Abstract: Single-cell analysis through mass cytometry has become an increasingly important tool for immunologists to study the immune system in health and disease. Mass cytometry creates a high-dimensional description vector for single cells by time-of-flight measurement. Recently, t-Distributed Stochastic Neighborhood Embedding (t-SNE) has emerged as one of the state-of-the-art techniques for the visualization and exploration of single-cell data. Ever increasing amounts of data lead to the adoption of Hierarchical Stochastic Neighborhood Embedding (HSNE), enabling the hierarchical representation of the data. Here, the hierarchy is explored selectively by the analyst, who can request more and more detail in areas of interest. Such hierarchies are usually explored by visualizing disconnected plots of selections in different levels of the hierarchy. This poses problems for navigation, by imposing a high cognitive load on the analyst. In this work, we present an interactive summary-visualization to tackle this problem. CyteGuide guides the analyst through the exploration of hierarchically represented single-cell data, and provides a complete overview of the current state of the analysis. We conducted a two-phase user study with domain experts that use HSNE for data exploration. We first studied their problems with their current workflow using HSNE and the requirements to ease this workflow in a field study. These requirements have been the basis for our visual design. In the second phase, we verified our proposed solution in a user evaluation.
More info at cyteguide.cytosplore.org.
Abstract: Deep neural networks are now rivaling human accuracy in several pattern recognition problems. Compared to traditional classifiers, where features are handcrafted, neural networks learn increasingly complex features directly from the data. Instead of handcrafting the features, it is now the network architecture that is manually engineered. The network architecture parameters such as the number of layers or the number of filters per layer and their interconnections are essential for good performance. Even though basic design guidelines exist, designing a neural network is an iterative trial-and-error process that takes days or even weeks to perform due to the large datasets used for training. In this paper, we present DeepEyes, a Progressive Visual Analytics system that supports the design of neural networks during training. We present novel visualizations, supporting the identification of layers that learned a stable set of patterns and, therefore, are of interest for a detailed analysis. The system facilitates the identification of problems, such as superfluous filters or layers, and information that is not being captured by the network. We demonstrate the effectiveness of our system through multiple use cases, showing how a trained network can be compressed, reshaped and adapted to different problems.
Abstract: Mass cytometry allows high-resolution dissection of the cellular composition of the immune system. However, the high-dimensionality, large size, and non-linear structure of the data poses considerable challenges for the data analysis. In particular, dimensionality reduction-based techniques like t-SNE offer single-cell resolution but are limited in the number of cells that can be analyzed. Here we introduce Hierarchical Stochastic Neighbor Embedding (HSNE) for the analysis of mass cytometry data sets. HSNE constructs a hierarchy of non-linear similarities that can be interactively explored with a stepwise increase in detail up to the single-cell level. We apply HSNE to a study on gastrointestinal disorders and three other available mass cytometry data sets. We find that HSNE efficiently replicates previous observations and identifies rare cell populations that were previously missed due to downsampling. Thus, HSNE removes the scalability limit of conventional t-SNE analysis, a feature that makes it highly suitable for the analysis of massive high-dimensional data sets.
Abstract: Diffusion Tensor Imaging (DTI) group studies often require the comparison of two groups of 3D diffusion tensor fields. The total number of datasets involved in the study and the multivariate nature of diffusion tensors together make this a challenging process. The traditional approach is to reduce the six-dimensional diffusion tensor to some scalar quantities, which can be analyzed with univariate statistical methods, and visualized with standard techniques such as slice views. However, this provides merely part of the whole story due to information reduction. If to take the full tensor information into account, only few methods are available, and they focus on the analysis of a single group, rather than the comparison of two groups. Simultaneously comparing two groups of diffusion tensor fields by simple juxtaposition or superposition is rather impractical. In this work, we extend previous work to visually compare two groups of diffusion tensor fields. To deal with the wealth of information, the comparison is carried out at multiple levels of detail. In the 3D spatial domain, we propose a details on demand glyph representation to support the visual comparison of the tensor ensemble summary information in a progressive manner. The spatial view guides analysts to select voxels of interest. Then at the detail level, the respective original tensor ensembles are compared in terms of tensor intrinsic properties, with special care taken to reduce visual clutter. We demonstrate the usefulness of our visual analysis system by comparing a control group and an HIV positive patient group.
Winner best paper award
Abstract: Progressive Visual Analytics aims at improving the interactivity in existing analytics techniques by means of visualization as well as interaction with intermediate results. One key method for data analysis is dimensionality reduction, for example, to produce 2D embeddings that can be visualized and analyzed efficiently. t-Distributed Stochastic Neighbor Embedding (tSNE) is a well-suited technique for the visualization of several high-dimensional data. tSNE can create meaningful intermediate results but suffers from a slow initialization that constrains its application in Progressive Visual Analytics. We introduce a controllable tSNE approximation (A-tSNE), which trades off speed and accuracy, to enable interactive data exploration. We offer real-time visualization techniques, including a density-based solution and a Magic Lens to inspect the degree of approximation. With this feedback, the user can decide on local refinements and steer the approximation level during the analysis. We demonstrate our technique with several datasets, in a real-world research scenario and for the real-time analysis of high-dimensional streams to illustrate its effectiveness for interactive data analysis.
Abstract: A diffusion tensor imaging group study consists of a collection of volumetric diffusion tensor datasets (i.e., an ensemble) acquired from a group of subjects. The multivariate nature of the diffusion tensor imposes challenges on the analysis and the visualization. These challenges are commonly tackled by reducing the diffusion tensors to scalar-valued quantities that can be analyzed with common statistical tools. However, reducing tensors to scalars poses the risk of losing intrinsic information about the tensor. Visualization of tensor ensemble data without loss of information is still a largely unsolved problem. In this work, we propose an overview + detail visualization to facilitate the tensor ensemble exploration. We define an ensemble representative tensor and variations in terms of the three intrinsic tensor properties (i.e., scale, shape, and orientation) separately. The ensemble summary information is visually encoded into the newly designed aggregate tensor glyph which, in a spatial layout, functions as the overview. The aggregate tensor glyph guides the analyst to interesting areas that would need further detailed inspection. The detail views reveal the original information that is lost during aggregation. It helps the analyst to further understand the sources of variation and formulate hypotheses. To illustrate the applicability of our prototype, we compare with most relevant previous work through a user study and we present a case study on the analysis of a brain diffusion tensor dataset ensemble from healthy volunteers.
Abstract: Spatial and temporal brain transcriptomics has recently emerged as an invaluable data source for molecular neuroscience. The complexity of such data poses considerable challenges for analysis and visualization. We present BrainScope: a web portal for fast, interactive visual exploration of the Allen Atlases of the adult and developing human brain transcriptome. Through a novel methodology to explore high-dimensional data (dual t-SNE), BrainScope enables the linked, all-in-one visualization of genes and samples across the whole brain and genome, and across developmental stages. We show that densities in t-SNE scatter plots of the spatial samples coincide with anatomical regions, and that densities in t-SNE scatter plots of the genes represent gene co-expression modules that are significantly enriched for biological functions. We also show that the topography of the gene t-SNE maps reflect brain region-specific gene functions, enabling hypothesis and data driven research. We demonstrate the discovery potential of BrainScope through three examples: (i) analysis of cell type specific gene sets, (ii) analysis of a set of stable gene co-expression modules across the adult human donors and (iii) analysis of the evolution of co-expression of oligodendrocyte specific genes over developmental stages.
BrainScope is publicly accessible at www.brainscope.nl.
Abstract: To understand how the immune system works, one needs to have a clear picture of its cellular compositon and the cells’ corresponding properties and functionality. Mass cytometry is a novel technique to determine the properties of single-cells with unprecedented detail. This amount of detail allows for much finer differentiation but also comes at the cost of more complex analysis. In this work, we present Cytosplore, implementing an interactive workflow to analyze mass cytometry data in an integrated system, providing multiple linked views, showing different levels of detail and enabling the rapid definition of known and unknown cell types. Cytosplore handles millions of cells, each represented as a high-dimensional data point, facilitates hypothesis generation and confirmation, and provides a significant speed up of the current workflow. We show the effectiveness of Cytosplore in a case study evaluation.
Abstract: In recent years, dimensionality-reduction techniques have been developed and are widely used for hypothesis generation in Exploratory Data Analysis. However, these techniques are confronted with overcoming the trade-off between computation time and the quality of the provided dimensionality reduction. In this work, we address this limitation, by introducing Hierarchical Stochastic Neighbor Embedding (Hierarchical-SNE). Using a hierarchical representation of the data, we incorporate the well-known mantra of Overview-First, Details-On-Demand in non-linear dimensionality reduction. First, the analysis shows an embedding, that reveals only the dominant structures in the data (Overview). Then, by selecting structures that are visible in the overview, the user can filter the data and drill down in the hierarchy. While the user descends into the hierarchy, detailed visualizations of the high-dimensional structures will lead to new insights. In this paper, we explain how Hierarchical-SNE scales to the analysis of big datasets. In addition, we show its application potential in the visualization of Deep-Learning architectures and the analysis of hyperspectral images.
Abstract: Hydrocarbon reservoir simulation models produce large amounts of heterogeneous data, combining multiple variables of different dimensionality, such as two or three-dimensional geospatial estimates with abstract estimates simulated for the complete field or different wells. In addition these simulations are nowadays often run as so-called ensemble simulations, to capture uncertainty of the model, as well as boundary conditions as variation in the output. The (visual) analysis of such data is a challenging process, due to the size and complexity of the data. In this paper we present an integrated system for the visual analysis of ensemble reservoir simulation data. We provide tools to inspect forecasts for multiple variables of complete fields, as well as different wells. Finally, we present a case study highlighting the effectiveness of the presented system.
Abstract: Inflammatory intestinal diseases are characterized by abnormal immune responses and affect distinct locations of the gastrointestinal tract. Although the role of several immune subsets in driving intestinal pathology has been studied, a system-wide approach that simultaneously interrogates all major lineages on a single-cell basis is lacking. We used high-dimensional mass cytometry to generate a system-wide view of the human mucosal immune system in health and disease. We distinguished 142 immune subsets and through computational applications found distinct immune subsets in peripheral blood mononuclear cells and intestinal biopsies that distinguished patients from controls. In addition, mucosal lymphoid malignancies were readily detected as well as precursors from which these likely derived. These findings indicate that an integrated high-dimensional analysis of the entire immune system can identify immune subsets associated with the pathogenesis of complex intestinal disorders. This might have implications for diagnostic procedures, immune-monitoring, and treatment of intestinal diseases and mucosal malignancies.
LUMC Best Article Prize 2016 (non clinical)
Abstract: Ocean forecasts nowadays are created by running ensemble simulations in combination with data assimilation techniques. Most of these techniques resample the ensemble members after each assimilation cycle. This means that in a time series, after resampling, every member can follow up on any of the members before resampling. Tracking behavior over time, such as all possible paths of a particle in an ensemble vector field, becomes very difficult, as the number of combinations rises exponentially with the number of assimilation cycles. In general a single possible path is not of interest but only the probabilities that any point in space might be reached by a particle at some point in time. In this work we present an approach using probability-weighted piecewise particle trajectories to allow such a mapping interactively, instead of tracing quadrillions of individual particles. We achieve interactive rates by binning the domain and splitting up the tracing process into the individual assimilation cycles, so that particles that fall into the same bin after a cycle can be treated as a single particle with a larger probability as input for the next time step. As a result we loose the possibility to track individual particles, but can create probability maps for any desired seed at interactive rates.
Abstract: We present a novel integrated visualization system that enables the interactive visual analysis of ensemble simulations and estimates of the sea surface height and other model variables that are used for storm surge prediction. Coastal inundation, caused by hurricanes and tropical storms, pose large risks for todays societies. High-fidelity numerical models of water levels driven by hurricane-force winds are required to predict these events, posing a challenging computational problem and even though computational models continue to improve, uncertainties in storm surge forecasts are inevitable. Today this uncertainty is often exposed to the user by running the simulation many times with different parameters or inputs following a Monte-Carlo framework in which uncertainties are represented as stochastic quantities. This results in multidimensional, multivariate and multivalued data, so-called ensemble data. While the resulting datasets are very comprehensive, they are also huge in size and thus hard to visualize and interpret. In this paper we tackle this problem by means of an interactive and integrated visual analysis system. By harnessing the power of modern graphics processing units (GPUs) for visualization as well as computation, our system allows the user to browse through the simulation ensembles in real-time, view specific parameter settings or simulation models and move between different spatial or temporal regions without delay. In addition our system provides advanced visualizations to highlight the uncertainty, or show the complete distribution of the simulations at user-defined positions over the complete time series of the prediction. We highlight the benefits of our system by presenting its application in a real world scenario using a simulation of Hurricane Ike.
Abstract: We present a novel integrated visualization system that enables interactive visual analysis of ensemble simulations of the sea surface height that is used in ocean forecasting. The position of eddies can be derived directly from the sea surface height and our visualization approach enables their interactive exploration and analysis. The behavior of eddies is important in different application settings of which we present two in this paper. First, we show an application for interactive planning of placement as well as operation of off-shore structures using real-world ensemble simulation data of the Gulf of Mexico. Off-shore structures, such as those used for oil exploration, are vulnerable to hazards caused by eddies, and the oil and gas industry relies on ocean forecasts for efficient operations. We enable analysis of the spatial domain, as well as the temporal evolution, for planning the placement and operation of structures. Eddies are also important for marine life. They transport water over large distances and with it also heat and other physical properties as well as biological organisms. In the second application we present the usefulness of our tool, which could be used for planning the paths of autonomous underwater vehicles, so called gliders, for marine scientists to study simulation data of the largely unexplored Red Sea.
Abstract: Seismic interpretation is an important step in building subsurface models, which are needed to efficiently exploit fossil fuel reservoirs. However, seismic features are seldom unambiguous, resulting in a high degree of uncertainty in the extracted model. In this paper we present a novel system for the extraction, analysis and visualization of ensemble data of seismic horizons. By parameterizing the cost function of a global optimization technique for seismic horizon extraction, we can create ensembles of surfaces describing each horizon, instead of just a single surface. Our system also provides the tools for a complete statistical analysis of these data. Additionally, we allow an interactive exploration of the parameter space to help finding optimal parameter setting for the current dataset.
Abstract: We present a novel integrated visualization system that enables interactive visual analysis of ensemble simulations used in ocean forecasting, i.e, simulations of sea surface elevation. Our system enables the interactive planning of both the placement and operation of off-shore structures. We illustrate this using a real-world simulation of the Gulf of Mexico. Off-shore structures, such as those used for oil exploration, are vulnerable to hazards caused by strong loop currents. The oil and gas industry therefore relies on accurate ocean forecasting systems for planning their operations. Nowadays, these forecasts are based on multiple spatio-temporal simulations resulting in multidimensional, multivariate and multivalued data, so-called ensemble data. Changes in sea surface elevation are a good indicator for the movement of loop current eddies, and our visualization approach enables their interactive exploration and analysis. We enable analysis of the spatial domain, for planning the placement of structures, as well as detailed exploration of the temporal evolution at any chosen position, for the prediction of critical ocean states that require the shutdown of rig operations.
Honorable mention for best paper award
Abstract: The most important resources to fulfill today’s energy demands are fossil fuels, such as oil and natural gas. When exploiting hydrocarbon reservoirs, a detailed and credible model of the subsurface structures is crucial in order to minimize economic and ecological risks. Creating such a model is an inverse problem: reconstructing structures from measured reflection seismics. The major challenge here is twofold: First, the structures in highly ambiguous seismic data are interpreted in the time domain. Second, a velocity model has to be built from this interpretation to match the model to depth measurements from wells. If it is not possible to obtain a match at all positions, the interpretation has to be updated, going back to the first step. This results in a lengthy back and forth between the different steps, or in an unphysical velocity model in many cases. This paper presents a novel, integrated approach to interactively creating subsurface models from reflection seismics. It integrates the interpretation of the seismic data using an interactive horizon extraction technique based on piecewise global optimization with velocity modeling. Computing and visualizing the effects of changes to the interpretation and velocity model on the depth-converted model on the fly enables an integrated feedback loop that enables a completely new connection of the seismic data in time domain and well data in depth domain. Using a novel joint time/depth visualization, depicting side-by-side views of the original and the resulting depth-converted data, domain experts can directly fit their interpretation in time domain to spatial ground truth data. We have conducted a domain expert evaluation, which illustrates that the presented workflow enables the creation of exact subsurface models much more rapidly than previous approaches.
Abstract: In this paper, a method for interactive direct volume rendering is proposed that computes ambient occlusion effects for visualizations that combine both volumetric and geometric primitives, specifically tube shaped geometric objects representing streamlines, magnetic field lines or DTI fiber tracts. The proposed algorithm extends the recently proposed Directional Occlusion Shading model to allow the rendering of those geometric shapes in combination with a context providing 3D volume, considering mutual occlusion between structures represented by a volume or geometry.
Abstract: Increasing demands in world-wide energy consumption and oil depletion of large reservoirs have resulted in the need for exploring smaller and more complex oil reservoirs. Planning of the reservoir valorization usually starts with creating a model of the subsurface structures, including seismic faults and horizons. However, seismic interpretation and horizon tracing is a difficult and error-prone task, often resulting in hours of work needing to be manually repeated. In this paper, we propose a novel, interactive workflow for horizon interpretation based on well positions, which include additional geological and geophysical data captured by actual drillings. Instead of interpreting the volume slice-by-slice in 2D, we propose 3D seismic interpretation based on well positions. We introduce a combination of 2D and 3D minimal cost path and minimal cost surface tracing for extracting horizons with very little user input. By processing the volume based on well positions rather than slice-based, we are able to create a piecewise optimal horizon surface at interactive rates. We have integrated our system into a visual analysis platform which supports multiple linked views for fast verification, exploration and analysis of the extracted horizons. The system is currently being evaluated by our collaborating domain experts.
Abstract: This paper presents a novel method for interactive exploration of industrial CT volumes such as cast metal parts, with the goal of interactively detecting, classifying, and quantifying features using a visualization-driven approach. The standard approach for defect detection builds on region growing, which requires manually tuning parameters such as target ranges for density and size, variance, as well as the specification of seed points. If the results are not satisfactory, region growing must be performed again with different parameters. In contrast, our method allows interactive exploration of the parameter space, completely separated from region growing in an unattended pre-processing stage. The pre-computed feature volume tracks a feature size curve for each voxel over time, which is identified with the main region growing parameter such as variance. A novel 3D transfer function domain over (density, feature size, time) allows for interactive exploration of feature classes. Features and feature size curves can also be explored individually, which helps with transfer function specification and allows coloring individual features and disabling features resulting from CT artifacts. Based on the classification obtained through exploration, the classified features can be quantified immediately.
Abstract: Mass cytometry allows high-resolution dissection of the cellular composition of the immune system. However, the high-dimensionality, large size, and non-linear structure of the data poses considerable challenges for data analysis. We introduce Hierarchical Stochastic Neighbor Embedding (HSNE) for single-cell analysis, a computational approach that constructs a hierarchy of non-linear similarities, allowing the analysis of millions of cells via different levels of detail up to single-cell resolution within minutes. We integrated HSNE into the Cytosplore +HSNE framework to facilitate interactive exploration and analysis of the hierarchy by a set of corresponding two-dimensional plots with stepwise increase in detail up to the single-cell level. This divide and conquer approach minimizes computation time and, thereby, allows efficient and interactive visualization. We validated the discovery potential of Cytosplore+HSNE by re-analyzing a recent study on gastrointestinal disorders as well as two other publicly available mass cytometry datasets. We found that Cytosplore+HSNE efficiently identifies both abundant and rare cell populations, without resorting to downsampling of the data, including rare cell populations that were missed in a previous analysis due to downsampling. Taken together, Cytosplore +HSNE offers unprecedented possibilities for visual exploration and analysis of millions of cells measured in mass cytometry studies.
Abstract: High-dimensional mass cytometry (CyTOF) permits the simultaneous measurement of many cellular markers, providing a system-wide view of immune phenotypes at the single-cell level1. Yet, the maximum number of markers that can be measure simultaneously is limited to ~50 due to several technical challenges. We propose a new method to integrate CyTOF data from several marker panels that include an overlapping set of markers, allowing for a deeper interrogation of the cellular composition of the immune system.
Abstract: Despite its importance to the world community for a variety of socio-economical reasons and the presence of extensive coral reef gardens along its shores, the Red Sea remains one of the most under-studied large marine physical and biological systems in the global ocean. We present our efforts to build advanced modeling, data assimilation, and uncertainty quantification capabilities for the Red Sea, which is part of the newly established Saudi ARAMCO Marine Environmental Research Center aiming at studying and forecasting the …
Abstract: This work presents a new workflow for the interpretation of seismic volume data, as well as a novel approach to interactively tracing seismic horizons. Instead of interpreting the seismic cube slice by slice, in the proposed workflow interpretation is performed on the planes connecting wells that have been drilled. Thereby the additional data provided by the well logs can easily be used during the interpretation process. Instead of manually picking the seismic horizon, we propose an algorithm which uses numerical integration over a vector field computed with diffusion tensors for automatic tracing, based on a user-defined seed point.
Abstract: The most important resources to fulfill today’s energy demands are fossil fuels, such as oil and natural gas. When exploiting hydrocarbon reservoirs, a detailed and credible model of the subsurface structures to plan the path of the borehole, is crucial in order to minimize economic and ecological risks. Before that, the placement, as well as the operations of oil rigs need to be planned carefully, as off-shore oil exploration is vulnerable to hazards caused by strong currents. The oil and gas industry therefore relies on accurate ocean forecasting systems for planning their operations. This thesis presents visual workflows for creating subsurface models as well as planning the placement and operations of off-shore structures. Creating a credible subsurface model poses two major challenges: First, the structures in highly ambiguous seismic data are interpreted in the time domain. Second, a velocity model has to be built from this interpretation to match the model to depth measurements from wells. If it is not possible to obtain a match at all positions, the interpretation has to be updated, going back to the first step. This results in a lengthy back and forth between the different steps, or in an unphysical velocity model in many cases. We present a novel, integrated approach to interactively creating subsurface models from reflection seismics, by integrating the interpretation of the seismic data using an interactive horizon extraction technique based on piecewise global optimization with velocity modeling. Computing and visualizing the effects of changes to the interpretation and velocity model on the depth-converted model, on the fly enables an integrated feedback loop that enables a completely new connection of the seismic data in time domain, and well data in depth domain. For planning the operations of off-shore structures we present a novel integrated visualization system that enables interactive visual analysis of ensemble simulations used in ocean forecasting, i.e, simulations of sea surface elevation. Changes in sea surface elevation are a good indicator for the movement of loop current eddies. Our visualization approach enables their interactive exploration and analysis. We enable analysis of the spatial domain, for planning the placement of structures, as well as detailed exploration of the temporal evolution at any chosen position, for the prediction of critical ocean states that require the shutdown of rig operations. We illustrate this using a real-world simulation of the Gulf of Mexico.
Abstract: Using programmable graphics hardware (GPU) is the de-facto standard for real time volume rendering nowadays. In addition to that, GPUs are often used for non-graphical tasks to accelerate complex computations and also allow the direct rendering of (intermediate) results. However, the amount of graphics memory can become a problem when working with large volume datasets. Even though todays graphics hardware provides more memory than ever before, the amount of data is also increasing rapidly. In order to overcome this, lots of compression algorithms have been developed and some of them even are hardware-accelerated. As these implementations only support a small range of formats and often do not provide sufficient quality, custom algorithms have been implemented which often utilize shader programs for decoding and encoding. While this has proven useful for visualization, providing interactive framerates for direct volume rendering, the algorithms focus on displaying the data, not processing it. In this thesis different compression techniques are compared with focus on their suitability for processing in the compression domain. A wavelet transform based compression scheme is implemented which allows lossless as well as lossy compression of volume data. Image processing operations are classified based on their applicability in the wavelet compression domain. Based on this classification different image operations are exemplarily implemented. Furthermore for visualization multi-planar reconstruction directly from the compressed data is presented. The results of this thesis are compared to processing in the spatial domain, showing advantages and shortcomings. Concluding an outlook on possible future work is given.