Multivariate image acquisition of a biomedical sample. A multivariate signal vector s(p) = (s1, s2, ···, sn) is associated with each pixel p in the spatial domain, which is usually a regular grid. The signal vectors s are data points in the n-dimensional signal domain.
Schematic illustration of multivariate image analysis. Two domains, the signal and spatial domain are regarded and linked to each other for a sophisticated knowledge discovery process (right). The analysis of each domain can further be subdivided into a statistical and exploratory analysis level. Closely linking both levels to each other is usually referred to as visual data mining. In some applications, an image segmentation step precedes the actual data analysis. Thereby, the amount of signal vectors to be analyzed can largely be reduced. Furthermore, it is possible to extract object specific features such as size, shape, or texture, which can be used instead or in addition to the original signal vectors. VDM, visual data mining.
Different levels of segmenting an input image. (a) A coarse segmentation assigns object (1) and background (0) labels. (b) Finer levels distinguish between individual object classes or (c) even object identities.
One way to analyze multivariate image is using visual diagnostics and information visualization. In this example,38 a multichannel fluorescence micrograph is analyzed, which shows a field-of-view in a tissue section from a mouse pancreas. The glucagon-positive cells show a green signal and insulin-positive cells express a red one. The blue signal shows tagged cell nuclei. In a segmentation step, all cells are detected in the blue channel and a fluorescence feature value is computed for each cell ci for the red channel [r(ci)] and the green channel [g(ci)]. All pairs [r(ci), g(ci)] are plotted in a scatter plot (black dots in frames in the background in (a) and (b). The user can select two thresholds (tr, tg) in the scatter plot, and all cells with r(ci)>tr are shown in the insulin channel (red, right) and all cells with g(ci)>tg are shown in the glucagon channel (green, left). The upper row (a) shows results for conservative thresholding and the lower row shows results for more generous thresholding.
To analyze 4D image data from radiology, the HYDE system60 is applied. In the three-dimensional volume data set, each voxel is associated with a n-dimensional feature vector that describes the kinetics of a contrast agent, which is used to visualize increased vascularity that is an indication for malignancy in breast tumors. The n-dimensional feature vectors are projected to a unit disc using a hyperbolic SOM and a Poincar projection (image on the left). The SOM prototype vectors are visualized with small box plot icons. The disc is filled with a top disc of a HSV color space cone, which allows topology preserving assignment of prototypes to colors. A best match unit search for each voxel allows assignment of voxels to colors and a pseudocoloring of the entire data cube (image in the middle). On the left, a single magnetic resonance image, feature image (upper left), and subtraction images (lower left) are shown. This is done because those are the standard visualizations a radiologist looks at, so they can link the patterns found in all visualizations. SOM, self-organizing map (Reprinted with permission from Ref 60. Copyright 2005 Palgrave Macmillan, a division of macmillan Publishers Limited).