Casiraghi, Cossa, Huber, Tozzi, Rivoltini, Villa, and Vergani: MIAQuant, a novel system for automatic segmentation, measurement, and localization comparison of different biomarkers from serialized histological slices

MIAQuant, a novel system for automatic segmentation, measurement, and localization comparison of different biomarkers from serialized histological slices





Competing interest statement

Conflict of interest: the authors declare no potential conflict of interest.

Abstract

In the clinical practice, automatic image analysis methods quickly quantizing histological results by objective and replicable methods are getting more and more necessary and widespread. Despite several commercial software products are available for this task, they are very little flexible, and provided as black boxes without modifiable source code. To overcome the aforementioned problems, we employed the commonly used MATLAB platform to develop an automatic method, MIAQuant, for the analysis of histochemical and immunohistochemical images, stained with various methods and acquired by different tools. It automatically extracts and quantifies markers characterized by various colors and shapes; furthermore, it aligns contiguous tissue slices stained by different markers and overlaps them with differing colors for visual comparison of their localization. Application of MIAQuant for clinical research fields, such as oncology and cardiovascular disease studies, has proven its efficacy, robustness and flexibility with respect to various problems; we highlight that, the flexibility of MIAQuant makes it an important tool to be exploited for basic researches where needs are constantly changing. MIAQuant software and its user manual are freely available for clinical studies, pathological research, and diagnosis.

Introduction

Histological variables with prognostic and predictive meaning are very important for clinical and decision-making purposes; pathologists have long understood the importance of quantifying them accurately. In the clinical routine manual methods are often used even if they are time-consuming, work-intensive, and prone to errors and high inter-intra-observer variability even when experienced histologists are involved.

Despite commercially available automatic counting methods are faster, and allow a more standardized quantification of the histological signal, they are not flexible with respect to different settings, not expansible and customizable since their source code is never provided.

This motivates our will to create a novel automatic image analysis system, usable on standard PCs, and flexible with respect to images with different characteristics, e.g., images with different colorations or acquired by different imaging systems (e.g., standard digital cameras connected to microscopes, or scanners that are more sophisticated).

An extremely flexible image analysis system is even more important for basic researches where needs are constantly changing.

Materials and Methods

Image database

To develop and test MIAQuant we used different histological samples, staining procedures, and imaging systems.

Formalin-fixed and paraffin-embedded (FFPE) slides were deparaffined in xylene and rehydrated in descending grades of alcohol. Standard procedures of Oil-red-O or Alcian blue pH 2.5 were used as histological stain to visualize the presence of lipidic drop or mucins.

Immunohistochemistry slides were processed as described in a previous study.1 After incubation with primary antibodies (Abs) we used as relevation system REAL Detection System, Alkaline Phosphatase /RED (red color) or UltraVision™ Quanto Detection System HRP (brown color) according to the manufacturer’s instructions. Images were acquired by Aperio Scanscope Cs (Aperio Technologies, Vista, CA, USA, color CCD camera, 14 μm x 14 μm pixel size), Olympus BX63 equipped with DP80 camera (color CCD camera, 6.45 μm x 6.45 μm pixel size) and software cellSens (Shinjuku Monolith, Tokyo, Japan) or by Nikon Eclipse E600 microscope equipped with DS-Fi1 camera (color CCD camera, 3.4 μm x 3.4 μm) and software Nis- Elements AR3.10 (Tochigi Nikon Corporation, Tochigi, Japan). These imaging systems have been used to acquire images whose pixel size ranges from 5000x5000 pixels to 35000x35000, and whose resolution is in the range.

Software development platform

To develop MIAQuant we employed the commonly used MATLAB platform, which is often used in computer science, physics, and mathematics for it is optimized for solving engineering and scientific problems. Moreover, the matrix-based MATLAB language is the world’s most natural way to express computational mathematics; for this reason, MATLAB is often used in the automatic image-processing research field, where images are represented and treated as multidimensional matrixes.

We developed and tested MIAQuant on a standard laptop (CPU, Intel i7; RAM 16 GB; disk 256 SSD). The system requirements depend on the image size and resolutions; based on our memory storage limits, the software has been used to process images whose format uses lossless compression (e.g., TIF/TIFF, JPEG 2000, PNG, or GIF) provided their memory size was less than 1.5 GB. We easily circumvented this limit by downsampling top weight images (e.g., 50%), when this did not compromise the quality of the analysis, or by using macros to automatically split images and then recompose the computed results.

Results

As described in the following, our software not only works on sections stained with the chromogenic methods most commonly used in immunohistochemistry (e.g., alkaline phosphatase resulting in reddish color, or peroxidase resulting in brownish color), but also is adaptable to particular chemical dyes (e.g., Alcian blue producing light blue markings, or oil red resulting in bright red markings). MIAQuant also analyzes images acquired with fluorescence microscopes, where segmentation problems are much simpler.

The first step of MIAQuant extracts the tissue area where the following algorithms are applied. To this aim, for computational efficiency, the image is firstly downsampled so that its larger size is less or equal to 5000 pixels. The downsampling size has been experimentally chosen to reduce the algorithms’ computational cost, without decreasing its effectiveness. Secondly, the gray-level (gL) version2 of the downsampled image is automatically thresholded by the Otsu algorithm.3 The computed mask is then rescaled to its original size and automatically refined by applying morphological binary operators2 and removing false positive areas (e.g., too small/not compact/elongated areas). To discard white noise as well as salt-and-pepper noise, the pixels of the RGB image in the tissue area are filtered by median and Gaussian filters,2 both with size 5x5.

Marker segmentation via Rule-Based System and K-NN classifier

The automatic identification of marker areas by a (computationally) simple, efficient, and effective segmentation system requires a pixel-based approach composed of simple techniques, which classifies each pixel of the image as an element of the marker-pixel class or of the not-markerpixel class. Note that the pixel-based approach, which classifies each pixel independently of its neighbors, allows processing images of too high dimensions; precisely, too big images can be split into smaller sub-images, separately segmented, and the segmented results are then recomposed. Furthermore, we highlight that our segmentation system is composed of simple, fast, and efficient techniques, to avoid a too high computational cost. Precisely, we employ two consecutive steps; firstly, a rule-based system discards most not-marker pixels, creating a first set of “candidate” markerpixels; secondly, a K-NN classifier,4 a nonparametric classification method, selects and recognizes pixels belonging to marker areas.

Both systems have been developed by the analysis of manually selected sample pixels. To collect them we have implemented a user interface to let expert users select a highly unbalanced Sample-pixel set,5 Sam, composed of a number, N, of marker-pixels which is obviously much lower than the number of not-marker-pixels (in our case the number of not-marker pixels is equal to N*1000).

To create the rule-based system and then train the K-NN classifier, Sam is randomly halved into two, not intersecting sets (Sam=SRules ∪ SKNN, SRules ∩ SKNN =) SRules is used for creating the rulebased system, SKNN is used to train the KNN classifier.

Rule-Based System

To discover discriminative rules allowing to discard most of the not-marker pixels while keeping all the marker pixels, we represented each pixel in SRules by its color coordinates in different color spaces (RGB, CIEL*a*b*, Y’CbCr, HSV, etc.)6 and we statistically analyzed, and compared, the probability distribution estimates of marker and not-marker pixels. As an example, in Figure 1 we show an image of melanoma section where CD 163 Abs are stained with alkaline phosphatase. The specific marker areas have a reddish appearance, but the image also shows cells with a brownish appearance; we remind that brown and red colorings have similar color coordinates in different color spaces, so that most software products often wrongly include unspecific pixels into the segmented “marker pixels”. To avoid such error while correctly segmenting the marker pixels, the statistical analysis we performed on our manually selected sample pixels suggested us to combine the (normalized) features a*, b*, and Cr values to represent the color of each pixel by a compact, and much more discriminating, feature: fComb=a*-b*+Cr. The analysis of the estimated three-dimensional marker/not-marker probability map in the RGB space and of the mono-dimensional probability map in the fComb space allowed the definition of optimal thresholding hyper-planes in the RGB and fComb spaces, which minimizes the probability of segmentation errors and create a first candidate marker-pixel set. Basically, to segment the markers with reddish appearance in our database, each pixel in the tissue area is taken as candidate marker if:

[ fComb (p)>170 AND R(p)>1.1*B(p) ] AND { NOT[ R(p)>190 & (G(p)>115 OR B(p)>115 ) ] }.

We highlight the fact that we used the same manual selection plus statistical analysis procedure, to develop a rule-based system for the segmentation of images, in our database, containing brownish markers; in this case candidate marker-pixels are such that:

R(p)>=1.15*G(p) AND R(p)>=B(P).

After all the candidate marker-pixels have been automatically extracted, we delete small areas (that is areas composed by less than 9 pixels) which are most probably due to noise or image artifacts.

K-NN classifier

In the pattern recognition field, the “Knearest neighbors” algorithm (shortly referred as K-NN) is a non-parametric method used for both classification and regression.4We are interested in K-NN classification, where the output is the class membership identifying the marker class versus the not-marker class. Given a pixel whose class is unknown, the K-NN algorithm classifies it by a majority vote of the pixel’s K nearest neighbors in the training set, meaning that K-NN assigns the pixel to the class most common among the pixel’s K nearest training neighbors. The parameter K is a positive integer, typically small and experimentally chosen. In our case, we applied the cross-validation procedure7 to choose the value of K=8.

Before employing SKNN to create the K-NN classifier, we prune it by applying the derived rule-based system. This has the effect of discarding much of the not-marker pixels, while keeping the number of markerpixels almost unchanged. The pruned SKNN set is still unbalanced but its cardinality is strongly reduced; this has a positive effect on the time-complexity of the following KNN classifier, since the search of the nearest 8 neighbors is faster when it is performed into a smaller set. To train the K-NN classifier, the pixels in SKNN are represented by their color coordinates in the RGB space, and the distance between a pixel to be classified and the training pixels is computed as the Euclidean distance among their RGB coordinates.

Applications of marker segmentation and computed results

Given the segmented marker pixels, their density estimate is computed as the percentage of the marker-pixels with respect to the tissue area; note that the density estimates might be as well computed with respect to user selected areas of interest. Furthermore, our software expresses the markers’ location by computing the markers’ minimum-distance from user selected points or borders of interest (e.g., borders of cancer nodules). So far, we have tested the described segmentation and analysis procedure on about 1000 different images characterized by different sizes and resolutions, and the computed results have been judged as precise and promising.8

Figure 2 A,C,E,G,I shows some sample images. The segmented markers (B, D, F, H, J) are represented by the software with white color (B and D) and with their original RGB color; this choice allows to highlight the marker hue characterizing each marker area. The images have different magnifications, have been acquired by different instruments (A, C, Aperio ScanScope; E, G, I, optical microscope), and have been stained by different techniques. As shown in Figure 2, the software is robust with respect to all the aforementioned variations and easily adaptable to any marker color and shape, thanks to the user-selected examples, which allow to firstly analyze the marker appearance and create the rules for candidate segmentation, and secondly to train the classifier for the selection of the marked pixels. We further underline that our system effectively copes with much of the problems commonly affecting many software solutions on the market. Examples of them are shown both in Figure 1A and in Figure 3 A,C,E; they are due to background signals, unspecific colorations, or technical artifacts (e.g., folds and pigments). Figure 3 C-F also shows two serial sections stained with the same primary antibody but different revelation methods. Note that the segmented markers are nearby and mostly overlapping, and the computed density estimates are comparable. This example practically shows that MIAQuant can be exploited for the effective quantitative analysis and comparison of serial sections.

In additional support of MIAQuant effectiveness and usefulness in the comparison of serial slices, highlighting its utility and potential impact in the cancer research field, Figure 4 shows 6 serial slices of a melanoma containing an immunological infiltrate and the segmentation of 6 markers identifying different cell populations constituting the infiltrate. The comparison of their density, as well as their relative localization, are crucial and key issues in oncology for they could provide highly informative knowledge explaining the role of the immuno-infiltrate. Note that the obtained results are effective and the comparison is successful even though the processed serial sections differ for staining, thickness, and are affected by problems, typically happening in the common routine.

MIAQuant expresses the distribution of each marker, with respect to user-selected points or areas of interest, by plotting minimum distance histograms. Considering our experiments, expert users noted that each histogram plot effectively reflects the marker position, while their comparison provides highly informative insights.

Registration and marker comparison

Another important feature aimed at marker comparison is the representation of different markers (segmented from serial slices) in an image where they are superimposed with different colors (see Figure 5).

Note that the relative rotation among shapes of contiguous slice images might even be equal to 45°, and the relative scale might be equal to 1.5x; anyway, shapes of serial slices might be quite different even though their orientation and scale are apparently similar. For this reason, the first step of our superimposition task applies a multiscale- hierarchical image registration procedure; it analyses the shape of serial sections and finds the best transformations to align and overlap them as much as possible. The proper transformations are determined by the analysis of gL images representing the tissue shape as follows; the image background is black (gL=0), the tissue area is gray (gL=128), and its border is white (gL=255). This choice allows to weigh twice the border when finding the best alignment.

Given a set of serial shapes, the best alignment is found by iteratively choosing one shape as the template shape, and transforming the other shapes, so that they “optimally” overlap the template. The iteration stops when all the shapes have been used as templates; this procedure ensures that each slice is aligned to all the other serial slices.

Given a template shape, the best overlap is found by a hierarchical image transformation based on consecutive optimal transformations (from the simplest to the most complicate). They are: translation (it involves only an image displacement), rigid (translation and rotation), similarity (translation, rotation, and scale), and affine (translation, rotation, scale, and shear). To find the most proper transformations, the algorithm applies the “step gradient descent optimization algorithm” to minimize the “mean squares image similarity metric”, a measure of shape difference computed by squaring the difference of corresponding pixels in each shape-image and taking the mean of those squared differences. The step gradient descent optimization algorithm iteratively adjusts the transformation parameters so that the optimization follows the gradient of the “mean squares image similarity” metric in the direction of its (minimum) extrema. The optimization algorithm uses constant length steps along the gradient between consecutive iterations until the gradient changes direction. At this point, the step length is halved.

Any optimization algorithm could drop in local minima; this motivates the usage of the hierarchical transformation, which minimizes the risk of dropping into local minima by searching for most complex transformations only after a coarse alignment has been already obtained with the easiest ones (this reduces the number of local minima in the neighborhood of the searching area).

To avoid local minima, we apply the aforementioned algorithm in a multiscale fashion. Specifically, we firstly consider images at a coarser resolution (by downsampling to 1/10 of the original image size) to exploit only the broader shape details for alignment, then we consider images at mid resolution, and finally the original images, to refine the alignment by considering finer and finer shape details. Furthermore, each transformation is applied only if it allows increasing either the correlation between the 2 shape-images (that are the templateshape and the shape being aligned to it) or the Cohen’s kappa coefficient.9

Figure 5 shows 3 serial slices, and the segmented markers, before and after the alignment; the reader may note that the alignment allows a trustworthy comparison of the markers’ respective localization. After having observed the result of applying our method to several sets of stained serial slices, we believe that our multiscale-hierarchical alignment procedure is a necessary preliminary step allowing any reliable comparative evaluation of densities and respective markers’ localization of serial slice sets.

MIAQuant software and its user manual are freely available as supplementary material of this article, for clinical studies, pathological research, and diagnosis.

Discussion

MIAQuant is a simple mean for the estimation of clinically interesting parameters. Being not affected by subjective variability, it might be a powerful tool to increase sensitivity, objectivity and efficiency in parameter estimation.

It can be adapted to staining methods used in pathology routine practice such as histochemistry and immunohistochemistry, and it is able to mitigate biological inconsistencies and/or technical errors in sample processing, including differential or incomplete slides or different intensity of staining.

MIAQuant is reliable, easy to handle and usable even in small laboratories, since image acquisition can be performed by cameras mounted on standard microscopes, which are commonly used in histopathological routine also in small hospitals, for their cheap cost. Moreover, MIAQuant is flexible since it effectively analyses images characterized by different image formats, pixel size and resolution, thus encouraging image exchange between clinical centers. In conclusion, MIAQuant has the potential to provide valuable assistance to pathologists in their daily practice, substantially enhancing the efficiency and accuracy of diagnostic processes, with benefit for the patient.

We are presently testing MIAQuant in various clinical oncological studies, including the definition of myeloid and immune cell tissue scores in metastatic melanoma and hepatocellular carcinoma. Another important application of MIAQuant in the research field concerns the study of the immunological infiltrate in human arterial plaques and its role in lympho-angiogenesis. Results of these studies, confirming the applicability of MIAQuant in patients setting, will be published elsewhere in the near future. All the aforementioned applications show that MIAQuant is a promising novel image analysis tool that might be successfully adapted to several medical research and clinical studies.

MIAQuant Software, its user manual, and further developments, are available online at www.consorziomia.org.

References

1. 

L Rivoltini, C Chiodoni, P Squarcina, M Tortoreto, A Villa, B Vergani. TNF-related apoptosis-inducing ligand (TRAIL) - Armed exosomes deliver proapoptotic signals to tumor site. Clin Cancer Res 2016;22:3499-512.

2. 

RWRE Gonzalez. Digital Image processing. vol. 3. Prentice Hall Publisher, 2008.

3. 

N Otsu. A Threshold selection method from gray-level histograms. IEEE Trans Systems Man Cybernetics 1979;9:62-6.

4. 

THP Cover. Nearest neighbor pattern classification. IEEE Trans Information Theory 1967;13:21-7.

5. 

C Bishop. Pattern recognition and machine learning. Springer-Verlag, Berlin: 2006.

6. 

W Pratt. Digital Image processing: PIKSInside. vol. 3. J. Wiley & Sons: 2001.

7. 

S Geisser. Predictive Inference. Chapman and Hall, New York: 1993.

8. 

E Casiraghi, S Ferraro, M Franchin, A Villa, B Vergani, M. Tozzi Semiautomatic Analysis for evaluation of carothid plaques neovascularization. Prooceedings 15th Annual Meet. Società Italiana di Chirurgia Vascolare ed Endovascolare, Rome, 2016.

9. 

J Cohen. A coefficient of agreement for nominal scales. Educ Psychol Meas 1960:37-46.

Figure 1.

Sample image and its segmentation computed by MIAQuant. A) CD163 staining of human melanoma (in red). B) Segmented signal. Note that a precise segmentation result has been computed, even though the image contains several pigmented areas.

ejh-61-4-2838-g001.jpg
Figure 2.

Segmentations computed by MIAQuant on images with differing characteristics. Sample images in the left column (A,C,E,G,I) and segmentation results in the right column (B,D,F,H,J). MIAQuant represents the segmented markers both with binary images (B and D) and with their original RGB color (F, H, J); this choice allows to highlight the specific hue characterizing each of the different marker areas. The shown sample images were acquired with different instruments: Aperio Scanscope Cs (A), Olympus BX63 equipped with DP89 camera and software cellSens (C,E), and Nikon Eclipse E600 microscope equipped with DS-Fi1 camera and software Nis-Elements AR3.10 (G-I). A,B) Immunostaining of atherosclerosis plaque with podoplanin (D2-40 Abs) stained with alkaline phosphatase in red. C-F) B-cell lymphoma xenograft model stained with Ki67 Abs stained with peroxidase in brown. E) The image is acquired with a much higher magnification than that of C. G,H) Histochemical staining with Alcian blue of normal colon tissue. I,J) Histochemical staining with Oil Red of cultured cells.

ejh-61-4-2838-g002.jpg
Figure 3.

MIAQuant tests on images containing problematic features. Sample images in the left panels (A,C,E) and segmentation results in the right panels (B,D,F). A) Immunostaining with CD15 Abs - Red alkaline phosphatase of melanoma slice; the image contains tissue folds and brownish red blood cells that do not affect the segmentation result (B). C,E) Serial melanoma slices immunostained with CD163 Abs; in (C) the Ab is stained with peroxidase in brown, and in (E) with alkaline phosphatase in red. The segmented results (D,F) show that the density estimates are comparable, and corresponding segmented cells are mostly overlapping.

ejh-61-4-2838-g003.jpg
Figure 4.

MIAQuant tests on serial slices of metastatic melanoma tissue. The serial slices were immunostained for, from the top to the bottom: CD8, CD3, CD15, CD14, CD163, CD66b. In the left column is shown the whole image, acquired with Aperio Scanscope CS. In the top row, black square shows both the location and the dimension of the detail shown in the central column. Right column: segmented results. Note that the software segments with precision each marker, despite the very variable background. The comparison of the detail dimension and the whole slice dimension highlights the usefulness of using an automatic system, which allows avoiding a manual counting procedure that would necessarily be too time consuming.

ejh-61-4-2838-g004.jpg
Figure 5.

Testing of images alignment with MIAQuant. In the top row, three serial tissue sections sampled from metastatic melanoma immunostained with Abs against CD3, CD163, and CD68. MIAQuant allows to overlay the corresponding segmentation results, as shown in the bottom row. Note that the comparison of markers’ respective localization is more reliable and consistent after the application of the multiscale-hierarchical alignment procedure, even when the original images (top row) are similar for dimension and orientation. The alignment procedure produces informative images allowing a proper visual comparison of the markers’ respective localization.

ejh-61-4-2838-g005.jpg
Abstract views:
313

Views:
PDF
63
MIAQuant user Manual
8
MIAQuant source code
5
HTML
6

Article Metrics

Metrics Loading ...

Metrics powered by PLOS ALM


Copyright (c) 2017 Elena Casiraghi, Mara Cossa, Veronica Huber, Licia Rivoltini, Matteo Tozzi, Antonello Villa, Barbara Vergani

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
 
© PAGEPress 2008-2018     -     PAGEPress is a registered trademark property of PAGEPress srl, Italy.     -     VAT: IT02125780185