Histopathology images come in all variants of colors even when the same dying chemicals — hematoxylin and eosin — are used

Soccer2020Live

11 min readDec 3, 2020

Introduction

Metastasis is the spread of cancer cells to new areas of the body, often by way of the lymphatic system or bloodstream. A metastatic cancer, or a metastatic tumor, is one that has spread from the primary site of origin, or where it started, into different areas of the body.

In order to identify metastatic tumors in histopathology slides, a biopsy is often done to remove tissue samples from affected sites of the body. A pathologist then examines the tissue samples under a microscope and decides whether or not a metastatic tumor is present in them. If a metastatic tumor is present, the pathologist performs further investigations to find out how deadly the tumor is and what grade of metastasis it should be assigned. This has to be done manually and is a time consuming process. Also, the decision depends on the expertise of the pathologist and the efficiency of the microscope in use.

Photo credit: Enso Discoveries

Therefore, advanced techniques in deep learning such as convolutional neural networks could be of great help in automatically detecting, locating, and grading tumors in diseased tissues of the body. In order to exploit the full potential of these techniques, one could build a pipeline using massive amount of tissue histopathology data that have been evaluated by board-certified pathologists and train an ensemble of convolutional neural networks on it.

Aim

The aim of this study is to build a metastatic tumor classification system that can tell if there are metastatic tumors in the center 32x32-pixel region of 96x96-pixel histopathology slides of lymphatic node sections of the human body.

Dataset

The training dataset used in this study is a subset of the PatchCamelyon dataset [1] containing 220,025 96x96-pixel color images extracted from histopathology slides of lymphatic node sections of the body. Each image is assigned a positive or negative label.

Samples of the images

A positive label indicates that the center 32x32-pixel region of a slide contains at least a metastatic tumor; tumors in the area outside this region do not influence the label.

Methods

A. Image Preprocessing

As a first step, I remove images of low maximum pixel intensity (almost totally black) and high minimum pixel intensity (almost totally white). This is done by setting threshold values of 10 pixels and 245 pixels respectively and classifying images with maximum pixel values less than 10 pixels as almost totally black and images with minimum pixel values greater than 245 pixels as almost totally white. In total, 6 images fall into the almost totally white category while only 1 image fall into the almost totally black category.

Samples of the ‘almost totally white’ images

The motivation behind this preprocessing is that it is possible some of the images in the training dataset were captured from slides with unstained tissues or even no tissues at all and may serve as outliers and ‘bad images’ during training.

B. Stain Normalization and Augmentation

This variability in color could easily pose a big challenge when working with the images algorithmically, especially if the algorithm in use has a high pattern recognition capacity, such as deep convolutional neural networks.

To mitigate this challenge, I adopt two methods:

Normalizing the images before feeding them into the training pipeline (stain normalization).
Making the pattern recognition system — convolutional neural network — more robust by artificially increasing the stain variability as its input (stain augmentation).

These two methods help in the estimation of the stain intensity of the two standard colors in the digital microscopy slide and modify it to either represent a standardized staining (in the case of normalization) or to represent different stains (in the case of augmentation).

Several other methods of stain normalization and augmentation exist. These methods are seen in the works of Macenko et al. in A Method For Normalizing Histology Slides For Quantitative Analysis [2] and Yuan et al. in Neural Stain Normalization and Unsupervised Classification of Cell Nuclei in Histopathological Breast Cancer Images [3].

C. Proposed Classification System

In this study, I propose a metastatic tumor classification system that consists of an ensemble of convolutional neural networks (CNN).

D. Component Details

CNN Architectures: The CNN architectures used in this study belong to the families of Residual Neural Networks [4] and Densely Connected Convolutional Networks [5] that have been pretrained on the ImageNet dataset. ResNet50, ResNet101, DenseNet169, and DenseNet201 are these CNN architectures.
Fully Connected Layers: The fully connected layers used comprises two layers with 512 and 256 neurons respectively. Each layer contains a Rectified Linear Unit (ReLU) activation, batch normalization, and dropout with p = 0.5. The fully connected layers are joined with the output of the last convolutional layer of the CNNs after global max pooling, global average pooling, and flatten have been applied and the resulting outputs concatenated.
Loss Function: A fundamental way of training a neural network is by optimizing the weights of its neurons to improve the network so that its output fits the ground truth data as much as possible. Loss functions make this possible. The loss function used in this study is the binary cross-entropy loss

Binary Cross-Entropy Loss

Optimizer: The selection of the optimizer for a neural network is very key as it influences the process of optimizing the weights of the neurons and the performance of the network at large. The Adam optimizer is used in this study. Regarding the initialization of parameters for the optimizer, I use the following settings: lr= 0.001, beta_1=0.9, beta_2=0.999, decay=0.0, epsilon=None and amsgrad=False.

Adam update equations

Learning Rate: The learning rate affects the training of neural networks a great deal. A fixed learning rate of 0.00007 is used in this study as it gives good model performance.

Training

The dataset is split into a training set, a validation set, and a test set. The neural networks are then trained on the training set for 15 epochs and a batch size of 32 and validated on the validation set.

Loss and AUC curves of the Training set and Validation set

Prediction

After training, prediction is performed on the test set by using a 5-step test-time augmentation.

Ensembling

Finally, a large ensemble of the neural networks is created. Consider C = {c₁, …, cₙ) configurations where each configuration uses the same hyperparameters (e.g test-time augmentation steps) but different CNN architectures (e.g ResNet50). Each configuration cᵢ consists of m=1 trained model and predictions yᵢ is obtained for each cᵢ. Ensembling is performed such that the mean of the predictions y, where y = {yᵢ, …, yₙ}, for each cᵢ is computed.

Results

To evaluate the effectiveness of the metastatic tumor classification system, 3 classical metrics are used: Area Under the Receiver Operating Characteristic Curve (AUC), Sensitivity and Specificity.

Formulas of Sensitivity and Specificity

Sensitivity is a measure of the ability of the metastatic tumor classification system to correctly identify histopathology slides with a metastatic tumor in the center 32x32-pixel region. It is also known as Recall.
Specificity, on the other hand, is a measure of the ability of the metastatic tumor classification system to correctly identify histopathology slides that have no metastatic tumor in the center 32x32-pixel region.
The Area Under the Receiver Operating Characteristic Curve (AUC) is equal to the probability that the metastatic tumor classification system will rank a randomly chosen positive instance higher than a randomly chosen negative one. The best AUC score is 1 (or 100%).

The results generated by the metastatic tumor classification system are as follows:

Scores

Confusion Matrix

Receiver Operating Characteristic Curve

Demystifying the ‘Black Box’

In the real world, deep learning models are treated as ‘black box’ methods, and many times machine learning engineers have to ask themselves several questions, such as:

Where in the input images is the neural network ‘looking’?
Which series of neurons activated in the forward-pass during inference/prediction?
How did the network arrive at its final output?
Can the decisions of the deep learning model be trusted?

To answer these questions, I adopt Gradient-weighted Class Activation Mapping (Grad-CAM) which was proposed by Selvaraju et al. in their paper Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization [6].

From arXiv: Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization (Selvaraju et al., 2016) — “Grad-CAM uses the gradients of any target concept (say logits for “dog” or even a caption), flowing into the final convolutional layer to produce a coarse localization map highlighting the important regions in the image for predicting the concept.”

A slide together with a heatmap showing its visual explanations from the networks via gradient-based localization

The heatmap on the right-hand side above is a product of the application of Grad-CAM on the image on its left-hand side. Since tumors are concentrated in the center 32x32-pixel region, we see that the neural network is actually ‘looking’ at this region in the histopathology image.

Conclusion

Although computers will not replace pathologists anytime soon, properly designed AI-based tools hold great potential to increase workflow efficiency and diagnostic accuracy in the practice of pathology. Recent trends, such as data augmentation, crowd-sourcing to generate annotated datasets, and unsupervised learning with molecular and/or clinical outcomes versus human diagnoses as a source of ground truth, are eliminating the direct role for pathologists in algorithm development.

Proper integration of AI-based systems into anatomical pathology practice will necessarily require fully digital imaging platforms, an overhaul of legacy information technology infrastructures, modification of laboratory/pathologist workflows, appropriate reimbursement/cost-offsetting models, and ultimately, active participation of pathologists to encourage buy-in and oversight.

About

Oluwafemi Ogundare is a 3rd year Medical Student at the University of Ibadan, Nigeria. He is passionate about Artificial Intelligence, Genomics, and Bioinformatics due to their potential to usher mankind into the era of Precision/Personalized medicine. He spends his free days improving on his Machine Learning skills and learning new concepts in Genomics and Bioinformatics. He has worked on a number of health-related Machine Learning projects including Tumor Segmentation in 3D MRIs of the Brain, Pneumonia Detection in Chest X-Ray Images using Computer Vision, and Prediction of Patient Survival Rates using Random Forests. You can send him a mail at femiogundare001@gmail.com or reach him on LinkedIn at https://www.linkedin.com/in/oluwafemi-ogundare-65b6a0185/.

The notebook containing the codes used in this study can be found on github.