Preface

This document contains four modules that should serve both as a guide for the interested reader who wishes to learn common methods of quantitfying fluorescence microscopy and for the users of the SMART center. It represents the sum of my work when I was supported to assist the SMART center in developing didactic materials during my FFGSI appointment in Winter 2020. I hope to continue working on it to share what I have learnt about fluorescence image analysis during my PhD, and aim to create a one-stop resource for other beginners.

Ameya Jalihal

January 2020


Introduction and how to use this document

Microscopy is and has been a mainstay of modern biology. With the discovery of fluorescent proteins, genetic fusions have completely changed how we see and think about biology at microscopic scales. Advances in microscopy mean that high resolution imaging has become ubiquitous. Optical tricks allow us to get closer and closer to the diffraction limit, and post-processing tricks allow us to surpass it altogether. The result of these developments is that the intricate and complex structures that underlie life are no longer schematics on a page, and can now be visualized in their native contexts, engaged in their functions, in real time. If you are reading this document, you probably have a biological system in mind that you want to image, or have fluorescence images and are wondering what to do with them. This handbook will help you think about how you can quantify your images to learn more about your biological system.

Quantification of fluorescence images frequently boils down to segmentation to obtain features, counting them or measuring their areas or brightness or tracking them over time. The demands of experimental rigor often necessitate analyzing large numbers of images for the sake of statistics, and in these cases it is often the most effective use of the researcher’s time to automate these operations. Automation of image processing operations, however, can become complicated. The operations themselves tend to be trivial, and are usually facile tasks when performed by hand because the human brain is incredibly adept at discerning patterns in noisy, grainy or blurry images. However they can be inordinately complex to automate for the same reason: it is notoriously difficult to get an algorithm to “see” the features you are interested in. This handbook should equip you with foundational ideas related to fluorescence images so that you can start to create your own analysis pipelines. It provides several examples and presents routines written in Matlab. At the end is a list of other resources that do similar things using ImageJ and related reading.

If you plan to start performing fluorescence microscopy at the SMART center, this guide should provide you with the tools you will need to think about extracting data from your images. I recommend you start thinking about quantification in parallel to designing your experiment, because the best data comes from the best images, and computational manipulation can’t help you if your images are noisy, low resolution or have too many artefacts to begin with. I highly recommend Lee and Kitaoka’s guide to rigorous and reproducible design of fluorescence microscopy experiments listed at the end of this document.

All images used in this handbook are obtained from Cell Image Library, which is an open-source repository of biological images.

Part 1: The fluorescence image and pixel intensity

A digital image is a two-dimensional matrix of Picture Elements or pixels. Each pixel in a gray-scale image holds a single number that corresponds to its brightness. In a 1-bit, or a binary image, each pixel is represented by 1 bit of information, and can take two values: 0 (black) or 1 (white). The number of gray levels or shades of grey between black and white is determined by the data-type of the image. So 8- or 16-bit images can have 256 or 65536 distinct gray levels. More the grey levels, the smoother a gradient from black to white looks. Each pixel in a three-color image, typically composed of red, green, and blue channels, holds three numerical values, each corresponding to the brightness of the individual channels.

In the case of fluorescence images, the pixel brightness of the pixel correlates with the fluorescence intensity of the sample. Thus the fluorescence image can be thought of as a map of fluorescence across a microscopic field of view. Darker pixels correspond to regions in the sample that show little or no fluorescence, whereas brighter ones correspond to greater fluorescence intensity, which usually arises from a high local density of fluorophores. Quantifying the brightness of a fluorescence image therefore can give useful information about the amount of fluorescent material present in the sample and its spatial distribution. The rest of this section deals with some common operations and manipulations that can be performed to extract useful information from fluorescence images.

Image operations and representations

Digital images are matrices, and can be manipulated in all the same ways matrices can be manipulated. A numerically-oriented programming language such as Matlab/Octave or Scilab therefore is particularly suited for image operations. Some common operations are addition, subtraction, multiplication, division, rotation, transposition and various convolution and filtering operations. The first four act at the level of individual pixels, i.e., produce a new matrix where each pixel represents the sum, difference, product or quotient of corresponding pixels in two or more images. The latter two operate on the level of the entire image matrix, therefore keep the pixel values the same but change their coordinates.

2D heatmaps

The most common representation of fluorescence intensity images is the 2-dimensional heatmap, where the color or brightness of each grid point on a cartesian plane corresponds to the pixel value at that location. The recorded pixel value is only related to the sum of photons collected by the camera pixel, which is related to the fluorescence emission from the sample. However, this number carries no intrinsic information related to the color of the incident light. Thus, fluorescence images are inherently greyscale images, and meta information regarding the wavelength of excitation light used, fluorophore emission characteristics and optical components such as emission filters is required to interpret the meaning of the pixel value. Further this also means that the 2D representation of fluorescence images is not tied to these experimental considerations, and various look up tables (LUTs) can be used to visualize images using pseudocolors that provide the best contrast, etc. ImageJ offers an array of LUT choices (https://imagej.net/Color_Image_Processing). Matlab has several options for colormaps that can be used in conjunction with the imshow() function.

I = imread('filename.tif');
imshow(I);
colormap('hot');

2D maps can also be used to represent multi-color images. Composite images where the RGB/CMYK channels each represent a different fluorophore can be used to visualize relative localizations of multiple components in a sample. Such images can be visually inspected for colocalization of features by the apparent color of pixels in the composite image. Commonly used set of colors for colocalization are

Red + Green = Yellow

Magenta + Green = White

Blue + Yellow = White, etc.

While useful for quick visual inspection, using composite images with pseudocolors is discouraged due to the inability of colorblind individuals to distinguish reds from greens. It is best to also include the individual channels in addition to composite images so that the viewer unfamiliar with the images is able to make a judgement about spatial overlap independently of pseudocolor choice.

Line plots

While 2D heatmaps are useful to visually observe differences in localization of the fluorophore across a 2D field of view, it is sometimes necessary to visualize intensity variations across a single spatial or temporal dimension.

One way to plot intensity profiles is to plot the intensity along one of three spatial dimensions or the temporal dimension of a Z-stack or movie respectively, such as by using the indexing operation in Matlab.

image = imread(filename.tif')
row = 100;
plot(image(row,:));
xlabel('Pixels');
ylabel('Intensity');

The intensity profile along one spatial and one temporal dimension is called a kymograph, and is useful to visualize directional movement of a blob over time.

Noise

Any contribution to pixel value that isn’t directly a readout of the “true” fluorescence intensity is noise. “Noisy” images can appear grainy or fuzzy if the pixel value of the signal is too close to the pixel value of noise. The signal:noise ratio is a good way to characterize how “sharp” an image appears, but is also a good quantitative metric to estimate the quality of your image and brightness of your signal. There are two categories of noise in a digital image of fluorescence: photon noise relates to the emission and detection of fluorescence from the sample. It follows a Poisson distribution therefore its standard deviation is related to how bright the signal is. readout noise originates from randomness in counting photons at every pixel. It is Gaussian, therefore its standard deviation is fixed. Knowing the functional forms of the components of noise allows us to estimate the relative contribution of each to the overall noise, subtract the background noise, and define intensity thresholds to define peaks of true signal in binary images. Some things to remember about pixel noise are that:

  1. It is random.
  2. It is uncorrelated across pixels, i.e., pixel noise is independent of the value in an adjacent pixel.
  3. It can deviate both positively and negatively about a mean value.

Estimating noise

The signal-to-noise ratio (SNR) is a commonly used metric to estimate the brightness of the signal relative to the background noise. It is useful to estimate the SNR of a signal of interest when deliberating whether a given fluorophore is an appropriate choice for fluorescence labeling, to pick an intensity threshold for feature detection and many instances where quantification of fluorescence becomes important. The most common method of estimating the SNR is as follows: Define the signal in your image: use thresholding and binary operations or manually drawn regions of interest (ROIs) to define this feature. The mean of pixel intensities within this region represents the signal. If the selected feature, such as a cell, nucleus or other blob is relatively dispersed from other features in the field of view, an annular region of 3-10 pixels around the feature can be used to obtain the value of the background noise. The mean of pixel intensities in this region is the noise. The SNR is the ratio of these two numbers. In general, an SNR >3 is required in order for the signal to be easily distinguished from the background, and higher the SNR the better.

Artefacts: Cross talk and cross excitation

While thinking about the relationship between pixel value and fluorescence intensity, it is important to consider the various sources of variation and light-contamination in an experiment. Fluorescence bleedthrough occurs when the sample contains a second fluorophore whose excitation spectrum overlaps with the wavelengths used to excite the first. Unless the fluorescence emission from the second fluorophore is filtered using appropriate dichroic filters, this emission light can contribute to artefacts in the image. Common sample-independent sources of artefacts are related to misaligned lasers and dirt in the light path. These can appear as bubbles, diffraction patterns or non-uniform illumination. Ultimately, careful experimental design, appropriate controls and careful data acquisition can minimize undesirable artefacts in the acquired image. However there are some image features that are impossible to get rid of due to the inherent limitations of the optics. The following section describes methods to estimate and correct non-uniform background.

Background correction

Biological fluorescence images are always dirty. Samples tend to be cloudy and the growth media they are in and glass slides they are grown on contribute to background fluorescence that can produce artefacts in the desired image or even decrease SNR. One common method to clean up an image is to use a “rolling ball” Gaussian filter. The Gaussian rolling ball filter subtracts the mean or median pixel intensity values in a circular area of a specified radius from the central pixel. In Matlab:

I = imread('filename.tif');
h = fspecial('disk',10);
filtered = imfilter(I,h,'options','replicate');

Part 2: Spot and other feature detection

The fluorescently-tagged molecular species will often dictate what the feature of interest in the image is. DAPI stains nuclear genomic DNA, phalloidin stains cytosolic filaments and immunofluorescence for LAMP1 stains lysosomes. Each of these appear distinct as features in an image, and requires slightly different methods for detection. Once these features are extracted, they may be quantified to measure number, size, fluorescence intensity and position relative to another feature, e.g. the number of nuclei, length of filaments, or subcellular localization of lysosomes.

The quickest method to obtain such statistics is to manually draw an ROI around the visible feature using the freehand draw tool in imageJ. While this method is neither rigorous nor time-efficient for dealing with a large number of images, it can nevertheless provide useful information about preliminary experiments.

Algorithmic methods for feature detection fall into three categories

  1. Intensity threshold-based
  2. Intensity variation-based
  3. Machine-learning/computer vision-based

Machine-learning based feature detection, while established and robust, often requires very large data sets to train networks on. The December 2019 Nature Collection on deep learning in microscopy and the introduction by von Chamier et al 2019 are good places to start exploring these ideas.

The two pixel intensity-based methods, however, are straightforward to implement on a wide range of fluorescence images, and can yeild highly precise quantification. Some common pipelines for such analyses are described below.

Intensity threshold-based methods

The basic premise of using intensity thresholds is that the feature of interest can be identified by its distinct range of fluorescence intensities. The threshold is the intensity value that reliably allows separation of pixels that belong to the feature from those that correspond to the background. In a typical fluorescence image, a histogram of pixel intensities typically has a peak at low intensities, representing the background, and a thick tail, representing the signal (Figure 1).
Figure 1: A fluorescence image of nuclei stained with DAPI (left) and a histogram of pixel intensities (right) Figure 1: A fluorescence image of nuclei1 stained with DAPI (left) and a histogram of pixel intensities (right).
Let’s consider the image of nuclei in Figure 1. The task at hand is to pick the right intensity value so that the most number of true nuclei get selected without including too many of the “dark” background pixels. This yields a binary image representing the operation intensity > threshold, where pixels above the threshold are assigned a value of 1 (“true”) and the rest are assigned a value of 0 (“false”). The thresholded pixels form blobs that represent nuclei (Figure 2).
Figure 2. Thresholding operations produce binary images.

ImageJ’s threshold plugin has options for several algorithms that try to estimate the background peak. Matlab 2019, the multithresh() function uses the Otsu method to determine thresholds.

Here are some considerations regarding thresholding operations for feature detection:

  1. Depending on the brightness of the signal, it may or may not be possible to define a threshold that separates all true signal from the background. In this case the objective of thresholding should be to minimise false positives instead of minimizing true negatives.
  2. Filters are your friend! Use them to process raw images, to remove smooth salt-and-pepper noise or speckels (Gaussian), and non-uniform background (Tophat). Matlab’s imfilter+fspecial functions have several default kernel operations built in that are useful for image pre-processing.
  3. The regions obtained from thresholding operations are only provisional. “Dilation” and “erosion” operations can be used to grow or shrink regions, and watershed operators can be used to split up large blobs. Ultimately it is up to the biology to interpret the blobs that thresholding reveals.

Intensity variation-based methods

While thresholds are useful to detect features that look like blobs, they are less useful to detect boundaries of blobs. A straightforward way of determining edges or boundaries is using a convolution operation that looks for gradients in intensity: regions of pixels where the intensity either increases or decreases rapidly. Thus at its base, the edge detection operation is equivalent to taking the derivative of the image with respect to the intensity. The Matlab “edge” function is a simple way of obtaining edges (Figure 3).

Figure 3. Edge detection using two different methods.

As with threshold-based blob detection, pre-processing a raw image can improve the quality of edge detection.

An extreme case in which edge detection is useful is for spot detection. Diffraction-limited imaging, such as single molecule imaging produces images where the features of interest are bright spots in a dark background. The Laplacian of Gaussian (LoG) or Mexican hat filter allows sharp variation of any kind to be amplified, resulting in a “sharper” image with respect to regions of bright signal.

Another feature of diffraction-limited spots that can be exploited to detect them is the fact that a spot’s intensity profile can be approximated by a 2D Gaussian. Thus fitting each pixel to a Gaussian of defined \(\sigma\) yeilds a highly accurate map of spots in an image.

Segmentation

Once a binary image has been obtained, the next step in the pipeline is segmentation, the process of separating each cluster of pixels by assigning them distint values to aid downstream analysis. Matlab’s bwlabel() function is a powerful tool that can be used to segment binary images. It assigns each cluster (or island) of white pixels a unique integer, which can be accessed individually and analyzed further (Figure 4).

i = imread('nuclei.tif');
thresh = multithresh(i);
blobs = bwlabel(i>thresh);
imshow(label2rgb(blobs));
Figure 4. Blobs colored by their unique integer labels.

Now that we have access to the blobs, we can start collecting statistics. The following code plots a scatter plot with area on the x-axis and the circularity of the blob on the y-axis, using the regionprops() function to extract these properties from the labeled blob image.

props = regionprops(blob,'Area','Circularity');
for j = 1:length(props)
    areas(j) = props(j).Area;
    circs(j)= props(j).Circularity;
end
subplot(122)
scatter(areas,circs);
axis square;
xlabel('Nuclei Areas, px^2');
ylabel('Nuclei Circularity');

The scatter plot tells us that the smaller blobs are almost perfectly circular, wherease the larger blobs deviate significantly from circularity. This helps us diagnose the quality of our thresholding procedure: it appears as though the automatically determined threshold isn’t able to resolve nuclei that are close together, and so these larger blobs appear longer. We can us this information to refine out pipeline, by introducing a watershed step that will break up these “clumps” that the thresholding introduces.

bin = gb>thresh;
BW = bwlabel(bin);
D = -bwdist(~BW);
D(~BW) = -Inf;
L = watershed(D);
imshow(label2rgb(L,'jet','w'))
Figure 4. Watershed segmented image and corresponding scatter plot.

The scatter plot now looks more acceptable, suggesting that the watershed transform has split the previously “fused” blobs at reasonable positions into new blobs with comparable properties to the singletons.

In general it is important to always perform sanity checks at every step of the pipeline to make sure the algorithm itself minimally impacts the data.


Part 3: Particle tracking and diffusion analysis

Fluorescence microscopy allows both the visualization and precise quantification of nano- to mesoscale motion over a range of time scales. Tracking is an operation that links features across frames of a movie and produces a trajectory of its motion over time. Such motion can be directional, as in axonal trasport of vesicular cargo or random, as in diffusive motion of transcription factors in the nucleus. The trajectory itself may be a single curve in 3 spatial dimensions and time, or it may be branch over time, such as in the case of dividing cells. This section goes over the basics of acquiring movies and some commonly used methods for diffusion analysis.

Acquiring movies

A movie is a series of images captured at known intervals. A timelapse movie is one in which successive frames are captured at long intervals, typically minutes or hours. This may be appropriate to study cell migration or division, and other relatively slow processes. In contrast, a movie can approach video rate, where time between successive frames can be milliseconds, which can be useful in precise tracking of rapidly diffusing single molecules. There are two time quantities that are relevant in the context of making movies of fluorescence:

  1. The time required to capture a single frame is usually limited by the camera exposure time or its integration time. The nature of the camera and the microscope determine the lower bound on the integration time. More typically however the time per frame is limited by the exposure time: the required to acquire an image with sufficient SNR. Faster acquisition time necessitates brighter fluorophores or higher excitation power compared to slower acqusition, which can be image dimmer and more slowly diffusing fluorophores. Pixel binning can help speed up acquisition at the expense of image size.
  2. The time between successive frames, or frame rate, when it isn’t limited by the exposure or integration times is influenced by the required time resolution, the photobleaching rate of the fluorophores and the desired duration of the movie.

Tracking routines and data

The primary challenge of tracking objects is identifying an object in different frames. This task is complicated in crowded images and in images in which the the object moves rapidly. The following information, used by tracking programs such as the Spots function in Imaris and TrackMate plugin in ImageJ can provide more parameters which reduce the search space for the object-linking operation.

  1. The expected distance of diffusion between successive frames limits the search space drastically, so that the algorithm only looks for potential object matched within the specified radius around the position of the centroid in the previous frame.
  2. The frame gap is the maximum numer of frames an object can be missing from the radius specified in 1. after which the trajectory is terminated. This can be useful in the case of rapid Z-direction diffusion which is typically poorly resolved and not imaged in widefeild modalities.
  3. The expected diffusion type can provide additional information by ranking different linking possibilities based on their likelihood under the specified diffusion model. Two classes of models are Brownian diffusion, that assume that the motion can be desribed as a random walk where a particle’s displacement at a given time interval is independent of its displacement at any other time intervals, and Autoregression models that account for time-dependent diffusion rates or displacements.

The following clip depicts a diffusing particle inside a cell, and an overlaid tracjectory that tracks the center of the object over time.

The output of a tracking operation is a list of values that minimally contain the spatial and temporal coordinates for each tracked object. Additionally, the output may contain the intensity of the tracked object at each frame, its size and related parameters. The following is a sample tracking output for the trajectory shown in the clip, obtained using the TrackMate plugin in FIJI.

Diffusion analysis

Once trajectories are obtained from movies, they can be fed to analyses pipelines that can be used to infer the type of motion it represents. One common method is the mean squared displacement method to infer the diffusion rate:
\(MSD = 2\cdot n\cdot D\cdot t\),

where \(MSD\) is given by \(<(x_1(t) - x_1(0))^2> + <(x_2(t) - x_2(0))^2> +...+ <(x_n(t) - x_n(0))^2>\), where \(x_i\) is the \(i^{th}\) spatial dimension and \(n\) = number of spatial dimensions and \(D\) is the diffusion coefficient.

Thus the slope of the \(MSD\) vs \(t\) plot can be used to determine the value of \(D\). A linear fit on the MSD plot represents random diffusion. Positive or negative devation from linearity represent super- or sub- diffusive states, that can represent spatial constraints or biases, such as those seen in active transport.

Other considerations

Motion blur is an artefact that arises from rapid diffusion that affects tracking by spot detection, and consequently linking.


Part 4: Multicolor imaging: colocalization and registration

The most exciting promise of biological fluorescence is the prospect of visualizing mulrtiple cellular components simultaneously. With extant fluorophores and detection optics it is possible to simultaneously detect upto 7 disctinct fluorophores. In practice, however, imaging speed and fluorescence bleed-through limit the number of colors to 2-4. Real-time multi-color acquisition, as is the case for single-molecule and live-cell imaging, is achieved by using dichroic band-pass filters to direct the emission light to two or more cameras.

The primary challenge of collecting light on multiple cameras is that the resulting images are hard to get perfectly spatially aligned, where each pixel from one camera corresponds to the same region in the sample as the equivalent pixel from a different camera. This may be particularly important for high-resolution images of diffraction-limited spots, where misaligments on the order of even micrometers can have major impacts on the outcome of any analysis. One way to overcome this limitation is to subject the images to a post-processing step called registration.

Registration

At the heart of it, image registration involves defining a transform that maps related pixels in two images based on some ground truth. One commonly used method uses so-called feduciary markers, such as fluorescent TetraSepck beads, randomly scattered across the feild of view. When imaged using two cameras under two different wavelengths, the resulting images can be aligned by identifying the pixels representing the same bead in each image and creating a vector map that of the displacement of the pixels in one image with respect to the pixels in the other.

A typical analysis pipeline involves acquiring a multi-color beads image at one or more time points during an experiment, creating a registration transform and applying it to the acquired multi-channel images. The final output is a set of “registered” images whose spatial aligment precision is dictated by the quality of the registration of the beads image.

Colocalzation

The term “colocaliation analysis” is commonly used in three slightly different senses. These are:

1. Degree of spatial overlap of signal

This is frequently visually assessed by looking at the color of a composite image, where spatial overlap of two signals, say in the green and red channels, are discernable by the color of the pixel, in this case yellow if the two signals are colocalized.

2. Degree of intensity correlation

It is often the case that both fluoresence signals are diffuse and do not lend themselves to easy visual testing. In this case, it is possible to estimate the spatial correlation or anti-correlation of the two signal instead of simply relying on the color of composite channel. Commonly used methods to determine such correlations are the Perason’s cross correlation coefficient and Manders correlation coefficient, both of which estimate the covarance of the signal. Dunn et al 2011 examine common pitfalls involved in using these methods.

3. Spatial overlap of detected objects

The power of high-resolution imaging lies in its ability to ascribe precise meaning to clusters of bright pixels, which may represent individual molecules or molecular-complexes. Methods described in Part 2 can be used to detect diffraction-limited spots or larger blobs. To determine how blobs in different channels are distributed in space relative to each other, one approach is to measure the degree of overlap of their “footprints” (akin to those described in 1.). Another powerful approach is to calculate the distance between their centroids or spot centers, which represent the coordinates of the fluorophore in a diffraction limited region.

Using distance to determine colocalization requires sub-pixel localization precision and an accurate assessment of registration error. It is important to use appropriate positive and negative controls where the degree of colocalization is known a priori, and to use tests such as the pixel-shift analysis to determine the extent of random colocalization in a dense image. Ultimately, object-based colocalization yeilds more quantitative results and is appropriate for multi-color single-molecule tracking analyses compared to pixel-colocalization methods, which while easier to carry out are more difficult to interpret in general.


References

1Pekka Ruusuvuori (2011) CIL:27833. CIL. Dataset. https://doi.org/doi:10.7295/W9CIL27833

Waters, Jennifer C. “Accuracy and precision in quantitative fluorescence microscopy.” The Journal of cell biology vol. 185,7 (2009): 1135-48. https://doi:10.1083/jcb.200903097

Dunn KW, Kamocka MM, McDonald JH. A practical guide to evaluating colocalization in biological microscopy. Am J Physiol Cell Physiol. 2011;300(4):C723‐C742. https://doi:10.1152/ajpcell.00462.2010

Other resources and outlook

Christian Tischer’s introductory talk https://www.ibiology.org/techniques/introduction-to-bioimage-analysis/ …and other talks in the Bioimage Analysis series from iBiology.

Lee and Kitaoka’s excellent summary of considerations for experimental design with emphases on rigor and reproducibility: https://www.molbiolcell.org/doi/pdf/10.1091/mbc.E17-05-0276

3D and confocal imaging: https://www.nature.com/articles/s41596-020-0313-9.pdf

The following papers by Anne Carpenter provide both good overviews of image analysis pipelines and experimental design:

  1. On the future of image analysis, with a big-picture overview of where the field is at right now: https://www.nature.com/articles/nbt.3722

Analyzing images using ImageJ

Pete Bankhead’s book has a great introduction to file formats and other preliminaries: https://petebankhead.gitbooks.io/imagej-intro/content/

Robert Haase’s videos on advanced bioimage analysis using Fiji: https://www.youtube.com/playlist?list=PL5ESQNfM5lc7SAMstEu082ivW4BDMvd0U