TMA Image Analysis#
This notebook shows how to use the TMA Image Analysis
components of napari-prism
to perform masking, dearraying, segmenting and measuring on the raw .qptiff tissue microarray image into a structured anndata.AnnData
object containing cell level information.
We continue on with the NSCLSC4301 dataset from the previous section, already loaded and parsed into a SpatialData .zarr file.
# Below code loads napari to notebook for screenshot purposes
# All functions done on the viewer window3
import warnings
import matplotlib.pyplot as plt
import napari
from IPython import get_ipython
from loguru import logger
class ScreenshotContext:
def __enter__(self):
get_ipython().run_line_magic("matplotlib", "inline")
def __exit__(self, exc_type, exc_value, traceback):
get_ipython().run_line_magic("matplotlib", "qt")
plt.rcParams["figure.figsize"] = (20, 20)
warnings.filterwarnings("ignore")
logger.remove()
viewer = napari.current_viewer() or napari.Viewer()
Masking#
The first operation we perform is image masking. This is done to generate a more discrete representation of where the TMA cores are by doing this on a ubiquitious cell nuclei marker like DAPI
.
We currently provide a mini-pipeline for generating image masks, comprised of a sequence of the following operations:
A gaussian blur with a sigma/spread equal to a user set size (here in um, converted to px)
An expansion of the blur by user set size (um)
A series of user-tickable optional operations (in below order):
Li Thresholding
2D sobel / edge filter
Local histogram equalisation
Gamma correction (0.2 value)
The outputs are:
A binary mask (
{channels}_mask
)A thresholded polygon representation of the binary mask (
{channels}_mask_poly
)
Steps#
Select the
DAPI
var in theView (napari-spatialdata)
widgetSelect the
TMA Image Analysis (napari-prism)
widget tab on the bottom-right
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Select an image resolution (Channels, Y pixels, X pixels) to perform image masking. Recommended to do this on the lowest resolution (33, 2025, 1320):
(Optional) Constrain the masking operation to a bounding box#
Create a new
Shapes
layer by selecting the polygon icon in the layer list(Optional) rename this layer by double-clicking the layer name, typing a new name (i.e.
bounding_box
), then left click an empty spaceCreate a rectangle shape in this layer by selecting the 4-pointed square icon in the layer controls. Adjust the opacity to make more transparent.
Reselect the
NSCLSC4301
layerIn
Select shape to mask within
of the Masker widget tab, select the named shape layer (bounding_box
)
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
0
Choose channel(s) for masking. If multiple channels are chosen it will combine the binary masks together. Let’s select
DAPI
for now.Choose masking parameters. Below are some recommendations:
Gaussian Blur Sigma in um
: 10Blur expansion in um
: 10Li Thresholding
: Only if the image has high background2D sobel / edge filter
: Only for very blurry images by highlighting edgesLocal histogram equalisation
: Only if the image has noticeable variation in intensity across the imageGamma correction (0.2 value)
: Only if intensities are very low
Provide mask polygon parameters:
Estimated core diameter in um
: This will compute and an estimate for the expected area of the TMA cores.Masks with this fraction of estimate core area to include
: Parts of the mask with less than this fraction of the estimated core area will be removed from the polygon representation.
Click
Generate masks
channel_mask and channel_mask_poly will appear in the
Elements
list at the bottom leftDouble click them to visualise in the layer
# This should have made `DAPI_mask` and `DAPI_mask_poly` in .labels
viewer.layers[0].metadata["sdata"]
SpatialData object, with associated Zarr store: D:\qptiff_images\NSCLC4301.zarr
├── Images
│ └── 'NSCLC4301': DataTree[cyx] (33, 32400, 21120), (33, 16200, 10560), (33, 8100, 5280), (33, 4050, 2640), (33, 2025, 1320)
├── Labels
│ └── 'DAPI_mask': DataArray[yx] (1699, 917)
└── Shapes
└── 'DAPI_mask_poly': GeoDataFrame shape: (39, 2) (2D shapes)
with coordinate systems:
▸ 'global', with elements:
NSCLC4301 (Images), DAPI_mask (Labels), DAPI_mask_poly (Shapes)
▸ 'um', with elements:
NSCLC4301 (Images)
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
We can play with the mask/labels parameters like increasing contour
to 4 and blending
to additive
to just highlight the edges of the masks:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
0
The masks appear to be more fragmented in the cores with more sparsely spaced cells (nuclei), and with low raw DAPI intensity
We can try to amend this by either:
Playing with image parameters, i.e.) increasing the gaussian blur and/or expansion to 30
Using multiple channels to capture more of the core
Increasing blur and expansion to 30#
# Since we mask on DAPI, and it will have the same name, the old mask is overwritten
viewer.layers[0].metadata["sdata"]
SpatialData object, with associated Zarr store: D:\qptiff_images\NSCLC4301.zarr
├── Images
│ └── 'NSCLC4301': DataTree[cyx] (33, 32400, 21120), (33, 16200, 10560), (33, 8100, 5280), (33, 4050, 2640), (33, 2025, 1320)
├── Labels
│ └── 'DAPI_mask': DataArray[yx] (1752, 936)
└── Shapes
└── 'DAPI_mask_poly': GeoDataFrame shape: (41, 2) (2D shapes)
with coordinate systems:
▸ 'global', with elements:
NSCLC4301 (Images), DAPI_mask (Labels), DAPI_mask_poly (Shapes)
▸ 'um', with elements:
NSCLC4301 (Images)
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
0
The masks at the bottom right still misses out on some cells. Let’s try also using multiple channels and including local histogram equalisation:
Multiple channel masking: DAPI + E-cadherin + Vimentin + Pan-Cytokeratin#
To select multiple channels, hold
Ctrl (⌘ on Mac)
then left click the channels inSelect channels(s) for masking
# Since we mask on multiple channels, this will have a new longer name of all
# channels chosen and show up as a new element in .labels
viewer.layers[0].metadata["sdata"]
SpatialData object, with associated Zarr store: D:\qptiff_images\NSCLC4301.zarr
├── Images
│ └── 'NSCLC4301': DataTree[cyx] (33, 32400, 21120), (33, 16200, 10560), (33, 8100, 5280), (33, 4050, 2640), (33, 2025, 1320)
├── Labels
│ ├── 'DAPI_E-cadherin_Vimentin_Pan-Cytokeratin_mask': DataArray[yx] (1699, 917)
│ └── 'DAPI_mask': DataArray[yx] (1699, 917)
└── Shapes
├── 'DAPI_E-cadherin_Vimentin_Pan-Cytokeratin_mask_poly': GeoDataFrame shape: (43, 2) (2D shapes)
└── 'DAPI_mask_poly': GeoDataFrame shape: (41, 2) (2D shapes)
with coordinate systems:
▸ 'global', with elements:
NSCLC4301 (Images), DAPI_E-cadherin_Vimentin_Pan-Cytokeratin_mask (Labels), DAPI_mask (Labels), DAPI_E-cadherin_Vimentin_Pan-Cytokeratin_mask_poly (Shapes), DAPI_mask_poly (Shapes)
▸ 'um', with elements:
NSCLC4301 (Images)
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
0
This gives us a fuller picture of the the TMA cores, so we can proceed with this mask.
Some other useful applications for this module is to generate regional masks (i.e. Tumor regions by masking on tumor markers like Pan-Cytokeratin).
For these, it is recommended to mask at higher resolutions to produce more detailed and less pixelated mask boundaries.
Below we mask on Pan-Cytokeratin at the (33, 8100, 5280) resolution, with blur and expansion to 2um, and local histogram equalisation:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Dearraying#
Dearraying is the process of segmenting and assigning labels to cores according to the grid of the tissue microarray. This is important as this is is how the metadata (i.e. clinical metadata, patient information, disease state, etc) is related back to each core.
We currently provide a minimal and oversimplified dearraying algorithm:
Blob detection with the Difference of Gaussians (DoG) on the binary masks / raw image from the previous step
Estimation of the TMA grid using the detected blobs
Fitting of TMA grid labels to each blob
The outputs of this are:
The traditional TMA core circles:
tma_core
Rectangles enveloping the circle shapes AND the mask:
tma_envelope
Steps#
Select the
Dearrayer
tab at the top of the widgetSelect the binary mask layer in the layer list (
DAPI_mask
orDAPI_E-cadherin_Vimentin_Pan-Cytokeratin_mask
). This will show up as theSelected layer
.Input an estimated or known value for core diameters in um, similar to the previous tab
(Optional, but recommended) Input the expected number of rows and column in the TMA grid.
If values are given, one-dimensional KMeans clustering is used to estimate the x and/or y coordinates of each grid element.
Otherwise, a rolling-mean with a predefined threshold is used to estimate x and y coordinates of each grid element.
Click
Dearray
tma_core and tma_envelope will appear in the
Elements
at the bottom leftDouble click
tma_envelope
to visualise in the layer
# These will be added as .shapes, as GeoDataFrames.
viewer.layers[0].metadata["sdata"]
SpatialData object, with associated Zarr store: D:\qptiff_images\NSCLC4301.zarr
├── Images
│ └── 'NSCLC4301': DataTree[cyx] (33, 32400, 21120), (33, 16200, 10560), (33, 8100, 5280), (33, 4050, 2640), (33, 2025, 1320)
├── Labels
│ ├── 'DAPI_E-cadherin_Vimentin_Pan-Cytokeratin_mask': DataArray[yx] (1699, 917)
│ ├── 'DAPI_mask': DataArray[yx] (1699, 917)
│ └── 'Pan-Cytokeratin_mask': DataArray[yx] (6796, 3666)
├── Shapes
│ ├── 'DAPI_E-cadherin_Vimentin_Pan-Cytokeratin_mask_poly': GeoDataFrame shape: (43, 2) (2D shapes)
│ ├── 'DAPI_mask_poly': GeoDataFrame shape: (41, 2) (2D shapes)
│ ├── 'Pan-Cytokeratin_mask_poly': GeoDataFrame shape: (27, 2) (2D shapes)
│ ├── 'tma_core': GeoDataFrame shape: (41, 6) (2D shapes)
│ └── 'tma_envelope': GeoDataFrame shape: (41, 2) (2D shapes)
└── Tables
└── 'DAPI_E-cadherin_Vimentin_Pan-Cytokeratin_mask_tma_table': AnnData (41, 0)
with coordinate systems:
▸ 'global', with elements:
NSCLC4301 (Images), DAPI_E-cadherin_Vimentin_Pan-Cytokeratin_mask (Labels), DAPI_mask (Labels), Pan-Cytokeratin_mask (Labels), DAPI_E-cadherin_Vimentin_Pan-Cytokeratin_mask_poly (Shapes), DAPI_mask_poly (Shapes), Pan-Cytokeratin_mask_poly (Shapes), tma_core (Shapes), tma_envelope (Shapes)
▸ 'um', with elements:
NSCLC4301 (Images)
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Note that some of the edges of the boxes may not be rendered at full zoom. Zoom in to see:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Adjust the boxes as necessary. As an example let’s adjust C-5 to capture a larger padding around the core:
Select the
tma_envelope
layer in the layer listClick the box to adjust in the viewer
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Adjust the box:
Move the box by dragging the centre of the box
Reshape the box by dragging a corner or edge (white circle) Here we adjust we increase the box for C-5:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Navigate to the
Utils
tab at the top of the widgetClick
Overwrite save selected layer
, withtma_envelope
selected (this will save the adjustments to disk)
For the sake of demonstrating the removal of ‘poor’ quality cores, we will drop the C-5 core since it appears to be necrotic.
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
# We can confirm that C-5 has been deleted in the viewer...
print("C-5" not in viewer.layers[-1].metadata["_columns_df"])
# ... and in the SpatialData object:
print("C-5" not in viewer.layers[-1].metadata["sdata"]["tma_envelope"])
True
True
Segmenter#
Next, we perform cell segmentation on the image. We make this process more efficient by only performing segmentation within the detected TMA cores, skipping over background, and dropped TMA cores.
We currently only support cellpose for cell segmentation, and provide a similar interface for tweaking segmentation parameters as the cellpose-napari plugin.
We provide functions for creating ‘meta-channels’ which are merged intensity projections of multiple channels selected by the user. This is useful for performing cytoplasm-based cell segmentation (i.e. cyto3) on images with no single reliable cytoplasmic channel.
The outputs of this tab are:
An instance labelled mask (cell segmentation masks):
{image}_labels
A table which annotates the labelled mask, containing cell indices, the TMA core label a cell belongs to, and the parent image (NSCLC4301):
{image}_labels
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Steps#
Select the
Segmenter
tab at the top of the widgetReselect the image layer
NSCLC4301
Select an image resolution (Channels, Y pixels, X pixels) to perform cell segmentation. Currently, we enforce doing this only on the highest resolution.
Select a base cellpose model (here we use
cyto3
)(Optional) Select a cellpose denoiser model (this overrides the base model selection above). These models perform image restoration/denoising prior to segmentation.
(Optional) Select a custom cellpose model file (this overrides both model selections above).
Select a shapes layer for tiling: This performs segmentations in discrete regions specified by an annotated shapes layer. This will be the
tma_envelopes
layer from the previous step.Select column for tile names: This will be the column in the annotated shapes layer that detail the label of each region. This will be
tma_label
, containing the string name of each TMA core from the previous step.
Select channel(s) for segmentation. User can select multiple channels for segmentation by holding Ctrl (⌘ on Mac) then left clicking all desired channels.
Select a method for merging multiple channels: If multiple channels were selected, these are then merged into a ‘meta-channel’ by performing an intensity projection across the channel axis. i.e.) “max” performs a maximum intensity projection
Here, we select and merge E-cadherin
, Vimentin
, and Pan-Cytokeratin
(not shown) with a maximum intensity projection:
If supported, then an additional nuclear channel can be specified (i.e. the cyto3 model can take an additional nuclear channel such as DAPI to enhance results).
Input a nucleus diameter. If this value is -1, then it is automatically estimated by cellpose.
Cell probability threshold: Decrease if undersegmenting in dim areas, vice-versa
Flow threshold: Decrease if segmentation shapes are unexpected, increase if undersegmenting
(Optional, but highly recommended) Tick
First and Last Tile only
to segment the first and last region from the provided tiling shapes layer. This is useful for testing and optimising for segmentation parameters before performing it on the entire image.(Optional, but highly recommended) Click
Preview Segmentation
. This will show the user a preview of the merged channels in green with the optional nuclear channel in blue ( if specified):
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
We can inspect if we have a decent representation of cytoplasms with the merged channels, and nuclei with DAPI.
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
INFO Transposing `data` of type: <class 'dask.array.core.Array'> to ('y',
'x').
Once satisfied by visualising input data from the previews, we can proceed with segmentation:
Click
Run Segmentation
. This button will be greyed out while it is running. When it becomes clickable, it means that segmentation is complete.After segmentation, the segmentation masks
NSCLC_labels
will appear in theElements
listDouble click to add and visualise in the viewer.
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Zooming into A-5:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
We can set contour to 2 to show just the cell borders:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
INFO Transposing `data` of type: <class 'dask.array.core.Array'> to ('y',
'x').
INFO Transposing `data` of type: <class 'dask.array.core.Array'> to ('y',
'x').
INFO Transposing `data` of type: <class 'dask.array.core.Array'> to ('y',
'x').
The results appear to be undersegmenting, especially the more irregular, elongated shapes.
We can tweak Cell Probability Threshold
and Flow Threshold
to optimise segmentation:
Decrease the
Cell Probability Threshold
to i.e. -3.98, to find more and larger masks. and/orIncrease the
Flow threshold
to i.e. 0.7, to find more masks
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
It is important to also see how this affects the whole image, across all core by simply zooming out and panning:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
ExpressionMeasurer#
In this tab, we combine the raw image with the cell segmentation masks to extract cell level features. This essentially creates the single cell table in the form of an AnnData object.
For each cell/instance, the mean or median intensity of each channel is measured as a proxy for expression levels, alongside other morphological and image properties.
Steps#
Select the
ExpressionMeasurer
tab at the top of the widgetReselect the image layer in the layer list (NSCLC4301)
Select the segmentation layer containing the cells or instances (NSCLC4301_labels)
Select the shapes layer for tiling: the
tma_envelope
layer.(Optional) Measure extended properties. Hover over this to view the additional image properties to measure on each cell/segmentation instance.
Select the intensity modality for quantifying ‘expression’ of a cell/instance.
Click
Measure
(hover over the button to see the entire list of properties measured, by default intensity_mean is shown, but if median is selected will do intensity_median)
Navigate to the
View (napari-spatialdata)
tab.Reselect the labels layer
NSCLC4301_labels
. Note the new table containing the measured data:
# The NSCLC4301_labels_expression table will now have '33' features which are
# intensities or marker expression for each cell
viewer.layers[0].metadata["sdata"]
SpatialData object, with associated Zarr store: D:\qptiff_images\NSCLC4301.zarr
├── Images
│ └── 'NSCLC4301': DataTree[cyx] (33, 32400, 21120), (33, 16200, 10560), (33, 8100, 5280), (33, 4050, 2640), (33, 2025, 1320)
├── Labels
│ ├── 'DAPI_E-cadherin_Vimentin_Pan-Cytokeratin_mask': DataArray[yx] (1699, 917)
│ ├── 'DAPI_mask': DataArray[yx] (1699, 917)
│ ├── 'NSCLC4301_labels': DataArray[yx] (32400, 21120)
│ └── 'Pan-Cytokeratin_mask': DataArray[yx] (6796, 3666)
├── Shapes
│ ├── 'DAPI_E-cadherin_Vimentin_Pan-Cytokeratin_mask_poly': GeoDataFrame shape: (43, 2) (2D shapes)
│ ├── 'DAPI_mask_poly': GeoDataFrame shape: (41, 2) (2D shapes)
│ ├── 'Pan-Cytokeratin_mask_poly': GeoDataFrame shape: (27, 2) (2D shapes)
│ ├── 'tma_core': GeoDataFrame shape: (41, 6) (2D shapes)
│ └── 'tma_envelope': GeoDataFrame shape: (40, 2) (2D shapes)
└── Tables
├── 'DAPI_E-cadherin_Vimentin_Pan-Cytokeratin_mask_tma_table': AnnData (41, 0)
└── 'NSCLC4301_labels_expression': AnnData (236553, 33)
with coordinate systems:
▸ 'global', with elements:
NSCLC4301 (Images), DAPI_E-cadherin_Vimentin_Pan-Cytokeratin_mask (Labels), DAPI_mask (Labels), NSCLC4301_labels (Labels), Pan-Cytokeratin_mask (Labels), DAPI_E-cadherin_Vimentin_Pan-Cytokeratin_mask_poly (Shapes), DAPI_mask_poly (Shapes), Pan-Cytokeratin_mask_poly (Shapes), tma_core (Shapes), tma_envelope (Shapes)
▸ 'um', with elements:
NSCLC4301 (Images)
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
We can visualise by these new attributes (.obs and .var) by double clicking them in this widget. We turn off the segmentation preview layer to show just the cell masks, and set contour to 0 to show the filled masks
i.e. by Obs
: color by the area
of the cell:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
i.e. by Vars
: color by the raw intensity of CD44
:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Next Steps#
Currently, there is no need to explicitly save any of the outputs generated in this widget, as (for now) the strategy is to save them directly to the on-disk SpatialData .zarr store with each button press.
The user may continue directly onto the AnnData analysis section.