AnnData Analysis#

This notebook explores the usage AnnData Analysis for interactively analysing anndata.AnnData objects contained within spatialdata.SpatialData objects.

The examples shown here follow on from the outputs of the previous TMA processing section.

# Below code loads napari to notebook for screenshot purposes
# All functions done on the viewer window3
import warnings

import matplotlib.pyplot as plt
import napari
from IPython import get_ipython
from loguru import logger


class ScreenshotContext:
    def __enter__(self):
        get_ipython().run_line_magic("matplotlib", "inline")

    def __exit__(self, exc_type, exc_value, traceback):
        get_ipython().run_line_magic("matplotlib", "qt")


plt.rcParams["figure.figsize"] = (20, 20)
warnings.filterwarnings("ignore")
logger.remove()
viewer = napari.current_viewer() or napari.Viewer()

Data Import#

If not done already:

Import the SpatialData .zarr (see Getting started)
Select coordinate system to work with (global)
(Optional) Select and double click the raw image NSCLC4301 to add it as a layer. Visualise the channel of choice.
Select and double click the NSCLC4301_labels element to add the cell segmentation (and cell table) as a layer
In the top part of the widget, select the NSCLC_labels_expression table

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/94c0d5397da328fd6a7a9cff835f5334921a37c063144428dfbebd139469144b.png

AnnData Analysis Tab#

The AnnData Analysis tab consists of various widgets arranged in a hierarchy.

From top to bottom:

Table Selection#

This provides a list of tables in the SpatialData object from which the user can choose to analyse. In this case, the suitable table is the expression table.

AnnData Tree#

This is a viewer-model-like widget and class called AnnDataNodeQT. This wraps AnnData objects and establishes how AnnDatas are related to one another. This was a unique solution to the lack of code blocks or cells where the user can look up an AnnData object. Rather, show all AnnDatas in a single visual and interactive widget. The tree structure facilitates highly interactive and exploratory analysis, tracking all of the many different AnnData objects and subsets produced by the many different combinations of operations the user performs.

we introduce for holding and saving AnnData objects organised in a tree data structure.

At this stage, things to note:#

When performing analysis on a selected AnnData node, some computations do not propagate to other nodes.
This is mainly due to differences in the shapes between nodes (different obs and/or vars).
Therefore, embeddings (.obsm), transformations (.X and .layers), and neighbor graphs (.uns and/or .obsp) are contained within the selected node.

Automatic I/O:#

As the user adds, deletes, annotates, modifies and renames AnnDatas nodes, the changes are saved to on-disk representations automatically.
Each node is saved as separate AnnData tables
The tree structure is cached as a dictionary called “tree_attrs” in .uns, with a reference to the stores of an AnnData node’s parent and children
These are used to reconstruct the AnnData tree widget everytime it is reloaded

Actions:#

Renaming: Double click the name in the widget to rename. NOTE: Only terminal nodes (AnnDatas with no subsets) can be renamed.
Delete (only if its a subset) - This will delete the node and all of its subsets.
Annotate Obs - launches an annotation table (see below)

Annotate Obs#

When Annotate Obs is clicked the following is launched:

When Annotate is pressed, annotation table is shown where the user can enter values to map a column in obs (i.e. based off tma_label).

The default column name Annotation can be relabelled by double clicking it (i.e. core_type)

Select Confirm, and this will be mapped accordingly in .obs. Blank annotations (I-1) will be labelled as NaN:

# If we view the AnnData contained in this widget:
tree = (
    viewer.window._dock_widgets["AnnData Analysis (napari-prism)"]
    .widget()
    .tree
)
tree.adata.obs

	cell_ID	area	centroid_x	centroid_y	eccentricity	solidity	tma_label	lyr	core_type
1_A-5	1	448.0	74.765625	1314.640625	0.863006	0.945148	A-5	NSCLC4301_labels	example1
2_A-5	2	461.0	69.034707	1340.616052	0.837425	0.931313	A-5	NSCLC4301_labels	example1
3_A-5	3	332.0	82.954819	1328.078313	0.659165	0.935211	A-5	NSCLC4301_labels	example1
4_A-5	4	310.0	82.838710	1353.954839	0.768014	0.796915	A-5	NSCLC4301_labels	example1
5_A-5	5	215.0	93.274419	1399.720930	0.497691	0.942982	A-5	NSCLC4301_labels	example1
...	...	...	...	...	...	...	...	...	...
7363_I-1	236549	1879.0	2227.356573	1040.453433	0.404111	0.959163	I-1	NSCLC4301_labels	NaN
7364_I-1	236550	1221.0	2266.730549	1133.805897	0.776592	0.873391	I-1	NSCLC4301_labels	NaN
7365_I-1	236551	504.0	2264.309524	1060.043651	0.582475	0.947368	I-1	NSCLC4301_labels	NaN
7366_I-1	236552	809.0	2269.645241	1035.556242	0.640624	0.945093	I-1	NSCLC4301_labels	NaN
7367_I-1	236553	269.0	2282.869888	442.936803	0.860358	0.967626	I-1	NSCLC4301_labels	NaN

236553 rows × 9 columns

When Import CSV is pressed, the user can add multiple existing annotations to map to an .obs column from a .csv file. This is usually how clinical metadata is mapped back to the cores. The .csv file must have a column tma_label with the format of {letter}-{number} for cores.

Below we have a metadata csv file called tma_label_metadata.csv which has a column tma_label, and three example metadata columns.

If the .csv is valid, then the merged form to add to the AnnData.obs will be displayed:

Select Confirm, and this will be added to .obs, and mapped to every cell:

# If we view the AnnData contained in this widget:
tree = (
    viewer.window._dock_widgets["AnnData Analysis (napari-prism)"]
    .widget()
    .tree
)
tree.adata.obs

	cell_ID	area	centroid_x	centroid_y	eccentricity	solidity	tma_label	lyr	core_type	Metadata1	Metadata2	Metadata3
0	1	448.0	74.765625	1314.640625	0.863006	0.945148	A-5	NSCLC4301_labels	example1	Meta1	5.0	D
1	2	461.0	69.034707	1340.616052	0.837425	0.931313	A-5	NSCLC4301_labels	example1	Meta1	5.0	D
2	3	332.0	82.954819	1328.078313	0.659165	0.935211	A-5	NSCLC4301_labels	example1	Meta1	5.0	D
3	4	310.0	82.838710	1353.954839	0.768014	0.796915	A-5	NSCLC4301_labels	example1	Meta1	5.0	D
4	5	215.0	93.274419	1399.720930	0.497691	0.942982	A-5	NSCLC4301_labels	example1	Meta1	5.0	D
...	...	...	...	...	...	...	...	...	...	...	...	...
236548	236549	1879.0	2227.356573	1040.453433	0.404111	0.959163	I-1	NSCLC4301_labels	NaN	Meta1	2.0	A
236549	236550	1221.0	2266.730549	1133.805897	0.776592	0.873391	I-1	NSCLC4301_labels	NaN	Meta1	2.0	A
236550	236551	504.0	2264.309524	1060.043651	0.582475	0.947368	I-1	NSCLC4301_labels	NaN	Meta1	2.0	A
236551	236552	809.0	2269.645241	1035.556242	0.640624	0.945093	I-1	NSCLC4301_labels	NaN	Meta1	2.0	A
236552	236553	269.0	2282.869888	442.936803	0.860358	0.967626	I-1	NSCLC4301_labels	NaN	Meta1	2.0	A

236553 rows × 12 columns

We also import real responder clinical metadata which marks ‘cores’ or patients as ‘Responders’ or ‘Non-Responders’:

A-5 is a core from a patient that responded to therapy
I-1 is a core from a patient that did not respond to therapy

# If we view the AnnData contained in this widget:
tree = (
    viewer.window._dock_widgets["AnnData Analysis (napari-prism)"]
    .widget()
    .tree
)
tree.adata.obs

	cell_ID	area	centroid_x	centroid_y	eccentricity	solidity	tma_label	lyr	core_type	Metadata1	Metadata2	Metadata3	Response
0	1	448.0	74.765625	1314.640625	0.863006	0.945148	A-5	NSCLC4301_labels	example1	Meta1	5.0	D	Responder
1	2	461.0	69.034707	1340.616052	0.837425	0.931313	A-5	NSCLC4301_labels	example1	Meta1	5.0	D	Responder
2	3	332.0	82.954819	1328.078313	0.659165	0.935211	A-5	NSCLC4301_labels	example1	Meta1	5.0	D	Responder
3	4	310.0	82.838710	1353.954839	0.768014	0.796915	A-5	NSCLC4301_labels	example1	Meta1	5.0	D	Responder
4	5	215.0	93.274419	1399.720930	0.497691	0.942982	A-5	NSCLC4301_labels	example1	Meta1	5.0	D	Responder
...	...	...	...	...	...	...	...	...	...	...	...	...	...
236548	236549	1879.0	2227.356573	1040.453433	0.404111	0.959163	I-1	NSCLC4301_labels	NaN	Meta1	2.0	A	Non-responder
236549	236550	1221.0	2266.730549	1133.805897	0.776592	0.873391	I-1	NSCLC4301_labels	NaN	Meta1	2.0	A	Non-responder
236550	236551	504.0	2264.309524	1060.043651	0.582475	0.947368	I-1	NSCLC4301_labels	NaN	Meta1	2.0	A	Non-responder
236551	236552	809.0	2269.645241	1035.556242	0.640624	0.945093	I-1	NSCLC4301_labels	NaN	Meta1	2.0	A	Non-responder
236552	236553	269.0	2282.869888	442.936803	0.860358	0.967626	I-1	NSCLC4301_labels	NaN	Meta1	2.0	A	Non-responder

236553 rows × 13 columns

AnnData Operator Tabs#

These tabs contains widgets which perform operations on the currently selected AnnData node. These operations encompass a set of tools and functions broadly categorised into cell typing and basic spatial analysis.

Feature modelling will be a future addition. This tab will involve an interface to train machine learning (sklearn / cuML) and deep learning models (Pytorch + PyTorch Geometric) using features encompassing all objects in the SpatialData object (i.e. from the raw image to cell-level spatial metrics).

Cell Typing#

Augmentation#

Cell Typing > Augmentation This tab contains functions for modifying the shape of the expression matrix and the AnnData object.

Additive Augmentation#

Cell Typing > Augmentation > Additive Augmentation

Here the user can choose to expand along the expression matrix features by adding numerical anndata.AnnData.obs columns as anndata.AnnData.var attributes. For example, we can add morphological features like solidity, eccentricity, area as ‘expression’ features.

Under Root, this will add a new ‘subset’ AnnData added_obs_as_var.

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/011c96b7f00903177cc915128d973b4da3807dda7bb4ffcf36c371038565b3c7.png

Reductive Augmentation#

Cell Typing > Augmentation > Reductive Augmentation

Here the user can choose to reduce the expression matrix features (i.e. due to poor quality markers, or performing clustering on a subset of Tumour markers).

Subset multiple variables by holding Ctrl (⌘ on Mac) then left clicking or dragging.

As an example we subset to a smaller handful of markers useful for differentiating tumor from non tumor cells (including DAPI as a QC marker, and the morphological features added above).

By default, the node will be named the list of all markers for the subset. We can rename the node double clicking the row title and entering a more informative name like: separate_tumor_non_tumor

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/a7a3c508390842e1445cf546ac1ecaa1c19f062bc5ef00c574879d4ab753c0e5.png

Preprocessing Tab#

Cell Typing > Preprocessing This tab consists of three sub-tabs:

Transforms: Perform transformations sequentially on a user-specified expression layer.
Quality Control / Filtering: Interactive filtering with an integrated histogram plot to choose thresholds accordingly.
Dimensionality Reduction: Perform PCA, UMAP, and Harmonypy on a user-specified expression layer.

If RAPIDS and/or rapids_singlecell were installed with the package, a Use GPU tick box will appear. When ticked, the relevant (not every!) functions will run on the GPU. Moving data between CPU and GPU memory will be automatically handled.

Transforms#

Cell Typing> Preprocessing > Transforms

We provide 5 transform functions to choose from:

Arcsinh (cofactor 150)
Scale (wrapper for scanpy.pp.scale())
Percentile (95th)
Z-score
Log1p (wrapper for scanpy.pp.log1p())

As an example, we perform 3 transforms in sequence: arcsinh, scale and then z-score. Once transform is called, a new expression layer (arcsinh_scale_zscore_augmented_X) will be added and available for selection:

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/26ac9c929b514aea982e72620a18eae5dca242646e547b03541342839eefaf55.png

Different transformation choices and combinations can be performed:

To add a combination, press the + button on the very right.
To remove a combination, press the - button to the right of the transform to remove
To modify a combination, select the dropdown box(es) and choose

Below, we do a different combination of a log1p transform, scale then a z-score on the augmented_X layer:

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/57d6f3c330d99b82fbd1486d282e3e4630949577c383ec39a148bf8a7e06908f.png

Quality Control / Filtering#

Cell Typing > Preprocessing > Quality Control / Filtering

We provide 5 quality control functions to choose from:

filter_by_obs_count: Filter out cells which belong to a categorical anndata.AnnData.obs value with less than X amount of cells. (i.e. filter out cells which belong to a TMA core with less than 50 cells)
filter_by_obs_value: Filter out cells which have a numerical anndata.AnnData.obs value outside a user-defined range
filter_by_obs_percentile: Same as filter_by_obs_value but using percentiles
filter_by_var_value: Filter out cells which have values outside a user-defined range for the selected anndata.AnnData.var in the selected expression layer.
filter_by_var_quantile: Same as filter_by_var_value but using percentiles

These will launch their own respective histogram plots, of which the user chooses a lower and upper bound, and the number of bins using sliders. If the number of bins is 0, this will be set to an automatic value.

As an example, we use filter_by_var_value to filter out cells based on DAPI in the raw layer (augmented_X).

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/1e08638c01f66e4bdbe603bddabbcb6adee19a4e6c83e438df7d4c5f03e92d87.png

There seems to be a peak of cells with very low DAPI expression. We can work with different axes for visualising and filtering this.

For example, we can work with the X-axis on the log scale (which will be equivalent to doing a log transform on augmented_X):

Modify the embedded figure by selecting the figure options icon
Under X-Axis, change Scale to log:

3. Press `Apply`, then `Ok`

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/7a11aaab1c3079ad4c1ab16bbce026a1cc8d2f7984a098d812563ae4f76467a3.png

We get a normal looking distribution for DAPI when log transformed, apart from the peak of cells with low DAPI expression.

We can filter these cells by dragging the left most slider to the cutoff point.

Note that the slider will be more sensitive due to the scale being log.
The absolute value of the slider will be annotated in red near the top of the vertical line:

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/4811000bdb3168479457ac7b2d9cdcd698ae35177c777faf4f8dc49f4804bebd.png

Press Apply QC Function.

This will create a new subset labelled by the channel and the filtering method, filter_by_DAPI_value.

We visualise the distribution of DAPI signal again in this subset, and change the x-axis scale to log as described above:

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/c36135a53e02aa34ebff3073ff8d5ffb268966ce7955cbd838c6ecd7956dcb57.png

This tab is also useful for visualising the effect the different transformations have on the channel intensity distributions:

i.e.) CD14 in the arcsinh_scale_zscore layer

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/de64ccc73e356ad29b6fe93f10cfee4dabbb0dbd6f38afa0374485850d37abb4.png

i.e.) CD14 in the log1p_scale_zscore layer

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/8cd8882999e7abf5a943047b9d10b886048b00267431b790333c871d5e3fcc97.png

Embeddings#

Cell Typing > Preprocessing > Embeddings

Following from the filtered AnnData above, we can then perform dimensionality reduction to generate various embeddings of the marker expression space using functions from the scanpy package (or rapids-singlecell package with GPU acceleration).

Currently, we provide wrappers for the following functions:

We can do a simple PCA on the log1p_scale_zscore_augmented_X layer with 15 principal components:

Change Scanpy tl/pp function to pca
Press Apply

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/d07bb08974a7fd8edbe681bec11512ccfd8a20aa0b22505bf7a2529bc2e2da85.png

Then compute a 15-nearest neighbors neighborhood graph between cells in that new ‘X_pca’ embedding.

Change Scanpy ty/pp function to neighbors
Set representation key to X_pca
n neighbors to 15
n pcs to 15
Modify the algorithm or metric; defaults are a brute force KNN search with euclidean distances
Press Apply

Note that N PCs must be less than or equal to the number of PCs computed above.

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/9e60fd13831a74bc9bbfe41f8de070d0dfd37b54627028f5b64ef0810af66091.png

And finally use that neighborhood graph to compute the UMAP embedding.

Change Scanpy tl/pp function to umap
Modify the parameters. The scanpy defaults are provided. Here we increase min dist to 0.7
Press Apply

Note: This will use the .uns[“neighbors”] attribute. If neighbors was not computed like in the previous step, an error will be returned in the window.

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/3447046cb67f3336ef50da1dff0a9956b88016eb5016c220475a25ed0030fa63.png

Visualising Attributes#

New numerical attributes generated throughout analysis can also be conveniently visualised with napari-spatialdata, using either the Scatter and/or View widget:

Plugins > napari spatialdata > Scatter and/or View

View#

Here we visualise the log1p_percentile_scale_zscore_augmented_X transformed values for DAPI of each cell:

Select the View (napari-spatialdata) tab on the bottom right
If not already, select the segmentation layer in the layer list on the left (NSCLC4301_labels)
Under Tables annotating layer: Select the NSCLC4301_labels_expression_ ... _filter_by_DAPI_value table (this is its on-disk name)
Under Layers: Select the transformed layer log1p_scale_zscore_augmented_X (note that the colormap will rescale values to 0 and 1)
Under Vars: Double click DAPI or channel of choice

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/9440fb03d582d47bc781e565aff249ccdf6269835676dfcd02bee568304ec749.png

Let’s zoom in on the top-right core.

Note how some masks are grey.

This is because NSCLC4301_labels contains ALL of the segmented cells in the original AnnData, but we are loading a filtered table which excludes the low DAPI signal cells, and these show up as grey masks:

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/989a232baf809097b660162e02f940a01cc7d2e7353f4d43fce130b71c645370.png

Scatter#

With Scatter, we can visualise numerical attributes from .obsm, .obs, and .var. This widget is is particularly useful for visualising embeddings we computed in 2D.

Here we visualise the umap embedding:

Select the Scatter (napari-spatialdata) tab on the bottom right
Select the segmentation layer in the layer list on the left (NSCLC4301_labels) if not already selected
Under Tables annotating layer: Select the ..._filter_by_DAPI_value table
Set X-axis type and Y-axis type to obsm
Select X_umap for both
Select 0 for one axis, and 1 for the other
(Optional) color by var or obs.

We will increase the window size and adjust the viewer increase the plot.

As an example, we can visualise by the transformed area variable to if area correlates with certain cells in this embedding space:

Color type to var
Select area_var
Underneath, select log1p_scale_zcore_augmented_X (again, note that values will be rescaled between 0 and 1)

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/a3cab9cb25317006df17be624c9e4a6ada88630cf07161ff250a17f84bc3a775.png

And do the same for Pan-Cytokeratin:

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/73600b5ee0612943751158b1f4cbcdf5b6d8e5257440408d7585f8915ab52e24.png

For more information, also see: https://spatialdata.scverse.org/projects/napari/en/latest/notebooks/spatialdata.html

Clustering Tabs#

The next few tabs marks the start of the clustering component of the cell typing workflow.

Clustering methods require rather arbitrarily set parameters, which are tweaked by the user until ‘reasonable’ clusters are generated. This can lead to many repetitions of:

Clustering
Plotting and assessing the clustering results
Repeat steps 1-2 with different parameters if results are poor quality
Annotating the clusters based on marker expression AND the raw image
Repeating steps 1-4 if subclustering annotated clusters

This can quickly become tedious and disorganised when done in a jupyter notebook.

We streamline this process by:

Merging steps 1-3 into a single step by performing clustering over a range of parameters. To do this efficiently we providing custom implementations which:
- Perform the parameter search efficiently
- Accelerate the computations using GPUs (RAPIDS and rapids-singlecell)
Have the marker expression plots, the raw image, and an annotation table in the single window to stream line step 4
Have a tree structure to keep track of subsets/subclusters

Clustering Search Tab#

AnnData Analysis (napari-prism) > Cell Typing > Clustering Search

To date, we support two different (yet similar) clustering methods: Phenograph and Scanpy clustering. These can be chosen in the Clustering Recipe selection box.

Both methods involve:

the computation of a KNN neighbors graph, which requires a parameter for K, or how many neighbors
leiden clustering on either method’s respective ‘refined’ graph, which requires a parameter for R, or the clustering resolution (how granular the clustering will be)

We also include a minimum cluster size parameter (from Phenograph), and adapt it to both methods. Clusters with less than this many cells are assigned a label of -1.

Like the previous tab, a GPU toggle button will appear if the GPU packages were installed with napari-prism.

As input, we’re going to subset again and remove DAPI, and re-compute the PCA and call this node clustering.

We perform a clustering search using Phenograph on the GPU and CPU:

Input data: X_pca
K: 10, 15, 20, …, 30 (start at 10, stop at 30, in steps of 5)
R: 0.1, 0.2, …, 0.6 (start at 0.1, stop at 0.6 in steps of 0.1)
Minimum cluster size: 10

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/a2144dacf76b5e68008001bd2211ae2319b65664e34aa9cebde62e8b44ac29dd.png

The runtime of this may depend on:

dataset size
parameter search range
maximum K parameter
hardware + gpu availability

Running the above search for this dataset of 239,540 cells on a AMD Ryzen7 5800X 8 Core AM4 4.7GHz CPU + GeForce RTX 2070 Super GPU took ~3-5 minutes.

import sys

from loguru import logger

logger.add(sys.stdout, level="INFO")

2024-12-15 13:22:46.548 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:parameter_search:1320 - Beginning Parameter Search... 

2024-12-15 13:22:46.548 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:22:46]
2024-12-15 13:22:46.549 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_neighbors:265 - Performing KNN search on GPU: 
	 K = 30
	 Distance Metric = euclidean
	 Search Algorithm = brute

2024-12-15 13:22:48.844 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_neighbors:294 - KNN GPU computed in 2.29492449760437 seconds 

2024-12-15 13:22:48.844 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:22:48]
2024-12-15 13:22:48.849 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:511 - Performing Jaccard on CPU:
	 KNN graph nodes = 233997
	 KNN graph K-neighbors = 10

CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
2024-12-15 13:22:53.961 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:536 - Jaccard CPU edgelist constructed in 5.11216926574707seconds 

2024-12-15 13:22:53.962 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:22:53]
2024-12-15 13:22:54.169 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.1
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:22:55.623 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 1.4533240795135498 seconds 

2024-12-15 13:22:55.623 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9484455446154784
2024-12-15 13:22:55.625 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:22:55]
2024-12-15 13:22:55.706 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.2
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:22:57.126 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 1.4200868606567383 seconds 

2024-12-15 13:22:57.127 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9331931568025645
2024-12-15 13:22:57.128 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:22:57]
2024-12-15 13:22:57.207 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.3
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:22:58.494 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 1.2873485088348389 seconds 

2024-12-15 13:22:58.495 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9243871856528958
2024-12-15 13:22:58.497 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:22:58]
2024-12-15 13:22:58.575 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.4
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:22:59.924 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 1.3494250774383545 seconds 

2024-12-15 13:22:59.925 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.917358648330681
2024-12-15 13:22:59.926 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:22:59]
2024-12-15 13:23:00.005 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.5
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:23:01.268 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 1.2633261680603027 seconds 

2024-12-15 13:23:01.268 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9110178311781054
2024-12-15 13:23:01.270 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:01]
2024-12-15 13:23:01.348 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.6
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:23:02.605 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 1.2569570541381836 seconds 

2024-12-15 13:23:02.606 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9076262134685122
2024-12-15 13:23:02.607 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:02]
2024-12-15 13:23:02.613 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:511 - Performing Jaccard on CPU:
	 KNN graph nodes = 233997
	 KNN graph K-neighbors = 15

2024-12-15 13:23:09.503 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:536 - Jaccard CPU edgelist constructed in 6.88990044593811seconds 

2024-12-15 13:23:09.504 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:09]
2024-12-15 13:23:09.634 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.1
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:23:11.816 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 2.18188738822937 seconds 

2024-12-15 13:23:11.817 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9394774128210623
2024-12-15 13:23:11.820 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:11]
2024-12-15 13:23:11.945 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.2
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:23:14.215 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 2.270263195037842 seconds 

2024-12-15 13:23:14.216 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9194047840479728
2024-12-15 13:23:14.218 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:14]
2024-12-15 13:23:14.341 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.3
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:23:16.570 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 2.229368209838867 seconds 

2024-12-15 13:23:16.571 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.908666658799452
2024-12-15 13:23:16.573 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:16]
2024-12-15 13:23:16.696 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.4
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:23:18.806 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 2.109954595565796 seconds 

2024-12-15 13:23:18.807 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9002598279878875
2024-12-15 13:23:18.809 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:18]
2024-12-15 13:23:18.934 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.5
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:23:20.987 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 2.053276300430298 seconds 

2024-12-15 13:23:20.987 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8920818583457851
2024-12-15 13:23:20.990 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:20]
2024-12-15 13:23:21.114 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.6
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:23:23.243 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 2.1290621757507324 seconds 

2024-12-15 13:23:23.243 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8877194982744846
2024-12-15 13:23:23.246 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:23]
2024-12-15 13:23:23.253 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:511 - Performing Jaccard on CPU:
	 KNN graph nodes = 233997
	 KNN graph K-neighbors = 20

2024-12-15 13:23:33.243 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:536 - Jaccard CPU edgelist constructed in 9.989278316497803seconds 

2024-12-15 13:23:33.244 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:33]
2024-12-15 13:23:33.415 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.1
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:23:36.838 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 3.423365354537964 seconds 

2024-12-15 13:23:36.839 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9363115083020052
2024-12-15 13:23:36.842 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:36]
2024-12-15 13:23:37.004 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.2
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:23:40.386 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 3.381678581237793 seconds 

2024-12-15 13:23:40.386 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9160461094701717
2024-12-15 13:23:40.390 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:40]
2024-12-15 13:23:40.568 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.3
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:23:44.015 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 3.44633150100708 seconds 

2024-12-15 13:23:44.015 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9022094993908114
2024-12-15 13:23:44.019 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:44]
2024-12-15 13:23:44.182 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.4
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:23:47.698 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 3.51560378074646 seconds 

2024-12-15 13:23:47.698 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8932483384780607
2024-12-15 13:23:47.702 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:47]
2024-12-15 13:23:47.859 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.5
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:23:50.882 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 3.0232059955596924 seconds 

2024-12-15 13:23:50.882 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8845792023086071
2024-12-15 13:23:50.885 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:50]
2024-12-15 13:23:51.048 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.6
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:23:54.198 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 3.150099277496338 seconds 

2024-12-15 13:23:54.198 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8778153974091304
2024-12-15 13:23:54.202 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:54]
2024-12-15 13:23:54.209 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:511 - Performing Jaccard on CPU:
	 KNN graph nodes = 233997
	 KNN graph K-neighbors = 25

2024-12-15 13:24:07.513 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:536 - Jaccard CPU edgelist constructed in 13.303729057312012seconds 

2024-12-15 13:24:07.514 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:24:07]
2024-12-15 13:24:07.760 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.1
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:24:12.743 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 4.982553958892822 seconds 

2024-12-15 13:24:12.743 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.933574570112028
2024-12-15 13:24:12.747 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:24:12]
2024-12-15 13:24:12.946 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.2
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:24:18.406 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 5.459635257720947 seconds 

2024-12-15 13:24:18.406 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.911611833834457
2024-12-15 13:24:18.410 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:24:18]
2024-12-15 13:24:18.612 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.3
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:24:23.546 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 4.934455156326294 seconds 

2024-12-15 13:24:23.547 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8968284695454755
2024-12-15 13:24:23.551 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:24:23]
2024-12-15 13:24:23.756 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.4
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:24:29.181 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 5.424898386001587 seconds 

2024-12-15 13:24:29.181 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.886198516046092
2024-12-15 13:24:29.185 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:24:29]
2024-12-15 13:24:29.386 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.5
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:24:34.152 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 4.7659196853637695 seconds 

2024-12-15 13:24:34.152 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8747998349811575
2024-12-15 13:24:34.156 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:24:34]
2024-12-15 13:24:34.358 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.6
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:24:39.108 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 4.75072717666626 seconds 

2024-12-15 13:24:39.109 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8709781274910915
2024-12-15 13:24:39.113 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:24:39]
2024-12-15 13:24:39.119 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:511 - Performing Jaccard on CPU:
	 KNN graph nodes = 233997
	 KNN graph K-neighbors = 30

2024-12-15 13:24:55.509 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:536 - Jaccard CPU edgelist constructed in 16.390007972717285seconds 

2024-12-15 13:24:55.511 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:24:55]
2024-12-15 13:24:55.774 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.1
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:25:02.064 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 6.290363788604736 seconds 

2024-12-15 13:25:02.065 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.934706743158094
2024-12-15 13:25:02.069 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:25:02]
2024-12-15 13:25:02.318 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.2
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:25:09.393 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 7.075368165969849 seconds 

2024-12-15 13:25:09.394 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9090085532750625
2024-12-15 13:25:09.398 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:25:09]
2024-12-15 13:25:09.646 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.3
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:25:16.919 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 7.272916555404663 seconds 

2024-12-15 13:25:16.919 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.892130617514392
2024-12-15 13:25:16.924 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:25:16]
2024-12-15 13:25:17.171 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.4
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:25:23.515 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 6.344385862350464 seconds 

2024-12-15 13:25:23.516 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8848316702791892
2024-12-15 13:25:23.520 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:25:23]
2024-12-15 13:25:23.768 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.5
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:25:29.742 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 5.973891496658325 seconds 

2024-12-15 13:25:29.742 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8723394623879358
2024-12-15 13:25:29.747 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:25:29]
2024-12-15 13:25:29.997 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU: 
	 Resolution = 0.6
	 Max iterations = 500
	 Min cluster size = 10

2024-12-15 13:25:36.089 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 6.09187388420105 seconds 

2024-12-15 13:25:36.090 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.863084275764576
2024-12-15 13:25:36.094 | INFO     | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:25:36]

logger.remove()

Based on the clustering method chosen this will create attributes in .obsm and .uns, which are read by the next tab to assess clustering ‘quality’.

Clustering Evaluator (Assess Cluster Runs)#

AnnData Analysis (prism) > Cell Typing > Assess Cluster Runs

We can compute some quality metrics to assess the clustering runs with parameters that generate ‘good’ clusterings. This can be quite objective, and so we choose a strategy which guides the user towards multiple reasonable runs.

Here, we decide to define ‘good’ runs as those with ‘stable’ clusters, or runs which have similar clusters to other runs. To assess this, we choose from 3 quality metrics to compute each cluster against each other:

Adjusted Rand Index (ARI)
Normalized Mutual Information (NMI)
Adjusted Mutual Information (AMI)

These scores are currently computed only on the CPU.

Following from the clustering run above:

Select HybridPhenographSearch
In Between-Cluster Score Plots, choose a metric (Here we choose Adjusted Rand Index)

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/663d9010ad7b993eb7d943d45b0965ccb58c2ec225f0b98288269c67f7d34afc.png

From this plot, ideally the user would visually choose a handful of resolutions which have stable clusters (lowest resolution possible with greatest similarity to other clusters). Visually, this is the corner of a clear greener box in this plot.

We decide to choose K = 20, and export both R = 0.2 and R = 0.3

We can export both runs:

Select 20 in Select K
Select 0.2 in Select R
Press Export Cluster Labels to Obs.
Repeat for 0.3 for R

This will export the cluster labels from this run as an obs columns called:

HybridPhenographSearch_K20_R0.2
HybridPhenographSearch_K20_R0.3

Visualise Clusters#

AnnData Analysis (napari-prism) > Cell Typing > Annotate Clusters

In this tab, the user can visualise clustering labels or any categorical .obs key.

To visualise grouped expression values we provide wrappers for the following scanpy cluster plots:

Here, we select the scanpy.pl.matrixplot() to visualise the mean group expression in the log1p_scale_zscore_augmented_X layer of all available markers. We select to groupby the cluster labels from the exported labels in the previous tab, HybridPhenographSearch_K20_R0.2:

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/8809f0b37d7e3d1bc8fbb677ce63d1c01f32c48d5f9ed24f25cb991b9d150599.png

For the sake of example, we do a simple annotation of cells being tumor cells or not. However, if desired, the user can perform finer cell type annotations (although we would do this on more markers since at the start we only included basic lineage markers).

We annotate each cluster using the annotation table:

Select HybridPhenographSearch_K20_R0.2
Press Annotate
Double click the Annotation column header, rename to any column name then press Enter. We name the column: Tumor
Annotate each respective cluster index to a given annotation. Here we annotate it as True or False (for tumor and non-tumor cells)

Press Confirm to add these annotations

We can check the group mean expression in the new annotations by changing the groupby key to Tumor. We expect tumor cells to be higher in E-cadherin, Ki67, Pan-Cytokeratin, and area_var (larger cells)

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/1acf4f6ea70031c5cec4a0a7322552b44f86bdc9346147730610f2e223ba0276.png

Most importantly, we can check these annotations directly back on the image by coloring the segmentation masks:

Double-click Tumor in Observations (currently there is no labels colormap legend in napari - this may be a future implementation in this package)
Select NSCLC4301 -> Double-click to switch to the Pan-Cytokeratin channel (43th channel)

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/1a967dea63948a196e9a795dcf5f801bfe7e9283b5bcbaf2305ce35d21379ae1.png

Zooming into various cores and switching to a contour view of the cell boundaries, True (tumor) cells in lighter-blue overlays with the raw Pan-Cytokeratin markers and close to none with the the non-tumor cells in darker-blue:

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/42ec2ab9262e1f5e694454d26ecc6dafbc18ea325baf1ec7b355baca3dffe4a8.png

Subclusterer#

AnnData Analysis (prism) > Cell Typing > Subclusterer

The usual last step of a recursive and interactive cell typing workflow is to perform subclustering on a subset.

Select an obs variable to subcluster on (cell_type or any other annotations)
Select a category in that variable (let’s subcluster the non-tumor cells to get their functional types)
Click Subcluster

This will create a child AnnData node of only False or non-tumor cells (we rename this to non_tumor_cells).

The key function of this tab is that the full marker feature set is restored (since we may want to use markers we have excluded at the start of the workflow i.e. for finer cell typing within the non-tumor cell populations). From this we subset markers to perform cell-typing of base cell lineages.

We perform recursive cell typing; repeating the preprocessing, clustering and annotation steps for the non-tumor_cells subset into base cell type lineages, then where possible further functionalise these cell types. All results are stored in a new annotation column called ct_func.

For the tumor subset, we name all cells as Tumor in ct_func.

Each subset will have a ‘ragged’-like array of annotations. Annotations in finer subsets (i.e. functionalised cell type labels) will take priority over coarse labels made in parent subsets. For example:

If we define cells 1 to 10 to be ‘T_cells’
Then subcluster these ‘T_cells’ population, where cells 1-5 as ‘T_cytotoxic’ and 6-7 as ‘T_reg’
Then cells 1-7 will take on the ‘T_cytotoxic’ and ‘T_reg’, whereas cells 8-10 without further functional annotations will take the original ‘T_cells’ label

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/94274fc712dd5038880a3dc73c6596a40f11ac97e2d265a4c778224c766aa4cd.png

Spatial Analysis#

Currently, we have two spatial metrics (more coming in the future):

Cellular Neighborhoods or CNs (from Schürch et al. (2020))
Proximity Density (a part of spatial_pscore from scimap)

Below we show how we can construct different spatial graphs compatible with either method.

Build Graph#

AnnData Analysis (prism) > Spatial Analysis > Build Graph

In this tab, we provide a wrapper for the squidpy.gr.spatial_neighbors() function, exposing some parameters the user can tweak. See the squidpy documentation for information on the provided parameters.

Continuing on with the filter_by_DAPI_value subset:

spatial_key: spatial in .obsm.
library_key: This should represent distinct spatial regions. In this case, this will be tma_label (each TMA core).
percentile: If, 0 will not threshold by percentile

For the CNs, we construct a 10-nearest neighbor spatial graph.

n_neighs: 10
key_added: knn. This will add knn_connectivities and knn_distances to .obsp

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/50f9f6cb79219b0cb49bc92062ebee06c50a4c61d6eee0d348cb08ba475dafbe.png

For proximity density, we construct a 40 pixel radius neighbors spatial graph.

radius: 40 (if not -1, then this will override n_neighs)
key_added: radial. This will add radial_connectivities and radial_distances to .obsp

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/a010ebbe81cd2a30ee4b18773e761a69205b6d5898a92ad3e413f094dadbf37f.png

# If we inspect the selected AnnData,
# obsp will have:
# connectivities/distances from the scanpy neighbors
# knn_connectivities/knn_distances from the knn graph
# radial_connectivities/radial_distances from the radial graph

tree = (
    viewer.window._dock_widgets["AnnData Analysis (napari-prism)"]
    .widget()
    .tree
)
print(tree.adata_tree_widget.currentItem().adata.obsp)

PairwiseArrays with keys: connectivities, distances, knn_connectivities, knn_distances, radial_connectivities, radial_distances

Cellular Neighborhoods#

AnnData Analysis (prism) > Spatial Analysis > Cellular Neighborhoods > Compute

We integrate the computation of cellular neighborhoods (CNs) as defined in Schürch et al. (2020), to be directly compatible with the squidpy neighborhood graphs in .obsp.

Computing CNs require a user set parameter for the number of neighborhoods to compute (K in KMeans clustering). Inpsired by Enfield et al. (2024), we implement a search for this parameter and a plot to assess the best choice using an elbow detection algorithm.

Steps#

In spatial_key select knn_connectivities
In phenotype select the obs variable denoting the phenotype to compute CNs for (for this example, let’s use the ct_func labels)
Provide a search range for the number of CNs (usually, a number between 3 and 20 should be sufficient)
(Optional) for larger datasets, consider using mini batch KMeans (by ticking the box)

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/644e5da7ba23e9296be8bb2721cbad8d7882c0af2aa9379cdcb32c4b6e3afc9a.png

AnnData Analysis (prism) > Spatial Analysis > Cellular Neighborhoods > Visualise

We can visualise the results:

Kneedle Plot - This shows the decay in KMeans ‘Inertia’ with higher Ks

Inertia is the sum of squared distances of each data point its assigned cluster centroid (a measure of how tight the clusterings are)
The dashed vertical line is the ‘optimal’ (lowest K with the lowest inertia) K according to the Kneedle algorithm

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/e07b06348ae8c4c044a761e106de452f85c7109a74f66e2bc9c92fbc2ac08936.png

K = 8 seems to be the ‘optimal’ number for CNs. We can check the enrichment of phenotypes with the Enrichment Matrix plotting tab:

Choose K Kmeans run: 8

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/32e2f042e9d7568edeae11f3e51c68aa855bfbc331c98845024df7ebfd2bdfed.png

Press Export K to export the 10 CNs as labels to .obs. This will appear as cns_k8.

We can annotate the CNs like we annotated cell types using the annotation table.

We label each CN index a general descriptive labels:

0: Immune_1_enriched
1: Immune_2_enriched
2: Tumor_immune_depleted
3: Endothelial_enriched
4: Fibroblast_enriched
5: Immune_3_enriched
6: Tumor_immune_mixed
7: Myeloid_enriched

We can then visualise these CNs back on the cell segmentation masks on top of the raw image in View (napari-spatialdata):

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/8337f77428893a3002af7e782216f90736d2d8dbb0a9fca961764e877239ca58.png

In B-2, we can see the Tumor_immune_depleted CNs in light green and the Tumor_immune_mixed in olive green possibly showing a tumor nest + tumor interface-like pattern:

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/cd101f516c2d8844b91d2a7a4395e27d8cebb31dd920f14ccf6239da7e7c3494.png

Proximity Density#

AnnData Analysis (prism) > Spatial Analysis > Proximity Density > Compute

We provide a wrapper for computing the proximity density component of the spatial_pscore function from scimap

Steps#

In spatial_key select radial_connectivities
In phenotype select cell_type_new
Select the pairs to compute proximity density between. If none are selected, it will do all unique combinations without replacement (i.e. assumes commutativity or undirected edges since radial-based graphs hold that assumption). Otherwise, till do the cartesian product between the left and right selections.

For example, below will compute:

Tumor to every typed immune cell population

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/88d1e3d6eb621ac2d3b18d35b3a1b37fc9c68d6a4779403c828f512c05266946.png

Starting plotting..
Starting calculating row orders..
Reordering rows..
Starting calculating col orders..
Reordering cols..
Plotting matrix..
Starting plotting HeatmapAnnotations
Collecting legends..
Collecting annotation legends..
Plotting legends..

AnnData Analysis (prism) > Spatial Analysis > Proximity Density > Visualise

We implement a clustermap from PyComplexHeatmap to visualise these scores for each cell type pair, in every distinct region.

region_key: tma_label
metadata_key: Select a metadata key to map. These will have a 1:1 or 1:N relation with tma_label (i.e. for a single or multiple tma_label(s), there will be a matching metadata label). Here we go with the example at the top of this notebook of the fake metadata column Metadata1.

Press Plot

with ScreenshotContext():
    plt.imshow(viewer.screenshot(canvas_only=False))
    plt.axis("off")
    plt.show()

../_images/a5c09cc7afb620b62f82b55d3605678865f1335a90d9e6642d2979e9d3a41cc7.png

Future + Roadmap#

This covers all current features so far in PRISM.

In the near future:

Expect to see more spatial features to compute and plot, alongside spatial graph plots in the main viewer with vectors.
Expect to see a Feature Modelling tab and usage notebook, covering how to combine cell types, regions, annotations and spatial metrics to generate features for machine learning / deep learning models to a) predict clinical responses / outcomes and b) perform feature selection.

AnnData Analysis

Contents

AnnData Analysis#

Data Import#

AnnData Analysis Tab#

Table Selection#

AnnData Tree#

At this stage, things to note:#

Automatic I/O:#

Actions:#

Annotate Obs#

AnnData Operator Tabs#

Cell Typing#

Augmentation#

Additive Augmentation#

Reductive Augmentation#

Preprocessing Tab#

Transforms#

Quality Control / Filtering#

Embeddings#

Visualising Attributes#

View#

Scatter#

Clustering Tabs#

Clustering Search Tab#

Clustering Evaluator (Assess Cluster Runs)#

Visualise Clusters#

Subclusterer#

Spatial Analysis#

Build Graph#

Cellular Neighborhoods#

Steps#

Proximity Density#

Steps#

Future + Roadmap#