AnnData Analysis#
This notebook explores the usage AnnData Analysis
for interactively analysing anndata.AnnData
objects contained within spatialdata.SpatialData
objects.
The examples shown here follow on from the outputs of the previous TMA processing section.
# Below code loads napari to notebook for screenshot purposes
# All functions done on the viewer window3
import warnings
import matplotlib.pyplot as plt
import napari
from IPython import get_ipython
from loguru import logger
class ScreenshotContext:
def __enter__(self):
get_ipython().run_line_magic("matplotlib", "inline")
def __exit__(self, exc_type, exc_value, traceback):
get_ipython().run_line_magic("matplotlib", "qt")
plt.rcParams["figure.figsize"] = (20, 20)
warnings.filterwarnings("ignore")
logger.remove()
viewer = napari.current_viewer() or napari.Viewer()
Data Import#
If not done already:
Import the SpatialData .zarr (see Getting started)
Select coordinate system to work with (global)
(Optional) Select and double click the raw image
NSCLC4301
to add it as a layer. Visualise the channel of choice.Select and double click the
NSCLC4301_labels
element to add the cell segmentation (and cell table) as a layerIn the top part of the widget, select the
NSCLC_labels_expression
table
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
AnnData Analysis Tab#
The AnnData Analysis
tab consists of various widgets arranged in a hierarchy.
From top to bottom:
Table Selection#
This provides a list of tables in the SpatialData object from which the user can choose to analyse. In this case, the suitable table is the expression table.
AnnData Tree#
This is a viewer-model-like widget and class called AnnDataNodeQT
. This wraps AnnData objects and establishes how AnnDatas are related to one another. This was a unique solution to the lack of code blocks or cells where the user can look up an AnnData object. Rather, show all AnnDatas in a single visual and interactive widget. The tree structure facilitates highly interactive and exploratory analysis, tracking all of the many different AnnData objects and subsets produced by the many different combinations of operations the user performs.
we introduce for holding and saving AnnData objects organised in a tree data structure.
At this stage, things to note:#
When performing analysis on a selected AnnData node, some computations do not propagate to other nodes.
This is mainly due to differences in the shapes between nodes (different obs and/or vars).
Therefore, embeddings (.obsm), transformations (.X and .layers), and neighbor graphs (.uns and/or .obsp) are contained within the selected node.
Automatic I/O:#
As the user adds, deletes, annotates, modifies and renames AnnDatas nodes, the changes are saved to on-disk representations automatically.
Each node is saved as separate AnnData tables
The tree structure is cached as a dictionary called “tree_attrs” in .uns, with a reference to the stores of an AnnData node’s parent and children
These are used to reconstruct the AnnData tree widget everytime it is reloaded
Actions:#
Renaming: Double click the name in the widget to rename. NOTE: Only terminal nodes (AnnDatas with no subsets) can be renamed.
Delete
(only if its a subset) - This will delete the node and all of its subsets.Annotate Obs
- launches an annotation table (see below)
Annotate Obs#
When Annotate Obs
is clicked the following is launched:
When Annotate
is pressed, annotation table is shown where the user can enter values to map a column in obs
(i.e. based off tma_label
).
The default column name
Annotation
can be relabelled by double clicking it (i.e.core_type
)
Select Confirm
, and this will be mapped accordingly in .obs
. Blank annotations (I-1) will be labelled as NaN:
# If we view the AnnData contained in this widget:
tree = (
viewer.window._dock_widgets["AnnData Analysis (napari-prism)"]
.widget()
.tree
)
tree.adata.obs
cell_ID | area | centroid_x | centroid_y | eccentricity | solidity | tma_label | lyr | core_type | |
---|---|---|---|---|---|---|---|---|---|
1_A-5 | 1 | 448.0 | 74.765625 | 1314.640625 | 0.863006 | 0.945148 | A-5 | NSCLC4301_labels | example1 |
2_A-5 | 2 | 461.0 | 69.034707 | 1340.616052 | 0.837425 | 0.931313 | A-5 | NSCLC4301_labels | example1 |
3_A-5 | 3 | 332.0 | 82.954819 | 1328.078313 | 0.659165 | 0.935211 | A-5 | NSCLC4301_labels | example1 |
4_A-5 | 4 | 310.0 | 82.838710 | 1353.954839 | 0.768014 | 0.796915 | A-5 | NSCLC4301_labels | example1 |
5_A-5 | 5 | 215.0 | 93.274419 | 1399.720930 | 0.497691 | 0.942982 | A-5 | NSCLC4301_labels | example1 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
7363_I-1 | 236549 | 1879.0 | 2227.356573 | 1040.453433 | 0.404111 | 0.959163 | I-1 | NSCLC4301_labels | NaN |
7364_I-1 | 236550 | 1221.0 | 2266.730549 | 1133.805897 | 0.776592 | 0.873391 | I-1 | NSCLC4301_labels | NaN |
7365_I-1 | 236551 | 504.0 | 2264.309524 | 1060.043651 | 0.582475 | 0.947368 | I-1 | NSCLC4301_labels | NaN |
7366_I-1 | 236552 | 809.0 | 2269.645241 | 1035.556242 | 0.640624 | 0.945093 | I-1 | NSCLC4301_labels | NaN |
7367_I-1 | 236553 | 269.0 | 2282.869888 | 442.936803 | 0.860358 | 0.967626 | I-1 | NSCLC4301_labels | NaN |
236553 rows × 9 columns
When Import CSV
is pressed, the user can add multiple existing annotations to map to an .obs
column from a .csv file. This is usually how clinical metadata is mapped back to the cores. The .csv file must have a column tma_label
with the format of {letter}-{number}
for cores.
Below we have a metadata csv file called tma_label_metadata.csv
which has a column tma_label
, and three example metadata columns.
If the .csv is valid, then the merged form to add to the AnnData.obs will be displayed:
Select Confirm
, and this will be added to .obs, and mapped to every cell:
# If we view the AnnData contained in this widget:
tree = (
viewer.window._dock_widgets["AnnData Analysis (napari-prism)"]
.widget()
.tree
)
tree.adata.obs
cell_ID | area | centroid_x | centroid_y | eccentricity | solidity | tma_label | lyr | core_type | Metadata1 | Metadata2 | Metadata3 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 448.0 | 74.765625 | 1314.640625 | 0.863006 | 0.945148 | A-5 | NSCLC4301_labels | example1 | Meta1 | 5.0 | D |
1 | 2 | 461.0 | 69.034707 | 1340.616052 | 0.837425 | 0.931313 | A-5 | NSCLC4301_labels | example1 | Meta1 | 5.0 | D |
2 | 3 | 332.0 | 82.954819 | 1328.078313 | 0.659165 | 0.935211 | A-5 | NSCLC4301_labels | example1 | Meta1 | 5.0 | D |
3 | 4 | 310.0 | 82.838710 | 1353.954839 | 0.768014 | 0.796915 | A-5 | NSCLC4301_labels | example1 | Meta1 | 5.0 | D |
4 | 5 | 215.0 | 93.274419 | 1399.720930 | 0.497691 | 0.942982 | A-5 | NSCLC4301_labels | example1 | Meta1 | 5.0 | D |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
236548 | 236549 | 1879.0 | 2227.356573 | 1040.453433 | 0.404111 | 0.959163 | I-1 | NSCLC4301_labels | NaN | Meta1 | 2.0 | A |
236549 | 236550 | 1221.0 | 2266.730549 | 1133.805897 | 0.776592 | 0.873391 | I-1 | NSCLC4301_labels | NaN | Meta1 | 2.0 | A |
236550 | 236551 | 504.0 | 2264.309524 | 1060.043651 | 0.582475 | 0.947368 | I-1 | NSCLC4301_labels | NaN | Meta1 | 2.0 | A |
236551 | 236552 | 809.0 | 2269.645241 | 1035.556242 | 0.640624 | 0.945093 | I-1 | NSCLC4301_labels | NaN | Meta1 | 2.0 | A |
236552 | 236553 | 269.0 | 2282.869888 | 442.936803 | 0.860358 | 0.967626 | I-1 | NSCLC4301_labels | NaN | Meta1 | 2.0 | A |
236553 rows × 12 columns
We also import real responder clinical metadata which marks ‘cores’ or patients as ‘Responders’ or ‘Non-Responders’:
A-5
is a core from a patient that responded to therapyI-1
is a core from a patient that did not respond to therapy
# If we view the AnnData contained in this widget:
tree = (
viewer.window._dock_widgets["AnnData Analysis (napari-prism)"]
.widget()
.tree
)
tree.adata.obs
cell_ID | area | centroid_x | centroid_y | eccentricity | solidity | tma_label | lyr | core_type | Metadata1 | Metadata2 | Metadata3 | Response | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 448.0 | 74.765625 | 1314.640625 | 0.863006 | 0.945148 | A-5 | NSCLC4301_labels | example1 | Meta1 | 5.0 | D | Responder |
1 | 2 | 461.0 | 69.034707 | 1340.616052 | 0.837425 | 0.931313 | A-5 | NSCLC4301_labels | example1 | Meta1 | 5.0 | D | Responder |
2 | 3 | 332.0 | 82.954819 | 1328.078313 | 0.659165 | 0.935211 | A-5 | NSCLC4301_labels | example1 | Meta1 | 5.0 | D | Responder |
3 | 4 | 310.0 | 82.838710 | 1353.954839 | 0.768014 | 0.796915 | A-5 | NSCLC4301_labels | example1 | Meta1 | 5.0 | D | Responder |
4 | 5 | 215.0 | 93.274419 | 1399.720930 | 0.497691 | 0.942982 | A-5 | NSCLC4301_labels | example1 | Meta1 | 5.0 | D | Responder |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
236548 | 236549 | 1879.0 | 2227.356573 | 1040.453433 | 0.404111 | 0.959163 | I-1 | NSCLC4301_labels | NaN | Meta1 | 2.0 | A | Non-responder |
236549 | 236550 | 1221.0 | 2266.730549 | 1133.805897 | 0.776592 | 0.873391 | I-1 | NSCLC4301_labels | NaN | Meta1 | 2.0 | A | Non-responder |
236550 | 236551 | 504.0 | 2264.309524 | 1060.043651 | 0.582475 | 0.947368 | I-1 | NSCLC4301_labels | NaN | Meta1 | 2.0 | A | Non-responder |
236551 | 236552 | 809.0 | 2269.645241 | 1035.556242 | 0.640624 | 0.945093 | I-1 | NSCLC4301_labels | NaN | Meta1 | 2.0 | A | Non-responder |
236552 | 236553 | 269.0 | 2282.869888 | 442.936803 | 0.860358 | 0.967626 | I-1 | NSCLC4301_labels | NaN | Meta1 | 2.0 | A | Non-responder |
236553 rows × 13 columns
AnnData Operator Tabs#
These tabs contains widgets which perform operations on the currently selected AnnData node. These operations encompass a set of tools and functions broadly categorised into cell typing and basic spatial analysis.
Feature modelling will be a future addition. This tab will involve an interface to train machine learning (sklearn / cuML) and deep learning models (Pytorch + PyTorch Geometric) using features encompassing all objects in the SpatialData object (i.e. from the raw image to cell-level spatial metrics).
Cell Typing#
Augmentation#
Cell Typing
> Augmentation
This tab contains functions for modifying the shape of the expression matrix and the AnnData object.
Additive Augmentation#
Cell Typing
> Augmentation
> Additive Augmentation
Here the user can choose to expand along the expression matrix features by adding numerical anndata.AnnData.obs
columns as anndata.AnnData.var
attributes. For example, we can add morphological features like solidity
, eccentricity
, area
as ‘expression’ features.
Under Root
, this will add a new ‘subset’ AnnData added_obs_as_var
.
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Reductive Augmentation#
Cell Typing
> Augmentation
> Reductive Augmentation
Here the user can choose to reduce the expression matrix features (i.e. due to poor quality markers, or performing clustering on a subset of Tumour markers).
Subset multiple variables by holding Ctrl (⌘ on Mac) then left clicking or dragging.
As an example we subset to a smaller handful of markers useful for differentiating tumor from non tumor cells (including DAPI
as a QC marker, and the morphological features added above).
By default, the node will be named the list of all markers for the subset. We can rename the node double clicking the row title and entering a more informative name like: separate_tumor_non_tumor
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Preprocessing Tab#
Cell Typing
> Preprocessing
This tab consists of three sub-tabs:
Transforms: Perform transformations sequentially on a user-specified expression layer.
Quality Control / Filtering: Interactive filtering with an integrated histogram plot to choose thresholds accordingly.
Dimensionality Reduction: Perform PCA, UMAP, and Harmonypy on a user-specified expression layer.
If RAPIDS
and/or rapids_singlecell
were installed with the package, a Use GPU
tick box will appear. When ticked, the relevant (not every!) functions will run on the GPU. Moving data between CPU and GPU memory will be automatically handled.
Transforms#
Cell Typing
> Preprocessing
> Transforms
We provide 5 transform functions to choose from:
Arcsinh (cofactor 150)
Scale (wrapper for
scanpy.pp.scale()
)Percentile (95th)
Z-score
Log1p (wrapper for
scanpy.pp.log1p()
)
As an example, we perform 3 transforms in sequence: arcsinh, scale and then z-score. Once transform is called, a new expression layer (arcsinh_scale_zscore_augmented_X
) will be added and available for selection:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Different transformation choices and combinations can be performed:
To add a combination, press the
+
button on the very right.To remove a combination, press the
-
button to the right of the transform to removeTo modify a combination, select the dropdown box(es) and choose
Below, we do a different combination of a log1p
transform, scale
then a z-score
on the augmented_X
layer:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Quality Control / Filtering#
Cell Typing
> Preprocessing
> Quality Control / Filtering
We provide 5 quality control functions to choose from:
filter_by_obs_count
: Filter out cells which belong to a categoricalanndata.AnnData.obs
value with less than X amount of cells. (i.e. filter out cells which belong to a TMA core with less than 50 cells)filter_by_obs_value
: Filter out cells which have a numericalanndata.AnnData.obs
value outside a user-defined rangefilter_by_obs_percentile
: Same as filter_by_obs_value but using percentilesfilter_by_var_value
: Filter out cells which have values outside a user-defined range for the selectedanndata.AnnData.var
in the selected expression layer.filter_by_var_quantile
: Same as filter_by_var_value but using percentiles
These will launch their own respective histogram plots, of which the user chooses a lower and upper bound, and the number of bins using sliders. If the number of bins is 0, this will be set to an automatic value.
As an example, we use filter_by_var_value
to filter out cells based on DAPI
in the raw layer (augmented_X
).
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
There seems to be a peak of cells with very low DAPI expression. We can work with different axes for visualising and filtering this.
For example, we can work with the X-axis on the log scale (which will be equivalent to doing a log transform on augmented_X
):
3. Press `Apply`, then `Ok`
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
We get a normal looking distribution for DAPI when log transformed, apart from the peak of cells with low DAPI expression.
We can filter these cells by dragging the left most slider to the cutoff point.
Note that the slider will be more sensitive due to the scale being log.
The absolute value of the slider will be annotated in red near the top of the vertical line:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Press Apply QC Function
.
This will create a new subset labelled by the channel and the filtering method, filter_by_DAPI_value
.
We visualise the distribution of DAPI
signal again in this subset, and change the x-axis scale to log as described above:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
This tab is also useful for visualising the effect the different transformations have on the channel intensity distributions:
i.e.) CD14
in the arcsinh_scale_zscore layer
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
i.e.) CD14
in the log1p_scale_zscore layer
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Embeddings#
Cell Typing
> Preprocessing
> Embeddings
Following from the filtered AnnData above, we can then perform dimensionality reduction to generate various embeddings of the marker expression space using functions from the scanpy package (or rapids-singlecell package with GPU acceleration).
Currently, we provide wrappers for the following functions:
We can do a simple PCA on the log1p_scale_zscore_augmented_X
layer with 15 principal components:
Change
Scanpy tl/pp function
topca
Press
Apply
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Then compute a 15-nearest neighbors neighborhood graph between cells in that new ‘X_pca’ embedding.
Change
Scanpy ty/pp function
toneighbors
Set representation key to
X_pca
n neighbors to 15
n pcs to 15
Modify the algorithm or metric; defaults are a brute force KNN search with euclidean distances
Press
Apply
Note that N PCs must be less than or equal to the number of PCs computed above.
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
And finally use that neighborhood graph to compute the UMAP embedding.
Change
Scanpy tl/pp function
toumap
Modify the parameters. The scanpy defaults are provided. Here we increase
min dist
to 0.7Press
Apply
Note: This will use the .uns[“neighbors”] attribute. If neighbors was not computed like in the previous step, an error will be returned in the window.
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Visualising Attributes#
New numerical attributes generated throughout analysis can also be conveniently visualised with napari-spatialdata
, using either the Scatter
and/or View
widget:
Plugins
> napari spatialdata
> Scatter
and/or View
View#
Here we visualise the log1p_percentile_scale_zscore_augmented_X
transformed values for DAPI of each cell:
Select the
View (napari-spatialdata)
tab on the bottom rightIf not already, select the segmentation layer in the layer list on the left (
NSCLC4301_labels
)Under
Tables annotating layer
: Select theNSCLC4301_labels_expression_ ... _filter_by_DAPI_value
table (this is its on-disk name)Under
Layers
: Select the transformed layerlog1p_scale_zscore_augmented_X
(note that the colormap will rescale values to 0 and 1)Under
Vars
: Double clickDAPI
or channel of choice
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Let’s zoom in on the top-right core.
Note how some masks are grey.
This is because NSCLC4301_labels
contains ALL of the segmented cells in the original AnnData, but we are loading a filtered table which excludes the low DAPI signal cells, and these show up as grey masks:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Scatter#
With Scatter, we can visualise numerical attributes from .obsm, .obs, and .var. This widget is is particularly useful for visualising embeddings we computed in 2D.
Here we visualise the umap
embedding:
Select the
Scatter (napari-spatialdata)
tab on the bottom rightSelect the segmentation layer in the layer list on the left (
NSCLC4301_labels
) if not already selectedUnder
Tables annotating layer
: Select the..._filter_by_DAPI_value
tableSet
X-axis type
andY-axis type
toobsm
Select
X_umap
for bothSelect
0
for one axis, and1
for the other(Optional) color by
var
orobs
.
We will increase the window size and adjust the viewer increase the plot.
As an example, we can visualise by the transformed area
variable to if area correlates with certain cells in this embedding space:
Color type to
var
Select
area_var
Underneath, select
log1p_scale_zcore_augmented_X
(again, note that values will be rescaled between 0 and 1)
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
And do the same for Pan-Cytokeratin
:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
For more information, also see: https://spatialdata.scverse.org/projects/napari/en/latest/notebooks/spatialdata.html
Clustering Tabs#
The next few tabs marks the start of the clustering component of the cell typing workflow.
Clustering methods require rather arbitrarily set parameters, which are tweaked by the user until ‘reasonable’ clusters are generated. This can lead to many repetitions of:
Clustering
Plotting and assessing the clustering results
Repeat steps 1-2 with different parameters if results are poor quality
Annotating the clusters based on marker expression AND the raw image
Repeating steps 1-4 if subclustering annotated clusters
This can quickly become tedious and disorganised when done in a jupyter notebook.
We streamline this process by:
Merging steps 1-3 into a single step by performing clustering over a range of parameters. To do this efficiently we providing custom implementations which:
Perform the parameter search efficiently
Accelerate the computations using GPUs (RAPIDS and rapids-singlecell)
Have the marker expression plots, the raw image, and an annotation table in the single window to stream line step 4
Have a tree structure to keep track of subsets/subclusters
Clustering Search Tab#
AnnData Analysis (napari-prism)
> Cell Typing
> Clustering Search
To date, we support two different (yet similar) clustering methods: Phenograph and Scanpy clustering. These can be chosen in the Clustering Recipe
selection box.
Both methods involve:
the computation of a KNN neighbors graph, which requires a parameter for K, or how many neighbors
leiden clustering on either method’s respective ‘refined’ graph, which requires a parameter for R, or the clustering resolution (how granular the clustering will be)
We also include a minimum cluster size parameter (from Phenograph), and adapt it to both methods. Clusters with less than this many cells are assigned a label of -1.
Like the previous tab, a GPU toggle button will appear if the GPU packages were installed with napari-prism.
As input, we’re going to subset again and remove DAPI
, and re-compute the PCA and call this node clustering
.
We perform a clustering search using Phenograph on the GPU and CPU:
Input data: X_pca
K: 10, 15, 20, …, 30 (start at 10, stop at 30, in steps of 5)
R: 0.1, 0.2, …, 0.6 (start at 0.1, stop at 0.6 in steps of 0.1)
Minimum cluster size: 10
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
The runtime of this may depend on:
dataset size
parameter search range
maximum K parameter
hardware + gpu availability
Running the above search for this dataset of 239,540 cells on a AMD Ryzen7 5800X 8 Core AM4 4.7GHz CPU + GeForce RTX 2070 Super GPU took ~3-5 minutes.
import sys
from loguru import logger
logger.add(sys.stdout, level="INFO")
1
2024-12-15 13:22:46.548 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:parameter_search:1320 - Beginning Parameter Search...
2024-12-15 13:22:46.548 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:22:46]
2024-12-15 13:22:46.549 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_neighbors:265 - Performing KNN search on GPU:
K = 30
Distance Metric = euclidean
Search Algorithm = brute
2024-12-15 13:22:48.844 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_neighbors:294 - KNN GPU computed in 2.29492449760437 seconds
2024-12-15 13:22:48.844 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:22:48]
2024-12-15 13:22:48.849 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:511 - Performing Jaccard on CPU:
KNN graph nodes = 233997
KNN graph K-neighbors = 10
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
CUDA call='cudaEventDestroy(event_)' at file=/__w/cuml/cuml/python/cuml/build/cp310-cp310-linux_x86_64/_deps/raft-src/cpp/include/raft/core/resource/cuda_event.hpp line=34 failed with initialization error
2024-12-15 13:22:53.961 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:536 - Jaccard CPU edgelist constructed in 5.11216926574707seconds
2024-12-15 13:22:53.962 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:22:53]
2024-12-15 13:22:54.169 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.1
Max iterations = 500
Min cluster size = 10
2024-12-15 13:22:55.623 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 1.4533240795135498 seconds
2024-12-15 13:22:55.623 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9484455446154784
2024-12-15 13:22:55.625 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:22:55]
2024-12-15 13:22:55.706 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.2
Max iterations = 500
Min cluster size = 10
2024-12-15 13:22:57.126 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 1.4200868606567383 seconds
2024-12-15 13:22:57.127 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9331931568025645
2024-12-15 13:22:57.128 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:22:57]
2024-12-15 13:22:57.207 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.3
Max iterations = 500
Min cluster size = 10
2024-12-15 13:22:58.494 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 1.2873485088348389 seconds
2024-12-15 13:22:58.495 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9243871856528958
2024-12-15 13:22:58.497 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:22:58]
2024-12-15 13:22:58.575 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.4
Max iterations = 500
Min cluster size = 10
2024-12-15 13:22:59.924 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 1.3494250774383545 seconds
2024-12-15 13:22:59.925 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.917358648330681
2024-12-15 13:22:59.926 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:22:59]
2024-12-15 13:23:00.005 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.5
Max iterations = 500
Min cluster size = 10
2024-12-15 13:23:01.268 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 1.2633261680603027 seconds
2024-12-15 13:23:01.268 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9110178311781054
2024-12-15 13:23:01.270 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:01]
2024-12-15 13:23:01.348 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.6
Max iterations = 500
Min cluster size = 10
2024-12-15 13:23:02.605 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 1.2569570541381836 seconds
2024-12-15 13:23:02.606 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9076262134685122
2024-12-15 13:23:02.607 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:02]
2024-12-15 13:23:02.613 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:511 - Performing Jaccard on CPU:
KNN graph nodes = 233997
KNN graph K-neighbors = 15
2024-12-15 13:23:09.503 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:536 - Jaccard CPU edgelist constructed in 6.88990044593811seconds
2024-12-15 13:23:09.504 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:09]
2024-12-15 13:23:09.634 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.1
Max iterations = 500
Min cluster size = 10
2024-12-15 13:23:11.816 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 2.18188738822937 seconds
2024-12-15 13:23:11.817 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9394774128210623
2024-12-15 13:23:11.820 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:11]
2024-12-15 13:23:11.945 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.2
Max iterations = 500
Min cluster size = 10
2024-12-15 13:23:14.215 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 2.270263195037842 seconds
2024-12-15 13:23:14.216 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9194047840479728
2024-12-15 13:23:14.218 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:14]
2024-12-15 13:23:14.341 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.3
Max iterations = 500
Min cluster size = 10
2024-12-15 13:23:16.570 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 2.229368209838867 seconds
2024-12-15 13:23:16.571 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.908666658799452
2024-12-15 13:23:16.573 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:16]
2024-12-15 13:23:16.696 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.4
Max iterations = 500
Min cluster size = 10
2024-12-15 13:23:18.806 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 2.109954595565796 seconds
2024-12-15 13:23:18.807 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9002598279878875
2024-12-15 13:23:18.809 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:18]
2024-12-15 13:23:18.934 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.5
Max iterations = 500
Min cluster size = 10
2024-12-15 13:23:20.987 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 2.053276300430298 seconds
2024-12-15 13:23:20.987 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8920818583457851
2024-12-15 13:23:20.990 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:20]
2024-12-15 13:23:21.114 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.6
Max iterations = 500
Min cluster size = 10
2024-12-15 13:23:23.243 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 2.1290621757507324 seconds
2024-12-15 13:23:23.243 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8877194982744846
2024-12-15 13:23:23.246 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:23]
2024-12-15 13:23:23.253 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:511 - Performing Jaccard on CPU:
KNN graph nodes = 233997
KNN graph K-neighbors = 20
2024-12-15 13:23:33.243 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:536 - Jaccard CPU edgelist constructed in 9.989278316497803seconds
2024-12-15 13:23:33.244 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:33]
2024-12-15 13:23:33.415 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.1
Max iterations = 500
Min cluster size = 10
2024-12-15 13:23:36.838 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 3.423365354537964 seconds
2024-12-15 13:23:36.839 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9363115083020052
2024-12-15 13:23:36.842 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:36]
2024-12-15 13:23:37.004 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.2
Max iterations = 500
Min cluster size = 10
2024-12-15 13:23:40.386 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 3.381678581237793 seconds
2024-12-15 13:23:40.386 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9160461094701717
2024-12-15 13:23:40.390 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:40]
2024-12-15 13:23:40.568 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.3
Max iterations = 500
Min cluster size = 10
2024-12-15 13:23:44.015 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 3.44633150100708 seconds
2024-12-15 13:23:44.015 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9022094993908114
2024-12-15 13:23:44.019 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:44]
2024-12-15 13:23:44.182 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.4
Max iterations = 500
Min cluster size = 10
2024-12-15 13:23:47.698 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 3.51560378074646 seconds
2024-12-15 13:23:47.698 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8932483384780607
2024-12-15 13:23:47.702 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:47]
2024-12-15 13:23:47.859 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.5
Max iterations = 500
Min cluster size = 10
2024-12-15 13:23:50.882 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 3.0232059955596924 seconds
2024-12-15 13:23:50.882 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8845792023086071
2024-12-15 13:23:50.885 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:50]
2024-12-15 13:23:51.048 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.6
Max iterations = 500
Min cluster size = 10
2024-12-15 13:23:54.198 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 3.150099277496338 seconds
2024-12-15 13:23:54.198 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8778153974091304
2024-12-15 13:23:54.202 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:23:54]
2024-12-15 13:23:54.209 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:511 - Performing Jaccard on CPU:
KNN graph nodes = 233997
KNN graph K-neighbors = 25
2024-12-15 13:24:07.513 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:536 - Jaccard CPU edgelist constructed in 13.303729057312012seconds
2024-12-15 13:24:07.514 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:24:07]
2024-12-15 13:24:07.760 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.1
Max iterations = 500
Min cluster size = 10
2024-12-15 13:24:12.743 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 4.982553958892822 seconds
2024-12-15 13:24:12.743 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.933574570112028
2024-12-15 13:24:12.747 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:24:12]
2024-12-15 13:24:12.946 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.2
Max iterations = 500
Min cluster size = 10
2024-12-15 13:24:18.406 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 5.459635257720947 seconds
2024-12-15 13:24:18.406 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.911611833834457
2024-12-15 13:24:18.410 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:24:18]
2024-12-15 13:24:18.612 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.3
Max iterations = 500
Min cluster size = 10
2024-12-15 13:24:23.546 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 4.934455156326294 seconds
2024-12-15 13:24:23.547 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8968284695454755
2024-12-15 13:24:23.551 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:24:23]
2024-12-15 13:24:23.756 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.4
Max iterations = 500
Min cluster size = 10
2024-12-15 13:24:29.181 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 5.424898386001587 seconds
2024-12-15 13:24:29.181 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.886198516046092
2024-12-15 13:24:29.185 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:24:29]
2024-12-15 13:24:29.386 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.5
Max iterations = 500
Min cluster size = 10
2024-12-15 13:24:34.152 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 4.7659196853637695 seconds
2024-12-15 13:24:34.152 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8747998349811575
2024-12-15 13:24:34.156 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:24:34]
2024-12-15 13:24:34.358 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.6
Max iterations = 500
Min cluster size = 10
2024-12-15 13:24:39.108 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 4.75072717666626 seconds
2024-12-15 13:24:39.109 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8709781274910915
2024-12-15 13:24:39.113 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:24:39]
2024-12-15 13:24:39.119 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:511 - Performing Jaccard on CPU:
KNN graph nodes = 233997
KNN graph K-neighbors = 30
2024-12-15 13:24:55.509 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_jaccard:536 - Jaccard CPU edgelist constructed in 16.390007972717285seconds
2024-12-15 13:24:55.511 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:24:55]
2024-12-15 13:24:55.774 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.1
Max iterations = 500
Min cluster size = 10
2024-12-15 13:25:02.064 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 6.290363788604736 seconds
2024-12-15 13:25:02.065 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.934706743158094
2024-12-15 13:25:02.069 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:25:02]
2024-12-15 13:25:02.318 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.2
Max iterations = 500
Min cluster size = 10
2024-12-15 13:25:09.393 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 7.075368165969849 seconds
2024-12-15 13:25:09.394 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.9090085532750625
2024-12-15 13:25:09.398 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:25:09]
2024-12-15 13:25:09.646 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.3
Max iterations = 500
Min cluster size = 10
2024-12-15 13:25:16.919 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 7.272916555404663 seconds
2024-12-15 13:25:16.919 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.892130617514392
2024-12-15 13:25:16.924 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:25:16]
2024-12-15 13:25:17.171 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.4
Max iterations = 500
Min cluster size = 10
2024-12-15 13:25:23.515 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 6.344385862350464 seconds
2024-12-15 13:25:23.516 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8848316702791892
2024-12-15 13:25:23.520 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:25:23]
2024-12-15 13:25:23.768 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.5
Max iterations = 500
Min cluster size = 10
2024-12-15 13:25:29.742 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 5.973891496658325 seconds
2024-12-15 13:25:29.742 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.8723394623879358
2024-12-15 13:25:29.747 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:25:29]
2024-12-15 13:25:29.997 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:908 - Performing Leiden on GPU:
Resolution = 0.6
Max iterations = 500
Min cluster size = 10
2024-12-15 13:25:36.089 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:920 - cugraph.leiden computed in 6.09187388420105 seconds
2024-12-15 13:25:36.090 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:compute_leiden:924 - Q = 0.863084275764576
2024-12-15 13:25:36.094 | INFO | napari_prism.models.adata_ops.cell_typing._clustsearch:_log_current_time:1263 - [2024-12-15 13:25:36]
logger.remove()
Based on the clustering method chosen this will create attributes in .obsm and .uns, which are read by the next tab to assess clustering ‘quality’.
Clustering Evaluator (Assess Cluster Runs)#
AnnData Analysis (prism)
> Cell Typing
> Assess Cluster Runs
We can compute some quality metrics to assess the clustering runs with parameters that generate ‘good’ clusterings. This can be quite objective, and so we choose a strategy which guides the user towards multiple reasonable runs.
Here, we decide to define ‘good’ runs as those with ‘stable’ clusters, or runs which have similar clusters to other runs. To assess this, we choose from 3 quality metrics to compute each cluster against each other:
Adjusted Rand Index (ARI)
Normalized Mutual Information (NMI)
Adjusted Mutual Information (AMI)
These scores are currently computed only on the CPU.
Following from the clustering run above:
Select
HybridPhenographSearch
In
Between-Cluster Score Plots
, choose a metric (Here we choose Adjusted Rand Index)
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
From this plot, ideally the user would visually choose a handful of resolutions which have stable clusters (lowest resolution possible with greatest similarity to other clusters). Visually, this is the corner of a clear greener box in this plot.
We decide to choose K = 20, and export both R = 0.2 and R = 0.3
We can export both runs:
Select 20 in
Select K
Select 0.2 in
Select R
Press
Export Cluster Labels to Obs
.Repeat for 0.3 for R
This will export the cluster labels from this run as an obs columns called:
HybridPhenographSearch_K20_R0.2
HybridPhenographSearch_K20_R0.3
Visualise Clusters#
AnnData Analysis (napari-prism)
> Cell Typing
> Annotate Clusters
In this tab, the user can visualise clustering labels or any categorical .obs key.
To visualise grouped expression values we provide wrappers for the following scanpy
cluster plots:
Here, we select the scanpy.pl.matrixplot()
to visualise the mean group expression in the log1p_scale_zscore_augmented_X
layer of all available markers. We select to groupby the cluster labels from the exported labels in the previous tab, HybridPhenographSearch_K20_R0.2
:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
For the sake of example, we do a simple annotation of cells being tumor cells or not. However, if desired, the user can perform finer cell type annotations (although we would do this on more markers since at the start we only included basic lineage markers).
We annotate each cluster using the annotation table:
Select
HybridPhenographSearch_K20_R0.2
Press
Annotate
Double click the
Annotation
column header, rename to any column name then press Enter. We name the column:Tumor
Annotate each respective cluster index to a given annotation. Here we annotate it as True or False (for tumor and non-tumor cells)
Press
Confirm
to add these annotations
We can check the group mean expression in the new annotations by changing the groupby key to Tumor
. We expect tumor cells to be higher in E-cadherin
, Ki67
, Pan-Cytokeratin
, and area_var
(larger cells)
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Most importantly, we can check these annotations directly back on the image by coloring the segmentation masks:
Double-click
Tumor
inObservations
(currently there is no labels colormap legend in napari - this may be a future implementation in this package)Select
NSCLC4301
-> Double-click to switch to thePan-Cytokeratin
channel (43th channel)
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Zooming into various cores and switching to a contour view of the cell boundaries, True
(tumor) cells in lighter-blue overlays with the raw Pan-Cytokeratin markers and close to none with the the non-tumor cells in darker-blue:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Subclusterer#
AnnData Analysis (prism)
> Cell Typing
> Subclusterer
The usual last step of a recursive and interactive cell typing workflow is to perform subclustering on a subset.
Select an obs variable to subcluster on (cell_type or any other annotations)
Select a category in that variable (let’s subcluster the non-tumor cells to get their functional types)
Click
Subcluster
This will create a child AnnData node of only False
or non-tumor cells (we rename this to non_tumor_cells
).
The key function of this tab is that the full marker feature set is restored (since we may want to use markers we have excluded at the start of the workflow i.e. for finer cell typing within the non-tumor cell populations). From this we subset markers to perform cell-typing of base cell lineages.
We perform recursive cell typing; repeating the preprocessing, clustering and annotation steps for the non-tumor_cells
subset into base cell type lineages, then where possible further functionalise these cell types. All results are stored in a new annotation column called ct_func
.
For the tumor
subset, we name all cells as Tumor
in ct_func
.
Each subset will have a ‘ragged’-like array of annotations. Annotations in finer subsets (i.e. functionalised cell type labels) will take priority over coarse labels made in parent subsets. For example:
If we define cells 1 to 10 to be ‘T_cells’
Then subcluster these ‘T_cells’ population, where cells 1-5 as ‘T_cytotoxic’ and 6-7 as ‘T_reg’
Then cells 1-7 will take on the ‘T_cytotoxic’ and ‘T_reg’, whereas cells 8-10 without further functional annotations will take the original ‘T_cells’ label
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Spatial Analysis#
Currently, we have two spatial metrics (more coming in the future):
Cellular Neighborhoods
or CNs (from Schürch et al. (2020))Proximity Density
(a part ofspatial_pscore
from scimap)
Below we show how we can construct different spatial graphs compatible with either method.
Build Graph#
AnnData Analysis (prism)
> Spatial Analysis
> Build Graph
In this tab, we provide a wrapper for the squidpy.gr.spatial_neighbors()
function, exposing some parameters the user can tweak. See the squidpy documentation for information on the provided parameters.
Continuing on with the filter_by_DAPI_value
subset:
spatial_key
:spatial
in .obsm.library_key
: This should represent distinct spatial regions. In this case, this will betma_label
(each TMA core).percentile
: If, 0 will not threshold by percentile
For the CNs, we construct a 10-nearest neighbor spatial graph.
n_neighs
: 10key_added
:knn
. This will addknn_connectivities
andknn_distances
to .obsp
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
For proximity density, we construct a 40 pixel radius neighbors spatial graph.
radius
: 40 (if not -1, then this will overriden_neighs
)key_added
:radial
. This will addradial_connectivities
andradial_distances
to .obsp
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
# If we inspect the selected AnnData,
# obsp will have:
# connectivities/distances from the scanpy neighbors
# knn_connectivities/knn_distances from the knn graph
# radial_connectivities/radial_distances from the radial graph
tree = (
viewer.window._dock_widgets["AnnData Analysis (napari-prism)"]
.widget()
.tree
)
print(tree.adata_tree_widget.currentItem().adata.obsp)
PairwiseArrays with keys: connectivities, distances, knn_connectivities, knn_distances, radial_connectivities, radial_distances
Cellular Neighborhoods#
AnnData Analysis (prism)
> Spatial Analysis
> Cellular Neighborhoods
> Compute
We integrate the computation of cellular neighborhoods (CNs) as defined in Schürch et al. (2020), to be directly compatible with the squidpy neighborhood graphs in .obsp.
Computing CNs require a user set parameter for the number of neighborhoods to compute (K in KMeans clustering). Inpsired by Enfield et al. (2024), we implement a search for this parameter and a plot to assess the best choice using an elbow detection algorithm.
Steps#
In
spatial_key
selectknn_connectivities
In
phenotype
select the obs variable denoting the phenotype to compute CNs for (for this example, let’s use thect_func
labels)Provide a search range for the number of CNs (usually, a number between 3 and 20 should be sufficient)
(Optional) for larger datasets, consider using mini batch KMeans (by ticking the box)
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
AnnData Analysis (prism)
> Spatial Analysis
> Cellular Neighborhoods
> Visualise
We can visualise the results:
Kneedle Plot - This shows the decay in KMeans ‘Inertia’ with higher Ks
Inertia is the sum of squared distances of each data point its assigned cluster centroid (a measure of how tight the clusterings are)
The dashed vertical line is the ‘optimal’ (lowest K with the lowest inertia) K according to the Kneedle algorithm
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
K = 8 seems to be the ‘optimal’ number for CNs. We can check the enrichment of phenotypes with the Enrichment Matrix
plotting tab:
Choose K Kmeans run
: 8
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Press Export K
to export the 10 CNs as labels to .obs. This will appear as cns_k8
.
We can annotate the CNs like we annotated cell types using the annotation table.
We label each CN index a general descriptive labels:
0: Immune_1_enriched
1: Immune_2_enriched
2: Tumor_immune_depleted
3: Endothelial_enriched
4: Fibroblast_enriched
5: Immune_3_enriched
6: Tumor_immune_mixed
7: Myeloid_enriched
We can then visualise these CNs back on the cell segmentation masks on top of the raw image in View (napari-spatialdata)
:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
In B-2, we can see the Tumor_immune_depleted
CNs in light green and the Tumor_immune_mixed
in olive green possibly showing a tumor nest + tumor interface-like pattern:
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Proximity Density#
AnnData Analysis (prism)
> Spatial Analysis
> Proximity Density
> Compute
We provide a wrapper for computing the proximity density component of the spatial_pscore function from scimap
Steps#
In
spatial_key
selectradial_connectivities
In
phenotype
selectcell_type_new
Select the pairs to compute proximity density between. If none are selected, it will do all unique combinations without replacement (i.e. assumes commutativity or undirected edges since radial-based graphs hold that assumption). Otherwise, till do the cartesian product between the left and right selections.
For example, below will compute:
Tumor
to every typed immune cell population
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Starting plotting..
Starting calculating row orders..
Reordering rows..
Starting calculating col orders..
Reordering cols..
Plotting matrix..
Starting plotting HeatmapAnnotations
Collecting legends..
Collecting annotation legends..
Plotting legends..
AnnData Analysis (prism)
> Spatial Analysis
> Proximity Density
> Visualise
We implement a clustermap from PyComplexHeatmap to visualise these scores for each cell type pair, in every distinct region.
region_key
:tma_label
metadata_key
: Select a metadata key to map. These will have a 1:1 or 1:N relation withtma_label
(i.e. for a single or multiple tma_label(s), there will be a matching metadata label). Here we go with the example at the top of this notebook of the fake metadata columnMetadata1
.
Press Plot
with ScreenshotContext():
plt.imshow(viewer.screenshot(canvas_only=False))
plt.axis("off")
plt.show()
Future + Roadmap#
This covers all current features so far in PRISM
.
In the near future:
Expect to see more spatial features to compute and plot, alongside spatial graph plots in the main viewer with vectors.
Expect to see a
Feature Modelling
tab and usage notebook, covering how to combine cell types, regions, annotations and spatial metrics to generate features for machine learning / deep learning models to a) predict clinical responses / outcomes and b) perform feature selection.