sctk.multi_resolution_cluster_qc

sctk.multi_resolution_cluster_qc(ad, metrics, failed=True, res=array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.]), n_pcs=None, n_neighbors=None, threshold=0.5, clus_key='qc_cluster', umap_key='X_umap_qc', cell_qc_key='cell_passed_qc', key_added='cluster_passed_qc', consensus_threshold=0.5, consensus_frac_key='consensus_fraction', consensus_call_key='consensus_passed_qc') None

Run generate_qc_clusters() and clusterwise_qc() for a number of potential resolutions. Uses the multiple clusterings to identify a more robust set of QC calls than that derived from a single resolution. Proposes two sets of QC calls:

1. A clustering resolution is proposed as the one that grants the highest Jaccard index between cell-level and cluster-level QC calls.

2. For each cell, the fraction of tested resolutions where the cell passes QC is computed. This fraction is then thresholded to get a QC call.

Stores both sets of QC calls in input object.

Parameters:
  • ad – AnnData object to generate QC clusters for.

  • metricsgenerate_qc_clusters() argument. List of QC metrics to use for generating QC clusters. Must be present as obs columns.

  • failed – If True, will maximise the Jaccard index for cell-level and cluster-level calls of cells failing QC. If False, will do so for cells passing QC.

  • res – Resolution values to check the clustering of.

  • n_pcsgenerate_qc_clusters() argument. Number of principal components to use for PCA. If not provided,this will be set to max(2, len(metrics) - 2).

  • n_neighborsgenerate_qc_clusters() argument. Number of nearest neighbors to use for constructing the nearest neighbor graph. If not provided, this will be set to min(max(5, int(ad.n_obs / 500)), 10).

  • thresholdclusterwise_qc() argument. Clusters featuring at least this fraction of good QC cells will be deemed good QC clusters.

  • clus_key – Obs column to store the QC clusters in.

  • umap_keygenerate_qc_clusters() argument. Obsm key to store the QC UMAP coordinates in.

  • cell_qc_keyclusterwise_qc() argument. Key to use to retrieve per- cell QC calls from obs in the AnnData.

  • key_addedclusterwise_qc() argument. Key to use for storing the results in the AnnData obs object.

  • consensus_threshold – A cell has to pass QC in more than this fraction of tested resolutions to be flagged as a good QC cell in the consensus calls.

  • consensus_frac_key – Key to use for storing the consensus fraction in the AnnData obs.

  • consensus_call_key – Key to use for storing the consensus calls (thresholded consensus fraction) in the AnnData obs.

Returns:

None.

Raises:

None.

Examples

>>> import scanpy as sc
>>> import sctk
>>> adata = sc.datasets.pbmc3k()
>>> sctk.calculate_qc(adata)
>>> metrics_list = ["n_counts", "n_genes", "percent_mito", "percent_ribo", "percent_hb"]
>>> sctk.cellwise_qc(adata)
>>> sctk.multi_resolution_cluster_qc(adata, metrics=metrics_list)