sctk.multi_resolution_cluster_qc
- sctk.multi_resolution_cluster_qc(ad, metrics, failed=True, res=array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.]), n_pcs=None, n_neighbors=None, threshold=0.5, clus_key='qc_cluster', umap_key='X_umap_qc', cell_qc_key='cell_passed_qc', key_added='cluster_passed_qc', consensus_threshold=0.5, consensus_frac_key='consensus_fraction', consensus_call_key='consensus_passed_qc') None
Run
generate_qc_clusters()
andclusterwise_qc()
for a number of potential resolutions. Uses the multiple clusterings to identify a more robust set of QC calls than that derived from a single resolution. Proposes two sets of QC calls:1. A clustering resolution is proposed as the one that grants the highest Jaccard index between cell-level and cluster-level QC calls.
2. For each cell, the fraction of tested resolutions where the cell passes QC is computed. This fraction is then thresholded to get a QC call.
Stores both sets of QC calls in input object.
- Parameters:
ad – AnnData object to generate QC clusters for.
metrics –
generate_qc_clusters()
argument. List of QC metrics to use for generating QC clusters. Must be present as obs columns.failed – If
True
, will maximise the Jaccard index for cell-level and cluster-level calls of cells failing QC. IfFalse
, will do so for cells passing QC.res – Resolution values to check the clustering of.
n_pcs –
generate_qc_clusters()
argument. Number of principal components to use for PCA. If not provided,this will be set to max(2, len(metrics) - 2).n_neighbors –
generate_qc_clusters()
argument. Number of nearest neighbors to use for constructing the nearest neighbor graph. If not provided, this will be set to min(max(5, int(ad.n_obs / 500)), 10).threshold –
clusterwise_qc()
argument. Clusters featuring at least this fraction of good QC cells will be deemed good QC clusters.clus_key – Obs column to store the QC clusters in.
umap_key –
generate_qc_clusters()
argument. Obsm key to store the QC UMAP coordinates in.cell_qc_key –
clusterwise_qc()
argument. Key to use to retrieve per- cell QC calls from obs in the AnnData.key_added –
clusterwise_qc()
argument. Key to use for storing the results in the AnnData obs object.consensus_threshold – A cell has to pass QC in more than this fraction of tested resolutions to be flagged as a good QC cell in the consensus calls.
consensus_frac_key – Key to use for storing the consensus fraction in the AnnData obs.
consensus_call_key – Key to use for storing the consensus calls (thresholded consensus fraction) in the AnnData obs.
- Returns:
None.
- Raises:
None. –
Examples
>>> import scanpy as sc >>> import sctk >>> adata = sc.datasets.pbmc3k() >>> sctk.calculate_qc(adata) >>> metrics_list = ["n_counts", "n_genes", "percent_mito", "percent_ribo", "percent_hb"] >>> sctk.cellwise_qc(adata) >>> sctk.multi_resolution_cluster_qc(adata, metrics=metrics_list)