Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sc normalization and clustering #48

Open
lpantano opened this issue Aug 27, 2024 · 0 comments
Open

sc normalization and clustering #48

lpantano opened this issue Aug 27, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@lpantano
Copy link
Contributor

Normalization: Log or SCT or both?
Log for findMarkers?
SCT create a new slot with the data
SCT for clusters
Log for looking at expression
SCT new version it does: NormalizeData(), ScaleData(), and FindVariableFeatures().
The glmGamPoi package substantially improves speed, but memory is high, turn off if it is an issue
Check if new version of SCT will actually run normalizeData and check numbers are different (checking the counts and data in the RNA assay)
Regress covariates? When they are not relevant to the question and affecting clustering
Looking at PCA for variables to possible regress if needed
Looking at UMAP for variables to regress
Try not to regress covariates by default
Integration
When to do it: multiple samples or batches, you need to make sure clusters are based on cell type/stages not other variable
What to integrate on: sample, batches, any other variable that is cofounding the clustering step
Always use Harmony?
It could be Harmony only with one
Harmony for two or more covariates, CCA for one
Log-normalization and look at metadata in UMAP, and if there is clear separation, try SCT, if mito/ribo are driving differences, remove the genes from VariableGenes function
If the biological variable separate too much the clusters, still may be useful to force cells to be the same among conditions
Look at samples indv. And annotate clusters first, then put together samples to see how the prior clusters align and decide based on that
Clustering
How to display data and choose resolution?
Leiden clustering
Clustering tree
Upmap for each resolution:
Broader : 0.1
Granular: 0.8
Follow up clustering over resolutions
Plotting markers on UMAP
Barplots showing proportions of metadata variable per cluster
Who IDs clusters (the client!)
Guide with known methods: celltypist, ?
Sub-cluster to a specific cell type to identify the sub-clusters
Annotate with High/Low genes
FindMarkers vs. FindConservedMarkers vs. FindAllMarkers

@lpantano lpantano added the enhancement New feature or request label Aug 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants