Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt Azimuth code for Wilms label transfer #843

Conversation

sjspielman
Copy link
Member

@sjspielman sjspielman commented Oct 29, 2024

Towards #810

This PR does the crux of the work to swap over label transfer strategies. There are a lot of code changes, but I think they are all ok for 1 PR since they are all highly related (please let me know if you want me to cherry pick some things out!).

This PR implements the following changes:

  • Add script notebook_template/utils/label-transfer-functions.R. This script contains functions adapted from Azimuth (with source links) to prepare a query for label transfer and perform label transfer
  • Update notebooks (02a and 02b) that perform label transfer to use these new functions rather than Azimuth
    • I also added a param for CI to this notebook so we can set k.weight accordingly
  • Update 00_run_workflow.sh (now a shell script; see Create shell script for wilms-06 workflow #817 & issue Use shell script for cell-type-wilms-tumor-06 workflow #816) so these steps are not run in CI, but use the CI param

There are also some additional smol changes:

  • A minor tweak to one of the .gitleaks.toml regex's for efficiency
  • Remove code that installs the fetus reference from scripts/prepare-fetal-references.R, since this data is now tracked with renv (Add fetusref.SeuratData to renv #831)

I also added a new directory supplemental-notebooks which contains a README and 1 notebook (here is the HTML for review though it's also committed: compare-label-transfer-approaches.nb.html.zip) to compare these results to Azimuth. Before running this notebook, I re-ran the main branch workflow up through label transfer (including object Seurat preparation) to ensure results being compared use the current data release. Then, I ran this notebook to compare results between Azimuth and the code here which is adapted from Azimuth over 5 samples. Results are consistent enough that I feel comfortable with this code!

Note that the pre-commit hook styled a lot of this code, so you may want to turn off whitespace when reviewing!

Opening this as a draft since we're running in CI for the first time!
EDIT: Several samples have now successfully undergone label transfer in CI; it's still running at the time of writing this, but seems enough to be be ready for review!

@sjspielman sjspielman marked this pull request as ready for review October 29, 2024 20:12
Copy link
Member

@jaclyn-taroni jaclyn-taroni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the results, but I am most curious about what happens when we use the adaptation with all the other results (e.g., inferCNV) to generate labels! We may not know that until next week (at the earliest).

The only question I had was about whether or not we should be setting query.assay = NULL.

k.filter = NA,
reference.neighbors = "refdr.annoy.neighbors",
reference.assay = "refAssay",
query.assay = NULL,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to be doing this instead of accepting an argument to transfer_labels?

Copy link
Member Author

@sjspielman sjspielman Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NULL here is based on the source code, and to be honest I was hesitant to rock the boat on it...

This is defined in Seurat with this assay argument, and we previously didn't override the NULL default.

I'm ~sure the assay should be RNA, but I'd want to run it through to see if specifying* "RNA" changes the results. Do you think I should do that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ~sure the assay should be RNA, but I'd want to run it through to see if specifying* "RNA" changes the results. Do you think I should do that?

I too am ~sure, so I say go for it; it will bring us greater understanding at the very least!

@sjspielman
Copy link
Member Author

I re-ran this with RNA as the assay, and interestingly (though I don't fully understand why, may be part of the query prep actually?)...

  • the fetal full results were the same
  • the fetal kidney results were not, but they were "improved" - cell type "disagree" score distributions were shifted much lower (yay!), and there were way fewer cell type differences (yay!). There were no more meshenchymal swaps, either.

Here's the updated supplemental notebook:
compare-label-transfer-approaches.nb.html.zip

Given this increased agreement with Azimuth inferences, I think specifying "RNA" is indeed the move. I updated the actual label transfer notebooks accordingly.

Copy link
Member

@jaclyn-taroni jaclyn-taroni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

@sjspielman sjspielman merged commit 3e2f90c into AlexsLemonade:feature/wilms-tumor-06-azimuth Nov 5, 2024
3 checks passed
@sjspielman sjspielman deleted the sjspielman/wilms-run-azimuth branch November 5, 2024 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants