[QUESTION] #129
-
Hi Kelvin, I have a question: I am going to generate a tree to show where the mutation in sequences are occurring, in those clones that dandelion finds edges between them. In other words, I want to generate the tree for the network generated by dandelion. So the questions are 1- what column I should use for generating the tree, to see the mutations of sequences? Thank you, |
Beta Was this translation helpful? Give feedback.
Replies: 31 comments 15 replies
-
Hi @saramoein372, just converting to a discussion here. I think if you really want to plot the networks/trees, I think you should seriously consider using either dowser's or alakazam's implementations as they are designed to do what you are asking for. dandelion's method is just a simple minimum spanning tree approach compared to a phylogenetics/sequence alignment approach. If you still want to go down the route of using dandelion's network. i will suggest the following:
|
Beta Was this translation helpful? Give feedback.
-
Hi Kelvin,
Thank you so much.
I am a little confused: Would you please make it clear which part of this
BCR analysis is based on *hamming distance*? and where is it based on
*levenshtein
distance*?
Thanks,
Sara
…On Sat, Feb 12, 2022 at 7:06 PM Zewen Kelvin Tuong ***@***.***> wrote:
Hi @saramoein372 <https://github.com/saramoein372>,
just converting to a discussion here.
I think if you really want to plot the networks/trees, I think you should
seriously consider using either dowser's
<https://dowser.readthedocs.io/en/latest/vignettes/Plotting-Trees-Vignette/>
or alakazam's
<https://alakazam.readthedocs.io/en/stable/vignettes/Lineage-Vignette/>
implementations as they are designed to do what you are asking for.
dandelion's method is just a simple minimum spanning tree approach
compared to a phylogenetics/sequence alignment approach.
If you still want to go down the route of using dandelion's network. i
will suggest the following:
To answer you points above,
1. you have to ask yourself where in the contig you want to count the
mutations -
A. sequence_alignment_aa : the entire sequence as amino acid sequence
B. sequence_alignment : the entire sequence as nucleotide sequence.
This column is gapped, so you will need to remove the . before doing
the calculation
C. junction_aa : just the CDR3 junction as amino acid sequence
D. junction : just the CDR3 junction as nucleotide sequence
I think one of the CDR3 columns would be sufficient. dandelion's
network uses A as default.
2. I think hamming distance is the most appropriate for looking at
mutations (substitutions and not indels). dandelion's network uses
levenshtein distance as default and there's currently no option to swap
this to hamming.
3. if you can extract the specific clone from the dandelion networkx
object, you can plot it as any layout you wish, like a tree layout. I've
personally not tried this as i just use alakazam or dowser for that
purpose. It's not an immediately priority to implement this for dandelion
as it requires the inference of a root/germline node to add to the data,
which doesn't exactly fit with the single-cell nature of the data.
—
Reply to this email directly, view it on GitHub
<#129 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVVJONS6GBA6UD2FEW6VSWTU23YX3ANCNFSM5OHZEBBA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
clone_id definition is based on hamming distance at the CDR3 region, and BCR network is based on levenshtein distance of the entire sequence. |
Beta Was this translation helpful? Give feedback.
-
Thanks Kelvin. The reason I am asking is when I extract the edges, and
nodes from the data, I can see there is edges between nodes with MORE THAN
ONE base difference. But in the paper, it says the edge is between nodes
when ONLY ONE base is different.
How we can justify this?
…On Tue, Feb 15, 2022, 3:50 AM Zewen Kelvin Tuong ***@***.***> wrote:
clone_id definition is based on hamming distance at the CDR3 region, and
BCR network is based on levenshtein distance of the entire sequence.
—
Reply to this email directly, view it on GitHub
<#129 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVVJONSM6LLR2GG5EKMNVADU3IHUFANCNFSM5OHZEBBA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi Sara, i don't think i've explicitly say that there's how it is: In the Stephenson et al paper, this is what described in the methods:
basically think of it as two steps:
|
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Hi Sara, is the issue because there's light chains in the object? |
Beta Was this translation helpful? Give feedback.
-
Yes, there are light chains.
But regardless of that, I have a new data and trying to generate the
network, but I get the error when identifying clones:
by_alleles = True
#ddl.tl.find_clones(new_vdj)
ddl.tl.find_clones(new_vdj, full_pairing_label = True)
ValueError: cannot reindex from a duplicate axis
Do you have any comments on how to solve this problem?
Thank you,
Sara
…On Mon, Feb 28, 2022 at 10:16 AM Zewen Kelvin Tuong < ***@***.***> wrote:
Hi Sara, is the issue because there's light chains in the object?
—
Reply to this email directly, view it on GitHub
<#129 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVVJONQUQVJ2YQ57QZ4HCETU5OGWJANCNFSM5OHZEBBA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Thanks Kelvin. I could solve the issue after some hours debugging....
Best,
Sara
…On Fri, Mar 4, 2022, 11:31 AM Zewen Kelvin Tuong ***@***.***> wrote:
Hi Sara, can you print the full error log here.
—
Reply to this email directly, view it on GitHub
<#129 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVVJONUMZ43MNH6K32E2GDDU6I3ENANCNFSM5OHZEBBA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi Kelvin, I have a question about TCR data processing. I used the filterd_contig.fasta and filtered_contig_annotation.tsv as the input. But got the below error: usage: dandelion_preprocess.py [-h] [--meta META] [--chain CHAIN] What can be the problem? I appreciate your help. Thanks, |
Beta Was this translation helpful? Give feedback.
-
Thanks Kelvin.
But where I should get the new one?
Originally I used the command: singularity pull
library://kt16/default/sc-dandelion:latest
…On Thu, Mar 17, 2022 at 2:45 PM Zewen Kelvin Tuong ***@***.***> wrote:
Hi Sara, I had just uploaded a new sc-dandelion_latest.sif (to v0.2.0) in
the last hour or so. Can you delete your current .sif file and pull it
again? i think it should work then.
—
Reply to this email directly, view it on GitHub
<#129 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVVJONUJA6BIK2JB2PYLKBTVAN4TLANCNFSM5OHZEBBA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
I get this error: What files from cell ranger are required for this TCR processing? |
Beta Was this translation helpful? Give feedback.
-
Strange...
Why I get error:
more slurm-8931679.out
Software versions:
Beginning preprocessing
command line parameters:
:
…--------------------------------------------------------------
--meta = None
--chain = tr
--file_prefix = all
--sep = _
--flavour = strict
--skip_format_header = False
--filter_to_high_confidence = True
--keep_trailing_hyphen_number = False
--skip_reassign_dj = False
--clean_output = False
--------------------------------------------------------------
dandelion==0.2.0 pandas==1.3.4 numpy==1.20.3 matplotlib==3.4.3
networkx==2.6.3 scipy==1.7.1 skbio==0.5.6
Formating fasta(s) : 0%| | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/share/dandelion_preprocess.py", line 264, in <module>
main()
File "/share/dandelion_preprocess.py", line 184, in main
ddl.pp.format_fastas(
File
"/opt/conda/envs/sc-dandelion-container/lib/python3.9/site-packages/dandelion/preprocessing/_preprocessing.py",
line 304, in format_fastas
format_fasta(
File
"/opt/conda/envs/sc-dandelion-container/lib/python3.9/site-packages/dandelion/preprocessing/_preprocessing.py",
line 88, in format_fasta
raise FileNotFoundError(
FileNotFoundError: Path to fasta file is unknown. Please specify path to
fasta file or folder containing fasta file. Starting folder should only
contain 1 fasta file.
On Thu, Mar 17, 2022 at 3:07 PM Zewen Kelvin Tuong ***@***.***> wrote:
the same files as BCR, *_contig.fasta and *_contig_annotations.csv.
—
Reply to this email directly, view it on GitHub
<#129 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVVJONRNT5UFLJMUDPG6IWLVAN7FNANCNFSM5OHZEBBA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi Kelvin,
Thanks. I am not clear about the --meta file. What is that?
In the guideline I could not understand where to put it.
Can you provide some details please?
With this command: singularity run -B
/athena/namlab/scratch/sam4032/HL_8_s1_TCR
/athena/namlab/scratch/sam4032/HL_8_s1_TCR/sc-dandelion_latest.sif
dandelion-preprocess --chain TR
when I am using filtered_contig... inputs, I get this error:
FileNotFoundError: Path to .tsv file for BCR is unknown. Please specify
path to reannotated .tsv file or folder containing reannotated .tsv file.
Would you please help me solve this error?
Thanks,
Sara
…On Thu, Mar 17, 2022 at 4:12 PM Zewen Kelvin Tuong ***@***.***> wrote:
so to run dandelion-preprocess, you need to be in the directory above the
folder containing the files. for example:
.
└── sample1
├── all_contig.fasta
└── all_contig_annotations.csv
where . is my current working directory. dandelion-preprocess will then
process the sample1 folder. If you have more than 1 folders under . like:
.
├── irrelevant_folder
│ └── irrelevant_file.txt
├── sample1
│ ├── all_contig.fasta
│ └── all_contig_annotations.csv
└── sample2
├── all_contig.fasta
└── all_contig_annotations.csv
This will come up with the error that you saw i.e. FileNotFoundError:
Path to fasta file is unknown. Please specify path to fasta file or folder
containing fasta file. Starting folder should only contain 1 fasta file.
Specifying the --meta option with a .csv file as per the example
<https://sc-dandelion.readthedocs.io/en/latest/notebooks/singularity_preprocessing.html>
will get around this issue.
—
Reply to this email directly, view it on GitHub
<#129 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVVJONSRNAG664QVTK65WX3VAOG3XANCNFSM5OHZEBBA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Thanks Kelvin.
I put the meta.csv in the working directory.
my meta file is :
sample
BCR
But again I get the error:
Traceback (most recent call last):
File "/share/dandelion_preprocess.py", line 264, in <module>
main()
File "/share/dandelion_preprocess.py", line 199, in main
ddl.pp.reannotate_genes(samples,
File
"/opt/conda/envs/sc-dandelion-container/lib/python3.9/site-packages/dandelion/preprocessing/_preprocessing.py",
line 959, in reannotate_genes
change_file_location(data, filename_prefix)
File
"/opt/conda/envs/sc-dandelion-container/lib/python3.9/site-packages/dandelion/utilities/_io.py",
line 736, in change_file_location
raise FileNotFoundError(
FileNotFoundError: Path to .tsv file for BCR is unknown. Please specify
path to reannotated .tsv file or folder containing reannotated .tsv file.
Any comments? It is asking about a tsv file... what is that file?
…On Fri, Mar 18, 2022 at 9:51 AM Zewen Kelvin Tuong ***@***.***> wrote:
singularity run -B /athena/namlab/scratch/sam4032/HL_8_s1_TCR /athena/namlab/scratch/sam4032/HL_8_s1_TCR/sc-dandelion_latest.sif dandelion-preprocess --chain TR --meta meta.csv
the meta.csv file should look like this:
sample
5841STDY7998693
5841STDY7998694
5841STDY7998695
your folder structure (where . refers to
/athena/namlab/scratch/sam4032/HL_8_s1_TCR) should look like this:
.
├── 5841STDY7998693
│ ├── filtered_contig.fasta
│ └── filtered_contig_annotations.csv
├── 5841STDY7998694
│ ├── filtered_contig.fasta
│ └── filtered_contig_annotations.csv
└── 5841STDY7998695
├── filtered_contig.fasta
└── filtered_contig_annotations.csv
—
Reply to this email directly, view it on GitHub
<#129 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVVJONXMNRRLKPNWTW4RFDLVASC43ANCNFSM5OHZEBBA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Thanks Kelvin.
I am on HPC and the tree sudo tree command is not working for me.
But I send you the list of my directory:
This is my working directory: /athena/namlab/scratch/sam4032/HL_8_s1_TCR
And inside this folder I have:
```
total 1835392
drwxr-sr-x 3 sam4032 namlab 4096 Mar 18 09:26 BCR
…-rw-r--r--. 1 sam4032 namlab 14 Mar 18 10:49 meta.csv
-rwxr-xr-x 1 sam4032 namlab 1879352862 Mar 17 17:06 sc-dandelion_latest.sif
-rwxr-xr-x 1 sam4032 namlab 481 Mar 18 10:52 tcr_preprocess.sh
```
The BCR folder include
```
-rw-r--r-- 1 sam4032 namlab 1039382 Mar 17 17:06
filtered_contig_annotations.csv
-rw-r--r-- 1 sam4032 namlab 990458 Mar 17 17:06 filtered_contig.fasta
```
So there is no prefix in my data.
My command is:
```
cd /athena/namlab/scratch/sam4032/HL_8_s1_TCR
singularity run -B /athena/namlab/scratch/sam4032/HL_8_s1_TCR
/athena/namlab/scratch/sam4032/HL_8_s1_TCR/sc-dandelion_latest.sif
dandelion-preprocess --chain TR --meta meta.csv
```
I appreciate any comment
On Fri, Mar 18, 2022 at 12:35 PM Zewen Kelvin Tuong < ***@***.***> wrote:
Hi Sara, can i ask you to print your working directory with tree
https://www.tecmint.com/linux-tree-command-examples/
1. There should be no blank lines in your meta.csv file i.e. your file
should be:
sample
BCR
not
sample
BCR
1. The input file names have to end with *_contig.fasta and
*_contig_annotations.csv.
So your BCR folder must look like this
.
├── BCR
├── filtered_contig.fasta
└── filtered_contig_annotations.csv
if they are not filtered, like BCR_something_something_contig.fasta and
BCR_something_something_contig_annotations.csv
Then you need to specify the filename prefix in your command like so:
singularity run -B $PWD sc-dandelion_latest.sif dandelion-preprocess --chain IG --meta meta.csv --filename_prefix BCR_something_something
If it's TCR,
singularity run -B $PWD sc-dandelion_latest.sif dandelion-preprocess --chain TR --meta meta.csv --filename_prefix TCR_something_something
—
Reply to this email directly, view it on GitHub
<#129 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVVJONXXUBMJDS67STMVZH3VASWFDANCNFSM5OHZEBBA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Thanks Kelvin.
I have
drwxr-sr-x 3 sam4032 namlab 4096 Mar 18 14:29 .
drwxr-sr-x 29 sam4032 namlab 4096 Mar 17 17:06 ..
drwxr-sr-x 3 sam4032 namlab 4096 Mar 18 14:28 BCR
…-rw-r--r-- 1 sam4032 namlab 16 Mar 18 14:27 meta.csv
-rwxr-xr-x 1 sam4032 namlab 1879352862 Mar 17 17:06 sc-dandelion_latest.sif
-rwxr-xr-x 1 sam4032 namlab 481 Mar 18 10:52 tcr_preprocess.sh
So, a very basic question is how I remove . and .. folders. I am using rm
-rf . and it is not working.
Do you have any comments?
Also, to make sure you know the exact error from running TCR dandelion I
copied that here:
***@***.*** HL_8_s1_TCR]$ more slurm-8932812.out
Software versions:
Beginning preprocessing
command line parameters:
:
--------------------------------------------------------------
--meta = meta.csv
--chain = tr
--file_prefix = filtered
--sep = _
--flavour = strict
--skip_format_header = False
--filter_to_high_confidence = False
--keep_trailing_hyphen_number = False
--skip_reassign_dj = False
--clean_output = False
--------------------------------------------------------------
dandelion==0.2.0 pandas==1.3.4 numpy==1.20.3 matplotlib==3.4.3
networkx==2.6.3 scipy==1.7.1 skbio==0.5.6
Formating fasta(s) : 100%|██████████| 1/1 [00:00<00:00, 4.13it/s]
Assigning genes : 0%| | 0/1 [00:00<?, ?it/s] START>
MakeDB
COMMAND> igblast
ALIGNER_FILE> filtered_contig_igblast.fmt7
SEQ_FILE> filtered_contig.fasta
ASIS_ID> False
ASIS_CALLS> False
PARTIAL> False
EXTENDED> True
INFER_JUNCTION> False
PROGRESS> 14:28:24 |Done | 0.0 min
PROGRESS> 14:28:24 | | 0% ( 0) 0.0 minTraceback
(most recent call last):
File "/opt/conda/envs/sc-dandelion-container/bin/MakeDb.py", line 897, in
<module>
args.func(**args_dict)
File "/opt/conda/envs/sc-dandelion-container/bin/MakeDb.py", line 542, in
parseIgBLAST
output = writeDb(germ_iter, fields=fields, aligner_file=aligner_file,
total_count=total_count,
File "/opt/conda/envs/sc-dandelion-container/bin/MakeDb.py", line 302, in
writeDb
record.setDict(annotations[record.sequence_id], parse=True)
KeyError: 'AAACCTGAGAAGGGTA-1_contig_1'
START> MakeDB
COMMAND> igblast
ALIGNER_FILE> filtered_contig_igblast.fmt7
SEQ_FILE> filtered_contig.fasta
ASIS_ID> False
ASIS_CALLS> False
PARTIAL> False
EXTENDED> True
INFER_JUNCTION> False
PROGRESS> 14:28:24 |Done | 0.0 min
PROGRESS> 14:28:24 | | 0% ( 0) 0.0 minTraceback
(most recent call last):
File "/opt/conda/envs/sc-dandelion-container/bin/MakeDb.py", line 897, in
<module>
args.func(**args_dict)
File "/opt/conda/envs/sc-dandelion-container/bin/MakeDb.py", line 542, in
parseIgBLAST
output = writeDb(germ_iter, fields=fields, aligner_file=aligner_file,
total_count=total_count,
File "/opt/conda/envs/sc-dandelion-container/bin/MakeDb.py", line 302, in
writeDb
record.setDict(annotations[record.sequence_id], parse=True)
KeyError: 'AAACCTGAGAAGGGTA-1_contig_1'
Assigning genes : 100%|██████████| 1/1 [00:18<00:00, 18.64s/it]
Traceback (most recent call last):
File "/share/dandelion_preprocess.py", line 264, in <module>
main()
File "/share/dandelion_preprocess.py", line 199, in main
ddl.pp.reannotate_genes(samples,
File
"/opt/conda/envs/sc-dandelion-container/lib/python3.9/site-packages/dandelion/preprocessing/_preprocessing.py",
line 959, in reannotate_genes
change_file_location(data, filename_prefix)
File
"/opt/conda/envs/sc-dandelion-container/lib/python3.9/site-packages/dandelion/utilities/_io.py",
line 736, in change_file_location
raise FileNotFoundError(
FileNotFoundError: Path to .tsv file for BCR is unknown. Please specify
path to reannotated .tsv file or folder containing reannotated .tsv file.
So it looks it starts working and then looking for a tsv file
On Fri, Mar 18, 2022 at 2:29 PM Sara Moien ***@***.***> wrote:
Thanks Kelvin.
I am on HPC and the tree sudo tree command is not working for me.
But I send you the list of my directory:
This is my working directory: /athena/namlab/scratch/sam4032/HL_8_s1_TCR
And inside this folder I have:
total 1835392
drwxr-sr-x 3 sam4032 namlab 4096 Mar 18 09:26 BCR
-rw-r--r--. 1 sam4032 namlab 14 Mar 18 10:49 meta.csv
-rwxr-xr-x 1 sam4032 namlab 1879352862 Mar 17 17:06
sc-dandelion_latest.sif
-rwxr-xr-x 1 sam4032 namlab 481 Mar 18 10:52 tcr_preprocess.sh
The BCR folder include
-rw-r--r-- 1 sam4032 namlab 1039382 Mar 17 17:06
filtered_contig_annotations.csv
-rw-r--r-- 1 sam4032 namlab 990458 Mar 17 17:06 filtered_contig.fasta
So there is no prefix in my data.
My command is:
cd /athena/namlab/scratch/sam4032/HL_8_s1_TCR
singularity run -B /athena/namlab/scratch/sam4032/HL_8_s1_TCR
/athena/namlab/scratch/sam4032/HL_8_s1_TCR/sc-dandelion_latest.sif
dandelion-preprocess --chain TR --meta meta.csv
I appreciate any comment
On Fri, Mar 18, 2022 at 12:35 PM Zewen Kelvin Tuong <
***@***.***> wrote:
> Hi Sara, can i ask you to print your working directory with tree
> https://www.tecmint.com/linux-tree-command-examples/
>
> 1. There should be no blank lines in your meta.csv file i.e. your
> file should be:
>
> sample
>
> BCR
>
>
> not
>
> sample
>
>
>
> BCR
>
>
>
> 1. The input file names have to end with *_contig.fasta and
> *_contig_annotations.csv.
> So your BCR folder must look like this
>
> .
>
> ├── BCR
>
> ├── filtered_contig.fasta
>
> └── filtered_contig_annotations.csv
>
>
> if they are not filtered, like BCR_something_something_contig.fasta and
> BCR_something_something_contig_annotations.csv
> Then you need to specify the filename prefix in your command like so:
>
> singularity run -B $PWD sc-dandelion_latest.sif dandelion-preprocess --chain IG --meta meta.csv --filename_prefix BCR_something_something
>
> If it's TCR,
>
> singularity run -B $PWD sc-dandelion_latest.sif dandelion-preprocess --chain TR --meta meta.csv --filename_prefix TCR_something_something
>
> —
> Reply to this email directly, view it on GitHub
> <#129 (reply in thread)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AVVJONXXUBMJDS67STMVZH3VASWFDANCNFSM5OHZEBBA>
> .
> Triage notifications on the go with GitHub Mobile for iOS
> <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
> or Android
> <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
>
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Beta Was this translation helpful? Give feedback.
-
***@***.*** HL_8_s1_TCR]$ ls -A .
BCR meta.csv sc-dandelion_latest.sif slurm-8932812.out tcr_preprocess.sh
…On Fri, Mar 18, 2022 at 2:48 PM Zewen Kelvin Tuong ***@***.***> wrote:
can you do ls -A . and print what returns?
—
Reply to this email directly, view it on GitHub
<#129 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVVJONXFJQT4F7CGFTUBPNDVATFZRANCNFSM5OHZEBBA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
***@***.*** HL_8_s1_TCR]$ ls -la
total 1835392
drwxr-sr-x 3 sam4032 namlab 4096 Mar 18 14:29 .
drwxr-sr-x 29 sam4032 namlab 4096 Mar 17 17:06 ..
drwxr-sr-x 3 sam4032 namlab 4096 Mar 18 14:28 BCR
…-rw-r--r-- 1 sam4032 namlab 16 Mar 18 14:27 meta.csv
-rwxr-xr-x 1 sam4032 namlab 1879352862 Mar 17 17:06 sc-dandelion_latest.sif
-rw-r--r-- 1 sam4032 namlab 3612 Mar 18 14:28 slurm-8932812.out
-rwxr-xr-x 1 sam4032 namlab 481 Mar 18 10:52 tcr_preprocess.sh
On Fri, Mar 18, 2022 at 2:52 PM Sara Moien ***@***.***> wrote:
***@***.*** HL_8_s1_TCR]$ ls -A .
BCR meta.csv sc-dandelion_latest.sif slurm-8932812.out
tcr_preprocess.sh
On Fri, Mar 18, 2022 at 2:48 PM Zewen Kelvin Tuong <
***@***.***> wrote:
> can you do ls -A . and print what returns?
>
> —
> Reply to this email directly, view it on GitHub
> <#129 (reply in thread)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AVVJONXFJQT4F7CGFTUBPNDVATFZRANCNFSM5OHZEBBA>
> .
> Triage notifications on the go with GitHub Mobile for iOS
> <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
> or Android
> <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
>
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Beta Was this translation helpful? Give feedback.
-
Thanks. I am looking forward to hearing from you. Thank you again
…On Fri, Mar 18, 2022, 2:53 PM Zewen Kelvin Tuong ***@***.***> wrote:
OK i think this is probably a bug. I'll look it up and see what's going on
—
Reply to this email directly, view it on GitHub
<#129 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVVJONVPITFYY36SLVGEYA3VATGLJANCNFSM5OHZEBBA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi Sara, sorry i can't seem to replicate the issue. Can you run the following lines exactly: cd /athena/namlab/scratch/sam4032/HL_8_s1_TCR/
mkdir test_for_dandelion
cd test_for_dandelion
mkdir example
cd example
wget -O filtered_contig_annotations.csv https://cf.10xgenomics.com/samples/cell-vdj/3.1.0/vdj_v1_hs_pbmc3/vdj_v1_hs_pbmc3_t_filtered_contig_annotations.csv
wget -O filtered_contig.fasta https://cf.10xgenomics.com/samples/cell-vdj/3.1.0/vdj_v1_hs_pbmc3/vdj_v1_hs_pbmc3_t_filtered_contig.fasta
cd ..
singularity run -B $PWD /athena/namlab/scratch/sam4032/HL_8_s1_TCR/sc-dandelion_latest.sif dandelion-preprocess --chain TR If it runs without issue, then the problem is there's something wrong with your |
Beta Was this translation helpful? Give feedback.
-
Hi Kelvin,
Thank you so much for your help!
I could run this code!
But not sure what is wrong with my filtered_contig_anno...files
I can see my contig file has some columns more than what is in your example
folder. Should that make a problem?
…On Fri, Mar 18, 2022 at 3:34 PM Zewen Kelvin Tuong ***@***.***> wrote:
Hi Sara, sorry i can't seem to replicate the issue.
Can you run the following lines *exactly*:
cd /athena/namlab/scratch/sam4032/HL_8_s1_TCR/
mkdir test_for_dandelioncd test_for_dandelion
mkdir examplecd example
wget -O filtered_contig_annotations.csv https://cf.10xgenomics.com/samples/cell-vdj/3.1.0/vdj_v1_hs_pbmc3/vdj_v1_hs_pbmc3_t_filtered_contig_annotations.csv
wget -O filtered_contig.fasta https://cf.10xgenomics.com/samples/cell-vdj/3.1.0/vdj_v1_hs_pbmc3/vdj_v1_hs_pbmc3_t_filtered_contig.fastacd ..
singularity run -B $PWD /athena/namlab/scratch/sam4032/HL_8_s1_TCR/sc-dandelion_latest.sif dandelion-preprocess --chain TR
If it runs without issue, then the problem is there's something wrong with
your filtered_contig_annotations.csv and/or filtered_contig.fasta files
—
Reply to this email directly, view it on GitHub
<#129 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVVJONTQOZJ3ETBKZE7QAB3VATLGFANCNFSM5OHZEBBA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Okay. Thanks. I'll check that.
…On Fri, Mar 18, 2022, 4:38 PM Zewen Kelvin Tuong ***@***.***> wrote:
I think the issue lies in that your two files have different contig
barcodes. Can you check if the line numbers are the same? If the
annotations file has 700 lines + 1 header, your fasta file should have 1400
lines.
—
Reply to this email directly, view it on GitHub
<#129 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVVJONUFHDM6SCEIERYPXJDVATSSXANCNFSM5OHZEBBA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Thanks Kelvin. Does the order of columns in the filtered_contig_annotation.csv matters? |
Beta Was this translation helpful? Give feedback.
-
Hi Kelvin, I have a question: can I have the link of tutorial for TCR analysis? |
Beta Was this translation helpful? Give feedback.
-
Hi Kelvin,
Thank you so much.
I could see a part related to TCR analysis. However, I get error when
generating the network. I am not sure what is the input of this analysis,
since it is not clear where the "filtered_contig_dandelion.tsv" is used.
May I ask what input file is used for the TCR analysis?
filtered_contig_igblast_db-pass_genotyped.tsv or
filtered_contig_dandelion.tsv?
I think the tutorial is not update for this part.
Thanks,
Sara
…On Fri, Mar 25, 2022 at 9:40 AM Zewen Kelvin Tuong ***@***.***> wrote:
it's all in the docs:
https://sc-dandelion.readthedocs.io
—
Reply to this email directly, view it on GitHub
<#129 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVVJONSAMUPPO2ZD3WZOPLDVBW63NANCNFSM5OHZEBBA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi Kelvin,
Thank you. I am trying to use tutorial "Step 2: Reannotate the V/D/J genes
with *igblastn"*.
ddl.pp.format_fastas(samples, prefix =samples , filename_prefix
=filename_prefixes)
But I get this error:
KeyError: 'Environmental variable IGDATA must be set. Otherwise,
please provide path to igblast database'
Would you please help me to solve this issue? What path should I use?
Thanks,
Sara
…On Fri, Mar 25, 2022 at 6:10 PM Zewen Kelvin Tuong ***@***.***> wrote:
filtered_contig_dandelion.tsv should work. if you are using reprocesed
data through dandelion's preprocessing workflow, you may use the same
workflow as with BCR data but specify locus = 'tr' where it's asked for.
When not specified in the call, the default islocus='ig'. I would ask of
you to go through the individual commands here
<https://sc-dandelion.readthedocs.io/en/latest/api.html> to find out
which commands require this toggle.
Indeed the tutorial is not up-to-date for that part. I will update it soon.
—
Reply to this email directly, view it on GitHub
<#129 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVVJONXT2LCD24MPZBHQKATVBY2TLANCNFSM5OHZEBBA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Related to running the dandelion preprocessing I get this error:
vdj1 = ddl.read_10x_airr('filtered_contig_dandelion.tsv')
I get this error:
KeyError: 'd_call'
Do you have any comments?
Thanks,
Sara
…On Sat, Mar 26, 2022 at 11:57 AM Sara Moien ***@***.***> wrote:
Hi Kelvin,
Thank you. I am trying to use tutorial "Step 2: Reannotate the V/D/J
genes with *igblastn"*.
ddl.pp.format_fastas(samples, prefix =samples , filename_prefix
=filename_prefixes)
But I get this error:
KeyError: 'Environmental variable IGDATA must be set. Otherwise, please provide path to igblast database'
Would you please help me to solve this issue? What path should I use?
Thanks,
Sara
On Fri, Mar 25, 2022 at 6:10 PM Zewen Kelvin Tuong <
***@***.***> wrote:
> filtered_contig_dandelion.tsv should work. if you are using reprocesed
> data through dandelion's preprocessing workflow, you may use the same
> workflow as with BCR data but specify locus = 'tr' where it's asked for.
> When not specified in the call, the default islocus='ig'. I would ask of
> you to go through the individual commands here
> <https://sc-dandelion.readthedocs.io/en/latest/api.html> to find out
> which commands require this toggle.
>
> Indeed the tutorial is not up-to-date for that part. I will update it
> soon.
>
> —
> Reply to this email directly, view it on GitHub
> <#129 (reply in thread)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AVVJONXT2LCD24MPZBHQKATVBY2TLANCNFSM5OHZEBBA>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Beta Was this translation helpful? Give feedback.
-
Hi Kelvin,
I have a question about BCR clones: how I can find the biggest clone in
each BCR clone network?
Thank you,
Sara
…On Tue, Mar 29, 2022 at 4:47 AM Zewen Kelvin Tuong ***@***.***> wrote:
your filtered_contig_dandelion.tsv should contain a d_call column.
—
Reply to this email directly, view it on GitHub
<#129 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVVJONRYD3YLRSJK4CVV6ALVCK7SDANCNFSM5OHZEBBA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
when you said each BCR clone network, what do you mean exactly? in the whole sample? |
Beta Was this translation helpful? Give feedback.
Hi @saramoein372,
just converting to a discussion here.
I think if you really want to plot the networks/trees, I think you should seriously consider using either dowser's or alakazam's implementations as they are designed to do what you are asking for.
dandelion's method is just a simple minimum spanning tree approach compared to a phylogenetics/sequence alignment approach.
If you still want to go down the route of using dandelion's network. i will suggest the following:
To answer your points above,
A.
sequence_alignment_aa
: the entire sequence as amino acid sequenceB.
sequence_alignment
: the entire sequenc…