You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have several files in the ngi-igenomes folder on juno that do not actually exist in the remote reference repository, making recreation of this reference folder difficult in any other environment. Many of these paths are listed in the tempo references configuration file. Here's a list of files that are newer than Nov 16, 2018:
$ find $PWD -mtime -1930 -type f -exec ls -l {} \;
-rw-r----- 1 gongy cmopipeline 242018150 Mar 10 2022 /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/hg19/1000G_phase1.indels.hg19.sites.vcf
-rw-r----- 1 gongy cmopipeline 90196895 Mar 10 2022 /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/hg19/Mills_and_1000G_gold_standard.indels.hg19.sites.vcf
-rw-r----- 1 gongy cmopipeline 1484596 Mar 10 2022 /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/hg19/Mills_and_1000G_gold_standard.indels.hg19.sites.vcf.idx
-rw-r----- 1 gongy cmopipeline 12381528 Mar 10 2022 /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/hg19/dbsnp_138.hg19.vcf.idx
-rw-r----- 1 gongy cmopipeline 1238920 Mar 10 2022 /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/hg19/1000G_phase1.indels.hg19.sites.vcf.idx
-rw-r----- 1 gongy cmopipeline 10796220779 Mar 10 2022 /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/hg19/dbsnp_138.hg19.vcf
-rw-r--r-- 1 socci cmopipeline 1517 Mar 18 2019 /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/GRCh37/Annotation/intervals/human.b37.genome.bed
-rw-r--r-- 1 socci cmopipeline 1360930446 Mar 7 2019 /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/GRCh37/Sequence/WholeGenomeFasta/human_g1k_v37_decoy.fasta.microsatellites.list
-rw-rw-r-- 1 socci cmopipeline 3189750467 Feb 27 2019 /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/GRCh37/Sequence/WholeGenomeFasta/human_g1k_v37_decoy.fasta
-rw-r--r-- 1 socci cmopipeline 67108864 Apr 22 2019 /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/GRCh37/Sequence/WholeGenomeFasta/human_g1k_v37_decoy.fasta.index
-rw-r--r-- 1 noronhaa cmopipeline 16854 Jun 29 2022 /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/GRCh37/Sequence/BWAIndex/human_g1k_v37_decoy.fasta.dict
-rw-r--r-- 1 noronhaa cmopipeline 1176551519 Jun 30 2022 /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/GRCh37/Sequence/BWAIndex/human_g1k_v37_decoy.fasta.gridsscache
-rw-rw-r-- 1 socci cmopipeline 3189750467 Jul 1 2019 /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/GRCh37/Sequence/BWAIndex/human_g1k_v37_decoy.fasta
-rw-r--r-- 1 wooh cmopipeline 2813 Jun 7 2021 /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/GRCh37/Sequence/BWAIndex/human_g1k_v37_decoy.fasta.fai
-rw-r--r-- 1 socci cmopipeline 9040952644 Mar 5 2019 /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/b37/dbsnp_137.b37__RmDupsClean__plusPseudo50__DROP_SORT.vcf
-rw-r--r-- 1 socci cmopipeline 1015019014 Mar 5 2019 /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/b37/dbsnp_137.b37__RmDupsClean__plusPseudo50__DROP_SORT.vcf.gz
some files such as human.b37.genome.bed, human_g1k_v37_decoy.fasta.microsatellites.list and dbsnp_137.b37__RmDupsClean__plusPseudo50__DROP_SORT.vcf* can be relocated somewhere outside of the igenomes directory. fasta, fai, and dict files can be cleaned up or ignored from /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/GRCh37/Sequence/BWAIndex/ because i don't believe they are being used by tempo.
some of the vcf files are also unzipped in the juno folder, but on igenomes they only exist as zipped files. this might cause confusion as well.
The text was updated successfully, but these errors were encountered:
We have several files in the ngi-igenomes folder on juno that do not actually exist in the remote reference repository, making recreation of this reference folder difficult in any other environment. Many of these paths are listed in the tempo references configuration file. Here's a list of files that are newer than Nov 16, 2018:
some files such as human.b37.genome.bed, human_g1k_v37_decoy.fasta.microsatellites.list and dbsnp_137.b37__RmDupsClean__plusPseudo50__DROP_SORT.vcf* can be relocated somewhere outside of the igenomes directory. fasta, fai, and dict files can be cleaned up or ignored from /juno/work/taylorlab/cmopipeline/mskcc-igenomes/igenomes/Homo_sapiens/GATK/GRCh37/Sequence/BWAIndex/ because i don't believe they are being used by tempo.
some of the vcf files are also unzipped in the juno folder, but on igenomes they only exist as zipped files. this might cause confusion as well.
The text was updated successfully, but these errors were encountered: