Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pairtools:Empty of fully duplicated library, can't estimate complexity #254

Open
Wong718 opened this issue Nov 4, 2024 · 4 comments
Open

Comments

@Wong718
Copy link

Wong718 commented Nov 4, 2024

Thanks for conducting this useful tool for 3D genome analysis.
However, when I tried to convert the bam file (haplotagged by whatshap) to the pairs format, I met the error

pairtools:Empty of fully duplicated library, can't estimate complexity

The code I run was as follows

pairtools parse2  \
	--output-stats scNM-C_001.stats.txt \
	-c $fai --drop-sam --drop-seq --expand --add-pair-index --min-mapq 20\
	scNM-C_001.ht.bam -o scNM-C_001.ht.pairs.gz

Could you help me fix this problem? thanks a lot.

@Phlya
Copy link
Member

Phlya commented Nov 4, 2024

Do you get any pairs in the output? If yes, this should be safe to ignore at this stage. This is a warning from estimation of library complexity which requires annotation of duplicated pairs, but at the parsing stage before dedup this information is not available.

@Wong718
Copy link
Author

Wong718 commented Nov 5, 2024

Thanks for your extremely quick reply!
However, I have checked the output and there is no proper output in the .pairs file. And I also try the unhaplotagged .sam file directly generated from bwa, but it seems the same.
The output .pairs file write

 d3d59d85-f117-406d-93e3-4901250df094    !       0       !       0       -       -       XX      1       R1-2
f243eba2-22fe-4b38-a415-d8985d077396    !       0       !       0       -       -       XX      1       R1-2
7d11cd1f-e60c-448a-abf4-491d4e2fbcb3    !       0       !       0       -       -       XX      1       R1-2
ff74d552-2350-4449-a06b-11c06e5de5de    !       0       !       0       -       -       XX      1       R1-2
e7669fad-39b2-4ceb-b1ce-57d2f3b9feff    !       0       !       0       -       -       XX      1       R1-2
cc6863d0-ad92-40a4-a82d-c10c4cd99c0b    !       0       !       0       -       -       XX      1       R1-2
b400bcfc-0f06-463d-8948-35f897c7fdfb    !       0       !       0       -       -       XX      1       R1-2
fe0b58e9-4562-4735-9a76-931bc108771b    !       0       !       0       -       -       XX      1       R1-2
7013ec1b-fe2a-4c5f-b58b-8eb9af7e96e5    !       0       !       0       -       -       XX      1       R1-2

and the .stat file write

total   1406727
total_unmapped  1406727
total_single_sided_mapped       0
total_mapped    0
total_dups      0
total_nodups    0
cis     0
trans   0
pair_types/XX   1406727

Previously, I have tried to due with the same .sam file with hickit::sam2seg and it has generated informative and proper results. So what's the problem. And I sincerely appreciate you reply again, thank you.

@Phlya
Copy link
Member

Phlya commented Nov 5, 2024

@agalitsyna is this something you fixed recently?

@agalitsyna
Copy link
Member

Hi @Wong718 ,
What version of pairtools do you use?
Is the problem reproducible with the latest version from github?
Is it single-end read library or paired-end?
Also, feel free to share the sample of this bam file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants