KO abundances #596

LunavdL · 2022-12-07T16:21:18Z

LunavdL
Dec 7, 2022

Hello,

I have a dataset with samples across an environmental gradient and I ran Atlas for the first time (with both the genomes and genecatalog workflow). I have analyzed the MAGs and their taxonomic patterns, but I would like to dive into the functional patterns now.

I followed the online tutorial to analyze the output of Atlas, and I made the heatmaps that use the relative abundance of the KEGG modules (based on the matrix multiplication of the relative abundance of the MAGs and the presence matrix of the KEGG modules with step coverage>0.8). I also see a lot of papers online however that use KO abundances and I can't find these in the Atlas output folders. I am a bit confused about whether I should use the output in the ./genome folder or the ./Genecatalog folder for these functional analyses and how I can find or calculate the KO abundances.

Should I use the Nmapped_reads (in ./Genecatalog/counts) and look up the KO identifier for each of the genes listed?

My apologies if this is a very simple question.
Best regards

Answered by SilasK

Dec 13, 2022

@LunavdL It is not a silly question.
Great you managed to calculate the abundance of modules.

There are two ways of getting to KO abundances, and you mentioned both of them:

Based on the genomes
Based on the Genecatalog

I suggest you to start with 1. The analysis would work the same as for Kegg modules with a matrix multiplication. You just need to take the KO's instead of the modules, I think it's the file genomes/annotations/dram/annotations.tsv

For 2. You can use the eggnog annotations, and filter for the eggnogs that have a link to Kegg. and sum the abundance of these genes.

I don't know which way is better. The genecatalog for sure is more comprehensive. But the eggNOG annotations …

View full answer

SilasK · 2022-12-13T10:26:41Z

SilasK
Dec 13, 2022
Maintainer

@LunavdL It is not a silly question.
Great you managed to calculate the abundance of modules.

There are two ways of getting to KO abundances, and you mentioned both of them:

Based on the genomes
Based on the Genecatalog

I suggest you to start with 1. The analysis would work the same as for Kegg modules with a matrix multiplication. You just need to take the KO's instead of the modules, I think it's the file genomes/annotations/dram/annotations.tsv

For 2. You can use the eggnog annotations, and filter for the eggnogs that have a link to Kegg. and sum the abundance of these genes.

I don't know which way is better. The genecatalog for sure is more comprehensive. But the eggNOG annotations are linked to somehwat outdated KOs' if I'm not wrong.
I planned to add KOfam annotations for the genecatalog.

6 replies

SilasK Dec 16, 2022
Maintainer

Very good.
Yes you don't nned matrix multiplication with the genecatalog abundnace.

You can use CLR transformation from relative abundance but if you want to use Aldex, then you would need the counts.

LunavdL Dec 16, 2022
Author

Thank you very much!

teddy256 May 8, 2023

Hello @LunavdL

Do you mind sharing your code on how you calculated the KO abundances from annotations.tsv file?
Thank you.

PS. R beginner user :)

LunavdL May 9, 2023
Author

I first made a KO presence matrix:

kegg_KO <- read.table("annotations.tsv", header=T)
KO_presence_matrix = dcast(kegg_KO, fasta~kegg_id)
row.names(KO_presence_matrix) <- KO_presence_matrix$fasta
KO_presence_matrix$fasta <- NULL
KO_presence_matrix <- as.matrix(KO_presence_matrix)
KO_presence_matrix[KO_presence_matrix > 0] <- 1

And then used a matrix multiplication to multiply the abundance matrix with the KO presence matrix:
KO_abund <- as.matrix(abund) %*% as.matrix(KO_presence_matrix)
or:
KO_counts <- as.matrix(counts) %*% as.matrix(KO_presence_matrix)

teddy256 May 9, 2023

Thank you so much!

SilasK · 2022-12-13T10:27:52Z

SilasK
Dec 13, 2022
Maintainer

An other way for analysis would be to do a kind of enrichment analysis based on the genomes.

E.g. from all the genomes that are changed above threshold are they enriched in a KO or in e kegg module.

3 replies

LunavdL Jan 12, 2023
Author

Hi @SilasK,
I was wondering if you know of an R package you would advise using to test whether certain genomes are enriched in certain KOs? (there are a lot of options available, but many functions seem to be working mainly for a specific KEGG organism and I have a diverse microbiome)

I used ALDEx2 to obtain a list of MAGs that significantly change in abundance with salinity (my environmental variable) and I have a KO count table for my samples (obtained with a matrix multiplication of the count table and KO presence table, both containing only the significant MAGs). Now I would like to test whether these MAGs that are significantly more abundant in low versus high salinity are enriched in specific KOs.

SilasK Jan 13, 2023
Maintainer

I'm sorry, I'm not an R expert.
Just to clarify my idea in this second point would be different.

Take the effect from aldex and then see for each KO in the KO-genome-presence matrix if there is a enrichment of increased taxa.

LunavdL Jan 17, 2023
Author

Thank you for the clarification!

SilasK · 2023-05-08T12:08:00Z

SilasK
May 8, 2023
Maintainer

@teddy256 @LunavdL

For your information in the version 2.15.2 I added the option to annotate genes with dram that runs more precise and uptodate KO annotation than eggNOG.

You need to add - dram to the genecatalog annotation section.

LunavdL May 9, 2023
Author

Thank you for letting us know!

teddy256 May 10, 2023

Thank you @SilasK. Using atlas version 2.15.2 I was able to get the medium_coverage.h5, kegg.parquet and cazy.parquet. Precisely what I needed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KO abundances #596

{{title}}

Replies: 3 comments 11 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

KO abundances #596

LunavdL Dec 7, 2022

Replies: 3 comments · 11 replies

SilasK Dec 13, 2022 Maintainer

SilasK Dec 16, 2022 Maintainer

LunavdL Dec 16, 2022 Author

teddy256 May 8, 2023

LunavdL May 9, 2023 Author

teddy256 May 9, 2023

SilasK Dec 13, 2022 Maintainer

LunavdL Jan 12, 2023 Author

SilasK Jan 13, 2023 Maintainer

LunavdL Jan 17, 2023 Author

SilasK May 8, 2023 Maintainer

LunavdL May 9, 2023 Author

teddy256 May 10, 2023

LunavdL
Dec 7, 2022

Replies: 3 comments 11 replies

SilasK
Dec 13, 2022
Maintainer

SilasK Dec 16, 2022
Maintainer

LunavdL Dec 16, 2022
Author

LunavdL May 9, 2023
Author

SilasK
Dec 13, 2022
Maintainer

LunavdL Jan 12, 2023
Author

SilasK Jan 13, 2023
Maintainer

LunavdL Jan 17, 2023
Author

SilasK
May 8, 2023
Maintainer

LunavdL May 9, 2023
Author