Processed Data Access #15

Qiqing-Fu · 2023-12-15T03:44:40Z

It is difficult for us to download the pre-trained model and the pre-processed data stored on Amazon, could you share me with the email?

Qiqing-Fu · 2023-12-15T04:01:09Z

And where is /home/ubuntu/COVID_Data/NeuroCOVID/TrainSplitData/NeuroCOVID_preprocessed_splitted.h5ad and /home/ubuntu/scGAN_ProcessedData/MADE_BY_scGAN/20Kneurons_2KTest.h5?

dr-aheydari · 2023-12-15T05:37:58Z

Hi @Seraph-009,

Thank you for your interest in ACTIVA! Given the size and the number of files, it would be best to share it over a cloud service, such as AWS (as we have).

If you are not familiar with downloading data from AWS, I highly recommend taking a look at issue #14. I provided some guidance on different ways of downloading our data from AWS there : )

The path in the code was a local path that we used for training (on a virtual machine), which you would need to change with the path to the data on your machine once the data is downloaded.

I hope this helps. Please let me know if you have any other questions : )

Best,
Ali

Qiqing-Fu · 2023-12-16T06:32:49Z

Thank you for your kind reply.
However, I have to register the Amazon account through the VISA card which I don't have now. Because I live in China, and we don't use the VISA card.
Could you provide the script of converting the raw_68kPBMCs.h5ad into the 68kPBMCs_7kTest.h5ad? Maybe this will help me more! thank you.

dr-aheydari · 2023-12-19T23:38:06Z

hi @Seraph-009,

Sorry to hear about the AWS registration issues. Of course, I'd be happy to point you to the Notebooks for splitting sc files to train/test sets.

This notebook shows how one can use our SCProcessing pipeline for splitting the data into train/test/validation sets. This is how we went from <dataset>.h5ad to <dataset>_xkTrain/Test.h5ad : )

I hope this helps!

Best,
Ali

Qiqing-Fu · 2023-12-22T07:33:15Z

hi @dr-aheydari ,
Thank you for your instruction, and It did work！
However, I see the pipeline and find that the labels of generated cells are subsampled from the raw labels, which makes me puzzled. Does this work?
For example, I have raw data with 9000 cells and the corresponding cell type. if I want to generate 20000 cells, how should I use the ACTIVA?
I appreciate your patience.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Processed Data Access #15

Processed Data Access #15

Qiqing-Fu commented Dec 15, 2023

Qiqing-Fu commented Dec 15, 2023

dr-aheydari commented Dec 15, 2023 •

edited

Loading

Qiqing-Fu commented Dec 16, 2023 •

edited

Loading

dr-aheydari commented Dec 19, 2023

Qiqing-Fu commented Dec 22, 2023 •

edited

Loading

Processed Data Access #15

Processed Data Access #15

Comments

Qiqing-Fu commented Dec 15, 2023

Qiqing-Fu commented Dec 15, 2023

dr-aheydari commented Dec 15, 2023 • edited Loading

Qiqing-Fu commented Dec 16, 2023 • edited Loading

dr-aheydari commented Dec 19, 2023

Qiqing-Fu commented Dec 22, 2023 • edited Loading

dr-aheydari commented Dec 15, 2023 •

edited

Loading

Qiqing-Fu commented Dec 16, 2023 •

edited

Loading

Qiqing-Fu commented Dec 22, 2023 •

edited

Loading