Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing bug for GEO dataset summary and newlines #36

Open
JTFouquier opened this issue May 10, 2016 · 3 comments
Open

Parsing bug for GEO dataset summary and newlines #36

JTFouquier opened this issue May 10, 2016 · 3 comments

Comments

@JTFouquier
Copy link
Collaborator

When loading GEO datasets, the "summary" field should be imported into the metadata. There was a bug importing the summary for GSE18842. The summary right now only reads "PURPOSE", presumably because it was truncated at the first newline in the GEO entry (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE18842). If this diagnosis is correct, then the GEO data importer should be fixed to import the entire summary field.


@JTFouquier
Copy link
Collaborator Author

confirmed that "ParseGSEFiles.py" script takes only the first paragraph of the summary text. The logic should be fixed at here.


Original comment by: Chunlei Wu

@JTFouquier
Copy link
Collaborator Author

Re-uploading GEO datasets requires a revamp of the original "ParseGSEFiles.py" script, as the original insilicodb api doesn't work any more. Their new api is here:

https://insilicodb.org/api/class-interface-controller.html


Original comment by: Chunlei Wu

@JTFouquier
Copy link
Collaborator Author

instead of fixing this bug, we will make a new dataset-loading pipeline at #37


Original comment by: Chunlei Wu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant