Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix: Multiple header lines in the genbank tsv file
The fetch of rsv genbank data involved three distinct curl calls, leading to the creation of multiple header lines in the fetched data. Although the first header line was correctly interpreted, the 2nd and 3rd header lines transformed into inaccurate metadata and sequence records within the data/genbank.ndjson file. This caused failures during the subsequent data processing in the transform rule. This commit addresses the above issue with tsv-append to concatenate the three files, ensuring that only a single header line is used.
- Loading branch information