You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For example, with this accession, OM900516.2, it fails with this message:
$ datasets download genome accession OM900516.2 --filename OM900516.zip --assembly-version latest --include genome
Error: invalid or unsupported assembly accession: OM900516
Use datasets download genome accession <command> --help for detailed help about a command.
The reason being is that these kinds of accession are only accessible through the NCBI Virus data package, so you have to specify a different sub-command to download the genome (& other associated files)
This command works:
$ datasets download virus genome accession OM900516.2 --filename OM900516.2.zip --include genome
New version of client (16.27.2) available at https://ftp.ncbi.nlm.nih.gov/pub/datasets/command-line/LATEST/linux-amd64/datasets.
Downloading: OM900516.2.zip 15.1kB valid zip structure -- files not checked
Validating package [================================================] 100% 5/5
I've started a dev branch called cjk-assembly-fetch for this a long time ago but it was left by the wayside as other higher priorities arose.
It would be good to continue making commits to this branch and add in support more completely. Things that need to be done:
update the docker image to the latest version of NCBI datasets CLI tool. StaPH-B has one that's a little more up-to-date though not likely the absolute latest version
clean up output file handling and ensure that regular downloads without use of datasets download virus subcommand are unimpacted by changes
there may be other CLI features we want to make accessible to the user. Would be good to do a review of their documentation prior to further code dev.
test on a variety of accessions both virus and non-virus to ensure functionality.
The text was updated successfully, but these errors were encountered:
🆒
📌 Explain the Request
Some GenBank accessions are unable to be downloaded via the command we currently have in the Assembly_fetch workflow:
In code here: https://github.com/theiagen/public_health_bioinformatics/blob/5be343354f716d77e9e4a0fb4a2ec10eb3bc00a5/tasks/utilities/data_import/task_ncbi_datasets.wdl#L27C5-L28C24
For example, with this accession,
OM900516.2
, it fails with this message:The reason being is that these kinds of accession are only accessible through the NCBI Virus data package, so you have to specify a different sub-command to download the genome (& other associated files)
This command works:
I've started a dev branch called
cjk-assembly-fetch
for this a long time ago but it was left by the wayside as other higher priorities arose.It would be good to continue making commits to this branch and add in support more completely. Things that need to be done:
datasets
CLI tool. StaPH-B has one that's a little more up-to-date though not likely the absolute latest versiondatasets download virus
subcommand are unimpacted by changesThe text was updated successfully, but these errors were encountered: