Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ask VEuPathDb for reference sync with NCBI #191

Open
nekrut opened this issue Dec 5, 2024 · 2 comments
Open

Ask VEuPathDb for reference sync with NCBI #191

nekrut opened this issue Dec 5, 2024 · 2 comments
Assignees

Comments

@nekrut
Copy link
Contributor

nekrut commented Dec 5, 2024

Contact VEuPathDb about syncing references with NCBI

@nekrut nekrut converted this from a draft issue Dec 5, 2024
@nekrut nekrut self-assigned this Dec 5, 2024
@d-callan
Copy link
Collaborator

d-callan commented Dec 9, 2024

Some things to think about prior to a call:

  1. maybe putting together a list, in a google sheet or other format share-able w them, of what all assemblies of theirs dont match the GCA accession they claim (and which of these are in our list of initial targets)
  2. it might be possible to do some interesting things in spite of these out-of-sync issues.. ex: a first step might be to let ppl run galaxy workflows on the assembly claimed by the gca accession regardless of whether its identical to whats actually in ncbi and then pass back to veupath gene identifiers rather than loci. their jbrowse tracks wouldnt work, but some other plots would id think.
  3. they might have thoughts on our initial organisms list too. ive done my best to include what i remember as big ones, in addition to what anton had already, but they might have a couple theyd really like to be able to let ppl run galaxy workflows on. are we open to that feedback?
  4. how much does it really matter what theyre claiming as reference? as long as we also have that assembly in addition to the ncbi refseq? is this a 'let the user decide' type moment? maybe ive misunderstood something..

@nekrut nekrut moved this to In Progress in BRC development tasks Dec 12, 2024
@nekrut
Copy link
Contributor Author

nekrut commented Dec 17, 2024

Here is a link to slides from the meeting: https://docs.google.com/presentation/d/1JDHoirXmqb6ovHW-eKWq232alnI_Ro2yb_6c5PQdM5U/edit?usp=sharing

The next steps are:

• Vamsi to provide list of genomes where ViewPathDB and GenBank sequences differ.
• Terrence/NCBI team to allocate staff time (estimated 1 week) to analyze and categorize genome discrepancies.
• ViewPathDB team to provide usage statistics to help prioritize genomes for analysis.
• Anton to follow up in second week of January to schedule next meeting on genome reconciliation.
• ViewPathDB team to discuss internally about adding links to BRC Analytics tools.
• BRC Analytics team to implement sequence identifier mapping between ViewPathDB and GenBank accessions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

2 participants