Skip to content

Input files

Pieter Verschaffelt edited this page Mar 20, 2023 · 4 revisions

This page lists all the different files (and file formats) that are being used by the different steps in the database construction process of Unipept's database.

NCBI

The files in this section are all provided by NCBI and can be retrieved from their FTP-server: https://ftp.ncbi.nih.gov/pub/taxonomy/taxdmp.zip. The ZIP-file linked above contains more files than the ones listed below, but these are not of interested to our project.

names.dmp

This file contains all NCBI taxa and their associated name.

Example

1	|	all	|		|	synonym	|
1	|	root	|		|	scientific name	|
2	|	Bacteria	|	Bacteria <bacteria>	|	scientific name	|

nodes.dmp

This file contains all NCBI taxa and the associated ranks and lineage (as described by the NCBI taxonomy).

Example

1	|	1	|	no rank	|		|	8	|	0	|	1	|	0	|	0	|	0	|	0	|	0|		|
2	|	131567	|	superkingdom	|		|	0	|	0	|	11	|	0	|	0	|	0	|	0	|0	|		|
Clone this wiki locally