From 21d39fe4bad1f0cf22537ba3814d5286865b4d04 Mon Sep 17 00:00:00 2001
From: EliLawrence Yes, polygons, lines, or combinations of polygon, line, and/or point data can be added to OBIS by using the 1.5.0.1 GeneralWhere can I make suggestions for improvements on this Manual?
1.5.0.3 Formatting DataHow does data flow in OBIS?
+Can I add polygon data to OBIS?
+
+footprintWKT
field. This is can be used to record tracks, transects, tows, trawls, habitat extent, or when an exact location is not known. Midpoints of polygons can be added to the required fields decimalLongitude
and decimalLatitude
.decimalLongitude
, decimalLatitude
, and coordinateUncertaintyInMeters
respectively. There is also an obistools
R function to calculate the centroid and radius for WKT polygons.
+1.5.0.3 Formatting DataDNA data guidelines for more details.
Occurrences unknown or new to science should be documented according to recommendations by Horton et al. 2021. You should populate the scientificName
field with the genus, and in identificationQualifer
provide the ON sign ‘sp.’. However you must also indicate the reason why species-level identification is unavailable. To do this, supplement ‘sp.’ with either stet. (stetit) or indet. (indeterminabilis). If neither of these are applicable, (e.g. for undescribed new species), add a unique taxon identifier code after ‘sp.’ to identificationQualifer
. For example Eurythenes sp. DISCOLL.PAP.JC165.674.
eventIDs
or occurrenceIDs
, you should strive to provide more complex and globally unique identifier. Identifiers could be constructed by combining higher taxonomic information with information related to a collection, institution, museum or collection code, sample number or museum accession number, expedition, dive number, or timestamp. This ensures namestrings will remain unique within a larger repositories like OBIS. It is also recommended to include these temporary names on specimen labels for physical specimens.
-Occurrences unknown or new to science should be documented according to recommendations by Horton et al. 2021. You should populate the scientificName
field with the genus, and in identificationQualifer
provide the ON sign ‘sp.’. However you must also indicate the reason why species-level identification is unavailable. To do this, supplement ‘sp.’ with either stet. (stetit) or indet. (indeterminabilis). If neither of these are applicable, (e.g. for undescribed new species), add a unique taxon identifier code after ‘sp.’ to identificationQualifer
. For example Eurythenes sp. DISCOLL.PAP.JC165.674.
eventIDs
or occurrenceIDs
, you should strive to provide more complex and globally unique identifier. Identifiers could be constructed by combining higher taxonomic information with information related to a collection, institution, museum or collection code, sample number or museum accession number, expedition, dive number, or timestamp. This ensures namestrings will remain unique within a larger repositories like OBIS. It is also recommended to include these temporary names on specimen labels for physical specimens.
+Yes. The identifier in scientificNameID
should always correspond with the name that is in the scientificName
field, even if the name is an unaccepted name in WoRMS. For example, the species name “Holothuria mammiculata” was provided, but this name is unaccepted in WoRMS. The accepted name is “Holothuria (Stauropora) pervicax Selenka, 1867”. In this case scientificNameID
should correspond to the original name with LSID urn:lsid:marinespecies.org:taxname:529968 because the ID must correlate with the name as recorded in scientificName
.
When species are reclassied in WoRMS, the original scientificName
and scientificNameID
provided in a dataset remains unchanged. However WoRMS will list the old ID as “Unaccepted”, and link to the accepted taxon entry, and this will be reflected in the taxonomic information attached to a dataset download.
For example, if we search for Manta birostris in OBIS (https://obis.org/taxon/105857), we see that the taxon’s status in WoRMS is unaccepted. At the bottom of the page it links to the currently accepted name: https://obis.org/taxon/1026118. We can find an occurrence which shows the source scientificName
as “Manta” while the interpreted scientificName
is “Mobula”: https://obis.org/occurrence/0020c873-02f1-4bd7-b396-ad36600bc8b2. We can also see that originalScientificName
is populated with the source name in the intepreted output.
As a user, you don’t have to trace species names. However if the datasets’s DwC-A is downloaded from the dataset page instead of obtained through R or the Mapper, all fields will contain the original value. It remains good practice to also check identifiers against WoRMS to see if any have been updated when you download data.
+Yes. The identifier in scientificNameID
should always correspond with the name that is in the scientificName
field, even if the name is an unaccepted name in WoRMS. For example, the species name “Holothuria mammiculata” was provided, but this name is unaccepted in WoRMS. The accepted name is “Holothuria (Stauropora) pervicax Selenka, 1867”. In this case scientificNameID
should correspond to the original name with LSID urn:lsid:marinespecies.org:taxname:529968 because the ID must correlate with the name as recorded in scientificName
.
Yes. There is an Excel template generator developed by Luke Marsden & Olaf Schneider as part of the Nansen Legacy project. Note this template generator is aimed at GBIF users, so make sure to account for and include required OBIS terms.
There is also this Excel to Darwin Core macro tool developed by GBIF Norway you can use to help generate templates.
If the areas OBIS currently uses does not work for your use case, then it is best to first define all the boundaries for the desired regions. OBIS can be queried using WKT polygons by providing a WKT string to the geometry
parameter in the robis::occurrence
function. HOWEVER there are some limitations with respect to polygon complexity, and if it is too complex you will likely receive the error “The OBIS API was not able to process your request”.
For more complex spatial queries we recommend indexing OBIS and GBIF data against polygons and using (finely) gridded versions of these datasets to make the process faster. We note we have not yet properly documented this process, but see the example script produced by Pieter Provoost below. The script first indexes a polygon to the H3 spatial index, then queries a gridded version of OBIS+GBIF data on AWS to get the species list, and finally fetches taxonomy from WoRMS for every species, which may take some time.
+library(readr)
+library(h3jsr)
+library(sf)
+library(duckdb)
+library(DBI)
+library(dplyr)
+
+sf_use_s2(FALSE)
+
+# Read WKT from text file, convert to sf, and index to H3 resolution 7
+# https://wktmap.com/?e6b28728
+
+wkt <- read_file("wkt_21773.txt")
+geom <- st_as_sfc(wkt, crs = 4326)
+cells <- data.frame(cell = polygon_to_cells(geom, 7)[[1]])
+
+# Set up duckdb connection and register cells table
+
+con <- dbConnect(duckdb())
+dbSendQuery(con, "install httpfs; load httpfs;")
+duckdb_register(con, "cells", cells)
+
+# Join cells list and gridded species dataset
+
+species <- dbGetQuery(con, "
+ select species, AphiaID
+ from cells
+ inner join read_parquet('s3://obis-products/speciesgrids/h3_7/*') h3 on cells.cell = h3.h3_07
+ group by species, AphiaID
+")
+
+# Add WoRMS taxonomy
+
+id_batches <- split(species$AphiaID, ceiling(seq_along(species$AphiaID) / 50))
+taxa_batches <- purrr::map(id_batches, worrms::wm_record)
+taxa <- bind_rows(taxa_batches) %>%
+ select(AphiaID, scientificname, phylum, class, order, family, genus, scientificName = scientificname)
+
+# Get Mollusca species
+
+mollusca <- taxa %>%
+ filter(phylum == "Mollusca")
NOTE When you download data from the Mapper or full export, the data you will receive is flattened into one table with occurrence plus event data. eMoF data tables are separate upon request. However when you download a dataset from the OBIS homepage or dataset page, all tables (Event, Occurrence, eMoF) are separate files.
+When you download data from the Mapper or full export, the data you will receive is flattened into one table with occurrence plus event data. eMoF data tables are separate upon request. However when you download a dataset from the OBIS homepage or dataset page, all tables (Event, Occurrence, eMoF) are separate files.
+From the OBIS homepage, you can search for data in the search bar in the middle of the page. You can search by particular taxonomic groups, common names, dataset names, OBIS nodes, institute name, areas (e.g., Exclusive Economic Zone (EEZ)), or by the data provider’s country.
@@ -355,7 +357,7 @@Watch this video demonstration of how to use the Mapper as well as the OBIS homepage search.
+Watch this video demonstration of how to use the Mapper as well as the OBIS homepage search.
If you’d like to then download this data, you can simply export R objects with the write.csv
function. For example, if we wanted to obtain Mollusc data from OBIS:
This file will be saved to your working directory (if you are not familiar with working directories, read here). After opening the file, you will notice that the fields in the download do not include every possible field, but instead only those where information has been recorded by data providers, plus the fields added by OBIS’s quality control pipeline.
To use robis
for visualizing and mapping occurrences, see the Visualization section of the manual.
Watch the video below for a walkthrough of how to use the robis package to obtain OBIS data.
+Watch the video below for a walkthrough of how to use the robis package to obtain OBIS data.
Watch the video tutorial of this process below.
+Watch the video tutorial of this process below.
Watch our video tutorial for a demonstration of this procedure: Watch our video tutorial for a demonstration of this procedure: Note: You should consider carefully what combination of fields will generate a unique event. Combinations including date, time, location, and depth are common elements to help generate such unique codes. Including the event type can also be useful for datasets with hierarchical sampling methods (e.g., samples taken from a station within a cruise). Repeating the Broadly, an Thus to construct a unique We can see that each record has a similar See also De Pooter et al. 2017 for an example of an event hierarchy in a complex benthos dataset. Watch this video for a demonstration on how to construct eventIDs: Watch this video for a demonstration on how to construct eventIDs: 05 September, 2024 11 September, 2024 Watch the video below for an overview of all of the above procedures, from uploading data to mapping terms to DwC. Watch the video below for an overview of all of the above procedures, from uploading data to mapping terms to DwC. Follow the guidelines on the OBIS metadata standards and best practices page, or check the IPT manual for detailed instructions about the metadata editor. You can also upload a file with metadata information. The video below also demonstrates how to fill metadata on the IPT. Follow the guidelines on the OBIS metadata standards and best practices page, or check the IPT manual for detailed instructions about the metadata editor. You can also upload a file with metadata information. The video below also demonstrates how to fill metadata on the IPT. The IPT will now generate your data as Darwin Core, and combine the data with the metadata to package it as a standardized zip-file called a “Darwin Core Archive”. See the IPT manual for more details. Note: Hitting the “publish” button does not mean that your dataset is available to everyone, it is still private, with access limited to the resource managers. It will only be publicly available when you have changed Visibility to Public. You can choose to do this immediately or at a set date. The first minute of the video below provides an overview of how to publish on the IPT. The first minute of the video below provides an overview of how to publish on the IPT. OBIS recommends to share the data as widely as possible including with other networks such as GBIF. On 13 October 2014, a cooperation agreement was signed between the secretariats of IOC-UNESCO/OBIS and GBIF in which the two parties recognized the two initiatives (OBIS and GBIF) as complementary with common goals (and in particular OBIS’s role in Marine Biodiversity Data). Together they agreed to work towards maximizing the quantity, quality, completeness and fitness for use of marine biodiversity data, accessible through OBIS and GBIF and in particular in the development of data standards (DwC), technology (IPT), maximizing fitness for use, development of biodiversity indicators for assessments, enhance capacity through training and coordinate approaches to the global science/policy interface. At the 4th session of the OBIS Steering Group (SG-OBIS-IV, Feb 2015), it was recommended that GBIF should harvest OBIS tier 2 nodes if OBIS tier 2 nodes could also harvest marine datasets from their GBIF nodes. In this way OBIS could work directly with the entire marine community and promote its standards and best practices. It was not recommended that iOBIS set up a separate IPT for GBIF to harvest, since this would mean a duplication of effort. In order to publish data with GBIF, the OBIS node also needs to become a data publisher in GBIF, and link the IPT installation with this publishing organization. OBIS nodes are encouraged to use the OBIS node name as the publishers’s name, unless the host institution requires its institutional name to be used. In the latter case, reference to the OBIS node can be added in the description, as well as between brackets in the title. The name of the IPT instance can also refer to the OBIS node. OBIS nodes are also encouraged to select OBIS as the endorsing organization. In this way, the OBIS node is also listed on the OBIS page at GBIF. The video tutorial below will help guide you on registering your OBIS node IPT with GBIF. The video tutorial below will help guide you on registering your OBIS node IPT with GBIF. w!t8_Lgs{lFGl!zzEfEer+G!}_$A2_(hj9I`VX%Jjax-f6o4z#
z!^XyXI5IZ?e?2ynRAJNzowGTfoGXJarC;#%sm9Sy_`{#|q|C#k8F^%|mQX;_)1X0>
zImOCKt#whQF7Rb?7EeG_yIf~$glQQo!3yCrjyw!{a~Iu0qjAZ6?p>8t{~|iJ{T(?8
zg4(-! UPoBM&{vsE75+;lF
zc9CSlm%$>HzDVR^T`qq0!f5&L^*^7z&ZE)3AP+?pparentEventID
in the child event (use :
as delimiter) can make the structure of the dataset easier to understand. Nesting event information in this way also allows you to reduce redundancy and still provide information relevant to each level of sampling.eventID
can take the form of [parentEventID]:[sample type]_[sample ID]
eventID
for parent and child events, you join relevant sampling information. Possible configurations (with examples) could include:4.5.1 eventIDEnvironmental impact assessments in the eastern part of Adriatic sea - species list of benthic invertebrates and phytobenthos (2000-2010).
eventID
structure, except for the last part which indicates the event type - documented in the eventRemarks
column. In this dataset, records with the eventID
IOF_benthos_Plominski_zaljev_2000_crs
has information applicable for records with eventID
s ending with _stat1
, _stat2
, _s01
, and _s02
because _crs
is their parent event. Similarly, information (e.g., date of station visit, coordinates) documented in records with eventID
IOF_benthos_Plominski_zaljev_2000_stat1
is applicable for the two sample records (eventID
_s01
and _s02
), because these samples were taken at Station 1 (indicated by the parentEventID
). These eventID
s could have been nested in another way, such as IOF_benthos_Plominsku_zaljev_2000_crs:stat1:s01
which would embed the parentEventID
into the identifier.The OBIS manual
-Map your data to Darwin Core
-
+
Publish on the IPTConfigure IPT settings section of the IPT manual).
-
DOVN`X6SrckHVZ7>>7R91!yU}PTnOi4+CU;k6?-u8i
zthid$W+9^3LzHfiVv$o)Aq;2Xk`|krDHB;{l}RDYSc<@%;!0W-h(%0AJO9lUF0Nl<
zG#CR{`7KM-tT49ck7fI9Uh$}SOg9S#Fb=lHIeIsK<@yRUL*G(Cs4VODQ{!qt@XIw>
z=x^Zskj3GagU$NDCJoqs7R0KplTPiroyqxyOZv0IXtwhVf@!scct6?ZnU*1}*U72H
z{a9u;8^O`F8)&=s(}Xy4F}EJyU4D-cd6quC&Y4-SGcyb7N9(eHkdv_2)dJ^2uWGB>
z%THe)otafYyjw@;bL^-lR9x)YFPDuo{?p;yOEt#rkq?8mv4eX@Ft
zLMgF>#D~i-{ZtdiUg2lvmzpd-hd=zKhJUD;#EbdM0)?g|r(-L+8}mXozH*H2blDlu
zBl|g<+HMJ&8xSS7JG!E|?V`>l){;?Fw!PW%mVXt$+;u)}d_Y$~c0};d0
zb0`*V8F7#amJ;8&k<|uEgWo^S0
g;&poYQe3ofmj0R?c}^E-WR+!$a{kgaH^%BwiT$l}s=#
zNW
K;^6u^$zRJ(blk6)z7UJqK
zn5Uo9!gbBQEy^xBjrh@@%4L
dD67R
z^0-@(08=-zv_}^uu!Chu2$V97e`r3y?m>St{XDi$Yz~cvK~g1yL^ua|^Zjk?5s)|G
zh#*gRgRkoYvhf68sDvy)5lg^jY3hxWwipFXQVwVw$cn9kSdtnVx3&+@2DoP0^%xxj
z8Y@PZmeIJ4J|N_d&q=Y3Gf4%E$bzVf-tNmRvQkCDc#j_E5Lv?~qaU}6e;HmLVF+b+
z(Brt00y8Kpt2j{^F=^ELzX>iJW-Cjxi0Y$0YmyHM285J(orZo0m>c&P#T0JkY9~~j
zs$62c1wlfZ#z`i8Ot%LElNn(0J{1;Uip&Fs?J!Z$vc_9vpNr+tiKH85q$(B}JCU1-
z;F+8WJkK7c7jdS%O`0PWe;8>HA<(xhX{Zelgpmg!PQF5}EJUsh%(rccWn