Skip to content

Latest commit

 

History

History
585 lines (506 loc) · 51.4 KB

README.md

File metadata and controls

585 lines (506 loc) · 51.4 KB

CARE-SM CSV Module Documentation

This guide explains how to structure, populate, and utilize CSV files for patient data in CARE-SM.


Table of Contents

Step-by-Step Guide for FIAR-in-box implementation

  1. Create the CSV File

    Begin with an empty CSV and save it as preCARE.csv. This filename is required for recognition and quality control within CARE-SM.

    This document would only allow a set of concrete columns names:

    • model, pid, event_id, value, age, value_datatype, valueIRI, activity, unit, input, target, protocol_id, frequency_type, frequency_value, agent, startdate, enddate, comments

    Example:

    model,pid,event_id,value,age,value_datatype,valueIRI,activity,unit,input,target,protocol_id,frequency_type,frequency_value,agent,startdate,enddate,comments
    ,,,,,,,,,,,,,,,,,
    
  2. Populate Data
    Each data entry needs specific fields (not all fields are mandatory). See the glossary below for details.

    Example:

    model,pid,event_id,value,age,value_datatype,valueIRI,activity,unit,input,target,protocol_id,frequency_type,frequency_value,agent,startdate,enddate,comments
    Diagnosis,30056,,,,,http://www.orpha.net/ORDO/Orphanet_93552,,,,,,,,,,,2006-01-19,,
    

    For more examples, refer to the exemplar_data folder.

Data Element Glossary

Legend:

  • This column is UNUSED for this case.
  • This column is filled in IN CASE OF ANY.
  • This column colored is REQUIRED for this case.

Here you can find the list of data elements and the columns required to be defined. Those that are optional, feel free to add them. If not, leave them empty.

Birthdate:

This data element can be queried (for counting anonymized patient information) by Beacon API created for CARE-SM, for more information, click here

  • model: Birthdate
  • pid: patient unique identifier.
  • value: ISO 8601 formatted date (not date time)
  • value_datatype:
  • valueIRI:
  • activity:
  • unit:
  • input:
  • target:
  • protocol_id:
  • frequency_type:
  • frequency_value:
  • agent:
  • startdate: ISO 8601 formatted start date of observation, could be the same of the value
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age:
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of this data elements under the same visit occurrence event.

Birthyear:

This data element can be queried (for counting anonymized patient information) by Beacon API created for CARE-SM, for more information, click here

  • model: Birthdate
  • pid: patient unique identifier.
  • value: The year in which a person was born. E.g. 1984
  • value_datatype:
  • valueIRI:
  • activity:
  • unit:
  • input:
  • target:
  • protocol_id:
  • frequency_type:
  • frequency_value:
  • agent:
  • startdate:
  • enddate:
  • age:
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of this data elements under the same visit occurrence event.

Deathdate:

  • model: Deathdate
  • pid: individual identifier, in the form of a patient identifier.
  • value: ISO 8601 formatted date of death (not date time)
  • value_datatype:
  • valueIRI:
  • activity:
  • unit:
  • input:
  • target:
  • protocol_id:
  • frequency_type:
  • frequency_value:
  • agent:
  • startdate: ISO 8601 formatted start date of observation, could be the same of the value
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age: Besides its named as "Deathdate", accepts "Age of Death" using this age column. Patient age when this observation was taken, this age information can be both an addition or an alternative for value/startdate/enddate information. Its units are fractional years, so it accepts any decimal figure for age. E.g. 33.75 years.
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of these data elements under the same visit occurrence event.

First Confirmed Visit:

  • model: First_visit
  • pid: individual identifier, in the form of a patient identifier.
  • value: ISO 8601 formatted date of first confirmed visit (not date time). Its required to add at least one the following: value and/or age column (preferably value)
  • value_datatype:
  • valueIRI:
  • activity:
  • unit:
  • input:
  • target:
  • protocol_id:
  • frequency_type:
  • frequency_value:
  • agent:
  • startdate: ISO 8601 formatted start date of observation, could be the same of the value
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age: patient age when this observation was taken, this age information can be both an addition or an alternative for value/startdate/enddate information. It's required to add at least one the following: value and/or age column (preferably value). Its units are fractional years, so it accepts any decimal figure for age. E.g. 33.75 years.
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of these data elements under the same visit occurrence event.

Sex:

This data element can be queried (for counting anonymized patient information) by Beacon API created for CARE-SM, for more information, click here

  • model: Sex
  • pid: individual identifier, in form of a patient identifier.
  • value: human readable label, e.g. Female
  • value_datatype:
  • valueIRI: one of the following:
  • activity:
  • unit:
  • input:
  • target:
  • protocol_id:
  • frequency_type:
  • frequency_value:
  • agent:
  • startdate: ISO 8601 formatted start date of observation
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age:
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of these data elements under the same visit occurrence event.

Participation status:

  • model: Status
  • pid: individual identifier, in the form of a patient identifier.
  • value: any human readable response, e.g. Lost of follow-up
  • value_datatype:
  • valueIRI: one of the following:
  • activity:
  • unit:
  • input:
  • target:
  • protocol_id:
  • frequency_type:
  • frequency_value:
  • agent:
  • startdate: ISO 8601 formatted start date of observation
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age: patient age when this observation was taken, this age information can be both an addition or an alternative for startdate/enddate information. Its units are fractional years, so it accepts any decimal figure for age. E.g. 33.75 years.
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of this data elements under the same visit occurrence event.

Education:

  • model: Education
  • pid: individual identifier, in the form of a patient identifier.
  • value: International Standard Classification of Education score. E.g. 7
  • value_datatype:
  • valueIRI:
  • activity:
  • unit:
  • input:
  • target:
  • protocol_id:
  • frequency_type:
  • frequency_value:
  • agent:
  • startdate: ISO 8601 formatted start date of observation
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age: patient age when this observation was taken, this age information can be both an addition or an alternative for startdate/enddate information. Its units are fractional years, so it accepts any decimal figure for age. E.g. 33.75 years.
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of these data elements under the same visit occurrence event.

Symptom/phenotype:

This data element can be queried (for counting anonymized patient information) by Beacon API created for CARE-SM, for more information, click here

  • model: Phenotype
  • pid: individual identifier, in the form of a patient identifier.
  • value:
  • value_datatype:
  • valueIRI: IRI that defines clinical phenotypic symptom or sign: For example Human Phenotype ontology (HPO) term represented with a full URL such as http://purl.obolibrary.org/obo/HP_0001251
  • activity:
  • unit:
  • input:
  • target:
  • protocol_id:
  • frequency_type:
  • frequency_value:
  • agent:
  • startdate: ISO 8601 formatted start date of observation
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age: patient age when this observation was taken, this age information can be both an addition or an alternative for startdate/enddate information. Its units are fractional years, so it accepts any decimal figure for age. E.g. 33.75 years.
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of these data elements under the same visit occurrence event.

Diagnosis:

This data element can be queried (for counting anonymized patient information) by Beacon API created for CARE-SM, for more information, click here

  • model: Diagnosis
  • pid: individual identifier, in the form of a patient identifier.
  • value: Human readable label of the diagnosed condition.
  • value_datatype:
  • valueIRI: IRI that defines clinical condition as disease or disorder: Orphanet disease ontology (ORDO) represented with a full URL such as http://www.orpha.net/ORDO/Orphanet_199630
  • activity:
  • unit:
  • input:
  • target:
  • protocol_id:
  • frequency_type:
  • frequency_value:
  • agent:
  • startdate: ISO 8601 formatted start date of observation
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age: patient age when this observation was taken, this age information can be both an addition or an alternative for startdate/enddate information. Its units are fractional years, so it accepts any decimal figure for age. E.g. 33.75 years.
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of these data elements under the same visit occurrence event.

Genetic:

This data element can be queried (for counting anonymized patient information) by Beacon API created for CARE-SM, for more information, click here

  • model: Genetic

  • pid: individual identifier, in the form of a patient identifier.

  • value: Lexical Annonatation code for the genetic variant. E.g. NM-004006.2:c.4375C>T p.(Arg1459*)

  • value_datatype:

  • valueIRI: Genetic variant code constructed by appending the HGNC/OMIM/HGVS annotation, e.g. https://www.ncbi.nlm.nih.gov/clinvar/RCV000008537

  • activity: Specific method in form of an ontological class that describe the process, e.g. NCIT:Microarray Analysis: http://purl.obolibrary.org/obo/NCIT_C18477

  • unit:

  • input: Anatomical structure where the sample was extracted. Recommended a child of NCIT:Biospecimen or NCIT:Anatomic Structure, System, or Substance, e.g. NCIT:Blood Sample: http://purl.obolibrary.org/obo/NCIT_C17610

  • target: Zygosity associated with this particular genetic variant. Defined by GENO OBO Foundry ontology: One of the following:

  • protocol_id:

  • frequency_type:

  • frequency_value:

  • agent: Molecular target type, refering to the level of molecular dogma central studied by the genetic variant. Some of the examples terminology from NCIT:

  • startdate: ISO 8601 formatted start date of observation

  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.

  • age: patient age when this observation was taken, this age information can be both an addition or an alternative for startdate/enddate information. Its units are fractional years, so it accepts any decimal figure for age. E.g. 33.75 years.

  • comments: human readable comments of any kind related to this procedure.

  • event_id: contextual identifier (formatted as integer) used for relating several of these data elements under the same visit occurrence event.

Biobank:

  • model: Biobank
  • pid: individual identifier, in the form of a patient identifier.
  • value: sample identifier
  • value_datatype:
  • valueIRI:
  • activity:
  • unit:
  • input: tissue/sample collected during the sampling process. E.g. Cerebrospinal Fluid http://purl.obolibrary.org/obo/NCIT_C12692
  • target:
  • protocol_id:
  • frequency_type:
  • frequency_value:
  • agent: biobank Identifier, e.g. https://directory.bbmri-eric.eu/biobankid
  • startdate: ISO 8601 formatted start date of observation
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age: patient age when this observation was taken, this age information can be both an addition or an alternative for startdate/enddate information. Its units are fractional years, so it accepts any decimal figure for age. E.g. 33.75 years.
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of these data elements under the same visit occurrence event.

Symptoms onset:

This data element can be queried (for counting anonymized patient information) by Beacon API created for CARE-SM, for more information, click here

  • model: Symptoms_onset
  • pid: individual identifier, in the form of a patient identifier.
  • value: age or date of symptoms occurrence (Do not confuse with startdate/enddate/age for defining when this observation was registered).
  • value_datatype: XSD datatype that defines value column.xsd:date for date or xsd:integer for age. (xsd:float is not included as an option because fractional ages are not accepted by CARE-SM Toolkit).
  • valueIRI:
  • activity:
  • unit:
  • input:
  • target: URI based ontological term that defines symptom/phenotype measured, in case of any. E.g. HP ontological term for the symptom.
  • protocol_id:
  • frequency_type:
  • frequency_value:
  • agent:
  • startdate: ISO 8601 formatted start date of observation
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age: patient age when this observation was taken, this age information can be both an addition or an alternative for startdate/enddate information. Its units are fractional years, so it accepts any decimal figure for age. E.g. 33.75 years.
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of these data elements under the same visit occurrence event.

Disability:

  • model: Disability
  • pid: individual identifier, in the form of a patient identifier.
  • value: Score/value of the test output.
  • value_datatype: XSD datatype that defines value column type, e.g. xsd:float por a decimal score.
  • valueIRI:
  • activity: URL that defines the specific clinical questionnaire. Some examples are presented. E.g. http://purl.obolibrary.org/obo/NCIT_C107391 for Edmonton symptom disability assessment .
  • unit: score/ value unit of measurement.
  • input:
  • target:
  • protocol_id:
  • frequency_type:
  • frequency_value:
  • agent:
  • startdate: ISO 8601 formatted start date of observation
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age: patient age when this observation was taken, this age information can be both an addition or an alternative for startdate/enddate information. Its units are fractional years, so it accepts any decimal figure for age. E.g. 33.75 years.
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of these data elements under the same visit occurrence event.

Questionnaire:

  • model: Questionnaire
  • pid: individual identifier, in the form of a patient identifier.
  • value: Score/value of the test output.
  • value_datatype: XSD datatype that defines value column type, e.g. xsd:float por a decimal score.
  • valueIRI:
  • activity: URL that defines the specific clinical question defined in the questionnaire or PROM.
  • unit: score/ value unit of measurement.
  • input:
  • target:
  • protocol_id:
  • frequency_type:
  • frequency_value:
  • agent:
  • startdate: ISO 8601 formatted start date of observation
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age: patient age when this observation was taken, this age information can be both an addition or an alternative for startdate/enddate information. Its units are fractional years, so it accepts any decimal figure for age. E.g. 33.75 years.
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of this data elements under the same visit occurrence event.

Corporal:

  • model: Corporal
  • pid: individual identifier, in the form of a patient identifier.
  • value: resulting value from this observation
  • value_datatype: XSD datatype that defines value column type, e.g. xsd:float or xsd:integer for numerical values. In case of none, xsd:float will be added by default.
  • valueIRI: child of NCIT:Personal Attribute, for instance:
  • activity:
  • unit: child of UO:unit http://purl.obolibrary.org/obo/UO_0000000
  • input:
  • target:
  • protocol_id: URL reference to a protocol, e.g. https://protocols.io deposit or any identifier that describes the specific properties of this clinical procedure. E.g. https://www.protocols.io/view/hplc-sample-prep-4r3l25ew4l1y/v1
  • frequency_type:
  • frequency_value:
  • agent:
  • startdate: ISO 8601 formatted start date of observation
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age: patient age when this observation was taken, this age information can be both an addition or an alternative for startdate/enddate information. Its units are fractional years, so it accepts any decimal figure for age. E.g. 33.75 years.
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of these data elements under the same visit occurrence event.

Laboratory:

  • model: Laboratory
  • pid: individual identifier, in the form of a patient identifier.
  • value: resulting value from this analysis.
  • value_datatype: XSD datatype that defines value column type, e.g. xsd:float or xsd:integer for numerical values. In case of none, xsd:float will be added by default.
  • valueIRI:
  • activity: Specific method in form of an ontological class that describe the process, e.g. NCIT:Creatinine Clearance Adjusted for BSA: http://purl.obolibrary.org/obo/NCIT_C147324
  • unit: child of UO:unit http://purl.obolibrary.org/obo/UO_0000000
  • input: material input represented as Child of Anatomic, Structure, System, or Substance http://purl.obolibrary.org/obo/NCIT_C12219 (e.g: obo:Urine)
  • target: compound being measured in the sample. Child of Drug, Food, Chemical or Biomedical Material http://purl.obolibrary.org/obo/NCIT_C1908 (e.g. obo:Creatinine http://purl.obolibrary.org/obo/NCIT_C399)
  • protocol_id: URL reference to a protocol, e.g. https://protocols.io deposit or any identifier that describes the specific properties of this clinical procedure. E.g. https://www.protocols.io/view/hplc-sample-prep-4r3l25ew4l1y/v1
  • frequency_type:
  • frequency_value:
  • agent:
  • startdate: ISO 8601 formatted start date of observation
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age: patient age when this observation was taken, this age information can be both an addition or an alternative for startdate/enddate information. Its units are fractional years, so it accepts any decimal figure for age. E.g. 33.75 years.
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of these data elements under the same visit occurrence event.

Medical imaging:

  • model: Imaging
  • pid: individual identifier, in the form of a patient identifier.
  • value:
  • value_datatype:
  • valueIRI: medical imagine GUID of the file (must be a GUID system compatible with RDF Resource identifiers)
  • activity: child of Imaging technique http://purl.obolibrary.org/obo/NCIT_C17369 (e.g. obo:Digital X-Ray http://purl.obolibrary.org/obo/NCIT_C18001)
  • unit:
  • input:
  • target: child of Anatomic Structure, System, or Substance http://purl.obolibrary.org/obo/NCIT_C12219 (e.g. obo:Palmar Region http://purl.obolibrary.org/obo/NCIT_C33252)
  • protocol_id: URL reference to a protocol, e.g. https://protocols.io deposit or any identifier that describes the specific properties of this clinical procedure. E.g. https://www.protocols.io/view/anatomical-variations-and-dimensions-of-the-poplit-3byl4qqk8vo5
  • frequency_type:
  • frequency_value:
  • agent:
  • startdate: ISO 8601 formatted start date of observation
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age: patient age when this observation was taken, this age information can be both an addition or an alternative for startdate/enddate information. Its units are fractional years, so it accepts any decimal figure for age. E.g. 33.75 years.
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of these data elements under the same visit occurrence event.

Medication:

  • model: Medication
  • pid: individual identifier, in the form of a patient identifier.
  • value: dose value prescribed to the patient
  • value_datatype: XSD datatype that defines value column type, e.g. xsd:float or xsd:integer for numerical values.
  • valueIRI:
  • activity: child of Route of Administration http://purl.obolibrary.org/obo/NCIT_C38114 (e.g. obo:Sublingual Route of Administration http://purl.obolibrary.org/obo/NCIT_C38300 )
  • unit: child of UO:unit http://purl.obolibrary.org/obo/UO_0000000
  • input:
  • target:
  • protocol_id: URL reference to a protocol, e.g. https://protocols.io deposit or any identifier that describes the specific properties of this clinical procedure. E.g. https://www.protocols.io/view/hplc-sample-prep-4r3l25ew4l1y/v1
  • frequency_type: child of obo:Temporal Qualifier http://purl.obolibrary.org/obo/NCIT_C21514 (e.g. obo:Per Day)
  • frequency_value: frequency value prescribe to the patient
  • agent: ATC URI-code for drugs components. (e.g. https://www.whocc.no/atc_ddd_index/?code=A07EA01)
  • startdate: ISO 8601 formatted start date of observation
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age: patient age when this observation was taken, this age information can be both an addition or an alternative for startdate/enddate information. Its units are fractional years, so it accepts any decimal figure for age. E.g. 33.75 years.
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of these data elements under the same visit occurrence event.

Surgery:

  • model: Surgery
  • pid: individual identifier, in the form of a patient identifier.
  • value:
  • value_datatype:
  • valueIRI:
  • activity: child of Intervention or Procedure http://purl.obolibrary.org/obo/NCIT_C25218 (ex: obo:Tumor Resection http://purl.obolibrary.org/obo/NCIT_C164212)
  • unit:
  • input:
  • target: child of Anatomic Structure, System, or Substance http://purl.obolibrary.org/obo/NCIT_C12219
  • protocol_id: URL reference to a protocol, e.g. https://protocols.io deposit or any identifier that describes the specific properties of this clinical procedure. E.g. https://www.protocols.io/view/hplc-sample-prep-4r3l25ew4l1y/v1
  • frequency_type:
  • frequency_value:
  • agent:
  • startdate: ISO 8601 formatted start date of observation
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age: patient age when this observation was taken, this age information can be both an addition or an alternative for value/startdate/enddate information. Its units are fractional years, so it accepts any decimal figure for age. E.g. 33.75 years.
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of these data elements under the same visit occurrence event.

Clinical trial:

  • model: Clinical_trial
  • pid: individual identifier, in the form of a patient identifier.
  • value:
  • value_datatype:
  • valueIRI: IRI that defines clinical condition as disease or disorder: Orphanet disease ontology (ORDO) represented with a full URL such as http://www.orpha.net/ORDO/Orphanet_199630
  • activity:
  • unit:
  • input:
  • target:
  • protocol_id:
  • frequency_type:
  • frequency_value:
  • agent: GUID for this medical center where this clinical trial is taking place.
  • startdate: ISO 8601 formatted start date of observation
  • enddate: ISO 8601 formatted enddate of observation in case it is different from startdate.
  • age: patient age when this observation was taken, this age information can be both an addition or an alternative for startdate/enddate information. Its units are fractional years, so it accepts any decimal figure for age. E.g. 33.75 years.
  • comments: human readable comments of any kind related to this procedure.
  • event_id: contextual identifier (formatted as integer) used for relating several of these data elements under the same visit occurrence event.