-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Typos in correspondence with GIFT #28
Comments
@Rekyt Thank you for documenting these! I'll make the changes on a branch tomorrow. It is sometimes intentionally to have multiple close matches within a single cell in the csv files. The code that builds the formal ontology will split those into multiple lines, each assigned as an example of type "close_match". But I'll double check the one you mentioned to ensure there isn't something else wrong. |
|
First of all, thank you for making all the fixes! I'm checking the correspondence with GIFT traits with the script I shared at the end of initial message in the issue. For GIFT traits there's is an additional complexity regarding the fact that GIFT offers three levels of granularity of traits. Level 1 traits are very broad categories, level 2 are the type of traits, level 3 are detailed to the exact meaning of the trait. Level 2 and Level 3 traits have their respective names in the columns This is what I obtain for example for the first few traits referenced in GIFT: head(GIFT::GIFT_traits_meta())[,1:6]
#> You are asking for the latest stable version of GIFT which is 3.2.
#> Lvl1 Category Lvl2 Trait1 Lvl3 Trait2
#> 1 1 Morphology 1.1 Woodiness 1.1.1 Woodiness_1
#> 2 1 Morphology 1.10 Shoot length 1.10.1 Shoot_length_min
#> 3 1 Morphology 1.10 Shoot length 1.10.2 Shoot_length_max
#> 4 1 Morphology 1.2 Growth form 1.2.1 Growth_form_1
#> 5 1 Morphology 1.2 Growth form 1.2.2 Growth_form_2
#> 6 1 Morphology 1.3 Epiphyte 1.3.1 Epiphyte_1 Created on 2024-08-30 with reprex v2.1.0 The Level 2 traits have a "Sentence case" naming convention while the Level 3 traits have a Capitalized_snake_case one. For level 2 traits these are the ones I obtained that are still mismatching using the updated APD file:
For the level 3 traits this is what I'm getting:
I may have missed something regarding GIFT's traits naming convention, but AFAIK, the traits in the database are indexed by the |
Hi @ehwenk & @dfalster,
while building the trait correspondence network I noticed some issues with the correspondence with GIFT traits (I still have to do the same for BIEN and TRY).
I basically checked that both the trait codes provided by AusTraits were in GIFT, as well as the trait names, and that the provided GIFT traits names were matching the provided GIFT trait codes.
The script I used is below. But I'll first detail my findings.
trait_0030020
&trait_0030015
theGIFT_close
contains multiple traits as a single line. Is it on purpose? Because other matched traits span multiple lines.trait_0030215
, there is a typo in theGIFT_exact
name as it is referenced "Fuiting time" missing an r.trait_0030020
, there is a typo in the GIFT code 'leaf_thorns_1 [GIFT:4:14.1]' which should be 'leaf_thorns_1 [GIFT:4.14.1]'.trait_0030060
,GIFT_close
matches with GIFT 1.4.1 (Climber_1) while it should match with GIFT 3.4.1 (Reproduction_sexual_1). This was the trait that triggered my systematic search for potential mismatches, as I obtained in the correspondence network a much larger connected component than expected with traits that shouldn't be matching.Maybe you could use an adaptation of the script below to perform semi-automated quality checks when updating the APD?
For the sake of completeness, I'll try performing the same checks for TRY and BIEN.
Matching script
The text was updated successfully, but these errors were encountered: