Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

need to clarify some terms #13

Open
saubin78 opened this issue Jan 16, 2023 · 3 comments
Open

need to clarify some terms #13

saubin78 opened this issue Jan 16, 2023 · 3 comments
Assignees

Comments

@saubin78
Copy link

Hi. I feel there is some sources of confusion in the definitions for the semapv elements:

  • matching: in semapv, sounds like the activity/process. Should it be by machines only (as is looks like today)? or also by humans?
  • mapping: in semapv, sounds like the result
  • curation:
    - in semapv, ManualMappingCuration is defined as An matching process that is performed by a human agent and is based on human judgement and domain knowledge. . With this definition, the element should rather be named "ManualMatching" (as is it a process)
    - otherwise, according to Merriam Webster, curation is the act or process of selecting and organizing (something, such as articles or images) for distribution or publication so this means that mappings already exist, i.e. have been computed. This would then be close to the semapv definition for review
  • review: A process that is concerned with determining if a mapping “candidate” (otherwise determined) is reasonable/correct. This should be applicable to mappings created by either a human or a machine
    Could you please clarify or harmonize the lexicon used?
@matentzn
Copy link
Contributor

matching: in semapv, sounds like the activity/process. Should it be by machines only (as is looks like today)? or also by humans?
mapping: in semapv, sounds like the result

Yes both are correct, see details of discussion here: mapping-commons/sssom#169

Note that this is not an attempt to be normative. There is no denying that many people use matching only for machines, and other for humans and machines. Many use mapping as a process. This is just to define how the terms are used in the context of SEMPAV (and by extension, SSSOM).

curation:

  • in semapv, ManualMappingCuration is defined as An matching process that is performed by a human agent and is based on human judgement and domain knowledge. . With this definition, the element should rather be named "ManualMatching" (as is it a process)
  • otherwise, according to Merriam Webster, curation is the act or process of selecting and organizing (something, such as articles or images) for distribution or publication so this means that mappings already exist, i.e. have been computed. This would then be close to the semapv definition for review

This is a great point. I would argue that the "selecting" part can be seen as "selecting an appropriate identifier"; we have been using mapping curation primarily for the task of "finding an appropriate term to map to". Unfortunately, we cant really change the property now at this stage, as it is two widely used and the churn would be enormous to have everyone update their SSSOM files, but if we would have heard your comment earlier in the process, we would have considered it.. Thanks in any case for making that point!

review: A process that is concerned with determining if a mapping “candidate” (otherwise determined) is reasonable/correct. This should be applicable to mappings created by either a human or a machine

Could you please clarify or harmonize the lexicon used?

Not sure I understand, but can you make perhaps concrete suggestions of how we could clarify the definitions to disambiguate better?

@saubin78
Copy link
Author

Taking a look at the specifications' definitions:

  • manual mapping curation : An matching process that is performed by a human agent and is based on human judgement and domain knowledge. --> this one is clear enough
  • mapping review: A process that is concerned with determining if a mapping “candidate” (otherwise determined) is reasonable/correct. --> this one is clear (now)
  • matching process: An process that results in a mapping between a subject and an object entity. --> either this one should explicitely mention that the matching is performed by a machine OR a suggestion is to keep the definition for "matching process" as it is and have "manual mapping curation" as subclass of "matching process".

@matentzn
Copy link
Contributor

Yes, I agree @saubin78 thanks for this analysis.

Which of the two do you prefer? I personally feel more inclined to having manual mapping curation" as subclass of "matching process". This is more practical IMO. With improving automated methods, the distinction between the two will constantly get less. In the end, the human brain is just a big "pattern matching system". What do you think? cc @graybeal

@matentzn matentzn self-assigned this Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants