-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Writing "Good" Labeling functions #8
Comments
We'll know more about this issue once we start writing labeling functions. Some background, for each relationship type we'll be starting with a knowledgebase (gold standard) of known relationships. These will generally be a relationship type from Hetionet. So the first two labeling functions will be:
Then we will have to make additional labeling functions to refine the classifier. We're hoping to parallelize this task to some degree, i.e. everyone involved can submit additional labeling functions. So we'll have to develop a framework that allows anyone to submit labeling functions. And it's our impression that snorkel will be able to evaluate the quality of each labeling function? So it's not the end of the world if some of our labeling functions are imperfect. |
Below are examples of the desired and undesired Disease-Gene candidate relationships we will be working with. Good Example:
In the quote above, the Disease-Gene candidate relation is in bold. This is a good example because the relationship is in our gold standard list, so it would receive a +1. Bad Example:
This is a bad example because the disease in this context has nothing to do with epilepsy, so this relationship would receive a -1. |
Our aim is to generate useful labeling functions from a given set of candidate sentences provided below:
The text was updated successfully, but these errors were encountered: