-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add get_loftee_end_trunc_filter_expr
and update_loftee_end_trunc_filter
to patch the LOFTEE END_TRUNC filter
#740
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two minor comments.
|
||
:param csq_expr: StructExpression containing the LOFTEE annotation 'lof_info', with | ||
'GERP_DIST' and '50_BP_RULE' info. | ||
:param gerp_dist_cutoff: GERP distance cutoff for end truncation. Default is 0.0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making this notebook! So the gerp_dist_cutoff is changing from -58 to 0 and some "HC" became "LC", is there a justification for this cutoff that we could document?
I summarized our discussion with Konrad last year to answer the questions in the emails and on the forum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only checked variants in BRCA2 gene, the last 2 LOF now have "END_TRUNC" filter and became "LC".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I'm hoping Julia can help me add to the documentation to justify the cutoff, and to also confirm that the patch is being done correctly.
:return: Consequence Struct with updated LOFTEE annotations. | ||
""" | ||
end_trunc_expr = get_loftee_end_trunc_filter_expr(csq_expr, gerp_dist_cutoff) | ||
filter_expr = hl.or_else( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you use hl.set
here and hl.delimit
below because there might be multiple lof_filter per transcript? I saw in your test, after explode, there seemed to be only one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lof_filter
is a list of filters represented as a string delimited by ,
, for example: "5UTR_SPLICE,ANC_ALLELE,GC_TO_GT_DONOR"
If end_trunc_expr
is True, I add END_TRUNC
to the filters, but it might already be in the filters and I only want it represented once.
Tested in this notebook
test_loftee_end_trunc.html.zip