-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Site upstream of transcription start site annotated as intronic #203
Comments
Heya @pkothiyal and devs, I happened to peek at this too, and just because of the older version of R where I tested, version 1.16.0 of genomation was installed, and this didn't happen in that version (in case it helps track down whatever is going on): library(genomation)
packageVersion("genomation") # 1.16.0
library(GenomicRanges)
packageVersion("GenomicRanges") # 1.36.1
bed <- c("chr12", 3906815, 3908547, "ENSMUST00000172913", 0, "+", 3906815, 3908265, 0, 3, "68,138,295,", "0,790,1437,")
bed_df <- t(data.frame(bed))
write.table(bed_df, "mm10.bed", sep = "\t", quote = FALSE, row.names = FALSE, col.names = FALSE)
df <- data.frame (chr = c("chr12"),
start = c(3905814),
end = c(3905814)
)
gr <- makeGRangesFromDataFrame(df, keep.extra.columns = FALSE,
start.field = "start",
end.field = "end")
gene.obj <- readTranscriptFeatures("mm10.bed", up.flank = 1000, down.flank = 1000, remove.unusual = TRUE, unique.prom = TRUE)
annRes <- annotateWithGeneParts(gr, gene.obj)
cbind(getAssociationWithTSS(annRes), as.data.frame(getMembers(annRes)))
# target.row dist.to.feature feature.name feature.strand prom exon intron
# 1 -1002 ENSMUST00000172913 + 0 0 0
sessionInfo()
# R version 3.6.3 (2020-02-29)
# Platform: x86_64-apple-darwin15.6.0 (64-bit)
# Running under: macOS 10.16
#
# Matrix products: default
# LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
#
# locale:
# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#
# attached base packages:
# [1] parallel stats4 grid stats graphics grDevices utils datasets methods base
#
# other attached packages:
# [1] GenomicRanges_1.36.1 GenomeInfoDb_1.20.0 IRanges_2.18.3 S4Vectors_0.22.1 BiocGenerics_0.30.0 genomation_1.16.0
#
# loaded via a namespace (and not attached):
# [1] Rcpp_1.0.8 lattice_0.20-45 Rsamtools_2.0.3 Biostrings_2.52.0
# [5] assertthat_0.2.1 utf8_1.2.2 gridBase_0.4-7 R6_2.5.1
# [9] plyr_1.8.6 ggplot2_3.3.5 pillar_1.6.5 zlibbioc_1.30.0
# [13] rlang_1.0.0 rstudioapi_0.13 data.table_1.14.2 Matrix_1.2-18
# [17] BiocParallel_1.18.1 readr_2.1.1 stringr_1.4.0 RCurl_1.98-1.5
# [21] bit_4.0.4 munsell_0.5.0 DelayedArray_0.10.0 compiler_3.6.3
# [25] rtracklayer_1.44.4 pkgconfig_2.0.3 tidyselect_1.1.1 SummarizedExperiment_1.14.1
# [29] tibble_3.1.6 GenomeInfoDbData_1.2.1 matrixStats_0.61.0 XML_3.99-0.3
# [33] fansi_1.0.2 withr_2.4.3 crayon_1.4.2 dplyr_1.0.7
# [37] tzdb_0.2.0 GenomicAlignments_1.20.1 bitops_1.0-7 gtable_0.3.0
# [41] lifecycle_1.0.1 DBI_1.1.2 magrittr_2.0.2 scales_1.1.1
# [45] KernSmooth_2.23-20 cli_3.1.1 stringi_1.7.6 impute_1.58.0
# [49] vroom_1.5.7 XVector_0.24.0 reshape2_1.4.4 ellipsis_0.3.2
# [53] generics_0.1.1 vctrs_0.3.8 tools_3.6.3 bit64_4.0.5
# [57] BSgenome_1.52.0 Biobase_2.44.0 glue_1.6.1 seqPattern_1.16.0
# [61] purrr_0.3.4 hms_1.1.1 plotrix_3.8-2 colorspace_2.0-2
# [65] BiocManager_1.30.16 |
Thank you, something must have been changed in GenomicRanges that causes
this.
On Mon, Jan 31, 2022 at 5:28 PM Mike Lee ***@***.***> wrote:
Heya @pkothiyal <https://github.com/pkothiyal> and devs,
I happened to peek at this too, and just because of the older version of R
where I tested, version 1.16.0 of genomation was installed, and this didn't
happen in that version (in case it helps track down whatever is going on):
library(genomation)
packageVersion("genomation") # 1.16.0
library(GenomicRanges)
packageVersion("GenomicRanges") # 1.36.1
bed <- c("chr12", 3906815, 3908547, "ENSMUST00000172913", 0, "+", 3906815, 3908265, 0, 3, "68,138,295,", "0,790,1437,")bed_df <- t(data.frame(bed))
write.table(bed_df, "mm10.bed", sep = "\t", quote = FALSE, row.names = FALSE, col.names = FALSE)
df <- data.frame (chr = c("chr12"),
start = c(3905814),
end = c(3905814)
)
gr <- makeGRangesFromDataFrame(df, keep.extra.columns = FALSE,
start.field = "start",
end.field = "end")
gene.obj <- readTranscriptFeatures("mm10.bed", up.flank = 1000, down.flank = 1000, remove.unusual = TRUE, unique.prom = TRUE)
annRes <- annotateWithGeneParts(gr, gene.obj)
cbind(getAssociationWithTSS(annRes), as.data.frame(getMembers(annRes)))
# target.row dist.to.feature feature.name feature.strand prom exon intron# 1 -1002 ENSMUST00000172913 + 0 0 0
sessionInfo()# R version 3.6.3 (2020-02-29)# Platform: x86_64-apple-darwin15.6.0 (64-bit)# Running under: macOS 10.16# # Matrix products: default# LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib# # locale:# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8# # attached base packages:# [1] parallel stats4 grid stats graphics grDevices utils datasets methods base # # other attached packages:# [1] GenomicRanges_1.36.1 GenomeInfoDb_1.20.0 IRanges_2.18.3 S4Vectors_0.22.1 BiocGenerics_0.30.0 genomation_1.16.0 # # loaded via a namespace (and not attached):# [1] Rcpp_1.0.8 lattice_0.20-45 Rsamtools_2.0.3 Biostrings_2.52.0 # [5] assertthat_0.2.1 utf8_1.2.2 gridBase_0.4-7 R6_2.5.1 # [9] plyr_1.8.6 ggplot2_3.3.5 pillar_1.6.5 zlibbioc_1.30.0 # [13] rlang_1.0.0 rstudioapi_0.13 data.table_1.14.2 Matrix_1.2-18 # [17] BiocParallel_1.18.1 readr_2.1.1 stringr_1.4.0 RCurl_1.98-1.5 # [21] bit_4.0.4 munsell_0.5.0 DelayedArray_0.10.0 compiler_3.6.3 # [25] rtracklayer_1.44.4 pkgconfig_2.0.3 tidyselect_1.1.1 SummarizedExperiment_1.14.1# [29] tibble_3.1.6 GenomeInfoDbData_1.2.1 matrixStats_0.61.0 XML_3.99-0.3 # [33] fansi_1.0.2 withr_2.4.3 crayon_1.4.2 dplyr_1.0.7 # [37] tzdb_0.2.0 GenomicAlignments_1.20.1 bitops_1.0-7 gtable_0.3.0 # [41] lifecycle_1.0.1 DBI_1.1.2 magrittr_2.0.2 scales_1.1.1 # [45] KernSmooth_2.23-20 cli_3.1.1 stringi_1.7.6 impute_1.58.0 # [49] vroom_1.5.7 XVector_0.24.0 reshape2_1.4.4 ellipsis_0.3.2 # [53] generics_0.1.1 vctrs_0.3.8 tools_3.6.3 bit64_4.0.5 # [57] BSgenome_1.52.0 Biobase_2.44.0 glue_1.6.1 seqPattern_1.16.0 # [61] purrr_0.3.4 hms_1.1.1 plotrix_3.8-2 colorspace_2.0-2 # [65] BiocManager_1.30.16
—
Reply to this email directly, view it on GitHub
<#203 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAE32EPZRTVRET5SFUDJ4YLUY22BFANCNFSM5M3QWVXQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
Sent from mobile, excuse the brevity
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi, I tried annotating a few sites with annotateWithGeneParts and I'm getting an intron annotation even though the site is upstream of transcription start site. Here's an example:
BED12 entry for the associated transcript:
Site for annotation (chr12:3905814):
Annotate:
As seen above, the site is annotated in the intron of the transcript even though it is 1002 bases upstream of the transcription start site. The promoter flank is set at 1,000 so I understand it not being mapped to promoter but should it be intergenic instead of intronic in this case?
Thanks,
Prachi
The text was updated successfully, but these errors were encountered: