Skip to content

Commit

Permalink
clarify piscem infer details
Browse files Browse the repository at this point in the history
  • Loading branch information
mikelove committed Oct 9, 2023
1 parent 18ba9da commit 1ace0f0
Show file tree
Hide file tree
Showing 5 changed files with 31 additions and 27 deletions.
6 changes: 3 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
Package: tximeta
Version: 1.19.9
Version: 1.19.10
Title: Transcript Quantification Import with Automatic Metadata
Description: Transcript quantification import from Salmon and
alevin with automatic attachment of transcript ranges and
release information, and other associated metadata. De novo
other quantifiers with automatic attachment of transcript ranges
and release information, and other associated metadata. De novo
transcriptomes can be linked to the appropriate sources with
linkedTxomes and shared for computational reproducibility.
Authors@R: c(
Expand Down
6 changes: 2 additions & 4 deletions R/tximeta.R
Original file line number Diff line number Diff line change
Expand Up @@ -55,13 +55,11 @@ NULL

#' Import transcript quantification with metadata
#'
#' \code{tximeta} leverages the hashed checksum of the Salmon index,
#' \code{tximeta} leverages the hashed checksum of the Salmon or piscem index,
#' in addition to a number of core Bioconductor packages (GenomicFeatures,
#' ensembldb, AnnotationHub, GenomeInfoDb, BiocFileCache) to automatically
#' populate metadata for the user, without additional effort from the user.
#' Note that \code{tximeta} requires that the entire output directory of Salmon
#' or alevin is present and unmodified in order to identify the provenance of the
#' reference transcripts.
#' For other quantifiers see the \code{customMetaInfo} argument below.
#'
#' Most of the code in \code{tximeta} works to add metadata and transcript ranges
#' when the quantification was performed with Salmon. However,
Expand Down
13 changes: 10 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,12 @@ metadata for transcript quantification data in Bioconductor. The
`tximeta()` function imports quantification data from *Salmon* or
other quantifiers, and returns a
[SummarizedExperiment](https://bioconductor.org/packages/release/bioc/vignettes/SummarizedExperiment/inst/doc/SummarizedExperiment.html#anatomy-of-a-summarizedexperiment)
object.
object. *tximeta* works natively with
[Salmon](https://salmon.readthedocs.io/en/latest/),
[alevin](https://salmon.readthedocs.io/en/latest/alevin.html),
or [piscem-infer](https://piscem-infer.readthedocs.io/en/latest/),
but can easily be configured to work with any transcript
quantification tool.

If `tximeta()` recognizes the reference transcripts used
for quantification, it will automatically download relevant
Expand All @@ -31,12 +36,14 @@ quantification data).

# How it works

The key idea behind *tximeta* is that *Salmon* propagates a hash value
The key idea behind *tximeta* is that *Salmon*, *alevin*, and
*piscem-infer* propagate a hash value
summarizing the reference transcripts into each quantification
directory it outputs. *tximeta* can be used with other tools as long
as the
[hash of the transcripts](https://github.com/COMBINE-lab/FastaDigest)
is also included in the output directories.
is also included in the output directories. See `customMetaInfo`
argument of `tximeta()` for more details.

![](man/figures/diagram.png)

Expand Down
6 changes: 2 additions & 4 deletions man/tximeta.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

27 changes: 14 additions & 13 deletions vignettes/tximeta.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@ output:
abstract: >
Tximeta performs numerous annotation and metadata gathering tasks on
behalf of users during the import of transcript quantifications from
*Salmon* or *alevin* into R/Bioconductor. Metadata and transcript
ranges are added automatically, facilitating genomic analyses and
assisting in computational reproducibility.
*Salmon*, *alevin*, or *piscem-infer* into R/Bioconductor. Metadata
and transcript ranges are added automatically, facilitating genomic
analyses and assisting in computational reproducibility.
bibliography: library.bib
vignette: |
%\VignetteIndexEntry{Transcript quantification import with automatic metadata}
Expand All @@ -25,7 +25,8 @@ vignette: |
The `tximeta` package [@Love2020] extends the `tximport` package
[@Soneson2015] for import of transcript-level quantification data into
R/Bioconductor. It automatically adds annotation metadata when the
RNA-seq data has been quantified with *Salmon* [@Patro2017] or for
RNA-seq data has been quantified with *Salmon* [@Patro2017] or
[piscem-infer](https://piscem-infer.readthedocs.io/en/latest/), or the
scRNA-seq data quantified with *alevin* [@Srivastava2019]. To our
knowledge, `tximeta` is the only package for RNA-seq data import that
can automatically identify and attach transcriptome metadata based on
Expand All @@ -34,15 +35,15 @@ For more details on these packages -- including the motivation for
`tximeta` and description of similar work -- consult the
**References** below.

**Note:** `tximeta` requires that the **entire output directory** of
*Salmon* / *alevin* is present and unmodified in order to identify the
provenance of the reference transcripts. In general, it's a good idea
to not modify or re-arrange the output directory of bioinformatic
software as other downstream software rely on and assume a consistent
directory structure. For sharing multiple samples, one can use, for
example, `tar -czf` to bundle up a set of Salmon output directories,
or to bundle one alevin output directory. For tips on using `tximeta`
with other quantifiers see the
**Note:** `tximeta` requires that the **entire output** of
*Salmon* / *piscem-infer* / *alevin* is present and unmodified in
order to identify the provenance of the reference transcripts. In
general, it's a good idea to not modify or re-arrange the output
directory of bioinformatic software as other downstream software rely
on and assume a consistent directory structure. For sharing multiple
samples, one can use, for example, `tar -czf` to bundle up a set of
Salmon output directories, or to bundle one alevin output
directory. For tips on using `tximeta` with other quantifiers see the
[other quantifiers](#other_quantifiers) section below.

```{r echo=FALSE}
Expand Down

0 comments on commit 1ace0f0

Please sign in to comment.