Skip to content

Commit

Permalink
revamp documentation and expose docstrings
Browse files Browse the repository at this point in the history
working towards #44
  • Loading branch information
tlnagy committed Feb 8, 2018
1 parent d931677 commit b66357e
Show file tree
Hide file tree
Showing 14 changed files with 349 additions and 34 deletions.
1 change: 1 addition & 0 deletions REQUIRE
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@ Gadfly
ArgParse
Compat v0.17.0
YAML
DocStringExtensions
26 changes: 24 additions & 2 deletions docs/make.jl
Original file line number Diff line number Diff line change
@@ -1,12 +1,34 @@
using Documenter

module Simulation
packages = [:StatsBase,
:Distributions,
:DataFrames,
:HypothesisTests,
:IterTools,
:DocStringExtensions]

for package in packages
eval(:(using $package))
end

filenames = ["common.jl", "utils.jl", "library.jl", "transfection.jl",
"selection.jl", "sequencing.jl", "processing.jl",
"designs.jl"]
for filename in filenames
include(joinpath(Base.source_dir(), "..", "src", "simulation", filename))
end
end
using Simulation

makedocs(
modules = [Simulation],
clean = false,
format = Documenter.Formats.HTML,
format = :html,
sitename = "Crispulator.jl",
pages = Any[
"Home" => "index.md",
"FACS vs Growth" => "facs_growth.md"
"Simulation Internals" => "internals.md"
]
)

Expand Down
1 change: 0 additions & 1 deletion docs/src/facs_growth.md

This file was deleted.

8 changes: 6 additions & 2 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,11 @@ navigate into the root directory of the project and run `julia`. Run the
following command:

```
julia -e 'Pkg.clone(pwd()); Pkg.build("Crispulator")'
julia -e 'Pkg.update(); Pkg.add("Crispulator")'
```

this copies `Crispulator` over to the Julia package directory and installs
all of its dependencies.
all of its dependencies.

## Quickstart

Expand All @@ -28,3 +28,7 @@ From the root directory of the project run
```julia
julia src/run.jl config example_config.yml .
```

## Advanced

See the [Simulation Internals](@ref) page for in-depth documentation needed for more advanced usage
56 changes: 56 additions & 0 deletions docs/src/internals.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Simulation Internals

More in-depth documentation on specific types and functions.

## Contents

```@contents
Pages = ["internals.md"]
```

## Index

```@index
Pages = ["internals.md"]
```

## Simulation Types

```@docs
Simulation.FacsScreen
Simulation.GrowthScreen
```

## Key functions

```@docs
Simulation.construct_library
Simulation.transfect
Simulation.select
Simulation.sequencing
Simulation.counts_to_freqs
Simulation.differences_between_bins
```

## Miscellaneous Types

```@docs
Simulation.Library
Simulation.Barcode
Simulation.KDPhenotypeRelationship
Simulation.Cas9Behavior
Simulation.CRISPRn
Simulation.CRISPRi
```

## Miscellaneous functions

```@docs
Simulation.signal
Simulation.noise
Simulation.linear
Simulation.sigmoid
Simulation.auprc
Simulation.auroc
Simulation.venn
```
115 changes: 101 additions & 14 deletions src/simulation/common.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ using Compat
Any entity that is tracked through the pooled experiment. For CRISPR screens,
this is equivalent to the sgRNA. This object stores properties relating to the
performance of this entity in the screen.
$(FIELDS)
"""
type Barcode
"The target gene id"
Expand Down Expand Up @@ -42,45 +44,130 @@ A description of screen parameters

as_array(ss::ScreenSetup) = [getfield(ss, fld) for fld in fieldnames(ss)]

"""
A type representing the parameters used in a typical FACS screen.
$(FIELDS)
"""
type FacsScreen <: ScreenSetup
"Number of target genes"
"Number of genes targeted by the screen"
num_genes::Int

"Number of guides per gene"
coverage::Int
"Number of cells with each guide"

"""Number of cells with each guide. `representation=10` means that there are
10 times as many cells as guides during transfection. The actual number of
cells per guide post-transfection will be less depending on the MOI"""
representation::Int
"Multiplicity of infection"

"""The multiplicity of infection, ``\\lambda``, of the screen. We model this as a Poisson
process during transfection (see [`Simulation.transfect`](@ref)).
!!! note
We **do not** model multiple infections. We assume that the MOI is properly
selected and less than half the cells are transfected by any virus, i.e.
``\\lambda \\lt 0.5`` and then select only the cells that have a single
transfection occurrence:
```math
P(x = 1; Poisson(λ))
```
For ``\\lambda = 0.25`` this works out to being ``\\approx 19.5\\%`` of the
number of cells (`num_genes` * `coverage` * `representation`).
"""
moi::Float64
"Std dev expected for cells during facs sorting"

"""The standard deviation expected for cells during FACS sorting. This should
be set according to the biological variance experimentally observed, e.g. in
the fluorescence intensity of isogenic cells"""
σ::Float64
"Range of guide phenotypes to collect in each bin"

"""Range of guide phenotypes to collect in each bin
# Example
In the following example
```julia
p = 0.05
bin_info = Dict(:bin1 => (0.0, p), :bin2 => (1.0-p, 1.0))
```
The 5th percentile of cells sorted according to their phenotype (fluorescence,
size, etc) will be compared to the 95th percentile.
"""
bin_info::Dict{Symbol, Tuple{Float64, Float64}}
"Number of cells sorted"

"Number of cells sorted expressed as an integer multiple of the number of guides"
bottleneck_representation::Int
"Sequencing depth"

"""Sequencing depth as a integer multiple of the number of guides, i.e.
`seq_depth=10` is equivalent to `10 * num_genes * coverage` reads.
"""
seq_depth::Int

function FacsScreen()
new(500, 5, 100, 0.25, 1.0, Dict(:bin1 => (0.0, 1/3), :bin2 => (2/3, 1.0)), 1000, 1000)
end
end

"""
A type representing the parameters used in a typical growth-based screen.
$(FIELDS)
"""
type GrowthScreen <: ScreenSetup
"Number of target genes"
"Number of genes targeted by the screen"
num_genes::Int

"Number of guides per gene"
coverage::Int
"Number of cells with each guide"

"""Number of cells with each guide. `representation=10` means that there are
10 times as many cells as guides during transfection. The actual number of
cells per guide post-transfection will be less depending on the MOI"""
representation::Int
"Multiplicity of infection"

"""The multiplicity of infection, ``\\lambda``, of the screen. We model this as a Poisson
process during transfection (see [`Simulation.transfect`](@ref)).
!!! note
We **do not** model multiple infections. We assume that the MOI is properly
selected and less than half the cells are transfected by any virus, i.e.
``\\lambda \\lt 0.5`` and then select only the cells that have a single
transfection occurrence:
```math
P(x = 1; Poisson(λ))
```
For ``\\lambda = 0.25`` this works out to being ``\\approx 19.5\\%`` of the
number of cells (`num_genes` * `coverage` * `representation`).
"""
moi::Float64
"Sequencing depth"

"""Sequencing depth as a integer multiple of the number of guides, i.e.
`seq_depth=10` is equivalent to `10 * num_genes * coverage` reads.
"""
seq_depth::Int
"For growth screens, how much of a bottleneck is applied"

"""For growth screens, how much of a bottleneck is applied. This is minimum
number of cells that is kept when passaging the pool of cells. This is
expressed as an integer multiple of the number of guides."""
bottleneck_representation::Int
"For growth screens, how many bottlenecks are applied"

"""For growth screens, how many bottlenecks are applied. This is the integer
number of passages during the growth screen."""
num_bottlenecks::Int
"The σ of the normal noise distribution added to each cell's phenotype"

"""Before each passage, the theoretical phenotype of each cell is convolved
with a normal noise distribution with a standard deviation, σ, dictated by
this parameter. This should be set based on an expectation of noisiness of
the subsampling"""
noise::Float64

function GrowthScreen()
Expand Down
2 changes: 2 additions & 0 deletions src/simulation/designs.jl
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
"""
$(SIGNATURES)
Runs a screen given the parameters specified in `setup` using the
library `lib` and applies the `processing_func` function to the result.
"""
Expand Down
36 changes: 36 additions & 0 deletions src/simulation/library.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
using Compat

"""
$(SIGNATURES)
Given value(s) `x` apply the sigmoidal function with maximum
value `l`, a steepness `k`, and an inflection point `p`.
"""
Expand All @@ -21,10 +23,16 @@ function sigmoid(xs::AbstractArray{Float64}, l, k, p)
end

"""
$(SIGNATURES)
Given value(s) `x` apply a simple linear function with maximum value `l`
"""
linear(x, l) = clamp(l.*x, min(0, l), max(0, l))

"""
A type representing a relationship between degree of knockdown and effect on
phenotype
"""
@compat abstract type KDPhenotypeRelationship end

type Linear <: KDPhenotypeRelationship end
Expand All @@ -51,10 +59,34 @@ function response(sig::Sigmoidal)
(x, l) -> sigmoid(x, l, width, inflection)
end

"""
A type representing the behavior of different Cas9s
"""
@compat abstract type Cas9Behavior end

"""
$(TYPEDEF)
CRISPRi behavior is simply determined by the activity of the guide
"""
type CRISPRi <: Cas9Behavior end

"""
$(TYPEDEF)
CRISPR KO behavior is more complex since sgRNA-directed DNA damage repair is
stochastic. We assume that 2/3 of repair events at a given locus lead to a
frameshift, and that the screen is carried out in diploid cells. The assumption
that only bi-allelic frame-shift mutations lead to a phenotype in
CRISPRn screens for most sgRNAs is supported by the empirical finding that
in-frame deletions mostly do not show strong phenotypes, unless they occur in
regions encoding conserved residues or domains[^2]
[^2]: Horlbeck MA, Gilbert LA, Villalta JE, Adamson B, Pak RA, Chen Y, Fields AP,
Park CY, Corn JE, Kampmann M, Weissman JS: Compact and highly active next-
generation libraries for CRISPR-mediated gene repression and activation.
*Elife* 2016, 5.
"""
type CRISPRn <: Cas9Behavior
knockout_dist::Categorical

Expand All @@ -64,6 +96,8 @@ CRISPRn() = CRISPRn(Categorical([1/9, 4/9, 4/9]))

"""
Wrapper containing all library construction parameters
$(FIELDS)
"""
type Library
"Distribution of guide knockdown efficiencies"
Expand Down Expand Up @@ -190,6 +224,8 @@ function rand_gene(lib::Library)
end

"""
$(SIGNATURES)
Constructs the guide library for `N` genes with `coverage` number of guides per
gene. Returns a tuple of guides and their relative frequencies (assigned randomly).
"""
Expand Down
3 changes: 2 additions & 1 deletion src/simulation/load.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ packages = [:StatsBase,
:Distributions,
:DataFrames,
:HypothesisTests,
:IterTools]
:IterTools,
:DocStringExtensions]

for package in packages
eval(:(using $package))
Expand Down
Loading

0 comments on commit b66357e

Please sign in to comment.