Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Choose an SSA if no SSA is passed in JumpProblem. #351

Merged
merged 23 commits into from
Aug 6, 2024
Merged

Conversation

Vilin97
Copy link
Contributor

@Vilin97 Vilin97 commented Sep 22, 2023

If JumpProblem(prob, jumps) is called, an SSA is selected via the function below. The assumption is that Direct is fastest for a small network, while RSSACR is fastest for other networks. If RSSACR cannot be used because no species-to-jumps graph is given, then DirectCR is used instead.

# return the fastest aggregator out of the available ones
function select_aggregator(jumps::JumpSet; vartojumps_map=nothing, jumptovars_map=nothing, dep_graph=nothing, spatial_system=nothing, hopping_constants=nothing)
    
    # detect if a spatial SSA should be used
    !isnothing(spatial_system) && !isnothing(hopping_constants) && return DirectCRDirect

    # if variable rate jumps are present, return the only SSA that supports them
    num_vrjs(jumps) != 0 && return Coevolve
    
    # if the number of jumps is small, return the Direct
    num_majumps(jumps) + num_crjs(jumps) < 10 && return Direct

    # if there are only massaction jumps, we can any build dependency graph, so return the fastest aggregator
    num_crjs(jumps) == 0 && num_vrjs(jumps) == 0 && return RSSACR

    # in the presence of constant rate jumps, we rely on the given dependency graphs
    if !isnothing(vartojumps_map) && !isnothing(jumptovars_map)
        return RSSACR
    elseif !isnothing(dep_graph)
        return DirectCR
    else
        return Direct
    end
end

@github-actions
Copy link
Contributor

github-actions bot commented Sep 22, 2023

Pull Request Test Coverage Report for Build 7514454839

Warning: This coverage report may be inaccurate.

We've detected an issue with your CI configuration that might affect the accuracy of this pull request's coverage report.
To ensure accuracy in future PRs, please see these guidelines.
A quick fix for this PR: rebase it; your next report should be accurate.

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-1.1%) to 88.765%

Totals Coverage Status
Change from base Build 6269620671: -1.1%
Covered Lines: 2070
Relevant Lines: 2332

💛 - Coveralls

test/spatial/ABC.jl Outdated Show resolved Hide resolved
src/problem.jl Outdated Show resolved Hide resolved
src/aggregators/aggregators.jl Outdated Show resolved Hide resolved
Comment on lines 202 to 213
num_majumps(jumps) + num_crjs(jumps) < 10 && return Direct

# if there are only massaction jumps, we can any build dependency graph, so return the fastest aggregator
num_crjs(jumps) == 0 && num_vrjs(jumps) == 0 && return RSSACR

# in the presence of constant rate jumps, we rely on the given dependency graphs
if !isnothing(vartojumps_map) && !isnothing(jumptovars_map)
return RSSACR
elseif !isnothing(dep_graph)
return DirectCR
else
return Direct
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are missing a middle regime where SortingDirect or RSSA might be better. Have you played around with this at all / looked over the SciMLBenchmarks and the Catalyst paper results?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I did the Catalyst tests, RSSA was best for small systems. However, we didn't test very small systems, so not sure about these. But there definitely was a regime where RSSA as the best choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
This image suggests that RSSA is indeed better on small models but only marginally. I did not see a comparison with Direct and SortingDirect there. @TorkelE , do you have a table of times from when you did the benchmarking? I did not find such a table for SSA benchmarks in the paper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
image
These two are from models with 6 and 18 reactions, respectively. Indeed, it looks like between ~6 and ~66 reactions we might want to use RSSA. The number 66 comes from the benchmark from the Catalyst paper.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think the smallest model (Multistate) is 18 reactions. It is kind of hard to see the Direct and Sorting DIrect as RSSA (and RSSACR) is more or less on top of them (also suggesting that the choice of method is not that important).

There is not good table, but the results are in JSON files here: https://github.com/SciML/Catalyst_PLOS_COMPBIO_2023/tree/master/Benchmarks/Benchmarking_results/Threads_1

Look for e.g. RSSA_multistate and you find the multistate results. The data is saved in Dicts with the benchmark median times and simulation lengs as separate entries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that hold with static array states?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added RSSA if the number of jumps is between 10 and 100. This schema is bound to be just a heuristic so I don't think we should spend a ton of time finding the exact threshold when e.g. RSSA is better than e.g. SortingDirect.

@ChrisRackauckas
Copy link
Member

Perfect is always the enemy of good for this kind of thing. I can tell you that 90% of users just use Gillespie's method if we document Direct as Gillespie's method. Even the simplest defaulting scheme can be merged and do better than the vast majority of users, so the details can always evolve but I think the biggest thing is to just get something with the right broad strokes and also update all of the tutorials to not hardcode a method. The latter is BTW very important, because there's still people who use DiffEq who just throw Tsit5 everywhere instead of using the default, and you really want to stop that habit via tutorials otherwise people will overuse Gillespie.

Copy link
Member

@TorkelE TorkelE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

src/aggregators/aggregators.jl Outdated Show resolved Hide resolved
src/aggregators/aggregators.jl Outdated Show resolved Hide resolved
test/geneexpr_test.jl Outdated Show resolved Hide resolved
@isaacsas
Copy link
Member

isaacsas commented Aug 6, 2024

@Vilin97 will merge when I get tests passing.

@TorkelE expanded this a bit to allow not needing to pass SSAStepper too, i.e. with this PR this should now work

jprob = JumpProblem(rn, dprob)
sol = solve(jprob)

@isaacsas
Copy link
Member

isaacsas commented Aug 6, 2024

@TorkelE actually MTK and Catalyst need appropriate dispatches for JumpProblem to make this work, so not quite there yet for being usable in Catalyst.

src/solve.jl Outdated
Comment on lines 11 to 13
function DiffEqBase.__solve(jump_prob::DiffEqBase.AbstractJumpProblem; kwargs...)
DiffEqBase.__solve(jump_prob, SSAStepper(); kwargs...)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChrisRackauckas can you confirm this is an OK dispatch for me to add to enable

sol = solve(jprob; kwargs...)  # equivalent to solve(jprob, SSAStepper(); kwargs...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I guess this ignores when the JumpProblem is over an ODEProblem or such. Is there a way I can explicitly select the default auto-solver for a given problem type?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't work currently either, so for now I'll just make this dispatch only work for JumpProblems over DiscreteProblems.

@isaacsas
Copy link
Member

isaacsas commented Aug 6, 2024

@ChrisRackauckas if you could give a quick look to ensure I haven't added a dispatch better handled elsewhere (or shadowing one elsewhere) that would be helpful.

HISTORY.md Outdated Show resolved Hide resolved

# if variable rate jumps are present, return one of the two SSAs that support them
if num_vrjs(jumps) > 0
(num_bndvrjs(jumps) > 0) && return Coevolve
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't they all need bounds?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory we support mixing bounded and non-bounded vrjs (but if there are any non-bounded vrjs you need to use a JumpProblem over an ODEProblem or such). JumpProblem has code to check these cases, but I'm not sure how well this is actually tested currently.

@isaacsas isaacsas merged commit 4f262cb into master Aug 6, 2024
5 checks passed
@isaacsas isaacsas deleted the default-ssa branch August 6, 2024 17:50
@isaacsas
Copy link
Member

isaacsas commented Aug 6, 2024

@Vilin97 thanks for your work on this and sorry for the long long delay!

@Vilin97
Copy link
Contributor Author

Vilin97 commented Aug 6, 2024

Thank you for cleaning up and adding a default solver too. Now we need to update the docs, right?

@isaacsas
Copy link
Member

isaacsas commented Aug 6, 2024

Yes, the docs need updates in places now. Using the defaults in the most introductory places, but explaining how to select solvers and such as they currently do in more advanced places.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants