Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

calling HMCODE in Cosmology object -- slow #1104

Open
nikosarcevic opened this issue Jul 14, 2023 · 1 comment
Open

calling HMCODE in Cosmology object -- slow #1104

nikosarcevic opened this issue Jul 14, 2023 · 1 comment

Comments

@nikosarcevic
Copy link
Contributor

nikosarcevic commented Jul 14, 2023

Currently, HMCODE is calling CAMB externally, as I understand. I did not have a look into the code and am not sure how exactly HMCODE is called internally, but I know that for me, getting Pk is very very VERY slow compared to when I use halo fit or BCM.

can this be done differently so it is faster?

also, can then hmcode be called differently when constructing the cosmology object? I think the way it is done now is a bit complicated (several layers of dictionaries)

ETA: my setup

I have a fisher forecasting pipeline with several "modes" for shear, gglensing, clustering and 3x2pt for SRD Y1 and Y10.
Modes are: SRD (obtain the SRD results), JMAS (my project within DESC), SRD+BCM (srd extended to have boosted Pk and include BCM params) and SRD+HMCODE (Pk boosted using HMCODE route).
Pipeline is modular, optimized and stable. The only bottleneck is when I choose the SRD+HMCODE mode. For example, if I am running the cosmic shear probe using the SRD+BCM mode, I obtain a Fisher matrix (the final product) in about 20 min. If I go with the SRD+HMCODE, it takes over 2 hours. The bottleneck is constructing the cosmology object, getting the Pk and eventually getting the Cls. Hope this explanation helps @tilmantroester

@tilmantroester
Copy link
Contributor

It's good practice to reduce the problem to a minimal example that reproduces the issue.

In the test below, using HMCode in CAMB takes between 50% and 100% longer than using halofit (either from CAMB or CCL). That is more than I had expected (when I did similar tests previously, the added runtime due to running HMCode was negligible). But maybe I was doing the tests with different accuracy settings for CAMB, where computing the transfer function took much longer, making the time difference in the power spectra computation less significant.

So this doesn't explain the factor of 6 in runtime you see in your pipeline. Have you used a profiler to pinpoint the bottleneck?

import os
os.environ["OMP_NUM_THREADS"] = "1"

import numpy as np
import pyccl as ccl


p = dict(
    Omega_b = 0.05,
	Omega_c = 0.25,
	h       = 0.67,
	n_s     = 0.96,
	A_s     = 2.1e-9,
)


z = np.linspace(0, 3, 300)
nz = np.exp(-0.5 * (z - 0.5)**2/0.1**2)
ell = np.unique(np.geomspace(2, 2000, 400, dtype=int))


def dmo():
    cosmo = ccl.Cosmology(**p,
                          extra_parameters={"camb": {"kmax": 20.0,}})
    wl_tracer = ccl.WeakLensingTracer(cosmo=cosmo, dndz=(z, nz))
    cell = ccl.angular_cl(cosmo=cosmo, ell=ell, tracer1=wl_tracer, tracer2=wl_tracer)
    return cell


def hmcode():
    cosmo = ccl.Cosmology(**p,
        matter_power_spectrum="camb",
        extra_parameters={"camb": {"kmax": 20.0,
                                   "halofit_version": "mead2020",
                                   "HMCode_logT_AGN": 7.8}}
    )
    wl_tracer = ccl.WeakLensingTracer(cosmo=cosmo, dndz=(z, nz))
    cell = ccl.angular_cl(cosmo=cosmo, ell=ell, tracer1=wl_tracer, tracer2=wl_tracer)
    return cell


def bcm():
    cosmo = ccl.Cosmology(**p,
                          extra_parameters={"camb": {"kmax": 20.0,}})
    baryon_model = ccl.baryons.BaryonsSchneider15()
    baryon_pk = baryon_model.include_baryonic_effects(cosmo=cosmo, pk=cosmo.get_nonlin_power())
    wl_tracer = ccl.WeakLensingTracer(cosmo=cosmo, dndz=(z, nz))
    cell = ccl.angular_cl(cosmo=cosmo, ell=ell, tracer1=wl_tracer, tracer2=wl_tracer, p_of_k_a=baryon_pk)
    return cell


%timeit dmo()
%timeit hmcode()
%timeit bcm()
1.17 s ± 19.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1.43 s ± 35.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1.12 s ± 6.71 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants