Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fitkwargs are not propagated to lmoments3 lmom_fit function #2043

Open
2 tasks done
saschahofmann opened this issue Jan 13, 2025 · 8 comments
Open
2 tasks done

fitkwargs are not propagated to lmoments3 lmom_fit function #2043

saschahofmann opened this issue Jan 13, 2025 · 8 comments
Labels
enhancement New feature or request

Comments

@saschahofmann
Copy link
Contributor

Addressing a Problem?

I am using xclim.indices.stats.standardized_index_fit_params to compute SPI and SPEI. I am calling it with method='PWM', dist=lmoments3.distr['gam'] and sometimes get errors where the L-moments are invalid (lmoments3/distr.py:1207). This is due to potentially negative values.

I have seen two situations for negative values:

A common solution seems to be adding an offset (e.g. I can add a large value for water_budget+=1000 or a very small number pr+=0.0001 for the second case). Previously xclim supported an offset parameter that is soon to be deprecated.
I couldn't really figure out a good way to provide a scalar with units but using a one-dimensional DataArray with units attributes worked.

The suggest alternative for the deprecated offset is using fitkwargs=dict(floc= -offset) but this parameter doesn't seem to be propagated to the lmom_fit function when providing method='PWM' to _fitfunc_1d.

    elif method == "PWM":
        # lmoments3 will raise an error if only dist.numargs + 2 values are provided
        if len(x) <= dist.numargs + 2:
            return np.asarray([np.nan] * nparams)
        params = list(dist.lmom_fit(x).values())

Potential Solution

I am a little confused to how the lmoments3 part works but the GammaGen inherits from scipy.stats._continuous_distns.gamma_gen so we might be able to propagate it the argument to the internal fit. Another solution that could also solve the lmoments3 license issue, is using scipy.stats.lmoment introduced in SciPy 1.15 and deprecating lmoments3.

Additional context

No response

Contribution

  • I would be willing/able to open a Pull Request to contribute this feature.

Code of Conduct

  • I agree to follow this project's Code of Conduct
@saschahofmann saschahofmann added the enhancement New feature or request label Jan 13, 2025
@coxipi
Copy link
Contributor

coxipi commented Jan 13, 2025

Hi Sascha,

I didn't check specifically with Lmoments3 because of the licensing issues. I think that specifically for gamma this is a problem, because loc is accepted as an argument, but then it is assumed to be 0:
https://lmoments3.readthedocs.io/stable/distributions.html

a, loc, scale (The location parameter is not calculated using L-moments and assumed to be zero.)

I guess that for other distributions, a loc parameter will be computed appropriately, but for the lmoments3 gamma distribution, you just need to offset your distribution.

The floc is something you fix for the _fit_start, but after that it is given in the full-fledged optimization a the starting parameter for loc, so you will presumably still get the problem outlined above, lmoments3 doesn't really support loc for gamma? So I realize that lmoments3 just doesn't take starting values for parameters, so floc cannot be used to fix the starting point of the distribution in this way

In this sense, the previous argument offset would have let you work with Lmoments3 in this way, but I prefer the current way we handle arguments because it more closely follows what can be given to scipy, instead of relying on manual shifts, we use built-in procedure to compute loc. If possible, it might be good for Lmoments3 to accept a loc argument, and simply shift the distribution accordingly, stating that loc will be frozen and not optimized by the method?

@coxipi
Copy link
Contributor

coxipi commented Jan 13, 2025

I couldn't really figure out a good way to provide a scalar with units but using a one-dimensional DataArray with units attributes worked.

Did you try with a string offset = "1000 mm/d"? It worked like thresholds

@coxipi
Copy link
Contributor

coxipi commented Jan 13, 2025

Ah, sorry, I read too fast. I do think we will hit a problem specifically with lmom-gamma, but currently, fitkwargs is not even propagated as you say, we need to input fitkwargs to dist.lmom_fit:
params = list(dist.lmom_fit(x, **fitkwargs).values())

But I guess the way that lmoments works, you don't take starting values for an optimization process, you just compute moments of you data and use them in some formulas that give you the parameters? So fitkwargs can't be used as an argument in this case.

@coxipi
Copy link
Contributor

coxipi commented Jan 13, 2025

So, right now, we accept fitkwargs arguments in conjunction with method="PWM" although they are not used, so this is confusing.

  1. We could forbid the use of fitkwargs arguments in this case. Or, simply a warning, that with lmoments, the fitkwargs are not used.
  2. Specifically for "lmoments3/gamma", we could allow floc and use it as a shifting parameter?
    elif method == "PWM":
        # ...
        if "floc" in fitkwargs and type(dist) == "GammaGen":
            # lmoments3 assumes `loc` is 0, so `x` may need to be shifted
            # note that `floc` must already be in appropriate units for `x`
            params = list(dist.lmom_fit(x - fitkwargs["floc"]).values())
            params["loc"] = fitkwargs["floc"]
        else:
            params = list(dist.lmom_fit(x).values())

I don't think we plan on having a full-fledged support for lmoments3 because of licensing issues, so I think this kind of specific crafty solution is all right. Although it's not exactly the way floc was expected to be used for, I think it's acceptable in this case. @aulemahal thoughts?

@aulemahal
Copy link
Collaborator

I agree with this solution! I support a warning if fitkwargs is passed, but lmom and dist != 'GammaGen' is used.

Don't you need to edit params afterwards to invert the offset ?

@coxipi
Copy link
Contributor

coxipi commented Jan 13, 2025

Don't you need to edit params afterwards to invert the offset ?

I have to see how params are stored. The best would be simply to include loc=floc in the final parameters found.

That is,

  1. compute dist.lmom_fit(x_shifted) & obtain a=a_sol, scale=scale_sol, loc=0.
  2. update loc since we know we shifted x?

I have updated my previous comment to reflect this

@coxipi
Copy link
Contributor

coxipi commented Jan 13, 2025

see #2045

@saschahofmann
Copy link
Contributor Author

Wow that was fast. I actually agree and the easiest solution would probably be to even just raise an error. When the floc parameter is provided for PWM. I currently just apply the offset before calling standardized_index_fit_params, so that's always an option.

But I see you already put in a solution 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants