diff --git a/README.md b/README.md new file mode 100644 index 0000000..524f179 --- /dev/null +++ b/README.md @@ -0,0 +1,69 @@ +# ProbabilisticParameterEstimators.jl + +Implementation of different parameter estimators that take in measures under uncertainty and produce a probability distribution over the parameters. + +## High Level Example + +``` julia +# observation function with multivariate observations +f(x, p) = [(x + 1)^2 - sum(p); + (x + 1)^3 + diff(p)[1]] + +# true parameter (to be estimated) and a prior belief +θtrue = [1.0, 2.0] +prior = MvNormal(zeros(2), 4.0 * I) + +# observation noise +obsnoises = [rand()/10 * I(2) * MvNormal(zeros(2), I) for _ in eachindex(xs)] +noisemodel = UncorrGaussianNoiseModel(obsnoises) + +# noisy observations x and y +xs = rand(5) +ysmeas = f.(xs, [θtrue]) .+ rand.(noises) + +# find a probabilistic description of θ either as samples or as a distribution +# currently we provide three methods +for est in [MCMCEstimator(prior, f), + LinearApproxEstimator(prior, f), + LSQEstimator(prior, f)] + # either + samples = predictsamples(est, xs, ysmeas, noisemodel, 100) + # or + dist = predictdist(est, xs, ysmeas, noisemodel; nsamples=100) +end +``` + +## Problem Setup +We assume parameters $\theta$ in $\mathbb{R}^m$, inputs $x$ in $\mathbb{R}^n$, and measurements $y$ in $\mathbb{R}^l$, linked by a observation function $$y = f(x, \theta) + \varepsilon$$ where $\varepsilon$ is sampled from a known noise distribution $p_{\bar{\varepsilon}}$. +Further assumptions of the noise models are discussed below. +Notice also that $x$, $y$, and $theta$ may all be multidimensional, with different dimensions. + +Given that we have uncertainty in the observations, we are interested in constructing a probabilistic description $p_{\bar{\theta}}(\theta \mid y)$ of the parameters $\theta$, either as a distribution, or as a set of samples. +We implement three estimators for this task, which map to either samples or a distribution via `predictsamples(est, xs, ys, noisemodel, nsamples)` and `predictdist(est, xs, ys, noisemodel)`, respectively. +The conversion between samples and a distribution can be done automatically via sampling or fitting a multivariate normal distribution. + +![Estimator Overview](figs/distribution_graph/distribution_graph.png) + +### MCMCEstimator +The `MCMCEstimator` simply phrases the problem as a Monte-Carlo Markov-Chain inference problem, which we solve using the `NUTS` algorithm provided by `Turing.jl`. +Therefore `predictdist(::MCMCEstimator, xs, ys, nsamples)` will create `nsamples` samples (after skipping a number of warmup steps). +`predictdist(::MCMCEstimator, xs, ys, nsamples)` will do the same, and then fit a `MvNormal` distribution. + +### LSQEstimator +The `LSQEstimator` works by sampling noise $\varepsilon^{(k)}$ from the noise model and repeatedly solving a least-squares parameter estimation problem for modified measurements $y - \varepsilon^{(k)}$, i.e. +$$ +\theta = \arg \min_\theta \sum_i ((y_i - \varepsilon_i^{(i)}) - f(x, \theta))^2 \cdot w_i +$$ +for uncorrelated noise, where the weights $w_i$ are chosen as the inverse variance. +For correlated noise, the weight results from the whole covariance matrix. + +Therefore `predictsamples(::LSQEstimator, xs, ys, nsamples)` will solve `nsamples` optimization problems and return a sample each. +`predictdist(::LSQEstimator, xs, ys, nsamples)` will do the same, and then fit a `MvNormal` distribution. + +### LinearApproxEstimator +The `LinearApproxEstimator` solves the optimization problem above just once, and then constructs a multivariate normal distribution centered at the solution. +The covariance is constructed by computing the Jacobian of $f(x, \theta)$ and (roughly) multiplying it with the measurement uncertainty. +See also [this wikipedia link](https://en.wikipedia.org/wiki/Non-linear_least_squares#Extension_by_weights). + +Therefore `predictdist(::LinearApproxEstimator, xs, ys, nsamples)` will solve one optimization problem and compute one Jacobian, yielding a `MvNormal` and making it very efficient. +`predictsamples(::LinearApproxEstimator, xs, ys, nsamples)` will simply sample `nsample` times from this distribution, which is also very fast. diff --git a/figs/distribution_graph/distribution_graph.pdf b/figs/distribution_graph/distribution_graph.pdf new file mode 100644 index 0000000..50cb939 Binary files /dev/null and b/figs/distribution_graph/distribution_graph.pdf differ diff --git a/figs/distribution_graph/distribution_graph.png b/figs/distribution_graph/distribution_graph.png new file mode 100644 index 0000000..3bfaf67 Binary files /dev/null and b/figs/distribution_graph/distribution_graph.png differ diff --git a/figs/distribution_graph/distribution_graph.tex b/figs/distribution_graph/distribution_graph.tex new file mode 100644 index 0000000..bb30770 --- /dev/null +++ b/figs/distribution_graph/distribution_graph.tex @@ -0,0 +1,24 @@ +\documentclass[tikz, convert={density=600,outext=.png}]{standalone} +\usetikzlibrary{positioning} + +\begin{document} +\begin{tikzpicture}[ + mybox/.style={minimum width=2.0cm, minimum height=0.70cm, draw, thick}, + myarrow/.style={shorten <=2pt, shorten >=2pt, ->, thick}, + ] + \node[mybox] (mvnormal) {\texttt{MvNormal}}; + \node[mybox, right=2 of mvnormal, align=center] (samples) {\texttt{Samples}}; + + \draw[myarrow] (mvnormal) to[bend left=20] node[above] {\texttt{rand}} (samples); + \draw[myarrow] (samples) to[bend left=20] node[below] {\texttt{fit(MvNormal, samples)}} (mvnormal); + + \node[draw, minimum width=4.5cm, above=3 of mvnormal.west, anchor=east] (mcmcestimator) {\texttt{MCMCEstimator}}; + \node[draw, minimum width=4.5cm, below=0 of mcmcestimator] (lsqestimator) {\texttt{LSQEstimator}}; + \node[draw, minimum width=4.5cm, below=0 of lsqestimator] (linearapproxestimator) {\texttt{LinearApproxEstimator}}; + + \draw[->] (mcmcestimator) to[out=0, in=90] (samples); + \draw[->] (lsqestimator) to[out=0, in=120] node[above, rotate=-30, pos=0.62, scale=0.6] {\texttt{predictsamples}} (samples); + \draw[->] (linearapproxestimator) to[out=0, in=90] node[above, rotate=-70, scale=0.6, pos=0.62] {\texttt{predictdist}} (mvnormal); +\end{tikzpicture} + +\end{document}