-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathIceThicknessReport.tex
452 lines (403 loc) · 28.3 KB
/
IceThicknessReport.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
\documentclass[12pt, letterpaper]{article}
\usepackage[utf8]{inputenc}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{amsthm}
\usepackage[margin=1in]{geometry}
\usepackage{graphicx}
\usepackage[ruled,vlined]{algorithm2e}
\graphicspath{ {./img/} }
\author{Jeff Liu}
\title{Bayesian Modeling of Polar Sea Ice from IceSat-2 Freeboard Measurements}
\begin{document}
\maketitle
\section{Introduction}
Polar sea ice thickness is of interest to researchers for a variety of reasons including maritime navigation and climate change.
In 2018, NASA launched the IceSat-2 satellite, which relies a laser altimeter to help study polar sea ice.
To measure the thickness of this ice, the satellite takes measurements of ``freeboard'', which is the height of the ice above sea water.
Typically, 7/8 of the ice lies below water, so the freeboard can provide direct inference for the true thickness of the ice;
however, this measurement is complicated by an unknown amount of snow. Any snow present may weigh down the ice, pushing it deeper into the water,
which renders this 7/8 rule of thumb inaccurate.
\\\\
In response to this complication, scientists have devised a more sophisticated model that incorporates the amount of snow present into
the relationship between ice thickness and freeboard.
It relies on three additional parameters: the densities of the surrounding sea water, the sea ice in question, and the snow on top of the ice.
My goal in this paper is to incorporate this physical model as well as prior knowledge of the parameters to estimate the sea ice thickness
corresponding to a freeboard measurement of 0.1 meters. I will proceed to evaluate this estimate using Bayesian inference techniques
by defining a prior and likelihood and measuring the posterior.
\\\\
I will first proceed to elaborate on this physical model and formulate the Bayesian process to evaluate the expected ice thickness in section II,
including discussion of the assumptions, prior, hyper priors, forward model, likelihood functions, and posterior densities.
Following that, I will discuss the computational strategies for computing the posterior densities, including Markov Chain Monte Carlo,
in section III. Finally, I will report my model results and analyze model accuracy in section IV, and analyze and discuss the results in section V.
\newpage
\section{Formulation}
\subsection{Physical Model}
The physical model relating the ice thickness $H$, snow thickness $S$, and freeboard $H$ is the following:
\[
F = \frac{\rho_w - \rho_i}{\rho_w} \left( H - S \frac{\rho_s}{\rho_w - \rho_i} \right),
\]
where $\rho_w$, $\rho_i$, and $\rho_s$ are the densities of the sea water, sea ice, and snow, respectively.
In the literature, it is given that Arctic seawater density $\rho_w$ ranges from 1010 to 1030 $kg/m^3$,
sea ice density $\rho_i$ ranges from 750 to 940 $kg/m^3$ while having an average value of 910 $kg / m^3$,
and snow density averages 300 $kg / m^3$ while ranging from 100 to 400 $kg / m^3$. In addition, I add a constraint
on sea ice thickness from Dr.\ Matthew Parno's Math 76 lecture on a similar model, where sea ice thickness ranges
from 0.1 to 5 meters while having an average value of 2 meters. No information is given on snow thickness, but it
is safe to say that snow thickness must be greater than or equal to zero meters.
\subsection{Assumptions}
To make this inference possible, I will have to make many assumptions because the given information is quite sparse.
\\\\
First, I must give some characterization to a prior density over these parameters. It is helpful that I have a given
range to work with for the parameters, but I will have to arbitrarily assume variances for all of them.
The potential bias introduced by of any poorly assumed variances is dampened by the fact that I have a predefined range of possibilities.
\\\\
In concert with these variance assumptions, I also assume that the prior densities are truncated Gaussian densities.
The reason for this choice is twofold. First, Gaussian densities are easily computed, and an added truncation is trivial
as long as we do not need to integrate to find the normalizing constant. Second, these truncations allow me to impose a strict
range upon the parameters, which helps with limiting the bias of arbitrarily chosen variances in my first assumption.
\\\\
My third assumption is that this model has additive noise. It is given that IceSat-2 can measure ice height within a few centimeters.
Since this measurement affects only the freeboard, which is not multiplied with anything in the physical model, I am assuming that this error
is additive. However, I will not make a hyperprior estimate on the exact amount of additive noise, and will reserve that to a hyperprior distribution
which I will discuss later.
\\\\
I will also need to make certain assumptions about the snow depth to make this model more easily computable. I visited a NASA website
discussing arctic snow depth over sea ice (https://earthobservatory.nasa.gov/images/146758/mapping-snow-on-arctic-sea-ice), and used the graphics
to semi-arbitrarily decide an average snow depth of 0.15 meters. However, giving an arbitrary estimate of the snow depth variance may
introduce too much bias into the model since I have no little information. Therefore, I will be modeling this variance as a hyperparameter
with a hyperprior as well.
\\\\
Because I am assuming truncated Gaussians for all prior densities, I will assume that the inverse-Gamma distribution will serve as a proper
hyperprior on my additive noise/error variance and my snow thickness variance. This is typically chosen for conjugacy reasons,
which I cannot take advantage of due to my truncations, but the general shape of the distribution is suitable for Gaussian variances anyway.
The ``hyper-hyper-priors'' $\alpha$ and $\beta$ of the inverse-Gamma are chosen somewhat arbitrarily,
but this is not a huge issue since bias in hyper-hyper-parameters have little influence over the posterior. I will discuss my decision strategy
for these hyper-hyper-priors in a later subsection.
\\\\
My final assumption is that the prior parameters are independent. This is actually a poor assumption that I will revisit in the discussion section,
but I have no way to quantify the covariances of the parameters \emph{a priori}, so I have to make this assumption to make modeling possible.
Additionally, I make the assumption that the hyperparameters are independent from each other, and that the noise variance is also independent
from the parameters. This allows me to make some simplifications using conditional independence in my posterior formulation.
\subsection{Prior Formulation and Densities}
As I mentioned in the assumptions section, I chose the variances on the three densities rather arbitrarily.
I estimated the standard deviations of $\rho_w$ to 7, $\rho_i$ to 80, and $\rho_s$ to 90, and squared those to obtain the variances.
I also set the standard deviation of ice density to 2 meters, equal to the mean, to account for a fair amount of uncertainty, and squared that as well.
Finally, I guessed a mean standard deviation of snow density equal 0.2 meters, which is very wide. However, this value will not be going straight
into the prior model, since I will be modeling that standard deviation with an inverse-Gamma hyperprior. It remains useful for getting situated
and visualizing the prior, though.
\medskip
\begin{center}
\begin{tabular} { |c|c|c|c|c|c| }
\hline
parameter & mean & std.\ dev & variance & lower bound & upper bound \\
\hline
$\rho_w$ & 1020 & 7 & 49 & 1010 & 1030 \\
$\rho_i$ & 910 & 80 & 6400 & 750 & 940 \\
$\rho_s$ & 300 & 90 & 8100 & 100 & 400 \\
$H$ & 2.0 & 2.0 & 4.0 & 0.1 & 5.0 \\
$S$ & 0.15 &$\approx0.2$&$\approx0.04$& 0 & unknown \\
\hline
\end{tabular}
\end{center}
\medskip
Plots of the prior are shown on the next page. Note that the snow thickness plot is not a ``true'' prior; it is only an estimate
based on the hyperprior I will define next.
\begin{figure}[p]
\caption{Plots of Prior Densities}
\includegraphics[width=0.5\textwidth]{prior_rw}
\includegraphics[width=0.5\textwidth]{prior_ri}
\includegraphics[width=0.5\textwidth]{prior_rs}
\includegraphics[width=0.5\textwidth]{prior_H}
\includegraphics[width=0.5\textwidth]{prior_S}
\end{figure}
\newpage
\subsection{Hyperprior Formulation and Densities}
As mentioned in the assumptions, I chose to use inverse-Gamma densities as hyperpriors on the additive noise variance as well as the snow depth variance.
The Gamma distribution contains two parameters in the typical parametrization: $\alpha$ and $\beta$. These are largely arbitrary for my usage,
so I did a re-parametrization in terms of mean and variance, and used identities to convert them back to $\alpha$ and $\beta$ (from Wikipedia):
\[
\mu = \frac{\beta}{\alpha - 1} \text{ for } \alpha > 1; \>
\sigma^2 = \frac{\beta^2}{(\alpha-1)^2(\alpha-2)} \text{ for } \alpha > 2
\]
I chose hyperprior means equal to my estimates of the variances of the snow depth and error noise, which are the squared standard deviations.
For snow depth, I chose a semi-arbitrary estimate of 0.2 meters as mentioned earlier. For additive noise, I estimated three centimeters
based on the description of ``accurate to a few centimeters'' on IceSat-2. I chose the hyperprior variances assuming that the standard deviations
would be equal to the means. After that, I used the sympy package in Python to solve the equations for the hyper-hyper-parameters $\alpha$ and $\beta$.
Additionally, I also added a small ``nugget'' to each distribution to shift them a tiny bit away from zero to aid computation.
\medskip
\begin{center}
\begin{tabular} { |c|c|c|c|c|c|c| }
\hline
hyperparameter &expected mean &std.\ dev& variance &$\alpha$& $\beta$ & nugget \\
\hline
$\sigma^2_\epsilon$&$0.03^2=$ 9.0e-4 & 9e-4 & 8.1e-7 & 3.0 & 0.0018 & 0.0001 \\
$\sigma^2_S$ &$0.2^2=$ 0.04 & 0.04 & 1.6e-3 & 3.0 & 0.08 & 0.003 \\
\hline
\end{tabular}
\end{center}
\medskip
\begin{figure}[h]
\caption{Plots of Hyperprior Densities}
\includegraphics[width=0.5\textwidth]{hypprior_var_eps.png}
\includegraphics[width=0.5\textwidth]{hypprior_var_S.png}
\end{figure}
\newpage
\subsection{Forward Model and Likelihood Function}
Since I assume an additive Gaussian noise error, the likelihood function is a simple extension of the physical forward model.
Let $x$ be the five-dimensional vector containing the parameters $\rho_w$, $\rho_i$, $\rho_s$, $H$, and $S$,
and let $g(x)$ be the forward model predicting some freeboard value $F_{pred}$ from parameters $x$.
We can then describe the likelihood function as a Gaussian distribution with mean zero and variance $ \sigma^2_\epsilon$,
evaluated at the difference $F - g(x)$ as observed minus predicted.
\[
f(F | x, \sigma^2_\epsilon) \sim N(0, \sigma^2_\epsilon)
\]
This is notably dependent on a variable noise variance, which I will explore in the next subsection on the hierarchical model.
\subsection{Hierarchical Model and Posterior Formulation}
Now I will proceed to discuss the posterior of the model, which contains hierarchical components regarding the noise and snow variances.
Traditionally, a non-hierarchical Bayesian model follows Bayes' rule, where the posterior is proportional to the prior times the likelihood:
\[
f(x | F) \propto f(F | x) f(x)
\]
However, we have a hierarchical model which tries to incorporate the observation and obtain posterior hyperparameters as well.
These posterior hyperparameters show up here:
\[
f(x, \sigma^2_\epsilon, \sigma^2_S | F) \propto f(F | x, \sigma^2_\epsilon) f(\sigma^2_\epsilon | \alpha_\epsilon, \beta_\epsilon)
f(x | \sigma^2_S) f(\sigma^2_S | \alpha_S, \beta_S)
\]
I have dropped a few terms in this equation due to conditional independence. For example, $\sigma_S$ does not show up
in the likelihood function because F is conditionally independent of $\sigma_S$ given some value of x that has already been evaluated.
I also make use of my assumption that the two hyperparameter variances are independent as well, to make the joint distribution of
$\sigma^2_\epsilon$ and $\sigma^2_S$ just equal to the one we are interested in when the other is held constant.
Finally, I have left out the $\alpha$ and $\beta$ terms on the left side to save space; they are just constants here
and do not need to be listed, but I'm listing them to make clear which terms are inverse-Gamma hyperpriors.
\\\\
I will elaborate on the specific computation of this posterior, as well as the hierarchical separation of the parameters and hyperparameters,
in the next section.
\subsection{Expected Value of Ice Thickness}
The solution to my answer is relatively simple after computing the posterior density. Since I am purely making an inference on the parameters,
I can just compute $\mathbb{E}[x_{post}]$ and then extract the specific expected value and posterior density on ice thickness $H$.
\newpage
\section{Solution and Computation Strategy}
\subsection{Non-Conjugacy and Markov Chain Monte Carlo}
My model is non-conjugate in two areas. First, the physical model is simple but non-linear, so no linear algebra identities can be used to
analytically derive the distributions. Second, I chose to use truncated Gaussians, which have complex normalizing constants that render conjugacy impossible
in and of itself. These conditions leave sampling as the best technique to derive the posterior, as we can take advantage of the Central Limit
Theorem to prevent the exponential growth in computational complexity for numerical integration of a multi-dimensional density.
\\\\
Within sampling techniques, I chose to use Markov Chain Monte Carlo due to the difficulty of generating direct samples from the truncated prior density.
\subsection{Hierarchical MCMC and Gibbs Sampling}
With a hierarchical model, it is easier to break up each sampling step into three distinct steps: one for the parameters $x$
and one for each hyperparameter $\sigma^2_\epsilon$ and $\sigma^2_S$. This technique, called Gibbs sampling, is computationally easier
to evaluate. In each sub-step, I can hold the other variables constant, reducing the amount of required computation. In addition,
I free up the hyperparameters to be sampled independently of the parameters. Gibbs sampling will necessitate a modification to
the formulation where each parameter and hyperparameter is isolated in the posterior.
\\\\
For the sub-step involving the parameters $x$ while the hyperparameters are held constant, the marginal posterior on $x$ looks like this,
which can then be simplified by dropping $\sigma^2_S$ in the likelihood function and $\sigma^2_\epsilon$ in the prior
due to our independence and conditional independence assumptions on the parameters.
\begin{eqnarray*}
f(x | F, \sigma^2_\epsilon, \sigma^2_S) & \propto & f(F | x, \sigma^2_\epsilon, \sigma^2_S) f(x | \sigma^2_\epsilon, \sigma^2_S) \\
& \propto & f(F | x, \sigma^2_\epsilon) f(x | \sigma^2_S)
\end{eqnarray*}
For the sub-step involving $\sigma^2_\epsilon$ while $x$ and $\sigma^2_S$ are held constant, the marginal posterior looks like this,
which can be simplified by dropping dependence on $\sigma^2_S$ entirely (since I assume the two hyperpriors are independent),
and dropping both dependencies on $x$ and $\sigma^2_S$ in the hyperprior (again due to the same assumption).
\begin{eqnarray*}
f(\sigma^2_\epsilon | F, \sigma^2_S) & \propto & f(F | x, \sigma^2_\epsilon, \sigma^2_S) f(\sigma^2_\epsilon | x, \sigma^2_S) \\
f(\sigma^2_\epsilon | F) & \propto & f(F | x, \sigma^2_\epsilon) f(\sigma^2_\epsilon)
\end{eqnarray*}
Finally, for the sub-step involving $\sigma^2_S$ while $x$ and $\sigma^2_\epsilon$ are held constant, the marginal posterior
looks like this:
\begin{eqnarray*}
f(\sigma^2_\S | F, \sigma^2_\epsilon) & \propto & f(F | x, \sigma^2_\epsilon, \sigma^2_S) f(\sigma^2_S | x, \sigma^2_\epsilon) \\
& \propto & f(F | x, \sigma^2_\epsilon) f(x | \sigma^2_S) f(\sigma^2_S)
\end{eqnarray*}
In the $\sigma^2_S$ step, I first simplify by dropping $\sigma^2_S$ in the likelihood due to conditional independence given fixed $x$.
Next, I must expand the hyperprior, since I do not have a calculation for the density on $\sigma^2_S$ given $x$.
I do this expansion using Bayes' rule to ``flip'' the density into something I can calculate, which is just a prior and hyperprior.
\subsection{Metropolis-Within-Gibbs and Proposal Distribution}
Using these new posterior distributions, I executed my Gibbs sampling using the Metropolis-Hastings rule for each sub-step.
Together with Gibbs sampling, this creates three ``orthogonal'' walking steps per iteration, and the overall combination of these
techniques is called Metropolis-Within-Gibbs sampling.
\\\\
For the Metropolis-Hastings rule, I chose to use either univariate or multivariate Gaussians for all proposal densities.
This is convenient because it removes the need to consider the proposal ratios when calculating the acceptance ratio,
since Gaussians are symmetric. I scaled the proposal standard deviations of the parameters together,
such that they were equally proportional to their respective prior standard deviations. I chose that scaling factor
rather arbitrarily though trial-and-error in an attempt to make the autocorrelation as low as possible.
The proposal standard deviations of my hyperparameters were each chosen independently as well.
The standard deviations are listed below:
\medskip
\begin{center}
\begin{tabular} { |c|c|c|c|c|c|c| }
\hline
$\rho_w$ & $\rho_i $& $\rho_s$ & $H$ & $S$ & $\sigma^2_\epsilon$ & $\sigma^2_S$ \\
\hline
1.96 & 22.4 & 25.2 &0.56 &0.056& 1.05e-3 & 0.05 \\
\hline
\end{tabular}
\end{center}
\medskip
Note that since my Metropolis-Hastings step with regard to the first five parameters is contained with one vector $x$,
the five standard deviations make up a covariance matrix where there is an independence assumption on the components
(which was what I assumed). As I have stated in the assumptions section, I will revisit the impact of this in the discussion section.
\subsection{Overview of Software Tools}
To do this computation, I have used the Jupyter Notebook interface with classic Python scientific computing and statistics packages
such as Matplotlib, Numpy, Scipy, and Sympy. I additionally import a specific function from Pandas for a more sophisticated plot,
although I do not use Pandas for any data manipulation.
\\\\
I wrote most of the code myself, with small parts taken from Dr.\ Parno's example code in places such as plotting, MCMC setup,
and autocorrelation analysis.
\newpage
\subsection{Pseudocode}
Here, I will outline the general algorithmic process I take to compute the Markov Chain Monte Carlo.
Notably, I will use logarithms of all calculated densities to increase computational performance.
Additionally, I chose to record 10,000 iterations, since my autocorrelation was rather high.
\medskip
\begin{algorithm}[H]
\SetAlgoLined
\KwResult{Samples from the posterior distributions of the parameters and hyperparameters}
Define constants\\
Set up functions to evaluate the log-prior\\
Set up functions to evaluate the log-hyperpriors\\
Set up functions to evaluate the log-likelihood function\\
Create initial conditions for MCMC and prepare proposal distribution\\
\For{range from 1 to 10000}{
propose new point for $x$ and compute the acceptance ratio $\gamma$\\
\eIf{random number 0 to 1 $< \gamma$}{
set next $x$ to proposed $x$\\
}{
set next $x$ to previous $x$\\
}
propose new point for $\sigma^2_\epsilon$ and compute the acceptance ratio $\gamma$\\\\
\eIf{random number 0 to 1 $< \gamma$}{
set next $\sigma^2_\epsilon$ to proposed $\sigma^2_\epsilon$\\
}{
set next $\sigma^2_\epsilon$ to previous $\sigma^2_\epsilon$\\
}
propose new point for $\sigma^2_S$ and compute the acceptance ratio $\gamma$\\\\
\eIf{random number 0 to 1 $< \gamma$}{
set next $\sigma^2_S$ to proposed $\sigma^2_S$\\
}{
set next $\sigma^2_S$ to previous $\sigma^2_S$\\
}
}
Record, plot, and analyze results\\
Compute Integrated Autocorrelation time and standard error\\
Check Posterior Predictive\\
\caption{Metropolis-Within-Gibbs MCMC}
\end{algorithm}
\newpage
\section{Results}
\subsection{Sampling Results and Posterior Densities}
After 10,000 iterations, I recorded the following posterior plots:
\begin{figure}[h]
\caption{Scatter Matrix of Posterior on $x$}
\includegraphics[width=\textwidth]{posterior_matrix.png}
\end{figure}
\begin{figure}[h]
\caption{Posterior Histograms on Hyperparameters and Ice Thickness}
\includegraphics[width=\textwidth]{posteriors.png}
\end{figure}
Out of the five primary parameters, most of the densities remained similar to the priors. However, the snow density $\rho_s$ was shifted down slightly,
and the variance of the ice thickness $H$ was significantly tightened. Furthermore, observing the scatter matrix, significant covariances were
found in the posterior. This makes sense; our physical model contains many ratios, which means that the freeboard will stay constant when
certain parameters move in tandem.
\\\\
The posteriors on the error and snow variances are have become extremely tight, as seen in the histograms. The ice thickness
posterior has grown reasonably tight too. Whereas the prior had a mean and standard deviations of 2.0 meters each, the posterior
has a mean of 1.34 meters and standard deviation of only 0.55 meters. This is still a pretty wide curve, but is an insightful result
compared to the absolute uncertainty of the prior.
\subsection{Central Limit Theorem, Autocorrelation, and Error Analysis}
Since I am dealing with samples, the Central Limit Theorem applies to the model accuracy. Because I am working with Markov Chain Monte Carlo,
each sample is not independent, so the strict CLT does not apply. To use the CLT, I performed autocorrelation analysis to quantify
the lack of independence between samples.
\begin{figure}[h]
\centering
\caption{Autocorrelation Functions of Posterior w.r.t. Lag}
\includegraphics[width=0.85\textwidth]{autocorr.png}
\end{figure}
\\\\
The autocorrelation functions for the hyperpriors $\sigma^2_S$ and $\sigma^2_\epsilon$ look excellent, quickly decaying to zero,
but the five parameters in $x$ are a bit lackluster. They take a long time to decay, and even have a low negative autocorrelation
for large parts of the chain.
\\\\
By integrating the autocorrelation functions, we obtain the integrated autocorrelation time (IACT). This value serves as a coefficient
in the modified CLT for MCMC. The IACTs for all seven parameters/hyperparameters are below:
\medskip
\begin{center}
\begin{tabular} { |c|c|c|c|c|c|c| }
\hline
$\rho_w$ & $\rho_i $& $\rho_s$ & $H$ & $S$ & $\sigma^2_\epsilon$ & $\sigma^2_S$ \\
\hline
37.81 & 12.36 & 58.08 &15.86&35.04& 1.93 & 1.41 \\
\hline
\end{tabular}
\end{center}
\medskip
As expected, the IACT indicates fast convergence for the hyperparameters, while the parameters lag behind. This will negatively impact the
standard error. In addition, something to note is that I did not account for modeling error from negative autocorrelation. The standard
error should perhaps be even higher to account for this, which is a potential limitation of my model. The full summary statistics of the posterior,
along with calculated standard error, are below:
\medskip
\begin{center}
\begin{tabular} { |c|c|c|c|c|c|c|c| }
\hline
&$\rho_w$ & $\rho_i $& $\rho_s$ & $H$ & $S$ & $\sigma^2_\epsilon$ & $\sigma^2_S$ \\
\hline
mean & 1019.91 &887.81 & 267.57 & 1.34& 0.23& 0.001 & 0.043 \\
std.\ dev & 4.99 & 40.25 & 69.47 &0.55 & 0.15& 0.003 & 0.0123 \\
std.\ error & 0.31 & 1.42 & 5.29 &0.02 &0.01& 4.0e-6 & 1.46e-4 \\
\hline
\end{tabular}
\end{center}
\medskip
\subsection{Posterior Predictive}
As a final check on my model, I used the samples of the Markov Chain Monte Carlo to also sample a posterior predictive on freeboard.
I did this by stepping through the iterations and computing the forward model plus a noise value with the current $\sigma^2_\epsilon$.
\begin{figure}[h]
\centering
\caption{Posterior Predictive}
\includegraphics[width=0.4\textwidth]{predictive.png}
\end{figure}
\\\\
The posterior predictive is understandably very wide to account for a lot of uncertainty, but our measured value of 0.1 is in the middle,
so this looks qualitatively reasonable.
\section{Discussion}
\subsection{Sea Ice Thickness}
The original goal of this analysis was to find a posterior on the sea ice thickness, and I have found one. My model estimates that
the expected value of the ice thickness causing the 0.1 meter freeboard measurement is 1.34 meters thick. As stated earlier,
my model puts a wide range of possibilities with the 0.55 meter standard deviation, but we nonetheless have a reasonable expectation now.
Additionally, there is a low standard error of 0.02 meters on this expected value, so we can be reasonably sure that our expectation
is accurate. With further measurements, this posterior can be further shrunk to more precisely show a range of thickness values.
\subsection{Observations on Posterior}
The posterior revealed a lot about the internal structure of the model despite only having a single freeboard measurement to work off of.
For example, it clearly learned the nature of the covariances between the parameters without any additional input. Looking at the
scatter matrix, we can see that ice density and ice thickness have a strong positive covariance, and ice thickness with snow thickness as well.
These both reflect intuitive ideas about the physics of sea ice. The denser the ice is, the thicker it will need to be to have enough buoyancy
to stay floating at the same freeboard. Similarly, the more snow there is, the more ice will be needed to counterbalance the snow mass with
buoyancy as well. There are likely more nuances in the model that depend on three or more variables acting in tandem, but upon
initial assumption, this insight is fascinating yet extremely intuitive.
\\\\
Some additional observations can be gleaned as well. The model predicts a higher ice density than we originally set in the prior,
and a lower snow density, although the snow variance is very wide. Furthermore, it predicts a noise standard deviation of only one centimeter,
which puts pretty high confidence on IceSat-2's precision.
\subsection{Revisiting Assumptions and Potential Bias}
The model contains a lot of insight, but relies on several assumptions, some of which may be problematic. First of all,
the model seems to not have affected the water and snow densities much. Selecting a different prior may have made a large difference
as a result. In general, this model relies heavily on preexisting knowledge because of the tiny amount of data present.
\\\\
Another potential source of bias is the independence assumption between the model parameters, and consequently the choice
of a diagonal covariance matrix for the proposal distribution. As we can see in the posterior scatter matrix, there is clearly
strong covariance in the model, which does not agree with our prior. I suspect that this strong covariance has to do with the difficulty I
experienced in getting the autocorrelation functions of my Markov Chains to decay, and it may also be the reason behind the negative covariance.
When a zero-covariance model is proposed, the acceptance rate will be artificially deflated, since many of the proposal steps will go in the
``wrong'' direction to the model covariance. This results in an unusually small step size, which can increase autocorrelation. Furthermore,
The high-covariance nature of the model may lend itself to a ``yo-yo'' like back-and-forth on accepted proposal steps, which can cause negative
autocorrelation. This negative autocorrelation may be artificially deflating the standard errors of my parameters. However, I still have a large sample size,
so the convergence of the model should still be sufficient compared to how much uncertainty is already present.
\subsection{Conclusion}
In conclusion, the model has its issues, but the general uncertainty of the model combined with the large sample size allow its inferences to be valid.
We can be reasonably sure that the model's wide range of predictions is true to life, as long as our prior distributions are accurate as well.
With further data, we can hopefully bring our 1.34 meter estimation of the sea ice thickness closer to the true value.
\end{document}