forked from saundersg/BYUI_M221_Book_R
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Lesson11.Rmd
755 lines (514 loc) · 30.7 KB
/
Lesson11.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
---
title: "Lesson 11 Inference for One Mean (Sigma Unknown)"
output:
html_document:
theme: cerulean
toc: true
toc_float: false
---
<script type="text/javascript">
function showhide(id) {
var e = document.getElementById(id);
e.style.display = (e.style.display == 'block') ? 'none' : 'block';
}
</script>
<div style="width:50%;float:right;">
#### Optional Videos for this Lesson {.tabset .tabset-pills}
##### Part 1
<iframe id="kaltura_player_1644540668" src="https://cdnapisec.kaltura.com/p/1157612/sp/115761200/embedIframeJs/uiconf_id/47306393/partner_id/1157612?iframeembed=true&playerId=kaltura_player_1644540668&entry_id=1_yvuyewxm" width="480" height="270" allowfullscreen webkitallowfullscreen mozAllowFullScreen allow="autoplay *; fullscreen *; encrypted-media *" frameborder="0"></iframe>
##### Part 2
<iframe id="kaltura_player_1644540833" src="https://cdnapisec.kaltura.com/p/1157612/sp/115761200/embedIframeJs/uiconf_id/47306393/partner_id/1157612?iframeembed=true&playerId=kaltura_player_1644540833&entry_id=1_aj1ls83v" width="480" height="270" allowfullscreen webkitallowfullscreen mozAllowFullScreen allow="autoplay *; fullscreen *; encrypted-media *" frameborder="0"></iframe>
##### Part 3
<iframe id="kaltura_player_1644541018" src="https://cdnapisec.kaltura.com/p/1157612/sp/115761200/embedIframeJs/uiconf_id/47306393/partner_id/1157612?iframeembed=true&playerId=kaltura_player_1644541018&entry_id=1_0so3fjua" width="480" height="270" allowfullscreen webkitallowfullscreen mozAllowFullScreen allow="autoplay *; fullscreen *; encrypted-media *" frameborder="0"></iframe>
##### Part 4
<iframe id="kaltura_player_1644542243" src="https://cdnapisec.kaltura.com/p/1157612/sp/115761200/embedIframeJs/uiconf_id/47306393/partner_id/1157612?iframeembed=true&playerId=kaltura_player_1644542243&entry_id=1_x0hwk1gr" width="480" height="270" allowfullscreen webkitallowfullscreen mozAllowFullScreen allow="autoplay *; fullscreen *; encrypted-media *" frameborder="0"></iframe>
####
</div><div style="clear:both;"></div>
## Lesson Outcomes
By the end of this lesson, you should be able to do the following with regards to confidence intervals and hypothesis testing.
**Regarding Confidence Intervals for a single mean with $\sigma$ unknown:**
* Calculate and interpret a confidence interval for a population mean given a confidence level.
* Identify a point estimate and margin of error for the confidence interval.
* Show the appropriate connections between the numerical and graphical summaries that support the confidence interval.
* Check the requirements the confidence interval.
**Regarding Hypothesis Testing for a single mean with $\sigma$ unknown:**
* State the null and alternative hypothesis.
* Calculate the test-statistic, degrees of freedom and p-value of the hypothesis test.
* Assess the statistical significance by comparing the p-value to the α-level.
* Check the requirements for the hypothesis test.
* Show the appropriate connections between the numerical and graphical summaries that support the hypothesis test.
* Draw a correct conclusion for the hypothesis test.
<br>
## What If We Don't Know $\sigma$?
<img src="./Images/WilliamSealyGosset-NoCopyright.png">
In practice, we almost never know the population standard deviation, $\sigma$. So, it is generally not appropriate to use the formula
$$
z = \frac{ \bar x - \mu }{ \sigma / \sqrt{n} }
$$
In 1908, William Sealy Gosset published a solution to this problem<!--<cite>Gosset08</cite>-->. He found a way to appropriately compute the confidence interval for the mean when $\sigma$ is not known. The basic idea is to use the sample standard deviation, $s$, in the place of the true population standard deviation, $\sigma$. If $\sigma$ is not known, we cannot base the calculations on the standard normal distribution, and we cannot use the formula above to conduct hypothesis tests.
In a remarkable piece of work, Gosset found the appropriate distribution to use when $\sigma$ is unknown. At the time of this discovery, Gosset worked for the Guinness brewery. To avoid problems with industrial espionage, Guinness prohibited employees from publishing any research results. Knowing his work provided a significant contribution to Statistics, Gosset convinced his employer to allow him to publish under a fictitious name. He chose the pseudonym "Student". Gosset's test statistic was denoted by the letter $t$, this distribution has come to be known as **Student's t-distribution**.
$$
t = \frac{ \bar x - \mu }{ s / \sqrt{n} }
$$
The $t$-distribution is bell-shaped and symmetrical. The $t$-distribution has a mean of 0, but it has more area in the tails than the standard normal distribution. The exact shape of the $t$-distribution depends on a parameter called the **degrees of freedom** (abbreviated $df$). The degrees of freedom is related to the sample size. As the sample size goes up, the degrees of freedom increase accordingly. For the procedures discussed in this lesson, the degrees of freedom is one less than the sample size: $df = n-1$.
[Click here](http://byuimath.com/apps/normprobwitht.html){target="_blank"} to explore how the shape of the $t$-distribution changes with the degrees of freedom. Notice that as the degrees of freedom increases, the shape of the $t$-distribution (dark curve) gets closer to the standard normal distribution (lighter curve). The lighter gray curve is identical (always normal) in the three images below, while the darker curve changes shape as the degrees of freedom increase.
<center>
<table style="table-layout: fixed; width: 100%;">
<thead>
<tr class="header">
<th style="text-align:center;"><p>$t$-distribution<br />
with $df = 1$</p></th>
<th style="text-align:center;"><p>$t$-distribution<br />
with $df = 5$</p></th>
<th style="text-align:center;"><p>$t$-distribution<br />
with $df = 15$</p></th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="width:33.3%;">
```{r, echo=FALSE, fig.width=4, fig.height=4}
par(mai=c(.6,.6,.3,.1))
curve(dnorm(x), xlim=c(-3.2,3.2), col="gray", lwd=3, xlab="", ylab="")
curve(dt(x, 1), add=TRUE, col="gray44", lwd=4)
abline(h=0, lty=1, col="gray")
legend("topright", legend=c("Normal", "t"), col=c("gray","gray44"), lwd=c(2,4), bty="n", cex=.8)
mtext(side=2, las=2, "density", at=0.43, line=.3)
```
</td>
<td style="width:33.3%;">
```{r, echo=FALSE, fig.width=4, fig.height=4}
par(mai=c(.6,.6,.3,.1))
curve(dnorm(x), xlim=c(-3.2,3.2), col="gray", lwd=3, xlab="", ylab="")
curve(dt(x, 5), add=TRUE, col="gray44", lwd=4)
abline(h=0, lty=1, col="gray")
legend("topright", legend=c("Normal", "t"), col=c("gray","gray44"), lwd=c(2,4), bty="n", cex=.8)
mtext(side=2, las=2, "density", at=0.43, line=.3)
```
</td>
<td style="width:33.3%;">
```{r, echo=FALSE, fig.width=4, fig.height=4}
par(mai=c(.6,.6,.3,.1))
curve(dnorm(x), xlim=c(-3.2,3.2), col="gray", lwd=3, xlab="", ylab="")
curve(dt(x, 15), add=TRUE, col="gray44", lwd=4)
abline(h=0, lty=1, col="gray")
legend("topright", legend=c("Normal", "t"), col=c("gray","gray44"), lwd=c(2,4), bty="n", cex=.8)
mtext(side=2, las=2, "density", at=0.43, line=0.3)
```
</td>
</tr>
<tr class="even">
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
</center>
<!-- <center> -->
<!-- {| class="wikitable" -->
<!-- |- -->
<!-- ! $t$-distribution <br> with $df = 1$ -->
<!-- ! $t$-distribution <br> with $df = 5$ -->
<!-- ! $t$-distribution <br> with $df = 15$ -->
<!-- |- -->
<!-- | <img src="./Images/T-dist-df-01.png"> -->
<!-- | <img src="./Images/T-dist-df-05.png"> -->
<!-- | <img src="./Images/T-dist-df-15.png"> -->
<!-- |- -->
<!-- | align="center" colspan="3" | (Image credit: Webster West, http://www.stat.tamu.edu/~west/applets/tdemo1.html) <!-- We do not have permission to use these images....-->
<!-- |} -->
<!-- </center> -->
<br>
## Hypothesis Tests
<img src="./Images/StepsAll.png">
### Body Temperatures Revisited
We will apply the $t$-distribution to the body temperature data we explored previously.
This hypothesis test is conducted in a manner similar to a test for a single mean where $\sigma$ is known, except that instead of using the population standard deviation, $\sigma$, in the calculations, we estimate this value using the sample standard deviation, $s$. This leads to a $t$-distribution, rather than a normal distribution for the test statistic. We will not need to compute the value of this test statistic by hand. It will be done using Software.
<img src="./Images/Step1.png">
**Summarize the relevant background information**
We want to conduct a hypothesis test to determine if the mean body temperature is different from 98.6° Fahrenheit. Previously, we assumed that we knew the value of $\sigma$. Actually, this value is not known.
**State the null and alternative hypotheses and the level of significance**
$$
\begin{array}{rl}
H_0: \mu = 98.6\\
H_a: \mu \ne 98.6\\
\end{array}
$$
$$
\alpha = 0.05
$$
<img src="./Images/Step2.png">
**Describe the data collection procedures**
We will use the body temperature data, [BodyTemp.xlsx](./Data/BodyTemp.xlsx), collected by Dr. Mackowiak and his colleagues to conduct the test. <!-- <cite>Mackowiak92</cite> -->
<img src="./Images/Step3.png">
**Give the relevant summary statistics**
$$
\begin{array}{l}
\bar x = 98.23\\
s = 0.738\\
n = 148
\end{array}
$$
**Make an appropriate graph (e.g. a histogram) to illustrate the data**
<img src="./Images/BodyTemp-Histogram-LoRes.png">
<img src="./Images/Step4.png">
**Verify the requirements have been met**
- We assume that the individuals chosen to participate in the study represent a (simple) random sample from the population.
- $\bar x$ will be normally distributed, because the sample size is large. (Note: We could have also noticed that the body temperature data appears to be normally distributed, so even with a small sample size, $\bar x$ would be normal.)
**Give the test statistic and its value**
We will need to conduct the analysis using software, so we can report this value.
**Instructions for conducting a test for one mean with $\sigma$ unknown:**
<br>
<!-- To access this content, scroll to the bottom of the editing page and click on the link "Software:(Excel or SPSS)-(PageName)" -->
<div class="SoftwareHeading">Excel Instructions</div>
<div class="Summary">
Here are the instructions for conducting the one sample t-test in Excel:
The Excel file needed for this analysis is [Math 221 Statistics Toolbox](./Data/Math221StatisticsToolbox.xltx). We will use this Excel file to conduct the hypothesis tests for a single mean with $\sigma$ unknown.
+ After opening the file, please click on the tab labeled "One-sample t-test".
+ Paste the data in the appropriate place in Column A.
+ Set the null hypothesis value in cell M5. For this example, the value should be 98.6.
+ Click in cell L6 and use the drop-down menu to set the alternative hypothesis to: "Not Equal To"
The image below shows the Excel file after these changes have been made.
<center><img src="./Images/BodyTemp-tTest-Excel_Toolbox.PNG"></center>
Consider each of the following alternative hypotheses that could be used in a test for a single mean where the null hypothesis is $H_0: \mu = 98.6$.
- $H_a: \mu \ne 98.6$
+ Choose "Not Equal To" in the drop-down menu in cell L6.
- $H_a: \mu < 98.6$
+ Choose "Less Than" in the drop-down menu in cell L6.
- $H_a: \mu > 98.6$
+ Choose "Greater Than" in the drop-down menu in cell L6.
You will have opportunities to practice using each of these.
<br>
</div>
<br>
The interpretation of the results will follow the pattern established in the previous hypothesis tests. If the $P$-value is less than the $\alpha$ level, we reject the null hypothesis. If the $P$-value is greater than the $\alpha$ level, we fail to reject the null hypothesis. This is true for every hypothesis test.
The test statistic is $t$ and its value is $-6.029$. So, we have $t = -6.029$.
Notice this is the same number we get if we use the formula:
$$
\begin{align}
t &= \frac{ \bar x - \mu }{ s / \sqrt{n} } \\
&\approx \frac{ 98.234 - 98.6 }{ 0.738 / \sqrt{148} } \\
&\approx -6.03
\end{align}
$$
Any differences are due to rounding.
**State the degrees of freedom**
In Excel, the degrees of freedom ($df$) are given in cell L9.
$$df = 147$$
<!-- **Mark the test statistic and $P$-value on a graph of the sampling distribution** --> <!-- CONSOLIDATED and UPDATED -->
**Find the $P$-value and compare it to the level of significance**
The $P$-value is given in the software as "0.000." (The actual value is 1.2723e-08, a very small number that rounds to 0.000 at three decimal places.) Writing this properly and comparing it to $\alpha$, we have:
$$
P\text{-value} = 1.2723 \times 10^{-8} = 0.000~000~012~723 < 0.05 = \alpha
$$
**State your decision**
The interpretation of the results will follow the pattern established in the previous hypothesis tests. If the $P$-value is less than the $\alpha$ level, we reject the null hypothesis. If the $P$-value is greater than the $\alpha$ level, we fail to reject the null hypothesis. This is true for every hypothesis test.
Since the $P$-value was less than $\alpha$, we reject the null hypothesis.
<img src="./Images/Step5.png">
**Present your conclusion in an easy to understand sentence, relating the result to the context of the problem**
There is sufficient evidence to suggest that the mean body temperature is not 98.6° Fahrenheit.
<img src="./Images/StepsAll.png">
### Baby Boom
<img src="./Images/Step1.png">
**Summarize the relevant background information**
The birth weight of a child is an important indicator of their neonatal health. It is important that pediatric health care providers track changes in the birth weights over time. The birth weight of children in Australia has historically had a population mean of 3373 grams.<!--<cite>Steele97,JSEdunn99</cite>--> Is this still the mean birth weight of Australian children, or has there been a change? We will use the 0.05 level of significance.
**State the null and alternative hypotheses and the level of significance**
$$
\begin{align}
H_0: &~ \mu=3373 \\
H_a: &~ \mu \ne 3373
\end{align}
$$
$$ \alpha = 0.05 $$
<img src="./Images/Step2.png">
**Describe the data collection procedures**
The birth weights of all children born on December 18, 1997 at the Mater Mothers' Hospital in Brisbane, Australia were recorded<!--<cite>JSEdunn99</cite>-->. The time of birth (on a 24 hour clock), gender, and birth weight of each child are given in the file [BabyBoom.xlsx](./Data/BabyBoom-JSE.xlsx).
Using this data set, test the hypothesis that the mean weight of babies born in Australia is 3373 grams. Use the $\alpha=0.05$ level of significance for this problem. Make an appropriate graph of the data.
<img src="./Images/Step3.png">
<div class="QuestionsHeading">Answer the following questions:</div>
<div class="Questions">
1. **Give the relevant summary statistics**
<a href="javascript:showhide('Q1')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q1" style="display:none;">
When the data are entered into the file
[Math 221 Statistics Toolbox](./Data/Math221StatisticsToolbox.xltx), the following output is generated:
<img src="./Images/BabyBoom-Ttest-Excel_Toolbox.PNG">
<br>
* The relevant summary statistics are:
<center>
$$
\begin{align}
\bar x &= 3275.955 \textrm{g}\\
s &= 528.032 \textrm{g}\\
n &= 44
\end{align}
$$
</center>
</div>
<br>
2. **Make an appropriate graph to illustrate the data**
<a href="javascript:showhide('Q2')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q2" style="display:none;">
The histogram is included in the above image.
</div>
</div>
<br>
<img src="./Images/Step4.png">
**Verify the requirements have been met**
The data show a left-skewed shape, however the sample size is large. Using the Central Limit Theorem, we can conclude that the sample mean is normally distributed and the requirements are satisfied.
<div class="QuestionsHeading">Answer the following questions:</div>
<div class="Questions">
3. **Give the test statistic and its value**
<a href="javascript:showhide('Q3')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q3" style="display:none;">
- The test statistic is a $t$. The value of the test statistic is $-1.219$. This was taken from the output above. We have $t = -1.219$.
</div>
<br>
4. **State the degrees of freedom**
<a href="javascript:showhide('Q4')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q4" style="display:none;">
- The degrees of freedom are given in the output above:
<center>
$$
df = 43
$$
</center>
- Notice that this is one less than the sample size. For a test for a single mean, the degrees of freedom are always equal to one less than the sample size.
</div>
</div>
<br>
**Mark the test statistic and $P$-value on a graph of the sampling distribution** <!-- CONSOLIDATED and UPDATED -->
The $P$-value is shaded in green:
<img src="./Images/BabyBoom-JSE-TwoSidedT1-2.png">
**Find the $P$-value and compare it to the level of significance**
$$P\text{-value}=0.2295 > 0.05 = \alpha$$
**State your decision**
Since the $P$-value is greater than the level of significance, we fail to reject the null hypothesis.
<img src="./Images/Step5.png">
**Present your conclusion in an easy to understand sentence, relating the result to the context of the problem**
There is insufficient evidence to suggest that the mean weight of babies born in Australia is different from 3373 grams.
<br>
## Confidence Intervals
The procedure for finding confidence intervals when $\sigma$ is not known is very similar to the process when $\sigma$ is known. As a reminder, here is the confidence interval (from Lesson 10) when $\sigma$ is known.
$$
\left( \bar x - z^* \frac{\sigma}{\sqrt{n}}, ~ \bar x + z^* \frac{\sigma}{\sqrt{n}} \right)
$$
To construct the confidence interval when $\sigma$ is not known, we replace the population standard deviation, $\sigma$, with its point estimate, $s$. The appropriate distribution is a $t$, rather than a $z$. So, we replace $z^*$ with $t^*$. So the confidence interval when $\sigma$ is not known is
$$
\left( \bar x - t^* \frac{s}{\sqrt{n}}, ~ \bar x + t^* \frac{s}{\sqrt{n}} \right)
$$
where $t^*$ is the $t$-score corresponding to the confidence level and $s$ is the sample standard deviation.
We will use Excel to construct confidence intervals when $\sigma$ is unknown. The values of $t^*$ and $s$ will be computed for us. Examples of this are given below. You can see the details by clicking on [Show Hand Calculations] in Step 4.
<img src="./Images/StepsAll.png">
### Automatic Language Translation Programs
<img src="./Images/Step1.png">
**Summarize the relevant background information**
Computer software is commonly used to translate text from one language to another. As part of his Ph.D. thesis, Philipp Koehn developed a phrase-based translation program called Pharaoh. <!--<cite>Koehn04Pharaoh</cite>-->
The quality of the translation can vary. A good translation system should match a professional human translation. <!--<cite>Papineni02</cite>--> It is important to be able to quantify how good the translations produced by Pharaoh are.
The IBM T. J. Watson Research Center developed methods to measure the quality of a translation from one language to another. <!--<cite>Brown90</cite>--> One of these is the BiLingual Evaluation Understudy (BLEU). <!--<cite>Papineni02</cite>--> BLEU is a score ranging from 0 to 1 that indicates how well a computer translation matches a professional human translation of the same text. Higher scores indicate a better match. BLEU helps companies who develop translation software "to monitor the effect of daily changes to their systems in order to weed out bad ideas from good ideas". <!--<cite>Papineni02</cite>-->
<img src="./Images/Step2.png">
**Describe the data collection procedures**
To test Pharaoh's ability to translate, Koehn took a random sample of 100 blocks of Spanish text, each of which contained 300 sentences, and used Pharaoh to translate each of these to English. The BLEU score was calculated for each of the 100 blocks. The data were extracted from Figure 2 in a paper Koehn published. <!--<cite>Koehn04</cite>--> The 100 BLEU scores are given in
[BLEU-Scores](./Data/BLEU-Scores.xlsx).
Koehn wants to find an estimate of the true mean BLEU score for text translated by the Pharaoh computer program. He would like to compute a confidence interval, but he does not know the true population standard deviation, $\sigma$.
<img src="./Images/Step3.png">
**Give the relevant summary statistics**
<!-- To access this content, scroll to the bottom of the editing page and click on the link "Software:(Excel or SPSS)-(PageName)" -->
<div class="SoftwareHeading">Excel Instructions</div>
<div class="Summary">
Copying the
[BLEU-Scores](./Data/BLEU-Scores.xlsx)
data into the
[Math 221 Statistics Toolbox](./Data/Math221StatisticsToolbox.xltx)
file in Excel, we get the following:
<center><img src="./Images/BLEUscores-Excel_Toolbox.PNG"></center>
<br>
</div>
<br>
The summary statistics are:
$$
\begin{align}
\bar x =& 0.288 \\
s =& 0.026 \\
n =& 100
\end{align}
$$
**Make an appropriate graph to illustrate the data**
The histogram is included in the image above.
<img src="./Images/Step4.png">
**Make Inference**
#### Find the confidence interval
We will use Excel to find the confidence intervals for the mean.
<div class="SoftwareHeading">Excel Instructions</div>
<div class="Summary">
Do the following:
- Open the file [Math 221 Statistics Toolbox](./Data/Math221StatisticsToolbox.xltx)
- Click on the tab labeled "One-sample t-test"
- Enter the data, and
- Set the confidence level.
In this case, we are using a 95% confidence level. So, you set the confidence level in the Excel file to 0.95 (i.e., 95%). The confidence interval is given to you in cells L14 and M14.
<br>
</div>
<br>
The 95% confidence interval for the true mean BLEU score is:
$$(0.282, 0.293)$$
<a href="javascript:showhide('hc')"><span style="font-size:8pt;">Show Hand Calculations</span></a>
<br>
<div id="hc" style="display:none;">
The formula for the confidence interval where $\sigma$ is known was given in the reading titled
[Lesson 10: Inference for One Mean: Sigma Known (Confidence Interval)](Lesson10.html) as
$$
\left( \bar x - z^* \frac{\sigma}{\sqrt{n}}, ~ \bar x + z^* \frac{\sigma}{\sqrt{n}} \right)
$$
It is impossible to know the true standard deviation of the BLEU scores for a new translation program like Pharaoh. Replacing $\sigma$ with $s$ and replacing $z^*$ with $t^*$, we get:
$$
\left( \bar x - t^* \frac{s}{\sqrt{n}}, ~ \bar x + t^* \frac{s}{\sqrt{n}} \right)
$$
The value of $t^*$ depends on the level of confidence and the sample size. It must be computed using software or looked up on a table. If you are asked to compute a confidence interval for a mean where the population standard deviation is unknown, the value of $t^*$ will be given to you. The other numbers ($\bar x$, $s$, and $n$) can all be obtained directly from your data.
If we want to create a 95% confidence interval for $\mu$, with 99 degrees of freedom, then $t^* = 1.9842$.
Using the sample statistics:
$$
\begin{align}
\bar x =& 0.2876 \\
s =& 0.0264 \\
n =& 100
\end{align}
$$
The 95% confidence interval for $\mu$ is:
$$
\left( 0.2876 - 1.9842 \frac{0.0264}{\sqrt{100}}, ~ 0.2876 + 1.9842 \frac{0.0264}{\sqrt{100}} \right)
$$
which reduces to:
$$
\left( 0.2824, ~ 0.2928 \right)
$$
</div>
<br>
#### Verify Requirements Are Met
The requirements for creating a confidence interval for a mean with $\sigma$ unknown are the same as the requirements for this procedure when $\sigma$ is known:
There are two requirements that need to be checked:
1. A simple random sample was drawn from the population
2. $\bar x$ is normally distributed
Recall the requirement of normality is satisfied if the data are approximately normally distributed or if the sample size is large.
The data are bell-shaped and fairly symmetric. So, the sample mean, $\bar x$, is approximately normally distributed.
<img src="./Images/Step5.png">
**Present your observations in an easy to understand sentence, relating the result to the context of the problem**
We are 95% confident that the true mean BLEU score for all translations by the Pharaoh program is between 0.2824 and 0.2928.
Answer the following questions using the data set [BLEU-Scores](./Data/BLEU-Scores.xslx), which gives the BLEU scores for $n=100$ translations from Spanish to English by the computer program Pharaoh.
<div class="QuestionsHeading">Answer the following questions:</div>
<div class="Questions">
5. Use Excel to find a 90% confidence interval for the true mean BLEU score for translations by the Pharaoh program. Give your answer accurate to 3 decimal places. Interpret this confidence interval in a complete sentence.
<a href="javascript:showhide('Q5')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q5" style="display:none;">
- $(0.283, 0.292)$<br>We are 90% confident that the true mean BLEU score for translations by the Pharaoh program lies between 0.283 and 0.292.
</div>
<br>
6. Use Excel to find a 99% confidence interval for the true mean BLEU score for translations by the Pharaoh program. Give your answer accurate to 3 decimal places. Interpret this confidence interval in a complete sentence.
<a href="javascript:showhide('Q6')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q6" style="display:none;">
- $(0.281, 0.295)$<br>We are 99% confident that the true mean BLEU score for translations by the Pharaoh program lies between 0.281 and 0.295.
</div>
<br>
7. What do you notice about the confidence interval as the confidence level increased from 90% to 99%?
<a href="javascript:showhide('Q7')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q7" style="display:none;">
- As the confidence level increases, the width of the confidence interval increases. We could also say, as the confidence level increases, the margin of error increases. The center of the confidence interval (the sample mean) is unchanged.
</div>
</div>
<br>
<img src="./Images/StepsAll.png">
### Euro Weights
<img src="./Images/Step1.png">
**Summarize the relevant background information**
A group of statisticians measured the weights of 2000 Belgian one Euro coins in eight batches. Each batch contains coins that were all minted together. <!--<cite>JSEeuros</cite>--> You can learn more about these data at:
[http://www.amstat.org/publications/jse/](http://www.amstat.org/publications/jse/datasets/euroweight.txt){target="_blank"}
<img src="./Images/Step2.png">
**Describe the data collection procedures**
The coins were "borrowed" from a bank in Belgium, one batch at a time. The weights (in grams) of the coins are given in the file [EuroWeight.xlsx](./Data/EuroWeight.xlsx).
<img src="./Images/Step3.png">
<div class="QuestionsHeading">Answer the following questions:</div>
<div class="Questions">
8. **Give the relevant summary statistics**.
<a href="javascript:showhide('Q8')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q8" style="display:none;">
The following output was generated using the Excel file
[Math 221 Statistics Toolbox](./Data/Math221StatisticsToolbox.xlsx):
<center><img src="./Images/EuroWeights-Histogram.PNG"></center>
* Typically, we round our computations to one more decimal place of precision than was given in the original data. This means we need one more decimal place than is shown in the output above. We can increase the precision by selecting the cell we want to modify and then clicking on the following button in the "Home" menu ribbon in Excel:
<center><img src="./Images/Excel-IncreaseDecimals.png"></center>
<br>
* When we do this, we get the following summary statistics:
<center>
$$
\begin{align}
\bar x =& 7.521 \\
s =& 0.0344 \\
n =& 2000
\end{align}
$$
</center>
</div>
<br>
9. **Make an appropriate graph to illustrate the data**
<div class="myemphasis"> **CAUTION:** The Excel Toolbox boxplot only works for datasets of **1,000 observations or less**. This dataset contains 2,000 observations. You will need to create your own boxplot. </div>
<a href="javascript:showhide('Q9')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q9" style="display:none;">
If you create your own histogram and boxplot of the data, the graphs should look something like the image below
<img src="./Images/Lesson_11_euro_weights.PNG">
You can see the data is relatively normal, with just a couple of potential outliers that could be investigated further.
</div>
</div>
<br>
<img src="./Images/Step4.png">
**Verify the requirements have been met**
We need to check the following two requirements:
1. A simple random sample was drawn from the population
2. $\bar x$ is normally distributed
The coins were taken in batches, but we can think of the batches as a random sample of the possible coins that could be chosen.
The data follow a bell-shaped distribution. There are a few potential outliers, but they are not too frequent or extreme. We can conclude that $\bar x$ is normally distributed.
<div class="QuestionsHeading">Answer the following question:</div>
<div class="Questions">
10. **Find the confidence interval**: Use Excel to create a 99% confidence interval for the true mean.
<a href="javascript:showhide('Q10')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q10" style="display:none;">
<center>
$$
(7.519, ~ 7.523)
$$
</center>
</div>
</div>
<br>
<img src="./Images/Step5.png">
<div class="QuestionsHeading">Answer the following questions:</div>
<div class="Questions">
11. **Present your observations in an easy to understand sentence, relating the result to the context of the problem**.
<a href="javascript:showhide('Q11')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q11" style="display:none;">
- We are 99% confident that the true mean weight of all Belgian Euros is between 7.519 grams and 7.523 grams.
</div>
</div>
<br>
## Summary
<div class="SummaryHeading">Remember...</div>
<div class="Summary">
- In practice we rarely know the true standard deviation $\sigma$ and will therefore be unable to calculate a z-score. **Student's t-distribution** gives us a new test statistic, $t$, that is calculated using the sample standard deviation ($s$) instead.
$$\displaystyle{ t = \frac {\bar x - \mu} {s / \sqrt{n}} }$$
- The $t$-distribution is similar to a normal distribution in that it is bell-shaped and symmetrical, but the exact shape of the $t$-distribution depends on the **degrees of freedom ($df$)**.
$$df=n-1$$
*You will use Excel to carry out hypothesis testing and create confidence intervals involving $t$-distributions.
</div>
<br>
## Navigation
<center>
| **Previous Reading** | **This Reading** | **Next Reading** |
| :------------------: | :--------------: | :--------------: |
| [Lesson 10: <br> Inference for One Mean: Sigma Known (Confidence Interval)](Lesson10.html) | Lesson 11: <br> Inference for One Mean: Sigma Unknown | [Lesson 12: <br> Inference for Two Means: Paired Data](Lesson12.html) |
</center>