forked from saundersg/BYUI_M221_Book_R
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Lesson09.Rmd
1415 lines (977 loc) · 70.9 KB
/
Lesson09.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "Lesson 9: Inference for One Mean with Sigma Known (Hypothesis Test)"
output:
html_document:
theme: cerulean
toc: true
toc_float: false
---
<script type="text/javascript">
function showhide(id) {
var e = document.getElementById(id);
e.style.display = (e.style.display == 'block') ? 'none' : 'block';
}
</script>
<div style="width:50%;float:right;">
#### Optional Videos for this Lesson {.tabset .tabset-pills}
##### Part 1
<iframe id="kaltura_player_1643927368" src="https://cdnapisec.kaltura.com/p/1157612/sp/115761200/embedIframeJs/uiconf_id/47306393/partner_id/1157612?iframeembed=true&playerId=kaltura_player_1643927368&entry_id=1_7nd4vdlc" width="480" height="270" allowfullscreen webkitallowfullscreen mozAllowFullScreen allow="autoplay *; fullscreen *; encrypted-media *" frameborder="0"></iframe>
##### Part 2
<iframe id="kaltura_player_1643927438" src="https://cdnapisec.kaltura.com/p/1157612/sp/115761200/embedIframeJs/uiconf_id/47306393/partner_id/1157612?iframeembed=true&playerId=kaltura_player_1643927438&entry_id=1_33420h63" width="480" height="270" allowfullscreen webkitallowfullscreen mozAllowFullScreen allow="autoplay *; fullscreen *; encrypted-media *" frameborder="0"></iframe>
##### Part 3
<iframe id="kaltura_player_1643927537" src="https://cdnapisec.kaltura.com/p/1157612/sp/115761200/embedIframeJs/uiconf_id/47306393/partner_id/1157612?iframeembed=true&playerId=kaltura_player_1643927537&entry_id=1_5pwu6g5i" width="480" height="270" allowfullscreen webkitallowfullscreen mozAllowFullScreen allow="autoplay *; fullscreen *; encrypted-media *" frameborder="0"></iframe>
##### Part 4
<iframe id="kaltura_player_1643927595" src="https://cdnapisec.kaltura.com/p/1157612/sp/115761200/embedIframeJs/uiconf_id/47306393/partner_id/1157612?iframeembed=true&playerId=kaltura_player_1643927595&entry_id=1_ituoja82" width="480" height="270" allowfullscreen webkitallowfullscreen mozAllowFullScreen allow="autoplay *; fullscreen *; encrypted-media *" frameborder="0"></iframe>
##### Part 5
<iframe id="kaltura_player_1643927653" src="https://cdnapisec.kaltura.com/p/1157612/sp/115761200/embedIframeJs/uiconf_id/47306393/partner_id/1157612?iframeembed=true&playerId=kaltura_player_1643927653&entry_id=1_krie6f5w" width="480" height="270" allowfullscreen webkitallowfullscreen mozAllowFullScreen allow="autoplay *; fullscreen *; encrypted-media *" frameborder="0"></iframe>
##### Part 6
<iframe id="kaltura_player_1643927718" src="https://cdnapisec.kaltura.com/p/1157612/sp/115761200/embedIframeJs/uiconf_id/47306393/partner_id/1157612?iframeembed=true&playerId=kaltura_player_1643927718&entry_id=1_n39z2vgy" width="480" height="270" allowfullscreen webkitallowfullscreen mozAllowFullScreen allow="autoplay *; fullscreen *; encrypted-media *" frameborder="0"></iframe>
####
Data files used in the videos:
* [Conference Talk Lengths.xlsx](./Data/ConferenceTalkLengths.xlsx)
* [Balloon Float Times.xlsx](Data/BalloonFloatTimes.xlsx)
</div><div style="clear:both;"></div>
## Lesson Outcomes
By the end of this lesson, you should be able to:
- Conduct a Hypothesis Test for a single mean with σ known:
+ State the null and alternative hypothesis.
+ Calculate the test-statistic and p-value of the hypothesis test.
+ Assess the statistical significance by comparing the p-value to the α-level.
+ Check the requirements for the hypothesis test.
+ Show the appropriate connections between the numerical and graphical summaries that support this hypothesis test.
+ Draw a correct conclusion for the hypothesis test.
+ Interpret a Type I and II error.
<br>
<img src="./Images/StepsAll.png">
<br>
## The Ethan Allen
<img src="./Images/The_Judge_Ben_Wiles_(TJG).jpg">
<!--|From [http://commons.wikimedia.org/wiki/File:The_Judge_Ben_Wiles_(TJG).jpg commons.wikimedia.org]-->
A tragic accident on Lake George in New York, USA, called into question the safety regulations for commercial tour boats. On October 5, 2005, a full boat of 47 passengers and 1 crew member began a routine one-hour tour of Lake George. As the operator initiated a turn, the tour boat "Ethan Allen" listed (tipped) enough to take water aboard. The force caused by dipping beneath the surface caused the vessel to list, shifting the passengers to one side of the boat. After this shift in the weight distribution, the boat capsized killing 20 passengers and injuring 9 others.
<img src="./Images/Step1.png">
We assume that at the time of the accident, the stability requirements were based on the Coast Guard criteria of a mean of 140 pounds per person. So, the Ethan Allen was supposed to be able to safely transport passengers and crew with a mean weight of 140 pounds. We want to investigate if 140 pounds is a reasonable value for the mean weight of tour boat passengers. This leads to the research question: "Is the mean weight of tour boat passengers greater than 140 pounds?"
We can rewrite the research question in a declarative sentence to obtain a hypothesis, or a testable statement about a population.
The first hypothesis we will write is that the Coast Guard criteria is appropriate: "The mean weight of tour boat passengers is 140 pounds." We call this the null hypothesis. The **null hypothesis** is a statement of the "status quo", or the value typically considered to be appropriate. Notice that the null hypothesis is expressed with a statement involving equality ($=$).
$H_0:~~\mu = 140$ pounds
In contrast to the null hypothesis, we write the alternative hypothesis. This is typically the statement that a researcher suspects is the actual truth. In our case, we suspect that "The mean weight of tour boat passengers is greater than 140 pounds."
$H_a:~~\mu > 140$ pounds
We label the null hypothesis $H_0$ and the alternative hypothesis $H_a$. In every hypothesis test in this class, the null hypothesis will be a statement involving equality. The alternative hypothesis can include greater than ($>$), less than ($<$), or not equal ($\ne$).
When we test hypotheses, we assume the null hypothesis is true. Because of this requirement, whenever we need to use μ in a calculation, we can use the value specified in the null hypothesis. When we conduct a hypothesis test, we gather evidence against the requirement that the null hypothesis is true. If we get enough evidence against the null hypothesis, we reject it. If we do not have sufficient evidence against the null hypothesis, we do not reject it.
<img src="./Images/Step2.png">
How do we gather evidence against a null hypothesis? We collect data.
The marine accident report gives the weight (in pounds) of each of the passengers and the crew member. These values are reproduced below. <!-- [Cite: "Capsizing of New York State-Certificated Vessel Ethan Allen, Lake George, New York October 2, 2005"] -->
```{r, echo = FALSE, message = FALSE, warning = FALSE, results = "asis"}
#library(knitr)
w <- c(189,
110,
144,
141,
185,
194,
180,
211,
128,
135 ,
141 ,
205 ,
200 ,
164 ,
150 ,
170,
194 ,
260 ,
165 ,
137 ,
198 ,
195 ,
158 ,
204,
170 ,
190,
129 ,
146 ,
135 ,
176 ,
204 ,
170,
142,
210 ,
180 ,
155 ,
217 ,
198 ,
126 ,
247,
173 ,
155 ,
165 ,
175 ,
235 ,
230 ,
268 ,
170)
library(pander)
pander(matrix(data = w, nrow = 6, ncol=8, byrow=TRUE))
```
<!--
<center>
{| style="text-align: center; width: 35%"
| 189
| 110
| 144
| 141
| 185
| 194
| 180
| 211
|-
| 128
| 135
| 141
| 205
| 200
| 164
| 150
| 170
|-
| 194
| 260
| 165
| 137
| 198
| 195
| 158
| 204
|-
| 170
| 190
| 129
| 146
| 135
| 176
| 204
| 170
|-
| 142
| 210
| 180
| 155
| 217
| 198
| 126
| 247
|-
| 173
| 155
| 165
| 175
| 235
| 230
| 268
| 170
|-
|}
</center>
-->
<img src="./Images/Step3.png">
To help us understand the data, we first create a graph summarizing the values.
<center>
**Weights of Passengers and Crew on the Ethan Allen**<br>
<img src="./Images/EthanAllen-PassengerWeights-Histogram.png">
</center>
Next, we compute summary statistics. The sample size is $n=48$, and the sample mean is $\bar{x}=177.6$ pounds. According to the CDC, the standard deviation of the weights of individuals in the United States is $\sigma=26.7$ pounds. <!-- CITE: http://www.cdc.gov/growthcharts/2000growthchart-us.pdf, page 154-155.-->
Considering the data as a random sample of all possible tour boat passengers, it appears that the true mean weight of tour boat passengers might be greater than 140 pounds. However, we need to check this with a formal test of our hypotheses.
<img src="./Images/Step4.png">
It is not sufficient to gain an intuitive sense for the data. We will test if there is sufficient evidence to reject the null hypothesis that the true mean weight of tour boat passengers is 140 pounds.
Assuming the null hypothesis is true, what is the probability that we would observe a sample mean as extreme or more extreme than the values we observed? This probability is called the $P$-value.
To find the $P$-value, we first calculate the number of standard deviations that $\bar{x}$ is away from the assumed value of true mean $\mu=140$ pounds. This is our $z$-score. Then we use the applet to determine the probability of observing a value of $z$ that is as large or larger than the value we observed.
$$\displaystyle{ z= \frac{ \bar{x} − \mu}{\sigma/\sqrt{n}} = \frac{177.6−140}{26.7/\sqrt{48}}=9.757 }$$
We use the applet to determine the probability of observing a value of $z$ that is as large or larger than $9.757$. This is the same as the probability of observing a value of $\bar{x} =177.6$ or more pounds, given that the true mean really is $\mu = 140$ pounds.
The area to the right of $z=9.757$ is so small, the normal probability applet gives the area as "8.6086e-23." This is how a computer represents scientific notation. This number is actually $8.6086ex10^{-23}$ or in other words, 0.000 000 000 000 000 000 000 086 086. This is the probability that a mean of $\bar x=177.6$ pounds or greater was observed just by chance assuming the true mean is $\mu = 140$ lbs. This is very, very unlikely. A more plausible explanation is that the true mean weight $\mu$ of individuals in the population is greater than 140 pounds.
Based on this analysis and the hypothesis test, the evidence against $H_0$ is strong enough to conclude that if every seat on the Ethan Allen was occupied, that the boat would be carrying a greater load than it was certified to handle.
<img src="./Images/Step5.png">
As a result of this accident, the United States government took several actions. The Coast Guard stability regulations were changed, and the assumed average weight per person was increased to 185 pounds. <!-- [[CITE: http://www.uscg.mil/hq/cg5/cg5212/docs/secg12142010.pdf]]--> As a result, the safety of public vessels has been improved.
<!--
[[FOLLOW-UP QUESTION: FIND THE TRUE MEAN WEIGHT OF ADULTS IN THE US. FIND THE PROBABILITY THAT A RANDOM SAMPLE OF n=48 WILL HAVE A MEAN GREATER THAN 185 POUNDS.]]
[[(AGES 20--74 YEARS) MU = 175.9 LBS, SIGMA = 64.9 LBS AS OF 2002. SOURCE: ADVANCE DATA FROM VITAL HEALTH STATISTICS, NUMBER 347, 27 OCTOBER 2994 (PUBLISHED BY THE CDC) http://www.cdc.gov/nchs/Data/ad/ad347.pdf, ACCESSED ON 14 OCTOBER 2013]]
-->
<img src="./Images/StepsAll.png">
<br>
## Body Temperatures of Healthy Adults
### Introduction
Have you ever wondered how it was determined that the true mean body temperature of healthy adults is 98.6° ?
It is not exactly clear who first reported this value, but this temperature has been used since the 1800's. <!-- \cite{Mackowiak92,Horvath50} --> One of the most influential researchers in this area is Carl Reinhold August Wunderlich. He reported measuring over 1,000,000 body temperatures on over 20,000 patients.<!-- \cite{Mackowiak92,Wunderlich68} --> When the temperature is measured in the arm pit, it is called an axillary temperature measurement.
Based on his research, Wunderlich stated, "The axillary temperature of 98.6° F = 37° C$\ldots$is considered the central thermic point of health". <!-- \cite{Wunderlich71}--> In other words, the mean body temperature of healthy adults is 98.6° F (or 37° C.)
<div class="QuestionsHeading">Answer the following question:</div>
<div class="Questions">
1. How would you design a study to determine if the mean body temperature of healthy adults is 98.6° ?
<a href="javascript:showhide('Q1')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q1" style="display:none;">
* Take the temperature of a large number of healthy individuals in a clinical setting and compare your mean to the assumed mean body temperature of 98.6 degrees F.
</div>
</div>
<br>
### Data on Body Temperatures
A group of researchers led by Philip A. Mackowiak, MD, conducted a study to assess the true mean body temperatures of healthy adults.<!-- \cite{Mackowiak92}--> They selected $n=148$ subjects between the ages of 18 and 40 years old, representative of the general population. Each volunteer was given a physical to assure that they were not ill at the time of the data collection. Their axillary body temperature was measured and reported in a paper published in the "Journal of the American Medical Association". <!-- \cite{Mackowiak92}--> These data were extracted and are presented in the file [BodyTemp](./Data/BodyTemp.xlsx). The body temperatures in the file are given in degrees Fahrenheit. Based on historical data, the standard deviation of body temperatures is assumed to be $\sigma = 0.675$° F.
<div class="QuestionsHeading">Answer the following questions:</div>
<div class="Questions">
2. How could you use the body temperature data collected by Dr. Mackowiak to determine if the mean body temperature is really 98.6° F?
<a href="javascript:showhide('Q2')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q2" style="display:none;">
* We can compare his sample results to the assumed population mean of 98.6° F.
</div>
<br>
3. Find the mean of the $n=148$ body temperatures.
<a href="javascript:showhide('Q3')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q3" style="display:none;">
- $\bar{x} = 98.23~^\circ\text{F}$.
</div>
<br>
4. Create a histogram illustrating the body temperatures of the individuals in the Mackowiak study.
<a href="javascript:showhide('Q4')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q4" style="display:none;">
<img src="./Images/BodyTemp-Hist-Excel_v2.png">
</div>
<br>
5. Based on the mean of the observations (Question 3) and the histogram of the data (Question 4,) does it appear that the mean body temperature of healthy adults is significantly different from 98.6° F?
<a href="javascript:showhide('Q5')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q5" style="display:none;">
* Answers may vary. However, many students are likely to say that the sample mean and population mean are very close to each other.
</div>
<br>
6. What is the shape of the distribution of all possible sample means? How do we know?
<a href="javascript:showhide('Q6')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q6" style="display:none;">
* The distribution of all possible sample means will be approximately normally distributed. Since the sample size is large, the Central Limit Theorem guarantees this result.
</div>
<br>
7. Assuming that the true mean body temperature of healthy adults is $\mu=98.6$° F, and the population standard deviation is $\sigma = 0.675$° F,<!-- \cite{SundLevander02}--> find the mean and standard deviation of the random variable $\bar{x}$.
<a href="javascript:showhide('Q7')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q7" style="display:none;">
* The random variable $\bar{x}$ will have a mean of $\mu = 98.6$ and a standard deviation of $\displaystyle{ \frac{\sigma}{\sqrt{n}} = \frac{0.675}{\sqrt{148}} = 0.05548 }$.
</div>
<br>
8. Use the information in Question 7 to find the $z$-score for the mean you calculated in Question 3.
<a href="javascript:showhide('Q8')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q8" style="display:none;">
<center>$z = \frac{\bar{x} - \mu}{\sigma / \sqrt{n} } = \frac{98.23 - 98.6}{0.675 / \sqrt{148} } = -6.6685$</center>
</div>
<br>
9. What is the probability of observing a $z$-score that is as extreme or more extreme (further away from 0) than the $z$-score you calculated in Question 8?
<a href="javascript:showhide('Q9')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q9" style="display:none;">
* This is a two-tailed test, so we need to shade both tails of the applet. This gives a $P$-value of $2.5931\times 10^{-11}$.
</div>
<br>
10. Assuming the mean body temperature really is 98.6° F, how likely would it be for a random sample of $n=148$ people in the population to have a mean body temperature that is as extreme as was observed in Question 3?
<a href="javascript:showhide('Q10')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q10" style="display:none;">
* This is extremely unlikely.
</div>
<br>
11. Results as unlikely as this demand an explanation. What do you think is the reason for a $z$-score as extreme as this?
* A. The group of volunteers included in the study had unusually low body temperatures.
* B. The researchers did not measure the temperature correctly.
* C. The true mean body temperature is different than 98.6° F.
* D. The sample size of $n=148$ is not large enough.
* E. The data were collected on a cold day.
* F. The data are not normally distributed.
<a href="javascript:showhide('Q11')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q11" style="display:none;">
* Correct answer: C. The true mean is different than 98.6° F. The data were collected carefully with properly calibrated modern thermometers. The sample was representative of the population and the size of the sample was not an issue.
</div>
</div>
<br>
## Hypothesis Test for the True Mean Body Temperature
### Null and Alternative Hypotheses
It is commonly believed that the true mean body temperature is 98.6° F (37° C). In science, a statement or claim such as this is called a **hypothesis**. We can use data to test a hypothesis. If there is enough evidence against a hypothesis, we reject it in favor of something else.
The claim representing the "status quo" or the commonly held belief or the usual value is called the **null hypothesis**. In the case of body temperatures, our null hypothesis is defined by the belief that the true mean body temperature of healthy adults is 98.6° F.
We present the null hypothesis in the following way:
$$H_0: \mu = 98.6$$
If we gather enough evidence against the null hypothesis and determine that it should be rejected, we need another hypothesis to propose in its place. This is called the **alternative hypothesis**. If 98.6° F is not the correct body temperature, then it is logical to propose the following alternative hypothesis:
$$H_a: \mu \ne 98.6$$
To be brief in writing, we label the null hypothesis $H_0$. The zero in the subscript represents "null," "baseline," "default," "no effect," etc. Similarly, we label the alternative hypothesis $H_a$.
Our alternative hypothesis could have been written as:
- $H_a: \mu \ne 98.6$ (two-sided hypothesis; two-tailed)
- $H_a: \mu < 98.6$ (one-sided hypothesis; left-tailed)
- $H_a: \mu > 98.6$ (one-sided hypothesis; right-tailed)
Notice that each of these is a viable alternative to the requirement that the mean is 98.6 degrees. We would only use $<$ or $>$ if we had a belief in advance that the mean was less than or greater than 98.6. If we do not have a strong reason to believe the mean is either smaller or larger than the stated value--before we collect our data--then, we use $\ne$ in our alternative hypothesis.
It is important that the null and alternative hypotheses be determined prior to collecting the data. It is not appropriate to use the data from your study to choose the alternative hypothesis that will be used to test the same data! This is an example of using data twice, once to choose the test and again to conduct the test. It is okay to use data from a previous study to determine your null and alternative hypotheses, but it is an improper use of the statistical procedures to use the data to define *and* conduct a hypothesis test.
Notice that the null and alternative hypotheses are statements about a population parameter (e.g. $\mu$.) They will never involve a sample statistic (e.g. $\bar{x}$.) Population parameters are unknown, and we are trying to make a judgement about whether or not they are equal to a particular value. Sample statistics are calculated from our data, so there is no reason to do any test to assess what their value is.
<!--
{| class="wikitable"
|-
|
See Lesson \ref{Reading:SampDistOfXbar} to review the concept of the distribution of sample means, $\bar{x}$.
|}
-->
The null hypothesis will always be a statement involving equality. This gives us a starting point in our analysis. In the reading for [Normal Distributions](Lesson05.html){target="_blank"}, we assumed that the true mean body temperature was 98.6° F. This becomes the assumed value of the mean of the distribution of the random variable $\bar{x}$.
The scientific method demands that we make a hypothesis (a claim or an educated guess). We assume that the null hypothesis is true and gather evidence against it. In other words, we "do research" (e.g., collect data) to gather evidence against that hypothesis. If we can gather enough evidence to discredit our initial hypothesis, we conclude that it was false, and begin the process again.
If we are unable to reject the original hypothesis, we do not conclude that it is correct. For example, we do not know that Einstein's Theory of Relativity is correct. We have simply not been able to disprove it$\ldots$yet.
In the same way, we never can prove a null hypothesis is true. We can only gather evidence against it. If we get enough evidence, we reject the null hypothesis.
### Test Statistic
Assuming the null hypothesis is true, we assume that the true mean body temperature of healthy adults actually is $\mu = 98.6^{\circ}$ F. Wanting to test this claim, researchers collected data on the mean body temperatures of $n=148$ healthy adults. The mean of the observed values was $\bar{x} = 98.23^{\circ}$ F.
A histogram of the body temperature data shows a nice bell-shaped distribution. Also, the sample mean, $\bar{x}=98.23^\circ$ F, appears to be reasonably close to the assumed population mean, $\mu=98.6^\circ$ F.
However, it is important to remember that the standard deviation of $\bar{x}$ is $\frac{\sigma}{\sqrt{n}} = \frac{0.675}{\sqrt{148}}=0.055$. We need to determine how far $\bar{x}$ is from $\mu$. Since we know $\sigma$ in this case, we can use the $z$-score to determine if $\bar{x}$ is far from the assumed value. This is called the **test statistic**:
$$
z = \frac{\bar{x} - \mu}{\sigma / \sqrt{n}} = \frac{98.23 - 98.6}{0.675 / \sqrt{148}} = -6.6685
$$
Where $z$ is the test statistic, $\bar{x}$ is the sample mean, $\mu$ is the population mean, $\sigma$ is the standard deviation and $n$ is the sample size.
Based on the results from the formula above, $\bar{x}$ is over 6 standard deviations below the mean!
### $P$-value
#### Definition
The **$P$-value** is the probability of obtaining a test statistic (such as $z$) at least as extreme as the one you calculated, assuming the null hypothesis is true. In other words, our $P$-value is the probability that we would get a $z$-score that is as extreme or more extreme than $z=-6.6685$, assuming the true mean is 98.6° F.
In advance of collecting the data, we may have had no idea whether the true mean would be greater than 98.6 or less than 98.6, if 98.6° F was not the right value. This is why we used a two-sided alternative hypothesis ($H_a: \mu \ne 98.6$.) When computing the $P$-value, we need to continue this logic. So, we would have considered a value of $z=-6.6685$ to be equally as extreme as $z=6.6685$.
In the case of a two-sided test for one mean with $\sigma$ known, the $P$-value is the area in the tails beyond $\pm z$, in both the left and right tails of the standard normal distribution.
<img src="./Images/Applet2584E-11-twosided-LoR.png">
Using the applet, we shade the area in both tails and then type in the value of $z=-6.669$. The applet reports an area of "2.584E-11." This is the way computers indicate scientific notation. Expressed using more traditional notation, we get
$$ P\text{-value} = 2.584 \times 10^{-11} = 0.000~000~000~026 $$
You should always convert the computer's notation involving "E" to scientific notation or decimal notation.
<div class="Emphasis">
**Scientific Notation:**
- Scientific notation is a method used to write very large or very small numbers without a lot of extra zeros. Computers express scientific notation using the letter "E." For example, to write $7.8\times 10^3$, a computer would output the expression 7.8E3. For example, the number $7.8\times 10^3$ is $7.8 \times 1000=7800$, since $10^3 = 10 \cdot 10 \cdot 10 = 1000$. Another way to think of this is to move the decimal place in the number 7.8 three places to the right, which gives you
$$ 7~\underset{\rightarrow}{8} \underset{\rightarrow}{0} \underset{\rightarrow}{0}. $$
- This is an example of how very large numbers can be represented. To express small numbers (close to zero,) we use negative exponents. The number $9.12 \times 10^{-5}$ is the same as $9.12 \times (10^5)^{-1} = 9.12 \times (10000)^{-1}$. This can be written as $9.12 \times 0.00001 = 0.0000912$. Since the exponent on the 10 is $-5$, you move the decimal place in the number 9.1 five places to the left:
$$0.\underset{\leftarrow}{0}\underset{\leftarrow}{0}\underset{\leftarrow}{0}\underset{\leftarrow}{0}\underset{\leftarrow}{9}~1~2$$
- Similarly, the value $2.57552 \times 10^{-11}$ can be written in regular notation as $0.000~000~000~025~755~2$.
</div>
<br>
If the $P$-value is small, that implies that the $z$-score was very large (far away from 0.) In other words, $\bar{x}$ was far away from $\mu$. In this case, our $z$-score was 6.669 standard deviations below the mean. It is very unusual to get a value of $\bar{x}$ that is this far from the value of $\mu$ given in the null hypothesis. This implies that it is not plausible that the mean is equal to 98.6° F. We reject the null hypothesis and conclude that there is sufficient evidence that the true mean body temperature of healthy adults is not 98.6° F.
If this conclusion is correct, you may wonder why so many people think the true mean body temperature of healthy adults is 98.6° F (37° C). Good question!
#### Relationship to the Alternative Hypothesis
The $P$-value is determined using the alternative hypothesis. If the alternative hypothesis is two-sided (i.e. if the alternative hypothesis involves $\ne$,) then the $P$-value is the area in the tails further than $\pm z$. That is, it is the area to the left of $z=-6.669$ plus the area to the right of $z=+6.669$.
If the alternative hypothesis involves less than (such as $\mu<98.6$,) we call it a left-tailed test. For a left-tailed test, the $P$-value is the area under the standard normal curve to the left of the test statistic, $z$.
In the case of a right-tailed test, the $P$-value is the area to the right of the test statistic, $z$.
In cases of very large or very small $z$ statistics sometimes the shaded area is so small it can't be seen. This is the case for $z=-6.669$. It is such an extreme value for $z$, it is hard to see the actual shading. However, shading is happening and the applet is calculating the right p-value. To help you visually see how the shading works, the three possibilities for alternative hypotheses are illustrated in the figures below, using a less extreme value of $z=-1.6$.
<center>
<table>
<thead>
<tr class="header">
<th><p>Two-tailed</p></th>
<th><p>Left-tailed</p></th>
<th><p>Right-tailed</p></th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><p>$H_0:~~\mu = 98.6$</p></td>
<td><p>$H_0:~~\mu = 98.6$</p></td>
<td><p>$H_0:~~\mu = 98.6$</p></td>
</tr>
<tr class="even">
<td><p>$H_a:~~\mu \ne 98.6$</p></td>
<td><p>$H_a:~~\mu \lt 98.6$</p></td>
<td><p>$H_a:~~\mu \gt 98.6$</p></td>
</tr>
<tr class="odd">
<td><p><img src="./Images/Applet-TwoTail-LoRes.png"></p></td>
<td><p><img src="./Images/Applet-LeftTail-LoRes.png"></p></td>
<td><p><img src="./Images/Applet-RightTail-LoRes.png"></p></td>
</tr>
</tbody>
</table>
</center>
<!-- Old table
<center>
{| class="wikitable"
! Two-tailed !! Left-tailed !! Right-tailed
|- align="center"
| $H_0:~~\mu = 98.6$ || $H_0:~~\mu = 98.6$ || $H_0:~~\mu = 98.6$
|- align="center"
| $H_a:~~\mu \ne 98.6$ || $H_a:~~\mu < 98.6$ || $H_a:~~\mu > 98.6$
|- align="center"
| <img src="./Images/Applet-TwoTail-LoRes.png"> || <img src="./Images/Applet-LeftTail-LoRes.png"> || <img src="./Images/Applet-RightTail-LoRes.png">
|}
</center>
-->
If $\bar{x}$ is close to $\mu$, then the $z$-score will be small. If the $z$-score is small, then there will be a lot of area in the tails beyond $z$. In other words, the $P$-value will be large. These two possibilities are illustrated for a two-tailed test in the figure below. Similar results hold for a one-tailed test.
<center>
<table>
<thead>
<tr class="header">
<th><p>Large $P$-value</p></th>
<th><p>Small $P$-value</p></th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><figure>
<img src="./Images/Applet-largeP-LoRes.png" width="300">
</figure></td>
<td><figure>
<img src="./Images/Applet-smallP-LoRes.png">
</figure></td>
</tr>
<tr class="even">
<td><p>Fail to reject $H_0$</p></td>
<td><p>Reject $H_0$</p></td>
</tr>
</tbody>
</table>
</center>
<!-- Old table
<center>
{| class="wikitable"
! Large $P$-value !! Small $P$-value
|- align="center"
| [[File:Applet-largeP-LoRes.png">
|- align="center"
| Fail to reject $H_0$ || Reject $H_0$
|}
</center>
-->
In the case of the body temperatures, a large $z$-score ($z=-6.669$) led to a small $P$-value (the area in the tails.) We observed a value for $\bar{x}$ that was very far from $\mu=98.6$. Strictly due to chance, it would be very unlikely to observe a mean of $\bar{x}=98.23^\circ$ F in a random sample of $n=148$ if the true mean is 98.6° F.
### Level of Significance, $\alpha$
How do we decide if the $P$-value is small enough to reject the null hypothesis? We need a way to determine if there is enough evidence to reject the null hypothesis that does not depend on the data.
We need a number that can be used to determine if the $P$-value is small enough to reject the null hypothesis. This number is called the **level of significance** and is often denoted by the symbol **$\alpha$** (pronounced "alpha".)
**Memory Aid:** Some students find it helpful to remember the decision rule using the couplet:<br />
**If the $P$ is low, reject the null.**
<!--
<center>
{| class="wikitable"
|-
| **Memory Aid:** <br>
Some students find it helpful to remember the decision rule using the couplet: <br>
|- align="center"
| "If the $P$ is low,<br> Reject the null."
|}
</center>
-->
We will use the same decision rule for all hypothesis tests.
If the $P$-value is less than $\alpha$, we reject the null hypothesis. Conversely, if the $P$-value is greater than $\alpha$, we fail to reject the null hypothesis.
The level of significance, $\alpha$, must be chosen prior to collecting the data.
This is our standard for determining if the null hypothesis should be rejected. It is important that the personal bias of the researcher is not imposed on the data. So, we choose the $\alpha$ level prior to collecting the data. If we wanted to reject the null hypothesis, we could compute the $P$-value and then unscrupulously choose the $\alpha$ level. In every case, we could choose a value of $\alpha$ that is larger than the computed $P$-value, and therefore always reject the null hypothesis. This practice would defeat the purpose of hypothesis testing.
### Type I and Type II Errors
If the null hypothesis is true, for example, if the true body temperature of healthy adults really is 98.6° F, sometimes we will get a very large or very small value of $\bar{x}$. This will lead to a very extreme $z$-score, strictly due to chance. In this case, we would reject the null hypothesis, even though it is true!
Whenever we reject a true null hypothesis, we say that a **Type I error** was committed. Assuming the null hypothesis is true, the level of significance ($\alpha$) is the probability of getting a value of the test statistic that is extreme enough that the null hypothesis will be rejected. In other words, the level of significance ($\alpha$) is the probability of committing a Type I error.
We choose $\alpha$ to be some small number so that the probability of committing a Type I error is low. The most common value for $\alpha$ is $\alpha=0.05$. This is equal to $\frac{1}{20}$. So, when the null hypothesis is true, there is a one-in-twenty chance that it will be rejected at the 0.05 level of significance. Other common choices for $\alpha$ include $\alpha = 0.1$ and $\alpha=0.01$.
If committing a Type I error is undesirable, why can't we let $\alpha=0$? This would make it impossible to commit a Type I error. Actually, this would also make it impossible to reject "any" null hypothesis! If $\alpha=0$, no matter what test we do, no matter what the data show, we always fail to reject the null hypothesis. This would be pointless.
A Type I error is committed when we reject a true null hypothesis. Another problem that can arise is to fail to reject a false null hypothesis. This is called a Type II error. The probability of a Type II error is represented by $\beta$.
<center><img src="./Images/truth_table.PNG"></center>
<!-- <center> -->
<!-- <table border="1"> -->
<!-- <thead> -->
<!-- <tr class="header"> -->
<!-- <th>Decision</th> -->
<!-- <th colspan="2"><p> Truth</p></th> -->
<!-- </tr> -->
<!-- </thead> -->
<!-- <tbody> -->
<!-- <tr class="odd"> -->
<!-- <td><p></p></td> -->
<!-- <td><p>$H_0$ is true</p></td> -->
<!-- <td><p>$H_0$ is false</p></td> -->
<!-- </tr> -->
<!-- <tr class="even"> -->
<!-- <td><p>Fail to<br /> -->
<!-- reject $H_0$</p></td> -->
<!-- <td><p>Correct<br /> -->
<!-- Decision</p></td> -->
<!-- <td><p>Type II <br /> -->
<!-- Error</p></td> -->
<!-- </tr> -->
<!-- <tr class="odd"> -->
<!-- <td><p>Reject $H_0$</p></td> -->
<!-- <td><p>Type I<br /> -->
<!-- Error</p></td> -->
<!-- <td><p>Correct<br /> -->
<!-- Decision</p></td> -->
<!-- </tr> -->
<!-- </tbody> -->
<!-- </table> -->
<!-- </center> -->
We do not want to commit Type I errors, so why not choose a very small value for $\alpha$? If we choose $\alpha$ to be so small that we rarely commit a Type I error, what would happen to the probability of committing a Type II error? (Think about it.)
If the $\alpha$ value (the probability of committing a Type I error) is very small, the probability of committing a Type II error will be large. Conversely, if $\alpha$ is allowed to be very large, then the probability of committing a Type II error will be very small.
A level of significance of $\alpha=0.05$ seems to strike a good balance between the probabilities of committing a Type I versus a Type II error. However, there may be instances where it will be important to choose a different value for $\alpha$. The important thing is to choose $\alpha$ before you collect your data. Typical choices of $\alpha$ are $0.05$ (most common), $0.1$, and $0.01$.
Read each of the following scenarios and answer the questions.
**Scenario 1**:
The BYU-Idaho Honor Code specifically prohibits wearing flip-flops on campus.
A student is seen on campus wearing shoes that look somewhat like flip-flops.
Consider the following null and alternative hypotheses:
$$
\begin{align}
H_0: & \textrm{Shoes of this type are technically flip-flops}\\
H_a: & \textrm{Shoes of this type are not technically flip-flops}\\
\end{align}
$$
You are considering approaching this student to discuss their footwear.
<div class="QuestionsHeading">Answer the following questions:</div>
<div class="Questions">
12. What action would lead to a Type I error in this case?
<a href="javascript:showhide('Q12')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q12" style="display:none;">
* A Type I Error would be committed if you decided that they were not flip flops, when in reality they were flip-flops.
</div>
<br>
13. What action corresponds to committing a Type II error?
<a href="javascript:showhide('Q13')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q13" style="display:none;">
* A Type II Error would be committed if you concluded that they were technically flip flops, when in reality they were not flip-flops.
</div>
<br>
14. In this case, is a Type I error or a Type II error more serious? Justify your response.
<a href="javascript:showhide('Q14')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q14" style="display:none;">
* Answers may vary.
</div>
</div>
<br>
**Scenario 2**:
The U. S. Food and Drug Administration (FDA) regulates all prescription medications dispensed in the United States.
Part of their responsibility is to oversee the investigation of the safety and efficacy of new drugs.
A new drug is under consideration for approval.
The following null and alternative hypotheses relate to the effectiveness of the proposed drug:
$$
\begin{align}
H_0: & \textrm{The drug does not improve patients' conditions}\\
H_a: & \textrm{The drug improves patients' conditions}\\
\end{align}
$$
<div class="QuestionsHeading">Answer the following questions:</div>
<div class="Questions">
15. What action by the FDA would lead to a Type I error in this case?
<a href="javascript:showhide('Q15')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q15" style="display:none;">
* A Type I error would be committed if the FDA decided that the drug improves patient's conditions, when in actuality it does not.
</div>
<br>
16. What action by the FDA would be associated with committing a Type II error?
<a href="javascript:showhide('Q16')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q16" style="display:none;">
* The FDA would commit a Type II error if they decided that the drug does not improve patients' conditions, when in reality it does.
</div>
<br>
17. In this case, is a Type I error or a Type II error more serious? Justify your response.
<a href="javascript:showhide('Q17')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q17" style="display:none;">
* Answers may vary. However, a Type I error is probably more serious, since patients would be exposed to the negative side effects of the drug without any benefit.
</div>
</div>
<br>
**Scenario 3**:
With regard to a new drug under consideration for approval by the FDA, the following null and alternative hypotheses address the safety of the drug:
$$
\begin{align}
H_0: & \textrm{The drug does not cause serious side effects}\\
H_a: & \textrm{The drug causes serious side effects}\\
\end{align}
$$
<div class="QuestionsHeading">Answer the following questions:</div>
<div class="Questions">
18. What action by the FDA would lead to a Type I error in this case?
<a href="javascript:showhide('Q18')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q18" style="display:none;">
* A Type I error would be committed if the FDA decided that the drug causes serious side effects, when it does not.
</div>
<br>
19. What action by the FDA corresponds to committing a Type II error?
<a href="javascript:showhide('Q19')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q19" style="display:none;">
* A Type II error would be committed if the FDA decided that the drug does not cause serious side effects, when in reality it does cause serious side effects.
</div>
<br>
20. In this case, is a Type I error or a Type II error more serious? Justify your response.
<a href="javascript:showhide('Q20')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q20" style="display:none;">
* A type II error would be more serious, because the FDA could release a drug that they say has no serious side effects, when it actually does.
</div>
</div>
<br>
### Requirements
Certain requirements are required for us to conduct the hypothesis test for a single mean with $\sigma$ known.
First of all, we must assume that the data represent a simple random sample from a large population. In practice, this requirement is rarely satisfied. Frequently researchers working with humans rely on convenience samples, where the subjects volunteer to participate in a research project. The researcher may advertise using television, radio, and flyers in a clinic. If there is no relationship between the issue being studied and people's desire to participate, this method works fairly well.
A second requirement is that the data are drawn from a population that has a normal distribution with mean $\mu$ given in the null hypothesis (e.g. 98.6° F) and a known standard deviation (e.g. $\sigma=0.675^\circ$ F.)
It turns out that the procedure actually works quite well--even if the data are not normally distributed--as long as $\bar{x}$ follows a normal distribution. Under what conditions will $\bar{x}$ be normally distributed? (Think about it.)
In the reading for [Distribution of Sample Means & The Central Limit Theorem|Lesson 6](Lesson06.html){target="_blank"}, we learned that $\bar{x}$ will follow a normal distribution if the data are drawn from a normal distribution or if the sample size ($n$) is large.
Putting this all together, we can summarize the requirements for a test for a single mean with $\sigma$ known as:
- The data represent a simple random sample from a large population.
- The sample mean $\bar{x}$ is normally distributed. This happens if either one of the following is true:
+ The population is normally distributed.
+ The sample size is large.
<img src="./Images/StepsAll.png">
### Worked Example: Body Temperatures
<img src="./Images/Step1.png">
**Summarize the relevant background information**
The following example illustrates how to conduct the hypothesis test for a single mean with $\sigma$ known. We will conduct a hypothesis test to determine if the true mean body temperature of healthy adults is 98.6° F, using the $\alpha=0.05$ level of significance. We will assume that the population standard deviation is known to be $\sigma=0.675^\circ$ F. This will summarize the work presented in the paragraphs above.
**State the null and alternative hypotheses and the level of significance**
$$
\begin{align}
H_0:& \mu = 98.6 \\
H_a:& \mu \ne 98.6
\end{align}
$$
$$ \alpha = 0.05 $$
<img src="./Images/Step2.png">
**Describe the data collection procedures**
The body temperatures of $n = 148$ healthy adults were measured using calibrated thermometers in a clinical setting.
<img src="./Images/Step3.png">
**Give the relevant summary statistics**
$$
\begin{array}{l}
\bar{x} = 98.23\\
\sigma = 0.675\\
n = 148
\end{array}
$$
**Make an appropriate graph (e.g. a histogram) to illustrate the data**
<img src="./Images/BodyTemp-Histogram-LoRes.png">
<img src="./Images/Step4.png">
**Verify the requirements have been met**
- We assume that the individuals chosen to participate in the study represent a (simple) random sample from the population.
- $\bar{x}$ will be normally distributed because the sample size is large. (Note: We could have also noticed that the body temperature data appears to be normally distributed, so even with a small sample size, $\bar{x}$ would be normal.)
**Give the test statistic and its value**
$$z=-6.669$$
**Mark the test statistic and $P$-value on a graph of the sampling distribution (i.e. the standard normal curve)**
<img src="./Images/BodyTemp-AppletPvalue.png">
**Find the $P$-value and compare it to the level of significance**
$$P\textrm{-value} = 2.57552 \times 10^{-11} = 0.000~000~000~026 < 0.05 = \alpha$$
**State your decision**
Since the $P$-value is less than $\alpha$, we reject the null hypothesis.
<img src="./Images/Step5.png">
**Present your conclusion in an English sentence, relating the result to the context of the problem**
There is sufficient evidence to suggest that the true mean body temperature of healthy adults is different from 98.6° F.
<img src="./Images/StepsAll.png">
## Worked Example: Perceived Health after a Cardiac Arrest
A group of researchers led by Dr. Jared Bunch studied the long-term effects suffered by patients who experienced a cardiac arrest outside a hospital<!-- \cite{Bunch03} -->. The long-term health of the patients was assessed using the Short-Form General Health Survey (SF-36)<!-- \cite{Ware92} --> at the time of their last follow up visit. The SF-36 is normalized so the mean score in the general population is $\mu=50$ and the standard deviation is $\sigma=10$<!-- \cite{Jenkinson99} -->. The minimum score is 0 and the maximum score is 100. Lower scores on the SF-36 indicate a poorer quality of health.
Using the 0.05 level of significance, we will conduct a hypothesis test to determine if the mean perceived general health level among cardiac arrest survivors is less than 50. In other words, we want to know if the mean perceived overall health is lower among cardiac arrest survivors than in the general population.
Dr. Bunch summarized the responses in a figure, <!--\cite{Bunch03}--> from which the data were extracted. These data are given in the file [CardiacArrestHealth](./Data/CardiacArrestHealth.xlsx).
<div class="QuestionsHeading">Answer the following question:</div>
<div class="Questions">
<img src="./Images/Step1.png">
21. Summarize the relevant background information.
<a href="javascript:showhide('Q21')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q21" style="display:none;">
* The long-term health of patients who had previously suffered a cardiac arrest were studied. The researchers want to know if these patients will feel that their overall long-term health was worsened by the cardiac arrest.
* After patients concluded their treatment for the cardiac arrest, they were given the SF-36 survey to assess their overall health. In the general population, the mean score on the SF-36 is 50. Lower scores indicate poorer health. The researchers will test to see if the mean score of those who have had a cardiac arrest is less than 50.
</div>
<br>
22. State the null and alternative hypotheses and the level of significance.
<a href="javascript:showhide('Q22')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q22" style="display:none;">
<center>
$$
\begin{array}{lcl}
H_0:~~\mu = 50\\
H_a:~~\mu < 50\\
\alpha = 0.05
\end{array}
$$
</center>
* Recall that we are testing to see if the mean is less than 50.
</div>
<br>
<img src="./Images/Step2.png">
23. Describe the data collection procedures.
<a href="javascript:showhide('Q23')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q23" style="display:none;">
* The researchers collected data from $n=50$ patients who had suffered a cardiac arrest outside a hospital. The data were collected using the SF-36 survey instrument.
</div>
<br>
<img src="./Images/Step3.png">
24. Give the relevant summary statistics.
<a href="javascript:showhide('Q24')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q24" style="display:none;">
<center>
$$
\begin{array}{l}
\bar{x} = 47.82\\
\sigma = 10\\
n = 50
\end{array}
$$
</center>
</div>
<br>
25. Make an appropriate graph to illustrate the data.
<a href="javascript:showhide('Q25')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q25" style="display:none;">
<img src="./Images/CardiacArrestHealth-Histogram.png">
</div>
<br>
<img src="./Images/Step4.png">
26. Verify the requirements have been met.
<a href="javascript:showhide('Q26')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q26" style="display:none;">
+ We assume that the individuals chosen to participate in the study represent a (simple) random sample from the population.
+ $\bar{x}$ will be normally distributed, because the sample size is large.
</div>
<br>
27. Give the test statistic and its value.
<a href="javascript:showhide('Q27')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q27" style="display:none;">
<center>
$\displaystyle{ z=\frac{\bar{x} - \mu}{\sigma / \sqrt{n}} = \frac{47.82 - 50}{10 / \sqrt{50}} = -1.5415 }$
</center>
</div>
<br>
28. Mark the test statistic and $P$-value on a graph of the sampling distribution.
<a href="javascript:showhide('Q28')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q28" style="display:none;">
<img src="./Images/CardiacArrestHealth-AppletPvalue.png">
</div>
<br>
29. Find the $P$-value and compare it to the level of significance.
<a href="javascript:showhide('Q29')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q29" style="display:none;">
<center>
$P\textrm{-value} = 0.0616 > 0.05 = \alpha$
</center>
</div>
<br>
30. State your decision.
<a href="javascript:showhide('Q30')"><span style="font-size:8pt;">Show/Hide Solution</span></a>
<div id="Q30" style="display:none;">
* Since the $P$-value is greater than $\alpha$, we fail to reject the null hypothesis.
</div>
<br>
<img src="./Images/Step5.png">