- Last updated
- Save as PDF
- Page ID
- 5200
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)
There are many different parameters that you can test. There is a test for the mean, such as was introduced with the z-test. There is also a test for the population proportion, p. This is where you might be curious if the proportion of students who smoke at your school is lower than the proportion in your area. Or you could question if the proportion of accidents caused by teenage drivers who do not have a drivers’ education class is more than the national proportion.
To test a population proportion, there are a few things that need to be defined first. Usually, Greek letters are used for parameters and Latin letters for statistics. When talking about proportions, it makes sense to use p for proportion. The Greek letter for p is \(\pi\), but that is too confusing to use. Instead, it is best to use p for the population proportion. That means that a different symbol is needed for the sample proportion. The convention is to use, \(\hat{p}\), known as p-hat. This way you know that p is the population proportion, and that \(\hat{p}\) is the sample proportion related to it.
Now proportion tests are about looking for the percentage of individuals who have a particular attribute. You are really looking for the number of successes that happen. Thus, a proportion test involves a binomial distribution.
Hypothesis Test for One Population Proportion (1-Prop Test)
- State the random variable and the parameter in words.
x = number of successes
I = proportion of successes - State the null and alternative hypotheses and the level of significance
\(H_{o} : p=p_{o}\), where \(p_{o}\) is the known proportion
\(H_{A} : p<p_{o}\)
\(H_{A} : p>p_{o}\), use the appropriate one for your problem
\(H_{A} : p \neq p_{o}\)
Also, state your \(\alpha\) level here. - State and check the assumptions for a hypothesis test
- A simple random sample of size n is taken.
- The conditions for the binomial distribution are satisfied
- To determine the sampling distribution of \(\hat{p}\), you need to show that \(n p \geq 5\) and \(n q \geq 5\), where \(q=1-p\). If this requirement is true, then the sampling distribution of \(\hat{p}\) is well approximated by a normal curve.
- Find the sample statistic, test statistic, and p-value
Sample Proportion:
\(\hat{p}=\dfrac{x}{n}=\dfrac{\# \text { of successes }}{\# \text { of trials }}\)
Test Statistic:
\(z=\dfrac{\hat{p}-p}{\sqrt{\stackrel{p q}{n}}}\)
p-value:
TI-83/84: Use normalcdf(lower limit, upper limit, 0, 1)Note
if \(H_{A} : p<p_{o}\), then lower limit is \(-1 E 99\) and upper limit is your test statistic. If \(H_{A} : p>p_{o}\), then lower limit is your test statistic and the upper limit is \(1 E 99\). If \(H_{A} : p \neq p_{o}\), then find the p-value for \(H_{A} : p<p_{o}\), and multiply by 2.
Note
If \(H_{A} : p<p_{o}\), then you can use pnorm. If \(H_{A} : p>p_{o}\), then you have to find pnorm and then subtract from 1. If \(H_{A} : p \neq p_{o}\), then find the p-value for \(H_{A} : p<p_{o}\), and multiply by 2.
- Conclusion
This is where you write reject \(H_{o}\) or fail to reject \(H_{o}\). The rule is: if the p-value < \(\alpha\), then reject \(H_{o}\). If the p-value \(\geq \alpha\), then fail to reject \(H_{o}\). - Interpretation
This is where you interpret in real world terms the conclusion to the test. The conclusion for a hypothesis test is that you either have enough evidence to show \(H_{A}\) is true, or you do not have enough evidence to show \(H_{A}\) is true.
Example \(\PageIndex{1}\) hypothesis test for one proportion using formula
A concern was raised in Australia that the percentage of deaths of Aboriginal prisoners was higher than the percent of deaths of non-Aboriginal prisoners, which is 0.27%. A sample of six years (1990-1995) of data was collected, and it was found that out of 14,495 Aboriginal prisoners, 51 died ("Indigenous deaths in," 1996). Do the data provide enough evidence to show that the proportion of deaths of Aboriginal prisoners is more than 0.27%?
- State the random variable and the parameter in words.
- State the null and alternative hypotheses and the level of significance.
- State and check the assumptions for a hypothesis test.
- Find the sample statistic, test statistic, and p-value.
- Conclusion
- Interpretation
Solution
1. x = number of Aboriginal prisoners who die
p = proportion of Aboriginal prisoners who die
2. \(\begin{array}{l}{H_{o} : p=0.0027} \\ {H_{A} : p>0.0027}\end{array}\)
Example \(\PageIndex{5}\)b argued that the \(\alpha =0.05\).
3.
- A simple random sample of 14,495 Aboriginal prisoners was taken. However, the sample was not a random sample, since it was data from six years. It is the numbers for all prisoners in these six years, but the six years were not picked at random. Unless there was something special about the six years that were chosen, the sample is probably a representative sample. This assumption is probably met.
- There are 14,495 prisoners in this case. The prisoners are all Aboriginals, so you are not mixing Aboriginal with non-Aboriginal prisoners. There are only two outcomes, either the prisoner dies or doesn’t. The chance that one prisoner dies over another may not be constant, but if you consider all prisoners the same, then it may be close to the same probability. Thus the conditions for the binomial distribution are satisfied
- In this case p = 0.0027 and n = 14,495. \(n p=14495^{*} 0.0027 \approx 39 \geq 5\) and \(n q=14495^{*}(1-0.0027) \approx 14456 \geq 5\). So, the sampling distribution for \(\hat{p}\) is a normal distribution.
4. Sample Proportion:
x = 51
n = 14495
\(\hat{p}=\dfrac{x}{n}=\dfrac{51}{14495} \approx 0.003518\)
Test Statistic:
\(z=\dfrac{\hat{p}-p}{\sqrt{\dfrac{p q}{n}}}=\dfrac{0.003518-0.0027}{\sqrt{\dfrac{0.0027(1-0.0027)}{14495}}} \approx 1.8979\)
p-value:
TI-83/84: p-value = \(P(z>1.8979)=\text { normalcdf }(1.8979,1 E 99,0,1) \approx 0.029\)
R: p-value = \(P(z>1.8979)=1-\text { pnorm }(1.8979,0,1) \approx 0.029\)
5. Since the p-value < 0.05, then reject \(H_{o}\).
6. There is enough evidence to show that the proportion of deaths of Aboriginal prisoners is more than for non-Aboriginal prisoners.
Example \(\PageIndex{2}\) hypothesis test for one proportion using technology
A researcher who is studying the effects of income levels on breastfeeding of infants hypothesizes that countries where the income level is lower have a higher rate of infant breastfeeding than higher income countries. It is known that in Germany, considered a high-income country by the World Bank, 22% of all babies are breastfeed. In Tajikistan, considered a low-income country by the World Bank, researchers found that in a random sample of 500 new mothers that 125 were breastfeeding their infant. At the 5% level of significance, does this show that low-income countries have a higher incident of breastfeeding?
- State you random variable and the parameter in words.
- State the null and alternative hypotheses and the level of significance.
- State and check the assumptions for a hypothesis test.
- Find the sample statistic, test statistic, and p-value.
- Conclusion
- Interpretation
Solution
1. x = number of woman who breastfeed in a low-income country
p = proportion of woman who breastfeed in a low-income country
2. \(\begin{array}{l}{H_{o} : p=0.22} \\ {H_{A} : p>0.22} \\ {\alpha=0.05}\end{array}\)
3.
- A simple random sample of 500 breastfeeding habits of woman in a low-income country was taken as was stated in the problem.
- There were 500 women in the study. The women are considered identical, though they probably have some differences. There are only two outcomes, either the woman breastfeeds or she doesn’t. The probability of a woman breastfeeding is probably not the same for each woman, but it is probably not very different for each woman. The conditions for the binomial distribution are satisfied
- In this case, n = 500 and p = 0.22. \(n p=500(0.22)=110 \geq 5\) and \(n q=500(1-0.22)=390 \geq 5\), so the sampling distribution of \(\hat{p}\) is well approximated by a normal curve.
4. This time, all calculations will be done with technology. On the TI-83/84 calculator. Go into the STAT menu, then arrow over to TESTS. This test is a 1-propZTest. Then type in the information just as shown in Figure \(\PageIndex{1}\).
.png?revision=1)
Once you press Calculate, you will see the results as in Figure \(\PageIndex{2}\).
.png?revision=1)
The z in the results is the test statistic. The p = 0.052683219 is the p-value, and the \(\hat{p}=0.25\) is the sample proportion.
The p-value is approximately 0.053.
On R, the command is prop.test(x, n, po, alternative = "less" or "greater"), where po is what \(\mathrm{H}_{\mathrm{o}}\) says p equals, and you use less if your \(\mathrm{H}_{\mathrm{A}}\) is less and greater if your \(\mathrm{H}_{\mathrm{A}}\) is greater. If your \(\mathrm{H}_{\mathrm{A}}\) is not equal to, then leave off the alternative statement. So for this example, the command would be prop.test(125, 500, .22, alternative = "greater")
1-sample proportions test with continuity correction
data: 125 out of 500, null probability 0.22
X-squared = 2.4505, df = 1, p-value = 0.05874
alternative hypothesis: true p is greater than 0.22
95 percent confidence interval:
0.218598 1.000000
sample estimates:
p
0.25
Note
R does a continuity correction that the formula and the TI-83/84 calculator do not do. You can put in a command that says not to use the continuity correction, but it is correct to use it. Also, R doesn’t give the z test statistic, so you don’t need to worry about this. It does give a p-value that is slightly off from the formula and the calculator due to the continuity correction.
p-value = 0.05874
5. Since the p-value is more than 0.05, you fail to reject \(H_{o}\).
6. There is not enough evidence to show that the proportion of women who breastfeed in low-income countries is more than in high-income countries.
Notice, the conclusion is that there wasn't enough evidence to show what \(H_{1}\) said. The conclusion was not that you proved \(H_{o}\) true. There are many reasons why you can’t say that \(H_{o}\) is true. It could be that the countries you chose were not very representative of what truly happens. If you instead looked at all high-income countries and compared them to low-income countries, you might have different results. It could also be that the sample you collected in the low-income country was not representative. It could also be that income level is not an indication of breastfeeding habits. There could be other factors involved. This is why you can’t say that you have proven \(H_{o}\) is true. There are too many other factors that could be the reason that you failed to reject \(H_{o}\).
Homework
Exercise \(\PageIndex{1}\)
In each problem show all steps of the hypothesis test. If some of the assumptions are not met, note that the results of the test may not be correct and then continue the process of the hypothesis test.
- Eyeglassomatic manufactures eyeglasses for different retailers. They test to see how many defective lenses they made in a given time period and found that 11% of all lenses had defects of some type. Looking at the type of defects, they found in a three-month time period that out of 34,641 defective lenses, 5865 were due to scratches. Are there more defects from scratches than from all other causes? Use a 1% level of significance.
- In July of 1997, Australians were asked if they thought unemployment would increase, and 47% thought that it would increase. In November of 1997, they were asked again. At that time 284 out of 631 said that they thought unemployment would increase ("Morgan gallup poll," 2013). At the 5% level, is there enough evidence to show that the proportion of Australians in November 1997 who believe unemployment would increase is less than the proportion who felt it would increase in July 1997?
- According to the February 2008 Federal Trade Commission report on consumer fraud and identity theft, 23% of all complaints in 2007 were for identity theft. In that year, Arkansas had 1,601 complaints of identity theft out of 3,482 consumer complaints ("Consumer fraud and," 2008). Does this data provide enough evidence to show that Arkansas had a higher proportion of identity theft than 23%? Test at the 5% level.
- According to the February 2008 Federal Trade Commission report on consumer fraud and identity theft, 23% of all complaints in 2007 were for identity theft. In that year, Alaska had 321 complaints of identity theft out of 1,432 consumer complaints ("Consumer fraud and," 2008). Does this data provide enough evidence to show that Alaska had a lower proportion of identity theft than 23%? Test at the 5% level.
- In 2001, the Gallup poll found that 81% of American adults believed that there was a conspiracy in the death of President Kennedy. In 2013, the Gallup poll asked 1,039 American adults if they believe there was a conspiracy in the assassination, and found that 634 believe there was a conspiracy ("Gallup news service," 2013). Do the data show that the proportion of Americans who believe in this conspiracy has decreased? Test at the 1% level.
- In 2008, there were 507 children in Arizona out of 32,601 who were diagnosed with Autism Spectrum Disorder (ASD) ("Autism and developmental," 2008). Nationally 1 in 88 children are diagnosed with ASD ("CDC features -," 2013). Is there sufficient data to show that the incident of ASD is more in Arizona than nationally? Test at the 1% level.
- Answer
-
For all hypothesis tests, just the conclusion is given. See solutions for the entire answer.
1. Reject Ho.
3. Reject Ho.
5. Reject Ho.
FAQs
How do you interpret a one sample proportion test? ›
The rule is: if the p-value < α, then reject Ho. If the p-value ≥α, then fail to reject Ho. This is where you interpret in real world terms the conclusion to the test. The conclusion for a hypothesis test is that you either have enough evidence to show HA is true, or you do not have enough evidence to show HA is true.
What is a 1 proportion test? ›The 1 proportion test tells you whether the proportion is equal to a target value. To put it in more starkly statistical terms, the procedure computes a confidence interval and performs a hypothesis test. Your null hypothesis is that the population proportion (p) equals a hypothesized value (H0: p = p0).
What is a one sample z-test for proportions? ›One proportion z-test or one-sample Z-test for proportion is one of the most popular statistical hypothesis tests dealing with one sample proportion. It is used to determine whether or not a hypothesized mean difference between the sample and the population can be rejected by drawing conclusions from sample data.
How do you find the test statistic for a single proportion? ›The test statistic is a z-score (z) defined by the following equation. z=(p−P)σ where P is the hypothesized value of population proportion in the null hypothesis, p is the sample proportion, and σ is the standard deviation of the sampling distribution.
What is a normal sample proportion? ›For large samples, the sample proportion is approximately normally distributed, with mean μˆP=p. and standard deviation σˆP=√pqn. A sample is large if the interval [p−3σˆp,p+3σˆp] lies wholly within the interval [0,1].
How do you interpret ratios and proportions? ›A ratio is an ordered pair of numbers a and b, written a / b where b does not equal 0. A proportion is an equation in which two ratios are set equal to each other. For example, if there is 1 boy and 3 girls you could write the ratio as: 1 : 3 (for every one boy there are 3 girls)
What is the p-value in a proportion test? ›The p-value is the proportion of samples on the randomization distribution that are more extreme than our observed sample in the direction of the alternative hypothesis. The p-value is compared to the alpha level (typically 0.05).
What does a 1 p-value mean? ›Being a probability, P can take any value between 0 and 1. Values close to 0 indicate that the observed difference is unlikely to be due to chance, whereas a P value close to 1 suggests no difference between the groups other than due to chance.
What is a one sample t test example? ›For example, imagine a company wants to test the claim that their batteries last more than 40 hours. Using a simple random sample of 15 batteries yielded a mean of 44.9 hours, with a standard deviation of 8.9 hours. Test this claim using a significance level of 0.05.
What does a 1 z-score tell you? ›A Z-score of 1.0 would indicate a value that is one standard deviation from the mean. Z-scores may be positive or negative, with a positive value indicating the score is above the mean and a negative score indicating it is below the mean.
What is the z-score for a population value of 1? ›
A z-score of 1 means that the data point is exactly 1 standard deviation above the mean.
What does a 1 z-score mean? ›A 1 in a z-score means 1 standard deviation, not 1 unit. So if the standard deviation of the data set is 1.69, a z-score of 1 would mean that the data point is 1.69 units above the mean.
How do you calculate sample proportion? ›The sample proportion P is given by P=X/N, where X denotes the number of successes and N denotes the size of the sample in question.
What is a healthy sample size? ›A good maximum sample size is usually around 10% of the population, as long as this does not exceed 1000. For example, in a population of 5000, 10% would be 500. In a population of 200,000, 10% would be 20,000.
What sample size is normal? ›The conventional rule-of-thumb is that a sample size of 30 is big enough for the theoretical distribution of the sample mean to be distributed roughly normally, even when the underlying population is skewed.
What proportion of scores is normal distribution? ›The Empirical Rule: Given a data set that is approximately normally distributed: Approximately 68% of the data is within one standard deviation of the mean. Approximately 95% of the data is within two standard deviations of the mean. Approximately 99.7% of the data is within three standard deviations of the mean.
What is ratio grade 7? ›In mathematics, a ratio indicates the number of times that a smaller number is contained within a larger number, while a rate expresses a ratio for two quantities measured in different units.
What is ratio and proportion Grade 7? ›A ratio may be treated as a fraction. Two ratios are equivalent, if the fractions corresponding to them are equivalent. Four quantities are said to be in proportion, if the ratio of the first and the second quantities is equal to the ratio of the third and the fourth quantities.
What grade level is ratios and proportions? ›In sixth grade, students are introduced to the concept of a ratio between two quantities.
What does p-value of 5% mean? ›"A P value of 0.05 does not mean that there is a 95% chance that a given hypothesis is correct. Instead, it signifies that if the null hypothesis is true, and all other assumptions made are valid, there is a 5% chance of obtaining a result at least as extreme as the one observed.
What happens if p-value is too high? ›
High p-values indicate that your evidence is not strong enough to suggest an effect exists in the population. An effect might exist but it's possible that the effect size is too small, the sample size is too small, or there is too much variability for the hypothesis test to detect it.
What does 0.7 p-value mean? ›the value will usually range between 0 and 1. Value of < 0.3 is weak , Value between 0.3 and 0.5 is moderate and Value > 0.7 means strong effect on the dependent variable.
What is p-value is above 1? ›It is a probability and, as a probability, it ranges from 0−1. 0 and cannot exceed one. A p-value higher than one would mean a probability greater than 100% and this can't occur.
Can a p-value be higher than 1? ›As the answer explains, P-values are probabilities and so cannot exceed 1, so whatever argument you had in mind was fallacious.
What if p-value is less than 1? ›If a p-value is lower than our significance level, we reject the null hypothesis. If not, we fail to reject the null hypothesis.
How do you analyze t-test results? ›Interpreting the results isn't very complicated. All you have to do is compare the p-value to an alpha significance level. If the value turns out to be smaller than the alpha level, then you can safely reject the hypothesis. In this scenario, since the alternative hypothesis will be true, the data will be significant.
How do you know when to use a one-sample t-test? ›The one-sample t-test is used when we want to know whether our sample comes from a particular population but we do not have full population information available to us. For instance, we may want to know if a particular sample of college students is similar to or different from college students in general.
How do you know if it's a one-sample t-test? ›What is a One Sample T Test? The one sample t test compares the mean of your sample data to a known value. For example, you might want to know how your sample mean compares to the population mean. You should run a one sample t test when you don't know the population standard deviation or you have a small sample size.
Does a z-score of 2.5 mean? ›Z-scores are measured in standard deviation units.
A Z-score of 2.5 means your observed value is 2.5 standard deviations from the mean and so on. The closer your Z-score is to zero, the closer your value is to the mean.
Positive Z-scores result from values that are above the mean, and negative Z-scores are from values below the mean. The greater a Z-score's absolute value, the more extraordinary is the data point's deviation from the mean.
Which z-score is higher 1 or 2? ›
A z-score of 1 is 1 standard deviation above the mean. A score of 2 is 2 standard deviations above the mean. A score of -1.8 is -1.8 standard deviations below the mean.
What is the z-score of 1.5 mean? ›A z-score of 1.5, then, means that a value is 1.5 standard deviations greater than the mean. Z-scores can be negative if they are below the mean, so for the three-sigma rule, 68% of the values fall between the z-scores of -1 and 1.
Is 1.3 A good z-score? ›Thus, any student who receives a z-score greater than or equal to 1.2816 would be considered a “good” z-score.
What if the z-score is greater than 1? ›A positive z-score indicates the raw score is higher than the mean average. For example, if a z-score is equal to +1, it is 1 standard deviation above the mean. A negative z-score reveals the raw score is below the mean average. For example, if a z-score is equal to -2, it is 2 standard deviations below the mean.
Is 3 a good z-score? ›A positive z-score says the data point is above average. A negative z-score says the data point is below average. A z-score close to 0 says the data point is close to average. A data point can be considered unusual if its z-score is above 3 or below −3 .
What is a high z-score? ›So, a high z-score means the data point is many standard deviations away from the mean. This could happen as a matter of course with heavy/long tailed distributions, or could signify outliers. A good first step would be good to plot a histogram or other density estimator and take a look at the distribution.
What z-score is abnormal? ›As a general rule, z-scores lower than -1.96 or higher than 1.96 are considered unusual and interesting. That is, they are statistically significant outliers.
How do you solve a proportion solution? ›To solve proportions, start by taking the numerator, or top number, of the fraction you know and multiplying it with the denominator, or bottom number, of the fraction you don't know. Next, take that number and divide it by the denominator of the fraction you know. Now you can replace x with this final number.
How do you interpret a confidence interval for a proportion? ›How to Interpret Confidence Intervals. A confidence interval indicates where the population parameter is likely to reside. For example, a 95% confidence interval of the mean [9 11] suggests you can be 95% confident that the population mean is between 9 and 11.
What does p indicate in a single sample test result? ›The p-value only tells you how likely the data you have observed is to have occurred under the null hypothesis. If the p-value is below your threshold of significance (typically p < 0.05), then you can reject the null hypothesis, but this does not necessarily mean that your alternative hypothesis is true.
What is considered a high standard error? ›
A high standard error shows that sample means are widely spread around the population mean—your sample may not closely represent your population. A low standard error shows that sample means are closely distributed around the population mean—your sample is representative of your population.
How do you know if standard error is significant? ›When the standard error is large relative to the statistic, the statistic will typically be non-significant. However, if the sample size is very large, for example, sample sizes greater than 1,000, then virtually any statistical result calculated on that sample will be statistically significant.
What does standard error of proportion tell you? ›The standard error of a proportion is a statistic indicating how greatly a particular sample proportion is likely to differ from the proportion in the population proportion, p.
What is a good 95 confidence interval? ›Once the standard error is calculated, the confidence interval is determined by multiplying the standard error by a constant that reflects the level of significance desired, based on the normal distribution. The constant for 95 percent confidence intervals is 1.96.
What does the 95% confidence interval tell you about your data? ›The 95% confidence interval defines a range of values that you can be 95% certain contains the population mean. With large samples, you know that mean with much more precision than you do with a small sample, so the confidence interval is quite narrow when computed from a large sample.
What is a high p-value? ›High p-values indicate that your evidence is not strong enough to suggest an effect exists in the population. An effect might exist but it's possible that the effect size is too small, the sample size is too small, or there is too much variability for the hypothesis test to detect it.
What is a good p-value? ›A p-value of 0.05 or lower is generally considered statistically significant. P-value can serve as an alternative to—or in addition to—preselected confidence levels for hypothesis testing.
What is a good p test value? ›A p-value less than 0.05 (typically ≤ 0.05) is statistically significant. It indicates strong evidence against the null hypothesis, as there is less than a 5% probability the null is correct (and the results are random).