You can also download a PDF copy of this lecture.
Traits | Probability | observed | expected |
---|---|---|---|
Purple and Long | 9/16 | 284 | |
Purple and Round | 3/16 | 21 | |
Red and Long | 3/16 | 21 | |
Red and Round | 1/16 | 55 |
How would we conduct a goodness-of-fit test for no linkage?
Two categorical variables are said to be independent if the distribution of one variable does not depend on the value of the other variable.
Example: Consider the following data where a sample of 1398 children were classified with respect to tonsil size and carrier status of Streptococcus pyogenes.1
Size | yes | no | Total |
---|---|---|---|
small | 19 | 497 | 516 |
medium | 29 | 560 | 589 |
large | 24 | 269 | 293 |
Total | 72 | 1326 | 1398 |
Size | yes | no | Total |
---|---|---|---|
small | 26.58 | 489.42 | 516 |
medium | 30.33 | 558.67 | 589 |
large | 15.09 | 277.91 | 293 |
Total | 72 | 1326 | 1398 |
State hypotheses in terms of independence of the variables.
Check assumptions (all expected counts should be at least five).
Compute the \(X^2\) test statistic. Estimate the expected counts using the formula \[ \frac{R \times C}{T} \] where \(R\) and \(C\) are the the sum of the observed counts in the corresponding row and column, respectively, and \(T\) is the total of all the observed counts.
Compute the \(p\)-value using \((r-1)(c-1)\) as the degrees of freedom, where \(r\) and \(c\) are the number of rows and columns of observed counts in the table, respectively.
Make a decision/conclusion.
Applicant | yes | no | Total |
---|---|---|---|
male | 21 | 3 | 24 |
female | 14 | 10 | 24 |
Total | 35 | 13 | 48 |
We could investigate the relationship between applicant sex and promotion decision by a test of the hypotheses \(H_0\!: p_m-p_f = 0\) versus \(H_a\!: p_m-p_f \neq 0\) using the test statistic \[ z = \frac{\hat{p}_m-\hat{p}_f}{\sqrt{\hat{p}(1-\hat{p})(1/n_m + 1/n_f)}}, \] which yields a test statistic of \(z\) \(\approx\) 2.27 and a p-value of about 0.02. How is this test related to the test of independence using the \(X^2\) test statistic? How is the \(z\) test statistic limited?
Strategy | progressive disease | no change | partial remission | complete remission | Total |
---|---|---|---|---|---|
sequential | 32 | 57 | 34 | 28 | 151 |
alternating | 53 | 51 | 23 | 21 | 148 |
Total | 85 | 108 | 57 | 49 | 299 |
Holmes, M. C. & Willaims, R. E. O. (1954). The distribution of carriers of Streptococcus pyogenes among 2413 healthy children. Journal of Hygiene, 52, 165–179.↩︎
Rosen, B. & Jerdee, J. (1974). Influence of sex role stereotypes on personnel decisions. Journal of Applied Psychology, 59, 9–14.↩︎
Holtbrugge, W. & Schumacher, M. (1991). A comparison of regression models for the analysis of ordered categorical data. Applied Statistics, 40, 249–259.↩︎