You can also download a PDF copy of this lecture.

Rate Ratios (Quantitative Explanatory Variable)

Consider the model \[ \log E(Y) = \beta_0 + \beta_1 x, \] and let \[ \log E(Y_a) = \beta_0 + \beta_1 (x+1) \ \ \ \text{and} \ \ \ \log E(Y_b) = \beta_0 + \beta_1 x \] for an arbitrary value of \(x\). Then the difference in the log of the expected values is \[ \log E(Y_a) - \log E(Y_b) = \underbrace{\beta_0 + \beta_1 (x+1)}_{\log E(Y_a)} - \underbrace{(\beta_0 + \beta_1 x)}_{\log E(Y_b)} = \beta_1, \] meaning that \(\beta_1\) is the additive change in \(\log E(Y)\) per unit increase in \(x\).

Now consider the same model written as \[ E(Y) = e^{\beta_0}e^{\beta_1 x}, \] and let \[ E(Y_a) = e^{\beta_0}e^{\beta_1 (x+1)} \ \ \ \text{and} \ \ \ E(Y_b) = e^{\beta_0}e^{\beta_1 x} \] for an arbitrary value of \(x\). Then the ratio of the expected values is \[ \frac{E(Y_a)}{E(Y_b)} = \frac{\overbrace{e^{\beta_0}e^{\beta_1 (x+1)}}^{E(Y_a)}}{\underbrace{e^{\beta_0}e^{\beta_1 x}}_{E(Y_b)}} = \frac{e^{\beta_0}e^{\beta_1 x}e^{\beta_1}}{e^{\beta_0}e^{\beta_1 x}} = e^{\beta_1} \Rightarrow E(Y_a) = E(Y_b)e^{\beta_1}, \] so that \(E(Y)\) changes by a factor of \(e^{\beta_1}\) per unit increase in \(x\). The “exponentiated” parameter, \(e^{\beta_1}\), is sometimes called a “rate ratio” because it is often the ratio of two rates when the counts are per unit space, time, or something else.

Example: Consider again the ceriodaphniastrain data and model.

library(trtools)
ceriodaphniastrain$strainf <- factor(ceriodaphniastrain$strain, 
  labels = c("a","b"))
m <- glm(count ~ concentration + strainf, 
  family = poisson, data = ceriodaphniastrain) # log link is default
cbind(summary(m)$coefficients, confint(m))
              Estimate Std. Error z value  Pr(>|z|) 2.5 % 97.5 %
(Intercept)      4.455     0.0391  113.82  0.00e+00  4.38   4.53
concentration   -1.543     0.0466  -33.11 2.06e-240 -1.63  -1.45
strainfb        -0.275     0.0484   -5.68  1.31e-08 -0.37  -0.18
exp(cbind(coef(m), confint(m))) # coef extracts the parameter estimates only
                      2.5 % 97.5 %
(Intercept)   86.025 79.615 92.817
concentration  0.214  0.195  0.234
strainfb       0.760  0.691  0.835

Note: It only makes sense to apply the exponential function to the point estimates and the endpoints of the confidence interval. A standard error of \(e^{\hat\beta_1}\) could be obtained, but it is not equal to the exponentiated standard error of \(\hat\beta_1\). A test concerning \(e^{\beta_1}\) can be done using either the confidence interval or by stated the hypotheses in terms of \(\beta_1\) (e.g., the null hypothesis that \(e^{\beta_1} = 1\) is the same as the null hypothesis that \(\beta_1 = 0\)).

Another approach is to use lincon and the tf (transformation function) argument.

lincon(m, tf = exp)
              estimate  lower  upper
(Intercept)     86.025 79.673 92.884
concentration    0.214  0.195  0.234
strainfb         0.760  0.691  0.835

Note that the confidence interval endpoints are not quite the same as what we obtained using confint. This is because confint and lincon use different approaches to confidence intervals (more on that later).

Example: Consider a model for the expected number of matings of African elephants as a function of age.

library(Sleuth3)
head(case2201)
  Age Matings
1  27       0
2  28       1
3  28       1
4  28       1
5  28       3
6  29       0
m <- glm(Matings ~ Age, family = poisson, data = case2201)
cbind(summary(m)$coefficients, confint(m))
            Estimate Std. Error z value Pr(>|z|)   2.5 %  97.5 %
(Intercept)  -1.5820     0.5446    -2.9 3.68e-03 -2.6667 -0.5289
Age           0.0687     0.0137     5.0 5.81e-07  0.0417  0.0956
exp(cbind(m$coefficients, confint(m))) 
                   2.5 % 97.5 %
(Intercept) 0.206 0.0695  0.589
Age         1.071 1.0426  1.100

Percent Change (Quantitative Explanatory Variable)

The percent change in the expected response is \[ 100\% \times \left[\frac{E(Y_a)-E(Y_b)}{E(Y_b)}\right] = 100\% \times \left[E(Y_a)/E(Y_b) - 1\right], \]
where \(E(Y_a)\) and \(E(Y_b)\) are the expected responses at two different points (\(a\) and \(b\)) defined in terms of the explanatory variable(s).

  1. Note that if this is positive then it is a percent increase, whereas if it is negative then it is a percent decrease.

  2. The ratio \(E(Y_a)/E(Y_b)\) is the rate ratio.

Example: Suppose we have the model \(\log E(Y) = \beta_0 + \beta_1 x\) where \(x\) is a quantitative variable and \(\beta_1 = 0.22\). Then \(e^{\beta_1} \approx 1.25\). So when \(x\) increases by one unit (i.e., to \(x + 1\)), — i.e., from \(E(Y_b) = e^{\beta_0}e^{\beta_1x}\) to \(E(Y_a) = e^{\beta_0}e^{\beta_1(x+1)}\) then the expected response increases by a factor of \[ E(Y_a)/E(Y_b) = e^{\beta_1} \approx 1.25, \] and because \[ 100\% \times \left[1.25 - 1\right] = 25\%. \]
we can say that it increases by 25%.

Example: Consider again the model for the elephant mating data.

m <- glm(Matings ~ Age, family = poisson, data = case2201)
exp(cbind(m$coefficients, confint(m))) 
                   2.5 % 97.5 %
(Intercept) 0.206 0.0695  0.589
Age         1.071 1.0426  1.100

The percent change in the expected count per unit (year) increase in Age is approximately 100%(1.07 - 1) = 7% (i.e., a 7% increase).

Example: Suppose we have the model \(\log E(Y) = \beta_0 + \beta_1 x\) where \(x\) is a quantitative variable and \(\beta_1 = -0.22\). Then \(e^{\beta_1} \approx 0.8\). So when \(x\) increases by one unit (i.e., to \(x + 1\)), — i.e., from \(E(Y_b) = e^{\beta_0}e^{\beta_1x}\) to \(E(Y_a) = e^{\beta_0}e^{\beta_1(x+1)}\) then the expected response decreases by a factor of \[ E(Y_a)/E(Y_b) = e^{\beta_1} \approx 0.8, \] or because \[ 100\% \times \left[0.8 - 1\right] = -20\% \]
we can say that it decreases by 20%.

Example: Consider again the model for the ceriodaphniastrain data.

m <- glm(count ~ concentration + strainf, family = poisson, data = ceriodaphniastrain) 
exp(cbind(coef(m), confint(m)))
                      2.5 % 97.5 %
(Intercept)   86.025 79.615 92.817
concentration  0.214  0.195  0.234
strainfb       0.760  0.691  0.835

The percent change in the expected count per unit increase in concentration is approximately 100%(0.21 - 1) = -79% (i.e., a 79% decrease or reduction).

Rate Ratios (Categorical Explanatory Variable)

Consider the model \[ \log E(Y) = \beta_0 + \beta_1 x, \ \ \text{or, equivalently,} \ \ E(Y) = e^{\beta_0}e^{\beta_1 x}, \] where \[ x = \begin{cases} 1, & \text{if the observation is in group $a$}, \\ 0, & \text{if the observation is in group $b$}. \end{cases} \] Then \[ E(Y) = \begin{cases} e^{\beta_0}e^{\beta_1}, & \text{if the observation is in group $a$}, \\ e^{\beta_0}, & \text{if the observation is in group $b$}. \end{cases} \] Let \[ E(Y_a) = e^{\beta_0}e^{\beta_1} \ \ \ \text{and} \ \ \ E(Y_b) = e^{\beta_0}. \] Then the ratio of the expected values is \[ \frac{E(Y_a)}{E(Y_b)} = \frac{e^{\beta_0}e^{\beta_1}}{e^{\beta_0}} = e^{\beta_1} \Leftrightarrow E(Y_a) = E(Y_b)e^{\beta_1} \] so that \(E(Y_a)\) is \(e^{\beta_1}\) times that of \(E(Y_b)\). Also \[ \frac{E(Y_b)}{E(Y_a)} = \frac{e^{\beta_0}}{e^{\beta_0}e^{\beta_1}} = \frac{1}{e^{\beta_1}} = e^{-\beta_1}. \] so that \(E(Y_b)\) is \(1/e^{\beta_1}\) times that of \(E(Y_a)\).

Example: Consider again the ceriodaphniastrain data and model.

m <- glm(count ~ concentration + strainf, 
  family = poisson, data = ceriodaphniastrain) 
cbind(summary(m)$coefficients, confint(m))
              Estimate Std. Error z value  Pr(>|z|) 2.5 % 97.5 %
(Intercept)      4.455     0.0391  113.82  0.00e+00  4.38   4.53
concentration   -1.543     0.0466  -33.11 2.06e-240 -1.63  -1.45
strainfb        -0.275     0.0484   -5.68  1.31e-08 -0.37  -0.18
exp(cbind(coef(m), confint(m)))
                      2.5 % 97.5 %
(Intercept)   86.025 79.615 92.817
concentration  0.214  0.195  0.234
strainfb       0.760  0.691  0.835

Alternatively we can parameterize the model.

ceriodaphniastrain$strainf <- relevel(ceriodaphniastrain$strainf, ref = "b")
m <- glm(count ~ concentration + strainf, 
  family = poisson, data = ceriodaphniastrain) 
cbind(summary(m)$coefficients, confint(m))
              Estimate Std. Error z value  Pr(>|z|) 2.5 % 97.5 %
(Intercept)      4.180     0.0430   97.14  0.00e+00  4.09   4.26
concentration   -1.543     0.0466  -33.11 2.06e-240 -1.63  -1.45
strainfa         0.275     0.0484    5.68  1.31e-08  0.18   0.37
exp(cbind(coef(m), confint(m)))
                      2.5 % 97.5 %
(Intercept)   65.344 60.008 71.034
concentration  0.214  0.195  0.234
strainfa       1.316  1.198  1.448

Example: Consider these data from a stratified random sampling design and a Poisson regression model.

library(trtools)
library(ggplot2) 
p <- ggplot(daphniastrat, aes(x = layer, y = count)) + 
  geom_dotplot(binaxis = "y", binwidth = 1, stackdir = "center") + 
  labs(x = "Layer", y = "Number of Daphnia") + theme_minimal()
plot(p)

daphniastrat$layer <- relevel(daphniastrat$layer, ref = "thermocline")
m <- glm(count ~ layer, family = poisson, data = daphniastrat)
summary(m)$coefficients
                 Estimate Std. Error z value  Pr(>|z|)
(Intercept)         2.425     0.0941   25.78 1.65e-146
layerepilimnion     0.546     0.1068    5.11  3.27e-07
layerhypolimnion   -1.875     0.2175   -8.62  6.74e-18
exp(cbind(coef(m), confint(m)))
                         2.5 % 97.5 %
(Intercept)      11.300 9.3425 13.513
layerepilimnion   1.726 1.4050  2.137
layerhypolimnion  0.153 0.0981  0.231

Percent Larger/Smaller (Categorical Explanatory Variable)

The percent change in the expected response is \[ 100\% \times \left[\frac{E(Y_a)-E(Y_b)}{E(Y_b)}\right] = 100\% \times \left[E(Y_a)/E(Y_b) - 1\right], \]
where \(E(Y_a)\) and \(E(Y_b)\) are the expected responses at two different points (\(a\) and \(b\)) defined in terms of the explanatory variable(s).

  1. Note that if this is positive then \(E(Y_a)\) is that percent larger than \(E(Y_b)\), whereas if this is negative then \(E(Y_b)\) is that percent smaller than \(E(Y_a)\).

  2. The ratio \(E(Y_a)/E(Y_b)\) is the rate ratio.

Example: Suppose we have the model \(\log E(Y) = \beta_0 + \beta_1 x\) where \(x\) is an indicator variable for category \(a\) and \(\beta_1 = 0.22\). Then \(e^{\beta_1} \approx 1.25\), \(E(Y_a) = e^{\beta_0}e^{\beta_1}\) and \(E(Y_b) = e^{\beta_0}\), and \(E(Y_a)\) is about 1.25 times larger than \(E(Y_b)\) because \[ E(Y_a)/E(Y_b) = e^{\beta_1} \approx 1.25, \] and because \[ 100\% \times \left[1.25 - 1\right] = 25\%. \]
we can say that \(E(Y_a)\) is about 25% larger than \(E(Y_b)\).

Example: Suppose we have the model \(\log E(Y) = \beta_0 + \beta_1 x\) where \(x\) is an indicator variable for category \(a\) and \(\beta_1 = -0.22\). Then \(e^{\beta_1} \approx 0.8\), \(E(Y_a) = e^{\beta_0}e^{\beta_1}\) and \(E(Y_b) = e^{\beta_0}\), and \(E(Y_a)\) is about 0.8 times smaller than \(E(Y_b)\) because \[ E(Y_a)/E(Y_b) = e^{\beta_1} \approx 0.8, \] and because \[ 100\% \times \left[0.8 - 1\right] = -20\%. \]
we can say that \(E(Y_a)\) is about 20% smaller than \(E(Y_b)\).

Example: Consider again the model for the daphnia data.

exp(cbind(coef(m), confint(m)))
                         2.5 % 97.5 %
(Intercept)      11.300 9.3425 13.513
layerepilimnion   1.726 1.4050  2.137
layerhypolimnion  0.153 0.0981  0.231

The expected number of daphnia per liter in the epilimnion layer is estimated to be about 100%(1.73-1) = 73% more than in the thermocline layer. And because 100%(0.15-1) = -85% we estimate that the the expected number of daphia per liter in the hypolimnion layer is 85% less than it is in the thermocline layer.