Prof Randi Garcia
March 28, 2018
Modern zoos try to reproduce natural habitats in their exhibits as much as possible. They try to use appropriate plants, but these plants can be infested with inappropriate insects. Cycads (plants that look vaguely like palms) can be infected with mealybugs, and the zoo wishes to test three treatments: 1) water, 2) horticultural oil, and 3) fungal spores in water. Five infested cycads are taken to the testing area. Three branches are randomly selected from each tree, and 3 cm by 3 cm patches are marked on each branch. The number of mealybugs on the patch is counted. The three treatments then get randomly assigned to the three branches for each tree. After three days the mealybugs are counted again. The change in number of mealybugs is computed (\( before-after \)).
treatment | tree1 | tree2 | tree3 | tree4 | tree5 |
---|---|---|---|---|---|
oil | 4 | 29 | 14 | 14 | 7 |
spores | -4 | 29 | 4 | -2 | 11 |
water | -9 | 18 | 10 | 9 | -6 |
Draw the factor diagram, labeling inside outside factors.
\[ {y}_{ij}={\mu}+{\tau}_{i}+{\beta}_{j}+{e}_{ij} \]
Source | SS | df | MS | F |
---|---|---|---|---|
Treatment | \( \sum_{i=1}^{a}b(\bar{y}_{i.}-\bar{y}_{..})^{2} \) | \( a-1 \) | \( \frac{{SS}_{T}}{{df}_{T}} \) | \( \frac{{MS}_{T}}{{MS}_{E}} \) |
Blocks | \( \sum_{j=1}^{b}a(\bar{y}_{.j}-\bar{y}_{..})^{2} \) | \( b-1 \) | \( \frac{{SS}_{B}}{{df}_{B}} \) | \( \frac{{MS}_{B}}{{MS}_{E}} \) |
Error | \( \sum_{i=1}^{a}\sum_{j=1}^{b}({y}_{ij}-\bar{y}_{i.}-\bar{y}_{.j}+\bar{y}_{..})^{2} \) | \( (a-1)(b-1) \) | \( \frac{{SS}_{E}}{{df}_{E}} \) |
mealybugs
tree treatment bugs_change
1 tree1 water -9
2 tree1 spores -4
3 tree1 oil 4
4 tree2 water 18
5 tree2 spores 29
6 tree2 oil 29
7 tree3 water 10
8 tree3 spores 4
9 tree3 oil 14
10 tree4 water 9
11 tree4 spores -2
12 tree4 oil 14
13 tree5 water -6
14 tree5 spores 11
15 tree5 oil 7
mod <- lm(bugs_change ~ treatment + tree, data = mealybugs)
anova(mod)
Analysis of Variance Table
Response: bugs_change
Df Sum Sq Mean Sq F value Pr(>F)
treatment 2 218.13 109.07 2.9963 0.106846
tree 4 1316.40 329.10 9.0412 0.004603 **
Residuals 8 291.20 36.40
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
library(tidyr)
library(ggplot2)
mealybugs %>%
spread(treatment, bugs_change)
tree oil spores water
1 tree1 4 -4 -9
2 tree2 29 29 18
3 tree3 14 4 10
4 tree4 14 -2 9
5 tree5 7 11 -6
Spores versus oil
mealybugs %>%
spread(treatment, bugs_change) %>%
ggplot(aes(x = spores, y = oil)) +
geom_point() +
geom_abline(slope = 1, intercept = 8)
Spores versus water
mealybugs %>%
spread(treatment, bugs_change) %>%
ggplot(aes(x = spores, y = water)) +
geom_point() +
geom_abline(slope = 1, intercept = -5)
Oil versus water
mealybugs %>%
spread(treatment, bugs_change) %>%
ggplot(aes(x = oil, y = water)) +
geom_point() +
geom_abline(slope = 1, intercept = -13)
This experiment is interested in the blood concentration of a drug after it has been administered. The concentration will start at zero, then go up, and back down as it is metabolized. This curve may differ depending on the form of the drug (a solution, a tablet, or a capsule). We will use three subjects, and each subject will be given the drug three times, once for each method. The area under the time-concentration curve is recorded for each subject after each method of drug delivery.
In the bioequivalence example, because the body may adapt to the drug in some way, each drug will be used once in the first period, once in the second period, and once in the third period.
Treatments:
period | 1 | 2 | 3 |
---|---|---|---|
1 | A 1799 | C 2075 | B 1396 |
2 | C 1846 | B 1156 | A 868 |
3 | B 2147 | A 1777 | C 2291 |
Factor diagram for the Latin Square??
The actual data structure for analysis is “long.”
subject | treatment | period | group | c_curve |
---|---|---|---|---|
1 | solution | 1 | A | 1799 |
1 | capsule | 2 | C | 1846 |
1 | tablet | 3 | B | 2147 |
2 | capsule | 1 | C | 2075 |
2 | tablet | 2 | B | 1156 |
2 | solution | 3 | A | 1777 |
3 | tablet | 1 | B | 1396 |
3 | solution | 2 | A | 868 |
3 | capsule | 3 | C | 2291 |
We can make a parallel dot graph
And check for equal standard deviations
library(mosaic)
sd <- favstats(c_curve ~ treatment, data = bioequivalence)[,8]
max(sd)/min(sd)
[1] 2.387418
\[ {y}_{ijk}={\mu}+{\alpha}_{i}+{\beta}_{j}+{\tau}_{k}+{e}_{ijk} \]
Source | SS | df | MS | F |
---|---|---|---|---|
rows | \( \sum_{i=1}^{p}p(\bar{y}_{i..}-\bar{y}_{...})^{2} \) | \( p-1 \) | \( \frac{{SS}_{A}}{{df}_{A}} \) | \( \frac{{MS}_{A}}{{MS}_{E}} \) |
columns | \( \sum_{j=1}^{p}p(\bar{y}_{.j.}-\bar{y}_{...})^{2} \) | \( p-1 \) | \( \frac{{SS}_{B}}{{df}_{B}} \) | \( \frac{{MS}_{B}}{{MS}_{E}} \) |
treatment | \( \sum_{k=1}^{p}p(\bar{y}_{..k}-\bar{y}_{...})^{2} \) | \( p-1 \) | \( \frac{{SS}_{T}}{{df}_{T}} \) | \( \frac{{MS}_{T}}{{MS}_{E}} \) |
Error | \( \sum_{i=1}^{p}\sum_{j=1}^{p}\sum_{k=1}^{p}({y}_{ijk}-\bar{y}_{i..}-\bar{y}_{.j.}-\bar{y}_{..k}+2\bar{y}_{..})^{2} \) | \( (p-1)(p-2) \) | \( \frac{{SS}_{E}}{{df}_{E}} \) |
ls_mod <- lm(c_curve ~ treatment + period + subject, data = bioequivalence)
anova(ls_mod)
Analysis of Variance Table
Response: c_curve
Df Sum Sq Mean Sq F value Pr(>F)
treatment 2 608891 304445 67.733 0.014549 *
period 2 928006 464003 103.231 0.009594 **
subject 2 261115 130557 29.047 0.033282 *
Residuals 2 8990 4495
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
bioequivalence <- bioequivalence %>%
mutate(fitted = fitted(ls_mod),
residuals = residuals(ls_mod))
ggplot(bioequivalence, aes(x = fitted, residuals)) +
geom_point() +
geom_hline(yintercept = 0, color = "red")
If you suspect a design in a split-plot design, you should be able to answer the following questions:
The Canada goose is a magnificent bird, but it can be a nuisance in urban areas in large numbers. One method of population control is to addle eggs in nests, but this method can hard adult females. Would removal of the eggs at the usual hatch date prevent harm? It is suspected that females nesting together at different sites are similar to each other. We randomly select 5 different sites, and we then randomly assign 5 nests per site to the addle with no removal condition, and 5 nests per site to the addle plus removal condition. The females at the nests are banded such that survival age can be measured later.
The disease diabetes affects the rate of turnover of lactic acid in a system of biochemical reactions called the Cori cycle. This experiment compares two methods of using radioactive carbon-14 to measure rate of turnover. Method 1 is injection all at once, and method 2 is infused continuously. 10 dogs were sorted into two groups, 5 were controls and 5 had their pancreas removed (to make it diabetic). The rate of turnover was then measured twice for each dog, once for each method. The order of the two methods was randomly assigned.
Draw the factor diagram for the data on page 263.