Split Plot/Repeated Measures Design

Prof Randi Garcia
March 28, 2018

Reading Free-Write (5 minutes)

  1. When we make a scatterplot of observations between two treatments within-blocks, what are we hoping to see? Can you explain why (OK if you cannot)?
  2. When you plot the fitted values versus residual values after running a model, what are the issues you should look out for?
  3. What sorts are things are confusing/fuzzy from Ch 7?

Announcements

  • Jenny Smetzer, Lab Instructor candidate, talk/demo today 12:00-1:00p in Bass 002
    • Pizza will be served!
  • HW 7 is posted
    • You can complete your homework in R if you'd like
  • Project Method draft due on Monday at Midnight on Moodle

Agenda

  • Fred Conrad, Research Professor and Director, Michigan Program in Survey Methodology
  • ANOVA for CB[1]
  • Latin square designs
  • Split plot designs

Inappropriate Insects

Modern zoos try to reproduce natural habitats in their exhibits as much as possible. They try to use appropriate plants, but these plants can be infested with inappropriate insects. Cycads (plants that look vaguely like palms) can be infected with mealybugs, and the zoo wishes to test three treatments: 1) water, 2) horticultural oil, and 3) fungal spores in water. Five infested cycads are taken to the testing area. Three branches are randomly selected from each tree, and 3 cm by 3 cm patches are marked on each branch. The number of mealybugs on the patch is counted. The three treatments then get randomly assigned to the three branches for each tree. After three days the mealybugs are counted again. The change in number of mealybugs is computed (\( before-after \)).

Inappropriate Insects

treatment tree1 tree2 tree3 tree4 tree5
oil 4 29 14 14 7
spores -4 29 4 -2 11
water -9 18 10 9 -6

Draw the factor diagram, labeling inside outside factors.

Formal ANOVA for CB[1]

\[ {y}_{ij}={\mu}+{\tau}_{i}+{\beta}_{j}+{e}_{ij} \]

Source SS df MS F
Treatment \( \sum_{i=1}^{a}b(\bar{y}_{i.}-\bar{y}_{..})^{2} \) \( a-1 \) \( \frac{{SS}_{T}}{{df}_{T}} \) \( \frac{{MS}_{T}}{{MS}_{E}} \)
Blocks \( \sum_{j=1}^{b}a(\bar{y}_{.j}-\bar{y}_{..})^{2} \) \( b-1 \) \( \frac{{SS}_{B}}{{df}_{B}} \) \( \frac{{MS}_{B}}{{MS}_{E}} \)
Error \( \sum_{i=1}^{a}\sum_{j=1}^{b}({y}_{ij}-\bar{y}_{i.}-\bar{y}_{.j}+\bar{y}_{..})^{2} \) \( (a-1)(b-1) \) \( \frac{{SS}_{E}}{{df}_{E}} \)

Data Analysis Structure

mealybugs
    tree treatment bugs_change
1  tree1     water          -9
2  tree1    spores          -4
3  tree1       oil           4
4  tree2     water          18
5  tree2    spores          29
6  tree2       oil          29
7  tree3     water          10
8  tree3    spores           4
9  tree3       oil          14
10 tree4     water           9
11 tree4    spores          -2
12 tree4       oil          14
13 tree5     water          -6
14 tree5    spores          11
15 tree5       oil           7

Formal ANOVA

mod <- lm(bugs_change ~ treatment + tree, data = mealybugs)

anova(mod)
Analysis of Variance Table

Response: bugs_change
          Df  Sum Sq Mean Sq F value   Pr(>F)   
treatment  2  218.13  109.07  2.9963 0.106846   
tree       4 1316.40  329.10  9.0412 0.004603 **
Residuals  8  291.20   36.40                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Informal Analysis Structure

library(tidyr)
library(ggplot2)

mealybugs %>%
  spread(treatment, bugs_change)
   tree oil spores water
1 tree1   4     -4    -9
2 tree2  29     29    18
3 tree3  14      4    10
4 tree4  14     -2     9
5 tree5   7     11    -6

Scatterplots

Spores versus oil

mealybugs %>%
  spread(treatment, bugs_change) %>%
  ggplot(aes(x = spores, y = oil)) +
  geom_point() +
  geom_abline(slope = 1, intercept = 8)

plot of chunk unnamed-chunk-5

Scatterplots

Spores versus water

mealybugs %>%
  spread(treatment, bugs_change) %>%
  ggplot(aes(x = spores, y = water)) +
  geom_point() +
  geom_abline(slope = 1, intercept = -5)

plot of chunk unnamed-chunk-6

Scatterplots

Oil versus water

mealybugs %>%
  spread(treatment, bugs_change) %>%
  ggplot(aes(x = oil, y = water)) +
  geom_point() +
  geom_abline(slope = 1, intercept = -13)

plot of chunk unnamed-chunk-7

Bioequivalence of drug delivery

This experiment is interested in the blood concentration of a drug after it has been administered. The concentration will start at zero, then go up, and back down as it is metabolized. This curve may differ depending on the form of the drug (a solution, a tablet, or a capsule). We will use three subjects, and each subject will be given the drug three times, once for each method. The area under the time-concentration curve is recorded for each subject after each method of drug delivery.

Latin Square Design

In the bioequivalence example, because the body may adapt to the drug in some way, each drug will be used once in the first period, once in the second period, and once in the third period.

  • We can use a Latin Square design to control the order of drug administration
  • In this way, time is a second blocking factor (subject is the first)

Latin Square Design

Treatments:

  • Solution is treatment A
  • Tablet is treatment B
  • Capsule C is treatment C
period 1 2 3
1 A 1799 C 2075 B 1396
2 C 1846 B 1156 A 868
3 B 2147 A 1777 C 2291

Factor diagram for the Latin Square??

Latin Square Design

The actual data structure for analysis is “long.”

subject treatment period group c_curve
1 solution 1 A 1799
1 capsule 2 C 1846
1 tablet 3 B 2147
2 capsule 1 C 2075
2 tablet 2 B 1156
2 solution 3 A 1777
3 tablet 1 B 1396
3 solution 2 A 868
3 capsule 3 C 2291

Informal ANOVA for Latin Square

We can make a parallel dot graph

plot of chunk unnamed-chunk-10

And check for equal standard deviations

library(mosaic)

sd <- favstats(c_curve ~ treatment, data = bioequivalence)[,8]

max(sd)/min(sd)
[1] 2.387418

Formal ANOVA for the Latin Square

\[ {y}_{ijk}={\mu}+{\alpha}_{i}+{\beta}_{j}+{\tau}_{k}+{e}_{ijk} \]

  • \( {\mu} \) is the benchmark
  • \( {\alpha}_{i} \) is the row effect
  • \( {\beta}_{j} \) is the column effect
  • \( {\tau}_{k} \) is the treatment effect
  • There are p rows, columns, and treatments
Source SS df MS F
rows \( \sum_{i=1}^{p}p(\bar{y}_{i..}-\bar{y}_{...})^{2} \) \( p-1 \) \( \frac{{SS}_{A}}{{df}_{A}} \) \( \frac{{MS}_{A}}{{MS}_{E}} \)
columns \( \sum_{j=1}^{p}p(\bar{y}_{.j.}-\bar{y}_{...})^{2} \) \( p-1 \) \( \frac{{SS}_{B}}{{df}_{B}} \) \( \frac{{MS}_{B}}{{MS}_{E}} \)
treatment \( \sum_{k=1}^{p}p(\bar{y}_{..k}-\bar{y}_{...})^{2} \) \( p-1 \) \( \frac{{SS}_{T}}{{df}_{T}} \) \( \frac{{MS}_{T}}{{MS}_{E}} \)
Error \( \sum_{i=1}^{p}\sum_{j=1}^{p}\sum_{k=1}^{p}({y}_{ijk}-\bar{y}_{i..}-\bar{y}_{.j.}-\bar{y}_{..k}+2\bar{y}_{..})^{2} \) \( (p-1)(p-2) \) \( \frac{{SS}_{E}}{{df}_{E}} \)

Formal ANOVA for the Latin Square

ls_mod <- lm(c_curve ~ treatment + period + subject, data = bioequivalence)

anova(ls_mod)
Analysis of Variance Table

Response: c_curve
          Df Sum Sq Mean Sq F value   Pr(>F)   
treatment  2 608891  304445  67.733 0.014549 * 
period     2 928006  464003 103.231 0.009594 **
subject    2 261115  130557  29.047 0.033282 * 
Residuals  2   8990    4495                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual Plot

bioequivalence <- bioequivalence %>%
  mutate(fitted = fitted(ls_mod), 
         residuals = residuals(ls_mod))

ggplot(bioequivalence, aes(x = fitted, residuals)) +
  geom_point() +
  geom_hline(yintercept = 0, color = "red")

Residual Plot

plot of chunk unnamed-chunk-14

Split Plot Design

If you suspect a design in a split-plot design, you should be able to answer the following questions:

  1. What are the whole plots, that is, what is the nuisance factor?
  2. What is the between-blocks factor? Is it observational or experimental?
  3. What is the within-blocks factor? Is it observational or experimental?

Example: Addled Goose Eggs

The Canada goose is a magnificent bird, but it can be a nuisance in urban areas in large numbers. One method of population control is to addle eggs in nests, but this method can hard adult females. Would removal of the eggs at the usual hatch date prevent harm? It is suspected that females nesting together at different sites are similar to each other. We randomly select 5 different sites, and we then randomly assign 5 nests per site to the addle with no removal condition, and 5 nests per site to the addle plus removal condition. The females at the nests are banded such that survival age can be measured later.

Crossing versus Nesting

  1. Crossing: Two sets of treatments are crossed if all possible combinations of treatments occur in the design. The design is called a two-way factorial and has factorial treatment structure.
  2. Nesting: One factor is nested within another if each level of the first (“inside”) factor occurs with exactly one level of the second (“outside”) factor.

Example: Diabetic Dogs

The disease diabetes affects the rate of turnover of lactic acid in a system of biochemical reactions called the Cori cycle. This experiment compares two methods of using radioactive carbon-14 to measure rate of turnover. Method 1 is injection all at once, and method 2 is infused continuously. 10 dogs were sorted into two groups, 5 were controls and 5 had their pancreas removed (to make it diabetic). The rate of turnover was then measured twice for each dog, once for each method. The order of the two methods was randomly assigned.

Draw the factor diagram for the data on page 263.