Packages and Functions

A function with \(R^2\) added to anova table output

1 Generate Designs with Computer

1.1 Randomized Complete Block Design

## $book
##    plots block treatments
## 1    101     1          D
## 2    102     1          A
## 3    103     1          B
## 4    104     1          C
## 5    201     2          C
## 6    202     2          D
## 7    203     2          A
## 8    204     2          B
## 9    301     3          D
## 10   302     3          A
## 11   303     3          C
## 12   304     3          B
## 13   401     4          D
## 14   402     4          B
## 15   403     4          A
## 16   404     4          C
## 17   501     5          D
## 18   502     5          A
## 19   503     5          B
## 20   504     5          C
## 
## $sketch
##      [,1] [,2] [,3] [,4]
## [1,] "D"  "A"  "B"  "C" 
## [2,] "C"  "D"  "A"  "B" 
## [3,] "D"  "A"  "C"  "B" 
## [4,] "D"  "B"  "A"  "C" 
## [5,] "D"  "A"  "B"  "C"
## $book
##    plots block treatments2
## 1    101     1           C
## 2    102     1           A
## 3    103     1           D
## 4    104     1           B
## 5    105     1           B
## 6    106     1           C
## 7    107     1           D
## 8    108     1           A
## 9    201     2           D
## 10   202     2           D
## 11   203     2           C
## 12   204     2           A
## 13   205     2           B
## 14   206     2           C
## 15   207     2           A
## 16   208     2           B
## 17   301     3           B
## 18   302     3           A
## 19   303     3           D
## 20   304     3           A
## 21   305     3           C
## 22   306     3           C
## 23   307     3           D
## 24   308     3           B
## 
## $sketch
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] "C"  "A"  "D"  "B"  "B"  "C"  "D"  "A" 
## [2,] "D"  "D"  "C"  "A"  "B"  "C"  "A"  "B" 
## [3,] "B"  "A"  "D"  "A"  "C"  "C"  "D"  "B"

1.2 Latin-Square Design

## $book
##    plots row col treatments
## 1    101   1   1          A
## 2    102   1   2          C
## 3    103   1   3          B
## 4    104   1   4          D
## 5    201   2   1          B
## 6    202   2   2          D
## 7    203   2   3          C
## 8    204   2   4          A
## 9    301   3   1          D
## 10   302   3   2          B
## 11   303   3   3          A
## 12   304   3   4          C
## 13   401   4   1          C
## 14   402   4   2          A
## 15   403   4   3          D
## 16   404   4   4          B
## 
## $sketch
##      [,1] [,2] [,3] [,4]
## [1,] "A"  "C"  "B"  "D" 
## [2,] "B"  "D"  "C"  "A" 
## [3,] "D"  "B"  "A"  "C" 
## [4,] "C"  "A"  "D"  "B"
plots row col treatments
4 104 1 4 A
6 202 2 2 A
9 301 3 1 A
15 403 4 3 A
1 101 1 1 B
7 203 2 3 B
10 302 3 2 B
16 404 4 4 B
2 102 1 2 C
8 204 2 4 C
11 303 3 3 C
13 401 4 1 C
3 103 1 3 D
5 201 2 1 D
12 304 3 4 D
14 402 4 2 D
##    col
## row 1 2 3 4
##   1 1 1 1 1
##   2 1 1 1 1
##   3 1 1 1 1
##   4 1 1 1 1
##    treatments
## row A B C D
##   1 1 1 1 1
##   2 1 1 1 1
##   3 1 1 1 1
##   4 1 1 1 1
##    treatments
## col A B C D
##   1 1 1 1 1
##   2 1 1 1 1
##   3 1 1 1 1
##   4 1 1 1 1

1.3 Graeco-Latin Design

## $book
##    plots row col treat1 treat2
## 1    101   1   1      D      d
## 2    102   1   2      C      b
## 3    103   1   3      A      c
## 4    104   1   4      B      a
## 5    201   2   1      C      c
## 6    202   2   2      D      a
## 7    203   2   3      B      d
## 8    204   2   4      A      b
## 9    301   3   1      A      a
## 10   302   3   2      B      c
## 11   303   3   3      D      b
## 12   304   3   4      C      d
## 13   401   4   1      B      b
## 14   402   4   2      A      d
## 15   403   4   3      C      a
## 16   404   4   4      D      c
## 
## $sketch
##      [,1]  [,2]  [,3]  [,4] 
## [1,] "D d" "C b" "A c" "B a"
## [2,] "C c" "D a" "B d" "A b"
## [3,] "A a" "B c" "D b" "C d"
## [4,] "B b" "A d" "C a" "D c"
##    col
## row 1 2 3 4
##   1 1 1 1 1
##   2 1 1 1 1
##   3 1 1 1 1
##   4 1 1 1 1
##    treat1
## row A B C D
##   1 1 1 1 1
##   2 1 1 1 1
##   3 1 1 1 1
##   4 1 1 1 1
##    treat2
## row a b c d
##   1 1 1 1 1
##   2 1 1 1 1
##   3 1 1 1 1
##   4 1 1 1 1
##    treat1
## col A B C D
##   1 1 1 1 1
##   2 1 1 1 1
##   3 1 1 1 1
##   4 1 1 1 1
##    treat2
## col a b c d
##   1 1 1 1 1
##   2 1 1 1 1
##   3 1 1 1 1
##   4 1 1 1 1

1.4 Balanced Incomplete Block Design

## 
## Parameters BIB
## ==============
## Lambda     : 2
## treatmeans : 4
## Block size : 3
## Blocks     : 4
## Replication: 3 
## 
## Efficiency factor 0.8888889 
## 
## <<< Book >>>
## $book
##    plots block treatments
## 1    101     1          C
## 2    102     1          A
## 3    103     1          B
## 4    201     2          D
## 5    202     2          B
## 6    203     2          C
## 7    301     3          C
## 8    302     3          D
## 9    303     3          A
## 10   401     4          B
## 11   402     4          A
## 12   403     4          D
## 
## $sketch
##      [,1] [,2] [,3]
## [1,] "C"  "A"  "B" 
## [2,] "D"  "B"  "C" 
## [3,] "C"  "D"  "A" 
## [4,] "B"  "A"  "D"
## block
## 1 2 3 4 
## 3 3 3 3
## treatments
## A B C D 
## 3 3 3 3
##      treatments
## block A B C D
##     1 1 1 1 0
##     2 0 1 1 1
##     3 1 0 1 1
##     4 1 1 0 1
## 
## Parameters BIB
## ==============
## Lambda     : 1
## treatmeans : 4
## Block size : 2
## Blocks     : 6
## Replication: 3 
## 
## Efficiency factor 0.6666667 
## 
## <<< Book >>>
## $book
##    plots block treatments
## 1    101     1          C
## 2    102     1          A
## 3    201     2          B
## 4    202     2          C
## 5    301     3          D
## 6    302     3          C
## 7    401     4          A
## 8    402     4          D
## 9    501     5          A
## 10   502     5          B
## 11   601     6          D
## 12   602     6          B
## 
## $sketch
##      [,1] [,2]
## [1,] "C"  "A" 
## [2,] "B"  "C" 
## [3,] "D"  "C" 
## [4,] "A"  "D" 
## [5,] "A"  "B" 
## [6,] "D"  "B"
## block
## 1 2 3 4 5 6 
## 2 2 2 2 2 2
## treatments
## A B C D 
## 3 3 3 3
##      treatments
## block A B C D
##     1 1 0 1 0
##     2 0 1 1 0
##     3 0 0 1 1
##     4 1 0 0 1
##     5 1 1 0 0
##     6 0 1 0 1

2 Analyzing Randomized Complete Block Design

This is the example 4.1 from the textbook. A medical device manufacturer produces vascular grafts (artificial veins). These grafts are produced by extruding billets of polytetrafluoroethylene (PTFE) resin combined with a lubricant into tubes. Frequently, some of the tubes in a production run contain small, hard protrusions on the external surface. These defects are known as “flicks.” The defect is cause for rejection of the unit. The product developer responsible for the vascular grafts suspects that the extrusion pressure affects the occur- rence of yield and therefore intends to conduct an experi- ment to investigate this hypothesis. However, the resin is manufactured by an external supplier and is delivered to the medical device manufacturer in batches. The engineer also suspects that there may be significant batch-to-batch variation. Therefore, the product developer decided to investigate the effect of four different levels of extrusion pressure on yield using a randomized com- plete block design considering batches of resin as blocks. Note that there are four levels of extrusion pressure (treatments) and six batches of resin (blocks). Remember that the order in which the extrusion pressures are tested within each block is random. The response variable is yield, or the percentage of tubes in the production run that did not contain any flicks.

2.1 Read Data

## The following object is masked from package:datasets:
## 
##     pressure
yield batch pressure
90.3 1 1
89.2 2 1
98.2 3 1
93.9 4 1
87.4 5 1
97.9 6 1
92.5 1 2
89.5 2 2
90.6 3 2
94.7 4 2
87.0 5 2
95.8 6 2
85.5 1 3
90.8 2 3
89.6 3 3
86.2 4 3
88.0 5 3
93.4 6 3
82.5 1 4
89.5 2 4
85.6 3 4
87.4 4 4
78.9 5 4
90.7 6 4
##      1    2    3    4
## 1 90.3 92.5 85.5 82.5
## 2 89.2 89.5 90.8 89.5
## 3 98.2 90.6 89.6 85.6
## 4 93.9 94.7 86.2 87.4
## 5 87.4 87.0 88.0 78.9
## 6 97.9 95.8 93.4 90.7
##         Pressure1 Pressure2 Pressure3 Pressure4  Average
## Batch1   90.30000  92.50000  85.50000  82.50000 87.70000
## Batch2   89.20000  89.50000  90.80000  89.50000 89.75000
## Batch3   98.20000  90.60000  89.60000  85.60000 91.00000
## Batch4   93.90000  94.70000  86.20000  87.40000 90.55000
## Batch5   87.40000  87.00000  88.00000  78.90000 85.32500
## Batch6   97.90000  95.80000  93.40000  90.70000 94.45000
## Average  92.81667  91.68333  88.91667  85.76667 89.79583

2.3 Linear Models

2.3.1 Effect Model

batch pressure X.Intercept. batch1 batch2 batch3 batch4 batch5 pressure1 pressure2 pressure3
1 1 1 1 0 0 0 0 1 0 0
2 1 1 0 1 0 0 0 1 0 0
3 1 1 0 0 1 0 0 1 0 0
4 1 1 0 0 0 1 0 1 0 0
5 1 1 0 0 0 0 1 1 0 0
6 1 1 -1 -1 -1 -1 -1 1 0 0
1 2 1 1 0 0 0 0 0 1 0
2 2 1 0 1 0 0 0 0 1 0
3 2 1 0 0 1 0 0 0 1 0
4 2 1 0 0 0 1 0 0 1 0
5 2 1 0 0 0 0 1 0 1 0
6 2 1 -1 -1 -1 -1 -1 0 1 0
1 3 1 1 0 0 0 0 0 0 1
2 3 1 0 1 0 0 0 0 0 1
3 3 1 0 0 1 0 0 0 0 1
4 3 1 0 0 0 1 0 0 0 1
5 3 1 0 0 0 0 1 0 0 1
6 3 1 -1 -1 -1 -1 -1 0 0 1
1 4 1 1 0 0 0 0 -1 -1 -1
2 4 1 0 1 0 0 0 -1 -1 -1
3 4 1 0 0 1 0 0 -1 -1 -1
4 4 1 0 0 0 1 0 -1 -1 -1
5 4 1 0 0 0 0 1 -1 -1 -1
6 4 1 -1 -1 -1 -1 -1 -1 -1 -1
## 
## Call:
## lm(formula = yield ~ batch + pressure, data = vascular, contrasts = list(batch = contr.sum, 
##     pressure = contr.sum))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.5708 -1.3333 -0.3167  1.1417  4.1792 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 89.79583    0.55248 162.531  < 2e-16 ***
## batch1      -2.09583    1.23539  -1.696  0.11044    
## batch2      -0.04583    1.23539  -0.037  0.97089    
## batch3       1.20417    1.23539   0.975  0.34516    
## batch4       0.75417    1.23539   0.610  0.55069    
## batch5      -4.47083    1.23539  -3.619  0.00253 ** 
## pressure1    3.02083    0.95693   3.157  0.00652 ** 
## pressure2    1.88750    0.95693   1.972  0.06728 .  
## pressure3   -0.87917    0.95693  -0.919  0.37277    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.707 on 15 degrees of freedom
## Multiple R-squared:  0.7712, Adjusted R-squared:  0.6492 
## F-statistic: 6.321 on 8 and 15 DF,  p-value: 0.00113
##         Pressure1 Pressure2 Pressure3 Pressure4  Average
## Batch1   90.30000  92.50000  85.50000  82.50000 87.70000
## Batch2   89.20000  89.50000  90.80000  89.50000 89.75000
## Batch3   98.20000  90.60000  89.60000  85.60000 91.00000
## Batch4   93.90000  94.70000  86.20000  87.40000 90.55000
## Batch5   87.40000  87.00000  88.00000  78.90000 85.32500
## Batch6   97.90000  95.80000  93.40000  90.70000 94.45000
## Average  92.81667  91.68333  88.91667  85.76667 89.79583
## [1] 89.79583
##  Pressure1  Pressure2  Pressure3  Pressure4 
##  3.0208333  1.8875000 -0.8791667 -4.0291667
##      Batch1      Batch2      Batch3      Batch4      Batch5      Batch6 
## -2.09583333 -0.04583333  1.20416667  0.75416667 -4.47083333  4.65416667

2.3.2 Treatment Difference Model

batch pressure X.Intercept. batch2 batch3 batch4 batch5 batch6 pressure2 pressure3 pressure4
1 1 1 0 0 0 0 0 0 0 0
2 1 1 1 0 0 0 0 0 0 0
3 1 1 0 1 0 0 0 0 0 0
4 1 1 0 0 1 0 0 0 0 0
5 1 1 0 0 0 1 0 0 0 0
6 1 1 0 0 0 0 1 0 0 0
1 2 1 0 0 0 0 0 1 0 0
2 2 1 1 0 0 0 0 1 0 0
3 2 1 0 1 0 0 0 1 0 0
4 2 1 0 0 1 0 0 1 0 0
5 2 1 0 0 0 1 0 1 0 0
6 2 1 0 0 0 0 1 1 0 0
1 3 1 0 0 0 0 0 0 1 0
2 3 1 1 0 0 0 0 0 1 0
3 3 1 0 1 0 0 0 0 1 0
4 3 1 0 0 1 0 0 0 1 0
5 3 1 0 0 0 1 0 0 1 0
6 3 1 0 0 0 0 1 0 1 0
1 4 1 0 0 0 0 0 0 0 1
2 4 1 1 0 0 0 0 0 0 1
3 4 1 0 1 0 0 0 0 0 1
4 4 1 0 0 1 0 0 0 0 1
5 4 1 0 0 0 1 0 0 0 1
6 4 1 0 0 0 0 1 0 0 1
## 
## Call:
## lm(formula = yield ~ batch + pressure, data = vascular)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.5708 -1.3333 -0.3167  1.1417  4.1792 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   90.721      1.657  54.735  < 2e-16 ***
## batch2         2.050      1.914   1.071 0.301043    
## batch3         3.300      1.914   1.724 0.105201    
## batch4         2.850      1.914   1.489 0.157175    
## batch5        -2.375      1.914  -1.241 0.233684    
## batch6         6.750      1.914   3.527 0.003050 ** 
## pressure2     -1.133      1.563  -0.725 0.479457    
## pressure3     -3.900      1.563  -2.496 0.024713 *  
## pressure4     -7.050      1.563  -4.512 0.000414 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.707 on 15 degrees of freedom
## Multiple R-squared:  0.7712, Adjusted R-squared:  0.6492 
## F-statistic: 6.321 on 8 and 15 DF,  p-value: 0.00113
##         Pressure1 Pressure2 Pressure3 Pressure4  Average
## Batch1   90.30000  92.50000  85.50000  82.50000 87.70000
## Batch2   89.20000  89.50000  90.80000  89.50000 89.75000
## Batch3   98.20000  90.60000  89.60000  85.60000 91.00000
## Batch4   93.90000  94.70000  86.20000  87.40000 90.55000
## Batch5   87.40000  87.00000  88.00000  78.90000 85.32500
## Batch6   97.90000  95.80000  93.40000  90.70000 94.45000
## Average  92.81667  91.68333  88.91667  85.76667 89.79583
## Pressure2 Pressure3 Pressure4 
## -1.133333 -3.900000 -7.050000
## Batch2 Batch3 Batch4 Batch5 Batch6 
##  2.050  3.300  2.850 -2.375  6.750
##   Batch1 
## 90.72083

2.3.3 Visualize the fitted model

## 
## Call:
## lm(formula = yield ~ batch + pressure, data = vascular)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.5708 -1.3333 -0.3167  1.1417  4.1792 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   90.721      1.657  54.735  < 2e-16 ***
## batch2         2.050      1.914   1.071 0.301043    
## batch3         3.300      1.914   1.724 0.105201    
## batch4         2.850      1.914   1.489 0.157175    
## batch5        -2.375      1.914  -1.241 0.233684    
## batch6         6.750      1.914   3.527 0.003050 ** 
## pressure2     -1.133      1.563  -0.725 0.479457    
## pressure3     -3.900      1.563  -2.496 0.024713 *  
## pressure4     -7.050      1.563  -4.512 0.000414 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.707 on 15 degrees of freedom
## Multiple R-squared:  0.7712, Adjusted R-squared:  0.6492 
## F-statistic: 6.321 on 8 and 15 DF,  p-value: 0.00113

2.4 ANOVA

2.4.1 Formulae

The formula for computing block and treatment sum squares is the same as in one-way ANOVA.

##         Pressure1 Pressure2 Pressure3 Pressure4  Average
## Batch1   90.30000  92.50000  85.50000  82.50000 87.70000
## Batch2   89.20000  89.50000  90.80000  89.50000 89.75000
## Batch3   98.20000  90.60000  89.60000  85.60000 91.00000
## Batch4   93.90000  94.70000  86.20000  87.40000 90.55000
## Batch5   87.40000  87.00000  88.00000  78.90000 85.32500
## Batch6   97.90000  95.80000  93.40000  90.70000 94.45000
## Average  92.81667  91.68333  88.91667  85.76667 89.79583
## [1] 480.3096
## [1] 178.1712
## [1] 192.2521
## [1] 109.8863

2.4.2 Using lm and anova functions

Df Sum Sq Mean Sq F value Pr(>F)
pressure 3 178.1712 59.39042 3.931339 0.023448
Residuals 20 302.1383 15.10692 NA NA
Df Sum Sq Mean Sq F value Pr(>F)
batch 5 192.2521 38.45042 2.402671 0.0777009
Residuals 18 288.0575 16.00319 NA NA
Df Sum Sq Mean Sq F value Pr(>F)
batch 5 192.2521 38.45042 5.248666 0.0055317
pressure 3 178.1712 59.39042 8.107077 0.0019163
Residuals 15 109.8863 7.32575 NA NA
Df Sum Sq Mean Sq F value Pr(>F)
pressure 3 178.1712 59.39042 8.107077 0.0019163
batch 5 192.2521 38.45042 5.248666 0.0055317
Residuals 15 109.8863 7.32575 NA NA

2.5 Effect Comparison

2.5.1 Standard Error of the Difference Between Two Means

\[se(\hat \mu_i-\hat \mu_j)=\hat\sigma\times \sqrt{1/n_i+1/n_j}\] where \(\hat\sigma=\sqrt{MSE}\) from the treatment+block model. This SE will be multiplied by the t quantile or Tukey quantiles to obtain the LSD CI or family-wise CI.

2.5.2 Effect Comparison for Fixed Pair with LSD

##              Estimate Std. Error    t value     Pr(>|t|)       2.5 %    97.5 %
## (Intercept) 90.720833   1.657455 54.7350287 1.091578e-18  87.1880522 94.253615
## batch2       2.050000   1.913864  1.0711316 3.010429e-01  -2.0293043  6.129304
## batch3       3.300000   1.913864  1.7242605 1.052012e-01  -0.7793043  7.379304
## batch4       2.850000   1.913864  1.4891341 1.571751e-01  -1.2293043  6.929304
## batch5      -2.375000   1.913864 -1.2409451 2.336842e-01  -6.4543043  1.704304
## batch6       6.750000   1.913864  3.5268966 3.050449e-03   2.6706957 10.829304
## pressure2   -1.133333   1.562663 -0.7252575 4.794567e-01  -4.4640714  2.197405
## pressure3   -3.900000   1.562663 -2.4957391 2.471273e-02  -7.2307380 -0.569262
## pressure4   -7.050000   1.562663 -4.5115284 4.136854e-04 -10.3807380 -3.719262
## [1] 1.913864
## [1] 1.562663
## [1] 4.079304
## [1] 3.330738

2.5.3 Tukey Multiple Comparison of Effects

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = yield ~ pressure + batch, data = vascular)
## 
## $pressure
##          diff        lwr       upr     p adj
## 2-1 -1.133333  -5.637161  3.370495 0.8854831
## 3-1 -3.900000  -8.403828  0.603828 0.1013084
## 4-1 -7.050000 -11.553828 -2.546172 0.0020883
## 3-2 -2.766667  -7.270495  1.737161 0.3245644
## 4-2 -5.916667 -10.420495 -1.412839 0.0086667
## 4-3 -3.150000  -7.653828  1.353828 0.2257674
## 
## $batch
##       diff         lwr        upr     p adj
## 2-1  2.050  -4.1680828  8.2680828 0.8853016
## 3-1  3.300  -2.9180828  9.5180828 0.5376297
## 4-1  2.850  -3.3680828  9.0680828 0.6757699
## 5-1 -2.375  -8.5930828  3.8430828 0.8105903
## 6-1  6.750   0.5319172 12.9680828 0.0297368
## 3-2  1.250  -4.9680828  7.4680828 0.9845521
## 4-2  0.800  -5.4180828  7.0180828 0.9980198
## 5-2 -4.425 -10.6430828  1.7930828 0.2483499
## 6-2  4.700  -1.5180828 10.9180828 0.1986961
## 4-3 -0.450  -6.6680828  5.7680828 0.9998784
## 5-3 -5.675 -11.8930828  0.5430828 0.0837504
## 6-3  3.450  -2.7680828  9.6680828 0.4925715
## 5-4 -5.225 -11.4430828  0.9930828 0.1263042
## 6-4  3.900  -2.3180828 10.1180828 0.3674672
## 6-5  9.125   2.9069172 15.3430828 0.0027838

## [1] 6.218083
## [1] 4.503828

2.6 Model Checking

2.6.1 Checking the treatment or block only model

## 
##  Shapiro-Wilk normality test
## 
## data:  fit_batch$residuals
## W = 0.98098, p-value = 0.9132
## 
##  Shapiro-Wilk normality test
## 
## data:  fit_pressure$residuals
## W = 0.95664, p-value = 0.3747

From the boxplots of residuals versus un-accounted variables, we can see clearly that there are still signals uncaptured. However, the normality test for the residuals of each model may tell you that the model is not bad good. To compare (not check) the goodness of fits of different models, another criteria, AIC, is often used. AIC is a criterion related to sum squares of residuals and the number of parameters.

## [1] 141.7516
## [1] 138.897
## [1] 124.6225

Note: AIC is not required for this course. ### Checking the treatment+block model

## 
##  Shapiro-Wilk normality test
## 
## data:  fit_vas$res
## W = 0.95631, p-value = 0.3689

Df Sum Sq Mean Sq F value Pr(>F)
vascular$batch 5 13.44405 2.688810 1.84921 0.1538582
Residuals 18 26.17257 1.454032 NA NA
Df Sum Sq Mean Sq F value Pr(>F)
vascular$pressure 3 0.1339583 0.0446528 0.0226189 0.9952375
Residuals 20 39.4826620 1.9741331 NA NA

3 Analyzing Latin Square Design Experiments

A hardness testing machine presses a pointed rod (the ‘tip’) into a metal specimen (a ‘coupon’), with a known force. The depth of the depression is a measure of the hardness of the specimen. It is feared that, depending on the kind of tip used, the machine might give different readings. The experimenter wants 4 observations on each of the 4 types of tips. Suppose that the ‘coupon’ and the ‘operator’ of the testing machine are thought to be factors. Suppose there are p = 4 operators, p = 4 coupons, and p = 4 tips. The first two are nuisance factors, the last is the treatment factor.

operator coupon tip y
1 1 A 9.3
1 2 B 9.4
1 3 C 9.2
1 4 D 9.7
2 1 B 9.3
2 2 A 9.4
2 3 D 9.6
2 4 C 9.4
3 1 C 9.5
3 2 D 10.0
3 3 A 9.6
3 4 B 9.8
4 1 D 10.2
4 2 C 9.7
4 3 B 9.9
4 4 A 10.0
##         tip
## operator A B C D
##        1 1 1 1 1
##        2 1 1 1 1
##        3 1 1 1 1
##        4 1 1 1 1
##         coupon
## operator 1 2 3 4
##        1 1 1 1 1
##        2 1 1 1 1
##        3 1 1 1 1
##        4 1 1 1 1
##       tip
## coupon A B C D
##      1 1 1 1 1
##      2 1 1 1 1
##      3 1 1 1 1
##      4 1 1 1 1

## 
## Call:
## lm(formula = y ~ operator + coupon + tip, data = hardness)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -0.075 -0.025  0.000  0.025  0.050 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  9.300e+00  4.564e-02 203.753 9.43e-13 ***
## operator2    2.500e-02  4.082e-02   0.612 0.562764    
## operator3    3.250e-01  4.082e-02   7.961 0.000209 ***
## operator4    5.500e-01  4.082e-02  13.472 1.04e-05 ***
## coupon2      5.000e-02  4.082e-02   1.225 0.266570    
## coupon3      9.433e-16  4.082e-02   0.000 1.000000    
## coupon4      1.500e-01  4.082e-02   3.674 0.010402 *  
## tipB         2.500e-02  4.082e-02   0.612 0.562764    
## tipC        -1.250e-01  4.082e-02  -3.062 0.022172 *  
## tipD         3.000e-01  4.082e-02   7.348 0.000325 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.05774 on 6 degrees of freedom
## Multiple R-squared:  0.9845, Adjusted R-squared:  0.9612 
## F-statistic: 42.33 on 9 and 6 DF,  p-value: 9.592e-05
Df Sum Sq Mean Sq F value Pr(>F) R2
operator 3 0.825 0.2750000 82.5 0.0000287 0.6395349
coupon 3 0.060 0.0200000 6.0 0.0307958 0.0465116
tip 3 0.385 0.1283333 38.5 0.0002585 0.2984496
Residuals 6 0.020 0.0033333 NA NA 0.0155039

4 Analyzing Balanced Incomplete Block Design

Suppose that a chemical engineer thinks that the time of reaction for a chemical process is a function of the type of catalyst employed. Four catalysts are currently being investigated. The experimental procedure consists of selecting a batch of raw material, loading the pilot plant, applying each catalyst in a separate run of the pilot plant, and observing the reaction time. Because variations in the batches of raw material may affect the performance of the catalysts, the engineer decides to use batches of raw material as blocks. However, each batch is only large enough to permit three catalysts to be run. Therefore a randomized incomplete block design must be used.

Treatment Block y
1 1 73
1 2 74
1 4 71
2 2 75
2 3 67
2 4 72
3 1 73
3 2 75
3 3 68
4 1 75
4 3 72
4 4 75
##      Treatment
## Block 1 2 3 4
##     1 1 0 1 1
##     2 1 1 1 0
##     3 0 1 1 1
##     4 1 1 0 1

4.2 Effect Comparison in BIBD

BIBD Effect Estimations is not exactly equal to treatment and block mean differences

##          1          2          3          4 
##  0.0000000 -1.3333333 -0.6666667  1.3333333
##         1         2         3         4 
##  0.000000  1.000000 -4.666667 -1.000000
## 
## Call:
## lm(formula = y ~ Block + Treatment, data = catalyst)
## 
## Residuals:
##          1          2          3          4          5          6          7 
##  7.500e-01 -3.750e-01 -3.750e-01  3.750e-01 -7.500e-01  3.750e-01  1.250e-01 
##          8          9         10         11         12 
##  1.188e-14 -1.250e-01 -8.750e-01  8.750e-01  1.077e-14 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  72.2500     0.6158 117.334 8.53e-10 ***
## Block2        2.1250     0.6982   3.043  0.02864 *  
## Block3       -4.7500     0.6982  -6.803  0.00105 ** 
## Block4       -0.8750     0.6982  -1.253  0.26554    
## Treatment2    0.2500     0.6982   0.358  0.73492    
## Treatment3    0.6250     0.6982   0.895  0.41173    
## Treatment4    3.6250     0.6982   5.192  0.00349 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8062 on 5 degrees of freedom
## Multiple R-squared:  0.9599, Adjusted R-squared:  0.9117 
## F-statistic: 19.94 on 6 and 5 DF,  p-value: 0.002396
##                  2.5 %     97.5 %
## (Intercept) 70.6671254 73.8328746
## Block2       0.3301889  3.9198111
## Block3      -6.5448111 -2.9551889
## Block4      -2.6698111  0.9198111
## Treatment2  -1.5448111  2.0448111
## Treatment3  -1.1698111  2.4198111
## Treatment4   1.8301889  5.4198111
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = y ~ Block + Treatment, data = catalyst)
## 
## $Block
##          diff       lwr       upr     p adj
## 2-1  1.000000 -1.428998  3.428998 0.4917130
## 3-1 -4.666667 -7.095665 -2.237669 0.0032954
## 4-1 -1.000000 -3.428998  1.428998 0.4917130
## 3-2 -5.666667 -8.095665 -3.237669 0.0013433
## 4-2 -2.000000 -4.428998  0.428998 0.0975482
## 4-3  3.666667  1.237669  6.095665 0.0096068
## 
## $Treatment
##          diff        lwr      upr     p adj
## 2-1 0.2222222 -2.2067758 2.651220 0.9852477
## 3-1 0.5555556 -1.8734425 2.984554 0.8323947
## 4-1 3.2222222  0.7932242 5.651220 0.0165827
## 3-2 0.3333333 -2.0956647 2.762331 0.9540767
## 4-2 3.0000000  0.5710020 5.428998 0.0222012
## 4-3 2.6666667  0.2376686 5.095665 0.0352698

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = y ~ Treatment + Block, data = catalyst)
## 
## $Treatment
##           diff        lwr      upr     p adj
## 2-1 -1.3333333 -3.7623314 1.095665 0.2923571
## 3-1 -0.6666667 -3.0956647 1.762331 0.7501716
## 4-1  1.3333333 -1.0956647 3.762331 0.2923571
## 3-2  0.6666667 -1.7623314 3.095665 0.7501716
## 4-2  2.6666667  0.2376686 5.095665 0.0352698
## 4-3  2.0000000 -0.4289980 4.428998 0.0975482
## 
## $Block
##           diff        lwr        upr     p adj
## 2-1  1.8888889 -0.5401092  4.3178869 0.1167979
## 3-1 -4.2222222 -6.6512203 -1.7932242 0.0051736
## 4-1 -0.7777778 -3.2067758  1.6512203 0.6622480
## 3-2 -6.1111111 -8.5401092 -3.6821131 0.0009414
## 4-2 -2.6666667 -5.0956647 -0.2376686 0.0352698
## 4-3  3.4444444  1.0154464  5.8734425 0.0125455
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = y ~ Block + Treatment, data = catalyst)
## 
## $Block
##          diff       lwr       upr     p adj
## 2-1  1.000000 -1.428998  3.428998 0.4917130
## 3-1 -4.666667 -7.095665 -2.237669 0.0032954
## 4-1 -1.000000 -3.428998  1.428998 0.4917130
## 3-2 -5.666667 -8.095665 -3.237669 0.0013433
## 4-2 -2.000000 -4.428998  0.428998 0.0975482
## 4-3  3.666667  1.237669  6.095665 0.0096068
## 
## $Treatment
##          diff        lwr      upr     p adj
## 2-1 0.2222222 -2.2067758 2.651220 0.9852477
## 3-1 0.5555556 -1.8734425 2.984554 0.8323947
## 4-1 3.2222222  0.7932242 5.651220 0.0165827
## 3-2 0.3333333 -2.0956647 2.762331 0.9540767
## 4-2 3.0000000  0.5710020 5.428998 0.0222012
## 4-3 2.6666667  0.2376686 5.095665 0.0352698

5 Analysis of Covariance (Accounting for Uncontrollable Continuous Covariates)

Textbook section 15.3

Consider an experiment (Flurry, 1939) to study the breaking strength (y) in grams of three types of starch film. The breaking strength is also known to depend on the thickness of the film (x) as measured in 10-4 inches. Because film thickness varies from run to run and its values cannot be controlled or chosen prior to the experiment, it should be treated as a covariate whose effect on strength needs to be accounted for before comparing the three types of starch.

strength thickness type
791.7 7.7 canna
610.0 6.3 canna
710.0 8.6 canna
940.7 11.8 canna
990.0 12.4 canna
916.2 12.0 canna
835.0 11.4 canna
724.3 10.4 canna
611.1 9.2 canna
621.7 9.0 canna
735.4 9.5 canna
990.0 12.5 canna
862.7 11.7 canna
731.0 8.0 corn
710.0 7.3 corn
604.7 7.2 corn
508.8 6.1 corn
393.0 6.4 corn
416.0 6.4 corn
400.0 6.9 corn
335.6 5.8 corn
306.4 5.3 corn
426.0 6.7 corn
382.5 5.8 corn
340.8 5.7 corn
436.7 6.1 corn
333.3 6.2 corn
382.3 6.3 corn
397.7 6.0 corn
619.1 6.8 corn
857.3 7.9 corn
592.5 7.2 corn
983.3 13.0 potato
958.8 13.3 potato
747.8 10.7 potato
866.0 12.2 potato
810.8 11.6 potato
950.0 9.7 potato
1282.0 10.8 potato
1233.8 10.1 potato
1660.0 12.7 potato
746.0 9.8 potato
650.0 10.0 potato
992.5 13.8 potato
896.7 13.3 potato
873.9 12.4 potato
924.4 12.2 potato
1050.0 14.1 potato
973.3 13.7 potato

5.1 Analysis 1-1: Considering ‘Type’ only, no ‘thickness’

Df Sum Sq Mean Sq F value Pr(>F) R2
type 2 2246204 1123102.19 32.6191 0 0.586473
Residuals 46 1583818 34430.82 NA NA 0.413527

5.2 Analysis 1-2: Considering ‘thickness’ only, no ‘Type’

## (Intercept)   thickness 
##   -43.46202    83.13356

Df Sum Sq Mean Sq F value Pr(>F) R2
thickness 1 2553357 2553357.20 94.00102 0 0.6666691
Residuals 47 1276665 27163.08 NA NA 0.3333309

5.3 Analysis 2-1: Considering “thickness” and then “type”

## (Intercept)   thickness   typecanna  typepotato 
##    74.59478    62.50120    83.66605   154.02614
##    (Intercept) thickness typecanna typepotato
## 1            1       7.7         1          0
## 2            1       6.3         1          0
## 3            1       8.6         1          0
## 4            1      11.8         1          0
## 5            1      12.4         1          0
## 6            1      12.0         1          0
## 7            1      11.4         1          0
## 8            1      10.4         1          0
## 9            1       9.2         1          0
## 10           1       9.0         1          0
## 11           1       9.5         1          0
## 12           1      12.5         1          0
## 13           1      11.7         1          0
## 14           1       8.0         0          0
## 15           1       7.3         0          0
## 16           1       7.2         0          0
## 17           1       6.1         0          0
## 18           1       6.4         0          0
## 19           1       6.4         0          0
## 20           1       6.9         0          0
## 21           1       5.8         0          0
## 22           1       5.3         0          0
## 23           1       6.7         0          0
## 24           1       5.8         0          0
## 25           1       5.7         0          0
## 26           1       6.1         0          0
## 27           1       6.2         0          0
## 28           1       6.3         0          0
## 29           1       6.0         0          0
## 30           1       6.8         0          0
## 31           1       7.9         0          0
## 32           1       7.2         0          0
## 33           1      13.0         0          1
## 34           1      13.3         0          1
## 35           1      10.7         0          1
## 36           1      12.2         0          1
## 37           1      11.6         0          1
## 38           1       9.7         0          1
## 39           1      10.8         0          1
## 40           1      10.1         0          1
## 41           1      12.7         0          1
## 42           1       9.8         0          1
## 43           1      10.0         0          1
## 44           1      13.8         0          1
## 45           1      13.3         0          1
## 46           1      12.4         0          1
## 47           1      12.2         0          1
## 48           1      14.1         0          1
## 49           1      13.7         0          1
## attr(,"assign")
## [1] 0 1 2 2
## attr(,"contrasts")
## attr(,"contrasts")$type
## [1] "contr.treatment"

Df Sum Sq Mean Sq F value Pr(>F) R2
thickness 1 2553357.20 2553357.20 94.185852 0.0000000 0.6666691
type 2 56724.87 28362.44 1.046207 0.3596542 0.0148106
Residuals 45 1219939.85 27109.77 NA NA 0.3185203

5.4 Analysis 2-2: Considering “type” then “thickness” (wrong for this question)

## (Intercept)   typecanna  typepotato   thickness 
##    74.59478    83.66605   154.02614    62.50120

Note that linear model estimation is the same as analysis 2-1, but anova will be different

Df Sum Sq Mean Sq F value Pr(>F) R2
type 2 2246204.4 1123102.19 41.42794 0.0000000 0.5864730
thickness 1 363877.7 363877.69 13.42238 0.0006526 0.0950067
Residuals 45 1219939.9 27109.77 NA NA 0.3185203

Note that the ordering of variables matters in producing ANOVA table.

5.5 Analyzing Starch Data Again: Varying Slopes (Interaction)

## 
## Call:
## lm(formula = strength ~ thickness * type)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -262.56  -83.31  -18.84   47.20  659.67 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           -751.04     323.32  -2.323  0.02499 *  
## thickness              188.91      49.20   3.839  0.00040 ***
## typecanna              942.01     399.62   2.357  0.02303 *  
## typepotato            1338.49     446.49   2.998  0.00451 ** 
## thickness:typecanna   -129.62      54.17  -2.393  0.02116 *  
## thickness:typepotato  -156.40      55.44  -2.821  0.00721 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 154.7 on 43 degrees of freedom
## Multiple R-squared:  0.7314, Adjusted R-squared:  0.7002 
## F-statistic: 23.42 on 5 and 43 DF,  p-value: 2.774e-11
x
(Intercept) -751.0374
thickness 188.9074
typecanna 942.0086
typepotato 1338.4945
thickness:typecanna -129.6155
thickness:typepotato -156.3974
X.Intercept. thickness typecanna typepotato thickness.typecanna thickness.typepotato
1 7.7 1 0 7.7 0.0
1 6.3 1 0 6.3 0.0
1 8.6 1 0 8.6 0.0
1 11.8 1 0 11.8 0.0
1 12.4 1 0 12.4 0.0
1 12.0 1 0 12.0 0.0
1 11.4 1 0 11.4 0.0
1 10.4 1 0 10.4 0.0
1 9.2 1 0 9.2 0.0
1 9.0 1 0 9.0 0.0
1 9.5 1 0 9.5 0.0
1 12.5 1 0 12.5 0.0
1 11.7 1 0 11.7 0.0
1 8.0 0 0 0.0 0.0
1 7.3 0 0 0.0 0.0
1 7.2 0 0 0.0 0.0
1 6.1 0 0 0.0 0.0
1 6.4 0 0 0.0 0.0
1 6.4 0 0 0.0 0.0
1 6.9 0 0 0.0 0.0
1 5.8 0 0 0.0 0.0
1 5.3 0 0 0.0 0.0
1 6.7 0 0 0.0 0.0
1 5.8 0 0 0.0 0.0
1 5.7 0 0 0.0 0.0
1 6.1 0 0 0.0 0.0
1 6.2 0 0 0.0 0.0
1 6.3 0 0 0.0 0.0
1 6.0 0 0 0.0 0.0
1 6.8 0 0 0.0 0.0
1 7.9 0 0 0.0 0.0
1 7.2 0 0 0.0 0.0
1 13.0 0 1 0.0 13.0
1 13.3 0 1 0.0 13.3
1 10.7 0 1 0.0 10.7
1 12.2 0 1 0.0 12.2
1 11.6 0 1 0.0 11.6
1 9.7 0 1 0.0 9.7
1 10.8 0 1 0.0 10.8
1 10.1 0 1 0.0 10.1
1 12.7 0 1 0.0 12.7
1 9.8 0 1 0.0 9.8
1 10.0 0 1 0.0 10.0
1 13.8 0 1 0.0 13.8
1 13.3 0 1 0.0 13.3
1 12.4 0 1 0.0 12.4
1 12.2 0 1 0.0 12.2
1 14.1 0 1 0.0 14.1
1 13.7 0 1 0.0 13.7

Df Sum Sq Mean Sq F value Pr(>F) R2
thickness 1 2553357.20 2553357.20 106.741403 0.0000000 0.6666691
type 2 56724.87 28362.44 1.185673 0.3153306 0.0148106
thickness:type 2 191338.42 95669.21 3.999388 0.0255292 0.0499575
Residuals 43 1028601.43 23920.96 NA NA 0.2685628

The most resonable way to account for the contribution of “type” should be comparing these two models.

Res.Df RSS Df Sum of Sq F Pr(>F)
47 1276665 NA NA NA NA
43 1028601 4 248063.3 2.59253 0.0497493
## [1] 0.1943055

5.6 Which model should we choose to interpret? Using AIC

Notes: not required for this course

AIC is an estimate of the leave-one-out predictive accuracy, a value related to SSE but penalized by the number of model parameters.

## [1] 640.6986
## [1] 645.0581
## [1] 643.2851
## [1] 655.8489

The model considering the interaction between thickness and type has the lowest AIC value, which means that it has the best out-of-sample predictive ability. Our conclusion should be based on this model.

The model considering thickness+type has lower predictive accuracy than the model considering thickness only. This is another approach to conclude that the type of starch does not affect the mean of the strengthness of films. After considering the interaction between the thickness and the type of starch, we discover that the type of starch affects the slope between the strengthness and thickness.