An experiment was conducted to determine the optimum conditions for
extruding plastic film. Three responses, tear resistance,
film gloss and film opacity were measured in
relation to two factors, rate of extrusion and amount of an
additive, both of these being set to two values, High and
Low. The data set comes from Johnson &
Wichern (1992).
Multivariate tests
We begin with an overall MANOVA for the two-way MANOVA model. In all
these analyses, we use car::Anova() for significance tests
rather than stats::anova(), which only provides so-called
“Type I” (sequential) tests for terms in linear models.
In this example, because each effect has 1 df, all of the
multivariate statistics (Roy’s maximum root test, Pillai and Hotelling
trace criteria, Wilks’ Lambda) are equivalent, in that they give the
same \(F\) statistics and \(p\)-values. We specify
test.statistic="Roy" to emphasize that Roy’s test has a
natural visual interpretation in HE plots.
plastic.mod <- lm(cbind(tear, gloss, opacity) ~ rate*additive, data=Plastic)
Anova(plastic.mod, test.statistic="Roy")
#>
#> Type II MANOVA Tests: Roy test statistic
#> Df test stat approx F num Df den Df Pr(>F)
#> rate 1 1.619 7.55 3 14 0.003 **
#> additive 1 0.912 4.26 3 14 0.025 *
#> rate:additive 1 0.287 1.34 3 14 0.302
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
For the three responses jointly, the main effects of
rate and additive are significant, while their
interaction is not. In some approaches to testing effects in
multivariate linear models (MLMs), significant multivariate tests are
often followed by univariate tests on each of the responses separately
to determine which responses contribute to each significant effect.
In R, univariate analyses are conveniently performed using the
update() method for the mlm object
plastic.mod, which re-fits the model with only a single
outcome variable.
Anova(update(plastic.mod, tear ~ .))
#> Anova Table (Type II tests)
#>
#> Response: tear
#> Sum Sq Df F value Pr(>F)
#> rate 1.74 1 15.8 0.0011 **
#> additive 0.76 1 6.9 0.0183 *
#> rate:additive 0.00 1 0.0 0.9471
#> Residuals 1.76 16
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Anova(update(plastic.mod, gloss ~ .))
#> Anova Table (Type II tests)
#>
#> Response: gloss
#> Sum Sq Df F value Pr(>F)
#> rate 1.300 1 7.92 0.012 *
#> additive 0.613 1 3.73 0.071 .
#> rate:additive 0.544 1 3.32 0.087 .
#> Residuals 2.628 16
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Anova(update(plastic.mod, opacity ~ .))
#> Anova Table (Type II tests)
#>
#> Response: opacity
#> Sum Sq Df F value Pr(>F)
#> rate 0.4 1 0.10 0.75
#> additive 4.9 1 1.21 0.29
#> rate:additive 4.0 1 0.98 0.34
#> Residuals 64.9 16
The results above show significant main effects for
tear, a significant main effect of rate for
gloss, and no significant effects for opacity,
but they don’t shed light on the nature of these effects.
Traditional univariate plots of the means for each variable separately
are useful, but they don’t allow visualization of the relations
among the response variables.
HE plots
We can visualize these effects for pairs of variables in an HE plot,
showing the “size” and orientation of hypothesis variation (\(\mathbf{H}\)) in relation to error
variation (\(\mathbf{E}\)) as
ellipsoids. When, as here, the model terms have 1 degree of freedom, the
\(\mathbf{H}\) ellipsoids degenerate to
a line.
In HE plots, the \(\mathbf{H}\)
ellipses can be scaled relative to the \(\mathbf{E}\) to show
significance of effects (size="evidence"),
or effect size (size="effect"). In the
former case, a model term is significant (using Roy’s maximum root test)
iff the \(\mathbf{H}\)
projects anywhere outside the \(\mathbf{E}\) ellipse.
This plot overlays those for both scaling, using thicker lines for
the effect scaling.
## Compare evidence and effect scaling
colors = c("red", "darkblue", "darkgreen", "brown")
heplot(plastic.mod, size="evidence",
col=colors, cex=1.25,
fill=TRUE, fill.alpha=0.1)
heplot(plastic.mod, size="effect",
add=TRUE, lwd=5, term.labels=FALSE, col=colors)
The interpretation can be easily read from the plot, at least for the
two response variables (tear and gloss) that
are shown in this bivariate view. The effect of rate of
extrusion is highly significant: high rate shows greater
tear compared to low rate. The effect of amount of additive
is not significant in this view, but high level of additive has greater
tear and gloss.
With effect scaling, both the \(\mathbf{H}\) and \(\mathbf{E}\) sums of squares and products
matrices are both divided by the error df, giving multivariate analogs
of univariate measures of effect size, e.g., \((\bar{y}_1-\bar{y}_2) / s\). With
significance scaling, the \(\mathbf{H}\) ellipse is further divided by
\(\lambda_\alpha\), the critical value
of Roy’s largest root statistic. This scaling has the property that an
\(\mathbf{H}\) ellipse will protrude
somewhere outside the \(\mathbf{E}\)
ellipse iff the multivariate test is significant at level \(\alpha\). Figure @ref(fig:plastic1) shows
both scalings, using a thinner line for significance scaling. Note that
the (degenerate) ellipse for additive is significant, but
does not protrude outside the \(\mathbf{E}\) ellipse in this view. All that
is guaranteed is that it will protrude somewhere in the 3D space of the
responses.
By design, means for the levels of interaction terms are not shown in
the HE plot, because doing so in general can lead to messy displays. We
can add them here for the term rate:additive as
follows:
# Compare evidence and effect scaling
colors = c("red", "darkblue", "darkgreen", "brown")
heplot(plastic.mod, size="evidence",
col=colors, cex=1.25,
fill=TRUE, fill.alpha=0.05)
heplot(plastic.mod, size="effect",
add=TRUE, lwd=5, term.labels=FALSE, col=colors)
## add interaction means
intMeans <- termMeans(plastic.mod, 'rate:additive', abbrev.levels=2)
points(intMeans[,1], intMeans[,2], pch=18, cex=1.2, col="brown")
text(intMeans[,1], intMeans[,2], rownames(intMeans),
adj=c(0.5, 1), col="brown")
lines(intMeans[c(1,3),1], intMeans[c(1,3),2], col="brown")
lines(intMeans[c(2,4),1], intMeans[c(2,4),2], col="brown")
The factor means in this plot (Figure @ref(fig:plastic1) have a
simple interpretation: The high rate level yields greater
tear resistance but lower gloss than the low
level. The high additive amount produces greater
tear resistance and greater gloss.
The rate:additive interaction is not significant
overall, though it approaches significance for gloss. The
cell means for the combinations of rate and
additive shown in this figure suggest an explanation, for
tutorial purposes: with the low level of rate, there is
little difference in gloss for the levels of
additive. At the high level of rate, there is
a larger difference in gloss. The \(\mathbf{H}\) ellipse for the interaction of
rate:additive therefore “points” in the direction of
gloss indicating that this variable contributes to the
interaction in the multivariate tests.
In some MANOVA models, it is of interest to test sub-hypotheses of a
given main effect or interaction, or conversely to test composite
hypotheses that pool together certain effects to test them jointly. All
of these tests (and, indeed, the tests of terms in a given model) are
carried out as tests of general linear hypotheses in the MLM.
In this example, it might be useful to test two composite hypotheses:
one corresponding to both main effects jointly, and another
corresponding to no difference among the means of the four groups
(equivalent to a joint test for the overall model). These tests are
specified in terms of subsets or linear combinations of the model
parameters.
plastic.mod
#>
#> Call:
#> lm(formula = cbind(tear, gloss, opacity) ~ rate * additive, data = Plastic)
#>
#> Coefficients:
#> tear gloss opacity
#> (Intercept) 6.30 9.56 3.74
#> rateHigh 0.58 -0.84 -0.60
#> additiveHigh 0.38 0.02 0.10
#> rateHigh:additiveHigh 0.02 0.66 1.78
Thus, for example, the joint test of both main effects tests the
parameters rateHigh and additiveHigh.
linearHypothesis(plastic.mod, c("rateHigh", "additiveHigh"),
title="Main effects") |>
print(SSP=FALSE)
#>
#> Multivariate Tests: Main effects
#> Df test stat approx F num Df den Df Pr(>F)
#> Pillai 2 0.71161 2.7616 6 30 0.029394 *
#> Wilks 2 0.37410 2.9632 6 28 0.022839 *
#> Hotelling-Lawley 2 1.44400 3.1287 6 26 0.019176 *
#> Roy 2 1.26253 6.3127 3 15 0.005542 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
linearHypothesis(plastic.mod, c("rateHigh", "additiveHigh", "rateHigh:additiveHigh"),
title="Groups") |>
print(SSP=FALSE)
#>
#> Multivariate Tests: Groups
#> Df test stat approx F num Df den Df Pr(>F)
#> Pillai 3 1.14560 3.2948 9 48.000 0.003350 **
#> Wilks 3 0.17802 3.9252 9 34.223 0.001663 **
#> Hotelling-Lawley 3 2.81752 3.9654 9 38.000 0.001245 **
#> Roy 3 1.86960 9.9712 3 16.000 0.000603 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correspondingly, we can display these tests in the HE plot by
specifying these tests in the hypothesis argument to
heplot(), as shown in Figure @ref(fig:plastic2).
heplot(plastic.mod,
hypotheses=list("Group" = c("rateHigh", "additiveHigh", "rateHigh:additiveHigh ")),
col=c(colors, "purple"),
fill = TRUE, fill.alpha = 0.1,
lwd=c(2, 3, 3, 3, 2), cex=1.25)
heplot(plastic.mod,
hypotheses=list("Main effects" = c("rateHigh", "additiveHigh")),
add=TRUE,
col=c(colors, "darkgreen"), cex=1.25)
Finally, a 3D HE plot can be produced with heplot3d(),
giving Figure @ref(fig:plastic1-HE3D). This plot was rotated
interactively to a view that shows both main effects protruding outside
the error ellipsoid.
colors = c("pink", "darkblue", "darkgreen", "brown")
heplot3d(plastic.mod, col=colors)
3D HE plot for the plastic MLM