ANCOVA
Note (August 19, 2010): although the approaches described here are still as valid as their assumptions, a better approach is to use either 3dtttest++ or mixed-effect meta-analysis with 3dMEMA in terms of modeling capability/flexibility and result presentation.
FMRI group analysis with ANOVA assumes that all subjects are drawn from completely randomized design. However such a randomness is not always met. In fact in many situations the investigator does not have a set of homogeneous subjects, and the potential differences could distort the analysis. For example, in FMRI studies direct control of variability due to subject performance (e.g., age effect of the subjects) is most likely unrealistic: the investigator doesn't at will select individual subjects to an experiment. Instead indirect (statistical) control is available through analysis of covariance (ANCOVA), achieved by measuring one or more concomitant variates in addition to the variates (factors) of primary interest. Such concomitant variates (nuisance, extraneous or ancillary variables) are also called covariates whose measurements are made for the purpose of adjusting the measurements on the variates (factors). A covariate is supposed to have some cause-effect relation with your dependent variable (percent signal change). Potential covariates include age, cortex thickness, behavioral data such as response time. Although mostly continuous variables, discrete variables are sometimes treated as covariates (e.g., sex) if the user is not specifically interested in such a variable except "regressing" them out in the analysis.
By accounting for the correlation between the covariate and the dependent variable, potential sources of bias (variability of the dependent variable which is attributable to the covariates) in the experiment can be removed, and such a group analysis would lead to having more statistical power by decreasing the size of the mean square against which the effects are tested. Especially when the regression of the covariates is linear, covariance adjustment is very effective by making the groups are more comparable.
There are basically three assumptions in ANCOVA: (1)
linearity of regression- linear relation between percent signal change
and the covariate; (2) exact measurement of the covariate; (3) the
covariate is independent of all factors (independent variables), and
does not correlate highly with other covariates. Nonlinear
relationships can compromise the analysis while inaccuracy on
measurement of the covariate contributes additional variability to the
data that is not taken into account by ANCOVA.
Before a group analysis is implemented, the user should plan the
analysis more discreetly before running ANCOVA on what contrasts and
simple effects are desirable. To account for subject variability in
group analysis, it is recommended the user run a one-way ANCOVA for
each contrast or simple effect separately, similar to its counterpart
of two-sample t test or one-sample t test with 3dttest. The following examples examplify how to implement such commonly encountered covariance analyses with 3dRegAna. ANCOVA with more than one covariate can be easily worked out with 3dRegAna
if the user understands the principle underlying ANCOVA. In fact with
its flexibility on the number of predictor variables and design
balance, 3dRegAna is a versatile program that can handle most group
analyses if the user knows how to implement them.
(1) Contrast with covariate effect removed
Suppose we want to test a contrast at group level with age effect removed from the analysis. This is the counterpart of a one-sample t test.
The corresponding model for this analysis is
Yi = βo + β1Xi + €i, i = 1, 2, ..., n,
where Yi is the contrast from subject i while Xi is the corresponding covariate. Parameter β1 reflects the effect of the covariate on the dependent variable Yi (percent signal change) we would like to remove from Yi, and βo is the adjusted contrast after the covariate effect is removed. The sign of β1 indicates the effect of the covariate on percent signal change: Positive β1 means that bigger Xi leads to higher percent signal change, and negative β1 means that bigger Xi decreases percent signal change.
First remove the mean from the covariate (ages of the subjects in this example). Centralizing
the covariate is mandatory for this type of analysis, otherwise the mean age would interfere with the contrast (βo in the above model) . The
decentralized age is listed as the only column in the following
3dRegAna script. To verify the demeaning effect, this column should add up to 0.
3dRegAna \
-rows 15 \
-cols 1 \
-workmem 1000 \
-xydata 0.1 Contrast1+tlrc.BRIK \
-xydata 7.1 Contrast2+tlrc.BRIK \
...
-xydata -0.9 Contrast8+tlrc.BRIK \
...
-xydata -5.9 Contrast11+tlrc.BRIK \
-xydata 4.1 Contrast12+tlrc.BRIK \
...
-xydata -3.9 Contrast15+tlrc.BRIK \
-model 1 : 0 \
-bucket 0 GroupContr \
-brick 0 coef 0 "GroupContr" \
-brick 1 tstat 0 "GroupContr t" \
-brick 2 coef 1 "Age Effect" \
-brick 3 tstat 1 "Age Effect t"
Please note:
(a) The -model option in 3dRegAna is for specifying a reduced model, and only
effective for the following output sub-bricks: F-statistic and R2 for the
specified model, and F-statistic for each regression coefficient if
option -fcoef is used. The regression coefficients and their t
statistics are calculated based on the full model using least squares principle, regardless of the specified model with the -model option.
(b) As the program was developed in the old days when computer memory was a big deal, option -workmem
(unit: MB) was provided to adopt to the user's computing environment.
With a miserable default value of 12 MB, most likely you'd like to
juice it up to something like 1000 MB (= 1 GB), otherwise your patience
will be significantly challenged.
(c) If the last few lines with option -brick are absent, the sub-brick names in the output file Pat_vs_Norm+tlrc would be labeled
as "Coef #0", "Coef #0 t", etc.. The following script can be used to make
them more self-revealing:
3drefit \
-sublabel 0 "GroupContr" \
-sublabel 1 "GroupContr t" \
-sublabel 2 "Age Effect" \
-sublabel 3 "Age Effect t" \
GroupContr+tlrc
(2) Comparing two groups with covariate effect removed
Suppose we would like to compare two groups (patient and normal) on
a condition or contrast with age effect removed from the analysis. This
is similar to a two-sample t test on the two groups. The model for this case is
Yi = βo + β1X1i
+ β2X2i + β3X3i + €i, i = 1, 2, ..., n,
where βo is the intercept of straight line fitting in the model.
Centralizing the covariate, variable X1i in the above model, is not really necessary for this type of analysis, but if done, make sure the whole covariate column in the 3dRegAna script add up to 0. And demeaning of X1i makes the interpretation of βo very revealing because it is the very effect of patient group if X2i is defined as below and if covariate mean is removed (flip 0's and 1's in the 2nd column if normal group effect is desirable). The coefficient β1 is the slope of the fitting lines, representing the influence of the covariate on the dependent variable Yi (percent signal change).
The
0's and 1's in the second column differentiate the two groups with 0
coding for patient and 1 for normal. Basically this defines a dummy
variable in the above model:
0, when the subject is a patient;
X2i = {
1, when the subject is normal.
Coefficient β2 is thus the effect of normal group
relative to patient group, e.g., the contrast between the two groups:
the magnitude of normal group more active than patients if positve, or
less if negative.
The variable X3i in the above model is defined as the product of X1i and X2i,
and thus the third column in the script below models the interaction
between age effect and group effect, which is meant to find out whether
the effect of one variable depends on the specific value of the other.
In the 3dRegAna script down below the 3rd column is simply the product between the first and second columns, and the coefficient β3 reveals the interaction effect between age and group. When β3 is
positive, age effect augments the difference between the two group;
when negative, age effect decreases the group difference. Without this
variable added in the model ANCOVA would bear an assumption of
homogeneity of regression in which two parallel lines are separated
vertically by the main effect of each group. Parallelism - equal slope
of regression lines - is equivalent to having no interaction between
the covariate and factors. However, it would be more appropriate to
consider any potential interaction with which the correlation between
age effect and hemodynamic response is assumed to be different for each
group. In other words, each group has its separate slope of the linear
regression instead of parallel fitting, thus the interaction between
the covariate and percent signal change is automatically considered.
The input files are the regression coefficients or contrast intensities (not statistics) from individual subject analysis. Like 3dttest, unequal sample size for the two groups is not a problem with 3dRegAna.
3dRegAna \
-rows 30 \
-cols 3 \
-workmem 1000 \
-xydata 0.1 0 0 patient/Pat1+tlrc.BRIK \
-xydata 7.1 0 0 patient/Pat2+tlrc.BRIK \
...
-xydata -0.9 0 0 patient/Pat8+tlrc.BRIK \
...
-xydata -5.9 0 0 patient/Pat11+tlrc.BRIK \
-xydata 4.1 0 0 patient/Pat12+tlrc.BRIK \
...
-xydata -3.9 0 0 patient/Pat15+tlrc.BRIK \
-xydata 2.1 1 2.1 normal/Norm1+tlrc.BRIK \
...
-xydata -0.9 1 -0.9 normal/Norm3+tlrc.BRIK \
-xydata 0.1 1 0.1 normal/Norm4+tlrc.BRIK \
...
-xydata -3.9 1 -3.9 normal/Norm9+tlrc.BRIK \
...
-xydata -8.9 1 -8.9 normal/Norm14+tlrc.BRIK \
-xydata 0.1 1 0.1 normal/Norm15+tlrc.BRIK \
-model 1 2 3 : 0 \
-bucket 0 Pat_vs_Norm \
-brick 0 coef 0 'PatEff' \
-brick 1 tstat 0 'PatEff t' \
-brick 2 coef 1 'Age Effect of Pat Group' \
-brick 3 tstat 1 'Age Effect of Pat Group t-stat' \
-brick 4 coef 2 'Norm-Pat' \
-brick 5 tstat 2 'Norm-Pat t' \
-brick 6 coef 3 'Interaction' \ # the difference of age effect between Normal and Patient groups
-brick 7 tstat 3 'Interaction t'
If
the last few lines with option -brick are absent, the sub-brick names in
the output file Pat_vs_Norm+tlrc would be labeled as "Coef #0", "Coef
#1", etc.. The following script can be alternatively used to make them
more self-revealing:
3drefit \
-sublabel 0 "PatEff" \
-sublabel 1 "PatEff t" \
-sublabel 2 "Age Effect" \
-sublabel 3 "Age Effect t" \
-sublabel 4 "Norm-Pat" \
-sublabel 5 "Norm-Pat t" \
-sublabel 6 "Interaction" \
-sublabel 7 "Interaction t" \
Pat_vs_Norm+tlrc
(3) More than one covariate
With one more covariate, it would be, just like the first covariate, adding one more column specifically devoting for that covariate. In addition if you want to consider any potential interactions with the group and/or the first covariate, insert appropriate columns as well. Anything else should be self-evident.
Related links
* Back to Gang Chen's home page
Last modified: July 19, 2005
Last modified 2011-03-02 10:23