ANCOVA

Document Actions

Note (August 19, 2010): although the approaches described here are still as valid as their assumptions, a better approach is to use either 3dtttest++ or mixed-effect meta-analysis with 3dMEMA in terms of modeling capability/flexibility and result presentation.

FMRI group analysis with ANOVA assumes that all subjects are drawn from completely randomized design. However such a randomness is not always met. In fact in many situations the investigator does not have a set of homogeneous subjects, and the potential differences could distort the analysis. For example, in FMRI studies direct control of variability due to subject performance (e.g., age effect of the subjects) is most likely unrealistic: the investigator doesn't at will select individual subjects to an experiment. Instead indirect (statistical) control is available through analysis of covariance (ANCOVA), achieved by measuring one or more concomitant variates in addition to the variates (factors) of primary interest. Such concomitant variates (nuisance, extraneous or ancillary variables) are also called covariates whose measurements are made for the purpose of adjusting the measurements on the variates (factors). A covariate is supposed to have some cause-effect relation with your dependent variable (percent signal change). Potential covariates include age, cortex thickness, behavioral data such as response time. Although mostly continuous variables, discrete variables are sometimes treated as covariates (e.g., sex) if the user is not specifically interested in such a variable except "regressing" them out in the analysis.

By accounting for the correlation between the covariate and the dependent variable, potential sources of bias (variability of the dependent variable which is attributable to the covariates) in the experiment can be removed, and such a group analysis would lead to having more statistical power by decreasing the size of the mean square against which the effects are tested. Especially when the regression of the covariates is linear, covariance adjustment is very effective by making the groups are more comparable.

There are basically three assumptions in ANCOVA: (1) linearity of regression- linear relation between percent signal change and the covariate; (2) exact measurement of the covariate; (3) the covariate is independent of all factors (independent variables), and does not correlate highly with other covariates. Nonlinear relationships can compromise the analysis while inaccuracy on measurement of the covariate contributes additional variability to the data that is not taken into account by ANCOVA.

Before a group analysis is implemented, the user should plan the analysis more discreetly before running ANCOVA on what contrasts and simple effects are desirable. To account for subject variability in group analysis, it is recommended the user run a one-way ANCOVA for each contrast or simple effect separately, similar to its counterpart of two-sample t test or one-sample t test with 3dttest. The following examples examplify how to implement such commonly encountered covariance analyses with 3dRegAna. ANCOVA with more than one covariate can be easily worked out with 3dRegAna if the user understands the principle underlying ANCOVA. In fact with its flexibility on the number of predictor variables and design balance, 3dRegAna is a versatile program that can handle most group analyses if the user knows how to implement them.

(1) Contrast with covariate effect removed

Suppose we want to test a contrast at group level with age effect removed from the analysis. This is the counterpart of a one-sample t test.

The corresponding model for this analysis is

Y_i = β_o + β₁X_i + €_i, i = 1, 2, ..., n,

where Y_i is the contrast from subject i while X_i is the corresponding covariate. Parameter β₁ reflects the effect of the covariate on the dependent variable Y_i (percent signal change) we would like to remove from Y_i, and β_o is the adjusted contrast after the covariate effect is removed. The sign of β₁indicates the effect of the covariate on percent signal change: Positive β₁means that bigger X_i leads to higher percent signal change, and negative β₁ means that bigger X_i decreases percent signal change.

First remove the mean from the covariate (ages of the subjects in this example). Centralizing the covariate is mandatory for this type of analysis, otherwise the mean age would interfere with the contrast (β_o in the above model) . The decentralized age is listed as the only column in the following 3dRegAna script. To verify the demeaning effect, this column should add up to 0.

3dRegAna \
-rows 15 \
-cols 1 \
-workmem 1000 \
-xydata 0.1 Contrast1+tlrc.BRIK \
-xydata 7.1 Contrast2+tlrc.BRIK \
...
-xydata -0.9 Contrast8+tlrc.BRIK \
...
-xydata -5.9 Contrast11+tlrc.BRIK \
-xydata 4.1 Contrast12+tlrc.BRIK \
...
-xydata -3.9 Contrast15+tlrc.BRIK \
-model 1 : 0 \
-bucket 0 GroupContr \
-brick 0 coef 0 "GroupContr" \
-brick 1 tstat 0 "GroupContr t" \
-brick 2 coef 1 "Age Effect" \
-brick 3 tstat 1 "Age Effect t"

Please note:

(a) The -model option in 3dRegAna is for specifying a reduced model, and only effective for the following output sub-bricks: F-statistic and R² for the specified model, and F-statistic for each regression coefficient if option -fcoef is used. The regression coefficients and their t statistics are calculated based on the full model using least squares principle, regardless of the specified model with the -model option.

(b) As the program was developed in the old days when computer memory was a big deal, option -workmem (unit: MB) was provided to adopt to the user's computing environment. With a miserable default value of 12 MB, most likely you'd like to juice it up to something like 1000 MB (= 1 GB), otherwise your patience will be significantly challenged.

(c) If the last few lines with option -brick are absent, the sub-brick names in the output file Pat_vs_Norm+tlrc would be labeled as "Coef #0", "Coef #0 t", etc.. The following script can be used to make them more self-revealing:

3drefit \
-sublabel 0 "GroupContr" \
-sublabel 1 "GroupContr t" \
-sublabel 2 "Age Effect" \
-sublabel 3 "Age Effect t" \
GroupContr+tlrc

(2) Comparing two groups with covariate effect removed

Suppose we would like to compare two groups (patient and normal) on a condition or contrast with age effect removed from the analysis. This is similar to a two-sample t test on the two groups. The model for this case is

Y_i = β_o + β₁X_1i + β₂X_2i + β₃X_3i + €_i, i = 1, 2, ..., n,

where β_ois the intercept of straight line fitting in the model.

Centralizing the covariate, variable X_1iin the above model, is not really necessary for this type of analysis, but if done, make sure the whole covariate column in the 3dRegAna script add up to 0. And demeaning of X_1i makes the interpretation of β_o very revealing because it is the very effect of patient group if X_2iis defined as below and if covariate mean is removed (flip 0's and 1's in the 2nd column if normal group effect is desirable). The coefficient β₁is the slope of the fitting lines, representing the influence of the covariate on the dependent variable Y_i (percent signal change).

The 0's and 1's in the second column differentiate the two groups with 0 coding for patient and 1 for normal. Basically this defines a dummy variable in the above model:

0, when the subject is a patient;
X_2i = {
1, when the subject is normal.

Coefficient β₂is thus the effect of normal group relative to patient group, e.g., the contrast between the two groups: the magnitude of normal group more active than patients if positve, or less if negative.

The variable X_3iin the above model is defined as the product of X_1iand X_2i, and thus the third column in the script below models the interaction between age effect and group effect, which is meant to find out whether the effect of one variable depends on the specific value of the other.

In the 3dRegAna script down below the 3rd column is simply the product between the first and second columns, and the coefficient β₃reveals the interaction effect between age and group. When β₃is positive, age effect augments the difference between the two group; when negative, age effect decreases the group difference. Without this variable added in the model ANCOVA would bear an assumption of homogeneity of regression in which two parallel lines are separated vertically by the main effect of each group. Parallelism - equal slope of regression lines - is equivalent to having no interaction between the covariate and factors. However, it would be more appropriate to consider any potential interaction with which the correlation between age effect and hemodynamic response is assumed to be different for each group. In other words, each group has its separate slope of the linear regression instead of parallel fitting, thus the interaction between the covariate and percent signal change is automatically considered.

The input files are the regression coefficients or contrast intensities (not statistics) from individual subject analysis. Like 3dttest, unequal sample size for the two groups is not a problem with 3dRegAna.

3dRegAna \
-rows 30 \
-cols 3 \
-workmem 1000 \
-xydata 0.1 0 0 patient/Pat1+tlrc.BRIK \
-xydata 7.1 0 0 patient/Pat2+tlrc.BRIK \
...
-xydata -0.9 0 0 patient/Pat8+tlrc.BRIK \
...
-xydata -5.9 0 0 patient/Pat11+tlrc.BRIK \
-xydata 4.1 0 0 patient/Pat12+tlrc.BRIK \
...
-xydata -3.9 0 0 patient/Pat15+tlrc.BRIK \
-xydata 2.1 1 2.1 normal/Norm1+tlrc.BRIK \
...
-xydata -0.9 1 -0.9 normal/Norm3+tlrc.BRIK \
-xydata 0.1 1 0.1 normal/Norm4+tlrc.BRIK \
...
-xydata -3.9 1 -3.9 normal/Norm9+tlrc.BRIK \
...
-xydata -8.9 1 -8.9 normal/Norm14+tlrc.BRIK \
-xydata 0.1 1 0.1 normal/Norm15+tlrc.BRIK \
-model 1 2 3 : 0 \
-bucket 0 Pat_vs_Norm \
-brick 0 coef 0 'PatEff' \
-brick 1 tstat 0 'PatEff t' \
-brick 2 coef 1 'Age Effect of Pat Group' \
-brick 3 tstat 1 'Age Effect of Pat Group t-stat' \
-brick 4 coef 2 'Norm-Pat' \
-brick 5 tstat 2 'Norm-Pat t' \
-brick 6 coef 3 'Interaction' \ # the difference of age effect between Normal and Patient groups
-brick 7 tstat 3 'Interaction t'

If the last few lines with option -brick are absent, the sub-brick names in the output file Pat_vs_Norm+tlrc would be labeled as "Coef #0", "Coef #1", etc.. The following script can be alternatively used to make them more self-revealing:

3drefit \
-sublabel 0 "PatEff" \
-sublabel 1 "PatEff t" \
-sublabel 2 "Age Effect" \
-sublabel 3 "Age Effect t" \
-sublabel 4 "Norm-Pat" \
-sublabel 5 "Norm-Pat t" \
-sublabel 6 "Interaction" \
-sublabel 7 "Interaction t" \
Pat_vs_Norm+tlrc

(3) More than one covariate

With one more covariate, it would be, just like the first covariate, adding one more column specifically devoting for that covariate. In addition if you want to consider any potential interactions with the group and/or the first covariate, insert appropriate columns as well. Anything else should be self-evident.

`Related links`

* 3dRegAna manual

* Basics of group analysis

* Back to Gang Chen's home page

Last modified: July 19, 2005

(not specified)
Last modified 2011-03-02 10:23

AFNI and NIfTI Server for NIMH/NIH/PHS/DHHS/USA/Earth

Sections

Personal tools

Navigation

Quick Links

ANCOVA

Document Actions

`Related links`