# ANOVA

============================================

The nice thing about mutli-way ANOVA is that we can explore the possibility of interactions among factors, which is justifiably of greater interest for the investigator than the main effects.

This Matlab package can implement the following group analyses in a streamlined fashion: up to 5-way ANOVA. It can handle both **volumetric** and **surface** data. Designs with unequal sample size in certain types are also available (see Group Analysis with Unbalanced Designs ).

With flexibility and potential extension (e.g., unbalanced design) in mind, this whole group analysis is treated as a general linear model (cell means model) with all factors being coded into dummy/indicator variables. Between the two popular methods sovling the normal equations of the linear system (restrictions and generalized inverse), we adopt the former approach and impose restrictions based on the fact that any definition of one parameter in the system can lead to other parameters uniquely defined. And the core computation algorithm is QR decomposition of huge matrices. The sacrifice out of this approach is the computation cost. Meanwhile all possible interaction terms are automatically modeled during the analysis. The following poster (in PowerPoint format) about the implementation of four-way ANOVA provides a little more about the details of the numerical steps and ANOVA table:

Group Analysis with Four-Way ANOVA in AFNI

Please do realize that there are a few assumptions for the current version of group analysis package among which the user should pay attention to the following:

**(1) Homogeneity of variance for between-subjects factor: all levels of the between-subjects factor have the same variance. **This is essentially the same assumption in two-sample *t* test (*3dttest*) with the same variance assumption of the two samples.

According to David C. Howell, if the population distributions tend to be symmetric, or at least similarly shaped or unidirectionally skewed, and if maximum variance among the populations is less than 4 times the minimum, the consquence of variance heterogeneity, when sample sizes are equal, is not very serious. However, inequality of sample size raises a huge concern if the homogeneity of variance assumption is violated. In case of possible violation of this assumption, try to aviod the possibility of unequal sample sizes.

**(2) Sphericity (circularity) for within-subject factor - it requires both the homogeneity of variance and the homogeneity of corelation among the levels of the within-subject factor.**

One can create a new set of variables, composed of all possible pairwise differences, and the variances of these differences must all be equal in the group. Another related property, compound symmetry, requires of the original within-subject factor levels that (a) all the variances be equal in the group, and (b) all the covariances be equal in the group. Compound symmetry is much easier to verify, and is a special case of the sphericity assumption: If compound symmetry is satisfied, then sphericity is met.

**If sphericity violation is a concern for a specific group analysis, it is highly recommended that 3dttest, 3dANOVA / 3dANOVA2 / 3dANOVA3 be used for contrast testing because of their high immunity to such a violation**.

`Available 1-way ANOVA types`

Type 1: one factor with fixed effect.

`Available 2-way ANOVA types`

Type 1: Factorial (crossed) design AXB - both factors are fixed.

Type 2: Factorial (crossed) design AXB - factor A is fixed while B is random. For example, A = stimulus category, and B is usually subject, also called 1-way design with A varying within subject.

Type 3: Factorial (crossed) design AXB - both factors are random. Rarely used in FMRI group analysis.

`Available 3-way ANOVA types`

Type 1: Factorial (crossed) design AXBXC - all factors are fixed.

Type 2: Factorial (crossed) design AXBXC - factor A and B are fixed while C is random. For example, A, B = stimulus category, C is subject, sometimes (usually in behavioral sciences) called 2-way ANOVA with A and B varying within subject.

Type 3: Mixed design BXC(A) - A and B are fixed while C is random and nested within A. For example, A = subject group, B = stimulus category, C = subejct, usually called 2-way ANOVA with B varying within-subject while A between subjects in behavioral sciences.

Type 4: Mixed design BXC(A) - Fixed factor C is nested within fixed factor A while B (usually subject) is random. For example, A = stimulus category, B = subject, C = stimulus subtype.

`Available 4-way ANOVA types`

Type 1: Factorial (crossed) design AXBXCXD - all 4 factors are fixed.

Type 2: Factorial (crossed) design AXBXCXD - only factor D is random. For example, A, B, C = stimulus category, D = subject, called 3-way design with all 3 factors A, B, and C varying within subject.

Type 3: Mixed design BXCXD(A)- only the nested (4th) factor (usually subject) is random. For example, A = subject group, B, C = stimulus category, D = subejct, called 3-way design with factors B and C varying within subject and factor A between subjects.

Type 4: Mixed design BXCXD(A)- D is nested within A, but only the 3rd factor (usually subject) is random. For example, A, B stimulus category, C = subject, D = stimulus subtype.

Type 5: Mixed design CXD(AXB) - only the nested (4th) factor (usually subject) is random, but factor D is nested within both factors A and B. For example, A, B = subject classes, C = stimulus category, D = subject, also called 3-way design with factor D varying within-subject and factors A and B between-subjects.

`Availabe 5-way ANOVA types`

Type 1: Factorial (crossed) design AXBXCXDXE - all 5 factors are fixed.

Type 2: Factorial (crossed) design AXBXCXDXE - only factor E is random. If E is subject it is also called 4-way design with all 4 factors A, B, C and D varying within subject.

Type 3: Mixed design BXCXDXE(A) - only the nested (5th) factor E (usually subject) is random. Also called 4-way design with factors B, C and D varying within subject and factor A between subjects.

Type 4: Mixed design BXCXDXE(A) - the 5th factor E is nested within factor A, but factor D (usually subject) is random.

`Running ANOVA`

- The input files are assumed to be regression coefficents (or contrast effects) from regression analysis of individual subject, and each of them should contain only one ANFI subbrik (one regression coefficient). Use
**3dbucket**to extract those subbriks so that each file contains only one.

- As all the factors and their interactions in ANOVA are coded in dummy/indicator variables, it is very time consuming to solve the regression model, especially with a high resolution of input data such as 1X1X1 mm
^{3}, which does not gain anything in terms of analytical results. To significantly reduce runtime without sacrificing accuracy, I suggest that you use option -dxyz # in adwarp during Talairach conversion. Here # can be the lower resolution of your EPI dataset along x, y, and z directions, or simply 2 mm. Afterward you can run 3dresample -master ... -rmode ... on the output of the group analysis or simply use the resample option on AFNI graphical viewer (Define Datamode --> Stat resam mode --> Cu) if higher resolution is desirable for cosmetic reason. As AFNI programs such as the viewer, 3dinfo and 3dresample can only handle a file with a maximum size of 2GB on a 32-bit machine, it is almost mandatory to run high-way group analysis with low resolution of input data. - Go to the directory where you want to run analysis of your datasets, and start Matlab with "matlab -nodesktop" (running Matlab directly on terminal would allow the user to easily copy and paste multiple lines) and type "GroupAna" at the prompt. You will be asked a list of streamlined and interactive questions regarding factors, levels, input data info, contrast tests, etc. No extra space character is tolerated in input lines.
- A typical interface of running group analysis is shown here
- I thought about designing a graphical interface so that the user could select those input files one by one, but with the consideration of most likely running the analysis more than once, I decided to stick to the option of everything in pure text on the terminal, thus the user can simply copying and pasting the saved input lines to run a repeated analysis with the same input lines. It is highly suggested that all the input command lines be saved in a text file (see an example) for records or for rerunning the steps by simply copying and pasting all the input lines as a whole one time into the Matlab terminal. It might be helpful to run ls -1 *.BRIK and then copy the input file names into the text editor and modify their order if necessary. As the input sequence usually wraps around all the subjects first, you can even list the files more conveniently by implementing with more specific naming on the input files in the command line of ls -1 *.BRIK. Some file naming strategy --indicative but terse-- definitely helps a lot along the line.
**Coding contrasts when running group analysis with the Matlab package**Suppose we have a 3-way ANOVA BXC(A) (type 3), where C is a random factor (subejct), A is Group with two levels (male and female), while B is Task with 2 conditions. The order of a constrast is basically the number of factors coded with nonzeroes. Keep in mind

.**only contrasts with coeficients adding up to 0 are allowed in GroupAna**### For 1st-order contrast between condition #1 and #2, do the following:

Label for 1st order contrast No. 1 is: c1vc2

How many terms are involved? 2

Factor index for No. 1 term is (e.g., 0120): 010

Corresponding coefficient (e.g., 1 or -1): 1

Factor index for No. 2 term is (e.g., 0120): 020

Corresponding coefficient (e.g., 1 or -1): -1

### Along the same line, code the 2nd-order contrast of condition #1 between the two groups as:

Label for 2nd order contrast No. 1 is: G1vG2@c1

How many terms are involved? 2

Factor index for No. 1 term is (e.g., 0120): 110

Corresponding coefficient (e.g., 1 or -1): 1

Factor index for No. 2 term is (e.g., 0120): 210

Corresponding coefficient (e.g., 1 or -1): -1

**A text file "diary" automatically saves all command window input and most of the resulting command window output.**It keeps a copy of the user's responses, and helps on diagnosing if something goes awry. An example of "diary" is like this .

`Related links`

Last modified 2012-07-26 10:50