Bivariate Auto-Regressive Modeling
Similar to the Granger causality analysis in BrainVoyager, 3dGC.R adopts bivariate auto-regressive modeling, a pure exploratory method, and focuses only on the connection of a target voxel with a seed region in a bivariate fashion similar to simple correlation or context-dependent correlation (aka PPI) analysis. It starts with a pre-selected seed region, and finds out the potential connectivity between the seed region and the rest of the brain. This exploratory step may help select regions of interest for multivariate auto-regressive modeling with 1dGC.R.
For more discussion, see G. Chen, et al., Vector autoregression, structural equation modeling, and their synthesis in neuroimaging data analysis, Comput. Biol. Med. (2011), doi:10.1016/j.compbiomed.2011.09.004
Program 3dGC
3dGC, written in R, can be run on all major platforms such as unix-based systems and Windows, and requires R installation:
Choose a mirror site
geographically close to you, and download the appropriate binary for
your platform (or the source code and then compile yourself). Set your
path appropriately. For example, my R executable is under
/Applications/R.app/Contents/MacOS on my Mac OS X, so I add
/Applications/R.app/Contents/MacOS as one of the search paths in my C
shell startup configuration file .cshrc
If the installation is successful, start the R interface with the following command on the prompt
R
You can also work with the GUI version of R on Mac OS and Windows.
3dGC.R should be already in the most recent version of AFNI package (say, under ~/abin/). You can also download it from here, and place it wherever you prefer.
If you know the directory (e.g., ~/abin/) 1dGC.R is in, launch it inside R by typing/copying, for example,
source("~/abin/3dGC.R")
If you are not sure about the location of 1dGC.R, copy the following into R (assuming 3dGC.R is under the same directory as the AFNI graphic viewer):
source(paste(strsplit(system('which afni', intern=TRUE), "afni")[[1]], "3dGC.R", sep=""))
or (assuming 3dGC.R is on one of the search paths)
source(Find(file.exists, file.path(strsplit(system("echo $PATH", intern=TRUE), "\\:")[[1]], "3dGC.R")))
3dGC.R works in a procedural or streamlined fashion with a string of inputs about parameters and options, and can run analysis at both individual subject and group levels. Hopefully anything else should be self-evident from there.
In case the program chokes because of failure on installing packages such as vars, network, tcltk, etc. for some reason, run the following commands in R:
install.packages("vars",dependencies=TRUE)
install.packages("gsl",dependencies=TRUE)
install.packages("car",dependencies=TRUE)
The runtime can be significantly reduced through parallel computing: If multiple cores are available on your computer, simply specify the number of parallel jobs in the program.
Once you know the exact answers for those sequential questions, you may want to run 3dGC.R in a batch mode for mutlple subjects by creating a file and call it Cmds.R, for example (don't include those interpretive words within parentheses in your file):
source("~/abin/3dGC.R")
1 (individual subject analysis)
myInput+tlrc
1 (will provide a mask)
myMask+tlrc
0 (no time breaks)
seed.1D
0 (not plotting out the seed time sereis)
3 (3rd order polynomials for baseline drift)
1 (will provide covariates)
6 (number of covariates)
1 (covariates in multi-column format)
covariates.1D
1 (covariates file has header containing regressor names)
0 (no plotting for covariates)
1 (1st order VAR model)
myOutput
4 (number of parallel jobs)
0 (quit R when done)
Then run 3dGC.R at the terminal prompt (not inside R):
R CMD BATCH Cmds.R myOut &
or remotely:
nohup R CMD BATCH Cmds.R myOut &
To quit R, type
q()
(or hit letter "d" while holding down CTRL key on UNIX-based systems).
Input data
Some
suggestions about input files below might be too specific for FMRI
data. Make proper adjustment if the program is used under other
circumstances.
(1) The time series data from seed region and covariates (conditions/tasks
of no interest, head motion and physiological noises) are assumed having a suffix of
.1D in the AFNI convention . They should be
structured in a pure text format of either multiple one-column files or
one multi-column file. Header is NOT allowed in a
one-column input file, but is optional for a multi-column input file:
If provided in the first row, it has to be the labels of those
ROIs/nodes, as the format of data frame in R. ROI time series as input
are required, but covariates are optional. All 1D ROI time series can be
stored in one data frame, or multiple one-column files, but not a
mixture.
(2) The 3D+time input data from a subject is assumed to be of typical AFNI data format (NIfTI data also acceptable, but not tested).
(3) The minimum pre-processing steps for the seed and BOLD time
series data include slice timing correction and volume registration. Spatial
smoothing is typically recommended for noise reduction, but not
mandatory. It is the change relative to the baseline that is comparable
across blocks/runs, regions, and subjects, therefore signal
normalization through scaling in terms of the loose concept of percent
signal change is very important, and it can be done during the
pre-processing, or you can leave it for 3dGC.R to handle.
(4)
Any confounding effects are better entered as covariates in the
causality model unless they are orthogonal to the autocorrelation
present in the network, which is rarely true.
(5) Since the low
frequencies (drifting effect including baseline) in the signal can be
modeled with polynomials embedded in the causality model, it's NOT
recommended to remove the trend (including baseline) during the
pre-processing because of argument (4) above. You don't have to include
the polynomial time series from the design matrix as covariates for
input, but if you do decide to include them, disable the embedded
option in the program by specifying an order of -1 for polynomials in
the program.
(6) Covariates can be stripped from the design matrix of
the individual subject regression analysis. All the covariates time series can
be multiple one-column files or one multi-column file, but not a
mixture of both formats. All time series
from BOLD signal, seed and covariates must have the same length and match up in
temporal sequence. They can be data from multiple blocks and/or runs
stitched together.
(7) If you want to censor out a few time points in the time series, create one covariate for each censored time point with the same length of the time series and with all 0's except putting a 1 at the censored time point.
(8) Input files for group analysis are supposed to be path coefficients (plus the corresponding t-statistics when using 3dMEMA) saved from analysis at individual subject level.
Features of 3dGC
Compared to other generic Granger analysis tools, 3dGC is used in time domain with the following features:
(1) written in an open source language and executable on all computer platforms;
(2) allowing time breaks (discontinuities due to multiple runs or censored time points) in the data;
(3)
extending VAR model with all possible covariates included as part of
the analysis instead of being regressed out prior to the analysis;
(4) outputting one network per lag (based on path coefficients and their t-statistics) instead of lumping all lags into one network (based on overall F-statistics across lags);
(5) running group analysis on path coefficients per lag (plus t-statistic when using 3dMEMA) from individual subjects instead of overall F values across all lags.
Useful links
1. Vector auto-regressive modeling
Acknowledgements
I'd like to thank Patrick Brandt and Chris Sims for theoretical consultation, Bernhard Pfaff for programming support, Paul Hamilton for help in testing the program and for providing feedback.
Last modified 2011-10-08 17:28