physio_calc.py¶
Overview¶
This program creates slice-based regressors for regressing out
components of cardiac and respiratory rates, as well as the
respiration volume per time (RVT).
Much of the calculations are based on the following papers:
Glover GH, Li TQ, Ress D (2000). Image-based method for
retrospective correction of physiological motion effects in fMRI:
RETROICOR. Magn Reson Med 44(1):162-7.
Birn RM, Diamond JB, Smith MA, Bandettini PA (2006). Separating
respiratory-variation-related fluctuations from
neuronal-activity-related fluctuations in fMRI. Neuroimage
31(4):1536-48.
This code has been informed by earlier programs that estimated
RETROICOR and RVT regressors, namely the RetroTS.py code by J Zosky,
which itself follows on from the original RetroTS.m code by ZS Saad.
That being said, the current code's implementation was written
separately, to understand the underlying processes and algorithms
afresh, to modularize several pieces, to refactorize others, and to
produce more QC outputs and logs of information. Several steps in the
eventual regressor estimation depend on processes like peak- (and
trough-) finding and outlier rejection, which can be reasonably
implemented in many ways. We do not expect exact matching of outcomes
between this and the previous versions.
Below, "resp" refers to respiratory input and results, and "card"
refers to the same for cardiac data.
Options¶
options:
-resp_file RESP_FILE Path to one respiration data file
-card_file CARD_FILE Path to one cardiac data file
-phys_file PHYS_FILE BIDS-formatted physio file in tab-separated format.
May be gzipped
-phys_json PHYS_JSON BIDS-formatted physio metadata JSON file. If not
specified, the JSON corresponding to the '-phys_file
..' will be loaded
-freq FREQ Physiological signal sampling frequency (in Hz)
-start_time START_TIME
The start time for the physio time series, relative to
the initial MRI volume (in s) (def: None)
-prefilt_max_freq PREFILT_MAX_FREQ
Allow for downsampling of the input physio time
series, by providing a maximum sampling frequency (in
Hz). This is applied just after badness checks. Values
<=0 mean that no downsampling will occur (def: -1)
-prefilt_mode PREFILT_MODE
Filter input physio time series (after badness
checks), likely aiming at reducing noise; can be
combined usefully with prefilt_max_freq. Allowed
modes: none, median
-prefilt_win_card PREFILT_WIN_CARD
Window size (in s) for card time series, if
prefiltering input physio time series with
'-prefilt_mode ..'; value must be >0 (def: 0.1, only
used if prefiltering is on)
-prefilt_win_resp PREFILT_WIN_RESP
Window size (in s) for resp time series, if
prefiltering input physio time series with
'-prefilt_mode ..'; value must be >0 (def: 0.25, only
used if prefiltering is on)
-do_interact Enter into interactive mode as the last stage of
peak/trough estimation for the physio time series
(def: only automatic peak/trough estimation)
-out_dir OUT_DIR Output directory name (can include path)
-prefix PREFIX Prefix of output filenames, without path (def: physio)
-dset_epi DSET_EPI Accompanying EPI/FMRI dset to which the physio
regressors will be applied, for obtaining the
volumetric parameters (namely, dset_tr, dset_nslice,
dset_nt)
-dset_tr DSET_TR FMRI dataset's repetition time (TR), which defines the
time interval between consecutive volumes (in s)
-dset_nt DSET_NT Integer number of time points to have in the output
(should likely match FMRI dataset's number of volumes)
-dset_nslice DSET_NSLICE
Integer number of slices in FMRI dataset
-dset_slice_times SLI_T1 [STI_T2 ...]
Slice time values (space separated list of numbers)
-dset_slice_pattern DSET_SLICE_PATTERN
Slice timing pattern code (def: None). Use
'-disp_all_slice_patterns' to see all allowed
patterns. Alternatively, one can enter the filename of
a file containing a single column of slice times.
-do_fix_nan Fix (= replace with interpolation) any NaN values in
the physio time series (def: exit if any appears)
-do_fix_null Fix (= replace with interpolation) any null or missing
values in the physio time series (def: exit if any
appears)
-do_fix_outliers Fix (= replace with interpolation) any outliers in the
physio time series (def: don't change them and
continue)
-extra_fix_list FVAL1 [FVAL2 ...]
List of one or more values that will also be
considered 'bad' if they appear in the physio time
series, and replaced with interpolated values
-remove_val_list RVAL1 [RVAL2 ...]
List of one or more values that will removed (not
interpolated: the time series will be shorter, if any
are found) if they appear in the physio time series;
this is necessary with some manufacturers' outputs,
see "Notes of input peculiarities," below.
-rvt_shift_list SHIFT1 [SHIFT2 ...]
Provide one or more values to specify how many and
what kinds of shifted copies of RVT are output as
regressors. Units are seconds, and including 0 may be
useful. Shifts could also be entered via
'-rvt_shift_linspace ..' (def: 0 1 2 3 4)
-rvt_shift_linspace START STOP N
Alternative to '-rvt_shift_list ..'. Provide three
space-separated values (start stop N) used to
determine how many and what kinds of shifted copies of
RVT are output as regressors, according to the Python-
Numpy function linspace(start, stop, N). Both start
and stop (units of seconds) can be negative, zero or
positive. Including 0 may be useful. Example params: 0
4 5, which lead to shifts of 0, 1, 2, 3 and 4 sec
(def: None, use '-rvt_shift_list')
-rvt_off Turn off output of RVT regressors
-no_card_out Turn off output of cardiac regressors
-no_resp_out Turn off output of respiratory regressors
-do_extend_bp_resp Use less strict initial bandpass for resp data
-min_bpm_resp MIN_BPM_RESP
Set the minimum breaths per minute for respiratory
proc (def: 6.0)
-max_bpm_resp MAX_BPM_RESP
Set the maximum breaths per minute for respiratory
proc (def: 60.0)
-min_bpm_card MIN_BPM_CARD
Set the minimum beats per minute for cardiac proc
(def: 25.0)
-max_bpm_card MAX_BPM_CARD
Set the maximum beats per minute for cardiac proc
(def: 250.0)
-img_verb IMG_VERB Verbosity level for saving QC images during
processing, by choosing one integer: 0 - Do not save
graphs 1 - Save end results (card and resp peaks,
final RVT) 2 - Save end results and intermediate steps
(bandpassing, peak refinement, etc.) (def: 1)
-img_figsize WID LEN Figure dimensions used for QC images (def: depends on
length of physio time series)
-img_fontsize IMG_FONTSIZE
Font size used for QC images (def: 10)
-img_line_time IMG_LINE_TIME
Maximum time duration per line in the QC images, in
units of sec (def: 60)
-img_fig_line IMG_FIG_LINE
Maximum number of lines per fig in the QC images (def:
6)
-img_dot_freq IMG_DOT_FREQ
Maximum number of dots per line in the QC images (to
save filesize and plot time), in units of dots per sec
(def: 50)
-img_bp_max_f IMG_BP_MAX_F
Maximum frequency in the bandpass QC images (i.e.,
upper value of x-axis), in units of Hz (def: 5.0)
-save_proc_peaks Write out the final set of peaks indices to a text
file called PREFIX_LABEL_proc_peaks_00.1D ('LABEL' is
'card', 'resp', etc.), which is a single column of the
integer values (def: don't write them out)
-save_proc_troughs Write out the final set of trough indices to a text
file called PREFIX_LABEL_proc_troughs_00.1D ('LABEL'
is 'card', 'resp', etc.), which is a single column of
the integer values (def: don't write them out). The
file is only output for LABEL types where troughs were
estimated (e.g., resp).
-verb VERB Integer values to control verbosity level (def: 0)
-disp_all_slice_patterns
Display all allowed slice pattern names
-disp_all_opts Display all options for this program
-ver Display program version number
-help Display help text in terminal
-hview Display help text in a text editor (AFNI
functionality)
Notes on usage and inputs¶
* Physio data input:
At least one of the following input option sets must be used:
-card_file
-resp_file
-card_file and -resp_file
-phys_file and -phys_json
* FMRI information input:
It is preferable to use:
-dset_epi
to provide EPI dset for which regressors will be made, to provide
the volumetric information that would otherwise be provided with:
-dset_tr
-dset_nslice
-dset_nt
... and the slice timing information
* Slice timing input:
If '-dset_epi ..' is not used to provide the slice timing (and other
useful) volumetric information, then exactly one of the following
input option must be used:
-dset_slice_times
-dset_slice_pattern
* Physio information input:
Each of the following input options must be provided through some
combination of phys_json file, dset_epi file, or the command line
opts themselves:
-freq
-dset_tr
-dset_nslice
-dset_nt
* The following table shows which keys from 'phys_json' can be used to
set (= replace) certain command line argument/option usage:
ARG/OPT JSON KEY EPS VAL
freq SamplingFrequency 1.000e-03
start_time StartTime 1.000e-03
The 'EPS VAL' shows the maximum difference tolerated between a
JSON-provided key and an option-provided one, in case both exist in
a command line call. It would be better to avoid such dual-calling.
Notes on input peculiarities¶
With Siemens physiological monitoring, values of 5000, 5003 and 6000 can be
used as trigger events to mark the beginning or end of something, like the
beginning of a TR. The meanings, from the Siemens Matlab code are:
5000 = cardiac pulse on
5003 = cardiac pulse off
6000 = cardiac pulse off
6002 = phys recording on
6003 = phys recording off
It appears that the number is inserted into the series, in which case,
5000 values could simply be removed rather than replaced by an
interpolation of the two adjacent values, using the option
'remove_val_list ..'.
Notes on prefiltering physio time series¶
Many physio time series contain noisy spikes or occasional blips. The
effects of these can be reduced during processing with some
"prefiltering". At present, this includes using a moving median
filter along the time series, to try to remove spiky things that are
likely nonphysiological. This can be implemented by using this opt+arg:
-prefilt_mode median
An additional decision to make then becomes what width of filter to
apply. That is, over how many points should the median be calculated?
One wants to balance making it large enough to be stable/useful with
small enough to not remove real features (like real peaks, troughs or
other time series changes). This is done by choosing a time interval,
and this interval is specified separately for each of the card and
resp time series, because each has a different expected time scale of
variability (and experimental design can affect this choice, as well).
So, the user can use:
-prefilt_win_card TIME_C
-prefilt_win_resp TIME_R
... and replace TIME_* with real time values, in using of seconds. There
are default time values in place, when '-prefilt_mode ..' is used; see
above.
Finally, physio time series are acquired with a variety of sampling
frequencies. These can easily range from 50 Hz to 2000 Hz (or more).
That means 50 (or 2000) point estimates per second---which is a lot
for most applications. Consider that typical FMRI sampling rates are
TR = 1-2 sec or so, meaning that they have 0.5 or 1 point estimates
per sec. Additionally, many (human) cardiac cycles are roughly of
order 1 per sec or so, and (human) respiration is at a much slower
rate. All this is to say, having a highly sampled physio time series
can be unnecessary for most practical applications and analyses. We
can reduce computational cost and processing time by downsampling it
near the beginning of processing. This would be done by specifying a
max sampling frequency MAX_F for the input data, to downsample to (or
near to), via:
-prefilt_max_freq MAX_F
All of the above prefiltering is applied after initial 'badness'
checks for outliers or missing values, so those processes can be a bit
slow for densely acquired data.
*Recommendation*
In general, at least for human applications, it seems hard to see why
one would need more than 50 physio measures per second. It also seems
like median filtering over even relatively small windows typically be
useful. So, perhaps consider adding these options to most processing
(but adjust as appropriate!):
-prefilt_max_freq 50
-prefilt_mode median
If reasonable, the '-prefilt_win_card ..' and '-prefilt_win_resp ..'
values could also be adjusted.
User interaction for peak/trough editing¶
This program includes functionality whereby the user can directly edit
the peaks and troughs that have estimated. This includes adding,
deleting or moving the points around, with the built-in constraint of
keeping the points on the displayed physio time series line. It's
kind of fun.
To enter interactive mode during the runtime of the program, use the
'-do_interact' option. Then, at some stage during the processing, a
Matplotlib panel will pop up, showing estimated troughs and/or peaks,
which the user can edit if desired. Upon closing the pop-up panel,
the final locations of peaks/troughs are kept and used for the
remainder of the code's run.
Key+mouse bindings being used:
4 : delete the vertex (peak or trough) nearest to mouse point
3 : add a peak vertex
2 : add a trough vertex
1 : toggle vertex visibility+editability on and off
Left-click : select closest vertex, which can then be dragged along
the reference line.
Some additional Matplotlib keyboard shortcuts:
f : toggle fullscreen view of panel
o : toggle zoom-to-rectangle mode
p : toggle pan/zoom mode
r : reset panel view (not point edits, but zoom/scroll/etc.)
q : quit/close viewer (also Ctrl+w), when done editing
For more on the Matplotlib panel navigation keypresses and tips, see:
https://matplotlib.org/3.2.2/users/navigation_toolbar.html
At present, there is no "undo" functionality. If you accidentally
delete a point, you can add one back, or vice versa.
Output files¶
The following files will/can be created in the output dir, with the
chosen prefix PREFIX. Some are primary output files (like the file of
physio and RVT regressors), and some are helpful QC images. The
*resp* files are only output if respiratory signal information were
input, and similarly for *card* files with cardiac input. At present,
RVT is only calculated from resp input.
PREFIX_slibase.1D : slice-based regressor file, which can include
card, resp and RVT regressors, and provided
to afni_proc.py for inclusion in FMRI processing
PREFIX_regressors_phys.svg: QC image of all physio regressors (including
card and/or resp), corresponding to slice=0
physio regressors in *slibase.1D
PREFIX_regressors_rvt_resp.svg:
QC image of all RVT regressors from resp data,
corresponding to all shifted RVT regressors in
in *slibase.1D
PREFIX_resp_review.txt : summary statistics and information for resp proc
PREFIX_card_review.txt : summary statistics and information for card proc
PREFIX_pc_cmd.tcsh : log/copy of the command used
PREFIX_info.json : reference dictionary of all command inputs after
interpreting user options and integrating
default values
PREFIX_card_*_final_peaks*.svg
: QC image of final peak estimation for card data.
Can be several files, depending on length of
input data. Colorbands highlight longer (red)
and shorter (blue) intervals, compared to median
(white)
PREFIX_resp_10_final_peaks_troughs*.svg
: same as above image but for resp data (so also
includes troughs)
The following text files are only output when using the
'-save_proc_peaks' and/or '-save_proc_troughs' option flag(s):
PREFIX_card_peaks_00.1D : 1D column file of peak indices for card data,
corresponding to card*final_peaks*svg image.
PREFIX_resp_peaks_00.1D : 1D column file of peak indices for resp data,
corresponding to resp*final_peaks*svg image.
PREFIX_resp_troughs_00.1D : 1D column file of trough indices for resp data,
corresponding to resp*final_peaks*svg image.
The following intermediate QC images are only output when the value of
'-img_verb' is 2 or more. In each time series plotting case, there
may be multiple images, depending on time series length:
PREFIX_card_*_peaks*.svg : QC images showing intermediate stages of peak
calculation for card data
PREFIX_resp_*_peaks*.svg : same as above image but for resp data peaks
PREFIX_resp_*_troughs*.svg: same as above image but for resp data troughs
PREFIX_card_bandpass_spectrum.svg,
PREFIX_resp_bandpass_spectrum.svg
: QC images showing intermediate stage of peak
and/or trough estimation, namely the Fourier
Transform frequency spectrum (magnitude only),
both full and bandpassed.
PREFIX_card_bandpass_ts_peaks*.svg,
PREFIX_resp_bandpass_ts_peaks*.svg,
PREFIX_resp_bandpass_ts_troughs*.svg
: QC images showing intermediate stage of peak
and/or trough estimation, namely the initial
peak/trough estimation on the bandpassed
physio time series
PREFIX_card_20_est_phase*.svg,
PREFIX_resp_20_est_phase*.svg
: QC images showing intermediate stages of phase
calculation for card and/or resp data
PREFIX_resp_21_rvt_env*.svg
: QC images showing intermediate stages of RVT
calculation, namely envelope estimation
PREFIX_resp_22_rvt_measure*.svg
: QC images showing intermediate stages of RVT
calculation, RVT per input time series point
Interpreting coloration in images¶
The QC images contain images that are supposed to be helpful in
interpreting the data. Here are some notes on various aspects.
When viewing physio time series, the interval that overlaps the FMRI
dataset in time has a white background, while any parts that do not
have a light gray background. Essentially, only the overlap regions
should affect regressor estimation---the parts in gray are useful to
have as realistic boundary conditions, though.
Peaks are always shown as downward pointing triangles, and troughs are
upward pointing triangles.
When viewing "final" peak and trough images, there will be color bands
made of red/white/blue rectangles shown in the subplots. These
highlight the relative duration of a given interpeak interval (top
band in the subplot) and/or intertrough interval (bottom intervals),
relative to their median values across the entire time series.
Namely:
white : interval matches median
blue : interval is shorter than median (darker blue -> much shorter)
red : interval is longer than median (darker red -> much longer)
The more intense colors mean that the interval is further than the median,
counting in standard deviations of the interpeak or intertrough intervals.
This coloration is meant to help point out variability across time: this
might reflect natural variability of the physio time series, or possibly
draw attention to a QC issue like an out-of-place or missing extremum
(which could be edited in "interactive mode").
Examples¶
Example 1
physio_calc.py \
-card_file physiopy/test000c \
-freq 400 \
-dset_epi DSET_MRI \
-dset_slice_pattern alt+z \
-extra_fix_list 5000 \
-do_fix_nan \
-out_dir OUT_DIR \
-prefix PREFIX
Example 2
physio_calc.py \
-phys_file physiopy/test003c.tsv.gz \
-phys_json physiopy/test003c.json \
-dset_tr 2.2 \
-dset_nt 34 \
-dset_nslice 34 \
-dset_slice_pattern alt+z \
-do_fix_nan \
-extra_fix_list 5000 \
-out_dir OUT_DIR \
-prefix PREFIX
Example 3
physio_calc.py \
-card_file sub-005_ses-01_task-rest_run-1_physio-ECG.txt \
-resp_file sub-005_ses-01_task-rest_run-1_physio-Resp.txt \
-freq 50 \
-dset_tr 2.2 \
-dset_nt 219 \
-dset_nslice 33 \
-dset_slice_pattern alt+z \
-do_fix_nan \
-out_dir OUT_DIR \
-prefix PREFIX
written by: Peter Lauren, Paul Taylor, Richard Reynolds and
Daniel Glen (SSCC, NIMH, NIH, USA)