EEG analyses using Brain Vision Analyzer software
Below you'll find a manual for preprocessing EEG data using Brain Vision Analyzer (version 2.x.x) software. Please note that this manual is not meant to serve as a gold standard for EEG analysis, but merely contains preprocessing steps that we use in the CoDAP Lab and are heavily inspired by resources offered by renowned EEG researchers (e.g., An Introduction to the Even-Related Potential Technique; Analysing Neural Time Series Data: Theory and Practice. This preprocessing manual is meant to offer the basic EEG analysis steps that you can use prior to performing various subsequent EEG analyses, such as examining event-related brain potentials (FRN, P3) and/or time-frequencies. [updated Dec 17th 2020)]
Setting up a work space
When you start your analyses in Brain Vision Analyzer (BVA) for the first time, you will need to create a workspace. A workspace is basically a file that you can use on future occasions to open your EEG study with all processing steps applied to the raw EEG data of each participant in your study.
To create a workspace go to file >> New.
Next, a window pops up that allows you to specify paths to the following information:
raw data files
Raw EEG data files are large files (about 1 GB per participant is common in most studies). If you work with a team of researchers/students on the same data-set you can use a single raw EEG data folder that will be specified for all researchers/students in their personal workspace. This will save disk space and time. Other folders that contain information that is unique to your data analyses (e.g., preprocessing steps) can be stored in personal folders on a personal specified location (see below).
Create a personal study folder that contains the following subfolders
Export: this folder will contain data exported from BVA, such as peak amplitude, or average EEG frequency band power.
History: this folder will contain all preprocessing steps/transformations applied to the raw files.
History templates: this folder will contain any history templates [or scripts] that you create to process multiple raw EEG files at once.
Workspace: this is the file that BVA uses to read where the raw/history/export folders are located.
You can copy-paste (or browse) the paths of the export, history, and history template folders you've just created. Click on OK. You'll be asked to save your workspace. Safe the workspace in the workspace folder you have just created.
Inspecting the raw EEG data (sanity checks)
After you've created a workspace, and the EEG files exist in your raw folder, BVA will read these data. Depending on the raw data file characteristics (number, size) this can take a while since for each raw data file a history file will be generated (and stored in your history folder) and the markers that specify certain events in your experiment will be added. Fortunately, this process only happens once, and on subsequent occasions when you're starting your workspace the data will appear immediately.
Below you will see the EEG files of all participants in the primary tab (left). These files appear as folders that contain the raw EEG data (Raw Data) and the subsequent analyses steps applied (stored in history files). To open the data file of a participant you can click on the "+" sign. This expands a single step or node. To reveal all steps/nodes included in the history tree you can right click on the folder and choose "expand all". If you double click on Raw Data the EEG time series will appear on the right.
Luckily you can't delete the raw EEG data from BVA itself or by deleting the history files in your history folder😅. If for some reason you don't want to have some of the raw data files in your workspace (primary tab) then you'll need to move the raw EEG data files from the raw data folder. As moving files from a raw data folder will affect the workspace of other researchers/students working on the same data, you don't want to do this (unless some raw files are corrupted).
Sanity checks before collecting large amounts of data
Prior to starting a study (and collecting large amounts of data), it's always a good idea to do a few sanity checks (such as checking whether stimulus/response markers are written into the EEG data correctly, as well as verifying that the number of markers is presented correctly). In BVA, stimulus/response markers are shown at the bottom (small red/blue vertical lines). Stimuli markers are often colored in red and labeled with S, whereas response markers are often colored in blue and labeled with R. To inspect the details of the markers, as well as the number of each marker present in the data you can right click on the Raw Data file of a participant >> choose markers. This will open a box that shows you all marker details in the participant's EEG recording.
In the lab, we often record the EEG with a 1024 or 2048 Hz sampling rate. For most of our offline analyses, we can down-sample the EEG time-series. This will speed-up the analyses as it reduces the temporal resolution of the data (with still having sufficient resolution for testing your research questions. Decisions on what sampling rate to use or to what extent you should down-sample depend on the Nyquist Theorem, which basically states that your sampling rate should at least be twice the size of the frequency you're interested in analyzing.
Go to Transformations >> Change Sampling Rate
This opens the box below where you should select the required sampling rate (Hz). Choose "Use Spline Interpolation". Thereafter click on "OK". This will add a new (history) node underneath the participant's Raw Data.
New history nodes will always inherit the specific BVA transformation applied to the data. In this case "Change Sampling Rate". It's recommended to rename all transformation nodes such that they include more specific info that characterized your analyses, as well as label condition differences. In this case, we could rename this node into "downsample_512Hz"
In general, you want to re-reference your EEG data to a different reference scheme as used in your EEG recording software. The following referencing options are usually used in the literature
Cz reference (single reference electrode)
Average mastoid reference
Single mastoid reference (either left or right)
Average Reference (average of all EEG electrodes)
Laplacian or Current Source Density (reference free montages)
Although it’s beyond the scope of this manual to discuss what reference scheme is the most appropriate - and a variety of EEG textbooks exist on this topic – we do not suggest to use a reference that consists of a single electrode (Cz, single mastoid).
The average mastoid reference is often used since the mastoid electrodes are placed on less noisy and more comfortable areas of the head. However, if you are interested in the temporal or occipital EEG activity, average mastoid referencing might be less suitable. As our lab frequently performs event-related analyses and source-localization we use the Average references scheme. However, for both the Average reference and source localization analyses it is a requirement to have collected high-density EEG data (at least 64 channels).
Here we demonstrate how to perform the Average Mastoid Reference scheme. We use this type of referencing prior to artifact rejection and channel interpolation, since this reference will minimize the influence of bad-channels on other EEG channels. After the EEG data cleaning steps, we recommend on using a different reference scheme (e.g., Average Reference, CSD reference), for example if you intend on performing source-localization methods.
In order to carry out the Average Mastoid Referencing procedure, perform the following steps:
Go to Transformations >> Channel Preprocessing >> New Reference
Step 1 of 3:
Select the EEG channels that were used as mastoid electrodes (EXG5 and EXG6 in this example). All other channels remain in the 'available' channels list. Important: do not include the implicit reference into the calculation of the New Reference.
Step 2 of 3:
Select all EEG channels to which the new reference will be applied. All other channels remain in the 'available' channels list.
Step 3 of 3:
Here you could add a suffix to the channels (you can leave this space blank as the re-referencing will still be applied).
Click on Finish, a new node will be added to the history file.
The EEG time series that you recorded contains different kinds of sources, and not all can be considered neural sources. An important source of "noise" stems from electrical line noise (50 Hz in Europe; 60 Hz in the US). Dependent where your lab is based in the world, you can attenuate this type of noise by applying a notch filter that cancels out (most) of this line noise. But other noise may also contaminate your data, such as muscle movements (often around 100 Hz) or skin conductance/sweating (causing slow drift in data often below 0.01 Hz). For most analyses, you can use a band-pass filter of 0.01-70 Hz, with a 50 Hz notch filter (assuming that you've collected data in Europe 😉). Note that the lower and upper cut-off filters really depend on the frequencies you are interested in. But going lower than 0.01 Hz is not recommended to due slow drifts. In contrast, using a higher high-pass filter (e.g., 0.5 or even 0.1 Hz) is not recommended due to the involvement of low frequencies in ERPs such as the P300 (see also Chapter 3 of Steve's Luck book on the ERP technique). But of course, filter settings greatly depend on your research questions.
The high-pass (or low-cutoff) frequency filter can be set higher than 0.01 Hz if you've recorded data from children or patients populations (due to movements) or when you are really not interested in the low-frequency EEG components.
Filters with these high-pass filter settings or low cut-off frequencies of 0.01 Hz can only be used on continuous data, and not on epoched data. Filters introduce edge-artifacts that won't affect continuous recordings that much as there is sufficient data (that is not relevant for your analyses) prior to and at the end of your events of interest. Thus, always make sure to start recording the EEG time series well ahead of your experiment.
Go to Transformations >> Data Filtering >> IIR Filters
Select the settings as shown below:
What does the "Order" do?
The steepness of the filter can be adjusted with the Order values (also reflecting the slope of the filter). A modest slope is 12 dB/Oct, which goes from complete frequency attenuation to no filtering along a long section of the spectrum. However, with a modest slope, the most extreme frequencies will sometimes not be entirely attenuated. A very strict slope of 48 dB/Oct means the filter has a steep function and therefore attenuates the frequencies strongly. Here, however, not existing frequencies might be added to the spectrum, causing new waves to form that were not there initially. The intermediate option of 24 dB/Oct is therefore often the most appropriate. Update: in the newest versions of the BVA software these orders (or slopes) are 2 (lowest), 4 (medium), and 8 (highest).
This procedure allows you to generate a new channel based on the activity of other channels. For example, when you have collected horizontal EOG activity from two channels, you may want to apply this procedure to generate a single HEOG channel. The same applies to VEOG or ECG. Here, this bipolar signal is created via a simple subtraction (e.g., Chan 1 – Chan 2). Our lab typically records HEOG from EX1 and EX2, VEOG from EXG3 and EXG4, and ECG from EXG7 and EXG8.
Go to Transformations >> Linear Derivation
This will open the Linear Derivation box that shows a table with columns (channels). The rows represent the values that indicate which channels have been involved in the subtraction. In the first column, you can enter the name of the new channel that you'd like to create. In the example below, we've created three new channels (HEOG, VEOG, and ECG).
You can directly specify the number of channels that you'd like to create by entering a value in the box "Number of Channels". If you click on "Refresh" the number of rows will change accordingly.
Make sure that you tick the box "Keep old channels"
Make sure to delete the old channels that were used to generate the new channel(s) via this procedure. You can do so using the following tool:
Go to Transformations >> Edit Channels
Untick all the old channels and other channels that you don't longer need and click OK.
With the segmentation option, you create epochs of (often locked to an event of interest) that you want to analyze. The length of the segment really depends on your research question. Short segments may suffice in case you are only interested in the Event-Related Potential (ERP), time-locked to an event of interest. In such situations, segments of 1 or 1.5 seconds (including a short pre-stimulus baseline) are fine. However, if you are interested in pre- and post-event activity, or activity in the time-frequency domain, you’ll need to create longer segments. Mainly due to edge-artifacts that are introduced when using wavelet transformations in time-frequency methods. In principle, it’s better to start off with longer segments, as you can always shorten them at a later stage if you need to.
We typically epoch at this stage to create rather large epochs that include all trial events. Thus, for most of our experiments, this means that 8-sec segments are not uncommon. After epoching, we will semi-automatically inspect the time-series for artifacts and we reject a whole epoch when an artifact is detected. This is a rather stringent process since some artifacts might be detected far from the event-of-interest, or on a channel that might not at first be a priority channel. However, rejecting artifacts from the EEG time series is a very time-consuming and meticulous process. Cleaning the time-series thoroughly allows for performing a multitude of analyses that might not specifically include a 'one-channel, one-event' approach. Think of source-localization analyses and connectivity/synchronization analyses that require more channels and more data. Having relatively long segments is also important considering the fact that some filtering techniques introduce edge-artifacts to the bounds of EEG epochs (such as wavelet-based techniques).
Go to Transformations >> Segmentation
Select "Create segments based on a marker position
Select the stimulus/stimuli of interest that you will segment around. You can enter an 'Advanced Boolean Expression' in the bottom field. This might be particularly useful if you want to exclude segments that contain undesired events (such as wrong response types, or omitted responses)
Specify the pre-stimulus and post-stimulus interval. The pre-stimulus interval should be large enough for a baseline correction (500 ms is fine, or opt for a longer interval if you're performing time-frequency analyses later on that will have these stimuli as event-of-interest
Prior to artifact rejection, the epochs should be baseline corrected using a linear subtraction method. This method subtracts from each trial, the average baseline activity. This is particularly useful to remove any large DC offset differences across trials/channels, as well as slow-drifts. Please note that the Baseline Correction method in BVA is only activated for segmented data.
Go to Transformations >> Baseline Correction (only active when data is segmented)
you can enter the start and end of the baseline period (-200 to 0 is fine)
Using History Templates (Batch Mode)
Pre-artifact rejection template
By using the History Template option in BVA you can apply the preprocessing steps that you have performed on the raw data of a single participant automatically on all (or a group of) participants. An important prerequisite for this procedure is that the template doesn't contain subject-specific information or manual adjustments (such as marker editing).
Go to History Template >> New
This will open a new window on the right (typically where you see EEG data). In this window, you'll see a folder called 'Root'. This Root folder will contain the history template. To create such a template you can follow these steps:
Drag & drop a history tree (i.e., a set of history nodes that will form the template) - to the window on the right. Make sure to place the history tree on the Root.
Click on Save As (in the History Template tab)
Click on Apply to History Files (in the History Template tab)
The image below shows you an example of a history template (attached to the Root folder). If you have saved the History Template, this file will automatically appear in 'Select History Template'. You can also open and apply another template that you've created previously via the 'Browse' option. Note that the 'OK' button will only become active when a history template file is loaded.
In the example below all nodes in the history template have been labeled (renamed) to match its procedures. This comes in handy if you are demonstrating your analysis pipeline with others, or when you revisit your workspace after a long time.
Depending on where in your analysis pipeline you'll be applying the history template, you either tick Root ("Raw Data") or Choose Data Set (History Node). Thus, in case you've already done some preprocessing, and you'd like to apply a history template on one of the nodes in the existing history tree, make sure that you will enter the name of the history node in Choose Data Set.
Tick Raw Data if this is the first template you'll be running in batch mode, OR
Tick Choose Data Set (History Node) if you are applying a template after certain manual operations (such as artifact rejection). You will then need to enter the name of the specific History Node (Data Set) the template needs to start from. Make sure that such a 'target' node (entered in Data Set) is similar for all participants (otherwise the template won't run properly).
Important: If you'd like to apply a history template on nodes further down the history tree (thus not on Raw Data), make sure that a 'target' node (entered in Data Set) is similar for all participants. Otherwise, the template won't run properly.
Select the data sets of the participants to which the History Template needs to be applied.
Remove bad channels
A first step is to screen your EEG time-series for any bad channels. You can do so via the scroll left/right buttons in the lower left corner of the screen. I you've detected any bad channels in your data, you can remove these channels from the data set via the "Edit Channels" transformation.
Go to Transformation >> Edit Channels.
Untick the channels that you've considered 'bad channels'.
The reason why we exclude the channels at this stage, and not interpolate the channels 'back into the EEG time-series' is that interpolation alters the rank of the data if you'd use an ICA-based method to remove EOG from your EEG time-series.
You can also exclude any bad channels right after the Semi-automatic artifact rejection process, as sometimes artifacts happen consistently (and only) on specific channels. If you would do this, then you'd need to keep track of these bad channels, don't exclude epochs based on these channels, and remove the channels after the semi-automatic correction step.
Semi-automatic artifact rejection
The next step is to reject the epochs that contain artifacts. In the CoDAP lab, we do this via a semi-automatic procedure. Based on a set of criteria we have BVA look for artifacts and we manually inspect this procedure for all trials and participants. At this stage, we don't reject EOG artifacts as these type of artifacts will be corrected via an independent component analysis (ICA).
Proper cleaning of EEG data is time-consuming and you want to do this thoroughly and preferably only once. Removing all artifacts at this stage allows you to run various analyses on the resulting clean epochs. For example, neglecting artifacts at electrodes that are not of interest to you in a current study may be a bad idea if – at a later stage – you want to do network or functional connectivity analyses, which require all electrodes. Or screening for artifacts in a time-window that mainly includes post-stimulus activity prevents you from properly analyzing pre-stimulus activity at a later stage. Tip: do not reject artifacts that occur frequently on the same electrode. Instead, remove these channels from the dataset after the semi-automatic artifact rejection procedure (and re-introduce these channels via topographical interpolation method after the (ICA). See this page for some examples of common artifact
For examples of various artifacts you might find in EEG time-series, please consult Chapter 4 of Steve Luck's ERP handbook. Mike Cohen's web lecturelets also deal with important considerations.
Go to Transformations >> Artifact Rejection
This opens a box that allows you to select 3 different inspection methods. We describe the Semiautomatic Segment Selection Method here.
Next, click on the Channels tab. This allows you to enable/disable channels that you'd like to use for the screening process. As mentioned earlier, we don't correct for EOG activity at this stage, thus you should disable the EOG channels. Also, EOG activity is often quite prominently present on the frontal electrodes, that's why we have disabled some of the most anterior-frontal / frontal-polar channels in the example below.
Next, click on the Criteria tab. This allows you to specify some automatic artifact screening criteria.
In the examples below, we have activated the Gradient, Max-Min, Low-Activity, and Intervals.
Gradient: this checks for the max allowed difference in microvolts/ms of neighboring sampling points. In this case, this maximum was set to 50 uV/ms.
Max-Min: this option checks that the difference between a maximum and minimum point in an epoch can't exceed a pre-specified value. In this case, this value was set to 170 uV.
Amplitude: this option is not used here (for a thorough explanation of why we don't use this option please consult chapter 4 of Steve Luck's ERP book.
Low activity: this option verifies that the difference between a maximum and minimum value in an interval can't exceed a specific value. In the example below, this value is set to 0.5 uV.
Intervals: Note that in the previous tabs you were able to specific intervals before and after an event. In the Intervals tab, you change the inspection ranges to periods in the whole epoch (rather than tied to events).
>> Click OK to proceed
Next, click on the Criteria tab. This allows you to specify some automatic artifact screening criteria.
In the examples below, we have activated the Gradient, Max-Min, Low-Activity, and Intervals.
In the example above you see the result from a semi-automatic artifact rejection screening. The interactive window allows you to Keep or Reject an epoch. In addition, with the << or >> buttons you can move 'freely' to the left or right without keeping or rejecting any segments.
We recommend to visualize all available channels and to set the microvolt legend at 100 uV
We mark bad channels for disabling/interpolation (thus we don't reject artifacts due to reoccurring artifacts resulting from bad channels)
We ignore blink activity.
If you got an error message that the screening process did not yield any 'clean' segments (or all segments were removed), this might be due to either a consistent bad channel (such as EOG activity). Disabling this channel at the start of this process might solve this.
Inspect all trials and keep track of potential bad channels (consistent or frequent artifactual channels). Bad channels will be removed in the following step.
Ocular Independen Component Analysis
The next step is to correct the EEG time-series for EOG-related artifacts. In our lab we use a method based on independent component analysis that uses the variance of the VEOG and HEOG channels to correct the EEG time-series for EOG activity. In our experience, this method yields better results than regression-based methods, such as the Gratton & Coles technique. Also, we use the Ocular ICA automatic option in BVA, so we avoid the subjectivity of manual component selection.
Go to Transformations >> Ocular Correction ICA
This will lead you through a subset of boxes.
Create new blink markers, using an automatic ICA-based correction method.
Select the meaned slope algorithm.
Select the VEOG channel as the vertical activity 'common reference channel', and the HEOG channel for the horizontal activity 'common reference channel'.
Select all EEG channels for correction
Apply the correction to the whole data, using the Biased Infomax, extended ICA
Use the sum of squared correlations with VEOG/HEOG to find the ICA components
You have the option to export the ICA matrices to the export folder, but we don't use this option.
>> Click on Finish when at the last box. This will initiate the Ocular ICA, and this process can take up to 20-30 minutes, depending on the length of the EEG time-series and your processor.
If you have deleted bad EEG channels prior to the Ocular ICA, you can re-introduce these channels to the EEG time-series via the Topographical Interpolation transformation. Basically, this method calculates the time-series of a channel based on Spherical Spline Interpolation or a Triangulation/Linear Interpolation method (we always chose the Spherical Spline method).
If you didn't delete any EEG channels you can skip this step...
Go to Transformations >> Topographical Interpolation
Tick the Interpolation by Spherical Spline option
Leave the other settings as is shown above
Enter the names of the EEG channels that you wish to interpolate (coordinates are calculated automatically)
Make sure to tick the Keep Old Channels box