CLEEGN: A CONVOLUTIONAL NEURAL NETWORK FOR PLUG-AND-PLAY AUTOMATIC EEG RECON-STRUCTION

Abstract

Human electroencephalography (EEG) is a brain monitoring modality that senses cortical neuroelectrophysiological activity in high-temporal resolution. One of the greatest challenges posed in applications of EEG is the unstable signal quality susceptible to inevitable artifacts during recordings. To date, most existing techniques for EEG artifact removal and reconstruction are applicable to offline analysis solely, or require individualized training data to facilitate online reconstruction. We have proposed CLEEGN, a novel convolutional neural network for plug-and-play automatic EEG reconstruction. CLEEGN is based on a subjectindependent pre-trained model using existing data and can operate on a new user without any further calibration. The performance of CLEEGN was validated using multiple evaluations including waveform observation, reconstruction error assessment, and decoding accuracy on well-studied labeled datasets. The results of simulated online validation suggest that, even without any calibration, CLEEGN can largely preserve inherent brain activity and outperforms leading online/offline artifact removal methods in the decoding accuracy of reconstructed EEG data. In addition, visualization of model parameters and latent features exhibit the model behavior and reveal explainable insights related to existing knowledge of neuroscience. We foresee pervasive applications of CLEEGN in prospective works of online plug-and-play EEG decoding and analysis.

1. INTRODUCTION

Since the first record of human electroencephalogram (EEG) performed almost a century ago (in 1924) , EEG has been one of the most widely used non-invasive neural modalities that monitors brain activity in high temporal resolution (Koike et al., 2013; Mehta & Parasuraman, 2013; Sejnowski et al., 2014) . Among a variety of modalities, EEG has extensive use in the clinical assessment of neurological and psychiatric conditions, as well as in the research of neuroscience, cognitive science, psychology, and brain-computer interfacing (BCI). EEG signals measure subtle fluctuations of the electrical field driven by local neuroelectrophysiological activity of a population of neurons in the brain cortex (Cohen, 2017) . While the electrodes are placed on the surface of the scalp, undesired artifacts may introduce interruption in the measurements and distort the signal of interest. Even in a well-controlled laboratory with a well-trained subject who can maximally keep the body still and relaxed, the EEG signals, unfortunately, could be contaminated by inevitable behavioral and physiological artifacts such as eye blinks, reflective muscle movements, ocular activity, cardiac activity, etc (Croft & Barry, 2000; Wallstrom et al., 2004; Romero et al., 2008) . In practice, it is difficult to identify and track the sources of artifacts entirely due to their diversity and non-stationarity. Noise cancellation and artifact removal remain major issues in EEG signal processing and decoding. Currently, numerous methods have been proposed to alleviate the influence of artifacts in EEG signals. Traditional EEG denoising algorithms include filtering, regression, data separation or decomposition (Makeig et al., 1995; Islam et al., 2016; Kothe & Jung, 2016) . According to previous meta-analyses on EEG artifact removal literature (Urigüen & Garcia-Zapirain, 2015; Jiang et al., 2019) , independent component analysis (ICA) is especially popular. It is majorly used in 45% of EEG denoising literature. ICA-based artifact removal estimates the component activity by unmixing the EEG data in the channel domain. Through manual or automatic identification, one can exclude the artifact components and then reconstruct the EEG data through back projection based on non-artifact components (Jung et al., 2000a) . The fast growth of deep learning methods has drawn state-of-the-art performances in a variety of machine learning problems (LeCun et al., 2015) . Lately, deep-learning-based EEG reconstruction has drawn much attention (Leite et al., 2018; Sun et al., 2020; Lopes et al., 2021; Lee et al., 2020; Chuang et al., 2022) . Although these methods can effectively remove artifacts from artificial synthetic signals, their performance in reconstructing real EEG data has not yet been validated in terms of decoding labeled EEG data. Meanwhile, the model design of existing deep-learning-based techniques for EEG reconstruction rarely takes the characteristics of EEG into account. In this work, we propose CLEEGN, a ConvoLutional neural network for EEG reconstructioN. CLEEGN is capable of subject-independent EEG construction without any training/calibration for a new subject. The contributions of this work are three-fold: • a light-weight autoencoder CNN, CLEEGN, with a subject-independent framework that facilitates plug-and-play EEG reconstruction. • CLEEGN outperforms leading online/offline methods in providing reconstructed EEG data with the best decoding performance for BCI datasets. • with a novel model design dedicated to EEG reconstruction, CLEEGN characterizes patterns of EEG interpretable and provides neuroscientific insights.

2. RELATED WORK

Current processing techniques for EEG artifact removal are highly varied based on the context where the algorithm may apply. Earlier attempts of EEG denoising assumed that the EEG signals and artifacts appear in different frequency ranges. Based on the assumption, some significant artifacts can be eliminated by the linear filtering method during the online stage (Seifzadeh et al., 2014) . Despite the advantages of low computational time, linear filtering hardly removes artifacts that distribute in an overlapped frequency range of EEG signals. Another approach, adaptive filtering (Schlögl et al., 2007) , estimates artifact signals through additional EOG, EMG, ECG channels and removes these noisy signals from the recording signals by regression. Nevertheless, this approach requires additional auxiliary electrodes and raises the cost and inconvenience in practical applications. The blind source separation (BSS) method in EEG denoising was developed by assuming that the recording EEG signals are linear combinations of the signals from noise sources and the brain neurons. One of the most well-known BSS method is independent component analysis (ICA) (Jung et al., 2000a; b) , which is able to separate EEG signals into independent components (ICs) (Makeig et al., 1995) . Traditionally, the artifact components extracted by ICA are determined and removed through manual inspection. Recently developed ICLabel can label the ICs provenance into seven different categories: brain, eye, heart, muscle, line noise, channel noise, and other (Pion-Tonachini et al., 2019) . Artifact subspace reconstruction (ASR) is another automatic approach, which is based on the principal component analysis (PCA) method (Kothe & Jung, 2016) . The ASR method selects relatively noiseless periods from the multi-channel EEG data as reference based on the data distribution. After projecting all EEG data to the principal-component domain, high-variance components projected from the artifacts are detected by a cutoff parameter k. The noiseless signals are reconstructed by preserving the components without carrying artifacts and back-projected to the time domain. The ASR method has been shown capable of improving the quality of ICA decomposition (Chang et al., 2020) . This network is able to remove EOG, ECG, EMG on single-channel synthesis EEG data. Instead of using synthesis EEG data, IC-U-Net (Chuang et al., 2022) was created, which generated pairings of noisy and noiseless EEG data through ICA as training data. The proposed neural network is a one-dimensional adaptation of U-Net architecture trained with the ensemble of four loss functions to minimize the difference between amplitude, velocity, acceleration, and frequency components from the signals. This work showed that their reconstructed signal has higher SNR and can surely increase the number of brain components classified by ICLabel. where f s denotes the sample frequency of the EEG signals. As for the decoder, we design an approximately symmetric structure to the encoder with three convolutional blocks. The first convolutional block decodes the EEG feature with N F temporal kernels with shape is (1, ⌊f s × 0.1⌋). The second block decodes the EEG feature by a convolution layer with C spatial kernels of shape (C, 1). The last convolutional block is for projecting the feature domain back to the original time domain. To maintain the shape between the model inputs and outputs, every convolution block applies zeropadding except the first block in the encoder. The detail of the CLEEGN architecture is available in Table 1 .

3. MATERIALS AND METHODS

(B, 1, C, T ) Conv2D C (C, 1) (B, C, 1, T ) Permute (B, 1, C, T ) BatchNorm Conv2D N F (1, ⌊f s /10⌋) (B, N F , C, T ) BatchNorm Decoder Conv2D N F (1, ⌊f s /10⌋) (B, N F , C, T ) BatchNorm Conv2D C (C, 1) (B, C, C, T ) BatchNorm Conv2D 1 (C, 1) (B, As illustrated in Figure 1 , the objective of the proposed method is to minimize the difference between noiseless signals, Y , and the model output, Ŷ . The recording EEG data, X, is the combination of brain signals, Y , and the signals from multiple noise sources, N . CLEEGN can be regarded as a denoising autoencoder that is intended to perform artifact removal on EEG data by creating a mapping between multi-channel noisy EEG signals, X, and noiseless signals, Y . The training process of CLEEGN utilizes pairing noisy raw EEG data and noiseless reference EEG data so that the model can learn to transform noisy EEG data into reconstructed EEG data with maximal similarity to the reference data. To generate large-scale and real reference EEG data, we adopted automatic denoising methods, ICLabel and ASR, to remove artifacts and reconstruct clean waveforms offline. Rather than synthetic data, the use of real EEG data ensures presence of artifact/noise in a natural way and thus provides a realistic evaluation for our model. The pairing EEG recording with C channels of continuous noisy/noiseless data was further segmented into training sample pairs. The input size is (C, T ), where T is the number of time points based on the sampling rate of the EEG data in 4-second segments. The windows size and stride are set as T and 0.5T . Considering the context of plug-and-play EEG reconstruction, we perform subject-independent training scheme where EEG data into k disjoint sets by subjects. During the training process, one of the sets was left out for testing. Subjects' EEG data in the left-alone set were not involved in both training and validation. A complete experiment on a single dataset would result in k different models. The artifact removal performance of a model was evaluated by using the left-alone set. The number of subjects in one set and the EEG duration available for each subject depend on the experiment setting and the dataset used. EEG Reconstruction and Evaluation. The reconstructed EEG data of CLEEGN is generated through an online reconstruction simulation. In the online stage, the system performs artifact removal on a new subject using a pre-trained CLEEGN model without any requirement of training/calibration. Figure 2 shows the online EEG reconstruction simulated in offline. The EEG data from a subject in the left-alone set would be fed into the model sequentially. The stride size of the sliding window is set to 0.5 seconds to minimize the delay in online reconstruction. The fitness of the model can be evaluated by observing the waveform visually or measuring the similarity using the mean square error (MSE) between reconstructed and noiseless EEG. Meanwhile, we propose to use the decoding performance of labeled EEG data as an objective measurement of the reconstructed EEG quality. We employed EEGNet, a compact CNN for end-to-end EEG decoding (Lawhern et al., 2018) , as the classifier to decode the labeled EEG data in our study. EEG time series from top to bottom represent the recording of Fp1, T7, Cz, T8, and O2 channels. The nine sub-figures visualize the comparison between the noisy raw data and the reference data reconstructed by the ICLabel, ASR with or without ICLabel when the cutoff parameter k was set to 4, 8, 16, and 32. We can see the waveform of ICLabel and the ASR-ICLabel hybrid method with 4 different cutoff parameters k are similar, and the noiseless waveform reconstructed by CLEEGN are highly correlated to the result from these 5 methods. High-amplitude ocular artifacts such as blinking and eye saccades are phenomenal in frontal channels (Fp1) and muscular artifacts featuring high-frequency (around 24-30 Hz in the β band) are observed on the T7 channel. CLEEGN, ICA, and ASR-ICLabel hybrid methods can eliminate these kinds of artifacts. Compared to the ICLabel and ASR-ICLabel hybrid methods, ASR is unable to eliminate the high-frequency EMG artifact for this dataset. The tolerance of large amplitude EOG artifact increases with the employment of larger cutoff parameter k. Limited by the reference data generated by ASR, CLEEGN preserves the high-frequency artifact. As for the large amplitude artifacts that failed to be identified by ASR-32 and ASR-16, we can see that CLEEGN is able to mitigate those EOG artifacts. As shown in Figure 6 (a), the reference data prepared by ICLabel offer the lowest MSE, i.e. the best fitness for CLEEGN training. We also compare the types of reference EEG data and their corresponding CLEEGN reconstruction results in the decoding performance of the ERN EEG dataset to assess their data quality. As illustrated in Figure 6 (b), CLEEGN reconstruction draws better performances for all denoising methods. Although the performance decreases with a smaller cutoff parameter k, CLEEGN can outperform the reference data used for its training. This result suggests that our method not only removes the artifact but also preserves informative brain activity in the EEG under our cross-subject training scheme. As the ICLabel provides the reference data with the best decoding performance and fitness, we chose ICLabel as the source of reference data for further experiments in this study. Training Data Size. We explored the effect of data length per subject in the training set regarding the fitness and the decoding performance of CLEEGN. The training data were segmented from the first 1, 2, 4, 10, 20, and 30 minutes of data in each EEG recording to investigate the trade-off between training time and reconstruction performance. Figure 7 (a) shows the fitness of CLEEGN to the reference data. The model trained using the first 10 minutes obtain the minimal value among all duration configurations. Though the difference in MSE between 10, 20, and 30 minutes is not noticeable, Figure 7 (b) shows that using the 10-minute training data yields the best decoding performance among all settings. Interestingly, with only one-minute training data from each subject, CLEEGN can achieve comparable decoding performance to the reference data. This indicates that CLEEGN retains its performance even when each subject only contributes a short recording for training. In addition, we explore the effect of the number of subjects included in the training set on the fitness of CLEEGN model training and the decoding performance of the reconstructed EEG data. We randomly reduced the number of subjects for training from 12 to 6, 4, and 2. Since the subset of subjects may influence the performance, we tested multiple combinations and averaged Visualization. On account of the interpretability of the CLEEGN model, we visualized its intermediate latent features by mapping onto a 2dimensional domain based on the principal component analysis (PCA) of the noisy raw EEG data (Wold et al., 1987) . Figure 9 (a) shows the principal component space and the noisy EEG data channel projections. We observe that the scatters of noisy EEG data retain the spatial relationship of the actual EEG electrodes. The arrangement of frontal electrodes (prefix in Fp, P) to the posterior (prefix in P, O) are projected along the xaxis from right to left. The projection from top to bottom matches the leftside electrodes (suffixed in odd numbers), central electrode (suffixed in z), and right-side electrodes (suffixed in even numbers). The projection suggests that EEG data of adjacent channels tend to show similar waveforms. The blue dots in Figure 9 (b), (c) are the projection of latent features resulting from the first and second convolutional layers, which make up the encoder design of CLEEGN. The projection of latent features from the next two convolutional layers is presented in Figure 9(d) , (e). From Figure 9 (b) to (e), we can see the distribution range of the latent features shrink in the encoder design and expand in the decoder design of CLEEGN. Figure 9 (f) compares the distribution of noisy data and CLEEGN reconstructed data, showing that the CLEEGN reconstructed data is a more compact cluster than the noisy EEG in the PCA space. This implies that the process within CLEEGN includes a projection of the original EEG data in the upstream layer, complex temporal filtering and combination in the midstream layers, and a final projection that converts the noiseless latent features back to the channel domain.

5. CONCLUSION

In this work, we have proposed CLEEGN, a novel convolutional neural network for plug-and-play automatic EEG reconstruction. The training of CLEEGN leverages the conventional offline denoising methods, ASR and ICLabel, with automatic component classification to generate abundant noiseless EEG data. The performance of CLEEGN was objectively validated using multiple evaluations including waveform observation; reconstruction error assessment; and decoding accuracy on well-studied, labeled datasets. The experiment results suggest that, even without any calibration, CLEEGN can predominantly preserve inherent brain activity. According to the decoding performance of reconstructed EEG data, CLEEGN outperforms other neural network-based denoising methods on both ERN and SSVEP EEG decoding. From the visualization of the waveform, CLEEGN can remove artifacts from different sources and the waveform is highly correlated to the reference data. Through the visualization of model parameters and latent features, we exhibit model behavior and reveal explainable insights related to existing knowledge of neuroscience. CLEEGN effectively learns the transformation of EEG reconstruction from existing technique and even outperforms conventional offline approach. Future extension of this work includes incorporating other EEG denoise methods and their mixed use, and enhancing the inference ability across datasets or even across recording montages. We foresee pervasive applications of CLEEGN in prospective works of EEG decoding and analysis.

A.1.3 1D-RESCNN

The Adam optimizer with an initial learning rate of 1e-3 and the Mean Squared Error (MSE) loss function is adopted in 1D-ResCNN (Sun et al., 2020) training. Since the 1D-ResCNN is developed under one-dimensional synthesized EEG data and no explicit instruction of using multi-channel EEG provided, we trained 1D-ResCNN using two different method. For one method, we trained multiple models for each EEG channel respectively. For the other method, we viewed each multichannel EEG segment as a batch and the channel arrangement within a batch is fixed. Under our experiment, the result showed that the second method not only provides an efficient training process, but also results in a better reconstructed performance. The same weight saving strategy in CLEEGN is adopted in 1D-ResCNN training.

A.1.4 EEGDENOISENET (SCNN, RNN)

Two further simple network architecture are adopted in the comparison, the simple convolutional structure and recurrent network structure. Mean Squared Error (MSE) loss function is adopted as the objective criterion. The weight saving strategy in SCNN and RNN training is the same as CLEEGN. A.1.5 DECODING PERFORMANCE EVALUATION MODEL: EEGNET EEGNet (Lawhern et al., 2018) is a famous EEG decoding model and widely used in EEG literature. In the original EEGNet paper, they investigated their proposed model with a different number of kernels and denoted the model with F 1 temporal filters and D spatial filters as EEGNet-F 1 ,D. We use two different settings to train the two datasets used. In ERN classification, we use the EEGNet-8,2 structure suggested by the EEGNet paper. As for the SSVEP classification, an experimental result showed that EEGNet-100,8 can draw the best performance. We trained and evaluated the decoding performance individually for each subject. We divided the collection of event epochs into three splits within each subject: training set, validation set, and test set with a ratio of 3:1:1. The ratio between classes remained the same in each set. The loss function is categorical cross entropy (CCE) and the Adam optimizer is adopted with a learning rate of 10 -3 and zero weight decay. The batch size is set to 32 and the total training epoch is 200.

A.2.1 DATASET 1: FEEDBACK ERROR-RELATED NEGATIVITY (ERN)

Error-related negativity (ERN) can be categorized as a kind of event-related potential (ERP), which occurs after an erroneous or abnormal event perceived by the subject. One characteristic of the feedback ERN is a relatively large negative amplitude approximately 350 ms and a positive amplitude approximately 500 ms after visual feedback triggered by the error event. In this work, we mainly use a well-studied EEG dataset from the BCI Challenge competition hosted by Kaggle to evaluate the artifact removal effectiveness. This dataset includes EEG recordings of 26 subjects (16 subjects labeled and 10 subjects unlabeled) that participated in a P300 speller task. P300 speller is a wellknown BCI system that develops a typing application through P300 response evoked potential. The ERN experiment was conducted under the assumption that the ERN occurred if the subject received incorrect prediction (feedback) from the P300 speller. The objective of the competition was to improve the P300 speller performance by implementing error correction through ERN potentials. We used the 16 subjects with labeled data of which the sampling rate is 200 Hz initially with 56 passive Ag/AgCl EEG sensors. In the interest of increasing the usability of EEG data, we applied some pre-processing procedures to each EEG recording. The EEG data were down-sampled to 128 Hz and re-referenced by the common average reference (CAR) method to eliminate common-mode noise and to zero-center the data. Each recording was band-pass filtered to 1-40 Hz through the FIR filter implemented by EEGLAB to remove DC drifting. During EEG decoding evaluation, we epoch EEG signals in [0, 1.25 ] second interval to obtain correct and erroneous feedback.

A.2.2 DATASET 2: STEADY STATE VISUALLY EVOKED POTENTIAL (SSVEP)

Steady state visually evoked potential (SSVEP) is another kind of ERP that is characterized as periodic potential induced by rapidly repetitive visual stimulation. The SSVEP is composed of several discrete frequency components, which consist of the fundamental frequency of the visual stimulus as well as its harmonics. To investigate the generalization ability of the model, we use "EEG SSVEP Dataset II" from Multimedia Authoring & Management using your Eyes & Mind (MAMEM). The dataset includes EEG data from 11 subjects and consists of five different frequencies (6.66, 7.50, 8.57, 10.00, and 12.00 Hz) . Each subject was recorded in five sessions and each session included 25 trials (5 trials for each class). The data used a 256-channel HydroCel Geodesic Sensor Net (HCGSN) and captured the signals with a sampling rate of 250 Hz. Since there are several different bad channels in each subject's EEG recording, we preserved 20 common channels from each subject to train the CLEEGN model. Every recording was down-sampled to 125 Hz, re-referenced by the common average reference (CAR) method, and band-pass filtered to 1-40 Hz. We epoch EEG signals in [1, 5] second interval for each event recorded timestamp. The first second was discarded under the consideration of a reaction delay of the subject.

A.3 INVESTIGATION OF DECODING PERFORMANCE EVALUATION: SSVEP

In this work, we mainly use decoding performance as the assessment of different artifact removal methods. We provide an interpretation of the relationship between EEG quality and decoding performance. In Section 4 SSVEP result, IC-U-Net can optimize the weight with minimal MSE value among all compared artifact removal networks. However, the decoding performance is worse than CLEEGN and even the reference method (ICLabel), which implies that the average error over data points (MSE) is not the best assessment of EEG quality. In the SSVEP experiment, the fundamental frequency of the external rapidly repetitive visual stimulation and its harmonics can be an important feature in classification. Hence, higher quality in the frequency component of EEG data is required. We use the power spectrum density (PSD) to interpret the result of decoding performance. Figure 10 shows the PSD of EEG event data in each class denoised by different methods. We can see the reconstructed EEG spectrum from CLEEGN in each class is similar to the reference. As for IC-U-Net, it seems that the model can not completely reconstruct in several frequency bands , which leads to the low decoding performance. The PSD result shows that there is great distortion in the power density of reconstructed data generated by 1D-ResCNN. Since 1D-ResCNN is a one-dimensional structure, we hypothesize that spatial information is important in EEG artifact removal. 



https://www.kaggle.com/c/inria-bci-challenge https://www.mamem.eu/results/datasets/



Figure 1: Illustration of the proposed CLEEGN model architecture and the model training flow.

1, C, T ) C: # channels, T : # time points, f s : Sampling rate, B: Batch size

Figure 2: Schematic of the simulated online processing flow and the assessments of fitness between the reconstructed and reference EEG signals.

Figure 4: Reconstructed EEG signals by CLEEGN across training steps.

Figure 5: Visualization of raw (gray), reference (red), CLEEGN (blue) EEG waveforms with offline methods by ICLabel, ASR-32, ASR-32-ICLabel, ASR-16, ASR-16-ICLabel, ASR-8, ASR-8-ICLabel, ASR-4, ASR-4-ICLabel. Each segment plots a five-second segment of signals at Fp1, T7, Cz, T8, and O2.

Figure 6: (a) Overall fitness of the CLEEGN model across types of reference data using the ERN dataset. (b) Decoding performance of the CLEEGN-reconstructed EEG data (blue) and the corresponding reference data used for CLEEGN model training (red).

the results. With the decrease in the number of subjects, the MSE value and the standard error (light span area) increase in Figure8(a), which indicates that the generalization ability of the CLEEGN model reduces when fewer subjects are included for training. In Figure8(b), we can observe a slight decrease in decoding performance. We consider the number of subjects as an essential factor in the performance of CLEEGN, yet it requires only a few subjects to achieve a satisfactory performance compared to the reference data.

Figure 7: Performance of CLEEGN against the training data length per subject evaluated by the BCI-ERN dataset on (a) the fitness to the reference data; and (b) the decoding performance.Performance Comparison.We compare the performance of CLEEGN against five baseline methods, ICLabel, 1D-ResCNN(Sun et al., 2020), IC-U-Net(Chuang et al., 2022) and the simple CNN (SCNN) structure and RNN structure proposed in a previous work(Zhang et al., 2020). Except for the ICLabel that operates offline, CLEEGN and the other neural network-based methods perform a simulated online reconstruction based on the same training process with the reference data generated by ICLabel. Results of the ERN EEG dataset are shown in Table 2. CLEEGN has the best fitness to the reference data with the minimum MSE, the highest AUC score in the decoding performance, and the least parameters. For the SSVEP EEG data, although not having the minimum MSE, CLEEGN outperforms other methods in the decoding accuracy and the model size. The evaluation across the two datasets suggests an overall superiority of our proposed CLEEGN model over other existing neural network-based methods in online reconstruction. CLEEGN even provides a better reconstruction than the offline ICLabel. These promising results indicate the usability of CLEEGN in online training/calibration-free EEG reconstruction that truly meets the need for real-world applications of EEG-based BCI.

Figure 8: EEG reconstruction performance of CLEEGN against the number of subjects evaluated under the BCI-ERN dataset by (a) the fitness to the reference data; and (b) the decoding performance.

Figure 9: PCA visualization of latent features in CLEEGN for a single subject in the ERN dataset. (a) The noisy EEG data projected on the PCA space of the noisy EEG data. (b)-(e) Latent features in the first to fourth convolutional layers projected on the original noisy PCA space. (f) CLEEGN reconstructed EEG data projected on the original noisy PCA space.

Figure 10: Power spectrum density of EEG data on different methods to each class

Overall performance over all subjects in the BCI-ERN dataset.

Overall performance over all subjects in the MAMEM-SSVEP-II dataset.

Mean square error between reference denoising method and CLEEGN (MSE), AUC score of CLEEGN (AUC) in "BCI-Challenge" ERN dataset -Part2

Mean square error between reference denoising method and CLEEGN (MSE), AUC score of CLEEGN (AUC) in "BCI-Challenge" ERN dataset -Part3

Mean square error between reference denoising method and CLEEGN (MSE), AUC score of CLEEGN (AUC) in "BCI-Challenge" ERN dataset trained by different data length -Part1

SSVEP mean square error and top-1 accuracy of each subject on different network structure

A.1.1 CLEEGN

For training CLEEGN, we used the Adam optimizer with an initial learning rate of 1e-3 without weight decay. Besides, the exponential learning rate scheduler is applied with a gamma of 0.8. As for the loss function, we used Mean Squared Error (MSE). The batch size is set to 64 and the total training epoch is 40. During the training procedure, the model is evaluated using the validation subset at the end of every epoch with the purpose of saving the weights that achieved the lowest validation loss.

A.1.2 IC-U-NET

The optimizer adopted in IC-U-Net (Chuang et al., 2022) training is SGD with an initial learning rate of 1e-2, momentum of 0.9, and weight decay of 5e-4. The learning rate scheduler used in the training procedure is the multistep scheduler. As for the loss function, a novel ensemble loss proposed in IC-U-Net is adopted. This ensemble is a simple linear combination of the Mean Squared Error (MSE) in amplitude, velocity, acceleration, and frequency components of EEG signals. 

