Classiﬁcation of Motor Imagery Using a Combination of User-Speciﬁc Band and Subject-Speciﬁc Band for Brain-Computer Interface

: The essential task of a Brain-Computer Interface (BCI) is to extract the motor imagery features from Electro-Encephalogram (EEG) signals for classifying the thought process. It is necessary to analyse these obtained signals in both the time domain and frequency domains. It is observed that the combination of multiple algorithms increases the performance of the feature extraction process. This paper identiﬁes combinations that have not been attempted previously and improves the accuracy of the overall process, although other authors implemented di ﬀ erent combinations of the techniques. The focus is given more on the feature extraction process and frequency bands, which are user-speciﬁc and subject-speciﬁc frequency bands. In both time and frequency domains, after analysing EEG signals with the time domain parameter, we select the frequency band and the timing while using the Fisher ratio of the time domain parameter (TDP). We used Fisher discriminant analysis (FDA)-type F-score to simultaneously select the frequency band and time segment for multi-class classiﬁcation. We extracted subject-speciﬁc TDP features from the training trials to train the classiﬁer when optimal time-frequency areas were selected for each subject. In this paper, various methods are explored for obtaining the features, which are Time Domain Parameters (TDP), Fast Fourier Transform (FFT), Principal Component Analysis (PCA), R 2 , Fast Correlation Based Filter (FCBF), Empirical Mode Decomposition (EMD), and Intrinsic time-scale decomposition (ITD). After the extraction process, PCA is used for dimensionality reduction. An e ﬃ cient result was obtained with the combination of TDP, FFT, and PCA. We used the multi-class Fisher’s linear discriminant analysis (LDA) as the classiﬁer, which was in line with the FDA-type F-score. It is observed that the combination of feature extraction techniques to the frequency bands that were selected by the Fisher ratio and FDA type F-score along with Fisher’s LDA classiﬁer had higher accuracy than the results obtained other researches. A kappa coe ﬃ cient accuracy of 0.64 is obtained for the proposed technique. Our method leads to better classiﬁcation performance when compared to state-of-the-art methods. The novelty of the approach is based on the combination of frequency bands and two feature extraction methods.


Introduction
The possibility of controlling other devices while using brain functions has become feasible. The brain signals can be intercepted through the process of Brain Computer Interface (BCI). This is possible by recording the various signals from the brain and analysing them to identify the type of the signal based on the time and frequency of the signal [1]. The signals can be internally and externally collected from the brain. It is more effective in collecting these signals from internal measurement, since the measured values are more accurate. The externally measured signals are of comparatively Dimensionality reduction techniques have been investigated by García-Laencina et al. [6] by using the Hjorth parameters and adaptive autoregressive coefficients. The feature selection techniques that were used were sequential forward selection and sequential backward selection, while the classification techniques used were Principal Component Analysis (PCA), Local Fisher Discriminant Analysis (LFDA), and Locality Preserving Projections. The performance of the classification process has been increased using these hybrid techniques. However, the number of channels and subjects used are very less and, hence, more detailed analysis is necessary.
Wavelet transform was used by Gupta et al. [7] for extracting the features. Six types of filter methods, which are Euclidean distance (ED), Bhattacharyya distance measure (BD), Kullback-Leibler distance (KD), ratio of scatter matrices (SR), linear regression (LR), and maximum relevance minimum redundancy (mRMR), have been used for decreasing the size of feature vectors. Each of the six feature selection techniques has increased the performance of the classification. From these techniques, the combination of wavelet transform and linear regression is seen to have the best performance.
A feature reduction algorithm has been proposed by Han et al. [8] for the EEG framework. Pre-processing is performed while using autoregressive coefficients. The three-dimensional EEG signals are converted to two-dimensional matrix by compressing the feature vectors. Different techniques have been used for the ranking process, such as RFS, SSLSR, and RUFS. However, only two motor imagery tasks have been performed in this work.
Jusas and Samuvel [9] has been classified the Motor Imagery while using combinations of feature extraction and dimensionality reduction approaches. FFT, TDP, band power, and channel variance are the methods used by the researchers. The obtained values are combined together in pairs. Different feature reduction techniques have been analysed, such as PCA, sequential selections, LPP, and LFDA. It has been observed that the combined techniques have higher accuracy than the individual techniques. Hence, it compares the various combination of techniques for improved accuracy and recommends the FFT, CV, and PCA methods along with the LS-SVM classification. Hence, this work will combine the different techniques for improving the accuracy.
Mahmoudi and Shamsi [10] proposed a method for finding the subject specific time intervals for the classification of four-class motor imagery tasks by using mutual information (MI) between the BCI input and output. The signal-to-noise ratio was utilized to compute the MI values while the MI values were utilized as feature selection criteria to select the discriminative features. The time segments and the better discriminative features were found by using training data and used to estimate the evaluation data. The filter bank common spatial pattern (FBCSP) algorithm has been divided in to four progressive stages, such as filter-bank, the CSP algorithm, feature selection, and classification. However, the noise and artifacts of the EEG signals have been unnoticed in the experiment.
Ren et al. proposed a feature extraction technique that has combined the feature extraction and feature selection method [11]. Four different feature extraction methods were used to reduce the number of features from a total of 83 features. Three different feature selection techniques were used and then compared. The Fisher score has been seen to have the most accurate results. The other techniques were seen to have a comparatively lesser accuracy.
Rodríguez-Bermúdez et al. proposed a wrapper based methodology [12] for selecting the features. The features used are power spectral density features along with the AAR co-efficient and Hjorth parameters. These averages of these features are calculated and then compiled into a single vector. These features are then selected and subjected to different regression techniques and it has been seen that the Least Angle Regression (LARS) algorithm works better than Wilcoxon rank test.
Wang et al. [13] presented a statistical model to select the optimal feature subset based on the Kullback-Leibler divergence measure and automatically select the optimal subject-specific time segment. The autoregressive model and log variance are employed on the Common Spatial Patterns (CSP) for the feature extraction. These extracted factors are spatial and temporal correlated power spectral features. In the experiment, they only performed binary classification. We used a four-class motor imagery tasks in this research.
Yuan et al. [14] proposed FDA techniques for reducing the linear dimensionalities, where the FDA and linear discriminant analysis (LDA) were combined. A graph that is data adaptive is created with the L1 or L2 norm constraints and then merged with the LDA approach for a better analytic solution. The experimental results are implemented on different datasets to demonstrate the effectiveness. It is seen to be effective in lower dimensions and small training datasets.
Yu et al. [15] used spatial filter techniques along with PCA. It has been seen that common average reference filter performs better than other well-known spatial filter techniques. However, the feature reduction using PCA did not improve the accuracy, but the performance of the classification was maintained.
Zhang et al. combined an autoregressive model and sample entropy model [16] for extracting the features. The coefficients of these models have been used along with SVM and Radial Basis Function (RBF) for the classification. The obtained accuracy was higher than the individual autoregressive models; however, the obtained accuracies were lower than existing combined techniques.
Uktveris and Jusas [17] considered a deep learning approach based on convolutional neural network (CNN). CNN and their application to four-class motor-imagery based problem were analysed in the research. The experimental results are similar to more complex state-of-the-art EEG analysis techniques.
Dai et al. [18] proposed an approach that combines CNN and variational autoencoder (VAE) networks. Deep learning approaches were used in the experiment. The deep learning approach takes extremely large amount of data to perform better than other methods [17]. Hence, it is difficult to obtain better results in the deep learning approach. There will be lesser chance to improve the classification accuracy.

Methods
In this paper, two feature extraction processes are performed to increase the number of relevant features. When lots of features are used, it leads to errors and confusion, thereby reducing the efficiency. Therefore, PCA method is used for the feature reduction. Hence, a CSP is used for the feature decomposition. The following feature extraction methods, such as time domain parameters, empirical mode decomposition, fast fourier transform, fast correlation-based filter, intrinsic time-scale decomposition, and squared pearson's correlation, have been combined in several ways.
The Fisher's LDA and Least Square Support Vector machine (LS-SVM) are used for the classification purpose, as shown in Figure 1. The frequency bands are selected based on the users and subjects. The Fisher's ratio of the time domain parameters is calculated to identify the dominant frequency band and timing in the signals. This is performed by applying the by-band pass filter.
A relatively new and perspective approach to motor imagery was found in combination of feature extraction methods. A combination of user specific band and subject specific band is a novel method based technique that has not been used with EEG. The combination of user specific band and subject specific band could be the new perspective way to present a solution since EEG motor imagery task lacks accurate solutions.

Common Spatial Patterns
CSP was initially used for classifying multiple channels EEG by Ramoser [19]. It was mainly used for linear transformation for projecting multiple channel EEG data into low dimensional spatial space by using a projection matrix, where every row contains weights for channels. The transformation can increase the variance of dual class signal matrices. This method uses and diagonalizes the co-variance matrices of both classes [20]. Let X 1 of size (n, t 1 ) and X 2 of size (n, t 2 ) be two multivariate signal windows, where n is the number of signals and t 1 and t 2 are the number of samples respectively. The CSP determines the W T component, so that the ratio of variance is maximized between the two windows. This can be expressed as follows: Appl. Sci. 2019, 9, 4990 6 of 17

Fisher Ratio
A feature is computed by utilizing the time domain parameter at each k-th window. "Along this way, whole features are obtained in all of the windowed durations of training EEG signals, and the features of each class are then ensemble averaged. Additionally, these processes are repeated, changing a frequency band to others" [21]. The following way is to select the frequency bands. Within the frequency range of 5-30 Hz, the n frequency points are defined, and a frequency band is composed of two points among n points. Subsequently, the number of bands is that the number of 2 combinations from frequency points nC 2 . The Fisher ratio, where j is the filter index, k is the window index, and l is the index of the time domain parameter, is calculated from the averaged features of two classes in order to select the important timing and frequency band, as follows: where, m 1 ( j, k, l) (i = 1, 2) denotes the average and σ 1 ( j, k, l) 2 stands for the variance of the time domain parameter l of each i at k-th window and j-th filter.

FDA-Type F-Score
FDA-type F-score is a simplified measure that is based on Fisher discriminant analysis (FDA) for assessing the discriminative power of a group of features (a feature vector) [22].
In above equation, where, → µ denotes the mean of the feature vector, Σ denotes the covariance matrix of the feature vector, and tr denotes the trace of a matrix. "Thus, FDA-type F-score depend on the Euclidean distance between class centers to evaluate the difference between classes and utilizes the trace of the covariance matrix to estimate the variance within one class". FDA-type F-score, as a simplified criterion, avoids estimating a projection direction in multi-dimensional FDA, and it has been efficiently used in two class BCI and motor recognition studies for channel and feature selection.

Feature Extraction Techniques
There are lots of feature extraction approaches that can be used for extracting the EEG signals. The extraction techniques used in this paper are TDP, FFT, and PCA.

Time Domain Parameters
Time Domain parameters (TDP) is a technique that is performed by calculating the time-varying power of first k derivatives in the following equation: The values that are obtained can be smoothened by utilising an exponential moving average window filter. Even though the features of the TDP are defined in the time domain, they can also be inferred as frequency domain filters. Other spectral approaches that are available are Fourier transform, wavelet analysis, and autoregressive spectrums, which can define the rest of the spectral density function. However, this technique has the limitation that classification does not occur when there are too many parameters. Therefore, the training requires lots of data that lead to the possibility of overfitting [23].

Empirical Mode Decomposition
Empirical Mode Decomposition (EMD) is an adaptive signal analysis technique for analysing the signal with a wide range of applications. It decomposes the signal into unique and different frequency components, known as Intrinsic Mode Functions (IMF). If this decomposition is not possible, other mode functions will then contain similar frequencies as overlapping components [24]. There are certain criteria to be applied as an IMF. The total zero crossings in the data must not be more than one. It should preferably be the same. This means that the values must remain either positive or negative. Its polarity must not change. The average value of envelope from maxima and minima must be zero at all times. The EMD has the ability to convert any signal to IMF. It is a shifting process performed to decompose the signal into narrow band signals. The criteria are satisfied by the expression below:

Fast Fourier Transform
Fast Fourier Transform (FFT) analyses certain signals and samples for certain space and time. It splits them into smaller frequency components. It processes the discrete Fourier transform for few data samples. This data that should be transformed is partitioned into smaller frames [9]. Each individual frame is transformed and the obtained result is added to the matrix. Short-Time Fourier Transform (STFT) is a type of Fourier transform that can be represented by following equation: Here, w is continuous and m is discrete. This is performed using FFT; hence, both of these variables are quantized and discrete. The FFT is very consistent and robust in obtaining the most optimum features.

Fast Correlation Based Filter
FCBF is a multivariate type of feature selection technique that uses Symmetrical Uncertainty (SU) for calculating the dependencies of the features and identifying the best subset while using backward selection technique along with sequential search strategy. It contains internal conditions, where the process stops when the necessary criteria is satisfied. It is based on correlation that generally runs faster than the other subset selecting techniques. Entropy and conditional entropy values are used for calculating the feature dependencies [25]. The entropy is calculated by the following expression: where, x is a random variable and p(x) is its probability.

Intrinsic Time-Scale Decomposition
Intrinsic time-scale decomposition (ITD) is a type of signal processing technique that has been recently developed. This technique can split a complicated signal into multiple smaller Proper Rotation Components (PRC) on the basis of local time scale of the signal characteristics. The signals are determined at the local extremum point by using a linear transformation technique; hence, more local data can be used for the process [26]. The Intrinsic time-scale decomposition can be computed while using the formula:

Squared Pearson's Correlation
It is the proportion of the variance when the dependent variable can be obtained from the independent variable. It is denoted by R 2 and it is the square of the correlation between the actual and predicted outcomes. It is used in statistical analysis for regression and it can be used for various types of analysis. This method independently estimates the discriminative power of every feature by calculating the square value of the Pearson's correlation coefficient between the values of the j th feature and the class vectors [27].
In above equation, x ji denotes the i th sample of j th feature, y i denotes the class label associated with the i th sample, and the bar notation denotes the average value across all samples.

Principal Component Analysis
PCA is another technique that can be used for extracting the features through the filtering technique. It uses an orthogonal transformation for converting some observations of correlated variables to a group of uncorrelated variables [28]. It is an unsupervised method that calculates the linear mapping for accomplishing low size representation of original data, where there is a high amount of variance. The covariance of two variables X and Y is obtained while using the following equation: The following processes take place in PCA: • The co-variance matrix of the data points is obtained.

•
The individual eigen values are calculated and then sorted in decreasing order. • The first k-eigen vector is selected and this will have k dimensions.

•
The original set of dimensions is modified. i.e., the dimensions are transformed into k dimensions.

Fisher's LDA
Fisher's Linear Discriminant Analysis, or just LDA, is a technique used for recognizing the patterns in statistics and machine learning for identifying various features that segregates different classes of objects and events. This combination of features can be used as a linear classifier for reducing the dimensionalities before the final classification.
The transformation in this algorithm is based on increasing the ratio of the variances of between the classes to within class with the aim of reducing the differences in the data within the class and increasing the differences between the classes [14]. This technique works well for multiple class problems. When there are X number of classes, the technique uses the (X − 1) projections by using the projection vectors θ i , arranged by column matrix, where expressed in following equation:

LS-SVM
The Least-squares Support Vector Machine (LS-SVM) is a least square version of SVMs that can analyse the data and identify recognisable patterns for classifying the data [29]. An SVM uses data points as input and gives the output in form of a hyper-plane. A decision boundary is used for the classification between the different classes, which will be classified. Instead of quadratic programming, like conventional SVMs, the LS-SVM technique uses linear equations for solving them and they are a type of kernel based learning method [30].

Data
2a dataset from BCI competition IV is used for the analysis. The available data are collected from nine different people in two seasons and has been recorded on different days. The participants were given four different motor imagery tasks, like actual movement and imagination of different parts of the body, like hands, feet, and tongue. Totally, there were 288 trials performed with 72 trials in each class randomly. The dataset contains 22 EEG signals that are recorded in a monopolar manner. The signals were sampled at the frequency of 250 Hz and a band pass filter is applied to remove the lower order frequency. Figure 2 shows the structure of a single trial.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 9 of 17 points as input and gives the output in form of a hyper-plane. A decision boundary is used for the classification between the different classes, which will be classified. Instead of quadratic programming, like conventional SVMs, the LS-SVM technique uses linear equations for solving them and they are a type of kernel based learning method [30].

Data
2a dataset from BCI competition IV is used for the analysis. The available data are collected from nine different people in two seasons and has been recorded on different days. The participants were given four different motor imagery tasks, like actual movement and imagination of different parts of the body, like hands, feet, and tongue. Totally, there were 288 trials performed with 72 trials in each class randomly. The dataset contains 22 EEG signals that are recorded in a monopolar manner. The signals were sampled at the frequency of 250 Hz and a band pass filter is applied to remove the lower order frequency. Figure 2 shows the structure of a single trial.

Results
The classification results are evaluated and compared by utilizing the kappa coefficient, which takes the value 0 for a random classifier and 1 for a perfect classifier that consistently correctly classifies. The estimation of kappa coefficient is computed utilizing the equation below: where 0 denotes the classification accuracy and denotes the hypothetical accuracy of a random classifier on the same data.
In above equation, we consider the value for = 0.25. The final proportion of execution of a given algorithm is the maximum value of the kappa value from the computed time-course.
The Fisher ratio is performed for obtaining the user specific band. After analysing the EEG signals and comparing it with the time domain parameters, the most dominant frequency band and timing is identified by using the Fisher ratio of the time domain parameter. The FDA type F-score has been used for the subject specific band. Fisher′s discriminant analysis is performed to identify both the dominant frequency band and timing for multiple class classification. It estimates the time and frequency areas for extracting the features of TDP with respect to subject. While the user

Results
The classification results are evaluated and compared by utilizing the kappa coefficient, which takes the value 0 for a random classifier and 1 for a perfect classifier that consistently correctly classifies. The estimation of kappa coefficient is computed utilizing the equation below: where P 0 denotes the classification accuracy and P e denotes the hypothetical accuracy of a random classifier on the same data.
In above equation, we consider the value for P e = 0.25. The final proportion of execution of a given algorithm is the maximum value of the kappa value from the computed time-course.
The Fisher ratio is performed for obtaining the user specific band. After analysing the EEG signals and comparing it with the time domain parameters, the most dominant frequency band and timing is identified by using the Fisher ratio of the time domain parameter. The FDA type F-score has been used for the subject specific band. Fisher's discriminant analysis is performed to identify both the dominant frequency band and timing for multiple class classification. It estimates the time and frequency areas for extracting the features of TDP with respect to subject. While the user specific and subject specific band are individually applied, they are also applied together as the proposed system. Different frequency intervals and time intervals are observed to identify the most optimum intervals. To identify the best time interval, they are examined with the feature extraction techniques and feature reduction while using PCA. After this, Fisher's LDA was used as a classifier. Figure 3 shows an original EEG signal. The electroencephalogram (EEG) is the recording of the electrical movement of the brain from the scalp. The recorded waveforms indicate the cortical electrical movement. EEG activity is quite small and they are measured in terms of microvolts.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 10 of 17 specific and subject specific band are individually applied, they are also applied together as the proposed system. Different frequency intervals and time intervals are observed to identify the most optimum intervals. To identify the best time interval, they are examined with the feature extraction techniques and feature reduction while using PCA. After this, Fisher′s LDA was used as a classifier. Figure 3 shows an original EEG signal. The electroencephalogram (EEG) is the recording of the electrical movement of the brain from the scalp. The recorded waveforms indicate the cortical electrical movement. EEG activity is quite small and they are measured in terms of microvolts.  Figure 4 shows a band-pass filter is used to display frequencies that are either too low or too high, making it easy to pass frequencies within a certain range. Band-pass filters can be created by stacking a low-pass filter at the end of a high-pass filter.   Figure 4 shows a band-pass filter is used to display frequencies that are either too low or too high, making it easy to pass frequencies within a certain range. Band-pass filters can be created by stacking a low-pass filter at the end of a high-pass filter.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 10 of 17 specific and subject specific band are individually applied, they are also applied together as the proposed system. Different frequency intervals and time intervals are observed to identify the most optimum intervals. To identify the best time interval, they are examined with the feature extraction techniques and feature reduction while using PCA. After this, Fisher′s LDA was used as a classifier. Figure 3 shows an original EEG signal. The electroencephalogram (EEG) is the recording of the electrical movement of the brain from the scalp. The recorded waveforms indicate the cortical electrical movement. EEG activity is quite small and they are measured in terms of microvolts.  Figure 4 shows a band-pass filter is used to display frequencies that are either too low or too high, making it easy to pass frequencies within a certain range. Band-pass filters can be created by stacking a low-pass filter at the end of a high-pass filter.  The Daubechies wavelets are a family of orthogonal wavelets and identify discrete wavelet transform. The coefficients of a one-dimensional signal are reconstructed and they are shown in Figure 5. The brain has five distinctive categories of brain waves; Gamma, Theta, Delta, Alpha, and Beta brain waves.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 11 of 17 The Daubechies wavelets are a family of orthogonal wavelets and identify discrete wavelet transform. The coefficients of a one-dimensional signal are reconstructed and they are shown in Figure 5. The brain has five distinctive categories of brain waves; Gamma, Theta, Delta, Alpha, and Beta brain waves. Figure 5. EEG brain waves. Figure 6 shows the frequency of brain waves. Gamma wave occurs maximum frequency at 12.00 Hz. Beta wave occurs maximum frequency at 6.00 Hz. Alpha wave occurs maximum frequency at 3.00 Hz. Theta wave occurs maximum frequency at 1.00 Hz. Delta wave occurs Maximum frequency at 1.00 Hz. Band pass filter is utilized for EEG signal denoising. Figure 7 shows the denoised EEG signal.  Figure 6 shows the frequency of brain waves. Gamma wave occurs maximum frequency at 12.00 Hz. Beta wave occurs maximum frequency at 6.00 Hz. Alpha wave occurs maximum frequency at 3.00 Hz. Theta wave occurs maximum frequency at 1.00 Hz. Delta wave occurs Maximum frequency at 1.00 Hz.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 11 of 17 The Daubechies wavelets are a family of orthogonal wavelets and identify discrete wavelet transform. The coefficients of a one-dimensional signal are reconstructed and they are shown in Figure 5. The brain has five distinctive categories of brain waves; Gamma, Theta, Delta, Alpha, and Beta brain waves. Figure 5. EEG brain waves. Figure 6 shows the frequency of brain waves. Gamma wave occurs maximum frequency at 12.00 Hz. Beta wave occurs maximum frequency at 6.00 Hz. Alpha wave occurs maximum frequency at 3.00 Hz. Theta wave occurs maximum frequency at 1.00 Hz. Delta wave occurs Maximum frequency at 1.00 Hz. Band pass filter is utilized for EEG signal denoising. Figure 7 shows the denoised EEG signal. Band pass filter is utilized for EEG signal denoising. Figure 7 shows the denoised EEG signal. Appl. Sci. 2019, 9,   This work performs a ten-fold cross validation. The TDP is utilised for identifying the optimum frequency band and timing during the training process. The band is subject to by-pass filtering to remove undesirable noises. Table 1 gives the range of the optimum frequency band and timing.  In the user specific band, the feature extraction is performed by all three discussed feature extraction techniques, which are TDP, FFT, and PCA in the BCI system. As every individual subject has different frequency and timing bands, the dominant bands must also be identified for the individual subject. Accordingly, we select the frequency band and the timing using the Fisher ratio of the time domain parameter. Hence, a band pass filter is used to eliminate the insignificant bands and only obtain the significant sections during the Event Related Synchronisation (ERS) and Event Related De-synchronisation (ERD). Hence, the frequency range is selected between 5 and 30 Hz and the selected points are 5,8,12, 14, 20, 24, and 30. For the subject specific band with FDA type, the F-score is computed for all of the ranges of the frequency-time region for identifying the optimal parameters and the maximum F-score. As the subject specific results are obtained, the optimal time-frequency area is identified for each individual test subject along with the subject-specific TDP features for training the classifier. Multiple class LDA is used for the classification for the subject specific results. The frequency range is selected between 8 and 30 Hz. This work performs a ten-fold cross validation. The TDP is utilised for identifying the optimum frequency band and timing during the training process. The band is subject to by-pass filtering to remove undesirable noises. Table 1 gives the range of the optimum frequency band and timing.  In the user specific band, the feature extraction is performed by all three discussed feature extraction techniques, which are TDP, FFT, and PCA in the BCI system. As every individual subject has different frequency and timing bands, the dominant bands must also be identified for the individual subject. Accordingly, we select the frequency band and the timing using the Fisher ratio of the time domain parameter. Hence, a band pass filter is used to eliminate the insignificant bands and only obtain the significant sections during the Event Related Synchronisation (ERS) and Event Related De-synchronisation (ERD). Hence, the frequency range is selected between 5 and 30 Hz and the selected points are 5,8,12, 14, 20, 24, and 30. For the subject specific band with FDA type, the F-score is computed for all of the ranges of the frequency-time region for identifying the optimal parameters and the maximum F-score. As the subject specific results are obtained, the optimal time-frequency area is identified for each individual test subject along with the subject-specific TDP features for training the classifier. Multiple class LDA is used for the classification for the subject specific results. The frequency range is selected between 8 and 30 Hz.
After performing both of them individually, they are combined to implement the proposed approach. For every subject, various time intervals are subjected to the three-feature extraction technique, which are FFT and TDP, and then feature reduction technique, which is PCA. After this, the classification techniques, Fisher's LDA and LS-SVM are performed. The number of frequency bands is calculated to identify which classification is optimum and which has a high accuracy. From the results, it is seen that the proposed combination of the feature extraction techniques works better when compared to its individual results. The frequency range for both the user-specific band and subject-specific band are identified and separated by applying the Discrete Fourier Transform (DFT) and Band Pass filter between the time interval. The analysis is statistically performed for both bands and the following points are selected as optimum, as shown in Table 2. In order to identify the best time interval, it has to be individually identified for all the subjects. This is compared with the different frequency intervals for both the bands. Additionally, LDA is also performed for the same frequency intervals in order to identify the optimum number of layers and epochs. The different traditional feature extraction methods, like TDP, PCA, R 2 , FCBF, EMD, ITD, CV, and FFT, are compared with different combinations and tabulated in Table 3. The TDP and PCA are maintained as a constant technique, while modifying the other techniques-R 2 , FCBF, EMD, ITD, CV, and FFT along with Fisher's LDA as a classification. We are using the Fisher ratio of TDP. Therefore, TDP is constant. The highest average kappa coefficient accuracy obtained is 0.56 for the combination of TDP, FFT, and PCA. PCA is constant, because it is one of the most applied methods for feature reduction. From Table 3, it is observed that these combinations do not deliver high accuracy. The user specific band is performed with the combinations of TDP, FFT, and PCA with both LS-SVM and Fisher's LDA individually. For the LS-SVM classifier, the obtained average kappa coefficient accuracy of all nine subjects is seen to be 0.57, whereas, for the Fisher's LDA, the kappa coefficient accuracy is seen to be at 0.60. Currently, the FDA type F-score is performed with the combinations of TDP, FFT, and PCA with Fisher's LDA and the average kappa coefficient accuracy of 0.58 is obtained. The proposed technique is performed with the combination of TDP, FFT, and PCA, along with the user specific and FDA type F-score by using the Fisher's LDA classifier. With the proposed technique, it can be seen that the kappa coefficient accuracy is 0.64, which is significantly higher than the traditional approaches. This comparison is tabulated in Table 4 and represented in Figure 8. significantly higher than the traditional approaches. This comparison is tabulated in Table 4 and represented in Figure 8.   Table 5 shows a comparison of the results of the proposed method and other competitive methods. The proposed technique of the user specific band is performed with combinations of TDP, FFT, and PCA (combined both Fisher ratio and FDA type F-score) with Fisher′s LDA, and the average kappa coefficient accuracy is seen to be at 0.64. The combination of methods, such as FFT, CV, and PCA, are performed with the LS-SVM classifier, and the kappa coefficient accuracy is seen to be at 0.56 [9]. The FBCSP algorithm were used to optimize subject-specific frequency bands for the CSP algorithm to extract features and the average kappa coefficient accuracy is seen to be at 0.63 [10]. The combination of CNN and VAE methods achieved the kappa coefficient accuracy of 0.56 [18]. The FBCSP algorithm that employed the MIRSR feature selection algorithm and yielded a kappa coefficient accuracy of 0.57 [31]. The Multiple discriminate analysis feature is performed with the SVM classifier and the kappa coefficient accuracy is seen to be at 0.55 [32]. The Autoregressive feature is performed with the LDA classifier, it can be seen that the kappa coefficient accuracy is 0.52 [33]. As a result, the proposed method offers a very satisfactory classification performance in comparison to the state-of-art methods. The obtained average result by the proposed method is 0.64,  Table 5 shows a comparison of the results of the proposed method and other competitive methods. The proposed technique of the user specific band is performed with combinations of TDP, FFT, and PCA (combined both Fisher ratio and FDA type F-score) with Fisher's LDA, and the average kappa coefficient accuracy is seen to be at 0.64. The combination of methods, such as FFT, CV, and PCA, are performed with the LS-SVM classifier, and the kappa coefficient accuracy is seen to be at 0.56 [9]. The FBCSP algorithm were used to optimize subject-specific frequency bands for the CSP algorithm to extract features and the average kappa coefficient accuracy is seen to be at 0.63 [10]. The combination of CNN and VAE methods achieved the kappa coefficient accuracy of 0.56 [18]. The FBCSP algorithm that employed the MIRSR feature selection algorithm and yielded a kappa coefficient accuracy of 0.57 [31]. The Multiple discriminate analysis feature is performed with the SVM classifier and the kappa coefficient accuracy is seen to be at 0.55 [32]. The Autoregressive feature is performed with the LDA classifier, it can be seen that the kappa coefficient accuracy is 0.52 [33]. As a result, the proposed method offers a very satisfactory classification performance in comparison to the state-of-art methods. The obtained average result by the proposed method is 0.64, which is higher than 0.63 [10], and it is much higher than the other results that were obtained on BCI competition data [ Table 5]. Our proposed approach outperforms the CNN method [17,18]. Thus, the combination of methods such as TDP, FFT, and PCA to the frequency bands selected by Fisher ratio and FDA type F-score along with Fisher's LDA yielded the most efficient results. for [17], the result was converted to kappa value.

Conclusions
In this paper, we present a novel method that is based on user specific band and subject specific band to select the frequency band and time segment for the classification of four-class motor imagery tasks. The method of common spatial patterns was used for pre-processing of the signal. The method reduced the number of channels from 22 to eight for the data set 2a from the BCI Competition IV. The different time intervals were examined with CSP, TDP, FFT, and PCA feature extraction methods and Fisher's linear discriminant analysis (LDA) classifier to find out the best time intervals each subject who performed motor imagery. We counted the number of frequency bands in which Fisher's linear discriminant analysis (LDA) classifier had the best accuracy rates in order to show that Fisher ratio and FDA-type F-score bands are effective in performance improvement. It is seen that, most of the time, a combination of Fisher ratio and FDA-type F-score bands would have better performance than just using Fisher ratio and FDA-type F-score bands alone. Different feature extraction techniques, like TDP, R 2 , PCA, FCBF, EMD, ITD, CV, TDP, and FFT, are compared in this paper. From the different combination of the feature extraction techniques, it is observed that the combination of both user specific band and subject specific band improve the accuracy for the combination of time domain parameters, fast Fourier transform, and principal component analysis. The combination of these algorithms has significantly increased the accuracy when it is compared to the individual approaches. Different other combinations of the feature extraction techniques were also executed and compared to compare with the proposed approach. While the other standard combinations achieved the kappa coefficient accuracies of between 0.50 and 0.60, the proposed algorithm achieved a kappa coefficient accuracy of 0.64, which is significantly higher than the other approaches. The novelty of the approach is based on the combination of user specific band and subject specific band and two feature extraction methods. In the future, other combinations of feature extraction and classification approaches will be performed to further improve the accuracy.
Author Contributions: V.J. designed the task, supervised research, analyzed the results, provided feedback and gave new ideas, revised the draft and approved the final version of the article. S.G.S. implemented test software, executed experimental work, analyzed the results and did revisions to the final article.
Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflict of interest.