EEkNN: k-Nearest Neighbor Classifier with an Evidential Editing Procedure for Training Samples †

The k-nearest neighbor (kNN) rule is one of the most popular classification algorithms applied in many fields because it is very simple to understand and easy to design. However, one of the major problems encountered in using the kNN rule is that all of the training samples are considered equally important in the assignment of the class label to the query pattern. In this paper, an evidential editing version of the kNN rule is developed within the framework of belief function theory. The proposal is composed of two procedures. An evidential editing procedure is first proposed to reassign the original training samples with new labels represented by an evidential membership structure, which provides a general representation model regarding the class membership of the training samples. After editing, a classification procedure specifically designed for evidently edited training samples is developed in the belief function framework to handle the more general situation in which the edited training samples are assigned dependent evidential labels. Three synthetic datasets and six real datasets collected from various fields were used to evaluate the performance of the proposed method. The reported results show that the proposal achieves better performance than other considered kNN-based methods, especially for datasets with high imprecision ratios.


Introduction
Classification of patterns is an important area of research and practical applications in a variety of fields including biology [1], psychology [2], medicine [3], electronics [4], marketing [5], military affairs [6], etc. In the past several decades, a wide variety of approaches has been developed towards this task [7]. As a type of lazy learning algorithm, the k-nearest neighbor (kNN) rule introduced by Fix and Hodges [8] has been one of the most popular and successful pattern classification techniques due to its simplicity and validity. The basic idea of the kNN rule is that patterns close in feature space are likely to belong to the same class. Though the kNN rule is suboptimal, it has been shown that as k increases, its error rate approaches the optimal Bayes error rate asymptotically in the infinite sample situation [9].
However, in the practical cases of a finite number of samples, the classical k-NN rule is not always the optimal way of utilizing the information contained in the neighborhood of query patterns, and therefore, a large number of research works focused on the improvement of this rule in the past 60 years [10][11][12][13][14][15]. One of the major concerns when using the kNN rule is that all of the training samples are considered equally important for assigning the class label of the query pattern. This limitation will result in great difficulty for classification in regions where the samples from different classes overlap.
Atypical samples in overlapping regions may be assigned as much weight as those that are truly representative of the clusters. Furthermore, it may be argued that training samples with great noise should not be given equal weight. In order to overcome this difficulty, many editing procedures have been proposed to preprocess the original training samples and then to make classification based on the edited training set [16][17][18][19][20][21][22][23][24][25][26][27][28][29].
Based on different structures of the edited labels, the editing procedures can be divided into two groups: crisp editing and soft editing. The editing procedure was firstly developed by Wilson [17] to preprocess the training samples. In this procedure, a training sample x i is classified using the kNN rule with the remainder of the training set and is then deleted from the original training set if its original label does not agree with the classification result. After that, many others followed Wilson's work and proposed some variants [18][19][20][21][22]. One of the representatives is the generalized editing procedure developed by Koplowitz and Brown [19], aiming to overcome the limitations of large amounts of samples being removed from the training set. In their work, instead of deleting all the conflicting samples as Wilson's work, if a particular class (excluding the original class) has at least k ((k + 1)/2 ≤ k ≤ k) representatives among these k nearest neighbors, then x i is labeled according to that majority class. Essentially, both Wilson's editing and its variants are crisp editing procedures, in which each edited sample is either removed or assigned to a single class. In order to overcome the weakness of the crisp editing methods, a fuzzy editing procedure was then proposed to reassign fuzzy membership to each training sample x i based on its k nearest neighbors [25]. Several different realizations of this fuzzy editing procedure have been also developed [26][27][28]. As a type of soft editing procedure, fuzzy editing makes it possible for each edited sample to be assigned to several classes with different fuzzy memberships, which provides more detailed information about the samples' membership than the crisp editing procedures.
In real-world classification problems, different types of uncertainty may coexist due to the environments or other interference factors, e.g., fuzziness may coexist with imprecision. The fuzzy editing procedure, developed based on fuzzy set theory [30], cannot address imprecise or partial information effectively in the modeling and reasoning processes. In contrast, the belief function theory [31][32][33], also known as Dempster-Shafer theory or evidence theory, offers a well-founded and effective framework to represent and combine a variety of uncertain information. This theory has already been used in kNN-based classification [34][35][36][37][38][39]. In [34], an evidential version of kNN, called EkNN, has been proposed by the introduction of the ignorance class to model the uncertainty. Then, this classification method was further extended to deal with uncertainty using the rejection class and meta-classes in [37]. In [38], Dempster's rule of combination used in EkNN was replaced by a class of parametric combination rules. However, neither the EkNN method nor its extensions consider any editing procedure in the classification process. Recently, an editing procedure for multi-label classification was developed in [29] based on the belief function theory, but it is essentially a crisp editing procedure, as each edited sample is just assigned a single set of classes without considering the membership degrees.
In this paper, an evidential editing version of the kNN classifier (EEkNN) is proposed based on the belief function theory (A preliminary version of some of the ideas introduced here was presented in [40,41]. The present paper is a deeply revised and extended version of this work, with several new results.). The proposed EEkNN classifier is composed of two procedures: evidential editing for the original training samples and classification based on the evidently edited training samples. First, an evidential editing procedure is developed to reassign the original training samples with new labels represented by an evidential membership structure. Compared with the crisp label or the fuzzy membership, the evidential membership provides more expressiveness to represent the imprecision and uncertainty for those samples in overlapping regions or with great noise. After the editing procedure, a kNN classification procedure specifically designed for evidently edited training samples is developed in the belief function framework. This classification procedure can well handle the more general situation where the edited training samples are assigned dependent evidential labels.
The rest of this paper is organized as follows. In Section 2, the basics of the belief function theory are recalled. Then, the evidential editing procedure is developed in Section 3. After that, the classification procedure is designed and realized based on the edited training samples in the belief function framework in Section 4. Section 5 provides several experiments to test the proposed method. Finally, Section 6 concludes the paper. To facilitate reading, Table 1 gives a list of the symbols used and their definitions.

Basics of the Belief Function Theory
In belief function theory [31][32][33], a problem domain is represented by a finite set Ω = {ω 1 , ω 2 , · · · , ω M } of mutually exclusive and exhaustive hypotheses called the frame of discernment. A mass function expressing the belief committed to the elements of 2 Ω by a given source of evidence is a mapping function m: 2 Ω → [0, 1], such that: Elements A ⊆ Ω having m(A) > 0 are called the focal sets of the mass function m. The mass function has several special cases to encode different types of information. A mass function is said to be:

•
Bayesian, if all of its focal sets are singletons. In this case, the mass function just reduces to the classical probability distribution. • categorical, if the whole mass is allocated to one focal set A. This indicates that the truth lies in A with certainty. • certain, if the whole mass is allocated to a unique singleton. This indicates that we have complete knowledge about the truth. • vacuous, if the whole mass is allocated to Ω. This situation corresponds to complete ignorance. • simple, if it has at most two focal sets and one of them is Ω if it has two. It is usually denoted as A w , where A is the focal set different from Ω and 1 − w is the confidence that the truth lies in A.
After representing the available pieces of evidence as mass functions, the next step is to combine these mass functions into a single one for decision making. Many combination rules have been developed. The differences among them mainly depend on two issues: the dependence and the conflict among the available pieces of evidence.
Dempster's rule is the most popular choice to combine several distinct pieces of evidence [31]. Its combination of two mass functions m 1 and m 2 defined on the same frame of discernment Ω is: To combine mass functions induced by nondistinct pieces of evidence, a cautious rule and, more generally, a family of parameterized t-norm based rules were proposed in [42]: where m 1 and m 2 are separable mass functions, such that The operator s denotes Frank's parameterized family of t-norms: for all a, b ∈ [0, 1], with s being a positive parameter. When s = 0, the t-norm-based rule reduces to the cautious rule, and when s = 1, it reduces to Dempster's rule. For the above combination rules, it is assumed that the pieces of evidence to be combined are fully reliable. However, when this assumption fails, there may exist large conflicts among the pieces of evidence, in which case the performance of the above combination rules degrades greatly. Dubois and Prade [43] proposed an alternative rule to the combination of pieces of conflicting evidence as: This rule boils down to Dempster's rule when there is no conflict between the two combined pieces of evidence.
For decision making, Smets [33] proposed the pignistic transformation to transform a mass function into a probability function as: where |X| is the cardinality of set X.

Evidential Editing Procedure for Training Samples
Let us consider an M-class classification problem in a predefined category Ω = {ω 1 , · · · , ω M }. Assuming that a set of N labeled training samples T = {(x 1 , ω (1) ), · · · , (x N , ω (N) )} with input vectors x i ∈ R P and class labels ω (i) ∈ Ω is available, the editing procedure aims to generate a new edited training set T , which is more powerful than the original one for classification. In this section, we develop an evidential editing procedure for training samples in the belief function framework. First, in Section 3.1, an evidential membership structure is introduced as a general representation model for class membership. Then, in Section 3.2, an evidential editing algorithm is proposed to edit the training samples based on the evidential membership structure.

Evidential Membership Structure
The purpose of the evidential editing procedure is to assign to each sample in the training set T a new soft label represented by an evidential membership structure as: where m i , i = 1, 2, · · · , N, are mass functions defined on the frame of discernment Ω.
The above evidential membership modeled by mass function m i provides a general representation model regarding the class membership of sample x i : • when m i is a Bayesian mass function, the evidential membership reduces to the fuzzy membership as a special case. • when m i is a categorical mass function, the evidential membership reduces to the crisp set of labels as defined in [29]. • when m i is a certain mass function, the evidential membership reduces to the crisp label. • when m i is a vacuous mass function, the sample x i is useless for classification and can be considered as an outlier.

Example 1. Let us consider a set of N
Mass functions for each sample are given in Table 2. They illustrate various situations: the case of sample x 1 corresponds to the situation of probabilistic uncertainty (m 1 is Bayesian), whereas the case of sample x 2 corresponds to the situation of imprecision (m 2 is categorical); the class of sample x 3 is known with precision and certainty (m 3 is certain), whereas the class of sample x 4 is completely unknown (m 4 is vacuous); finally, the mass function m 5 models the general situation where the class of sample x 5 is both imprecise and uncertain. Table 2. Example of the evidential membership.
As illustrated in the above example, the evidential membership is a powerful model to represent the imprecise and uncertain information existing in the training samples. In the following part, we will study how to edit each training sample with the evidential membership.

Evidential Editing Algorithm
For each training sample x i , i = 1, 2, · · · , N, we denote the leave-it-out training set as T i = T \ {(x i , ω (i) )}, i = 1, 2, · · · , N. Now, we will show how the evidential editing procedure works for one training sample x i based on the other samples contained in T i . The evidence modeling method developed in [34] is used here to generate a mass function for each neighbor x j regarding the class membership of x i as: where , ω q is the class label of x j (i.e., ω (j) = ω q ), and α is a parameter such that 0 < α < 1.
A recommended value of α = 0.95 can be used to obtain good results on average, and a good choice for φ q is: where γ q is a positive parameter associated with class ω q , and it can be set to the inverse of the mean squared distance between training samples belonging to class ω q heuristically. Based on the distance d(x i , x j ), we first select k edit nearest neighbors of x i in training set T i and construct the corresponding k edit mass functions according to the above way. These k edit mass functions are then combined to form a resulting mass function m i , synthesizing the final evidential membership regarding the class of x i . Considering the different degrees of conflict among the constructed mass functions, we developed a hierarchical combination process that is carried out at two levels: intra-class combination and inter-class combination.
At the first level, we consider the combination of mass functions derived from the neighbors with the same class label. As all the mass functions to be combined support the same class, there is no conflict among them. Besides, as the training samples are usually collected independently, the items of evidence from different neighbors are independent. In this case, Dempster's rule is a good choice for its effectiveness and simplicity. If we denote by Ψ q i the set of the k nearest neighbors of x i belonging to class ω q and assuming that Ψ q i is not empty, the intra-class combination for mass functions derived from the neighbors with class label ω q is given by: As shown in Equation (8), all the mass functions to be combined are simple. Thanks to this particular structure, the computational burden of Dempster's rule can be greatly reduced, and the above intra-class combination can be further formulated analytically as: If Ψ q i is an empty set, then m i (· | Ψ q i ) is simply a vacuous mass function satisfying m i (Ω | Ψ q i ) = 1. After the intra-class combination for mass functions derived from the neighbors belonging to each class, at the second level, we combine these sub-combination results to get a global combination result as the final evidential membership regarding the class of x i . As these sub-combination results support different classes, large conflicts may exist among them. In this case, Dubois-Prade's rule is a good alternative combination method. However, when the number of classes is large, Dubois-Prade's rule of combination for all the sub-combination results will generate a great number of focal sets (as many as 2 M − 1), which results in overmuch imprecision for the edited label. Therefore, at the inter-class combination level, if there is more than one mass function having non-zero mass for the support class, we only combine those two having largest mass as: where Noting that the sub-combination results shown in Equation (11) are also simple mass functions, the above inter-class combination can be further formulated analytically as: If there is only one mass function having non-zero mass for the support class, then m i is simply the same as m i (· | Ψ q 1 i ). Algorithm 1 shows the pseudocode of the evidential editing algorithm.

Algorithm 1 Evidential editing algorithm.
Require: the original training set T = {(x 1 , ω (1) ), · · · , (x N , ω (N) )} with x i ∈ R P and ω (i) ∈ {ω 1 , · · · , ω M }, the number of nearest neighbors k edit 1: Initialize T ← ∅; 2: for i = 1-N do 3: Find k edit nearest neighbors of Generate a mass function m i (· | x j ) for each neighbor x j using Equations (8)-(9); 5: for q = 1 to M do 6: Combine mass functions derived from the neighbors belonging to class ω q to get a sub-combination result m i (· | Ψ  Figure 1 illustrates a simplified three-class classification example in the two-dimensional plane. A total number of thirteen training samples was collected with x 1 -x 5 belonging to class ω 1 , x 6 -x 9 belonging to class ω 2 , and x 10 -x 13 belonging to class ω 3 . We consider the evidential editing process for sample x 1 based on the information from the other samples. In this example, the number of nearest neighbors k edit was set to five. Based on the Euclidean distance, five samples x 3 , x 5 , x 6 , x 8 , x 12 were selected, and the corresponding five mass functions were constructed using Equations (8) and (9) regarding the class membership of x 1 as: The above mass functions were then combined at two levels sequentially. At the intra-class combination level, we combined those mass functions derived from the neighbors with the same class label using Equation (11) and obtained the sub-combination results as: Next, at the second level, we combined the above sub-combination results to get a global one. In this step, only the two mass functions having largest mass for the support class, i.e., m 1 (· | {x 3 , x 5 }) and m 1 (· | {x 6 , x 8 }), were combined using Equation (13) to get the final evidential membership regarding the class of x i as: It can be seen that the focal set {ω 1 , ω 2 } obtained the largest mass. This indicates that the sample x 1 had a great chance of being in the overlapping region of class ω 1 and class ω 2 , which is consistent with the actual situation.

kNN Classification with Evidently Edited Training Samples
After the evidential editing procedure developed in Section 3, the problem now turns into classifying a query pattern y ∈ R P based on the evidently edited training set T . In this section, a classification procedure specifically designed for evidently edited training samples is developed in the belief function framework. This classification procedure is composed of the following two steps: evidence representation for the edited training samples and evidence combination for decision making.

Evidence Representation for the Edited Training Samples
Assume that the k nearest neighbors of the query pattern y have been selected from the edited training set. Generally, one training sample x i is a very reliable piece of evidence for the classification of y if it is very close to y. In contrast, if x i is far from y, then it is not reliable evidence. In the belief function society, the discounting operation proposed by Shafer [32] is a common tool to address the partially reliable evidence.
Denote as m i the evidential label of the training sample x i and β i the confidence degree of the class membership of y with respect to the training sample x i . The evidence provided by x i for the class membership of y is represented with a discounted mass function β i m i by discounting m i with a rate 1 − β i as: The confidence degree β i is determined based on the distance d i between x i and y. Generally, a larger distance results in a smaller confidence degree, and therefore, β i should be a decreasing function of d i . A similar decreasing function with Equation (9) is used here to define the confidence degree β i ∈ (0, 1] as: where λ i is a positive parameter associated with the training sample x i and is defined as: where d is the mean distance among all training samples and d A is the mean distance among training samples belonging to class set A, ∀A ∈ 2 Ω \ Ω.

Remark 1.
In calculating the confidence degree, parameter λ i is designed by extending the parameter γ q in Equation (9) to the cases of evidential labels. In Equation (16), if the label of the training sample x i is crisp with ω q , i.e., m i ({ω q }) = 1, m i (A) = 0, ∀A ∈ 2 Ω \ {ω q }, then the parameter λ i just reduces to γ q as a special case.

Evidence Combination for Decision Making
In this section, we will combine the above generated k mass functions into a single one in order to make a decision about the class of the query pattern y. The popular Dempster's rule of combination relies on the assumption that the items of evidence to be combined are independent. However, as illustrated in the following example, the k mass functions derived from different edited samples cannot be regarded as fully independent any longer. Figure 2 illustrates the dependence among different edited training samples, where the training samples are denoted by " " and the query pattern is denoted by " ". In the evidential editing process, k edit = 2 was assumed to search for the nearest neighbors, and in the classification process, the number of nearest neighbors k = 3 was assumed. We can see that x 1 , x 2 , and x 3 were the three nearest neighbors used for the classification of the query pattern y. In the evidential editing process, as the training sample x 4 was used to calculate both the class membership of x 1 and x 2 , the edited training samples x 1 and x 2 were no longer independent. In contrast, the edited training sample x 3 was still independent of both x 1 and x 2 as they did not use common training samples in the evidential editing process. Therefore, the items of evidence from different edited training samples may have partial dependence. To account for this partial dependence, we used the parameterized t-norm-based rule shown in Equation (3) to combine the generated k mass functions to get the final result for query pattern y as:

Example 3.
where k is the number of nearest neighbors with i 1 , i 2 , · · · , i k being the indices of the k nearest neighbors of y in T and s is the Frank t-norms parameter defined in Equation (4). Different values of parameter s result in a series of combination rules ranging from the cautious rule (s = 0) to the Dempster's rule (s = 1). The selection of parameter s depends on the potential dependence of the edited training samples. A smaller value should be assigned to s for the case of larger dependence. In practice, we can use cross-validation to search for the optimal t-norm-based rule.
In order to make a decision based on the above combined mass function m, the pignistic probability BetP shown in Equation (6) was calculated. Finally, the query pattern y was assigned to the class with the maximum pignistic probability.

Experiments
The performance of the proposed kNN classifier with evidential editing procedure (EEkNN) was evaluated using four different experiments. In the first experiment, the combination rules used in the classification process were evaluated under different dependence degrees of the edited samples. In the second experiment, the effects of the two main parameters k edit and k in the editing and classification processes were analyzed. In the last two experiments, the performance of the EEkNN classifier was compared with those of other kNN-based methods, including the kNN classifier with generalized editing procedure (GEkNN) [19], the kNN classifier with fuzzy editing procedure (FEkNN) [25], and the evidential kNN classifier (EkNN) [34], using synthetic datasets and real datasets, respectively.

Evaluation of the Combination Rules
This experiment was designed to evaluate the combination rules used in the classification process of the EEkNN classifier. A two-dimensional three-class classification problem was considered. The following normal class-conditional distributions were assumed: Class A: µ A = (6, 6) T , Σ A = 4I; Class B: µ B = (14, 6) T , Σ B = 4I; Class C: µ C = (14, 14) T , Σ C = 4I.
A set of 150 training samples and a set of 3000 test samples were generated from the above distributions using equal prior probabilities. The average test classification rate over 30 independent trials was calculated. In the evidential editing process, k edit = 3, 9, 15, 21 were selected, and in the classification process, values of k ranging from 1-25 have been investigated. The t-norm-based rules (TR) with parameter s ranging from 0-1 have been evaluated (the cautious rule (CR) was retrieved when s = 0, and Dempster's rule (DR) was retrieved when s = 1). Figure 3 shows the classification accuracy for different combination rules. We note that the best combination rule varied with changes of the value of k edit . In other words, the k edit value had great influence on the dependence of the edited samples, and a larger k edit value tended to result in larger dependence. For one specific classification problem, the selection of the best combination rule depends on the potential dependence of the edited samples, which further depends on the utilized k edit value. Therefore, for the EEkNN classifier, the optimal t-norm-based rule should be searched for each specific k edit value.

Parameter Analysis
This experiment was designed to analyze the effect of parameters k edit and k for the proposed EEkNN classifier. The same training and test samples with the previous experiment were used. The difference was that in the evidential editing process, k edit = 3, 6,9,12,15,18,21,24 were selected, and the optimal t-norm-based rule for each specific k edit value was used to make the classification. Average classification accuracy over the 30 trials with values of k ranging from 1-25 has been investigated.
From Figure 4, we can see that the classification performance can improve clearly as the parameter k edit increases within an interval ( [3,12] in this example). However, when k edit exceeded an upper boundary (k edit = 12 in this example), the classification performance no longer improved ideally. In addition, when k edit took small values, the classification performance could improve as the parameter k increased. However, when k edit exceeded the upper boundary, the parameter k had little effect on the classification performance.

Synthetic Data Test
This experiment was designed to compare the proposed EEkNN classifier with other kNN-based classifiers using synthetic datasets with different class imprecision ratios, defined as the number of imprecise samples divided by the total number of training samples. A training sample x i is considered to be imprecise if a non-singleton set gets the largest mass after the evidential editing procedure. A two-dimensional four-class classification problem was considered. The following normal class-conditional distributions were assumed. For comparisons, we changed the variance of each distribution to control the class imprecision ratio. Class C: µ C = (0, 5) T , Σ C = 3I; Class D: µ C = (5, 5) T , Σ C = 3I. Imprecision ratio ρ = 79% A training set of 200 samples and a test set of 4000 samples were generated from the above distributions using equal prior probabilities. For each case, 30 trials were performed with 30 independent training sets. The average classification accuracy and the corresponding 95% confidence interval were calculated. For each trial, the best values for the parameters k edit and s in the EEkNN classifier were determined in the sets {3, 6, 9, 12, 15, 18, 21, 24} and {1, 10 −1 , 10 −2 , 10 −3 , 10 −4 , 10 −5 , 0}, respectively, by cross-validation. For all of the considered method, values of k ranging from 1-25 have been investigated. Figures 5-7 show the training set and the classification results for cases with different imprecision ratios. From the left three subfigures, we can see that the three cases corresponded to slight, moderate, and severe class overlapping, respectively. The average classification accuracy rates of different methods, as well as the corresponding 95% confidence intervals of the proposed one are shown in the right three subfigures. It can be seen that for all the considered three cases, the proposed EEkNN classifier provided better classification accuracy than other kNN-based ones, because in our proposed EEkNN classifier, the uncertainty of samples in overlapping regions can be well characterized with the introduction of the evidential editing procedure. We also notice that the performance improvement was more significant for Case 3, where the samples from different classes overlapped severely. Furthermore, different from other kNN-based classifiers, the proposed one was less sensitive to the value of k, and it performed well even with a small value of k.

Real Data Test
This experiment was designed to compare the proposed EEkNN classifier with other kNN-based classifiers using some real-world classification problems from the well-known UCI Machine Learning Repository [44]. These datasets covered a variety of applications in many fields, i.e., biology, medicine, phytology, and astronomy. The main characteristics of the six real datasets used in this experiment are summarized in Table 3, where "# Samples" is the number of samples in the dataset, "# Features" is the number of features, and "# Classes" is the number of classes. To assess the results, we considered the resampled paired test. A series of 30 trials was conducted. In each trial, the available samples were randomly divided into a training set and a test set (with equal sizes). For each dataset, we calculated the average classification rate of the 30 trials and the corresponding 95% confidence interval. For the proposed EEkNN classifier, the best values for the parameters k edit and s were determined with the same procedure used in the previous experiment. For all of the considered methods, values of k ranging from 1-25 have been investigated.  Figure 8 shows the classification results of different methods for real datasets. It can be seen that, for most datasets, the EEkNN classifier provided better classification performance than other kNN-based ones. The reason is that in our proposed EEkNN classifier, the uncertainty of samples in overlapping regions or noisy patterns can be well characterized with the introduction of the evidential editing procedure. In the GEkNN classifier, however, each uncertain sample was either removed or assigned to a single class with great risk. Though in the FEkNN classifier, the fuzzy membership was reassigned to each uncertain sample, it could not address the involved imprecise information effectively. For the original EkNN classifier developed based on the belief function theory, the original training set was just used to make classification without considering any editing procedure. However, for dataset Glass, the classification performances of different methods were quite similar. The reason is that, for this dataset, the best classification performance was obtained when k took a small value, and under this circumstance, the evidential editing procedure could not improve the classification performance.

Conclusions
An evidential editing version of the kNN classifier (EEkNN) has been developed based on an evidential editing procedure that reassigns the original training samples with new labels represented by an evidential membership structure. Thanks to this procedure, noisy patterns or those situated in overlapping regions had less influence on the decisions. In addition, in the subsequent classification procedure, the parameterized t-norm-based rule was optimized to combine the k nearest neighbors of one query pattern by taking into account the potential dependence among them. Experiments based on both synthetic and real datasets have been carried out to evaluate the performance of the proposal. From the results reported in the last section, we can conclude that the proposed EEkNN classifier can achieve higher classification accuracy than other considered kNN-based methods, especially for datasets with high imprecision ratios. Moreover, the proposed EEkNN classifier was not too sensitive to the value of k, and it could gain a quite good performance even with k = 1. This is an advantage in time-or space-critical applications, in which only a small value of k is permitted in the classification process.
The proposal can be potentially used in many classification applications where the available data are imperfect. For example, in brain-computer interface (BCI) systems [45], the electroencephalogram (EEG) signals may contain great uncertainties due to the varying brain dynamics and the presence of noise. The proposed EEkNN classifier can minimize the effect of these uncertainties with the introduction of the evidential editing procedure for the raw data.