Multivariate Features Extraction and Effective Decision Making Using Machine Learning Approaches

: Fault Detection and Isolation (FDI) in Heating, Ventilation, and Air Conditioning (HVAC) systems is an important approach to guarantee the human safety of these systems. Therefore, the implementation of a FDI framework is required to reduce the energy needs for buildings and improving indoor environment quality. The main goal of this paper is to merge the beneﬁts of multiscale representation, Principal Component Analysis (PCA), and Machine Learning (ML) classiﬁers to improve the efﬁciency of the detection and isolation of Air Conditioning (AC) systems. First, the multivariate statistical features extraction and selection is achieved using the PCA method. Then, the multiscale representation is applied to separate feature from noise and approximately decorrelate autocorrelation between available measurements. Third, the extracted and selected features are introduced to several machine learning classiﬁers for fault classiﬁcation purposes. The effectiveness and higher classiﬁcation accuracy of the developed Multiscale PCA (MSPCA)-based ML technique is demonstrated using two examples: synthetic data and simulated data extracted from Air Conditioning systems.


Introduction
Over the last four decades, energy consumption in the Gulf Cooperation Council (GCC) countries has been rising. The building sector alone represents for more than 50% of all delivered energy consumption. Heating, Ventilating, and Air-Conditioning (HVAC) systems are important parts of a building system. These systems provide building occupants with a comfortable and productive environment. According to Qatar's electric utility Kahramaa [1], half of the energy consumption is due to the air conditioning systems for cooling. One of the challenges in Qatar, as in the GCC as well, is to reduce electricity consumption and to improve energy efficiency. As the substantial fraction of Qatar's economy relies on hydrocarbon resources, the increasing electricity demand has established financial stress and budget loss. This is due to the decline of oil prices in addition to a progressively degrading air quality issues mainly arise from energy generation and consumption patterns, and loads demands. Consequently, the authorities are examining ways to understand, monitor, manage, control and reduce electricity usage in different sectors.
All the above-mentioned issues automatically challenge the current status of energy efficiency in buildings. Even in advanced controllers or building automation systems that are applied to improve system efficiency, faults can develop during the installation. Hence, scheduled preventive maintenance in systems or routine operations results in reducing energy waste [2]. This wasted energy is reduced whenever those faults could be detected, isolated and identified [3].
Building efficiency is when the occupant feels comfortable, safe, and having an attractive living and work environment. This requires obviously higher engineering and architecture skills, superior construction practices and smart structures. Increasingly, operations will include integration with sophisticated electric utility grids but the penalty is often increased energy consumption and operating costs.
HVAC as a load varies depending on building location, type, and occupant behavior, but it is among the critical subsystems and can make up to 50% of the total energy consumption. Therefore, faults involving them cause large energy loss [4,5]. For example, the system may heat up or cool down the supply air too much, blow too little air into rooms, etc. It is critical to be able to detect quickly these faults and take corresponding measures to solve the problem. From energy efficiency aspect, unnecessary energy may be spent due to malfunctioning devices. For instance, if the reheat valve is stuck, the system may heat up the supply air while it is actually trying to cool down the room. Another type of fault is caused by occupants, such as windows or doors being left open when the AC or heat are on which also cause energy wastage [6]. Hence, it is important to build an effective and comprehensive Faults Detection and Isolation (FDI) system for HVAC systems.
Several approaches have been proposed in this field. One approach presented in [7] uses statistical machine learning techniques for FDI. An approach proposed in [8,9] for FDI of HVAC systems uses Kalman filter, especially for valve actuator failures. An artificial intelligence approach was reported in [10] for the FDI of an air-handling unit using dynamic fuzzy neural network. In [11], researchers reported that they can achieve 20% to 30% of energy saving by recommissioning malfunctioning HVAC systems. The developed technique is used to detect and analyze faults and anomalies in building systems monitoring.
Other FDI techniques including model based approaches [12,13], empirical based approaches [14,15] and qualitative/rule based approaches [16,17] have been proposed to detect and isolate faults in HVAC systems. In [18], the authors propose an FDI bottom-up technique that is based on a dynamic building model.
Feature extraction is the most critical step in designing a diagnosis algorithm. The data-driven diagnosis methods use sensor data as the set of features for fault detection and isolation [19]. When measurements are noisy or there are irrelevant measurements in the dataset, it becomes challenging to detect and isolate faults by only monitoring the raw data. Therefore, as part of developing a diagnosis approach, we have to devise methods for presenting new features, or selecting a group of measurements that are sensitive to faults as the features. Principal component analysis (PCA) is a common feature extraction method in data science [20][21][22][23]. Technically, PCA finds the eigenvectors of a covariance matrix with the highest eigenvalues and then applies those to map the data into a new subspace of equal or less dimensions [24]. In order to enhance further the quality of the feature, we proposed to use a multiscale representation with the aim of combining the ability of PCA to extract cross-correlation between variables with the ability of orthonormal wavelets to separate feature from noise and approximately decorrelate autocorrelation between available measurements. The proposed feature extraction framework merges the benefits of PCA and multiscale representation. We use this general presentation to develop a feature extraction framework for unifying PCA and multiscale representation scheme [25]. Features extracted from different groups can be applied together to perform a better understanding of system behavior specially with respect to fault modes. To obtain a good performance of diagnosis -based approaches, it is important to extract statistical features via multiscale PCA model. In the current study, the selected features extracted from the multiscale PCA model are based on the two principal and residual subspaces. The data representing the healthy mode are used to build the multiscale PCA model then the faulty data are transformed through this obtained model. Consequently, some features are extracted and adequately selected to simultaneously represent the different patterns in the two multiscale PCA subspaces. In the next step, a diagnosis scheme projects the features to the normal operating mode or different fault modes. This phase is achieved using several Machine Learning (ML) classifiers. These methods typically apply classification methods [26] such as decision tree [27], support vector machines (SVM) [28], K-Nearest Neighbors (KNN) [29,30] and Naive Bayes (NB) [31]. They use labeled training data to learn a set of predefined fault groups.
Therefore, the main contribution of this paper is to enhance the operation performances of an air conditioning systems using a novel FDI approach. The developed FDI approach combines the benefits of machine learning, multivariate feature extraction and multiscale representation. It is so-called Multiscale PCA (MSPCA)-based Machine Learning (ML) technique. The proposed FDI approach aims to increase the reliability and safety of the Air Conditioning system, to detect and to isolate faults in Air Conditioning systems.
Thus, from Air Conditioning system measurements, features are appropriately extracted through MSPCA approach by which an optimal number of features is selected. Their statistical characteristics have been added. A ML is used in classifying different faults that can be occurred in Air Conditioning systems operating under a satisfactory random environment.
The rest of the paper is organized as follows: Section 2 presents detailed description of developed methodology. The discussion of the results of developed methodology is presented in Section 3. Finally, Section 4 presents the conclusions and future works.

Fault Detection and Isolation Using MSPCA-Based ML Technique
The proposed FDI methodology includes two major steps: feature extraction and selection using the multiscale PCA (MSPCA) and classification using machine learning classifiers. Once the measurements are available representing healthy and different possible faulty modes in the process, a MSPCA model is built using only the data where the system is under normal operating conditions. These data are projected onto a subspace of positive right directions by keeping the most captured features information. The structure of the obtained MSPCA model is represented by the directions of the subspace projector where its dimension is less than that of the original data. Through this projection operation, some significant features are obtained for each scenario. Subsequently, a bank of different classifiers is trained via various features as input and their corresponding labels being the target output. The comparison between the obtained classifier output and the set of features labels is done to make effective decisions. As well, other ML classification-based experiments have been performed using different features sets extracted and selected from the MSPCA model. The different steps of the proposed strategy for FDI purposes are summarized in the block diagram illustrated in Figure 1.

Feature Extraction and Selection Using Multiscale PCA
Considering the data matrix X, collected from a process operating under normal conditions with N samples of m variables, the data is first normalized to zero mean and unit variance Through the PCA transformer a new matrix T ∈ N×m of uncorrelated variables has been extracted, representing data features, where T = [t 1 , t 2 , · · · , t k , · · · t N ] T , t k = [t k1 , ..., t km ], P is the loading matrix obtained by an orthogonal transformation of the covariance matrix Φ which can be obtained through eigenvalue decomposition where Λ = diag(λ 1 , λ 2 · · · , λ m ) is a diagonal matrix contains the eigenvalues sorted in a decreasing order. Data dimensionality reduction can be achieved by splitting P and Λ into modelled and non-modelled variations, the first one P ∈ m× and Λ ∈ × spanning the principal subspace and the other part is P m− ∈ m×(m−l) and Λ m− ∈ (m− )×(m− ) spanning the residual subspace. The columns of P are the eigenvectors of the covariance matrix associated with the first largest eigenvalues in Λ corresponding to the most variation of the data. is the number of principal components (PCs). The columns of P m− are the remaining m − eigenvectors related to eigenvalues in Λ m− . As a result, PCA decomposes the original data set X into: With T represents the selected features which are obtained through the projection of X onto the first eigenvectors corresponding to the largest variances of the sample covariance matrix. In summary, the PCA model is determined based on an eigen-decomposition of the covariance matrix Φ. The obtained PCA model is used to extract and select significant features to be classified. Such features should be conveniently extracted in such a way to emphasize the differences between normal and one or various abnormalities operating conditions. The MSPCA model was developed by Bakshi [32] with the aim of combining the ability of PCA to extract cross-correlation between variables with the ability of orthonormal wavelets to separate feature from noise and approximately decorrelate autocorrelation between available measurements [32,33]. Some of the advantages of multiscale representation in process modeling and monitoring are presented in [32,34].
Multiscale representation has the ability to separate noise from important features in the data. It helps transform the data to be closer to normal at multiple scales even though their distribution in the time-domain does not follow normal distribution. The benefits of multiscale representation described above legibly present that they can help verify the assumptions of independence, normality and noise level made by PCA model [32]. The main steps in the multiscale PCA model are presented in Algorithm 1 [32].
For each column (i.e., process variable) in the data matrix compute the wavelet decomposition.

•
For each block (matrix) of scaled and detail coefficients at each scale, the covariance matrix is computed along with the number of principal components, as well as PCA loadings and scores of those wavelet coefficients.

•
Once the appropriate number of loadings is selected, wavelet coefficients larger than a certain threshold are selected.

•
For all scales together, PCA is carried out by after including only scales with significant events during reconstruction.
To obtain a good performance of classification, it is of significance to extract statistical features via MSPCA model by exhaustively enumerating some possible values. In the current study, the selected features extracted from the MSPCA model are the squared prediction error (SPE) statistic [35], the T 2 statistic [36], the squared weighted error (SWE) statistic [37] and the first retained principal components (T ).

Faults Classification Using Machine Learning (ML) Technique
Machine learning classifiers are adjusted to the most informative features chosen after being extracted and selected from the data for fault classification reasons. These classifiers incorporate decision tree [27], support vector machines (SVM) [28], K-Nearest Neighbors (KNN) [29,30] and Naive Bayes (NB) [31].
Next, an overview of these classifiers is reported.

Decision Trees
A decision tree is one of the most common classifier that has been strongly used in many real-world applications [27,38]. This symbolic learning technique correlate information in a stratified structure obtained from a training dataset consisted of nodes and ramifications. The aim of this classifier is to minimize the least squares error for the next split of a node in the tree, in order to predict the average of the dependent variable of all training instances covered for unseen instances in a leaf. A decision tree model T(x; {R j } J j=1 ) splits the x−space into J splits the x space into J disjoint regions {R j } and predicts a separate constant value in each one as follows: or equivalently whereŷ j = 1 a j ∑ a j i=1 y i is the mean of the response y in each region R j , y i ∈ R j , a j is the size of region R j . Hence, a tree predicts a constant value y j within each region R j . The trees are built using top-down iterative splitting based on a least squares fitting criterion. The guideline of this model is the regions {R j } J j=1 of the partition resolved by the identities of the predictor variables which are utilized for splitting and their corresponding split points.

Support Vector Machines
Next, an introduction of the basic ideas of support vector machines (SVM) for classification is developed. Keep in mind a given training set of N samples {x k , y k } N k=1 , with input data x k ∈ R m and output y k ∈ {−1, 1} that represents a set of labeled training features. The SVM takes the following form: where w ∈ R m and b ∈ R. In order to locate the hyperplanes of linear separation to be as far from the support vectors as possible, the margin defined by 1 w should be maximized. It is equivalent to finding: Allocating the constraints Lagrange multipliers α, where α k ≥ 0 ∀k, the quadratic programming (QP) optimization problem (9) becomes by setting the partials of (10), with respect to w and b, to zero Substituting (11) into (10) gives the dual optimization problem of the primary L QP depending on α: where H = [y k y l x k x l ] l=1,N k=1,N . The solution of the optimization problem (12) returns to determine α. After that, w is easily calculated using (11). In order to determine b, a new data x s satisfying (11) is presented, and a support vector takes the following form: Referring to the separation hyperplane equations and regardless the class y s , the variable b is given by: In turn, the optimal orientation of the separating hyperplanes is obtained.

K-Nearest Neighbors
The K-nearest neighbors (K-NN) [30] is the most well known machine learning technique in which a non parametric method is used to identify in which class, already known, unidentified data belong to it. The attachment to such class is based on the Euclidean distance to k-nearest neighbors. Taking into account the elements of known class are X = [x 1 , x 2 , ..., x k ] and those of the data to be classified are Y = [y 1 , y 2 , ..., y k ], then, the distance is obtained by A class is designated when the distance determined in Equation (15) is minimal.

Naive Bayes
The Naive Bayes [31] is a probabilistic classifier based on applying Bayes' theorem of the conditional probability with a strong independence assumption between features. Thus, the assignment of a feature to one of K possible classes C k is based on the conditional probability value: As the features are known and mutually independent, p(x) doesn't depend on the class C k and Thus, One common decision-making rule is the maximum a posteriori probability decision rule. The corresponding classifier, a Bayes classifier, is the function which assigns a class label y k to C k for some k as follows:

Application 1: Simulated Synthetic Data
In this simulation example and in order to generate the fault database, we assumed a condition without faults (healthy mode), and conditions with three types of faults considering the manual data analysis (faulty modes). Appropriate multiscale preprocessing of the database was necessary to utilize the generated fault database as learning data for the machine learning. We generated a fault database using the system simulation and labeled the fault based on the simulation results. Labeled data are useful for machine learning methods.
The simulated synthetic example replicates and extends the illustrative example carried out in the original MSPCA paper [32]. Two variables are generated using Gaussian measurements that are uncorrelated, of zero mean and unit variance. The final variables are generated by adding and subtracting the first two variables, respectively, as shown in the following equations [32]: The measured data matrix,X (of six variables), is then contaminated by white noise, that is uncorrelated Gaussian error, of zero mean and standard deviation of 0.2 as follows [32]: Normal operating condition consisted of 2048 equally spaced observations. Abnormal operation (also of 2048 observations) consists of a step change in the mean (of size equals 2) in all four variables between samples between observations 512 and 1024 (F 1 ), variance change between 1025 and 1537 (F 2 ) and incipient fault between 1538 and 2048 (F 3 ) (see Figure 2). In order to evaluate the different classifiers for FDI purposes, the simulated 6 variables are generated (see Equation (20)). In our study, the main steps used in the data generation are training healthy data generation (Figure 2a), training faulty data generation (Figure 2b) and testing faulty data generation (Figure 2c).
First, under a healthy operating conditions (Figure 2a), the corresponding data set is used to build a PCA model after its normalization to zero mean and unit variance. Through the eigenvalue decomposition, the obtained variances of the transformed variables are sorted in decreasing order. Second, to generate training faulty database, we assumed a conditions with three types of faults considering the manual data analysis (faulty modes) (please refer to Figure 2b). The generated faulty data (called also training data) are used to train the machine learning classifiers. Third, in the testing phase, a new samples are generated using the same model (used in the training phase). These new data (called also testing data) are used to validate the machine learning classifiers (please see Figure 2c).
These variables represent one healthy (assigned to class C0) and 3 different faulty operating modes of synthetic data (assigned to Ci, i = 1, ...3), as reported in Table 1. To obtain a good performance of diagnosis -based approaches, it is of significance to extract statistical features via PCA model. In the current study, the selected features extracted from the PCA model are the squared prediction error (SPE) statistic, the T 2 statistic, the squared weighted error (SWE) statistic and the first retained principal components. In the current work, 3 arbitrary groups of features are used, including group 1: {T 2 , Q}, group 2: {T 2 , SWE} and group 3: {T } (see Table 2).

Groups Features Descriptions
Group 1 T 2 and SPE statistics Group 2 T 2 and SWE statistics Group 3 The first = 4 PCs Next, a fault classification framework introduces the extracted and selected features to the several machine learning classifiers. These methods include decision tree (DT), support vector machines (SVM), K-Nearest Neighbors (KNN) and Naive Bayes (NB). They use labeled training data to learn a set of predefined fault classes.
To demonstrate the performance of the DT, SVM, KNN and NB, we adopted a 10-fold cross-validation approach to obtain the accuracy of the classifiers.
In order to have reference results, a Monte-Carlo simulation of 100 runs is performed and used to compute the classification accuracies using DT, SVM, KNN and NB.
Next, the four classifiers (KNN, NB, DT and SVM) are simulated and based on the classification accuracy the best classifier is selected. Tables 3 and 4 present the global performance accuracy for different selected features and for the used classifiers. It can be clearly seen that the selected features of group 2 (T 2 and SWE statistics) and the features of group 3 (the first = 4 PCs) present the best results according to their classification accuracy. In terms of percentage of accuracy, Tables 3 and 4 show the performances results. It is clear that PCA, MSPCA-based SVM and KNN using group 3 as input features provide best tradeoffs. The accuracy rates have been successfully achieved 91.06%, 92.82% and 94.73%, 93.95% respectively. It can be shown from the results that the MSPCA provides better tradeoffs when compared to the classical PCA.  We can show from Tables 3 and 4 that MSPCA-based ML provides better classification accuracy when compared to those using ML based PCA techniques. For example, MSPCA-based KNN through groups 1 to 3 gives 100% of classification accuracy.
To show the classification efficiency of the developed approach under different simulation conditions, we vary for example the fault size of the mean between 1 and 3 and evaluate the performance of MSPCA-based SVM (please refer to Table 5). The obtained results show that the developed MSPCA-based SVM approach still provides a good classification accuracies through the three groups.

Description of Air Conditioning Systems
Transient system (TRNsys Simulation Tool, simulation software) with TRNsys Simulation Studio (graphical front-end) and interface TRNBuild have been used to modulate the building and to generate Air Conditioning systems data [39]. Other studies have used TRNSYS software to generate the data [40] and to simulate the buildings faults [41]. TRNBuild interface allows adding the non-geometrical properties such as door and window properties, layer and wall material properties, thermal conductivity and different gains etc.
For the building in TRNSYS (see Figure 3), we build a detailed three-zone building model using Type 56, which constructs all the layers of floors, walls and ceilings with their physical properties such as conductivity, density and specific capacitance as well as detailed window models.  Table 6 presents a summary of the set-up for the building. Based on this table, a building model is developed in TRNSYS. The type-56 multi-zone building is a reproduction of the reference building. The building model is divided into 3 zones. Air Conditioning is supplied locally in each room by electric air conditioning; the buildings have no heating system. However, the recommended set point for cooling is equal to 26 • C. The TRNSYS model has been run using the existing building parameters described earlier, with 1 h time step, using the U. S. Department of Energy (DOE) typical meteorological year version 2 (TMY2) weather data [42]. A simulation based case study was set up to demonstrate the FDI system. Three rooms are modeled in TRNSYS. The rooms are simulated with different load profiles and schedules. One room is monitored for FDI in this study. The model is assumed to be located in Doha-Qatar. The simulations are conducted for the Air Conditioning season due to the Air Conditioning dominated weather. One year of normal operation data is used to train and set up the FDI system.
In the current study and under a healthy operating conditions, the corresponding data set is used to build a PCA model after its normalization to zero mean and unit variance. Through the eigenvalue decomposition, the obtained variances of the transformed variables are sorted in decreasing order. According to the most significant captured information in the data via its projection, a PCA model with three directions has been constructed. In fact, this selection is based on that the variance of obtained component less or equal to one characterizes noise and measurement error. This is confirmed by the minimization of the reconstruction error variance. To generate fault database, we assumed a condition without faults, and conditions with two types of faults. The two fault cases are generated in TRNSYS, including the zone level operation faults. The faults occur individually and are implemented statically by altering existing objects such as schedules in TRNSYS. The two faults are described as follows: 1.
Unplanned occupancy is considered an abnormal occupation, i.e., a presence of unplanned occupant different from occupancy desired profile or more occupants present than allowed. This fault is simulated by injecting unplanned occupants.

2.
Opening a window while the Air Conditioning system is operating is an example of a fault that is generated by an occupant and causes wastage of energy sources.
Usually, building operational performance is measured in terms of comfort, system efficiency, productivity, environmental quality, and functionality. Unplanned occupancy as well opening window will increase energy consumption and cost. Simulation data explain how extra air conditioning power appears because of unplanned occupancy or opening window. It also represents the variations in indoor temperature. For example, opening a window might increase the air quality level but at the same time it might decline indoor thermal comfort.
As the FDI problem can be seen as a classification problem, there are totally three classes of data used here: one class of fault free data and two classes of faulty data. The time range of the data is from 0 h to 8760 h (1 year) with 1h time step (the dataset can be found at the link: https://www.kaggle. com/sondesgharsellaoui/fdi-dataset). The generated data extracted from Air Conditioning systems are presented in Figure 4.  In many cases, data does not represent the real values of actions. Furthermore, there are various factors are linked to each other and occupants are unable to choose the right actions. For example, closing a window might improve indoor thermal comfort but at the same time, it might reduce the air quality level. Similarly, cooling load is very sensitive to outdoor temperature and often it is not connected to future energy consumption load. Generally, the energy consumption of the air conditioner depends on several external factors such as outdoor temperature, humidity, wind velocity, atmospheric pressure (please refer to Table 7).

Fault Classification Results
Again, to evaluate the performance of the four classifiers, a 10-fold cross-validation is employed. In order to carry out the the proposed FDI approach, various simulated 5 variables measurements are collected (see Table 7). These variables represent one healthy (assigned to class C0) and 2 different faulty operating modes (assigned to Ci, i = 1, ...2), as presented in Table 8. In this case study, 3 groups of features are used, including group 1: {T 2 , Q}, group 2: {T 2 , SWE} and group 3 {T } (see Table 2). Tables 9 and 10 present the global performance accuracy for different selected features and for the applied classifiers. From these tables it is clear that the selected features of group 2 (T 2 and SWE statistics) and features of group 3 (the first = 3 PCs) give the best compromise in terms of classification accuracy. Performance results in terms of percentage of accuracy are reported in Tables 9 and 10. It is clear that the results using SVM and KNN techniques through group 3 provide the best tradeoffs. The accuracy rates have been successfully achieved 89.17%, 90.70% and 99.08%, 100% respectively.
We can show from the Tables 9 and 10 that MSPCA-based ML provides better classification accuracy when compared to those using ML based PCA techniques. For example, MSPCA-based KNN through groups 1 to 3 gives 100% of classification accuracy.

Conclusions
In this paper, the problem of Fault Detection and Isolation (FDI) of Air Conditioning Systems (AC) has been addressed. The developed technique is based on the machine learning (ML)-based Multiscale Principal Component Analysis (MSPCA). The developed MSPCA-based ML approach is addressed so that the MSPCA technique was used for feature extraction and selection purposes and the ML technique is applied for faults diagnosis. The proposed approach was developed for systems monitoring under normal and faulty conditions. Different cases were investigated in order to show the robustness and the efficiency of the developed approach. The effectiveness of the FDI approach was studied using a synthetic and simulated Air Conditioning systems data. The high detection and monitoring accuracy presented a good impact on the energy production in Air Conditioning systems. Under different simulation conditions, the developed FDI technique showed a good monitoring condition and higher detection accuracy. As a conclusion, the proposed fault detection and isolation method can help to maintain the comfort and reduce energy consumption.
There are a number of threats that may have an impact on the results of this study. The fault detection and isolation approaches proposed in this study were built by using default parameters. That is, we have not investigated how these approaches are affected by varying the parameters. Thus, other approaches might be better in monitoring and diagnosing the faults.
The feature extraction approach proposed in the current work is based on the PCA model, which assumes that the relationship between the variables is linear. The next research direction is to extend the current work to achieve further improvements and widen the applicability of the developed method in practice by using the kernel PCA method. The kernel PCA is derived from the nonlinear case of PCA algorithm and it will be investigated as feature extraction algorithm in the task of fault detection and isolation.
Additionally, in the current work, we assumed that the features are single-valued data represented, however, a more accurate monitoring and diagnosis can be obtained by representing the uncertainties in the systems by using interval-valued data representation. As future work, we plan to extend the current work to interval-valued data feature extraction based machine learning for detection and diagnosis purposes.