#### Summary

Algorithmic Learning in a Random World describes recent theoretical and experimental developments in building computable approximations to Kolmogorov's algorithmic notion of randomness.

Based on these approximations, a new set of machine learning algorithms have been developed that can be used to make predictions and to estimate their confidence and credibility in high-dimensional spaces under the usual assumption that the data are independent and identically distributed (assumption of randomness). Another aim of this unique monograph is to outline some limits of predictions: The approach based on algorithmic theory of randomness allows for the proof of impossibility of prediction in certain situations. The book describes how several important machine learning problems, such as density estimation in high-dimensional spaces, cannot be solved if the only assumption is randomness.

#### Summary

This volume contains the papers presented at the 21st International Conference on Algorithmic Learning Theory (ALT 2010), which was held in Canberra, Australia, October 6–8, 2010. The conference was co-located with the 13th International Conference on Discovery Science (DS 2010) and with the Machine Learning Summer School, which was held just before ALT 2010.

The techniccal program of ALT 2010, contained 26 papers selected from 44 submissions and invited talks. The invited talks were presented in joint sessions of both conferences. ALT 2010 was dedicated to the theoretical foundations of machine learning and took place on the campus of the Australian National University, Canberra, Australia.

#### Summary

Data analysis and inference have traditionally been research areas of statistics. However, the need to electronically store, manipulate and analyze large-scale, high-dimensional data sets requires new methods and tools, new types of databases, new efficient algorithms, new data structures, etc. - in effect new computational methods.

This monograph presents new intelligent data management methods and tools, such as the support vector machine, and new results from the field of inference, in particular of causal modeling. In 11 well-structured chapters, leading experts map out the major tendencies and future directions of intelligent data analysis. The book will become a valuable source of reference for researchers exploring the interdisciplinary area between statistics and computer science as well as for professionals applying advanced data analysis methods in industry and commerce. Students and lecturers will find the book useful as an introduction to the area.

#### Summary

This book is devoted to two interrelated techniques in solving some important problems in machine intelligence and pattern recognition, namely probabilistic reasoning and computational learning.

It is divided into four parts, the first of which describes several new inductive principles and techniques used in computational learning. The second part contains papers on Bayesian and Causal Belief networks. Part three includes chapters on case studies and descriptions of several hybrid systems and the final part describes some related theoretical work in the field of probabilistic reasoning.

#### Conformal Prediction for Reliable Machine Learning: Theory, Adaptations and Applications

*BookMorgan Kaufmann | 2014 | ISBN-10: 0123985374*

#### Summary

The conformal predictions framework is a recent development in machine learning that can associate a reliable measure of confidence with a prediction in any real-world pattern recognition application, including risk-sensitive applications such as medical diagnosis, face recognition, and financial risk prediction.

Conformal Predictions for Reliable Machine Learning: Theory, Adaptations and Applications captures the basic theory of the framework, demonstrates how to apply it to real-world problems, and presents several adaptations, including active learning, change detection, and anomaly detection. As practitioners and researchers around the world apply and adapt the framework, this edited volume brings together these bodies of work, providing a springboard for further research as well as a handbook for application in real-world problems.

#### Summary

This book constitutes the refereed proceedings of the 5th International Symposium on Conformal and Probabilistic Prediction with Applications, COPA 2016, held in Madrid, Spain, in April 2016.

The 14 revised full papers presented together with 1 invited paper were carefully reviewed and selected from 23 submissions and cover topics on theory of conformal prediction; applications of conformal prediction; and machine learning.

#### Summary

This book celebrates the work of Vladimir Vapnik, developer of the support vector machine, which combines methods from statistical learning and functional analysis to create a new approach to learning problems, and who continues as active as ever in his field.

#### Summary

Twenty-five years have passed since the publication of the Russian version of the book Estimation of Dependencies Based on Empirical Data (EDBED for short). Twenty-five years is a long period of time. During these years many things have happened. Looking back, one can see how rapidly life and technology have changed, and how slow and difficult it is to change the theoretical foundation of the technology and its philosophy.

I pursued two goals writing this Afterword: to update the technical results presented in EDBED (the easy goal) and to describe a general picture of how the new ideas developed over these years (a much more difficult goal). The picture which I would like to present is a very personal (and therefore very biased) account of the development of one particular branch of science, Empirical inference Science. Such accounts usually are not included in the content of technical publications. I have followed this rule in all of my previous books. But this time I would like to violate it for the following reasons. First of all, for me EDBED is the important milestone in the development of empirical inference theory and I would like to explain why. Second, during these years, there were a lot of discussions between supporters of the new 1 paradigm (now it is called the VC theory) and the old one (classical statistics).

#### Summary

This book brings together historical notes, reviews of research developments, fresh ideas on how to make VC (Vapnik–Chervonenkis) guarantees tighter, and new technical contributions in the areas of machine learning, statistical inference, classification, algorithmic statistics, and pattern recognition.

The contributors are leading scientists in domains such as statistics, mathematics, and theoretical computer science, and the book will be of interest to researchers and graduate students in these domains.

#### Summary

The aim of this book is to discuss the fundamental ideas which lie behind the statistical theory of learning and generalization. It considers learning as a general problem of function estimation based on empirical data. Omitting proofs and technical details, the author concentrates on discussing the main results of learning theory and their connections to fundamental problems in statistics.

This second edition contains three new chapters devoted to further development of the learning theory and SVM techniques. Written in a readable and concise style, the book is intended for statisticians, mathematicians, physicists, and computer scientists.

#### Summary

A new game–theoretic approach to probability and finance. Probability and Finance presents essential reading for anyone who studies or uses probability. Mathematicians and statisticians will find in it a new framework for probability: game theory instead of measure theory. Philosophers will find a surpising synthesis of the objective and the subjective. Practitioners, especially in financial engineering, will learn new ways to understand and sometimes eliminate stochastic models.

The first half of the book explains a new mathematical and philosophical framework for probability, based on a sequential game between an idealized scientist and the world. Two very accessible introductory chapters, one presenting an overview of the new framework and one reviewing its historical context, are followed by a careful mathematical treatment of probability′s classical limit theorems.
The second half of the book, on finance, illustrates the potential of the new framework. It proposes greater use of the market and less use of stochastic models in the pricing of financial derivatives, and it shows how purely game–theoretic probability can replace stochastic models in the efficient–market hypothesis.

#### Summary

This book constitutes the refereed proceedings of the Third International Symposium on Statistical Learning and Data Sciences, SLDS 2015, held in Egham, Surrey, UK, April 2015.

The 36 revised full papers presented together with 2 invited papers were carefully reviewed and selected from 59 submissions. The papers are organized in topical sections on statistical learning and its applications, conformal prediction and its applications, new frontiers in data analysis for nuclear fusion, and geometric data analysis.

#### Summary

A comprehensive look at learning and generalization theory. The statistical theory of learning and generalization concerns the problem of choosing desired functions on the basis of empirical data. Highly applicable to a variety of computer science and robotics fields, this book offers lucid coverage of the theory as a whole. Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real–life problems, and much more.

#### Abstract

Conformal predictive systems are a recent modification of conformal predictors that output, in regression problems, probability distributions for labels of test observations rather than set predictions. The extra information provided by conformal predictive systems may be useful, e.g., in decision making problems. Conformal predictive systems inherit the relative computational infficiency of conformal predictors. In this paper we discuss two computationally efficient versions of conformal predictive systems, which we call split conformal predictive systems and cross-conformal predictive systems, and discuss their advantages and limitations.

#### Abstract

This note explains how conformal predictive distributions can be used for the purpose of decision-making. Namely, a major limitation of conformal predictive distributions is that, at this time, they are only applicable to regression problems, where the label is a real number; however, this does not prevent them from being used in a general problem of decision making. The resulting methodology of conformal predictive decision making is illustrated on a small benchmark data set. Our main theoretical observation is that there exists an asymptotically efficient predictive decision-making system which can be obtained by using our methodology (and therefore, satisfying the standard property of validity).

#### Abstract

Venn predictors are a distribution-free probabilistic prediction framework that transforms the output of a scoring classifier into a (multi-)probabilistic prediction that has calibration guarantees, with the only requirement of an i.i.d. assumption for calibration and test data.

In this paper, we extend the framework from classifiation (where probabilities are predicted for a discrete number of labels) to regression (where labels form a continuum). We show how Venn Predictors can be applied on top of any regression method to obtain calibrated predictive distributions, without requiring assumptions beyond i.i.d. of calibration and test sets. This is contrasted with methods such as Bayesian Linear Regression, for which the calibration guarant

#### Detecting seizures in EEG recordings using Conformal Prediction

*Paper Proceedings of Machine Learning Research.*

#### Abstract

This study examines the use of the Conformal Prediction (CP) framework for the provision of confidence information in the detection of seizures in electroencephalograph (EEG) recordings. The detection of seizures is an important task since EEG recordings of seizures are of primary interest in the evaluation of epileptic patients. However, manual review of long-term EEG recordings for detecting and analyzing seizures that may have occurred is a time-consuming process. Therefore a technique for automatic detection of seizures in such recordings is highly beneficial since it can be used to significantly reduce the amount of data in need of manual review. Additionally, due to the infrequent and unpredictable occurrence of seizures, having high sensitivity is crucial for seizure detection systems. This is the main motivation for this study, since CP can be used for controlling the error rate of predictions and therefore guaranteeing an upper bound on the frequency of false negatives.

#### Cover Your Cough: Detection of Respiratory Events with Confidence Using a Smartwatch

*Paper Proceedings of Machine Learning Research.*

#### Abstract

Cough and sneeze are the most common means to spread respiratory diseases amongst humans. Existing approaches to detect coughing and sneezing events are either intrusive or do not provide any reliability measure. This paper offers a novel proposal to reliably and non-intrusively detect such events using a smartwatch as the underlying hardware, Conformal Prediction as the underlying software. We rigorously analysed the performances of our proposal with the Harvard ESC Environmental Sound dataset, and real coughing samples taken from a smartwatch in different ambient noises.

#### Exchangeability Martingales for Selecting Features in Anomaly Detection

*Paper Proceedings of Machine Learning Research.*

#### Abstract

We consider the problem of feature selection for unsupervised anomaly detection (AD) in time-series, where only normal examples are available for training. We develop a method based on exchangeability martingales that only keeps features that exhibit the same pattern (i.e., are i.i.d.) under normal conditions of the observed phenomenon. We apply this to the problem of monitoring a Windows service and detecting anomalies it exhibits if compromised; results show that our method: i) strongly improves the AD system's performance, and ii) it reduces its computational complexity. Furthermore, it gives results that are easy to interpret for analysts, and it potentially increases robustness against AD evasion attacks.

#### Conformal prediction based on K-nearest neighbors for discrimination of ginsengs by a home-made electronic nose

*Journal Sensors, Vol. 17, No. 8.*

#### Abstract

An estimate on the reliability of prediction in the applications of electronic nose is essential, which has not been paid enough attention. An algorithm framework called conformal prediction is introduced in this work for discriminating different kinds of ginsengs with a home-made electronic nose instrument. Nonconformity measure based on k-nearest neighbors (KNN) is implemented separately as underlying algorithm of conformal prediction.

In offline mode, the conformal predictor achieves a classification rate of 84.44% based on 1NN and 80.63% based on 3NN, which is better than that of simple KNN. In addition, it provides an estimate of reliability for each prediction.

In online mode, the validity of predictions is guaranteed, which means that the error rate of region predictions never exceeds the significance level set by a user. The potential of this framework for detecting borderline examples and outliers in the application of E-nose is also investigated. The result shows that conformal prediction is a promising framework for the application of electronic nose to make predictions with reliability and validity.

#### Conformal Prediction of Biological Activity of Chemical Compounds

*Journal Annals of Mathematics and Artificial Intelligence.*

#### Abstract

The paper presents an application of Conformal Predictors to a chemoinformatics problem of predicting the biological activities of chemical compounds. The paper addresses some speciﬁc challenges in this domain: a large number of compounds (training examples), highdimensionality of feature space, sparseness and a strong class imbalance. A variant of conformal predictors called Inductive Mondrian Conformal Predictor is applied to deal with these challenges. Results are presented for several non-conformity measures extracted from underlying algorithms and diﬀerent kernels. A number of performance measures are used in order to demonstrate the ﬂexibility of Inductive Mondrian Conformal Predictors in dealing with such a complex set of data. This approach allowed us to identify the most likely active compounds for a given biological target and present them in a ranking order.

#### Combination of Conformal Predictors for Classification

*Paper Proceedings of Machine Learning Research.*

#### Abstract

The paper presents some possible approaches to the combination of Conformal Predictors in the binary classification case. A first class of methods is based on p-value combination techniques that have been proposed in the context of Statistical Hypothesis Testing; a second class is based on the calibration of p-values into Bayes factors. A few methods from these two classes are applied to a real-world case, namely the chemoinformatics problem of Compound Activity Prediction. Their performance is discussed, showing the different abilities to preserve of validity and improve efficiency. The experiments show that P-value combination, in particular Fisher’s method, can be advantageous when ranking compounds by strength of evidence.

#### Abstract

This paper argues that the requirement of measurability (imposed on trading strategies) is indispensable in continuous-time game-theoretic probability. The necessity of the requirement of measurability in measure theory is demonstrated by results such as the Banach–Tarski paradox and is inherited by measure-theoretic probability. The situation in game-theoretic probability turns out to be somewhat similar in that dropping the requirement of measurability allows a trader in a financial security with a non-trivial price path to become infinitely rich while risking only one monetary unit.

#### Valid Probabilistic Prediction of Life Status after Percutaneous Coronary Intervention procedure

*Paper 6th Symposium on Conformal and Probabilistic Prediction with Applications (COPA 2017).*

#### Abstract

#### Inductive Conformal Martingales for Change-Point Detection

*Paper 6th Symposium on Conformal and Probabilistic Prediction with Applications (COPA 2017).*

#### Abstract

We consider the problem of quickest change-point detection in data streams. Classical change-point detection procedures, such as CUSUM, Shiryaev-Roberts and Posterior Probability statistics, are optimal only if the change-point model is known, which is an unrealistic assumption in typical applied problems. Instead we propose a new method for change-point detection based on Inductive Conformal Martingales, which requires only the independence and identical distribution of observations.

We compare the proposed approach to standard methods, as well as to change-point detection oracles, which model a typical practical post-change data distributions. Results of comparison provide evidence that change-point detection based on Inductive Conformal Martingales is an efficient tool, capable to work under quite general conditions unlike traditional approaches.

#### Abstract

We construct universal prediction systems in the spirit of Popper's falsifiability and Kolmogorov complexity and randomness. These prediction systems do not depend on any statistical assumptions (but under the IID assumption they dominate, to within the usual accuracy, conformal prediction). Our constructions give rise to a theory of algorithmic complexity and randomness of time containing analogues of several notions and results of the classical theory of Kolmogorov complexity and randomness.

#### Abstract

We study optimal conformity measures for various criteria of efficiency of set-valued classification in an idealised setting. This leads to an important class of criteria of efficiency that we call probabilistic and argue for; it turns out that the most standard criteria of efficiency used in literature on conformal prediction are not probabilistic unless the problem of classification is binary. We consider both unconditional and label-conditional conformal prediction.

#### All-solid-state carbonate-selective electrode based on screen-printed carbon paste electrode

*Journal Measurement Science and Technology, Vol. 28, No. 2.*

#### Abstract

A novel disposable all-solid-state carbonate-selective electrode based on a screen-printed carbon paste electrode using poly(3-octylthiophene-2,5-diyl) (POT) as an ion-to-electron transducer has been developed. The POT was dropped on the reaction area of the carbon paste electrode covered by the poly(vinyl chloride) (PVC) membrane, which contains N,N-Dioctyl-3α,12α-bis(4-trifluoroacetylbenzoyloxy)-5β-cholan-24-amide as a carbonate ionophore. The electrode showed a near-Nernstian slope of -27.5mV/decade with a detection limit of 3.6*10-5mol/L. Generally, the detection time was 30s. Because these electrodes are fast,convenient and low in cost, they have the potential to be mass produced and used in on-site testing as disposable sensors. Furthermore, the repeatability, reproducibility and stability have been studied to evaluate the properties of the electrodes. Measurement of the carbonate was also conducted in human blood solution and achieved good performance.

#### Nonparametric predictive distributions based on conformal prediction

*Paper Sixth Symposium on Conformal and Probabilistic Prediction and Applications.*

#### Abstract

This paper applies conformal prediction to derive predictive distributions that are valid under a nonparametric assumption. Namely, we introduce and explore predictive distribution functions that always satisfy a natural property of validity in terms of guaranteed coverage for IID observations. The focus is on a prediction algorithm that we call the Least Squares Prediction Machine (LSPM). The LSPM generalizes the classical Dempster-Hill predictive distributions to regression problems. If the standard parametric assumptions for Least Squares linear regression hold, the LSPM is as efficient as the Dempster-Hill procedure, in a natural sense. And if those parametric assumptions fail, the LSPM is still valid, provided the observations are IID.

#### Abstract

This paper gives a simple construction of the pathwise Ito integral ∫ ϕ dω for an integrand ϕ and an integrator ω satisfying various topological and analytical conditions. The definition is purely pathwise in that neither ϕ nor ω are assumed to be paths of processes, and the Ito integral exists almost surely in a non-probabilistic finance-theoretic sense. For example, one of the results shows the existence of ∫ ϕ dω for a cadlag integrand ϕ and a cadlag integrator ω with jumps bounded in a predictable manner.

#### Prescience: Probabilistic Guidance on the Retraining Conundrum for Malware Detection

*PaperACM Workshop on Artificial Intelligence and Security. Vienna, Austria: ACM, p. 71-82.*

#### Abstract

Malware evolves perpetually and relies on increasingly so- phisticated attacks to supersede defense strategies. Data-driven approaches to malware detection run the risk of becoming rapidly antiquated. Keeping pace with malware requires models that are periodically enriched with fresh knowledge, commonly known as retraining. In this work, we propose the use of Venn-Abers predictors for assessing the quality of binary classification tasks as a first step towards identifying antiquated models. One of the key benefits behind the use of Venn-Abers predictors is that they are automatically well calibrated and offer probabilistic guidance on the identification of nonstationary populations of malware. Our framework is agnostic to the underlying classification algorithm and can then be used for building better retraining strategies in the presence of concept drift. Results obtained over a timeline-based evaluation with about 90K samples show that our framework can identify when models tend to become obsolete.

#### Aggregation Algorithm vs. Average For Time Series Prediction

*PaperECML PKDD 2016 Workshop on Large-scale Learning from Data Streams in Evolving Environments, STREAMEVOLV-2016*

#### Abstract

Learning with expert advice as a scheme of on-line learning has been very successfully applied to various learning problems due to its strong theoretical basis. In this paper, for the purpose of times series prediction, we investigate the application of Aggregation Algorithm, which a generalisation of the famous weighted majority algorithm. The results of the experiments done, show that the Aggregation Algorithm performs very well in comparison to average.

#### An Upper Bound for Aggregating Algorithm for Regression with Changing Dependencies

*Paper27th International Conference, ALT 2016.*

#### Abstract

The paper presents a competitive prediction-style upper bound on the square loss of the Aggregating Algorithm for Regression with Changing Dependencies in the linear case. The algorithm is able to compete with a sequence of linear predictors provided the sum of squared Euclidean norms of differences of regression coefficient vectors grows at a sublinear rate.

#### IBC-C: A Dataset for Armed Conflict Event Analysis

*PaperThe Annual Meeting of the Association for Computational Linguistics (ACL) 2016.*

#### Abstract

We describe the Iraq Body Count Corpus (IBC-C) dataset, the first substantial armed conflict-related dataset which can be used for conflict analysis. IBC-C provides a ground-truth dataset for conflict specific named entity recognition, slot filling, and event de-duplication. IBC-C is constructed using data collected by the Iraq Body Count project which has been recording casualties resulting from the ongoing war in Iraq since 2003. We describe the dataset’s creation, how it can be used for the above three tasks and provide initial baseline results for the first task (named entity recognition) using Hidden Markov Models, Conditional Random Fields, and Recursive Neural Networks.

#### Valid Probabilistic Predictions for Ginseng with Venn Machines Using Electronic Nose

*JournalSensors.*

#### Abstract

In the application of electronic noses (E-noses), probabilistic prediction is a good way to estimate how confident we are about our prediction. In this work, a homemade E-nose system embedded with 16 metal-oxide semi-conductive gas sensors was used to discriminate nine kinds of ginsengs of different species or production places. A flexible machine learning framework, Venn machine (VM) was introduced to make probabilistic predictions for each prediction. Three Venn predictors were developed based on three classical probabilistic prediction methods (Platt’s method, Softmax regression and Naive Bayes). Three Venn predictors and three classical probabilistic prediction methods were compared in aspect of classification rate and especially the validity of estimated probability. A best classification rate of 88.57% was achieved with Platt’s method in offline mode, and the classification rate of VM-SVM （Venn machine based on Support Vector Machine）was 86.35%, just 2.22% lower. The validity of Venn predictors performed better than that of corresponding classical probabilistic prediction methods. The validity of VM-SVM was superior to the other methods. The results demonstrated that Venn machine is a flexible tool to make precise and valid probabilistic prediction in the application of E-nose, and VM-SVM achieved the best performance for the probabilistic prediction of ginseng samples.

#### Internet discussion forums: Maximizing choice in health-seeking behaviour during public health emergencies

*Paper2016 International Conference On Cyber Situational Awareness, Data Analytics And Assessment (CyberSA).*

#### Abstract

This paper proposes a new approach to improving our understanding of the suitability of internet discussion forums for use by health information seekers. We consider in particular their potential use during public health emergencies when access to conventional experts and healthcare professionals may be constrained. We explore potential benefits and challenges of crowdsourcing information on health issues in online environments through the context of Computer Science theories of Collective Intelligence [1, 2], which explore how members of a group - particularly when networked by computer systems - can reach a better solution than an individual working alone. We ask if online discussion forums can provide the `clever mechanism' Surowiecki [3] proposed is necessary to harness such potential group wisdom, and help health information seekers to identify the best option or `maximized choice' from a set of less-than-ideal choices, thus adding value to information seeking during public health emergencies.

#### Criteria of Efficiency for Conformal Prediction

*Paper5th International Symposium, COPA 2016 Madrid, Spain, April 20–22, 2016 Proceedings.*

#### Abstract

We study optimal conformity measures for various criteria of efficiency in an idealised setting. This leads to an important class of criteria of efficiency that we call probabilistic; it turns out that the most standard criteria of efficiency used in literature on conformal prediction are not probabilistic.

#### Universal probability-free conformal prediction

*Paper5th International Symposium, COPA 2016 Madrid, Spain, April 20–22, 2016 Proceedings.*

#### Abstract

We construct a universal prediction system in the spirit of Popper's falsifiability and Kolmogorov complexity. This prediction system does not depend on any statistical assumptions, but under the IID assumption it dominates, although in a rather weak sense, conformal prediction.

#### An Experimental Study of the Intrinsic Stability of Random Forest Variable Importance Measures

*JournalBMC Bioinformatics.*

#### Abstract

The stability of Variable Importance Measures (VIMs) based on random forest has recently received increased attention. Despite the extensive attention on traditional stability of data perturbations or parameter variations, few studies include influences coming from the intrinsic randomness in generating VIMs, i.e. bagging, randomization and permutation. To address these influences, in this paper we introduce a new concept of intrinsic stability of VIMs, which is defined as the self-consistence among feature rankings in repeated runs of VIMs without data perturbations and parameter variations. Two widely used VIMs, i.e., Mean Decrease Accuracy (MDA) and Mean Decrease Gini (MDG) are comprehensively investigated. The motivation of this study is two-fold. First, we empirically verify the prevalence of intrinsic stability of VIMs over many real-world datasets to highlight that the instability of VIMs does not originate exclusively from data perturbations or parameter variations, but also stems from the intrinsic randomness of VIMs. Second, through Spearman and Pearson tests we comprehensively investigate how different factors influence the intrinsic stability.

#### Comparison and data fusion of electronic nose and near-infrared reflectance spectroscopy for the discrimination of ginsengs

*JournalAnalytical Methods.*

#### Abstract

This paper reports a hybrid system consisting of a homemade electronic nose system (E-nose) with a sensor array of 16 metal-oxide sensors and a near-infrared reflectance spectroscopy (NIRS) system for discriminating different kinds of ginsengs. 315 samples from 9 kinds of ginsengs were measured by using both systems with simple sample pretreatment. Six commonly used features were extracted from each sensor in the E-nose sensor array. Principal Component Analysis (PCA) was used to reduce the dimension of NIRS data. Then, models were built and trained with a support vector machine separately using datasets of the two systems. The classification performances of individual systems were optimized and compared. The advantages and disadvantages of the two systems were demonstrated by comparing empirical probability distributions in the category of predict labels for all samples. Finally, new weighted feature-level data-fusion and Dempster–Shafer-theory based decision-level data-fusion approaches for the hybrid system were separately exploited. The results showed that the hybrid system achieved an optimal classification accuracy of 99.58% with weighted feature-level fusion and 99.24% with decision-level fusion, which significantly outperformed the performance of individual systems (90.18% by the E-nose and 97.98% by NIRS) by the Student's t-test (p < 0.001).