Classification methods
Supervised classification is the most common task in ML.
It addresses the question of class assignment to an input,
where the class is selected from a set of categorical values.
A given set of (annotated) examples is given, where for each example
the value of its associated class is known.
Work on the conception of new classification methods based on vine copula models have been presented in
Carrera_et_al:2016,
Carrera_et_al:2019. These
vine copula classifiers exploits the capacity of vine copulas to capture diverse types of patterns of interactions between
variables. Classification methods based on classifiers that evolved by means of evolutionary algorithm have been
presented in
Santana_et_al:2011,
Santana_et_al:2012c,
Roman_et_al:2019,
Santana_et_al:2019.
A variety of classification approaches to problems from neuroscience are presented in
Santana_et_al:2011f,
Santana:2013a,
Santana_et_al:2012,
Zhang_et_al:2015,
Santana_et_al:2015d,
Santana_et_al:2019. In some
cases, the introduced classifiers are based on unsupervised learning algorithms, as is the case of applying
the affinity propagation algorithm for neuron morphology classification
Santana_et_al:2013d.
Recent work on the design of multi-task prediction models based on deep neural networks has been presented in
Garciarena_et_al:2020c,
Garciarena_et_al:2021b.
Regression methods
Regression is one of the two most common tasks in ML. Given a set of inputs with the corresponding values
associated to a target variables. The problem consists of predicting the value of the target variables
for unlabeled examples. Usually, the target variable takes values in the continuous domain.
Most of the research on regression methods have focused on the solution of real-world problems with
particular characteristics (e.g., feature extraction is required, multiple target variables need to
be predicted, etc..).
For example, in
Murua_et_al:2018,
the tool wear prediction problem for the Inconel 718 material was addressed. For this problem,
feature extraction of the cutting forces is necessary for a more accurate prediction. Different regression
methods were evaluated.
In Khargharia_et_al:2020,
we investigate the trade-off between the accuracy and the overall complexity of sets of RNNs that are used together
to predict the volume of vehicles in a network of gas stations. In
GarciaRodriguez_et_al:2021,
isotonic regression and regressors based on different multi-layer perceptron architectures are compared to other
traditional regression methods for prediction of the award winning price in the public procurement process.
More recently, in Roman_et_al:2021,
a three-objective regression problem is addressed in the context of post-editing effort estimation from
sentence embedding representation. An approach based on genetic programming is used to evolve kernels that are suitable
for predicting several metrics at the same time.
Imputation and data augmentation methods
The goal of imputation is to impute or completing missing data in incomplete or corrupted examples.
Data augmentation approaches are used to generate new examples that resemble those in the training set.
Both, imputation and data augmentation methods are very important in scenarios in which the availability of data
for training the ML model is scarce.
A review of the most common types of missing data types, and the imputation methods used to address them is presented in
Garciarena_and_Santana:2017.
Methods for that incorporate the automatic selection of the imputation strategy as part of the
design of ML pipelines for classification problems are introduced in
Garciarena_et_al:2018,
Garciarena_et_al:2018c.
Data generation approaches for topic classification were presented in
Montenegro_et_al:2019a.