SVM |
It is useful for two-group classification problems. The idea is to find a function called hyperplane from the resolution of a linear system built from the various lessons of the training subset. 4040. Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20:273-97. This hyperplane is used to cluster the lessons of the test subset into two disjoint groups. |
Supervised |
NB |
It was inspired in the studies of the reverend Bayes on conditional probability. 4141. Bayes T. An essay towards solving a problem in the doctrine of chances. By the late Rev. Mr. Bayes, F. R. S. communicated by Mr. Price, in a letter to John Canton, A. M. F. R. S. Philos Trans R Soc Lond. 1763;53:370-418. These probabilities are used to identify the category (out of a total n possible) that a particular lesson belongs to. 4242. Webb GI, Boughton JR, Wang Z. Not so naive bayes: aggregating one-dependence estimators. Mach Learn. 2005;58(1):5-24.
|
Supervised |
KNN |
It is said that a vector norm is a mathematical function, which satisfies specific properties, and associates a vector with a value greater than or equal to zero. 4343. Watkins, DS. Fundamentals of matrix computations. 2th ed. New York: Wiley-Interscience; 2002. The norm of the difference between two vectors is the distance between them. The KNN uses a norm to calculate the distance between all the vectors (lessons) that make up the database. Then, for each vector of the database, the k vectors closest to it are determined. The inclusion in a given group is obtained from a majority voting system among the neighbors. 4444. Cover T, Hart P. (1967). Nearest neighbor pattern classification. IEEE Trans Inf Theory. 1967;13(1):21-7.,4545. Fix, E., Hodges, J.L. Discriminatory analysis, nonparametric discrimination: Consistency properties. Technical Report 4, USAF School of Aviation Medicine, Randolph Field, Texas, 1951.
|
Supervised |
AG |
Algorithms inspired by the biological evolution of species, in which each possible candidate to solve the problem is modeled as a chromosome consisting of a set of genes, which during the execution of the algorithm undergoes operations of crossing-over and mutation in order to obtain better solutions than the current ones. 4646. Holland JH. Adaptation in natural and artificial systems. 2th ed. Cambridge, MA: MIT Press; 1992. This way, they allow a database to be separated, for example, into two distinct groups – which have or do not have a particular characteristic. |
Supervised |
RF |
This method is based on the construction of several decision trees. The first step is to get several random samples (with reposition) of lessons to build other databases, a process that is called bootstrapping. Each of these new databases will give rise to a decision tree, which is obtained iteratively, from a subset of variables (features). After the construction of all trees, a new lesson in the database should be allocated to the group that has the largest number of decision trees, showing that it belongs to this group (majority of votes). 4747. Breiman L. Random forests. Mach Learn. 2001;45(1):5-32.,4848. Ho TK. Random decision forests. In: Proceedings of the 3rd International Conference on Document Analysis and Recognition; 1995 Aug 14-16; Montreal. Washington, DC: IEEE Computer Society; 1995. p.278-82.
|
Supervised |
K-means |
It allows partitioning a database into k groups with similar characteristics. To do so, it is necessary to update, in an iterative way, a set of vectors, called reference centroids of each group and to calculate the distance of each lesson to each one. A lesson is always allocated to the centroid for which it has the shortest distance. The elbow chart is generally used to determine the ideal number of groups to separate from the database. 4949. MacQueen JB. Some Methods for classification and Analysis of Multivariate Observations. Proc. Fifth Berkeley Symp. on Math. Statist. and Prob. 1967;1:281-97.
|
Unsupervised |
ANN |
Inspired in biological nerve systems, a structure called a graph - a set of nodes and edges - is used in which nodes are layered and connected by valued edges, which represent a weight assigned to a given connection. The idea is that from a set of inputs, these weights are used properly to produce an output. Several architectures have been proposed for neural networks, from simpler ones such as the perceptron, to more sophisticated ones, such as the radial basis function, convolutional networks and deep learning. In deep learning, in addition to the input and output layers, there are hidden layers that increase significantly the number of weights to be updated and often require huge computational efforts. Convolutional network is a type of deep leaning inspired in visual cortex of animals that have an important role in image analysis. Autoencoders and Kohonen neural networks are examples of unsupervised learning. 11. Goodfellow I, Bengio Y, Courville A. Deep Learning. Cambridge: MIT Press; 2016.,77. Dilsizian ME, Siegel EL. Machine meets biology: a primer on artificial intelligence in cardiology and cardiac imaging. Curr Cardiol Rep. 2018;20(12):139.,5050. McCulloch WS, Pitts W. A logical calculus of ideas immanent in nervous activity. Bull Math Biophys. 1943;5(4):115-33.
51. Rosenblatt F. The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev. 1958;65(6):386-408.-5252. Broomhead DS, Lowe D. Multivariable functional interpolation and adaptive networks. Complex Syst. 1988;2:321-55.
|
Unsupervised or Supervised |
GB |
It is a tree-based method that uses gradient, vectors related to the direction of maximum increase in a math function, to produce sequential decision trees to be combined to brush up on the prediction. Variants of this approach include Stochastic Gradient Descent that incorporates a random subsampling to GB. 5353. Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001;29(5):1189-232.,5454. Friedman JH. Stochastic gradient boosting. Comput. Stat. Data Anal. 2002;38(4):367-78.
|
Supervised |