Michel chamat dimitrios bersi kodra
Download 1.28 Mb. Pdf ko'rish
|
Таржима 3-5, 16-22, 29-30 ва 34-49 бетлар БМИ СКК
3.2.1.1.
Regression This supervised algorithm predicts continuous values based on the historical dataset by fitting the dataset into a curve with two sets of variables: the dependent and independent variables. Based on the relation between the two sets, the regression type is chosen. Simple regression is based on one linear or non-linear variable and its predicted dependent variable by the independent one. While multiple regression is the extension of the simple type and is based on multiple variables, where the capability of the algorithm is increasing proportionally with the complexity of the model. Simple regression is a fast and simple algorithm and can be used to predict continuous outputs based on a two-dimensional curve, although with several values, regression will become complicated, specifically with a multi- dimensional curve. Moreover, the regression models are divided at two major subcategories in relation with their linearity: Fig. 9. Linear (first two) and non-linear fit of samples, from [21]. Linear regression is when the dependency of the output-input relation can be defined by a linear function. On the other hand, non-linear regression is defined by non-linear functions as exponentials, polynomials and quadratics. 20 3.2.1.2. Classification In machine learning, the classification method is another supervised learning approach, which can be characterized as a means of categorizing, i.e. classifying, some unknown variables into a discrete set of classes. Classification attempts to learn the relationship between a set of feature variables and a target variable of interest [22]. In a classification problem, there exist two major constraints, which the fulfillment of both is sometimes unrealistic. At first, the set of classes are covering the whole possible output space, and secondly, the most important point about this set of ML algorithms is that the output value is discrete, with each example corresponding to precisely one class. There are occasions which examples might correspond to different classes or alternatively, examples that could not be classified in a particular input value, problem that the “fuzzy” classifiers are trying to solve. [21] [23] Data classification has several applications in the modern industrial categories. Most importantly, many of the problems that developers are trying to solve can be expressed as association of a feature and a target variable. Moreover, this relation provides and extends the applicability for classification in a vast range of different scenarios. The most commonly used types of classifications algorithms in machine learning are: [24] · Naïve Bayes: ML algorithms which take into consideration the principle of Maximum A Posteriori (MAP) for the classification of their problem. · Decision trees: approach that splits the training set into distinct nodes, where one node can include one, most of or all of the different categories that the database can be divided. · k-Nearest Neighbors (k-NN): a non-parametric classification algorithm that measures the distance of the unknown input from every other training example. [6] · Logistic regression: a statistical ML technique which classifies the dataset records, based on the values of the input fields. · Support Vector Machines (SVM): are among the most robust algorithms which are based in the maximization of the minimum distance from the decision line, i.e. separator, to the nearest example. 21 · Neural Networks (NN): are based upon the log likelihood function with respect to the network parameters, extending to the multiclass problem and ensuing as output a binary or N-ary result. Less popular algorithms, as Extreme Learning Machines (ELM) and Linear Discriminant Analysis (LDA) are showing poor capabilities in evaluation [25], and high rate of misjudgment [26], respectively. 3.2.2. Unsupervised Learning The most important difference between supervised and unsupervised learning is that the training data is unlabeled in a way that the algorithm does not have any specific information about the nature or the class of the inputs. The aim of this method focuses into finding clusters or similar inputs in the data, and consequently categorizing the unlabeled values in related groups. Moreover, the unsupervised approach is looking to determine the distribution of data within the input space, known as density estimation, or to project the data from a high-dimensional space down to lower and much simplistic systems for visualization. [22] Some of the most important ML unsupervised algorithms are shown in table 1 and in fig. 10 [23]. 3.2.3. Semi Supervise Learning Semi supervised learning, as the title describes, is the mixed version of the two different methods previously presented in sections 3.2.1 and 3.2.2. These algorithms can operate with partially labeled data and the rest of its data, the majority in most of the cases, with unlabeled ones. A good example of semi supervised algorithms is the reinforcement learning, which is a method called an agent in this context [23] and it observes the environment, selecting its best action according to a policy defined by previous situations. These actions are resulting to rewards or penalties, whether there is a positive or negative outcome, respectively. In [7] [8], semi supervised techniques are used to explore the capability of implementing ML in the mobile network of LTE systems. |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling