Fragment-Potential Machine Learning And Modeling Approach

Vyacheslav S. Pereverzev-Orlov,

Institute for Information Transmission Problems, Russian Academy of Sciences,
19 Ermolovoy st., Moscow, 101447, Russia
e-mail: peror@ippi.ac.msk.su

The machine learning methods as a modelling tool are not long ago born and so far they aren't as well- known and widely applied as classical mathematical modelling approaches, but they also have their own nice features making their usage effective when classical methods fail. This relates primarily to ill-defined problems, where data are unreliable and incomplete, and target classes are mixed in the description space.

Usually it is not emphasized and therefore not widely known that many recognition learning methods are at the same time the methods of transforming the source data description and creating a new one, which is more adequate for the underlying problem, and provides the simpler solution.

"Fragment-Potential" (FP) method belongs to this class. We have been developing and investigating this method as a part of Partner System project aimed to extend the abilities of human intelligence. FP approach is based on the following ideas. At the first learning stage the source data description and classification is used. The Fragment procedure is applied to generate a set of low-dimensional piece-wise linear discriminating rules for each pair of classes. Each of these rules only partly solves the discrimination problem: some objects of the data set may be wrong classified, others may be left non- classified by this or that particular rule.

At the second stage the rules are considered as voting elements. This provides the mapping of the data set to the so-called "potential space" with the help of voting matrix, or "potential matrix" B with C*C dimensions. For a fixed object and a pair of non-equal classes I and J, symmetrical elements Bij and Bji give the number of rules that classified the object as belonging to class I and J in this pair respectively, or the number of votes.

The dimension of so created potential space equals to C*(C-1)/2, but it's reasonable to enlarge it by adding different functions of potentials. Those may be the sums of Bij by rows and/or columns, their fractions. These functions proved to be very informative in respect of the classification considered. Column sums have the sense of similarity to the corresponding classes, row sums have the sense of dissimilarity, ratios or differences of symmetric sums may be considered as likelihood measures for hypothesis that an object belongs to corresponding class.

The mapping to the potential space has the properties of rectifying mapping. The classes are more compact in the new description, providing the simpler solution for recognition problem. Moreover, the more the number of classes considered, the more rich the potential space is. The recognition learning problems can be successfully solved when the classes are compact or homogeneous, but this is rare for complex real world problems. Usually there exist "hidden" subclasses that are really compact and homogeneous and reflect the nature of the underlying problem. The number of these classes is much greater than the number of classes given in data. The FP method is aimed to reveal these classes as an intermediate step to solve the recognition problem.

The modes of class objects distribution density in the potential space give a good hint for revealing subclasses. In the vicinity of these modes the bulk of class objects are concentrated. Based on them subclasses can be considered as homogeneous and in the same time well enough presenting the distribution function approximation.

Potential procedure, the second part of FP method, finds subclasses by clustering in the potential space. Than Fragment procedure may be again applied, but for new revealed subclasses instead of source data classes. Both procedures may be iterated this way, increase the number, compactness, and homogeneity of subclasses, until the desired result is reached. Thus the FP method lets increase the organization level of ill-organized data and makes the prediction problem solution easier.

As a result, the data organization model is represented by the clusters and decision elements found, with the help of which the potential space itself was defined, in which these clusters in their turn were found. Now new data reflecting the events to be predicted can be mapped to this space. As a result of this mapping, each object will be compared to the distribution of similarity to the clusters found. The share of cluster in the learning sample objects plus "purity level" as a homogeneity measure of the cluster help calculate the prediction. The prediction procedure may be represented as the sequence of description space transformations, and can be effectively implemented on a massively parallel neural-like architecture.

The FP approach has been successfully applied to problems that different medical and business ones. To feel the difference, outline the specific features of both. Problems of clinical medicine are characterized by:

high-dimensional descriptions (hundreds and thousands of features);
small data sample size (tens and hundreds);
rather high feature and description informativity due to great medical experience;
multiple choice for each problems;
great a priori professional knowledge.

Business problems have the following specific features:

large sample size (about hundred thousands);
low feature informativity;
binary choice;
lack of prior knowledge.