Improving multiclass pattern recognition by the combination of two strategies
6 pages
English

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Improving multiclass pattern recognition by the combination of two strategies

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus
6 pages
English
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

Description

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO. 6, JUNE 2006 1001 instances, and disregarding the rest. There are differentImproving Multiclass Pattern Recognition by methods of combining the obtained classifiers, the mostthe Combination of Two Strategies common is a simple voting scheme [5]. When classifying a newinstanceeachoneofthebaseclassifierscastsavotefor Nicola ´sGarc ´ıa-Pedrajas, Member, IEEE,and one of the two classes used in its training. Domingo Ortiz-Boyer, Member, IEEE Another approach for the combination of the trained classifier is the Decision Directed Acyclic Graph (DDAG) [6]. Abstract—We present a new method of multiclass classification based on the Thismethodconstructsarootedbinaryacyclicgraphusing combination of one-vs-all method and a modification of one-vs-one method. This the classifiers. The nodes are arranged in a triangle with combination of one-vs-all and one-vs-one methods proposed enforces the the root node at the top, two nodes in the second layer, strength of both methods. A study of the behavior of the two methods identifies four in the third layer, and so on. To evaluate a DDAG onsome of the sources of their failure. The performance of a classifier can be improved if the two methods are combined in one, in such a way that the main input pattern x, starting at the root node the binary sources of their failure are partially avoided.

Informations

Publié par
Publié le 01 janvier 2006
Nombre de lectures 15
Licence : En savoir +
Paternité, pas d'utilisation commerciale, pas de modification
Langue English
Poids de l'ouvrage 2 Mo

Extrait

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,VOL. 28,NO. 6,JUNE 2006
1001
instances, and disregarding the rest. There are different Improving Multiclass Pattern Recognition by methods of combining the obtained classifiers, the most the Combination of Two Strategies common is a simple voting scheme [5]. When classifying a new instance each one of the base classifiers casts a vote for Nicola´sGarcı´aPedrajas,Member,IEEE, and one of the two classes used in its training. Domingo OrtizBoyer,Member,IEEE Another approach for the combination of the trained classifier is theDecision Directed Acyclic Graph(DDAG) [6]. Abstract—We present a new method of multiclass classification based on theThis method constructs a rooted binary acyclic graph using combination of onevsall method and a modification of onevsone method. This the classifiers. The nodes are arranged in a triangle with combination of onevsall and onevsone methods proposed enforces the the root node at the top, two nodes in the second layer, strength of both methods. A study of the behavior of the two methods identifies four in the third layer, and so on. To evaluate a DDAG on some of the sources of their failure. The performance of a classifier can be improved if the two methods are combined in one, in such a way that the maininput patternx, starting at the root node the binary sources of their failure are partially avoided. function is evaluated, and the next node visited depends upon the results of this evaluation. The final answer is the Index Terms—Multiclass, classification, onevsone, onevsall, neural networks, class assigned by the leaf node visited at the final step. support vector machines. .Error Correcting Output Codes (ECOC).Dietterich and Bakiri [7] suggested the use of error correcting codes for multiclass classification. This method uses a matrixMof 1 INTRODUCTION f1;1gvalues of sizeKF, whereFis the number of Aclassification problem ofKclasses andntraining observationsbinary classifiers. Theithcolumn of the matrix induces a consists of a set of patterns whose class membership is known. Letpartition of the classes into twometaclasses. Instancex belonging to classiis a positive instance for the S¼ fðx1; y1Þ;ðx2; y2Þ;. . .ðxn; ynÞgbe a set ofntraining samples m jthclassifier if and only ifMij¼1. If we designatefjas where each instancexibelongs to a domainXIR. Each label is an integer from the setY¼ f1;. . .; Kg. A multiclass classifier is athe sign of thejthclassifier, the decision implemented by functionf:X!Ythat maps an instancexonto an element ofY. thismethod,fðxÞ, using the Hamming distance between The task is to find a definition for the unknown function,fðxÞ, eachrow of the matrixMand the output of theF given the set of training instances. Although many realworldclassifiers is given by: problems are multiclass problems,K >2, many of the most   F X 1signðM fðxÞ popular classifiers work best when facing binary problems,K¼2.ri i fðxÞ ¼arg min: Moreover, many algorithms are specifically designed for binary2 r21;...;K i¼1 problems, such as Support Vector Machines (SVM). A class There are other approaches that use other distance binarization is a mapping of a multiclass problem on several twoclass problems in a way that allows a derivation of ameasures between the outputs of the classifiers and prediction for the multiclass problem from the predictions of the each row of the coding matrix, or more sophisticated twoclass classifiers. The twoclass classifier is usually referred to methods [8]. asbase learner. The usefulness of this approach relies heavily on the Among the proposed methods for approaching multiclass independence of the classifiers [9], without which the error problems as many, possibly simpler, twoclass problems, we can correcting approach would fail. Rifkin and Klautau [10] make a rough classification in three groups: onevsall, onevsone, and error correcting output codes based methods:suggested that if the binary classifiers are finetuned more accurately, the independence of the classifiers diminishes, .OnevsAll (OVA).This method has been proposed and so does the efficiency of this approach. independently by several authors [1], [2]. OVA method Allwein et al. [11] proposed a unifying approach for the constructsKbinary classifiers. Classifierith,fi, is trained different methods using a coding matrix with three values, using all the patterns of classias positive instances and the patterns of the other classes as negative instances. Anf1;0;1g, with 0 meaningdon’t care. For example, in the onevs example is classified in the class whose correspondingall approach, the coding matrix hasKcolumns, all the diagonal classifier has the highest output. This classifier decision elements set to 1, and all other elements set to1. For onevsone,   function,f, is defined as: K we have a matrix ofK, in which each column corresponds to 2 a pairðc1; c2Þ. For this column, the matrix has +1 in rowc1,1in fðxÞ ¼arg maxfjðxÞ: j2f1;...;Kg rowc2, and zeros in all other rows. Moreira and Mayoraz [12] developed a combination of different classifiers, considering the output of each one as a probability of .OnevsOne (OVO).This method, proposed in [3], the pattern of belonging to a certain class. This method needs the constructsKðK1Þ=2classifiers [4]. Classifierij, named training ofKðKþ1Þ=2classifiers. fij, is trained using all the patterns from classias positive The approach we propose is based on the idea that the instances, all the patterns from classjas negative combination of the methods onevsall and onevsone can be able to obtain a classifier that outperforms both methods separately. .The authors are with the Department of Computing and Numerical This belief is based on a study of these two methods when the Analysis, Edificio Einstein, 3a Planta, Campus Universitario de Rabanales, classification they achieve for a given instance is inaccurate. This 14071 Co´rdoba, Spain. Email: {npedrajas, dortiz}@uco.es. study is presented in the next section. Manuscript received 27 May 2005; revised 19 Sept. 2005; accepted 21 Oct. This paper is organized as follows: Section 2 describes our 2005; published online 13 Apr. 2006. approach. Section 3 shows the experimental setup. Section 4 shows Recommended for acceptance by L. Kuncheva. the results of the experiments carried out, and, finally, Section 5 For information on obtaining reprints of this article, please send email to: tpami@computer.org, and reference IEEECS Log Number TPAMI02690505.states the conclusions of our work and future research lines. 01628828/06/$20.002006 IEEEPublished by the IEEE Computer Society
  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents