Abstract
Real-world data are often prepared for purposes other than data mining and machine learning and, therefore, are represented by primitive attributes. When data representation is primitive, preprocessing data before looking for patterns becomes necessary. The low-level primitive representation of real-world problems facilitates the existence of complex interactions among attributes. If lack of domain experts prevents traditional methods to uncover patterns in data due to complex attribute interactions, then the use of soft computing techniques such as genetic algorithms becomes necessary. This article introduces MFE3/GA𝐷𝑅, a data reduction method derived from the learning preprocessing system MFE3/GA. The method restructures the primitive data representation by capturing and compacting hidden information into new features in order to highlight regularities to the learner. We thoroughly analyze the empirical results obtained on the poker hand data set. The results show that this approach successfully compacts the set of low-level primitive attributes into a smaller set of highly informative features which outline patterns to the learner; thus, the new approach provides data reduction and yields learning a smaller and more accurate classifier.
Original language | English |
---|---|
Pages (from-to) | 1296-1303 |
Number of pages | 8 |
Journal | Applied Soft Computing Journal |
Volume | 9 |
Issue number | 4 |
Early online date | 6 May 2009 |
DOIs | |
Publication status | Published - 30 Sept 2009 |
Funding
This work was partially supported by the Spanish Ministry of Science and Technology, under Grant numbers TSI2005-08225-C07-06 and TIN2008-02081.
Keywords
- attribute interaction
- data reduction
- feature construction
- genetic algorithm
- machine learning
- non-algebraic feature