Computer Science Faculty Publications


Information Fusion for Text Classification –– an Experimental Comparison

Document Type



This article reports on our experiments and results on the effectiveness of different feature sets and information fusion from some combinations of them in classifying free text documents into a given number of categories. We use different feature sets and integrate neural network learning into the method. The feature sets are based on the “latent semantics” of a reference library — a collection of documents adequately representing the desired concepts. We found that a larger reference library is not necessarily better. Information fusion almost always gives better results than the individual constituent feature sets, with certain combinations doing better than the others.

Publication Date


Publication Title

Pattern Recognition

This document is currently not available here.