Information Fusion for Text Classification –– an Experimental Comparison
This article reports on our experiments and results on the effectiveness of different feature sets and information fusion from some combinations of them in classifying free text documents into a given number of categories. We use different feature sets and integrate neural network learning into the method. The feature sets are based on the “latent semantics” of a reference library — a collection of documents adequately representing the desired concepts. We found that a larger reference library is not necessarily better. Information fusion almost always gives better results than the individual constituent feature sets, with certain combinations doing better than the others.
Dasigi, Venu; Mann, Reinhold C.; and Protopopescu, Vladimir A., "Information Fusion for Text Classification –– an Experimental Comparison" (2001). Computer Science Faculty Publications. 4.