Computer Science Faculty Publications

Evaluation of Keyword Selection on Gene Clustering in Biomedical Literature Mining

Document Type

Conference Proceeding

Abstract

Conference proceeding from the Fifth IASTED Conference on Computational Intelligence, August 2010, pp. 119-124.

We describe two statistical metrics, Z-score and a variant of the familiar TF-IDF, which are appropriate for identifying keywords associated with genes by mining a collection of MEDLINE® abstracts. We describe experiments in clustering genes based on the identified keyword features that different genes share with each other. The quality of clustering is measured by comparing the clusters generated by a clustering algorithm against expert-defined clusters. We evaluate the quality of clustering based on keyword features identified by the two different metrics, as well as combinations of the keywords derived from the metrics. We present these results and our analysis.

Publication Date

8-2010

Publication Title

Fifth IASTED Conference on Computational Intelligence

Publisher

ACTA Press

DOI

https://doi.org/10.2316/P.2010.711-031

Start Page No.

119

End Page No.

124

This document is currently not available here.

Share

COinS