Computer Science Faculty Publications
Evaluation of Keyword Selection on Gene Clustering in Biomedical Literature Mining
Document Type
Conference Proceeding
Abstract
Conference proceeding from the Fifth IASTED Conference on Computational Intelligence, August 2010, pp. 119-124.
We describe two statistical metrics, Z-score and a variant of the familiar TF-IDF, which are appropriate for identifying keywords associated with genes by mining a collection of MEDLINE® abstracts. We describe experiments in clustering genes based on the identified keyword features that different genes share with each other. The quality of clustering is measured by comparing the clusters generated by a clustering algorithm against expert-defined clusters. We evaluate the quality of clustering based on keyword features identified by the two different metrics, as well as combinations of the keywords derived from the metrics. We present these results and our analysis.
Repository Citation
Dasigi, Venu G.; Karam, O.; and Pydimarri, S., "Evaluation of Keyword Selection on Gene Clustering in Biomedical Literature Mining" (2010). Computer Science Faculty Publications. 15.
https://scholarworks.bgsu.edu/comp_sci_pub/15
Publication Date
8-2010
Publication Title
Fifth IASTED Conference on Computational Intelligence
Publisher
ACTA Press
DOI
https://doi.org/10.2316/P.2010.711-031
Start Page No.
119
End Page No.
124