Computer Science ETDs

Author

Jiawei Xu Jr

Publication Date

7-1-2013

Abstract

The Cognitive Paradigm Ontology (CogPO) defines an ontological relationship between academic terms and experiments in the field of neuroscience. BrainMap (www.brainmap.org) is a database of literature describing these experiments, which are annotated by human experts based on the ontological framework defined in CogPO. We present a stochastic approach to automate this process. We begin with a gold standard corpus of abstracts annotated by experts, and model the annotations with a group of naive Bayes classifiers, then explore the inherent relationship among different components defined by the ontology using a probabilistic decision tree model. Our solution outperforms conventional text mining approaches by taking advantage of an ontology. We consider five essential ontological components (Stimulus Modality, Stimulus Type, Response Modality, Response Type, and Instructions) in CogPO, evaluate the probability of successfully categorizing a research paper on each component by training a basic multi-label naive Bayes classifier with a set of examples taken from the BrainMap database which are already manually annotated by human experts. According to the performance of the classifiers we create a decision tree to label the components sequentially on different levels. Each node of the decision tree is associated with a naive Bayes classifier built in different subspaces of the input universe. We first make decisions on those components whose labels are comparatively easy to predict, and then use these predetermined conditions to narrow down the input space along all tree paths, therefore boosting the performance of the naive Bayes classification upon components whose labels are difficult to predict. For annotating a new instance, we use the classifiers associated with the nodes to find labels for each component, starting from the root and then tracking down the tree perhaps on multiple paths. The annotation is completed when the bottom level is reached, where all labels produced along the paths are collected.

Language

English

Keywords

Annotation, Ontology, Machine Learning, Naive Bayes, Decision Tree

Document Type

Thesis

Degree Name

Computer Science

Level of Degree

Masters

Department Name

Department of Computer Science

First Committee Member (Chair)

Luger, George Jr

Second Committee Member

Turner, Jessica Jr

Third Committee Member

Williams, Lance Jr

Share

COinS