Virtual Cell Software Repository

MKEM PDF Print E-mail

Multi-level Knowledge Emergence Model(MKEM)

Overview

Since Swanson proposed the Undiscovered Public Knowledge (UPK) model, there have been many approaches to uncover UPK by mining the biomedical literature. These earlier works, however, required substantial manual intervention to reduce the number of possible connections and are mainly applied to disease-effect relation. With the advancement in biomedical science, it has become imperative to extract and combine information from multiple disjoint researches, studies and articles to infer new hypotheses and expand knowledge. We propose MKEM, a Multi-level Knowledge Emergence Model, to discover implicit relationships using Natural Language Processing techniques such as Link Grammar and Ontologies such as Unified Medical Language System (UMLS) MetaMap. The contribution of MKEM is as follows: First, we propose a flexible knowledge emergence model to extract implicit relationships across different levels such as molecular level for gene and protein and Phenomic level for disease and treatment. Second, we employ MetaMap for tagging biological concepts. Third, we provide an empirical and systematic approach to discover novel relationships.

The system constitutes of two parts, tagger and the extractor (may require compilation)

A sentence of interest is given to the tagger which then proceeds to the creation of rule sets. The tagger stores this in a folder by the name of “ruleList”. These rule sets are then given by copying this folder to the extractor directory.

Download

MKEM Download

Specification

System requirements

GNU/Linux Ubuntu www.ubuntu.com

Sun Java SE Development Kit (JDK) 6 http://java.sun.com/javase/6/

Abi Link Grammar http://www.abisource.com/projects/link-grammar/

LingPipe Java Library http://alias-i.com/lingpipe/

JGraph Java Library http://www.jgraph.com/

MMTx http://mmtx.nlm.nih.gov/

Usage

Tagger is GUI based and easily usable

Extractor <INPUTDIR> <OUTPUTDIR>

Where

<INPUTDIR> corresponds to PubMed abstracts downloaded in XML format and,

<OUTPUTSIR> corresponds to output directory for extracted information.

 

Last Updated on Tuesday, 31 August 2010 18:08