Please report any queries concerning the funding data grouped in the sections named "Externally Awarded" or "Internally Disbursed" (shown on the profile page) to
your Research Finance Administrator. Your can find your Research Finance Administrator at https://www.ucl.ac.uk/finance/research/rs-contacts.php by entering your department
Please report any queries concerning the student data shown on the profile page to:
Email: portico-services@ucl.ac.uk
Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Email: portico-services@ucl.ac.uk
Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Publication Detail
BioRAT: Extracting biological information from full-length papers
-
Publication Type:Journal article
-
Publication Sub Type:Article
-
Authors:Corney DPA, Buxton BF, Langdon WB, Jones DT
-
Publisher:OUP
-
Publication date:2004
-
Place of publication:UK
-
Pagination:3206, 3213
-
Journal:Bioinformatics
-
Volume:20
-
Issue:17
-
Print ISSN:1367-4803
Abstract
Motivation: Converting the vast quantity of free-format
text found in journals into a concise, structured format makes the
researcher's quest for information easier. Recently, several
information extraction systems have been developed that attempt to
simplify the retrieval and analysis of biological and medical
data. Most of this work has used the abstract alone, owing to the
convenience of access and the quality of data. Abstracts are generally
available through central collections with easy direct access
(e.g. PubMed). The full-text papers contain more information, but are
distributed across many locations (e.g. publishers' web sites, journal
web sites, and local repositories), making access more difficult.In
this paper, we present BioRAT, a new information extraction (IE) tool,
specifically designed to perform biomedical IE, and which is able to
locate and analyse both abstracts and full-length papers. BioRAT is a
Biological Research Assistant for Text mining, and incorporates a
document search ability with domain specific IE.Results: We show
first, that BioRAT performs as well as existing systems, when applied
to abstracts; and second, that significantly more information is
available to BioRAT through the full-length papers than via the
abstracts alone. Typically, less than half of the available
information is extracted from the abstract, with the majority coming
from the body of each paper. Overall, BioRAT recalled 20.31% of the
target facts from the abstracts with 55.07% precision, and achieved
43.6% recall with 51.25% precision on full-length papers.
Availability: The software and documentation can be found at
http://bioinf.cs.ucl.ac.uk/biorat.
› More search options
UCL Researchers