UCL  IRIS
Institutional Research Information Service
UCL Logo
Please report any queries concerning the funding data grouped in the sections named "Externally Awarded" or "Internally Disbursed" (shown on the profile page) to your Research Finance Administrator. Your can find your Research Finance Administrator at https://www.ucl.ac.uk/finance/research/rs-contacts.php by entering your department
Please report any queries concerning the student data shown on the profile page to:

Email: portico-services@ucl.ac.uk

Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Publication Detail
Subsequence Based Deep Active Learning for Named Entity Recognition
  • Publication Type:
    Conference
  • Authors:
    Radmard P, Fathullah Y, Lipani A
  • Publisher:
    Association for Computational Linguistics
  • Publication date:
    01/08/2021
  • Pagination:
    4310, 4321
  • Published proceedings:
    Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing
  • Volume:
    1
  • Editors:
    Zong C,Xia F,Li W,Navigli R
  • ISBN-13:
    978-1-954085-52-7
  • Status:
    Published
  • Name of conference:
    59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing
  • Language:
    English
  • Notes:
    ©2021 Association for Computational Linguistics, licensed on a Creative Commons Attribution 4.0 International License.
Abstract
Active Learning (AL) has been successfully applied to Deep Learning in order to drastically reduce the amount of data required to achieve high performance. Previous works have shown that lightweight architectures for Named Entity Recognition (NER) can achieve optimal performance with only 25% of the original training data. However, these methods do not exploit the sequential nature of language and the heterogeneity of uncertainty within each instance, requiring the labelling of whole sentences. Additionally, this standard method requires that the annotator has access to the full sentence when labelling. In this work, we overcome these limitations by allowing the AL algorithm to query subsequences within sentences, and propagate their labels to other sentences. We achieve highly efficient results on OntoNotes 5.0, only requiring 13% of the original training data, and CoNLL 2003, requiring only 27%. This is an improvement of 39% and 37% compared to querying full sentences.
Publication data is maintained in RPS. Visit https://rps.ucl.ac.uk
 More search options
UCL Researchers
Author
Dept of Civil, Environ &Geomatic Eng
University College London - Gower Street - London - WC1E 6BT Tel:+44 (0)20 7679 2000

© UCL 1999–2011

Search by