UCL  IRIS
Institutional Research Information Service
UCL Logo
Please report any queries concerning the funding data grouped in the sections named "Externally Awarded" or "Internally Disbursed" (shown on the profile page) to your Research Finance Administrator. Your can find your Research Finance Administrator at http://www.ucl.ac.uk/finance/research/post_award/post_award_contacts.php by entering your department
Please report any queries concerning the student data shown on the profile page to:

Email: portico-services@ucl.ac.uk

Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Publication Detail
Creating Regular Expressions as mRNA Motifs with GP to Predict Human Exon Splitting
  • Publication Type:
    Report
  • Authors:
    Langdon WB, Rowsell J, Harrison AP
  • publication date:
    19/03/2009
  • Place of publication:
    Strand, London, WC2R 2LS, UK
  • Report number:
    TR-09-02
  • Notes:
    keywords: genetic algorithms, genetic programming, Gene expression and regulation, alternative splicing, Microarray analysis, Integration of genetic programming into bioinformatics, Biological interpretation of computer generated motifs, Bioinformatics, Affymetrix GeneChip, strongly typed genetic programming, grammar, regular expression, Alternative splicing of Homosapiens exons, HDONA notes: Long version of \citelangdon:2009:gecco size: 9 pages
Abstract
Low correlation between mRNA concentrations measured at different locations for the same exon show many current Ensembl exon definitions are incomplete. Automatically created patterns (e.g. TCTTT) identify potential new alternative transcripts. Strongly typed grammar based genetic programming (GP) is used to evolve regular expressions (RE) to classify gene exons with potential alternative mRNA expression from those without. http://bioinformatics.essex.ac.uk/users/wlangdon/rnanet RNAnet gives us correlations between Affymetrix HG-U133 Plus 2 GeneChip probe measurements for the same exon across 2757 Homo Sapiens tissue samples from NCBI’s GEO database. We identify many non-atomic Ensembl exons. I.e. exons with substructure. Biological patterns can be data mined by a Backus-Naur form (BNF) context-free grammar using a strongly typed GP written in gawk and using egrep. The automatically produced DNA motifs suggest that alternative polyadenylation is not responsible. The training data is available on the http://bioinformatics.essex.ac.uk/users/wlangdon/tr-09-02.tar.gz internet.
Publication data is maintained in RPS. Visit https://rps.ucl.ac.uk
 More search options
UCL Researchers
Author
Dept of Computer Science
University College London - Gower Street - London - WC1E 6BT Tel:+44 (0)20 7679 2000

© UCL 1999–2011

Search by