Event Details

Data Extraction with Mildly Context-Sensitive Island Grammars

Presenter: Derek Church - University of Victoria, Canada
Supervisor:

Date: Fri, August 22, 2003
Time: 13:30:00 - 00:00:00
Place: EOW 430

ABSTRACT

Island Grammars are a methodology to identify constructs of interest (islands) within a data source. Based on utilizing regular expressions to identify islands, the remaining parts of the data source are then considered to be 'water' and are excluded as non-essential parts. First by extending the definition of Island grammars to include positional notation for the identification of islands and then allowing for a mechanism to implement rule-based dependencies and conditional evaluations, a limited form of context-sensitive analysis is achieved. The research seeks to implement a formal mechanism for defining such grammars as well as an engine to apply a grammar specification to a specified data source with the end result of this process a modified version of the data source with mark-up tags delineating identified constructs. We investigate this process in the following contexts to determine relevance and usability:

  1. Meta-data extraction from a legacy information system with implicit structure.
  2. Information capture/extraction/transformation in the domain of Health Information Grid networks.
  3. Computer Assisted Assessment for facilitating grading of student assignment submissions.