American Society of Civil Engineers


Automated Regulatory Information Extraction from Building Codes : Leveraging Syntactic and Semantic Information


by Jiansong Zhang, (Graduate student, Department of Civil and Environmental Engineering, University of Illinois at Urbana-Champaign, 205 North Mathews Avenue, Urbana, IL 61801 E-mail: jzhang70@illinois.edu) and Nora El-Gohary, (Assistant Professor, Department of Civil and Environmental Engineering, University of Illinois at Urbana-Champaign, 205 North Mathews Avenue, Urbana, IL 61801 E-mail: gohary@illinois.edu)
Section: Knowledge Management and Information Technology, pp. 622-632, (doi:  http://dx.doi.org/10.1061/9780784412329.063)

     Access full text
     Purchase Subscription
     Permissions for Reuse  

Document type: Conference Proceeding Paper
Part of: Construction Research Congress 2012: Construction Challenges in a Flat World
Abstract: Manual regulatory compliance checking of construction projects is usually time-consuming and error-prone. There have been efforts both in academia and industry to automate this process. However, none of them achieved full automation. Specifically, the extraction of rules from regulatory text (e.g. building code) and its representation in a computer-processable format is still conducted manually or semi-automatically. Natural language processing (NLP) aims at enabling computers to process natural language text in a human-like manner. It provides basic concepts and methods for text processing and analysis, such as part of speech (POS) tagging, tokenization, sentence splitting, named entity recognition, and semantic role labeling, etc. This paper is intended to explore the effectiveness of utilizing syntactic (i.e. grammatical) and semantic (i.e. meaning descriptive) features of the text (using NLP tools and techniques) to automatically extract regulatory information from building codes. An automated information extraction (IE) approach - involving the use of IE rules - is proposed. Chapter 12 of the 2006 International Building Code was used to develop the IE rules, while Chapter 12 of the 2009 International Fire Code was used to test the approach. An overall F-measure of 0.94 shows the potential of the proposed approach. Based on the experimental results and their analysis, we conclude the paper by pinpointing possible ways for improving the proposed approach.


ASCE Subject Headings:
Construction management
Standards and codes
Automation