Publications

Publications / SAND Report

Survey of Current State of the Art Entity-Relation Extraction Tools

Ward, Katrina J.; Bisila, Jonathan B.; Cairns, Kelsey L.

In the area of information extraction from text data, there exists a number of tools with the capability of extracting entities, topics, and their relationships with one another from both structured and unstructured text sources. Such information has endless uses in a number of domains, however, the solutions to getting this information are still in early stages and has room for improvement. The topic has been explored from a research perspective by academic institutions, as well as formal tool creation from corporations but has not made much advancement since the early 2000's. Overall, entity extraction, and the related topic of entity linking, is common among these tools, though with varying degrees of accuracy, while relationship extraction is more difficult to find and seems limited to same sentence analysis. In this report, we take a look at the top state of the art tools currently available and identify their capabilities, strengths, and weaknesses. We explore the common algorithms in the successful approaches to entity extraction and their ability to efficiently handle both structured and unstructured text data. Finally, we highlight some of the common issues among these tools and summarize the current ability to extract relationship information.