Skip to main content

Case studies

Entity and relationship extraction from text data

Clementine supports the work of law enforcement organizations in prevention and investigation by helping them processing data from unstructured data from historical and online sources using data mining and text analytics.

It works in the detection and prevention of organised crime.

BUSINESS PROBLEM

The success of a crime investigation depends on the speed and thoroughness of the investigation: the faster the organisation moves and the more information it has, the more likely it is to uncover a case. The analysis and detection of relationships is difficult and time-consuming because the information is available in different databases, comes from different sources and often in unstructured form. A new data storage and analysis system could be the answer to making investigations more time and cost-efficient.

SUMMARY

Clementine's solution combined text analytics and network visualization components to develop a data processing flow that allows automatic extraction of entities and relationships from text documents, visualizing the relevant networks, thus making the work of analysts in law enforcement and crime prevention more efficient.

SOLUTION

In implementing the ClemRISK solution with text analytics, we focused on understanding the investigation workflow and making its manual-intensive elements more efficient:

  1. Analysing historical data
    By processing the electronic documents accumulated over decades of investigations, we have developed text analytic dictionaries and processing procedures capable of extracting entities (e.g. names, license plates, companies, etc.) and identifying relationships between them.
  2. Data structure design and network visualisation
    To make the extracted information easy to interpret and analyse, and for investigators to quickly compare it with the results of other previous investigations, an "intelligence database" was created to store the data extracted from each case. This allows the visualisation and analysis of networks.
  3. Developing an operational mechanism
    The data processing flows we developed are tuned for real-time, continuous operation: new text documents generated during the investigation of current cases are automatically processed and loaded into the database.