Extracting Insights Using NLP

Digitize unstructured text at scale to improve data quality and deliver meaningful business insights

Text lacks structure and formatting, unlike data that is stored digitally in databases. Due to the disorganized nature of text-based data, it is not possible to extract meaningful insights from it using traditional, rules-based approaches. Furthermore, the sheer scale of text-based data dwarfs the size of structured, digital data for many use cases, making the problem of processing the text to extract insights even more challenging.

Health plans cannot afford to miss out on the enormous amount of insight buried within unstructured data like physicians’ notes, network affiliations and rates in provider contracts, etc.

Use Cases


MCheck™ Clinical scans medical charts and compares the data within them to other similar member’s the health plan serves, discovering HCC codes which should be associated with a member, but are not.

Provider Network Affiliations

MCheck™ Provider extracts networks and other related provider information from provider contracts and amendments to determine if network affiliations present in the plan’s provider database are accurate.

Provider Roster Ingestion

MCheck™ Provider automaps columns from incoming provider roster files to a standard template, error checks all incoming data, and can provide feedback to the submitter or operator regarding processing success.

Our Capabilities
Topic Modeling

Scans text and determines the concepts that are associated with that text. For example, in HEDIS/STARs and Risk Adjustment use cases, scanned medical charts determine which HCC codes should be associated with various members, but have not been.

Sentiment Analysis

Scans all text, and within a targeted area, determines the context associated with that text. In the provider contracts use case, our NLP team looked for the context surrounding networks to determine if the agreement includes, excludes, or modifies the networks a provider is participating in.

Table Extraction

This module allows the detection and extraction of any information present in a tabular format. In the provider contracts use case, this method has been used to extract provider information associated with networks and the fee schedules associated with those providers.

Data Ingestion

This module used across all of HiLabs’ use cases contains the data pipeline to ingest various formats of semi-structured data such as Word documents, PDFs, and scanned image formats at scale.

An advanced, healthcare-specific NLP engine extracting insights from unstructured data
Request more information