Developing Data Extraction API Software for Health Insurance Cards

A US healthcare client wanted to extract and process insurance details from diverse card layouts. More than 80,000 cards were processed in one day with 99% accuracy and a 2 second processing time thanks to our AI solution.

Table of contents
Contributors
Shilpa Ramaswamy

The Client & the Challenge

Our US client from the health industry needed an AI solution for extracting and processing insurance details from various card layouts. The objective was to save time, reduce costs, and eliminate errors in the insurance domain. Although OCR technology effectively extracted printed information, it faced limitations when claim-payer IDs were not visible on the cards. The ultimate goal of this project was to empower healthcare providers by streamlining insurance-related tasks and allowing them to prioritise patient care. The project marked a significant advancement beyond OCR, improving the handling of healthcare insurance data.

Industry Overview

About the US Health Insurance Industry

The individual health insurance market is experiencing significant growth in 2023, with over 3.6 million new consumers and a wide selection of 88 plans. Over the last 10 years, consumer participation has risen by 25% to approximately 16 million due to extended enrollment periods and improved subsidies. Consumer choice has dramatically increased, with 87% of individuals accessing three or more insurers in 2023 like HMOs and EPOs.

Business Challenge

Traditional OCR Systems

Traditional insurance processing presents significant process challenges - requiring more time and resources. This is further compounded by the staggering amount of denied health insurance claims in the United States of America, totalling over $262 billion annually. On top of that, there is a 27% error rate in patient registration and insurance processing, costing an additional $71 billion. These erroneous transactions comprise 1/60th of US healthcare spending and a third of hospital administrative costs.

There is a 27% error rate in patient registration and insurance processing, costing an additional $71 billion.


Solution

We used a blend of vision AI and Deep Learning to solve the customer's challenge. Here is a breakdown of the steps we used:

Step 1: Extracting Card from the Image

Leveraging state-of-the-art technology, we employed theYOLO V5 model to skillfully extract the card from images, seamlessly removing background clutter. 

Step 2: Text Extraction via AzureOCR

Ensuring a comprehensive approach, we simultaneously processed the front and back card images through theAzure OCR API. This dynamic tool provided us with accurate and speedy information retrieval from the cards.

Step 3: KV Extractor Model

Our commitment to precision led us to train the LiLT model on a robust dataset of around 80,000 insurance cards. This model excels in classifying essential words into their respective labels. The integration of 'Regular Expressions' further facilitated the extraction of vital information, such as phone numbers and website details, from insurance cards. To elevate the accuracy of our solution, we implemented the MobileNetV3 model to classify insurance cards. This strategic choice ensures a streamlined and efficient process, enhancing the overall performance of our system.


Impact Delivered

  • We automated the insurance data processing with unprecedented speed and accuracy using machine learning techniques.
  • We developed an end-to-end pipeline that optimised the entire process within 2-2.5 seconds.
  • We eliminated the need for manual labelling on insurance cards.
  • We provided our clients with an API that allows seamless integration of the enhanced solution into their workflow.

Top Benefits

  • By reducing the processing time from 7 seconds to just 2 seconds, we have significantly improved the operational efficiency and user experience.
  • By replacing two KV extractor models, our model accuracy has increased by a remarkable 99%. This ensures more reliable outcomes for our clients.
  • Our system has successfully processed over 80,000 cards in one day, showcasing its extensive capability to manage large volumes effectively.

The Akaike Edge

Inbuilt libraries, DL models with transfer learning capabilities