Financial Information Hub for Claims Analytics ( Insurance )

advancedanalytics

Client

One of the largest health insurers

A client is one of the largest health insurers in the US. One of the Largest customer-owned health insurance company in the United States with more than 4 million members.

Solution

Financial Information Hub

Datametica created a cost-effective and scalable framework called Financial Information Hub (FIH) to acquire, clean, and prepare data for Strategic Data Warehouse (SDW) consumption.

Solution

  • Layered data lake architecture of the Hadoop Distributed File System (HDFS) provides flexibility to store and process data.
  • Data is loaded from the three subject areas, processed in Hadoop, and then loaded into the enterprise Atomic Teradata Warehouse (eADW).
  • Kafka, Storm, Campus, and Hbase were used for real-time data collection and data aggregation.

Solution

  • Gold Data tables are exposed to Hive interface to enable business users to query data using Hue as per specified authorization. Hue provides a Web application interface for Apache Hadoop. It supports a file browser, JobTracker interface, Hive, Pig, Oozie, HBase, and more.
  • Batch views can be computed using Hadoop batch jobs to show entire data history for reporting.
  • An audit process, metadata management, and data lineage were part of the deliverables to support data governance.

Benefits

operational
Processing power

The distributed processing power of the Hadoop framework leads to scalability, security and fault tolerance

time
Real-time views

Real-time views into data can be computed for proactive analysis

operational
Hadoop performance

Hadoop can ingest structured and unstructured data

Business drivers

Data is processed on three subject areas of membership, claims and income using fragmented tools. Legacy tools such as Extract Transform & Load tool DataStage, Java, DB2, CDC, Mainframe, and Teradata were used to perform the batch and near real-time ETL.

Teradata is not able to keep up with real-time processing. It is expensive to add additional nodes to scale for increased storage and compute.

Top