SUCCESS STORY

Integrated Customer Database (ICDB)

A centralized customer data platform delivering a unified and business-ready Data Mart.

Company Name

Hyundai

Industry

Automotive

Impact

Enterprise-Wide Customer Data Governance and Deduplication Platform

Application

Hyundai Platforms

Background

Hyundai Motor India operates one of the largest automotive customer databases in the country, with customer data flowing in from multiple channels - dealer management systems, call centres, connected car services, e-commerce platforms, and web portals. Over 18 years of operations, this data had accumulated across six separate, non-interconnected systems: NDMS (New Dealer Management System), GCRM (Global CRM), AWS (Alexa/Amazon integration), Blue Link (connected car platform), the MyHyundai App, and CTB (Click to Buy portal). Each system created its own customer records independently, resulting in widespread duplication - the same customer appearing multiple times across platforms with slightly different data. With the impending expiration of Hyundai's Siebel CRM license and a planned migration to Salesforce, the bloated, redundant data posed a serious financial and operational risk: Salesforce licensing and storage costs are tied to the volume of records, making deduplication prerequisite. a Cubastion, as a trusted data and CRM implementation partner, identified the need for a purpose-built solution that could aggregate, cleanse, deduplicate, and standardize this data before Salesforce migration — and sustain clean data flows on an ongoing basis thereafter.

Challenges Faced

Business Challenges

Data Redundancy Across Systems

Data redundancy across six non interconnected systems - the same customer appearing with multiple entries across dealerships, web portals, and call centres

Lack of Unified Customer Data Source

Absence of a single source of truth for customer data, with records for the same individual scattered across customer, invoice, leads, and service tables in separate systems

Rising Data Storage Costs

High and growing storage costs driven by data duplication, compounded by the impending move to Salesforce where per record costs are significant

Limited Reporting & Analytics Flexibility

Inflexible, slow reporting in the existing Siebel/OBIEE environment - limiting the business's ability to generate custom KPI reports and campaign analytics

Non-Standardized Data Structure

No standardized data template to align Hyundai India's data with Salesforce's global schema requirements

Technical Challenges

Unified Multi-Source Data Ingestion

Designing a unified ingestion and deduplication pipeline across six structurally different source systems - Oracle-based (NDMS, GCRM) and web/ cloud-based (AWS, Blue Link, MyHyundai, CTB)

Duplicate Customer Resolution Engine

Building Python based customer identification logic capable of resolving duplicate identities across 85 million records with varying data quality and formats

Scalable Three-Layer Data Architecture

Architecting a three-layer data pipeline (Data Lake → Data Warehouse → Data Mart) within MariaDB while keeping daily incremental processing under a two-hour window

Real-Time & Batch Data Integration

Developing 80+ APIs via web Methods and 150+ PL/SQL procedure logics to handle real-time and batch data transmission without disrupting source system operations

High-Accuracy Data Deduplication

Ensuring 80%+ deduplication accuracy on daily runs while maintaining a monthly full deduplication cycle for complete data integrity

Solutions

Cubastion designed and implemented the Integrated Customer Database (ICDB) - a comprehensive data aggregation, transformation, and governance platform built on MariaDB, using Enterprise Application Integration (EAI) via webMethods to ingest data from all six source systems and a three-layer processing architecture to deliver a clean, Salesforce-ready Data Mart.

Multi-Source Data Aggregation

Structured webMethods to accommodate all six source systems - NDMS, GCRM, AWS, Blue Link, MyHyundai App, and CTB - with 80+ APIs developed for seamless, real-time data transmission into ICDB.

Three-Layer MariaDB Architecture

Data flows through a Data Lake (raw consolidated ingestion), Data Warehouse (cleansing, standardization, and entity relationship mapping), and Data Mart (final deduplicated, - tables) business-ready progressive structured, ensuring refinement.

Python-based Customer Deduplication Engine

A custom Python identification logic maps and resolves duplicate customer records across all source systems - reducing 85 million entries to 45 million unique, verified customer records.

Automated Daily Incremental Processing

With 150+ procedure logics and 150 entities updated daily, the full incremental data pipeline completes within two hours - ensuring the Data Mart stays current without manual intervention.

Salesforce-Ready Data Mart

A standardized global template covering customer, services, invoice, and leads entities - each linked to a unique customer identifier - aligned to Salesforce's schema for clean, cost efficient migration and ongoing CRM sync.

Enhanced Reporting & Analytics

Salesforce integration, combined with the optimized Data Mart, delivers significantly faster KPI report generation compared to the legacy OBIEE environment - enabling custom campaign analytics and business insights on demand.

Business Outcome

Data Consolidation

47% reduction in customer records

Data Accuracy & Governance

90%+ data accuracy achieved

Workflow Automation

90% automation of data & transformation workflows

Performance Improvement

80% reduction in data processing latency

Operational Cost
Savings

50% labor cost savings

Rework Reduction


20–30 hrs/month saved

Infrastructure Optimization

75–80% overall operational savings

System Reliability

99%+ system uptime delivered