SUCCESS STORY
Integrated Customer Database (ICDB)
A centralized customer data platform delivering a unified and business-ready Data Mart.
Company Name
Hyundai
Industry
Automotive
Impact
Enterprise-Wide Customer Data Governance and Deduplication Platform
Application
Hyundai Platforms
Background
Hyundai Motor India operates one of the largest automotive customer databases in the country, with customer data flowing in from multiple channels - dealer management systems, call centres, connected car services, e-commerce platforms, and web portals. Over 18 years of operations, this data had accumulated across six separate, non-interconnected systems: NDMS (New Dealer Management System), GCRM (Global CRM), AWS (Alexa/Amazon integration), Blue Link (connected car platform), the MyHyundai App, and CTB (Click to Buy portal). Each system created its own customer records independently, resulting in widespread duplication - the same customer appearing multiple times across platforms with slightly different data. With the impending expiration of Hyundai's Siebel CRM license and a planned migration to Salesforce, the bloated, redundant data posed a serious financial and operational risk: Salesforce licensing and storage costs are tied to the volume of records, making deduplication prerequisite. a Cubastion, as a trusted data and CRM implementation partner, identified the need for a purpose-built solution that could aggregate, cleanse, deduplicate, and standardize this data before Salesforce migration — and sustain clean data flows on an ongoing basis thereafter.
Challenges Faced
Business Challenges
Data Redundancy Across Systems
Data redundancy across six non interconnected systems - the same customer appearing with multiple entries across dealerships, web portals, and call centres
Lack of Unified Customer Data Source
Absence of a single source of truth for customer data, with records for the same individual scattered across customer, invoice, leads, and service tables in separate systems
Rising Data Storage Costs
High and growing storage costs driven by data duplication, compounded by the impending move to Salesforce where per record costs are significant
Limited Reporting & Analytics Flexibility
Inflexible, slow reporting in the existing Siebel/OBIEE environment - limiting the business's ability to generate custom KPI reports and campaign analytics
Non-Standardized Data Structure
No standardized data template to align Hyundai India's data with Salesforce's global schema requirements
Technical Challenges
Unified Multi-Source Data Ingestion
Designing a unified ingestion and deduplication pipeline across six structurally different source systems - Oracle-based (NDMS, GCRM) and web/ cloud-based (AWS, Blue Link, MyHyundai, CTB)
Duplicate Customer Resolution Engine
Building Python based customer identification logic capable of resolving duplicate identities across 85 million records with varying data quality and formats
Scalable Three-Layer Data Architecture
Architecting a three-layer data pipeline (Data Lake → Data Warehouse → Data Mart) within MariaDB while keeping daily incremental processing under a two-hour window
Real-Time & Batch Data Integration
Developing 80+ APIs via web Methods and 150+ PL/SQL procedure logics to handle real-time and batch data transmission without disrupting source system operations
High-Accuracy Data Deduplication
Ensuring 80%+ deduplication accuracy on daily runs while maintaining a monthly full deduplication cycle for complete data integrity
Solutions
Cubastion designed and implemented the Integrated Customer Database (ICDB) - a comprehensive data aggregation, transformation, and governance platform built on MariaDB, using Enterprise Application Integration (EAI) via webMethods to ingest data from all six source systems and a three-layer processing architecture to deliver a clean, Salesforce-ready Data Mart.
Multi-Source Data Aggregation
Structured webMethods to accommodate all six source systems - NDMS, GCRM, AWS, Blue Link, MyHyundai App, and CTB - with 80+ APIs developed for seamless, real-time data transmission into ICDB.
Three-Layer MariaDB Architecture
Data flows through a Data Lake (raw consolidated ingestion), Data Warehouse (cleansing, standardization, and entity relationship mapping), and Data Mart (final deduplicated, - tables) business-ready progressive structured, ensuring refinement.
Python-based Customer Deduplication Engine
A custom Python identification logic maps and resolves duplicate customer records across all source systems - reducing 85 million entries to 45 million unique, verified customer records.
Automated Daily Incremental Processing
With 150+ procedure logics and 150 entities updated daily, the full incremental data pipeline completes within two hours - ensuring the Data Mart stays current without manual intervention.
Salesforce-Ready Data Mart
A standardized global template covering customer, services, invoice, and leads entities - each linked to a unique customer identifier - aligned to Salesforce's schema for clean, cost efficient migration and ongoing CRM sync.
Enhanced Reporting & Analytics
Salesforce integration, combined with the optimized Data Mart, delivers significantly faster KPI report generation compared to the legacy OBIEE environment - enabling custom campaign analytics and business insights on demand.
Business Outcome
Data Consolidation
47% reduction in customer records
Data Accuracy & Governance
90%+ data accuracy achieved
Workflow Automation
90% automation of data & transformation workflows
Performance Improvement
80% reduction in data processing latency
Operational Cost
Savings
50% labor cost savings
Rework Reduction
20–30
hrs/month saved
Infrastructure Optimization
75–80% overall operational savings
System Reliability
99%+ system uptime delivered
English
Japanese