This is my course project for IE6750: Data Warehousing and Integration. It's a (Ongoing Project).
Policy lapsation is a major challenge in the life insurance industry. It happens when customers do not pay their premiums on time, leading to the termination of their policy. This affects both customers and companies. Customers lose financial protection, while companies lose revenue and face higher costs to regain customers.
The main issue is that data comes from multiple sources like web sales, agents, and banks. Because of this, companies struggle to identify at-risk customers early and take action.
This project aims to create a centralized data engineering solution to integrate and analyze data from multiple sources. The goal is to improve decision-making and reduce lapsation by providing accurate, consistent, and actionable insights.
- ETL Pipeline: Collects and processes both static reference data (e.g., agent details, branch details, policy types) and dynamic transactional data (e.g., premium payments, policy statuses).
- Synthetic Data: Simulates real-world scenarios such as customer policy purchases, agent interactions, and payment updates.
- Data Warehousing: Stores cleaned and transformed data in a structured format for analysis and reporting.
- Business Insights: Helps in customer segmentation, agent performance analysis, payment method optimization, and long-term strategic insights.
By the end of this project, we will have:
- A complete ETL pipeline that integrates data from multiple sources.
- A structured data warehouse for accurate reporting.
- Interactive dashboards to analyze lapsation trends.
- Predictive insights to identify high-risk customers and improve retention.
This project demonstrates how data-driven strategies can reduce policy lapsation and improve business outcomes in the life insurance industry