We created an end-to-end system that performs ETL and integrates BI tools featuring:
- Intelligent scheduling and on-demand auto-scaled Hadoop clusters on Amazon AWS
- User events data stream processing with Amazon Kinesis and Apache Spark
- Event tagging with lambda-functions
- Data Marts based on Hive, HBase, PostgreSQL and Infobright
- Visualization with WebFOCUS and Tableau
- Cloud-cost optimization with proprietary algorithms
- Creation of traditional Business Intelligence data marts
- Creation of specialized Data Marts for AI, as described in the following sections.
Good customers should be mostly left alone, but appropriate reminders, thank you’s, and special offers are appreciated if done in moderation. For this client, on top of the Data Warehouses, we created an integrated module to determine whether customers respond positively to various types of communications. The machine learning models dynamically learned customer behavior and were able to:
- Predict which customers are going to churn in a predefined period (e.g. in 2 – 6 weeks) on a regular basis.
- Discover which ones of the newly subscribed customers that are still in trial mode are more likely to be retained.
- Discover the long-term customers that are likely to churn.
- Drill-down in the analysis and identify the reasons that distinguish the retained and churned customers.
Automatic text classification is important in analyzing surveys, customer support transcripts, etc. It can be used for the detection of dissatisfaction of the system that isn’t easily identifiable with traditional reports. We have performed hierarchical classification and sentiment analysis of free-text surveys and customer support transcripts of existing customers. The benefits for the client were that they could obtain a real-time understanding of reasons for customer’s dissatisfaction based on free-text surveys, so the client could preemptively respond and help their customers. Ultimately, this provided better customer satisfaction and lowered customer turnover. This was implemented using state of the art Machine Learning and Natural Language Processing algorithms.
We built a recommendation engine that uses content-based and collaborative filtering using implicit feedback of users. The system integrated input streams from the watch history of users including various parameters for modeling their feedback:
- whether the users completely watched a particular item
- whether they interrupted something and never completed it
- started watching something on one device and continued on another, etc.
The evaluation of an independent set of users and content provided very compelling proof that the proposed system was effective. Furthermore, after the deployment, an evident boost of the total volume of watched content was detected.