Real World Data for Novel Health-Insurance products pilot: Diving into the system

Container orchestration technologies at the service of INFINITECH

Personalization of health insurance products needs to be based on continuous risk assessment of the individual, since lifestyle and behaviour cannot be assessed at one instance in time; they involve people’s habits and their continuous change. Health insurance products employing continuous assessment of customers’ lifestyle and behaviour are dynamically personalized.

Behavioural assessments, much like their clinical counterparts rely on data. For behaviour the data collection needs to be continuous, facilitated by software tools for the collection of information capturing the important aspects of lifestyle and behaviour. In the Pilot #12 of INFINITECH, “Real World Data for Novel Health-Insurance products”, insurance experts define the data to be collected, and the Healthentia e-Clinical platform facilitates the collection. Continuous risk assessment services are provided to health insurance professionals by training machine learning (ML) prediction models for the required health parameters.

The pilot focuses on health insurance and risk analysis by developing two AI-powered services risk assessment and fraud detection: The risk assessment service allows the insurance company to adapt prices by classifying individuals according to their lifestyle. The fraud detection service is based on outlier analysis for data, but mainly on the use of a virtual coach to advise individuals in their lifestyle choices, aiming at improving their health but also in persuading them to use the system correctly. These two services rely on a model of health outlook trained on the collected data and used in the provision of the services.
The pilot involves two systems:

The pilot #12 testbed, built within the INFINITECH project and deployed on the NOVA sandbox.
The Healthentia e-Clinical platform, provided by Innovation Sprint.

The data is collected by the Healthentia mobile app, a companion app for the insured clients, and is presented to the experts using its portal application. Also model-based inference takes part within Healthentia for both services, and the decisions are similarly visualized. This is the online part of the pilot.

Model training on the other hand takes place at the pilot #12 testbed in a secure and privacy-preserving way. This offline part is facilitated by the following INFINITECH components, deployed on the testbed:

The Data Collection Tool continuously queries the Healthentia platform for new data, and its storage in the miniIO internal storage.
The Regulatory Compliance Tools offer the Data Protection Orchestrator that is invoked to start a new model training process. It contacts the Data Collection Tool to retrieve the list of files in miniIO that should be included in the training process and then forwards this information to the Anonymization Tool.
The Anonymization Tool loads the designated data from the miniIO storage and performs different levels of anonymization. It then stores the results in LeanXcale.
The LeanXcale database hosts the data repository for model training. Data anonymized at different levels is stored here.
The Model Trainer is a collection of Python classes handling (a) training, validation and testing dataset creation for different input attributes’ and output outcomes’ scenarios, (b) model training, (c) model evaluation and (d) model exporting to Healthentia.

The pilot is now starting a study with external participants. Stay tuned for the first use of its testbed for model training early 2022!