Fighting Fraud with Machine Learning

Late Tuesday afternoon a client calls his insurance agent to submit a claim: a golf ball has gone through his window. The insurance agent knows that the client lives near a golf course and submits the claim to the carrier. However, details of the claim don’t completely make sense to the agent, and she’s admittedly a bit suspicious. But it is still possible for a golf ball to have gone through her client’s window, so she proceeds with caution.

The carrier receives the claim, and it is automatically reviewed by fraud detection software. The software is powered by machine learning using data populated by the carrier’s engineers. Data like satellite maps, demographics, and historic fraud cases are entered into the system to “teach” the software to identify and flag possible cases of fraud based on a variety of factors. When the claim is reviewed by the software, it is flagged as potential fraud.

How machine learning tech pinpoints “insur-fraud”

The software concludes that this “golf ball-gate” situation is likely fraudulent because of a number of factors. First, the machine learning tool uses satellite imagery to assist in identifying areas of potential accidents near golf courses, and in this case determined that the home is too far from the course to reasonably be in the path of an errant ball. Additionally, data from other fraudulent claims of broken windows near golf courses aided in identifying this case as fraudulent—data points like the end-insured’s demographics and past claim history, and hundreds of other variables humans likely would not detect without the aid of a computer.

As a result, the claims adjuster reviews the flagged case with the information provided by the machine learning tool and then further investigates. If the claim is ultimately denied, the fraud detection software saves the carrier money and frustration, and the information from the denied claim can be added to the software to continue to help flag similar cases of fraud in the future.

A nationwide problem in insurance

According to the FBI, the total cost of non-medical insurance fraud in the United States is more than $40 billion dollars per year. This cost directly impacts insureds with increased premiums between $400-700 each year. As carriers continue to improve the data used by their machine learning software to detect fraudulent claims, less money can and will be paid out, which results in lower premiums for consumers over time.

Humans are still needed, but not their biases

Machine learning begins and ends with humans. Humans are not correct 100 percent of the time, and therefore machine learning to detect fraud is vulnerable to incorrect conclusions. That’s why a combination of objective human involvement plus sound technology is best in creating a truly impactful machine learning software. The software flags a claim as potentially fraudulent using complex data analysis, but it takes a human being to make a final determination.

Machine learning software is configured based off data sets accumulated by humans as well as real-life examples of fraud cases caught by humans. During these processes, unconscious human biases can be introduced into the data, potentially causing valid claims to be flagged as fraudulent and causing a ripple effect for future fraud detection in the system. Therefore, before new or more data is added into machine learning software, it is important that it be reviewed for potential bias and for data teams to be educated on what kinds of bias—namely race and gender—are commonly introduced unintentionally.

Navigating data, security, and more

Bias may not be the only issue facing machine learning. Over time, it is necessary for insurance data professionals to evaluate the efficacy and stability of the models that are being used by the machine learning software. Factors such as shifting neighborhood demographics, population fluctuations, and other physical and social environmental changes should be considered and regularly monitored to ensure objective software performance.

Additionally, there is the issue of collecting and accessing safe and accurate data sets beyond what is available in-house. Data sets can be extremely expensive to purchase and must be reputably collected and managed, with all private personal information omitted and protected. Further, when building out machine learning software to detect fraud, data professionals may have to add new data over time if their budget doesn’t allow for large amounts of data sets to be purchased at once. This bottleneck can hamper the machine learning software’s ability to accurately detect fraud due to a limited data set and inconsistent input.

A key player in fraud detection

Despite a steep learning curve and potential pitfalls, machine learning is essential to the insurance industry moving forward. As time goes on and more data points are accumulated, machine learning software will become more accurate and efficient in helping the industry detect fraud. This will result in lower monetary losses for carriers, leading to more competitive rates for agencies and lower premiums for end-insureds. Cases like the “golf ball-gate” example above will increasingly train machine minds how to detect fraudulent claims—with the help of human minds responsibly nurturing it in the process.

If you’re ready for forward-looking solutions with proven results, see how Vertafore can help.