Guidewire’s Approach to Predictive Analytics, Part Two: Data Strategy

The Big Picture

Data is a purpose-driven asset. It is collected for specific reasons to meet specific needs. The structure of a given set of data is driven by the immediate purpose as well. Data models used to create efficient daily processing are not those used for efficient storing, and yet the structures which facilitate easy access for reporting and business intelligence are different.

Data strategies which focus on processing, storage and reporting can also fail miserably when confronted with the needs of actuaries and predictive modelers. Historically these needs have been an after-thought, leaving actuaries and data scientists to pull their data themselves. For companies that have accounted for this, costs have often limited the application of the data, and as soon as a new use case arose, one with new data requirements, the existing structures were costly and cumbersome to modify.

In this installment we consider the unique needs of predictive analytics with respect to data.

Data Requirements for Predictive Analytics

Any comprehensive approach to providing data for predictive analytics needs to grapple with the following issues:

  • Historical data
  • Flexible access to new or third-party data – for new tasks
  • Efficient extraction to data – for common tasks

Predictive analytics, as well as many common actuarial tasks such as pricing and loss reserving, requires historical data.  How much is required depends on the line of business, but three to 10 years of history should be expected. The consequences of this need are several. Both storage and extraction of data are relevant, but this need for history also brings legacy systems into the picture.

Over any multi-year span, there is rarely a single platform from which data is generated and in which data is stored. This could be due to the recent replacement of legacy processing systems or due to acquisitions. In either case, the different formats and quality of the data provide a fundamental challenge. In a sense, the inclusion of third-party data is part of the same challenge because it is simply another data source with different structures to incorporate.

Another issue is the tension between repeatable and efficient data extracts and the flexibility required due to predictive analytics. It should be stressed that, at its core, predictive analytics is a creative process. For any business problem, the predictive modeler needs to decide how existing, accessible data might be used to provide relevant information for solving the problem. Pre-packaged data extractions limit this creativity and blunt potential new ideas that could prove to be game changing.

And yet, while the predictive modeler would prefer complete flexibility, many models and tasks are by now well-known. For example, a policy-level data structure is needed for an underwriting model, while coverage-level is preferred for pricing. Data extracts at these levels will be needed regardless of other predictive goals, and getting this data should be made as efficient as possible. Efficient extractions of data to facilitate known predictive models is a must.

Guidewire’s Approach to Data

Guidewire has integrated a cloud-based data platform into their core systems. This creates an ecosystem which solves both sides of the data problem by:

  • Providing curated data extracts needed for common predictive models
  • Providing access to build pipelines defining custom extractions for new models

In addition, the Guidewire Cyence data-listening platform can bring unique and innovative data to supplement data from the policy and claims processing systems.

Of course, since predictive modeling requires 10 years or more of data, and with technology evolving like it has, it has seemed as if there will always be some legacy system complicating the data problem. Legacy systems will always be a challenge when they exist, but Guidewire’s cloud-based approach allows for regular updates and the potential of continued relevancy.

Guidewire appreciates the needs of actuaries and predictive modelers and has built their data strategy accordingly. Stay tuned for Part Three of this series on building predictive models, and check out Part One on the need for a comprehensive approach.

Get started