Dynamic Insurance Pricing – Telematics Analytics & Behavioural monitoring

In the auto insurance industry, the terms telematics and usage-based insurance (UBI) are often used interchangeably – but they are actually two different concepts. Usage-based insurance is a broader concept that can be broken down into two categories: self-reporting policies and telematics-based policies. There are two main types of telematics-based insurance products: Pay-as-you-drive (PAYD) and Pay-how-you-drive (PHYD). Vehicle telematics refers to automobile systems that combine Global Positioning System tracking and other wireless communications for automatic roadside assistance and remote diagnostics. Telematics refers to solutions that are based on information flowing to and/or from a vehicle. When implemented, telematics have the potential to increase operational efficiency and improve driver safety in a number of ways; for example:

  • GPS technology tracks a vehicle’s location, mileage, and speed. Fleet and Distribution companies can use this information to optimize routes and scheduling efficiency.
  • Communications technology promotes connectivity between drivers and dispatch.
  • Sensors monitor vehicle diagnostics which can then be used to streamline vehicle maintenance.
  • Accelerometers measure changes in speed and direction, cameras that monitor road condition, and drivers’ actions. This information can be used to improve driver performance through a one-on-one or in-vehicle coaching program

Insurers want access to a database of telematics data to help them set personalized premiums for individual drivers, but arrangements governing how that information is gathered, managed and accessed could be subject to scrutiny by competition regulators. Telematics data can constitute personal data, and therefore fall subject to data protection laws, on the basis that it records the activities of individual drivers, or a number of individuals. Insurers will need to be able to make sense of this data via a model with predictive capabilities based on frequency of driving, hard braking, sharp turns, time of the day, and a handful of other factors, to determine the personalized premium for each customer based on risk profile. UBI programs offer many advantages to insurers, consumers and society. Linking insurance premiums more closely to actual individual vehicle or fleet performance allows insurers to more accurately price premiums. This increases affordability for lower-risk drivers, many of whom are also lower-income drivers. It also gives consumers the ability to control their premium costs by incenting them to reduce miles driven and adopt safer driving habits. Fewer miles and safer driving also aid in reducing accidents, congestion, and vehicle emissions, which benefits society.

Auto insurance is going through transformational changes due to weather, telematics, social data and advancement in auto technology. The next generation of insurance pricing will be very similar to utility industry pricing model that will be driven by actual usage, rate tier based on timing of the day, location and the customer risk profile. As auto industry gets matured with driverless car technology, the insurance will be driven by risk profile of car and technology vs. risk profile of individual customer. All data from self driven/driverless car will be fed back to isnruance companies through telematics data. Three insurance suppliers and an auto parts maker have warned in their most recent annual reports that driverless cars and the technology behind them could one day disrupt the way they do business. The industry collected $107.4 billion in passenger car auto insurance premiums in 2013, the latest year for which figures were available, according to the Insurance Information Institute. Self-driving cars could have other effects as well. Insurers expect car- and software-makers to face litigation when crashes do happen, shifting at least some of the expense from consumer auto insurance to commercial liability policies. A 2013 analysis by PricewaterhouseCoopers suggested that the company thinks driverless cars won’t impact their bottom line anytime soon, but that it will, eventually. The use of telematics helps insurers more accurately estimate accident damages and reduce fraud by enabling them to analyze the driving data (such as hard breaking, speed, and time) during an accident. This additional data can also be used by insurers to refine or differentiate UBI products. Additionally, the ancillary safety benefits offered in conjunction with many telematics-based UBI programs also help to lower accident and vehicle theft related costs by improving accident response time, allowing for stolen vehicles to be tracked and recovered, and monitoring driver safety. Telematics also allow fleets to determine the most efficient routes, saving them costs related to personnel, gas, and maintenance.

According to SMA Research, approximately 36 percent of all auto insurance carriers are expected to use telematics UBI by 2020. Based on a May 2014  CIPR survey of 47 U.S. state and territory insurance departments, in all but five jurisdictions – California, New Mexico, Puerto Rico, Virgin Islands, and Guam – insurers currently offer telematics UBI policies. In twenty-three states, there are more than five insurance companies active in the telematics UBI market

As the move toward dynamic pricing becomes more prominent, some firms will likely go farther than others. A lot of analysis and research will need to be done to determine how much of a risk can effectively be determined by static versus dynamic factors.

Advertisements

Addressing the big data security!

Data Security rules have changed in the age of Big Data. The V-Force (Volume, Veracity and Variety) has changed the landscape for data processing and storage in many organizations. Organizations are collecting, analyzing, and making decisions based on analysis of massive amounts of data sets from various sources: web logs, click stream data and social media content to gain better insights about their customers. Their business and security in this process is becoming increasingly more important. IBM estimates that 90 percent of the data that now exists have been created in the past two years.

A recent study conducted by Ponemon Institute LLC in May 2013 showed that average number of breached records was 23,647. German and US companies had the most costly data breaches ($199 and $188 per record, respectively). These countries also experienced the highest total cost (US at $5.4 million and Germany at $4.8 million). On average, Australian and US companies had data breaches that resulted in the greatest number of exposed or compromised records (34,249 and 28,765 records, respectively).

A Forrester report, the “Future of Data Security and Privacy: Controlling Big Data”, observes that security professionals apply most controls at the very edges of the network. However, if attackers penetrate your perimeter, they will have full and unrestricted access to your big data. The report recommends placing controls as close as possible to the data store and the data itself, in order to create a more effective line of defense. Thus, if the priority is data security, then the cluster must be highly secured against attacks.

According to ISACA’s white paper – Privacy and Big Data published in August 2013, enterprises must ask and answer 16 important questions, including these key five questions, which, if ignored, expose the enterprise to greater risk and damage:

  • Can the company trust its sources of Big Data?
  • What information is the company collecting without exposing the enterprise to legal and regulatory battles?
  • How will the company protect its sources, processes and decisions from theft and corruption?
  • What policies are in place to ensure that employees keep stakeholder information confidential during and after employment?
  • What actions are company taking that creates trends that can be exploited by its rivals?

Hadoop, like many open source technologies such as UNIX and TCP/IP, wasn’t originally built with the enterprise in mind, let alone enterprise security. Hadoop’s original purpose was to manage publicly available information such as Web links, and it was designed to format large amounts of unstructured data within a distributed computing environment, specifically Google’s. It was not written to support hardened security, compliance, encryption, policy enablement and risk management.

Here are some specific steps you can take to secure your Big Data:

  • Use Kerberos authentication for validating inter-service communicate and to validate application requests for MapReduce (MR) and similar functions.
  • Use file/OS layer encryption to protect data at rest, ensure administrators or other applications cannot gain direct access to files, and prevent leaked information from exposure. File encryption protects against two attacker techniques for circumventing application security controls. Encryption protects data if malicious users or administrators gain access to data nodes and  directly inspect files, and renders stolen files or copied disk images unreadable
  • Use key/certificate management to store your encryption keys safely and separately from the data you’re trying to protect.
  • Use Automation tools like Chef and Puppet to help you validate nodes during deployment and stay on top of: patching, application configuration, updating the Hadoop stack, collecting trusted machine images, certificates and platform discrepancies.
  • Create/ use log transactions, anomalies, and administrative activity to validate usage and provide forensic system logs.
  • Use SSL or TLS network security to authenticate and ensure privacy of communications between nodes, name servers, and applications. Implement secure communication between nodes, and between nodes and applications. This requires an SSL/TLS implementation that actually protects all network communications rather than just a subset.
  • Anonymize data to remove all data that can be uniquely tied to an individual. Although this technique can protect some personal identification, hence privacy, you need to be really careful about the amount of information you strip out.
  • Use Tokenization technique to protect sensitive data by replacing it with random tokens or alias values that mean nothing to someone who gains unauthorized access to this data.
  • Leverage the Cloud database controls where access controls are built into the database to protect the whole database.
  • Use OS Hardening – the operating system on which the data is processed to harden and lock down data. The four main protection focus areas should be: users, permissions, services, logging.
  • Use In-Line Remediation to update configuration, restrict applications and devices, restrict network access in response to non-compliance.
  • Use the Knox Gateway (“Gateway” or “Knox”) that provides a single point of authentication and access for Apache Hadoop services in a cluster. The goal is to simplify Hadoop security for both users (i.e. who access the cluster data and execute jobs) and operators (i.e. who control access and manage the cluster).

A study conducted by Voltage Security showed 76% of senior-level IT and security respondents are concerned about the inability to secure data across big data initiatives. The study further showed that more than half (56%) admitted that these security concerns have kept them from starting or finishing cloud or big data projects. The built-in Apache Hadoop security still has significant gaps for enterprise to leverage them as-is and to address them, multiple vendors of Hadoop distributions: Cloudera, Hortonworks, IBM and others have bolstered security in a few powerful ways.

Cloudera’s Hadoop Distribution now offers Sentry, a new role-based security access control project that will enable companies to set rules for data access down to the level of servers, databases, tables, views and even portions of underlying files.

Its new support for role-based authorization, fine-grained authorization, and multi-tenant administration allows Hadoop operators to:

  • Store more sensitive data in Hadoop,
  • Give more end-users access to that data in Hadoop,
  • Create new use cases for Hadoop,
  • Enable multi-user applications, and
  • Comply with regulations (e.g., SOX, PCI, HIPAA, EAL3)

RSA NetWitness and HP ArcSight ESM now serve as weapons against advanced persistent threats that can’t be stopped by traditional defenses such as firewalls or antivirus systems.

Big data security - Cloudera

Figure 1: Cloudera Sentry Architecture

Hortonworks partner Voltage Security offers data protection solutions that protect data from any source in any format, before it enters Hadoop. Using Voltage Format-Preserving Encryption™ (FPE), structured, semi-structured or unstructured data can be encrypted at source and protected throughout the data life cycle, wherever it resides and however it is used. Protection travels with the data, eliminating security gaps in transmission into and out of Hadoop and other environments. FPE enables data de-identification to provide access to sensitive data while maintaining privacy and confidentiality for certain data fields such as social security numbers that need a degree of privacy while remaining in a format useful for analytics.

Big data security - Hortonworks

Figure 2: Hortonworks Security Architecture

IBM’s BigInsights provides built-in features that can be configured during the installation process. Authorization is supported in BigInsights by defining roles. InfoSphere BigInsights provides four options for authentication: No Authentication, Flat File authentication, LDAP authentication and PAM authentication. In addition, the BigInsights installer provides the option to configure HTTPS to potentially provide more security when a user connects to the BigInsights web console.

Big data security - IBM Big Insights

Figure 3: IBM BigInsights Security Architecture

Intel, one of the latest entrants to the distribution-vendor category — came out with a wish list for Hadoop security under the name Project Rhino

First of all, although today the focus is on technology and technical security issues around big data — and they are important — big data security is not just a technical challenge. Many other domains are also involved, such as legal, privacy, operations, and staffing. Not all big data is created equal, and depending on the data security requirements and risk appetite/profile of an organization, different security controls for big data are required.

 

Regulation – a class of Big Data apps

There are bad guys out there!

 

Going back to the gist of my last post, one of the pillars that underpinned de-regulation was the idea that companies would work in a ‘correct’ manner and regulate themselves. The truth is that this worked and still does work very well for 95% of companies but there are always bad pennies committing fraud or simply not being careful in accounting practices. Thanks to a few well known financial disasters, even before the global meltdown, the concept of re-regulation loomed large across many industries. There are many sets of rules that are now in place to bring governance to company business – some of the more well known include Sarbanes-Oxley and Basel II and III which have been around for a little while now. We might ask ourselves what do they have in common and the answer is that both and many more such initiatives, demand that very accurate and accountable numbers are produced quickly from very complex underlying data – the need for Business Intelligence rears its head once again and the term ‘Big Data’ can certainly be applied to some of these initiatives.

 

Re-regulation demands that some very complex numbers are delivered:

 

  • Quickly
  • Accurately
  • Transparently

 

 

Throw into the pot that the data needed often as not comes from tens or even hundreds of operational systems distributed across the world and that some of these initiatives need very complex predictive modelling and detailed segmentation and we see a new class of Big Data applications.

Big Data = more diverse data

 

 

Many ‘mega-trends’ are in place today – Globalisation, re-regulation, internet shopping, disengaged customers and more take-overs day by day. The need to have accurate information is paramount simply to survive let alone grow.

 

Too often we use big words without thinking about what they mean and Globalisation is one of them. Now I am not going to write about globalisation here but it is useful to consider it as a phenomenon, is it real, is it important?

 

Let’s consider some facts:

 

  • 70% of the world’s shoes come from one town in China – now if you produce shoes in the UK this fact should be very worrisome.
  • To all intents and purposes the UK no longer has a car industry – we used to have, but in the end British Leyland amongst others proved a tad slow and not too smart. There’s nothing left anymore.
  • In the space of just a few years Vodafone has penetrated nearly the entire know world with its mobile services. Unless you take active steps to prevent it, you are almost guaranteed to end up paying some money one way or another, to Vodafone this year.
  • Most holiday companies now make a sizable proportion of their revenue from banking products or shipping cargo.

 

 

Globalisation is the force behind the break-down of trading barriers but globalisation is partly a result of another massive change in business practices over the last twenty years that we call de-regulation. Basically, in the ‘old days’ there were rules about what a company (or type of company) could sell. For example, Building Societies could not lend savers money to borrowing customers directly – you had to have a banking license to do that. Retailers could not sell insurance products. Insurance companies could not provide savings accounts. This all changed in the process of de-regulation and so now retailers can sell banking products, banks can sell insurance products and by and large, anything goes. When you put the two things together, globalisation and de-regulation, we have another world, a world in which the biggest retailer ever seen – Wall-Mart can presumably sell banking services in the UK thus becoming a competitor of Barclays Bank!

 

Note: Wall-Mart own ASDA – I’m not sure if they sell banking products but I guess so.

 

So what does that mean today in terms of Big Data. Well now your average retailer knows a lot more about you than ever before. They used to know what you eat, now they know what you ware, where you go on holiday, how much you spend and get paid a month etc, etc and it’s by combining all of this information that a 360 degree of a consumer can be constructed. By trawling social media feeds they can find who your friends and family are and what you are saying about their products………scary!!

Will Hadoop replace or augment your Enterprise Data Warehouse?

Will Hadoop replace or augment your Enterprise Data Warehouse?

There is all the buzz about BigData and Hadoop these days and its potential for replacing Enterprise Data Warehouse (EDW). The promise of Hadoop has been the ability to store and process massive amounts of data using commodity hardware that scales extremely well and at very low cost. Hadoop is good for batch oriented work and not really good at OLTP workloads.

The logical question then is do enterprises still need the EDW. Why not simply get rid of the expensive warehouse and deploy a Hadoop cluster with Hbase and Hive. After all you never hear about Google or Facebook using data warehouse systems from Oracle or Teradata or  Greenplum.

Before we get into that a little bit of overview on how Hadoop stores data. Hadoop comprise of two components. The Hadoop Distributed File System (HDFS) and the Map-Reduce Engine.  HDFS enables you to store all kinds of data (structured as well as unstructured) on commodity servers. Data is divided into blocks and  distributed across data nodes. The data itself is processed by using Map-Reduce programs which are typically  written in Java. NoSql Databases like HBase and Hive provide a layer on top of  HDFS storage that enables end users to use Sql language. In addition BI reporting ,visualization and analytical tools like Cognos, Business Objects, Tableau, SPSS, R etc can now connect to Hadoop/Hive.

A traditional EDW stores structured data from OLTP and back office ERP systems into a relational database using expensive storage arrays with RAID disks. Examples of this structured data may be your customer orders, data from your financial systems, sales orders, invoices etc.  Reporting tools like Cognos, Business Objects, SPSS etc are used to run reports and perform analysis on the data.

So are we ready to dump the EDW and move to Hadoop for all our Warehouse needs. There are some things the EDW does very well that Hadoop is still not very good at:

  • Hadoop and HBase/Hive are all still very IT  focused. They need people with lot of expertise in writing Map reduce Programs in Java, Pig etc. Business Users who actually need the data are not in a position to run ad-hoc queries and analytics easily without involving IT. Hadoop is still maturing and needs lot of IT Hand holding to make it work.
  • EDW is well suited for many common business processes, such as monitoring sales by geography, product or channel; extract insight from customer surveys;  cost and profitability analyses. The data is loaded into pre-defined schemas/data marts and business users can use familiar tools to perform analysis and run ad-hoc Sql Queries.
  • Most EDW come with pre built adaptors for various ERP Systems and databases. Companies have built complex ETL, data marts , analytics, reports etc on top of these warehouse. It will be extremely expensive, time consuming and risky  to recode that into a new Hadoop environment. People with Hadoop/Map-reduce expertise are not readily available and are in short supply.

Augment your EDW with Hadoop to add new capabilities and Insight. For the next couple of years, as the Hadoop/BigData landscape evolves augment and enhance your EDW with a Hadoop/BigData cluster as follows:

  • Continue to store summary structured data from your OLTP and back office systems into the EDW.
  • Store unstructured data into Hadoop that does not fit nicely into “Tables”.  This means all the communication with your customers from phone logs, customer feedbacks, GPS locations, photos, tweets, emails, text messages etc can be stored in Hadoop. You can store this lot more cost effectively in Hadoop.
  • Co-relate data  in your EDW with the data in your Hadoop cluster to get better insight about your customers, products, equipments etc. You can now use this data for analytics that are computation intensive like clustering and targeting. Run ad-hoc analytics and models against your data in Hadoop, while you are still transforming and loading your EDW.
  • Do not build Hadoop capabilities within your enterprise in a silo. Big Data/Hadoop technologies should work in tandem with and extend the value of your existing data warehouse and analytics technologies.
  • Data Warehouse vendors are adding capabilities of Hadoop and Map-reduce into their offerings. When adding Hadoop capabilities, I would recommend going with a  vendor that supports and enhances the open source Hadoop distribution.

In a few years as newer and better analytical and reporting capabilities develop on top of Hadoop, it may eventually be a good platform for all your warehousing needs. Solutions like IBM‘s BigSql and Cloudera’s Impala will make it easier for business users to move more of their warehousing needs to Hadoop by improving query performance and Sql capabilities.

 

%d bloggers like this: