Making the most of what you have

Many, many companies have built very sophisticated Data Warehouses -They should start using what they’ve got a little more effectively before moving on to tougher things!

So there I was in an ICA store in Stockholm, a huge trolley of goods for the weekend and dead pleased that eventually I got to the front of the queue. It was Saturday, everyone was in a hurry to get home after queuing for ages on the Stockholm motor ways. My partner was diligently packing the goods because it was my turn to pay so imagine my horror when my debit card was rejected – not once, but three times. Crikey, everyone was looking at me as if I was some sort of crook. Well luckily my partners AMEX card came to the rescue but imagine my concern. I kept thinking of the £20k balance in my account and wondering what had happened to it.

In panic on the way home I missed an incoming SMS but got the second when I got back and was horrified to see the number of my bank come up – well I assumed this, as in fact it was actually some random call centre somewhere on planet Earth. I answered it (at my cost as I was roaming) to be told that this was a routine security check because the behavior on my card had proved concerning (to who and why is a mystery as you will see). I was asked to agree the last few transactions of my card to verify that these were correct and not fraudulent: They were:

Currency exchange (at Heathrow)

A purchase at Heathrow of around £30 (two bottles of champers)

Purchase of an airline ticket – UK to Sweden.

Well I confirmed all of this and was simply informed that my card would now start working again – no explanation, no nothing – unbelievable. My card had been refused at a grocery but imagine what could have happened!

Now you might ask yourself a question, why is this guy moaning about this? Well why I’m moaning is that for the two years previous to this incident I had been travelling to Sweden at least once every six weeks – I invariably change money, always buy champagne and always buy an air ticket so why did my bank see this as unusual?  Why weren’t they using some system to check that in fact this was quite a usual style of activity – nothing unusual here? Why has this bank got the authority to arbitrarily stop me using my own money, none the less in such an preposterous manner?

Well, the bank I am talking about was a pioneer in Data Warehousing so I’m just wondering why this event happened when I know that they diligently record all my transactions and store them in a DW whilst apparently failing to understand their meaning. No need for Hadoop here!!!!


Regulation – a class of Big Data apps

There are bad guys out there!


Going back to the gist of my last post, one of the pillars that underpinned de-regulation was the idea that companies would work in a ‘correct’ manner and regulate themselves. The truth is that this worked and still does work very well for 95% of companies but there are always bad pennies committing fraud or simply not being careful in accounting practices. Thanks to a few well known financial disasters, even before the global meltdown, the concept of re-regulation loomed large across many industries. There are many sets of rules that are now in place to bring governance to company business – some of the more well known include Sarbanes-Oxley and Basel II and III which have been around for a little while now. We might ask ourselves what do they have in common and the answer is that both and many more such initiatives, demand that very accurate and accountable numbers are produced quickly from very complex underlying data – the need for Business Intelligence rears its head once again and the term ‘Big Data’ can certainly be applied to some of these initiatives.


Re-regulation demands that some very complex numbers are delivered:


  • Quickly
  • Accurately
  • Transparently



Throw into the pot that the data needed often as not comes from tens or even hundreds of operational systems distributed across the world and that some of these initiatives need very complex predictive modelling and detailed segmentation and we see a new class of Big Data applications.

Big Data = more diverse data



Many ‘mega-trends’ are in place today – Globalisation, re-regulation, internet shopping, disengaged customers and more take-overs day by day. The need to have accurate information is paramount simply to survive let alone grow.


Too often we use big words without thinking about what they mean and Globalisation is one of them. Now I am not going to write about globalisation here but it is useful to consider it as a phenomenon, is it real, is it important?


Let’s consider some facts:


  • 70% of the world’s shoes come from one town in China – now if you produce shoes in the UK this fact should be very worrisome.
  • To all intents and purposes the UK no longer has a car industry – we used to have, but in the end British Leyland amongst others proved a tad slow and not too smart. There’s nothing left anymore.
  • In the space of just a few years Vodafone has penetrated nearly the entire know world with its mobile services. Unless you take active steps to prevent it, you are almost guaranteed to end up paying some money one way or another, to Vodafone this year.
  • Most holiday companies now make a sizable proportion of their revenue from banking products or shipping cargo.



Globalisation is the force behind the break-down of trading barriers but globalisation is partly a result of another massive change in business practices over the last twenty years that we call de-regulation. Basically, in the ‘old days’ there were rules about what a company (or type of company) could sell. For example, Building Societies could not lend savers money to borrowing customers directly – you had to have a banking license to do that. Retailers could not sell insurance products. Insurance companies could not provide savings accounts. This all changed in the process of de-regulation and so now retailers can sell banking products, banks can sell insurance products and by and large, anything goes. When you put the two things together, globalisation and de-regulation, we have another world, a world in which the biggest retailer ever seen – Wall-Mart can presumably sell banking services in the UK thus becoming a competitor of Barclays Bank!


Note: Wall-Mart own ASDA – I’m not sure if they sell banking products but I guess so.


So what does that mean today in terms of Big Data. Well now your average retailer knows a lot more about you than ever before. They used to know what you eat, now they know what you ware, where you go on holiday, how much you spend and get paid a month etc, etc and it’s by combining all of this information that a 360 degree of a consumer can be constructed. By trawling social media feeds they can find who your friends and family are and what you are saying about their products………scary!!

Natural Selection in Business – Does using Big Data provide a sustainable advantage?


In nature, when resources are plentiful, species live together quite amicably. Even predator and prey reach a satisfactory balance whereby there is always food for both. However, when resources are scarce, species that were once happy together often turn into bitter enemies. The strong, big guy’s fight each other, determined to completely obliterate their competitor often resulting in mortal damage being inflicted on both. Whilst this is happening, the intelligent guys, who are inevitably smaller and physically weaker, get to work. Firstly, they take advantage of the preoccupation of the others by amassing their basic requirements quickly. They then diversify and find a niche for themselves, knowing that competition will come, but being determined to foresee it and avoid it where possible.


Most people accept that this is the way of the natural world and business dynamics tend to follow the same basic rules. Intelligent companies will not measure themselves by numbers of employees, amount of real estate or revenue alone, but will instead increasingly judge themselves on different values:


  • The average life time value of their key customers
  • The elapsed time for a new customer to become profitable
  • Public image
  • Customer retention
  • Knowledge, expertise and willingness of the work force
  • Brand awareness and flexibility
  • Environmental friendliness
  • Efficient and focused work practices
  • Customer satisfaction


Note: be aware that the little guys don’t always have to take on the big guys directly and in fact it’s usually best not too. Those of you who know the story about David and Goliath should be clear that this was not a simple big guy versus little guy competition in which David shows the world not to be afraid of a ‘larger’ opponent. The fact is that Goliath, although being big, had no noticeable weaponry whilst David however, had the equivalent in those days, of a sawn off shotgun. My guess is that if the two guys had met with equal weapons the result would have been rather less romantic but David showed some real common-sense here. He knew that if he wasn’t prepared for the fight he had no chance so he fought the battle very much on his own terms.


I wonder if exploiting Big Data will enable big companies to grow even bigger or whether it will enable smaller companies to compete with them to level the playing field?

I wonder if exploiting Big Data will enable big companies to grow even bigger or whether it will enable smaller companies to compete with them to level the playing field?


As companies move forward, whilst it will undoubtedly remain an advantage to be rich and powerful, size in itself, may not be such an important plus point. Most certainly size brings coverage and reach, but it also breeds cost and inflexibility and we will see instead the proliferation of many smaller companies who have replaced the advantages of size, with the advantages of intelligence.


What will intelligence bring to a company that might give it sustainable market value?

Well it might enable it to:


  • Sell more diverse products to its customer base thereby increasing margin and perhaps even loyalty.
  • Acquire only those customers who will likely be low risk and high value.
  • Only execute marketing campaigns in geographies where the ability to provide service and product actually exists
  • Remove the need for inventory completely by direct collaboration with suppliers.
  • Reduce the cash to cash cycle by getting customers to pay for goods prior to manufacturing them.
  • Eliminate the need for a direct sales force altogether.
  • Make fraud so unprofitable for the fraudster that they give up.


So what is the major business driver that is set to change our ways of doing business? It can be summed up in one phrase – natural selection.


Note: Now I fancy myself as something of a biologist and there are several points in Darwin’s theories of evolution that concern me but maybe we can save that discussion till later?



Our Favorite 40+ Big Data use-cases. What’s your?

One of the key best practices for successful implementation of a big data analytics solution is to validate the business use case for big data. It will help organization with two important aspects for success:

1. Keeping the scope limited

2. Helping to measure the success of a solution that addresses a key business problem

In case the same data set addresses multiple use cases, an organization may need to prioritize their use case and apply an iterative and phased approach. It’s the theory of getting the biggest bang for the buck, both tactical and strategic. Think Big and Act small!

While there are extensive industry-specific use cases, here are some for handy reference:

EDW Use Cases

  • Augment EDW by offloading processing and storage
  • Support as preprocessing hub before getting to EDW

Retail/Consumer Use Cases

Financial Services Use Cases

  • Compliance and regulatory reporting
  • Risk analysis and management
  • Fraud detection and security analytics
  • CRM and customer loyalty programs
  • Credit risk, scoring and analysis
  • High speed arbitrage trading
  • Trade surveillance
  • Abnormal trading pattern analysis

Web & Digital Media Services Use Cases

  • Large-scale clickstream analytics
  • Ad targeting, analysis, forecasting and optimization
  • Abuse and click-fraud prevention
  • Social graph analysis and profile segmentation
  • Campaign management and loyalty programs

Health & Life Sciences Use Cases

  • Clinical trials data analysis
  • Disease pattern analysis
  • Campaign and sales program optimization
  • Patient care quality and program analysis
  • Medical device and pharma supply-chain management
  • Drug discovery and development analysis

Telecommunications Use Cases

  • Revenue assurance and price optimization
  • Customer churn prevention
  • Campaign management and customer loyalty
  • Call detail record (CDR) analysis
  • Network performance and optimization
  • Mobile user location analysis

Government Use Cases

  • Fraud detection
  • Threat detection
  • Cybersecurity
  • Compliance and regulatory analysis

New Application Use Cases

  • Online dating
  • Social gaming

Fraud Use-Cases

  • Credit and debit payment card fraud
  • Deposit account fraud
  • Technical fraud and bad debt
  • Healthcare fraud
  • Medicaid and Medicare fraud
  • Property and casualty (P&C) insurance fraud
  • Workers’ compensation fraud

E-Commerce and Customer Service Use-Cases

  • Cross-channel analytics
  • Event analytics
  • Recommendation engines using predictive analytics
  • Right offer at the right time
  • Next best offer or next best action

These are some of my favorites and ones that I have come across. Please add your favorites to the comment section. I would like to know from readers what they are seeing in their organization.


Big Data Use-Cases in Healthcare – Provider, Payer and Care Management

In this part, we will discuss use cases specific to Healthcare industry. In general, Healthcare industry has been late adopter of technology compared to other industry verticals – Banking and Finance, Retail and Insurance. As per McKinsey report on Big Data June 2011, “…if US health care could use big data creatively and effectively to drive efficiency and quality, we estimate that the potential value from data in the sector could be more than $300 billion in value every year, two-thirds of which would be in the form of reducing national health care expenditures by about 8 percent…”.

Some of the key use cases for Provider industry are:

a. Reduce Medicaid Re-admissions – One of the major cost of Medicaid is readmission costs due to lack of sufficient follow ups and proactive engagement with patients. These follow-up appointments and tests are often only documented as free-text in patients’ hospital discharge summaries and notes. These unstructured data can be mined using text analytics and timely alerts can be sent, appointments can be scheduled, education materials can be dispatched. This proactive engagement can potentially reduce readmission rates by over 30%.

b. Patient Monitoring – Inpatient, Out-Patient, Emergency Visits, Intensive Care Units…

With rapid progress in technology, sensors are embedded in your weighing scales, glucose devices, wheel chairs, patient beds, XRay machines. All these large streams of data generated in real-time can provide real insights into patient health and behavior. This will improve the accuracy of information and significantly reduce the cost of healthcare providers. It will also significantly enhance patient experience at healthcare facility by providing proactive risk monitoring, improved quality of care and personalized attention. Big Data can enable CEP – complex event processing providing real-time insights to doctors and nurses in control room.

c. Preventive care for ACO

One of the key ACO goals is to provide preventive care to its members. The Disease identification and Risk Stratification will be very crucial business function. Managing real-time feeds coming in from HIE from Pharmacists, Providers and Payers will be key information to apply risk stratification and predictive modeling techniques. In the past, companies were limited to historical claims and HRA/Survey data but with HIE, the whole dynamic to data availability for health analytics has changed. Big Data tools can significantly enhance the speed of processing and data mining.

d. Provider Sentiment Analysis 

With social media growing at rapid pace, members are sharing their experience about providers through social channels – Facebook, Twitter, and other media. These experiences through comments, twitter feeds, blogs, surveys can be mined for gaining rich insights about quality of services.

e. Epidemiology

Through HIE, most of the providers, payers and pharmacists will be connected through network in few months to come. These will allow hospitals and health agencies to track disease outbreaks, patterns and trends in health issues across geography allowing determination of source and containment plans.

f. Patient care quality and program analysis

Natural with growth of data and insight into new information, comes the challenge to process these voluminous and variety of information to produce metrics and KPIs for Patient care quality and program. Big data provides the architecture, tools and techniques that will allow to process TB and Petabytes of data to provide deep health care analytics capabilities to its stakeholders.

Some of the key use cases for Payer industryare

a. Clinical Data analysis for improved predictable outcomes

Payer/Health Plans and Insurance companies can significantly reduce cost of care by reducing readmission, improved outcomes and proactive patient monitoring. There is a huge amount of existing clinical data that resides within organization and myriads of unstructured data coming at rapid space, Big data will be candidate to process these complex events and data to provide clinical insights to payer organization. Some of the areas that can be immediately addressed by Big data solutions:

  • Longitudinal analysis of care across patients and diagnoses; time sequencing
  • Cluster Analysis around influencers on treatment, physicians, therapist; patient social relationships
  • Analyze clinical notes (multi-structured data); no longer limited by dimensional sentiment of a relational database
  • Analyze click stream data and clinical outcomes; look for patterns/ trends to quality of care delivered.
  • Clinical outcomes can be integrated with financial information to understand performance

b. Claims Fraud Detection

Although no precise dollar amount can be determined, some authorities contend that insurance fraud constitutes a $100-billion-a-year problem. The United States Government Accountability Office (GAO) estimates that $1 out of every $7 spent on Medicare is lost to fraud. Some of the fraud examples are:

  • Billing for services, procedures, and/or supplies that were not provided.
  • Misrepresentation of what was provided; when it was provided; the condition or diagnosis; the charges involved; and/or the identity of the provider recipient.
  • Providing unnecessary services or ordering unnecessary tests
  • Billing separately for procedures that normally are covered by a single fee.
  • Charging more than once for the same service.
  • Upcoding: Charging for a more complex service than was performed. This usually involves billing for longer or more complex office visits
  • Miscoding: Using a code number that does not apply to the procedure.
  • Kickbacks: Receiving payment or other benefit for making a referral.

With Health Information Exchanges playing a pivotal role in real-time information sharing, Payer organization will have the power of information to proactively detect frauds using Pattern Analysis, Graph Analysis of cohort networks, social media insights.

c. Member Engagement

Like any industry, Payer organization like Health Insurance companies are battling to win member business. Companies are monitoring members, prospects behavior on their websites and social media.

d. Payer Sentiment Analysis 

Similar to Provider sentiment analysis, members are sharing their experience about insurance benefits, customer service experience through social channels – Facebook, Twitter, and other media. These experiences through comments, twitter feeds, blogs, surveys can be mined for gaining rich insights to improve quality of services.

e. Call Center Analysis

Payer organizations are capturing information from Call Center using call recording. These call records provide valuable information to

  • staffing model – by demographic preferences, hours of services
  • member feedback using voice pattern and recognition
  • member experience using metrics – Average speed to answer, abandonment rate, dropped calls, unable to reach member

Finally, few of the use cases for Care Management – Disease Management, Utilization Management and Behavioral Health Management industry are:

a. Disease Identification and Risk Stratification

Care management companies constantly collect data from various sources – claims, prior authorizations, biometrics screening, health risk assessment and survey data. Disease ID and Risk Stratification is key function that helps organization with limited resources to focus on top 5-10% of high risk population that takes 60-80% of medical cost. Processing through 10′s of years of historical information added with realtime information from various sources adds a huge complexity and processing challenges. Big data can alleviate such challenges by not only providing accessibility to  unstructured data but also providing the robustness and speed of processing.

b. Member Sentiment Analysis 

With social media growing at rapid pace, members are sharing their experience about providers through social channels – Facebook, Twitter, and other media. These experiences through comments, twitter feeds, blogs, surveys can be mined for gaining rich insights about quality of services.

c. Member care quality and program analysis

Natural with growth of data and insight into new information, comes the challenge to process these voluminous and variety of information to produce metrics and KPIs for Member care quality and program. Big data provides the architecture, tools and techniques that will allow to process TB and Petabytes of data to provide deep health care analytics capabilities to its stakeholders.

While these are just few of the generic use cases in Healthcare industry, there are a lot of unique use cases specific to your line of business, organization and department. I will reiterate again that assessing and prioritizing the business use case for Big data based on value is key to its success and will have significant impact on your organization in years to come. Think Big, start Small!