Every day, we create 2.5 Quintillion bytes of data – so much that 90% of the data in the world has been created in last 2 years alone. Every organization is now accumulating Terabytes and Petabytes of data coming from various devices – machines, mobile, user, web logs and cookies, social media, application servers and transactional logs, etc. Organizations are rushing to store this wealth of information fearing missed opportunities. The challenge is not in storing this information but able to find usage of this data to its competitive advantage.
Leading organizations are rapidly reorganizing themselves and building the Data Science Team. This is very similar pattern that you may have noticed in early to mid 90’s wherein organization started with Business Intelligence team rebranding MIS or DSS teams. The key to competitive advantage is not by storing all and most of the data but by deriving value and insight from it and be able to tie it with business plan that can drive tangible business outcomes. This new function that is evolving to be a Data Science team primarily needs four tiered layer of experts: Data Science Champion, Business/Product Managers, Analytical Data Modeler and Big Data Engineer. To make the Data Science team successful, it’s key that they operate under a Data Science Champion such as Chief Data Officer or Chief Data Scientist and not under a traditional IT organization.
Let us define what each of these layers mean to an organization.
§ Data Science Champion: This is an Executive level sponsor such as Chief Data Officer or Chief Data Scientist. They lay out the vision and lead the mission for the team. This champion is a domain expert in a data science field. A large organization may have several Chief Data Scientist roles specific to LOB working for a Chief Data Officer.
§ Business/Product Managers: They are the product managers from the business that closely partners with the CDO or Chief Data Scientist
§ Analytical Data Modeler: These are the advanced mathematician and statisticians who can apply their computer skills to create complex analytical models largely contributing to the mission
§ Big Data Engineer: These are the computer science engineers who apply the sophisticated engineering skills to process large volumes of datasets using big data tools.
It’s hard to find a Data Scientist who will have all the above 4 skills – leadership, business knowledge, analytical experience, big data processing skills. A successful Data Science Team is partnership of above defined four distinct roles. When building your team, it’s important to focus on these key data science skills: analytical and curiosity mind, creativity, domain knowledge, advanced math skills including a solid background in calculus, geometry, linear algebra, and statistics and a computer science background. Leadership, Analytical and Computer Science skills are important, particularly for the first members chosen for your data science team.
A new concept of Business Engineering Unit under the leadership of CDO/Chief Data Scientist is slowly evolving that will house the Data Science team. This has been seen in some of the large retailers, telecom, and media/entertainment companies. Data Science deals with identifying the real problem or a business opportunity. Unlike IT as a support organization in many companies, the Business Engineering Unit needs to be a profit center that will contribute to the company’s topline or bottomline. To prove the value from Data Science, the Champion needs to initially focus on the hardest business problems within an organization or unique business opportunity that have the highest return for key stakeholders. Organization boundaries and internal political environment are often the biggest challenges facing a data science team.
Companies that have made investments in big-data computing will reap extraordinary near-term and long-term benefits. Data Science is perhaps the biggest innovation in computing in the last decade. The benefits from data science have already been proven in some industry sectors; the challenge is to extend the technology and to apply it more widely and in all facets of interaction between humans and machines.
In January 2014, IDG published their latest big data enterprise survey and predictions for 2014 finding that on average, enterprises will spend $8M on big data –related initiatives in 2014. The study also found that 70% of enterprise organizations have either deployed or are planning to deploy big data-related projects and programs.
Here’s our Top 10 Big Data Trends in 2014:
- Big Data as a Service and Big Data Analytics will go mainstream
- More companies will implement Predictive Analytics, Machine Learning
- Data Science and Big Data Analytics will be embedded in BI for actionable Insights into Operational Reports and Executive Dashboards
- Cloud computing and Big Data will be tightly integrated with BI solutions
- Enterprise will be using big data techniques to secure IT infrastructure
- Hadoop will be used for operational system and transactional application
- Hadoop will be implemented as extensions to part of Enterprise Information Management solutions
- Big Data and Data Scientists skills shortage will grow as companies start ramping up hiring for big data and data science projects
- Rise in M&A activity in Big Data space with legacy BI companies acquiring niche big data vendors
- Companies will start new roles defined as Chief Data Scientists, Chief Data Officers and Chief Analytic Officer
TechNavio’s analysts forecast the Global Big Data market to grow at a CAGR of 34.17 percent over the period 2013-2018. TechNavio’s report, the Global Big Data Market 2014-2018.
The key vendors dominating this market space are IBM Corp., Hewlett-Packard Co., Oracle Corp., and Teradata Corp.
Other vendors are 1010data Ltd., 10gen Inc., Accenture Inc., Amazon Web Services, Attivio Inc., Calpont Corp., Capgemini Inc., ClickFox Inc., Cloudera Inc., Computer Sciences Corp., Couchbase Inc., Datameer Inc., DataStax Inc., Dell Inc., Digital Reasoning Systems Inc., EMC Corp., Fractal Analytics Inc., Fujitsu Ltd., Hitachi Ltd., Hortonworks Inc., HPCC Systems Inc., Huawei Technologies Co. Ltd., Informatica Corp., Intel Corp., Karmasphere Inc., Logica plc, MapR Technologies Inc., MarkLogic Inc., Microsoft Corp., Mu Sigma Inc., NetApp Inc., Opera Solutions Inc., ParAccel Inc., Pervasive Software Inc., QlikTech Ltd., RainStor Inc., Red Hat Inc., SAP AG, SAS Institute Inc., Seagate Inc., Siemens Information Systems Ltd., Splunk Inc., Supermicro Computer Inc., Tableau Software Inc., Tata Consultancy Services Ltd., Think Big Analytics Inc., and Xerox Corp.