Wonder how to start your Big Data POC / Lab?

In continuation to my previous blog – 6 Steps to Start Your Big Data Journey, I want to address here the How should you start your Big Data Journey with the Lab.

What is the Big Data Lab?


The Big Data Lab is a dedicated development environment, within your current technology infrastructure, that can be created explicitly for experimentation with emerging technologies and approaches to Big Data and Analytics.

Key Activities within the Big Data Lab:

  • Assemble a selected set of technologies to be evaluated during your 2-3 months.
  • Test permutations against high value use cases
  • Develop recommendations from the testing scenarios to drive future architecture and usage

What should be the Big Data Lab’s objectives?

  • Deliver 2-3 “Quick Wins” to demonstrate the value of these technologies from both an IT and business perspective
  • Create a “Proof-of-Concept” that show’s how these technologies can be integrated into your enterprise existing architecture
  • Future state AI architecture recommendations
  • Deliver low-cost, high-performance agile BI and data discovery, with a focus on Big Data technologies
  • Pilot new analytical capabilities and use cases to prove business value and inform long-term roadmap to compete on analytics
  • Establish a permanent “Innovation Hub” within your architecture and center for Big Data and analytics skill-building

What components to consider in your Big Data Lab?


Lab Components Function
Big Data Storage and Processing
  • Use HADOOP and Big Data tools as pre-processing platform for structured and unstructured data before loading to EDW
  • Use HADOOP platform for storing and analyzing unstructured and high volume data
Real-Time Ingestion
  • Use real-time data ingestion into HADOOP
  • Filter data in real-time during collection. ETL high-level data for real-time analysis
Data Virtualization and Federation
  • Enable near-real-time reporting through the ODS and self-service visualization tools
BI, Reporting and Visualization
  • Structured reporting to enable Business Intelligence reporting and self-serve capability
  • Visualization tools to make insights operational
  • Predictive analytics and scenario modeling capabilities to improve audience measurement and campaign management
ETL / ELT – Data Integration
  • Custom ETL and data modeling to aggregate multiple data in high-volume and disparate formats
Data Discovery and Exploration
  • Discovery environment that allows for the combination of enterprise data with external data sets
Data Governance
  • Establish a data governance and change management model to ensure that analytics are embraced across the organization

What’s the proposed Hadoop Infrastructure for Big Data Lab:


Big Data Lab’s research mission is to identify, engineer and evaluate innovative technologies that address current and future data-intensive challenges. In order to unlock the potential of Big Data you need to overcome a significant number of research challenges including: managing diverse sources of unstructured data with no common schema, removing the complexity of writing auto-scaling algorithms, real time analytics, suitable visualization techniques for Petabytes scale data sets etc. The Big Lab will provide you the platform to test your hypothesis and integrate your big data efforts across your organization.



One thought on “Wonder how to start your Big Data POC / Lab?

  1. Pingback: Top Big Data Posts for Year 2013 !!! | The Big Data Institute

Comments are closed.