Introduction and Overview
Quantifying the amount of digital information that exists in the world is hard. What is clear is that there is an awful lot of it, and it is growing at a terrific rate. Modern day businesses are not an exception; it accumulates an astonishing amount of digital data, which may be leveraged to unlock new sources of economic value or to provide fresh insights into business trends. The real challenge in this process is the design of computing, storage infrastructures and algorithms needed to handle this “ big data ” problem. Web being the largest collection of digital data, Internet companies have contributed several exceptional technologies to efficiently handle this issue.
GSTF TRAINING and CERTIFICATIONS PROGRAM
Big Data Workshop by GSTF Singapore covers the concept of business analytics and big data technologies with its strategic importance to any organization. Participants will be introduced to the concept of business analytics with big data technologies: Hadoop, Hive and HBase. The course deals with basic principles, concepts, and techniques used for big data and business analytics, which includes Hadoop, HDFS & MapReduce, Apache HBase and Apache Hive. Participants will get good picture of all these concepts and how they all are interconnected to each other in organizational context.
“Modern enterprises are drowning in data and starving information”
- 24 Hours (3 Days)
Who Should Attend?
This comprehensive training and certification program will be of specific interest to:
- Senior Executives
- CIOs and CTOs
- Business Intelligence Executives
- Marketing Executives
- Data & Business Analytics Specialists
- Innovation Specialists & Entrepreneurs
- Academics, and other people interested in Big Data
- Understand business analytics and big data technologies with its impact on enterprises
- Understand the role of big data technologies (Hadoop, HBase, Hive) in business analytics
- Acquire the knowledge and learn to use Hadoop (HDFS and MapReduce), HBase and Hive
The concept of Business Analytics
- Data, Information, Knowledge and Wisdom
- Data as Unique Enterprise Asset
- Data, Information and Analytics Lifecycle
Introduction to Big Data
- What is Big Data? Why Big Data?
- 3V’s of Big Data
- The Rapid Growth of Unstructured Data
- Big Data Market Forecast
Business Analytics – Current Context
Types of Analytics
- Descriptive Analytics
- Predictive Analytics
- Prescriptive Analytics
- Data Mining
- Enterprise Reporting
Introduction to Hadoop
- Big Data – Current Industry Trends
- Why Process Big Data?
- Challenges in Data Processing
- Why Hadoop?
- What is Hadoop offering?
- Hadoop Network Structure
- Hadoop Eco-System
- Hadoop Core Components
- What does HDFS Facilitate?
- HDFS Architecture
- Hadoop Network and Server Infrastructure
- NameNode, Secondary NameNode and DataNode
- Ensuring Data Correctness
- Data Pipelining while Loading Data
- fs Operations
- MapReduce Conceptualization
- MapReduce – Overview
- MapReduce – Programming Model
- MapReduce – Execution Overview
- Hadoop – Application Examples
- Word Count – Example
Big Data in Business
Big Data Types & Architecture
- What is HBase?
- Why HBase?
- How does HBase work?
- HBase Architecture
- HBase Data model
- Advantages and Disadvantages
- What is Hive?
- Why Hive?
- Where to use Hive?
- Hive Architecture
- Hive: Benefits
- Hive: Tradeoffs
- What is Pig?
- Comparison with RDBMS
- Execution of Pig
- Pig Latin
- What is Sqoop?
- Why Sqoop?
- Sqoop 1 and Sqoop 2
- What is Flume?
- Sample topology