CGAT 2017  Keynote Addresses: Prof. Jemal Abawajy, Deakin University, Australia. ACE 2017  Keynote Addresses: Prof. Mark S. Anderson, University of California, USA Mr. Felipe Tomasevich, Minister of Infrastructure, Government of the State of San Luis, Argentina Prof. Stephen Foster, University of New South Wales, Australia Prof. Tommy Chan, Queensland University of Technology, Australia Prof. Mark Burry, University of Melbourne, Australia Prof. Peter Anderson, California College of the Arts, USA.

Big Data Workshop

Introduction and Overview

Quantifying the amount of digital information that exists in the world is hard. What is clear is that there is an awful lot of it, and it is growing at a terrific rate. Modern day businesses are not an exception; it accumulates an astonishing amount of digital data, which may be leveraged to unlock new sources of economic value or to provide fresh insights into business trends. The real challenge in this process is the design of computing, storage infrastructures and algorithms needed to handle this “big data” problem. Web being the largest collection of digital data, Internet companies have contributed several exceptional technologies to efficiently handle this issue.

This course covers the concept of business analytics and big data technologies with its strategic importance to any organization. Participants will be introduced to the concept of business analytics with big data technologies: Hadoop, Hive and HBase. The course deals with basic principles, concepts, and techniques used for big data and business analytics, which includes Hadoop, HDFS & MapReduce, Apache HBase and Apache Hive. Participants will get good picture of all these concepts and how they all are interconnected to each other in organizational context.

“Modern enterprises are drowning in data and starving information”

Duration

  • 24 Hours (3 Days)

Who Should Attend?

This comprehensive training and certification program will be of specific interest to:

  • Senior Executives
  • CIOs and CTOs
  • Business Intelligence Executives
  • Marketing Executives
  • Data & Business Analytics Specialists
  • Innovation Specialists & Entrepreneurs
  • Academics, and other people interested in Big Data

Pre-Requisites

Nil

Assessment

Nil

Workshop Outcome

  • Understand business analytics and big data technologies with its impact on enterprises
  • Understand the role of big data technologies (Hadoop, HBase, Hive) in business analytics
  • Acquire the knowledge and learn to use Hadoop (HDFS and MapReduce), HBase and Hive

Workshop Outline

  • The concept of Business Analytics
    • Data, Information, Knowledge and Wisdom
    • Data as Unique Enterprise Asset
    • Data, Information and Analytics Lifecycle
  • Introduction to Big Data
    • What is Big Data? Why Big Data?
    • 3V’s of Big Data
    • The Rapid Growth of Unstructured Data
    • Big Data Market Forecast
  • Business Analytics – Current Context
  • Types of Analytics
    • Descriptive Analytics
    • Predictive Analytics
    • Prescriptive Analytics
  • Business Intelligence
    • Data Mining
    • Enterprise Reporting
    • EPM
  • Introduction to Hadoop
    • Big Data – Current Industry Trends
    • Why Process Big Data?
    • Challenges in Data Processing
    • Why Hadoop?
    • What is Hadoop offering?
    • Hadoop Network Structure
    • Hadoop Eco-System
    • Hadoop Core Components
  • Hadoop HDFS
    • What does HDFS Facilitate?
    • HDFS Architecture
    • Hadoop Network and Server Infrastructure
    • NameNode, Secondary NameNode and DataNode
    • Ensuring Data Correctness
    • Data Pipelining while Loading Data
    • fs Operations
  • Hadoop MapReduce
    • MapReduce Conceptualization
    • MapReduce – Overview
    • MapReduce – Programming Model
    • MapReduce – Execution Overview
    • Hadoop – Application Examples
    • Word Count – Example
  • Big Data in Business
  • Big Data Types & Architecture
  • Apache HBase
    • What is HBase?
    • Why HBase?
    • How does HBase work?
    • HBase Architecture
    • HBase Data model
    • Advantages and Disadvantages
  • Apache Hive
    • What is Hive?
    • Why Hive?
    • Where to use Hive?
    • Hive Architecture
    • Hive: Benefits
    • Hive: Tradeoffs
  • Apache Pig
    • What is Pig?
    • Comparison with RDBMS
    • Execution of Pig
    • Pig Latin
  • Apache Sqoop
    • What is Sqoop?
    • Why Sqoop?
    • Architecture
    • Sqoop 1 and Sqoop 2
  • Apache Flume
    • What is Flume?
    • Sample topology
    • Properties