Hands On Big Data: Getting Started With NoSQL And Hadoop

DATE: Thursday the 27th of November.

What is Big Data? A Gartner analyst has characterized Big Data as “data that’s an order of magnitude greater than data you’re accustomed to”. The workshop introduces the topic of Big Data by providing a practical knowledge of the tools and techniques most commonly used to handle them.

LANGUAGE
Italian

LEVEL
Medium

DURATION
The workshop is full-day (8 hours) from 9:00 to 18:00, with one hour lunch break.

LOCATION
Room Bl 27.1.5 – Politecnico di Milano – Bovisa | Building BL 27  - Via R. Lambruschini, 4 – Milano

CHECK IN: 8:30 – 9:00

PRICES:
Super Early Bird: 105 €, from the 10th September to the 30th September;
Early Bird: 125 €, from the 1st October to the 5th November;
Regular Ticket: 145 €, from the 6th November to the end of the sales.

 

MARIO CARTIA
Mario Cartia is Chief Technology Officer of an italian company market leader in the field of software for school of every grade. He has more than 15 years of experience with enterprise architectures. In the past he has carried out training activities at various multinational companies on issues such as distributed architectures, performance tuning, system scalability and information security. He is also founder and board member of various community related to the opensource ecosystem. He is a Red Hat Certified Professional too.

ABSTRACT
What is Big Data? Gartner analyst Doug Laney has characterized Big Data as “data that’s an order of magnitude greater than data you’re accustomed to”. IBM’s chief executive, Virginia Rometty estimates that “there will be 5,200 gigabytes of data for every human on the planet by 2020”. The workshop introduces the topic of Big Data by providing a practical knowledge of the tools and techniques most commonly used to handle them. The workshop will include four “hands-on” labs focused on the use of the Hadoop framework.

TABLE OF CONTENTS
- What is Big Data?
- The strategic relevance of Big Data within the context of social networks, the “Internet of Things” and in the market of wearable devices
- Other case histories: Using Big Data in scientific research
- Historical evolution of the database and the data warehouse from the 70s to today
- Architectural prototype of a “Big Data oriented” system
- Introduction to NoSQL database
- Analysis of various NoSQL databases available on the market. Which one to choose according to your needs?
- Lab1: setup and use of a document-oriented database (Couchbase)
- Introduction to Apache Hadoop framework
- Hadoop: architecture and modules
- The Hadoop Common module
- HDFS: a distributed filesystem for quick access to large amounts of data
- YARN framework for job scheduling and management of cluster resources
- The Hadoop MapReduce module for parallel data processing
- Making queries using Hive and Shark
- Lab2: Hadoop setup
- Lab3: Importing data into Hadoop from a MySQL database
- Lab4: Processing data stored on Hadoop and export into NoSQL database created during lab1
Introduction to Machine Learning
- Data Visualization examples

TRAINING OBJECTIVES
Understanding the typical architecture of a Big Data system; setup, configure and use Hadoop; import data from SQL database into Hadoop.

WHO THE WORKSHOP IS DEDICATED TO?
CIO, CTO, DevOps, SysAdmin, DBA, Developers.

PREREQUISITES NEEDED FROM ATTENDEES
Basic knowledge of *nix operating systems, relational databases, distributed systems and scalable architectures.

HARDWARE AND SOFTWARE REQUIREMENTS
Participants are required to bring their own notebook with Linux OS (RHEL or CentOS recommended).

Back to workshops list

Main Sponsor