Next-Generation Big Data

Next-Generation Big Data at Python.Engineering

Next-Generation Big Data

See more books

This book serves as a practical guide on how to utilize big data to store, process, and analyze structured data, focusing on three of the most popular Apache projects in the Hadoop ecosystem: Apache Spark, Apache Impala, and Apache Kudu (incubating). Together, these three Apache projects can rival most commercial data warehouse platforms in terms of performance and scalability at a fraction of the cost. Most next- generation big data and data science use cases are driven by structured data, and this book will serve as your guide.

572 pages, published in 2018
About the Author Butch Quinto is Chief Data Officer at Lykuid, Inc. an advanced analytics company that provides an AI-powered infrastructure monitoring platform. As Chief Data Officer, Butch serves as the head of AI and data engineering, leading product innovation, strategy, research and development. Butch was previously Director of Analytics at Deloitte where he led strategy, solutions development and delivery, technology innovation, business development, vendor alliance and venture capital due diligence. While at Deloitte, Butch founded and developed several key big data, IoT and artificial intelligence applications including Deloitte’s IoT Framework, Smart City Platform and Geo-Distributed Telematics Platform. Butch was also the co-founder and lead lecturer of Deloitte’s national data science and big data training programs. Butch has more than 20 years of experience in various technical and leadership roles at start-ups and Global 2000 corporations in several industries including banking and finance, telecommunications, government, utilities, transportation, e-commerce, retail, technology, manufacturing, and bioinformatics. Butch is a recognized thought leader and a frequent speaker at conferences and events. Butch is a contributor to the Apache Spark and Apache Kudu open source projects, founder of the Cloudera Melbourne User Group and was Deloitte’s Director of Alliance for Cloudera. About the Technical Reviewer Irfan Elahi has years of multidisciplinary experience in Data Science and Machine Learning. He has worked in a number of verticals such as consultancy firms, his own start-ups, and academia research lab. Over the years he has worked on a number of data science and machine learning projects in different niches such as telecommunication, retail, Web, public sector, and energy with the goal to enable businesses to derive immense value from their data-assets.
Butch Quinto

Latest publications

Underline (_) in Python

The following are the various places where _ is used in Python:

  1. Single underscore:
    • In the translator
    • After the name
    • Before name
  2. 18/07/2021

__name__ (special variable) in Python

Consider two separate files File1 and File2.

# ...


Stripping and searching ordered words in a dictionary using Python

Ordered word — it is a word in which letters are displayed in alphabetical order. For example, abbey and dirt . The rest of the words are unordered, f...


SunPy | Plotting a solar image in Python

At the command line, enter:

 pip install sunpy 

Download sample data

The SunPy package contains a s...