Skip to content

O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian

Notifications You must be signed in to change notification settings

rcpbayindir/data-algorithms-with-spark

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Algorithms with Spark by Mahmoud Parsian

"... This book will be a great resource for
both readers looking to implement existing
algorithms in a scalable fashion and readers
who are developing new, custom algorithms
using Spark. ..."

Dr. Matei Zaharia
Original Creator of Apache Spark

FOREWORD by Dr. Matei Zaharia

Foreword by Dr. Matei Zaharia (Original Creator of Apache Spark)

Goal of this book: enable writing efficient & simpler PySpark code for data algorithms using Spark




Software:

Spark Python Scala Java
Apache Spark 3.2.0 Python 3.7.2 Scala 2.13 Java 8

Table of Contents

Chapter Title
Bonus
Chapters
Chapter 1 Introduction to Data Algorithms
Chapter 2 Transformations in Action
Chapter 3 Mapper Transformations
Chapter 4 Reductions in Spark
Chapter 5 Partitioning Data
Chapter 6 Graph Algorithms
Chapter 7 Interacting with External Data Sources
Chapter 8 Ranking Algorithms
Chapter 9 Fundamental Data Design Patterns
Chapter 10 Common Data Design Patterns
Chapter 11 Join Design Patterns
Chapter 12 Feature Engineering in PySpark

Data Algorithms with Spark Data Algorithms with Spark

About

O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 53.7%
  • Scala 37.1%
  • Shell 9.2%