Previous Lecture Complete and continue  

  Module 5: Advanced Topics in Spark

  • Comparisons with MapReduce
  • Key Terminology
  • Shuffles
  • Data partitioning
  • Best practices and optimizations
  • Resource Tuning
  • Tuning parallelism