MapReduce: Simplified Data Processing on Large Clusters
http://static.googleusercontent.com/media/research.google.com/en/us/archive/mapreduce-osdi04.pdf
MapReduce: The programming model and practice
https://ai.google/research/pubs/pub36249
Google’s MapReduce programming model — Revisited
https://www.sciencedirect.com/science/article/pii/S0167642307001281
Exploring Wikipedia with Apache Spark: A Live Coding Demo
https://www.infoq.com/presentations/wikipedia-apache-spark
Besides in-memory data, Apache Spark uses ideas from functional programming (immutable data, operations on data as functional transformations…) explaining why Scala is for the moment the lingua franca language of Apache Spark and Big Data.
Matei Zaharia, Spark inventor, explains the history of Spark starting from MapReduce and Hadoop.
The reason for Big Data is Machine Learning : MapReduce -> Hadoop -> BigData(Spark) -> Machine Learning
Apache Spark Architecture – Spark Cluster Architecture Explained
https://www.edureka.co/blog/spark-architecture/