Very informative podcast http://softwareengineeringdaily.com/2015/11/11/apache-flink-with-stephan-ewen/
It’s interview with Stephan Ewen, commiter to the Flink project committer and the CTO of Data Artisans.
Streaming is a must have in modern data engineering. One must be careful with choosing right tool for the job though. But in fact you want to choose between Flink and Spark.
- event time and processing time
- “Flink works on pages of bytes and Spark works on collections of objects” [this may be obsolete – Spark is in rapid development]
- Spark may coexist with Flink, each one has its sweet spots
- snapshot algorithm based on https://en.wikipedia.org/wiki/Chandy-Lamport_algorithm