Very informative podcast http://softwareengineeringdaily.com/2015/11/11/apache-flink-with-stephan-ewen/
It’s interview with Stephan Ewen, commiter to the Flink project committer and the CTO of Data Artisans.
Streaming is a must have in modern data engineering. One must be careful with choosing right tool for the job though. But in fact you want to choose between Flink and Spark.
Key takeaways:
- event time and processing time
- “Flink works on pages of bytes and Spark works on collections of objects” [this may be obsolete – Spark is in rapid development]
- Spark may coexist with Flink, each one has its sweet spots
- snapshot algorithm based on https://en.wikipedia.org/wiki/Chandy-Lamport_algorithm