The problems of Lambda architecture from Jay Kreps (@jaykreps) are also noticeable.
"First, maintaining code that needs to produce the same result in two complex distributed systems is exactly as painful as it seems to be. One proposed approach to fix this is to have a language or framework that abstracts over both the real-time and batch framework. Summingbird is a framework that does this."
"Second, even we can only code once, the operational burden of running and debugging two systems is going to be very high."
The reason we are still interesting at the Lambda architecture would be:
"What they have at their disposal are two things that don’t quite solve their problem: a scalable high-latency batch system that can process historical data and a low-latency stream processing system that can’t reprocess results."
"In this sense, even though it can be painful, I think the Lambda Architecture solves an important problem that was otherwise generally ignored. But I don’t think this is a new paradigm or the future or big data. It is just a temporary state driven by the current limitation of off-the-shelf tools. I also think there are better alternatives."
Some references:
- Website, Lambda Architecture. http://lambda-architecture.net/
- Book, big data. http://www.manning.com/marz/
- Framework, Kafka
- Framework, Samza
- Website, How to beat the guys that say they beat CAP, Beating the CAP Theorem Checklist