Druid 实时分析架构设计思路—Imply_部分3

1. Event Streams Insight STREAMING SETUP Hadoop Hadoop (pre-processing and storage) Kafka Samza Druid Druid 2015
2. STREAMING ONLY INGESTION ‣ Stream processing isn’t perfect ‣ Difficult to handle corrections of existing data ‣ Windows may be too small for fully accurate operations ‣ Hadoop was actually good at these things 2015
3. OPEN SOURCE LAMBDA ARCHITECTURE Samza ‣ Real-time ‣ Only on-time data Event Streams Insight Kafka Druid Hadoop ‣ Some hours later ‣ All data 2015
4. TAKE-AWAYS ‣ When Druid? • You want to power user-facing data applications • You want to do your analysis on data as it’s happening (realtime) • Arbitrary data exploration with sub-second ad-hoc queries • OLAP, BI, Pivot (anything involved aggregates) • You need availability, extensibility and flexibility 2015
