Druid 实时分析架构设计思路—Imply_部分3

经历了从上个世纪70年代开始“办公自动化”到今天“移动互联时代”,现在人类科技演进再次来到了十字路口。虚拟现实、人工智能、现实增强、物联网、车联 网……我们发现网络、科技正在逐渐改变我们生活中习以为常的方方面面,可以预见在不远的将来,人类将迎来一轮新的科技爆发。而数据必将是下一次科技爆发的基石。

1. Event Streams Insight STREAMING SETUP Hadoop Hadoop (pre-processing and storage) Kafka Samza Druid Druid 2015
2. STREAMING ONLY INGESTION ‣ Stream processing isn’t perfect ‣ Difficult to handle corrections of existing data ‣ Windows may be too small for fully accurate operations ‣ Hadoop was actually good at these things 2015
3. OPEN SOURCE LAMBDA ARCHITECTURE Samza ‣ Real-time ‣ Only on-time data Event Streams Insight Kafka Druid Hadoop ‣ Some hours later ‣ All data 2015
4. TAKE-AWAYS ‣ When Druid? • You want to power user-facing data applications • You want to do your analysis on data as it’s happening (realtime) • Arbitrary data exploration with sub-second ad-hoc queries • OLAP, BI, Pivot (anything involved aggregates) • You need availability, extensibility and flexibility 2015
5. DRUID IS OPEN SOURCE WWW.DRUID.IO twitter @druidio irc.freenode.net #druid-dev
6. MY INFORMATION FJ@IMPLY.IO twitter @fangjin LinkedIn fangjin