李彦超:阿⾥巴巴全球化技术架构

零声学院

2019/06/29 发布于 技术 分类

技术架构 

文字内容
1. A S C C 7 1 0 2
2. 7 1 阿⾥里里巴巴全球化技术架构 0 2 C C A S
3. About Me • Joined Alibaba Group at 2008 • Now I am Chief Infrastructure Architect of AliExpress. • Responsibilities of my team • High Availability • High Performance • Engineering efficiency • Big Data Architecture A S C C 7 1 0 2
4. About AliExpress-Some Numbers Our website is the 44th@Alexa. Our iOS app is the No.1 in over 28 countries ranked by download times in all app categories. And in shopping category our android app is the No.1 in over 88 countries ranked by download times. A S C C 7 1 0 2
5. Challenges buyers sellers • Long distance between buyers and sellers and latency affect purchase rate • On last single days over 2000 global transactions per seconds, and this number increase year by year • We have buyers from all over the world, so our 7*24 availability faces no valley 7 1 0 2 Buyers and sellers from any country can do transactions between each others A S C C
6. Solution-Infrastructure we build Global Users Static Content Delivery US-Users RU-Users EU-Users Akamai Edges Edge Edge Edge Edge Edge 7 1 0 2 GTR Service Near user POPs Dynamic Content Delivery ———— Asia-Users POP Akamai Sureroute POP Transfer POPs Realtime POP Network resources combination and monitoring Near IDC POPs POP Static Regional IDCs and Dynamic US-IDC Sources POP POP C C POP A S POP Akamai Intelligence Networks Edge Edge Edge Objective Performance Combin ation Selection Stability Realtim e Switching Realtim Big data e model Monitor ing POP Nearest serving, Global Failover, Data Consistency RU-IDC EU-IDC SH-IDC
7. Solution—Multi Tenancy Perspective Multi Tenancy Platform Centric Applications Global Tech Tools Tenants Goals AliExpress AliExpress RU PayTM Lazada Performance Availability Efficiency Cost Platform Independent Business Services By AliExpress IAAS C C Services Global Traffic Routing Service Global Service Routing Service Global Failover Service Routing Rules, Customized by Tenant Experience based Capacity based 7 1 0 2 A S Regulation based Global Data Routing&Replicate Service Tools Performance Tools SRE Tools Efficiency Tools Cost Tools Other rules based on big data Routing table Big Data Model Rule based Machine Learning Big Data User Access Data Mobile trace route AGP Realtime Network Metrics Application Logs
8. 7 1 关键技术—全球区域化部署 0 2 C C A S
9. 全球区域化部署 全球买、全球卖、就近访问、全球容灾、数据按需同步、数据⼀一致性保障 IDC 7 1 0 2 IDC A S C C IDC IDC IDC
10. 全球区域化部署架构-Infrastructure as a Service 区域机房 区域机房 Services Services Global Traffic Routing Service Global Service Routing Service Global Failover Global Data Service Replicate Service 7 1 0 2 C C A S Global Traffic Routing Service Global Service Routing Service Global Data Global Failover Replicate Service Service Global Consistent Data And Regional Replicas Data Routing Rules Experience based Regulation based Capacity based Other rules based on big data Data Routing Rules Experience based Regulation based Global Consistent Routing table Capacity based Other rules based on big data
11. 全球区域化部署-实现⽅方案 A S C C 7 1 0 2
12. 7 1 SRE@AliExpress 0 2 C C A S
13. SRE策略略 分级治理理,⾼高ROI优先投⼊入 • 先流程和规范试⽔水,验证通过后,向通过⼯工具化和智能化实 现 • 7 1 0 2 ⾼高ROI= 可⽤用性 成本 C C A S 基础治理理 可⽤用性回报/(成本投⼊入与研发效率的降低) ⼯工具化及智能化 应⽤用治理理 强流程规范 流程规范 迭代⽅方向 研发效率
14. SRE治理理架构 流程规范 SRE智能⼯工具平台 基础治理理⼯工具 应⽤用治理理⼯工具 ⼩小⽽而美⼯工具集 问题诊断 容量量监控—>优化 稳定性度量量—>优化 节假⽇日 智能故障诊断 容灾链路路监控—>容灾执⾏行行 数据质量量度量量—>优化 路路由信息查询 故障处理理协作⼯工具 强弱依赖度量量—>治理理 ⽹网络分析⼯工具 Etc. 容量量服务 容灾服务 稳定性规范考试 7 1 0 2 ⻜飞⻁虎队作战⼿手册 Etc. 智能服务 C C A S ⽹网络服务 分级容灾演习 故障演练 诊断服务 智能模型 Rule based RF SVM GBDT LR 原始数据 Springboot customized Logs CDN AGP ⽹网络监控 应⽤用⽇日志 鹰眼 XFlush AliMonitor 稳定性规范 Ali360 变更更⽇日志 NPS&舆情
15. AE SRE 基础治理理⽅方案 回滚 变更更问题 容量量问题 变更更 选取机房 分机房发布 区域化分流 其它问题 容灾问题 机房监测 互联⽹网问题 治理理⽅方案 出现问题 容灾切换 7 1 0 2 流量量清空 压测 容量量优化 分机房发布 容量量保障 A S C C 容灾切换服务 容灾服务 实时统计 GTR⽹网络切换服务 ⽹网络服务 互联⽹网问题 ⽹网络监测 实时统计
16. Thank you! A S C C 7 1 0 2