全球架构师峰会 Arch Summit 2018

黄东旭 TiDB on k8s AS2018

1. TiDB on Kubernetes Huang Dongxu CTO, PingCAP
3. About Me ● Huang Dongxu (黄东旭) ● CTO, Co-founder of PingCAP ● Infrastructure Engineer / Open-source advocator ● Co-author of Codis / TiDB / TiKV ● MSRA => Netease => WandouLabs => PingCAP ● h@pingcap.com
4. Agenda ● TiDB introduction ○ TiDB architecture ○ TiDB ecosystem ● Why combine TiDB & Kubernetes ○ Cloud vendor agnostic ○ Automation ● How we make it possible ○ TiDB Operator architecture & features ○ How we manage state ○ How we schedule stateful app
5. Part I - Intro to TiDB
6. Why we want to build a NewSQL Database ● From the beginning ● What’s wrong with the existing DBs? ○ RDBMS ○ NoSQL & Middleware ● NewSQL: F1 & Spanner RDBMS 1970s NoSQL NewSQL 2010 2015 Present MySQL PostgreSQL Oracle DB2... Redis HBase Cassandra MongoDB Google Spanner Google F1 TiDB PingCAP.com
7. TiDB architecture MySQL Clients Syncer TSO/Data location PD PD PD PD Cluster Metadata TiDB TiDB DistSQL API TiDB TiDB TiDB ... TiDB Cluster TiKV TiKV TiKV TiKV TiKV TiKV ... TiKV Cluster (Storage) Data location DistSQL API Spark Driver Job Worker Worker Worker ... Spark Cluster TiSpark
8. TiDB: Computing ● Stateless SQL layer ○ Client can connect to any existing tidb-server instance ○ TiDB *will not* re-shuffle the data across different tidb-servers ● Full-featured SQL Layer ○ Speak MySQL wire protocol ■ Why not reusing MySQL? ○ Homemade parser & lexer ○ RBO & CBO ○ Secondary index support ○ DML & DDL SQL Statistics tidb-server AST Logical Plan Cost Model Selected Physical Plan Optimized Logical Plan TiKV TiKV TiKV TiKV TiKV TiKV Cluster TiKV
9. TiKV: The Storage ● The storage layer for TiDB ● Distributed Key-Value store ○ Support ACID Transactions ○ Replicate logs by Raft ○ Range partitioning ■ Split / merge dynamically ○ Support coprocessor for SQL operators pushdown Client Metadata Dataflow PD PD PD Placement Driver TiKV TiKV TiKV TiKV TiKV TiKV TiKV TiKV TiKV TiKV Nodes TiKV TiKV TiKV
10. TiKV: The Storage RPC TiKV node 1 Store 1 Region 1 Region 3 Region 5 Region 4 Client RPC TiKV node 2 Store 2 Region 1 Region 2 RPC TiKV node 3 Store 3 Region 1 Region 5 Region 4 Region 3 Region 3 RPC TiKV node 4 Store 4 Region 1 Region 2 Region 5 Region 4 Placement Driver PD 1 PD 2 PD 3 Raft Group
11. TiDB ecosystem Monitor/Logging Core PD TiKV TiDB Lightning Syncer Data migration Loader Binlog ...
12. Part II - Why TiDB on Kubernetes
13. Cloud-Native applications ● Microservice architecure ● Easy deployment on any cloud ● Elastic scaling ● Highly available ● Automatic operation
14. Kubernetes: standard platform De-facto Container Orchestration System (Google Sponsored) Distributed, cloud provider agnostic OS ● CPU, Memory, Storage and other Devices across all nodes ● Container <==> Process ● Docker image <==> Executable artifacts ● Deployment, StatefulSet <==> Systemd/Supervisor … ● Helm / Charts <==> apt yum / deb rpm
15. Kubernetes: powerful extensibility ● Standard interface: CNI, CRI, CSI ● Scheduler: scheduler extender ● Controller: CRD ● APIServer: Aggregated APIServer ● Kubelet: virtual kubelet ● Cloud Provider: LoadBalancer, PersistentVolume ● ...
16. Part III - How we make it possible
17. TiDB Operator TiDB Operator https://github.com/pingcap/tidb-operator
18. Features ● Manage multiple TiDB clusters ● Safely scale the TiDB cluster ● Easily installed with Helm charts ● Network/Local PV support ● Automatically monitoring the TiDB cluster ● Seamlessly perform rolling updates to the TiDB cluster ● Automatic failover ● TiDB related tools integration
19. Architecture TiDB Operator TiDB Controller Manager TiDB Cluster Controller TiDB Controller PD Controller TiKV Controller TiDB Scheduler TiDB Scheduler Kube Scheduler Scheduler Kubernetes Core Controller Manager API Server
20. How we manage state Kubernetes builtin controllers Deployment: ● Start ✅ ● Scale ✅ ● Upgrade ✅ ● Failover ✅ StatefulSet: ● Start ⍻ ● Scale ⍻ ● Upgrade ⍻ ● Failover ❌
21. TiDB Operation ● Cluster bootstrap: Initial PD -> New PD joins existing cluster ● Safely delete PD ○ remove member using PD API ○ stop pd-server ● Safely delete TiKV ○ offline store using PD API ○ stop tikv-server ● Graceful upgrade ○ PD: transfer Raft leader ○ TiKV: evict Raft leaders ○ TiDB: evict DDL owner
22. Custom controller Domain operation logic ● ThirdPartyResource (TPR), CustomResourceDefinition (CRD): ○ Simple & easy ○ Lack schema & versionning (added in newer version) ● Aggregated APIServer (AA): ○ Powerful but complicated ○ Coupled with the built-in APIServer, hard to deploy
23. Custom controller Spec: component: image: replicas: ... Sync Status: image replicas state
24. Custom controller type Manager interface { Sync(*TidbCluster) error } … status: tikv: stores: “5”: podName: demo-tikv-2 state: Up ... apiVersion: pingcap.com/v1alpha1 kind: TidbCluster metadata: name: demo spec: pd: image: pingcap/pd:v2.1.0 replicas: 3 requests: cpu: “4” memory: “8Gi” … tikv: image: pingcap/tikv:v2.1.0
25. Custom controller StatefulSet with Local PV failover: 1. Increase replicas when failure occurs 2. Decrease replicas when node come back (ordinal limitations with statefulset) Node-A TiKV-1 Node-B ❌ TiKV-2 Node-C TiKV-3 Node-D TiKV-4
26. How we schedule stateful app ● Schedule consider existing pods topology Node-A 3 TiKV ❌ Node-B TiKV TiKV TiKV Node-C APP Node-A TiKV 3 TiKV ✅ Node-B TiKV > 3 TiKV ✅ Node-A Node-B TiKV TiKV TiKV TiKV Node-C APP TiKV Node-C APP TiKV
27. How we schedule stateful app ● Schedule consider virtual resource for local volume Upgrading TiKV-3 4 CPU 3 Memory 4 Storage Creating TiKV-0 8 CPU 5 Memory 6 Storage Node: 10 CPU, 10 Memory, 10 Storage TiKV-3 ❌ 4 CPU 3 Memory 4 Storage
28. TiDB Operator Open-sourced! ヾ(=^▽^=)ノ?????? https://github.com/pingcap/tidb-operator

相关幻灯片