CICD K8s And DBs Better Together

1. CI/CD, Kubernetes, and Databases: Better Together Niraj Tolia @nirajtolia Tom Manville @tdmanv
2. about us Niraj Tolia Tom Manville Co-founder & CEO @ Kasten Previously at EMC, Maginatics, HP, CMU Founding Engineer @ Kasten Previously at Dropbox, Maginatics, U. Mich. page 02
3. our goal: move fast and test with real data
4. what we will not cover in this talk Kubernetes Ready for Production Stateful Apps Implementing a Data Protection Strategy Presented at SNIA’s 2018 Storage Developer Conference KubeCon Seattle, Wednesday, December 12, 2:35pm page 04
5. current state of databases in a cloud-native world page 05
6. cloud-native and databases why is there so much fear and risk? Snowflakes Automation Gap DBAs and Ops Databases are isolated from the application, might have manual changes applied, treated as pets. Not built into CI/CD pipelines. Test datasets have manual imports and get stale quickly. Still see database groups isolated from both dev and infra ops groups. Not part of app dev. page 06
7. What should the future look like?
9. increasing agility with databases in a cloud-native environment Source Control Kubernetes to tie it all together! Automate testing all database changes and modifications CI/CD Pipeline Include all schema changes, upgrades changes, tools, etc. in the application repository Database Infrastructure Deliver database infrastructure and configuration as code page 09
10. how kubernetes makes a difference Enforces Good DevOps Hygiene Immutability, config as code, automation makes repeatable and reliable testing easy Efficient, High Resource Utilization Declarative systems approach supports reliable use of multiple testing environments to test at scale Universal Control Plane Use the same management plane as you use for all other components of your application page 010
11. ci/cd advantages for databases Automated testing • • Enforces the the app and DB are always in sync Higher-confidence releases Engineering agility • • Faster change iteration with automated testing High velocity prod DB deployments Catch issues early • • Unit tests for coverage Integration and staging environments for behavioral page 011
12. But, it’s a database! So, what about the data?
13. Need to safely test with production data (but not in production!)
14. data based testing number of integration challenges Storage Integration Might need to integrate with volumelevel storage APIs for efficiency. Database Integration For consistent data capture including w/ eventually consistent data stores Application Integration Polyglot persistence in micro-service based applications needs app-level coordination. So does data masking to protect sensitive data. page 014
15. Supporting Data Mobility
16. kanister: A Kubernetes-native framework for application-level data management • • • • Supports complex data management workflows Easy to integrate against your CI/CD pipeline Actions invoked via Custom Resources (CRs) Easy to extend via simple “recipes” or Blueprints page 016
17. kanister: the highlights • • Data Capture/Export Database Manipulation File/Block integration via native API and CSI v1.0 S3 API support for object stores • • • Filters Masking Incremental Capture Control Plane Integration • • Ties K8s and DB control planes Library support for complex workflows (e.g., scale up/down) Visit for more information page 017
18. kanister workflow 4. Status Update 1. ActionSet Creation ue Bl 2. int sc Di Kanister Controller pr ActionSet (Custom K8s Resource) er ov y 3. Action Execution Stateful Application KubeExec / KubeTask Blueprint (Custom K8s Resource) page 018
19. kanister actionset (abridged) apiVersion: kind: ActionSet spec: actions: - name: backup blueprint: postgresql object: kind: StatefulSet name: postgresql-cluster namespace: default configMaps: ... page 019
20. kanister blueprint (abridged) apiVersion: kind: Blueprint actions: backup: type: StatefulSet phases: - func: KubeExec args: - '{{ .StatefulSet.Namespace }}' - '{{ index .StatefulSet.Pods 0 }}' - postgresql-tools-sidecar - bash - -c - wal-e ... - func: ... restore: ... page 020
21. Demo!
22. demo: pipeline setup Data Mobility Application Code Config Definition Source Control Integration Pipeline Deployment Pipeline Production Cluster Database Schema page 022
23. integration demo: data flow setup ⓷ Test Invocation App K10: Policy and Orchestration (e.g., Periodic Import or Export) + Kanister: Data Manipulation and Mobility DB Pod Pod Namespace: demo Pod NS: kio NS: kio Production Kubernetes Cluster DB Pod Namespace: test Integration Kubernetes Cluster ⓵ App Export App + Data Snapshot App ⓶ App Import ⓸ Data Population Object Storage Firewall page 023
24. end-to-end demo
25. advanced topics (hopefully) coming soon to a conf. near you CD w/ schema changes Deploying schema changes (and rollbacks) can be a lot more involved. Backup/recovery is a critical part of this. Managed Services Apart from cost, these slides apply to managed services too but do track emerging best practices Masking and Sampling Kanister has support for injecting your own code to mask sensitive data or only extract a a subset Dataset Promotion There are situations where you might want to promote data from dev → staging → prod page 025
26. kubernetes, ci/cd, and databases wrapping up Build & Standardize your DB Pipeline on Kubernetes! 01 02 03 04 Automate your DB Pipeline Deploy database updates and changes with increased confidence Leverage Kubernetes Deliver greater agility to your dev teams by allowing easy and reliable testing Use Real Data Test on production data to reduce code quality risk when running against synthetic or stale data Make DB Engineering Agile Integrate database teams into your DevOps and Agile journey. Break apart the silos! page 26
27. Questions? You can also find us at: Booth S/E15 @kastenhq @nirajtolia @tdmanv page 027 Image is the cover art from Better Together, a Jack Johnson song