Crushing Application Latency with Vert,x Paulo

Razor

2019/10/19 发布于 技术 分类

文字内容
1. Crushing Latency with Vert.x Paulo Lopes Principal Software Engineer @pml0pes https://www.linkedin.com/in/pmlopes/ pmlopes
4. latency noun la·ten·cy \ ˈlā-tᵊn(t)-sē \ Network latency is the term used to indicate any kind of delay that happens in data communication over a network. (techopedia.com)
5. Latency by the numbers Amazon: every 100ms of latency costs 1% in sales http://home.blarg.net/~glinden/StanfordDataMining.2006-11-29.ppt Google: an extra 0.5 seconds in search page generation time dropped tra c by 20% http://glinden.blogspot.com/2006/11/marissa-mayer-at-web-20.html A broker: could lose $4 million in revenues per millisecond if their electronic trading platform is 5 milliseconds behind the competition http://www.tabbgroup.com/PublicationDetail.aspx?PublicationID=346
6. Latency is not the problem it's the symptom!
7. 2007: Dan Pritchett Loosely Couple Components Use Asynchronous Interfaces Horizontally Scale from the Start Create an Active/Active Architecture Use a BASE instead of ACID Shared Storage Mode www.infoq.com/articles/pritchett-latency
8. 2011 (Tim Fox): Vert.x (event bus) Loosely Couple Components (non blocking I/O) Use Asynchronous Interfaces (clustered) Horizontally Scale from the Start Eclipse Vert.x is a tool-kit for building reactive https://vertx.io/ applications on the JVM.
9. Why Non-Blocking I/O?
10. 5ms / req time ... # In optimal circumstances 1 Thread => 200 req/sec 8 Cores => 1600 req/sec
11. req time grows as threads fight for execution time ... # PROCESS STATE CODES # D Uninterruptible sleep (usually IO) ps aux awk '$8 ~ /D/ { print $0 }' root 9324 0.0 0.0 8316 436 ? D< Oct15 0:00 /usr/bin/java...
12. when load is higher than max threads queuing builds up ... # git@github.com:tsuna/contextswitch.git ./cpubench.sh model name : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz 1 physical CPUs, 4 cores/CPU, 2 hardware threads/core 2000000 thread context switches in 2231974869ns (1116.0ns/ctxsw)
13. grep 'CONFIG_HZ=' /boot/config-$(uname -r) # CONFIG_HZ=1000
14. Practical example: Tomcat 9.0 Default maxThreads: 200 Avg req time:'>time: 5ms Hypothetical High load: 1000 req (1000 / 200 - 1) * 5 = Wasted wait/queue time:'>time: 0~20ms https://tomcat.apache.org/tomcat-9.0-doc/con g/executor.html
15. at max utilization CPU is mostly waiting
16. Non-Blocking I/O
17. Vert.x Events Request handler AUTH handler DB handler JSON handler
18. 1 CPU core fully used!
19. Vert.x Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
20. 100% CPU cores used!
21. Benchmarking is hard Meaningful benchmarks are even harder Techempower Framework Benchmarks Contributors: 528 Pull Requests: 4022 Commits: 11095 https://github.com/TechEmpower/FrameworkBenchmarks
22. Baseline: JAX-RS Blocking API Thread Based Java
23. jax-rs @GET @Path("/queries") World[] queries(@QueryParam("queries") String queries) { World[] worlds = new World[queries]; Session session = emf.createEntityManager(); vert.x void queriesHandler(final RoutingContext ctx) { World[] worlds = new World[getQueries(ctx)]; AtomicInteger cnt = new AtomicInteger(); for (int i = 0; i < getQueries(ctx); i++) { db.preparedQuery(FIND_WORLD, ..., res -> { final Row row = res.result() .iterator() .next(); for (int i = 0; i < queries; i++) { worlds[i] = session .byId(World.class) .load(randomWorld()); } worlds[cnt.incrementAndGet()] = new World(row); return worlds; } if (cnt.get() == queries) { ctx.response() .end(Json.encodeToBuffer(worlds)); } }); } }
25. Simple results Vert.x: 37,157 req/s Jax-RS: 14,493 req/s
27. Polyglot English - 简体中文 - Português
28. What happens when you say Hello? function handler (context) { // the exchange context context // get the response object .response() // send the message and end // the response .end('你好'); }
29. Getting the response object 1 this.response = function() { 2 var __args = arguments; 3 if (__args.length === 0) { 4 if (that.cachedresponse == null) { 5 that.cachedresponse = utils.convReturnVertxGen( 6 HttpServerResponse, 7 j_routingContext["response()"]()); 8 } 9 return that.cachedresponse; 10 } else if (typeof __super_response != 'undefined') { 11 return __super_response.apply(this, __args); 12 } 13 else throw new TypeError('invalid arguments'); 14 };
30. In a nutshell Lots of conversions (GC) Constant switch from JS engine to Java code (somehow similar to context switching) Not suited for performance JIT optimization will stop at language cross
31. Node.js
32. JIT can't optimize it all
33. Where's Node?
35. 🤔 Can we make polyglot fast?
38. GraalVM: In a nutshell Lots of conversions (GC) Constant switch from JS engine to Java code (somehow similar to context switching) Not suited for performance JIT optimization will stop at language cross
39. ES4X GraalVM based Vert.x for I/O commonjs and ESM loader debug/pro le chrome-devtools https://reactiverse.io/es4x
40. ES4X design principles GraalJS (for fast JS runtime) Vert.x (for fast I/O + event loops) GraalVM (for full JIT) .d.ts (for IDE support) github.com/AlDanial/cloc v 1.72 T=0.09 s (401.5 files/s, 51963.8 lines/s) ------------------------------------------------------------------------------Language files blank comment code ------------------------------------------------------------------------------Java 26 389 683 1778 JavaScript 9 201 253 1226 ------------------------------------------------------------------------------SUM: 35 590 936 3004 -------------------------------------------------------------------------------
41. Node.js vs ES4X const cluster = require('cluster'), numCPUs = require('os').cpus().length, express = require('express'); import { Router } from '@vertx/web'; if (cluster.isMaster) { for (let i = 0; i < numCPUs; i++) cluster.fork(); } else { const app = module.exports = express(); app.get('/plaintext', (req, res) => res .header('Content-Type', 'text/plain') .send('Hello, World!')); } app.get("/plaintext").handler(ctx => { ctx.response() .putHeader("Content-Type", "text/plain") .end('Hello, World!'); }); const app = Router.router(vertx);
44. Polyglot GraalVM is fast. what about latency?
46. Conclusion latency is not a problem, it's a symptom use non-blocking to fully use the CPU use vert.x ;-) polyglot is ne use graalvm for polyglot JIT optimizations node can be slow for server applications use ES4X ;-)