Server to Servlet/Spring and vice versa(Async Support)
In previous article, we discussed how Servlet Containers has evolved and turned communication between Client to Server into non-blocking paradigm. In this article we would be focusing on the evolution of Java Servlets(and hence Spring) towards non blocking reactive world.
Let’s recall the flow of request when received by NIO connector:
Few threads (1-4 depending on # cores) polling the selector, looking for IO activity on a channel(hence on a connection).
When selector sees IO activity, it calls a handle method on the connection and a thread from pool is allocated to process.
Thread will attempt to read the connection and parse it and for http connection,if the request headers are complete, the thread goes on to call the handling of the request (eventually this gets to the servlet) without waiting for any content.
Once a thread is dispatched to a servlet, it looks to it like the servlet IO is blocking and hence any attempt to read/write data from HttpInputStream/HttpOutputStream should block.But as we are using NIO connector, underneath, the IO operations using HttpInputStream and HttpOutputStream are async with callbacks. Due to blocking nature of Servlet API, it uses a special blocking callback to achieve blocking.
Step 4 above would have clarified more on ‘Simulated Blocking’ term used in previous article.
Challenges Prior to Servlet 3.0
Now coming back to the challenges posed by one thread per request model, we can see that actual request processing which is blocking in nature is done by a thread(we will call it request thread) from pool which is managed by servlet container. In NIO, default thread pool size is 200 which implies that only 200 request can be served concurrently. The problem with synchronous processing of requests is that it resulted in threads (doing heavy-lifting) running for a long time before the response goes out. If this happens at scale, the servlet container eventually runs out of threads - long running threads lead to thread starvation.
This size could be increased to suit any number(with hardware constraints) but then it will also bring the overhead of context switching,cache flush etc. While increasing threads and serving more concurrent request is not a bad idea but in case application requires high concurrency then we need to find some other suitable approach.Let’s read on to better understand the approach of handling more concurrent users without increasing container thread pool size.
Server thread is blocked during Http Request Processing
Async Servlets in 3.0
An async servlet enables an application to process incoming requests in an asynchronous fashion: A given HTTP request thread handles an incoming request and then passes the request to another background thread which in turn will be responsible for processing the request and send the response back to the client. The initial HTTP request thread will return to the HTTP thread pool as soon as it passes the request to the background thread, so it becomes available to process another request.
Server thread is released during Http Request Processing
Below is a code snippet on how this can be achieved in Servlet 3.0
@WebServlet(name="myServlet", urlPatterns={"/asyncprocess"}, asyncSupported=true)
public class MyServlet extends HttpServlet {
public void doGet(HttpServletRequest request, HttpServletResponse response) {
OutputStream out = response.getOutputStream();
AsyncContext aCtx = request.startAsync(request, response);
//process your request in a different thread
Runnable runnable = new Runnable() {
@Override
public void run() {
String json ="json string";
out.write(json);
ctx.complete();
}
};
//use some thread pool executor
poolExecutor.submit(runnable);
}
}
When the asyncSupported attribute is set to true, the response object is not committed on method exit. Calling startAsync() returns an AsyncContext object that caches the request/response object pair. The AsyncContext object is then stored in an application-scoped queue. Without any delay, the doGet() method returns, and the original request thread is recycled. We can configure a Thread Pool Execotor on server startup which will be used to process the request. After a request is processed, you have the option of calling HttpServletResponse.getOutputStream().write(...), and then complete() to commit the response, or calling forward() to direct the flow to a JSP page to be displayed as the result. Note that JSP pages are servlets with an asyncSupported attribute that defaults to false. complete() triggers Servlet container to return the response to the client.
Note: This whole behaviour which is defined above for Servlets can be achieved by returning callable,DeferredResult or CompletableFuture from Spring Controller.
This approach by itself may solve the problem of HTTP thread pool exhaustion, but will not solve the problem of system resources consumption. After all, another background thread was created for processing the request, so the number of simultaneous active threads will not decrease and the system resource consumption will not be improved.So one might think, This could not be a better evolution on existing stack. Let’s first discuss its implementation in Spring and then will try to figure out in which scenarios this is desired and really scores big on synchronous servlets.
We would be using a Spring Boot project to expose two endpoints-one blockingRequestProcessing and another asyncBlockingRequestProcessing using async servlet feature.
@GetMapping(value = "/blockingRequestProcessing")
public String blockingRequestProcessing() {
logger.debug("Blocking Request processing Triggered");
String url = "http://localhost:8090/sleep/1000";
new RestTemplate().getForObject(url, Boolean.TYPE);
return "blocking...";
}
@GetMapping(value = "/asyncBlockingRequestProcessing")
public CompletableFuture asyncBlockingRequestProcessing(){
return CompletableFuture.supplyAsync(() -> {
logger.debug("Async Blocking Request processing Triggered");
String url = "http://localhost:8090/sleep/1000";
new RestTemplate().getForObject(url, Boolean.TYPE);
return "Async blocking...";
},asyncTaskExecutor);
}
Both the services above are calling a RestService Endpoint called sleepingService.We can assume that the sleeping service has enough resources and won't be our bottleneck.
Also,I have set the number of Tomcat threads for this service to be 1000.Our service will have only 10 to quickly reproduce scale issues.
Through this setup we want to examine the performance of our blockingRequestProcessing service.
We can see that in blockingRequestProcessing, an external sleeping service is called which would sleep for 1 second. Our service maximum number of Tomcat threads is 10. We can use Jmeter to trigger 20 requests per second for 60 seconds. Overall, while all the Tomcat threads(10 in our case) are busy with processing requests, Tomcat holds the waiting requests in a requests queue. When a thread becomes available, a request is retrieved from the queue and is processed by that thread. If the queue is full, we get a "Connection Refused" error, but since I didn't change the default size (10,000 for NIO connector) and we inject only 1200 requests total (20 requests per second for 60 seconds) we won't see that. The client timeout (set in JMeter configuration is 60 seconds). These are the results from JMeter:
Many of the clients got timeouts. Why? JMeter calls 20 requests per second, while our service can process 10 requests every 1 second so we accumulate 10 requests in the Tomcat requests queue every second. Meaning, at second 60, the requests queue holds at least 600 requests. Can the service process all the requests with 10 threads in 60 seconds (the client timeout)? The answer is no.
Let's run the same test with the same code, but return CompletableFuture(This will hence make use of async servlet as explained above with thread pool executor) instead of String as in asyncBlockingRequestProcessing service.
Everything looks good. All requests were successful. I even reduced the response time . What happened? As mentioned before, returning Callable releases the Tomcat thread and processing is executed on another thread. The other thread will be managed by the Spring MVC task executor which we have configured.
We actually improved performance by adding resources i.e number of threads from Executor Thread Pool. Note that the request to sleeping-service is still blocking, but it is blocking a different thread (Spring MVC executor thread). Now, question arises if we could also have increased performance without using async servlet API and by increasing tomcat max thread configuration for NIO connector? The answer is YES but for specific use cases.
So, Where we could use Servlet 3.0 Async feature?
Servlet 3.0 async is really useful if the processing request code uses nonblocking API (in our case, we use blocking API to call the other service) as shown in below sample code.
@GetMapping(value = "/asyncNonBlockingRequestProcessing")
public CompletableFuture asyncNonBlockingRequestProcessing(){
ListenableFuture listenableFuture = getRequest.execute(new AsyncCompletionHandler() {
@Override
public String onCompleted(Response response) throws Exception {
logger.debug("Async Non Blocking Request processing completed");
return "Async Non blocking...";
}
});
return listenableFuture.toCompletableFuture();
}
In above code , we are making use AsyncHttpClient which calls sleeping service in non-blocking way. Hence, with use of minimal threads here, we could scale our service to serve many more clients concurrently.
The benefit of releasing Tomcat threads is clear when it comes to a single Tomcat server with a few WARs deployed. For example, if I deploy two services and service1 needs 10 times the resources as service2, Servlet 3.0 async allows us to release the Tomcat threads and maintain a different thread pool in each service as needed.
With this we conclude our discussion on Servlet 3.0 Async feature. We have seen that this feature has changed the way, applications were designed and this would act as a solid foundation for Spring Reactive. Stay Tuned for next article on this!
Source code for this article could be found at: