Skip to main content

Garbage Collection and Application Performance

Automatic Garbage Collection is one of the finest features of the Java programming language. You can find more information about Garbage Collection concepts from the below link.


Even though GC is a cool feature in the JVM, it comes at a cost. Your application will stop working (Stop the World) when GC happens in the JVM level. Which means that, GC events will affect the performance of your java application. Due to this, you should have a proper understanding about the impact of GC for your application. 

There are two general ways to reduce garbage-collection pause time and the impact it has on application performance:


  • The garbage collection itself can leverage the existence of multiple CPUs and be executed in parallel. Although the application threads remain fully suspended during this time, the garbage collection can be done in a fraction of the time, effectively reducing the suspension time.
  • The second approach is leave the application running, and execute garbage collection concurrently with the application execution.


These two logical solutions have led to the development of serial, parallel, and concurrent garbage-collection strategies , which represent the foundation of all existing Java garbage-collection implementations.


In the above diagram, serial collector suspends the application threads and executes the mark-and-sweep algorithm in a single thread. It is the simplest and oldest form of garbage collection in Java and is still the default in the Oracle HotSpot JVM.

The parallel collector uses multiple threads to do its work. It can therefore decrease the GC pause time by leveraging multiple CPUs. It is often the best choice for throughput applications.

The concurrent collector does the majority of its work concurrent with the application execution. It has to suspend the application for only very short amounts of time. This has a big benefit for response-time for sensitive applications, but is not without drawbacks. 

Concurrent Mark and Sweep algorithm

Concurrent garbage-collection strategies complicate the relatively simple mark-and-sweep algorithm a bit. The mark phase is usually sub-divided into some variant of the following:


  • In the initial marking, the GC root objects are marked as alive. During this phase, all threads of the application are suspended.


  • During concurrent marking, the marked root objects are traversed and all reachable objects are marked. This phase is fully concurrent with application execution, so all application threads are active and can even allocate new objects. For this reason there might be another phase that marks objects that have been allocated during the concurrent marking. This is sometimes referred to as pre-cleaning and is still done concurrent to the application execution.


  • In the final marking, all threads are suspended and all remaining newly allocated objects are marked as alive. This is indicated in Figure 2.6 by the re-mark label.


The concurrent mark works mostly, but not completely, without pausing the application. The tradeoff is a more complex algorithm and an additional phase that is not necessary in a normal stop-the-world GC: the final marking.

The Oracle JRockit JVM improves this algorithm with the help of a keep area, which, if you're interested, is described in detail in the JRockit documentation . New objects are kept separately and not considered garbage during the first GC. This eliminates the need for a final marking or re-mark.

In the sweep phase of the CMS, all memory areas not occupied by marked objects are found and added to the free list. In other words, the objects are swept by the GC. This phase can run at least partially concurrent to the application. For instance, JRockit divides the heap into two areas of equal size and sweeps one then the other. During this phase, no threads are stopped, but allocations take place only in the area that is not actively being swept. 

There is only one way to make garbage collection faster: ensure that as few objects as possible are reachable during the garbage collection. The fewer objects that are alive, the less there is to be marked. This is the rationale behind the generational heap.

Young Generation vs Old Generation

In a typical application most objects are very short-lived. On the other hand, some objects last for a very long time and even until the application is terminated. When using generational garbage collection, the heap area is divided into two areas—a young generation and an old generation—that are garbage-collected via separate strategies.

Objects are ussually created in the young area. Once an object has survived a couple of GC cycles it is tenured to the old generation. After the application has completed its initial startup phase (most applications allocate caches, pools, and other permanent objects during startup), most allocated objects will not survive their first or second GC cycle. The number of live objects that need to be considered in each cycle should be stable and relatively small.

Allocations in the old generation should be infrequent, and in an ideal world would not happen at all after the initial startup phase. If the old generation is not growing and therefore not running out of space, it requires no garbage-collection at all. There will be unreachable objects in the old generation, but as long as the memory is not needed, there is no reason to reclaim it.

To make this generational approach work, the young generation must be big enough to ensure that all temporary objects die there. Since the number of temporary objects in most applications depends on the current application load, the optimal young generation size is load-related. Therefore, sizing the young generation, known as generation-sizing, is the key to achieving peak load.

Unfortunately, it is often not possible to reach an optimal state where all objects die in the young generation, and so the old generation will often often require a concurrent garbage collector. Concurrent garbage collection together with a minimally growing old generation ensures that the unavoidable, stop-the-world events will at least be very short and predictable. 

Comments

Popular posts from this blog

Understanding Threads created in WSO2 ESB

WSO2 ESB is an asynchronous high performing messaging engine which uses Java NIO technology for its internal implementations. You can find more information about the implementation details about the WSO2 ESB’s high performing http transport known as Pass-Through Transport (PTT) from the links given below. [1] http://soatutorials.blogspot.com/2015/05/understanding-wso2-esb-pass-through.html [2] http://wso2.com/library/articles/2013/12/demystifying-wso2-esb-pass-through-transport-part-i/ From this tutorial, I am going to discuss about various threads created when you start the ESB and start processing requests with that. This would help you to troubleshoot critical ESB server issues with the usage of a thread dump. You can monitor the threads created by using a monitoring tool like Jconsole or java mission control (java 1.7.40 upwards). Given below is a list of important threads and their stack traces from an active ESB server.  PassThroughHTTPSSender ( 1 Thread )

How to configure timeouts in WSO2 ESB to get rid of client timeout errors

WSO2 ESB has defined some configuration parameters which controls the timeout of a particular request which is going out of ESB. In a particular  scneario, your client sends a request to ESB, and then ESB sends a request to another endpoint to serve the request. CLIENT->WSO2 ESB->BACKEND The reason for clients getting timeout is that ESB timeout is larger than client's timeout. This can be solved by either increasing the timeout at client side or by decreasing the timeout in ESB side. In any of the case, you can control the timeout in ESB using the below properties. 1) Global timeout defined in synapse.properties (ESB_HOME\repository\conf\) file. This will decide the maximum time that a callback is waiting in the ESB for a response for a particular request. If ESB does not get any response from Back End, it will drop the message and clears out the call back. This is a global level parameter which affects all the endpoints configured in ESB. synapse.global_timeout_inte

WSO2 ESB tuning performance with threads

I have written several blog posts explaining the internal behavior of the ESB and the threads created inside ESB. With this post, I am talking about the effect of threads in the WSO2 ESB and how to tune up threads for optimal performance. You can refer [1] and [2] to understand the threads created within the ESB. [1] http://soatutorials.blogspot.com/2015/05/understanding-threads-created-in-wso2.html [2] http://wso2.com/library/articles/2012/03/importance-performance-wso2-esb-handles-nonobvious/ Within this blog post, I am discussing about the "worker threads" which are used for processing the data within the WSO2 ESB. There are 2 types of worker threads created when you start sending the requests to the server 1) Server Worker/Client Worker Threads 2) Mediator Worker (Synapse-Worker) Threads Server Worker/Client Worker Threads These set of threads will be used to process all the requests/responses coming to the ESB server. ServerWorker Threads will be used to pr