Pros and Cons of Automatic Garbage Collection

March 6, 2010

Take any group of application developers, and ask them what their opinion of automatic garbage collection is. Undoubtedly, at least one person will have some very strong reasons for the dislike of automatic garbage collection. While some of their reasons may be somewhat legitimate, most of their opinions are more likely misconceptions, based-on older versions of garbage collectors, or just myths. The purpose of this essay is to examine the pros and cons of automatic garbage collection, while disregarding the prevailing myths.


The advantages of automatic garbage collection (hereafter known as AGC) are not always apparent. There is the obvious increased speed of development. With AGC, developers can spend more time solving the actual problem at hand, and less time allocating, reclaiming, and keeping track of their memory usage.


Another advantage of AGC is that you have a reduction in memory leaks (allocated memory that is unable to be reclaimed). Of course, someone who is an expert (in their chosen language) could succeed in writing manual memory management that runs perfectly. But the facts remain that manual memory management requires human intervention, and humans inevitably make mistakes. Therefore, the potential for memory leaks is actually lower when a developer just allows AGC to do its job.


This next advantage actually counters a well-propagated myth about garbage collection: that it is performance intensive. In fact, some researchers have noted that AGC actually uses less CPU cycles than manual reclamation. Microsoft researcher Benjamin Zorn noted that manual memory reclamation has an amount of CPU overhead that is “comparable” to AGC (Zorn, 1993). AGC has come a long way since that publication. Additionally, the system determines the best time for AGC to happen. Calling a destructor at the wrong time has the potential for performance implications. While garbage collectors of the past may have significantly hindered performance (Java, prior to version 1.2), most modern AGC's are quite in-obtrusive.


With all of the myths and misconceptions out there, listing the disadvantages of AGC is a difficult task. The biggest that I can see, is that the lower levels of memory management are abstracted and hidden from the developer. This provides little to no opportunity for improvement of resource usage. Additionally, if a memory leak should happen to occur (however rare) debugging and fixing it becomes a very difficult task.


However, there are some applications for which AGC is not desirable. A daemon, or program that runs in the background of a server would not do well with AGC. This is because servers are meant to run all of the time, which in-turn means that the daemon will run all of the time. If there happened to be a memory leak that AGC could not solve, that program has a very good chance of running long-enough to consume all available memory. Of course, that would have the added effect of degrading performance for everything else running on that server, as well.


A daemon with a memory leak is an even bigger (potential) problem on a Windows server. Windows servers are usually rebooted on a regular or semi-frequent basis. This means that the daemon would most-likely be periodically restarted (forcibly reclaiming memory) and making the problem more-difficult to find. By comparison, a Linux server could potentially run for several months without a reboot (only required for updates to the Linux kernel). Thus, a similar daemon with the same problem is more-likely have an opportunity to fail and be discovered in a Linux environment.


While the potential for a memory leak in a program (written in a language with good AGC) is quite rare, there is another reason as to why it is not preferable to use AGC while writing server daemons. While in the background, a daemon can go idle for extended periods of time. During this time, the chance is great that the daemon's memory will be swapped-out to disk. If AGC were to run while the memory is in the swap space, the potential exists for a noticeable degradation in performance, as reading from the disk is significantly slower.


In conclusion, I believe that there is a time and a place for both manual reclamation and automatic garbage collection. During my research for this essay, I noted that there is more misguided opinion out there than actual fact. As the future of application development progresses, we will see further advances in garbage collection algorithms and techniques, making the use of AGC even faster and safer than it already is. Sure, there are some who will still hold to their opinions and believe that they can manage their own memory reclamation better themselves. As for me, I believe that it is best to not get in-the-way of letting the system do what it was designed to do.


Aaron Ploetz


References:

Stephen Leibowitz, (2004), "Automatic Garbage Collection in Java and C++", https://googledrive.com/host/0B8HeaZzKMqdWZ3Z4bDRFUXg1ekU/CPP_Java/AGC.html


Brian Goetz, (Jan 2004), "Java Theory and Practice: Garbage Collection and Performance", http://www.ibm.com/developerworks/java/library/j-jtp01274.html


Ian Kaplan, (Dec 2005), "Memory Allocation and Garbage Collection", http://www.bearcave.com/software/garbage.htm


Ravenbrook Ltd. (2001), "Memory Management Reference FAQ", http://www.memorymanagement.org/faq.html


Benjamin Zorn, (July 1993), "The Measured Cost of Conservative Garbage Collection", http://eprints.kfupm.edu.sa/70498/1/70498.pdf


Copyright © Aaron Ploetz 2010 -
All corporate trademarks are property of their respective owners, and are shown here for reference only.