gcc uses garbage collection internally during the course of a compilation. I wasn’t too involved with gcc at the time that garbage collection was introduced, but I certainly saw the problems that it was trying to solve. Before garbage collection, gcc allocated memory using obstacks. An obstack is a memory allocation pool in which if you free an object you also free all more recently allocated objects. Obstacks are still used in gcc today. The issue before garbage collection was introduced was that all allocation was done on the current default obstack. There were several different obstacks with different lifetimes, and code would change the default back and forth. There was also code to copy data structures from one obstack to another in order to make it more permanent. It was very easy to introduce errors in which objects in a more permanent obstack pointed to objects in a more temporary obstack. This then lead to unpredictable memory crashes.
We can draw some lessons from this. If you are going to use memory pools in a complicated program, you must not use a default memory pool. All memory allocation must use an explicitly named pool. Otherwise you are going to get confused and the result will be dangling pointers.
Another lesson is that memory allocation is hard, but then we already knew that. Using a garbage collector resolves the memory allocation problem. However, you pay a fairly heavy price. Is it a price which gcc can afford to pay?
One answer is to observe that gcc’s garbage collector includes a ggc_free function. This function frees memory in the garbage collector. It is basically an assertion that there are no pointers to the memory. Any use of gcc_free in gcc’s code base is a case where the price of garbage collection was considered to be too high. Note that systems which are fully dedicated to garbage collection do not provide any equivalent to ggc_free. I count 24 uses of ggc_free in gcc’s code base today. Clearly in at least some cases developers feel that the cost of garbage collection is too high.
The question gcc faces is: is the garbage collector too expensive in all cases? Should we go back to explicit memory management–but this time with explicitly named pools, to avoid the problems of default obstacks? The answers are complicated by the fact that the implementation of pre-compiled headers is tied into the implementation of garbage collection.
Another way to ask the question is: if garbage collection is not too expensive, then why not write gcc in Java? The natural response would be that using Java would be too expensive–the compiler would be too slow. But the main reason that Java is slower than C++ is not the bytecode system–modern JIT technology is competitive with compilation to machine code; the slowness of Java is a result of garbage collection.
Leave a Reply
You must be logged in to post a comment.