Why do languages such as C and C++ not have garbage collection, while Java does?

Question

Well, I know that there are things like malloc/free for C, and new/using-a-destructor for memory management in C++, but I was wondering why there aren't "new updates" to these languages that allow the user to have the option to manually manage memory, or for the system to do it automatically (garbage collection)?

Somewhat of a newb-ish question, but only been in CS for about a year.

We've got a module in iPhone development this semester. After coding apps for Android for 2 years, this question struck most of the class pretty hard. Only now do we see how many hours Java has actually saved us in not having to track down nasty memory management errors and not having to write boiler plate code. — siamii, Oct 08 '11 at 23:28
D has this capability (both GC and manual memory management). C and C++ can have it throw Boehm GC. — deadalnix, Oct 09 '11 at 02:04
@NullUserException, since it doesn't specify a way to reclaim memory that pretty much implies a GC. — Winston Ewert, Oct 09 '11 at 04:48
@bizso09: Did you look at ARC yet? No need for slow/fat/non-deterministic GC when you've got system-support for reference counting: http://developer.apple.com/technologies/ios5/ — JBRWilkinson, Oct 10 '11 at 12:01
For all those saying that C cannot have a garbage collector, why not use one which supports the malloc-free API? — , Oct 14 '11 at 09:58
@acidzombie24 I wouldn't say GC is bad. It eliminates a common source for mistakes and takes work away from the programmer, at a price. Every language is a child of its time intended use. — Mark, Oct 28 '14 at 07:32
The answers to this pretty good question are full of religious bullshit. — abatishchev, Apr 06 '15 at 03:47
In C and C++ it is possible to take a pointer, cast it to int and add a number to it. Later substract the number from the int and cast the result back to a pointer. You will get the same pointer as before. Good luck in implementing a GC which does NOT collect that memory while its address is stored only in the variable which also has another value. I know the example is silly but a XOR linked list uses something similar. I would post this as an answer but the question is closed. — Marian Spanik, Jun 18 '16 at 09:26
And how should you implement a GC on platforms with only 256 Bytes (Yes Bytes, like ATTINY4) of RAM? There you can not use dynamic memory at all. How would you write a realtime software with a GC? There you most likely do not use dynamic memory. And how would you implement a GC without a language like C or C++ when you not want to use assembler? — 12431234123412341234123, Jul 05 '17 at 08:02
@MarianSpanik This can run into UB if sizeof(int)<sizeof((yourtype*)) is True or when adding the number result in a overflow. To avoid this use uintptr_t. — 12431234123412341234123, Jul 05 '17 at 08:11

score 79 · Accepted Answer · edited Sep 06 '15 at 04:55

79

Garbage collection requires data structures for tracking allocations and/or reference counting. These create overhead in memory, performance, and the complexity of the language. C++ is designed to be "close to the metal", in other words, it takes the higher performance side of the tradeoff vs convenience features. Other languages make that tradeoff differently. This is one of the considerations in choosing a language, which emphasis you prefer.

That said, there are a lot of schemes for reference counting in C++ that are fairly lightweight and performant, but they are in libraries, both commercial and open source, rather than part of the language itself. Reference counting to manage object lifetime is not the same as garbage collection, but it addresses many of the same kinds of issues, and is a better fit with C++'s basic approach.

edited Sep 06 '15 at 04:55

Others

101
4

answered Oct 08 '11 at 20:59

kylben

2,308

31

A secondary issue is the GC is non-deterministic. The object may or may not still be in memory long after the program has "dropped" it. Refcount lifecycles are deterministic, when the last reference is dropped, the memory is dropped. This has implications not only for memory efficiency, but for debugging as well. A common programming error is the "zombie" object, reference memory that has theoretically been dropped. GC is much more likely to mask this effect, and produce bugs that are intermittent and extremely difficult to track down. – kylben Oct 08 '11 at 22:17
2

If you want to experiment with a language that supports both, try Objective-C. It can do reference counting or garbage collection. – Oct 08 '11 at 22:28
24
modern gc's neither track allocations or count references. They build a graph from everything currently on the stack and just condense and wipe everything else (simplified), and GC normally results in reduced language complexity. Even the performance benefit is questionable.

Joel Coehoorn

Oct 09 '11 at 00:56

14

Er, @kylben, the whole point of having automatic GC baked into the language is that it becomes impossible to reference zombie objects, because the GC only frees objects that are impossible to reference! You get the sort of hard-to-track bugs you're talking about when you make mistakes with manual memory management. – Ben Oct 09 '11 at 01:08

14

-1, GC does not count references. Plus, depending of your memory usage and allocation scheme, a GC can be faster (with an overhead in memory usage). So the argument about performance is fallacy too. Only the close to the metal is a valid point actually. – deadalnix Oct 09 '11 at 02:07

1

My understanding was that both Java and C# count references AND do graph-traversal. This is because reference-counting has much, much less overhead, and works correctly about 95% of the time. Also, I think kylben meant increased complexity in implementing the language, not programming for it. – BlueRaja - Danny Pflughoeft Oct 09 '11 at 02:44

14

Neither Java nor C# use reference counting: GC schemes based on reference counting are pretty primitive in comparison and perform much worse than modern garbage collectors (mainly because they need to do memory writes to alter reference counts whenever you copy a reference!) – mikera Oct 09 '11 at 02:59

2

Ben, that's true if the language is designed to lock that and force all references through certain channels. C++ does not, so this argument still holds against grafting it on.

Danny, yes, that is what I meant.

As far as all the comments on implementation, I'm not familiar with the nitty gritty and was not aware of the stack-inspection method. That's good to know, but I don't think it changes the argument too much, it's still overhead compared to traditional C/C++ ownership-based explicit deletion, or compared to refcounting, which is basically an atomic exchange and an if (jz).

– kylben Oct 09 '11 at 06:43

1

@kylben You explicitly drew a distinction between reference counting schemes in C/C++ and GC, and then wrote a comment stating that GC has the risk of masking zombie object bugs. This is severely misleading; it's only true if you're using a library based GC such as Boehm and you circumvent it. GC was specifically invented to make bugs such as zombie objects impossible. – Ben Oct 09 '11 at 23:00

7

@kylben Also, as others have said, reference counting is hugely inefficient compared to sophisticated modern GCs, and even hand-tuned manual free can be sub-par for some workloads. The advantage of C/C++ isn't that they're always more efficient than more managed environments, it's that they make it possible to write more efficient code when the managed environment is sub-par for your use-case. – Ben Oct 09 '11 at 23:04

4

@Ben: You can prevent access to zombie objects, or you can free unreachable subgraphs with cycles, or you can support finalizers, pick two. If you free an unreachable subgraph, finalizers running on components of that subgraph may still reference now-destroyed portions of said subgraph. – Ben Voigt Oct 15 '11 at 03:49

@BenVoigt I don't know that that's inherently true. I can imagine schemes where you run all the finalizers before deallocating any of the objects and only afterwards free anything that's still unreachable. I have gotten the impression that some modern languages are moving away from finalizers being a good idea because of complications with GC though, so your point has a measure of truth. I'd though it was impossible to do the GC/finalizers combo efficiently rather than impossible to do it though. – Ben Oct 15 '11 at 05:25

1

@Ben, which again gets at what the question is about. It's not whether GC is better or worse, its why C++ in particular doesn't have it. The design of C++ works against reliable and efficient GC. – kylben Oct 15 '11 at 13:01

5

@Ben: But if there are cycles in the subgraph being finalized, there is no "safe" order to run the finalizers. You won't be accessing memory that's repurposed, but you will have access to zombie objects (those for which the finalizer has already run). – Ben Voigt Oct 15 '11 at 13:27

1

@BenVoigt, and cycles are always a problem vis a vis object lifetimes, no matter what approach you take to cleanup. Its why the company I work for now has a fairly strict policy of object models always being acyclic (a very large C++ app). It's made life a hell of a lot easier, and dealing with the legacy code a hell of a lot more hellish, by comparison. A lot of this gordian knot is cleaved by just not allowing cycles. The assumption that there could be cycles is baked into most languages' designs, but it is an assumption that is usually never questioned. It should be. – kylben Oct 15 '11 at 13:37

3

@kylben: Agreed. But "they deal with cycles" is one of the most touted features of generational GC. In fact, there are still landmines. – Ben Voigt Oct 15 '11 at 13:38

@kylben - in the most general case, statically disallowing reference cycles while allowing linked lists, binary trees etc would be impossible, and immediately detecting and disallowing the creation of cycles at runtime would a potentially problematic overhead. Banning reference cycles in some data structures (e.g. double-linked lists) is impossible by definition, though that's usually an issue you can leave to your library. It's a non-issue too, if nodes don't have destructors. I certainly believe high-level code (and a lot of low-level code) can work with a reference-cycle ban. – Oct 15 '11 at 16:24

@kylben - a slightly more permissive option might be to ban destructors for recursive data structure nodes (containing direct or indirect pointers to the same type - should be statically checkable). This shouldn't be a hassle - the destructor cleanup just moves up a layer to the container-class level instead of the data-structure node level. Of course the problem is that this is a language-design option - not applicable to C++, Java or any other current language I'm aware of. – Oct 15 '11 at 16:30

@Steve314, sure, that's why I hedged a bit with "fairly strict", there are cases in libs and frameworks where the implementations have local cycles, but those are generally subject to much higher scrutiny and careful implementation. For a double-linked list, just don't have your operational objects hold the links, link together very lightweight objects that hold references to them, and don't let the reference holders manage lifetime in any way. I believe STL containers do the first part of that at least. – kylben Oct 15 '11 at 16:37

@Steve314, some reference counting schemes in C++ do just that.

There's no programming problem that can't be solved with another layer of indirection. :-)

– kylben Oct 15 '11 at 17:40

This answer is out of date as of C++11 since C++11 introduced smart pointers. – Pharap Jun 17 '16 at 00:07

score 51 · Answer 2 · answered Oct 08 '11 at 22:12

51

Strictly speaking, there is no memory management at all in the C language. malloc() and free() are not keywords in the language, but just functions that are called from a library. This distinction may be pedantic now, because malloc() and free() are part of the C standard library, and will be provided by any standard compliant implementation of C, but this wasn't always true in the past.

Why would you want a language with no standard for memory management? This goes back to C's origins as 'portable assembly'. There are many cases of hardware and algorithms that can benefit from, or even require, specialized memory management techniques. As far as I know, there is no way to completely disable Java's native memory management and replace it with your own. This is simply not acceptable in some high performance/minimal resource situations. C provides almost complete flexibility to choose exactly what infrastructure your program is going to use. The price paid is that the C language provides very little help in writing correct, bug free code.

answered Oct 08 '11 at 22:12

Charles E. Grant

16,672

3

+1 one for the overall good answer, but also especially for "The price paid is that the C language provides very little help in writing correct, bug free code" – Shivan Dragon Jul 13 '13 at 11:46
3

C does have memory management - but it just works, so people barely notice it. There's static memory, registers and the stack. Until you start allocating out of the heap, you're fine and dandy. It's the heap allocation that messes things up. As for Java, everyone can write their own Java runtime - there's plenty to choose from, including what could be called "System's Java". .NET can do pretty much everything C can - it only lags behind C++'s native capabilities (e.g. classes are managed only in .NET). Of course, you also have C++.NET, which has everything C++ does, and everything .NET does. – Luaan Jun 17 '16 at 08:10
1

@Luaan I'd say that's a very generous definition of having "memory management" "Until you start allocating out of the heap, you're fine and dandy. It's the heap allocation that messes things up", That's like saying a car is a perfectly good airplane, it just isn't able to fly. – Charles E. Grant Jun 17 '16 at 16:16
2

@CharlesE.Grant Well, a purely functional language can do everything with that kind of memory management. Just because heap allocation is a good trade off in some use cases doesn't mean that it's the benchmark for all languages/runtimes. It's not like memory management stops being "memory management" just because it's simple, straight-forward, hidden behind the scenes. Designing static memory allocation is still memory management, as is a good use of the stack and whatever else you have available. – Luaan Jun 17 '16 at 19:29
1

"any standard compliant implementation" is not true, only for standard compliant host environment implementation. Some Platforms/Standard Librarys, most for 8 or 16-Bit embedded Microcontroller, do not provide malloc() or free(). (example are MLAP-Compilers for PIC) – 12431234123412341234123 Jul 05 '17 at 07:50

Daniel Pryden · Answer 3 · 2015-09-06T23:02:40.453

38

The real answer is that the only way to make a safe, efficient garbage collection mechanism is to have language-level support for opaque references. (Or, conversely, a lack of language-level support for direct memory manipulation.)

Java and C# can do it because they have special reference types that cannot be manipulated. This gives the runtime the freedom to do things like move allocated objects in memory, which is crucial to a high-performance GC implementation.

For the record, no modern GC implementation uses reference counting, so that is completely a red herring. Modern GCs use generational collection, where new allocations are treated essentially the same way that stack allocations are in a language like C++, and then periodically any newly allocated objects that are still alive are moved to a separate "survivor" space, and an entire generation of objects is deallocated at once.

This approach has pros and cons: the upside is that heap allocations in a language that supports GC are as fast as stack allocations in a language that doesn't support GC, and the downside is that objects that need to perform cleanup before being destroyed either require a separate mechanism (e.g. C#'s using keyword) or else their cleanup code runs non-deterministically.

Note that one key to a high-performance GC is that there must be language support for a special class of references. C doesn't have this language support and never will; because C++ has operator overloading, it could emulate a GC'd pointer type, although it would have to be done carefully. In fact, when Microsoft invented their dialect of C++ that would run under the CLR (the .NET runtime), they had to invent a new syntax for "C#-style references" (e.g. Foo^) to distinguish them from "C++-style references" (e.g. Foo&).

What C++ does have, and what is regularly used by C++ programmers, is smart pointers, which are really just a reference-counting mechanism. I wouldn't consider reference counting to be "true" GC, but it does provide many of the same benefits, at the cost of slower performance than either manual memory management or true GC, but with the advantage of deterministic destruction.

At the end of the day, the answer really boils down to a language design feature. C made one choice, C++ made a choice that enabled it to be backward-compatible with C while still providing alternatives that are good enough for most purposes, and Java and C# made a different choice that is incompatible with C but is also good enough for most purposes. Unfortunately, there is no silver bullet, but being familiar with the different choices out there will help you to pick the correct one for whatever program you're currently trying to build.

edited Sep 06 '15 at 23:02

answered Oct 09 '11 at 20:28

Daniel Pryden

3,288

6

This is the actual answer to the question – coredump Sep 06 '15 at 07:11
1

For the c++ part, nowadays you should look at std::unique_ptr and std::move :) – Niclas Larsson Mar 31 '16 at 07:42
@NiclasLarsson: I'm not sure I understand your point. Are you saying that std::unique_ptr is "language-level support for opaque references"? (It wasn't the kind of support I meant, and I also don't think that's sufficient unless support for direct memory manipulation were also removed from C++.) I do mention smart pointers in my answer, and I would consider std:unique_ptr a smart pointer, since it does actually do reference counting, it just only supports the special cases where the number of references is either zero or one (and std::move is the reference count updating mechanism). – Daniel Pryden Mar 31 '16 at 15:13
std::unique_ptr does not have a reference count and std::move has nothing to do with references at all (so "no" performance hit). I see your point though, as std::shared_ptr does have a reference count that is implicity updated by std::move :) – Niclas Larsson Apr 01 '16 at 08:17
@NiclasLarsson: OK, we're quibbling over details. Fundamentally, though, all this is beside the point: while there are some systems where an ownership model like std::unique_ptr is all you need, in many more systems you do need something more robust, and that means you either need reference counting or garbage collection of some kind. It's no accident that GC was first invented for Lisp, since functional programming idiomatically uses closures as the mechanism for allocating memory, and you can't model the lifecycle of closed-over state using std:unique_ptr. – Daniel Pryden Apr 01 '16 at 19:00
2

no modern GC implementation uses reference counting: cPython uses both reference counting and automatic collection. – Chris Smith Jun 18 '16 at 18:43
@ChrisSmith: True, but I wouldn't call CPython's implementation particularly modern. And the Python devs would love to change it, but can't due to legacy FFI APIs that they can't get rid of. – Daniel Pryden Jun 18 '16 at 21:13
1

Is reference counting really that slow compared to true GC? You have to consider that ref.counting gets compiled directly into the program, so there is no need for a complex run-time environment that manages the heap. – Mike76 Aug 21 '16 at 17:38
2

@Mike76: On the allocation side, a GC allocator will work about as fast as stack allocation, and the GC can deallocate thousands of objects at the same time. No matter what you do with a ref-counting implementation, allocation and deallocation will never be faster than malloc and free. So yes, a GC can be substantially faster. (Note that I said "can be" -- of course the exact performance of each program is affected by many factors.) – Daniel Pryden Aug 22 '16 at 03:05
malloc and free can be implemented as buddy allocator, which has O1-allocations with the help of prepared free-block-pointers. Although the buddy allocator wastes some memory due to internal fragmentation, garbage collectors also tend to waste a lot of memory. Of course these O1-allocations are still slower than stack-allocations, but this should be compensated by the much faster free. – Mike76 Aug 22 '16 at 13:15
1

no modern GC implementation uses reference counting Swift uses automatic reference counting. – Aleksandr Dubinsky Jun 20 '17 at 15:09
1

@AleksandrDubinsky: At the time I wrote this answer, Swift did not exist. And I wouldn't consider Swift's memory management "GC": it's reference counting, which is something different. – Daniel Pryden Jun 20 '17 at 16:46
And now we have Rust, which made yet another choice. – Timo Huovinen Nov 02 '20 at 22:14

score 28 · Answer 4 · edited May 23 '23 at 12:57

28

Because, when using the power of C++, there is no need.

Herb Sutter: "I've haven't written delete in years."

see Writing modern C++ code: how C++ has evolved over the years 21:10

It may surprise many experienced C++ programmers.

edited May 23 '23 at 12:57

Christianidis Vasilis

101

answered Oct 08 '11 at 21:04

Lior Kogan

1,467

Interesting. My reading material for today. – surfasb Oct 08 '11 at 23:03
Bah, a video. But never the less, interesting already. – surfasb Oct 08 '11 at 23:07
2

interesting video. 21 minutes in, and 55 minutes in were the best bits. Too bad the WinRT calls still looked to be C++/CLI bumpf. – gbjbaanb Oct 08 '11 at 23:39
This half-answers the question, but there is still very much a need for manual memory management in C. – dan04 Oct 09 '11 at 02:17
3

@dan04: That's true. But then, if you write in C, you get what you ask for. – DeadMG Oct 14 '11 at 09:16
9

Managing the smart pointers is not any more demanding than making sure you don't have unnecessary references in a garbage collected environment. Because GC can't read your mind, it's not magic either. – Tamás Szelei Oct 14 '11 at 11:56

ChrisF · Answer 5 · 2011-10-08T21:09:43.233

16

"All" a garbage collector is is a process that runs periodically checking to see if there are any unreferenced objects in memory and if there are deletes them. (Yes, I know this is a gross oversimplification). This is not a property of the language, but the framework.

There are garbage collectors written for C and C++ - this one for example.

One reason why one hasn't been "added" to the language could be because of the sheer volume of existing code that would never use it as they use their own code for managing memory. Another reason could be that the types of applications written in C and C++ don't need the overhead associated with a garbage collection process.

edited Oct 08 '11 at 21:09

answered Oct 08 '11 at 20:55

ChrisF

38,938

1

But future programs written would begin to use the garbage collector, no? – Dark Templar Oct 08 '11 at 21:07
@DarkTemplar - possibly, however there are other reasons - see kylben's answer – ChrisF Oct 08 '11 at 21:08
5

While garbage collection is theoretically independent from any programming language, it is pretty hard to write a useful GC for C/C++, and even impossible to make a fool-proof one (at least as foolproof as Java's) - the reason Java can pull it off is because it runs in a controlled virtualized environment. Conversely, the Java language is designed for GC, and you'll have a hard time writing a Java compiler that doesn't do GC. – tdammers Oct 08 '11 at 21:55
4

@tdammers: I agree that garbage-collection needs to be supported by the language to be possible. However, the main point is not virtualization and controlled environment, but strict typing. C and C++ are weakly typed, so they allow things like storing pointer in integer variable, reconstructing pointers from offsets and such things that prevent the collector from being able to tell reliably what is reachable (C++11 prohibits the later to allow at least conservative collectors). In Java you always know what is a reference, so you can collect it precisely, even if compiled to native. – Jan Hudec Oct 10 '11 at 07:50
A C garbage collector could be transparently hooked to malloc-free. Why would such one not be used? – Oct 14 '11 at 09:55
3

@ThorbjørnRavnAndersen: I can write a valid C program that stores pointers in such a way that no garbage collector could ever find them. If you then hook a garbage collector to malloc and free, you would break my correct program. – Ben Voigt Oct 15 '11 at 03:52
@ben, if you have a pointer to memory thar has been free'd, you break the API. – Oct 16 '11 at 12:50
4

@ThorbjørnRavnAndersen: No, I wouldn't call free until I was done with it. But your proposed garbage collector that doesn't free the memory until I explicitly call free isn't a garbage collector at all. – Ben Voigt Oct 16 '11 at 14:47

score 12 · Answer 6 · answered Oct 08 '11 at 20:56

12

I don't have the exact quotes but both Bjarne and Herb Sutter says something along the lines:

C++ doesn't need a garbage collector, because it has no garbage.

In modern C++ you use smart pointers and therefore have no garbage.

answered Oct 08 '11 at 20:56

ronag

1,189

1

what are smart pointers? – Dark Templar Oct 08 '11 at 21:03
@DarkTemplar, http://en.wikipedia.org/wiki/Smart_pointer – riwalk Oct 08 '11 at 21:59
13

if it was that simple, nobody would have implemented any GC. – deadalnix Oct 09 '11 at 02:15
9

@deadalnix: Right, because nobody ever implements anything overly complicated, slow, bloated, or unnecessary. All software is 100% efficient all the time, right? – Zach Oct 09 '11 at 05:54
7

@deadalnix - The C++ approach to memory management is newer than garbage collectors. RAII was invented by Bjarne Stroustrup for C++. Destructor cleanup is an older idea, but the rules for ensuring exception-safety are key. I don't know when exactly when the idea itself was first described but the first C++ standard was finalized in 1998, and Stroustrups "Design and Evolution of C++" wasn't published until 1994, and exceptions were a relatively recent addition to C++ - after the publication of the "Annotated C++ Reference Manual" in 1990, I believe. GC was invented in 1959 for Lisp. – Oct 15 '11 at 12:25
Well newer doesn't mean perfect. For exemple, this way of doing thing as terrible performance in a multithreaded environnement. – deadalnix Oct 15 '11 at 12:40
Why would it be terrible in a multithreaded environment? ConcRT seems to run fine... – ronag Oct 15 '11 at 13:45
2

@deadalnix - are you aware that at least one Java VM used a reference-counting GC that could (almost) be implemented using C++-style RAII using a smart pointer class - precisely because it was more efficient for multithreaded code than existing VMs? See www.research.ibm.com/people/d/dfb/papers/Bacon01Concurrent.pdf. One reason you don't see this in C++ in practice is the usual GC collection - it can collect cycles, but can't choose a safe destructor order in the presence of cycles, and thus cannot ensure reliable destructor cleanup. – Oct 15 '11 at 14:10
2

@deadalnix - don't read this as "RAII is perfect". There are issues in each approach that don't occur in the other. C++ prioritizes safe and timely destructor cleanup, but loses some kinds of automatic cleanup as a result. Java goes the other way. Horses for courses, and IMO not worth the religious war - a good programmer should know how to work both ways. I only end up looking anti-GC at times because there's so many religiously pro-GC people around. – Oct 15 '11 at 14:16
@Steve314 > I know.This is exactly what I meant by «
if it was that simple, nobody would have implemented any GC ». GC has some advatages, RAII some others, and any good programmer must know what they are. This is why I like D : it allow both. Both can be achieved in C or C++ using Boehm GC. – deadalnix Oct 15 '11 at 14:44
@deadalnix - my (possibly dated) issue with D is described in comments to another answer here - http://programmers.stackexchange.com/questions/113177/why-do-languages-such-as-c-and-c-not-have-garbage-collection-while-java-does/114476#114476 – Oct 15 '11 at 14:55
D's actual GC doesn't compact heap and look into register for pointers. You can disable it for a given perdiod of time and/or manage memory manually. As your exemple, you can alloc the structure with the GC and malloc nodes. This is possible for ages in D. New allocators are currently beeing worked on by the community around the language. – deadalnix Oct 15 '11 at 15:05

score 12 · Answer 7 · answered Oct 08 '11 at 22:19

C was designed in an era when garbage collection was barely an option. It was also intended for uses where garbage collection would not generally work - bare metal, real time environments with minimal memory and minimal runtime support. Remember that C was the implementation language for the first unix, which ran on a pdp-11 with 64*K* bytes of memory. C++ was originally an extension to C - the choice had already been made, and it's very hard to graft garbage collection onto an existing language. It's the kind of thing that has to be built in from the ground floor.

score 9 · Answer 8 · answered Oct 09 '11 at 04:53

You ask why these languages haven't been updated to include an optional garbage collector.

The problem with optional garbage collection is that you can't mix code that uses the different models. That is, if I write code that assumes you are using a garbage collector you can't use it in your program which has garbage collection turned off. If you do, it'll leak everywhere.

score 8 · Answer 9 · edited May 23 '17 at 11:33

There's various issues, including...

Although GC was invented before C++, and possibly before C, both C and C++ were implemented before GCs were widely accepted as practical.
You can't easily implement a GC language and platform without an underlying non-GC language.
Although GC is demonstrably more efficient than non-GC for typical applications code developed in typical timescales etc, there are issues where more development effort is a good trade-off and specialized memory management will outperform a general-purpose GC. Besides, C++ is typically demonstrably more efficient than most GC languages even without any extra development effort.
GC is not universally safer than C++-style RAII. RAII allows resources other than memory to be automatically cleaned up, basically because it supports reliable and timely destructors. These cannot be combined with conventional GC methods because of issues with reference cycles.
GC languages have their own characteristic kinds of memory leaks, particularly relating to memory that will never be used again, but where existing references existed that have never been nulled out or overwritten. The need to do this explicitly is no different in principle than the need to delete or free explicitly. The GC approach still has an advantage - no dangling references - and static analysis can catch some cases, but again, there's no one perfect solution for all cases.

Basically, partly it's about the age of the languages, but there will always be a place for non-GC languages anyway - even if it is a bit of a nichey place. And seriously, in C++, the lack of GC isn't a big deal - your memory is managed differently, but it isn't unmanaged.

Microsofts managed C++ has at least some ability to mix GC and non-GC in the same application, allowing a mix-and-match of the advantages from each, but I don't have the experience to say how well this works in practice.

Rep-whoring links to related answers of mine...

score 7 · Answer 10 · answered Oct 14 '11 at 08:42

7

Can you imagine writing a device handler in a language with garbage collection? How many bits could come down the line while the GC was running?

Or an operating system? How could you start the garbage collection running before you even start the kernel?

C is designed for low level close to the hardware tasks. The problem? is it is such a nice language that its a good choice for many higher level tasks as well. The language czars are aware of these uses but they need to support the requirements of device drivers, embedded code and operating systems as a priority.

answered Oct 14 '11 at 08:42

James Anderson

18,147
1
43
72

2

C good for high level? I snorted my drink all over my keyboard. – DeadMG Oct 14 '11 at 09:18
6

Well, he did say "many higher level tasks". He could be troll-counting (one, two, many...). And he didn't actually say higher than what. Jokes aside, though, it's true - the evidence being that many significant higher-level projects have been successfully developed in C. There may be better choices now for a lot of those projects, but a working project is stronger evidence than speculation about what might have been. – Oct 15 '11 at 03:11
There's some managed operating systems, and they work rather well. In fact, when you make the whole system managed, the performance hit from using managed code drops even lower, up to being faster than unmanaged code in real-life scenarios. Of course, those are all "research OS" - there's pretty much no way to make them compatible with existing unmanaged code besides making a fully virtualised unmanaged OS within the managed OS. Microsoft did suggest at some point that they might replace Windows Server with one of those, though, as more and more server code is written on .NET. – Luaan Jun 17 '16 at 08:17

score 7 · Answer 11 · edited Oct 04 '14 at 19:51

7

The short and boring answer to this question is that there needs to be a non-garbage collected language out there for the people that write the garbage collectors. It's not conceptually easy to have a language that at the same time allows for very precise control over the memory layout and has a GC running on top.

The other question is why C and C++ don't have garbage collectors. Well, I know C++ has a couple of them around but they aren't really popular because they are forced to deal with a language that wasn't designed to be GC-ed in the first place, and the people that still use C++ in this age aren't really the kind that misses a GC.

Also, instead of adding GC to an old non-GC-ed language, it is actually easier to create a new language that has most of the same syntax while supporting a GC. Java and C# are good examples of this.

edited Oct 04 '14 at 19:51

Jamal

109

answered Oct 15 '11 at 01:05

hugomg

2,112
13
17

1

Somewhere on programmers.se or SO, there's a claim someone made to me that someone was working on a self-bootstrapping garbage-collected thingy - IIRC basically implementing the VM using a GC language, with a bootstrapping subset used to implement the GC itself. I forget the name. When I looked into it, it turned out that they'd basically never achieved the leap from the subset-without-GC to the working-GC level. This is possible in principle, but AFAIK it has never been achieved in practice - it's certainly a case of doing things the hard way. – Oct 15 '11 at 01:52
@Steve314: I'd love to see that if you ever remember where you found it! – hugomg Oct 15 '11 at 02:08
found it! See the comments to http://stackoverflow.com/questions/3317329/what-language-is-used-to-write-operating-systems-windows/3317622#3317622 referring to the Klein VM. Part of the problem finding it - the question was closed. – Oct 15 '11 at 02:54
BTW - I seem unable to start my comments with @missingno - what gives? – Oct 15 '11 at 02:56
@steve314: Having witten the answer this thread is attached to, I already receive a notification for all comments. Doing an @-post in this case would be redundant and is not allowed by SE (don't ask me why though). (The real cause though is because my number is missing) – hugomg Oct 15 '11 at 03:09

score 5 · Answer 12 · answered Oct 15 '11 at 03:59

5

Garbage collection is fundamentally incompatible with a systems language used for developing drivers for DMA-capable hardware.

It's entirely possible that the only pointer to an object would be stored in a hardware register in some peripheral. Since the garbage collector wouldn't know about this, it would think the object was unreachable and collect it.

This argument holds double for compacting GC. Even if you were careful to maintain in-memory references to objects used by hardware peripherals, when the GC relocated the object, it wouldn't know how to update the pointer contained in the peripheral config register.

So now you'd need a mixture of immobile DMA buffers and GC-managed objects, which means you have all the disadvantages of both.

answered Oct 15 '11 at 03:59

Ben Voigt

3,246

Arguably all the disadvantages of both, but fewer instances of each disadvantage, and the same for advantages. Clearly there is complexity in having more kinds of memory management to deal with, but there may also be complexity avoided by choosing the right horse for each course within your code. Unlikely, I imagine, but there's a theoretical gap there. I've speculated about mixing GC and non-GC in the same language before, but not for device drivers - more for having a mostly GC application, but with some manually memory-managed low level data structure libraries. – Oct 15 '11 at 05:07
@Steve314: Wouldn't you say that remembering which objects need to be manually freed is as onerous as remembering to free everything? (Of course, smart pointers can help with either, so neither one is a huge problem) And you need different pools for manually managed objects vs collected/compactible objects, since compaction doesn't work well when there are fixed objects scattered throughout. So a lot of extra complexity for nothing. – Ben Voigt Oct 15 '11 at 13:32
2

Not if there's a clear divide between the high-level code which is all GC, and the low-level code that opts out of GC. I mainly developed the idea while looking at D some years ago, which allows you to opt out of GC but doesn't allow you to opt back in. Take for example a B+ tree library. The container as a whole should be GC, but the data structure nodes probably not - it's more efficient to do a customized scan through the leaf nodes only than to make the GC do a recursive search through the branch nodes. However, that scan does need to report the contained items to the GC. – Oct 15 '11 at 13:53
The point is, that's a contained piece of functionality. Treating the B+ tree nodes as special WRT memory management is no different to treating them as special WRT being B+ tree nodes. It's an encapsulated library, and the application code doesn't need to know that GC behaviour has been bypassed/special-cased. Except that, at least at the time, that was impossible in D - as I said, no way to opt back in and report the contained items to the GC as potential GC roots. – Oct 15 '11 at 13:58

score 3 · Answer 13 · answered Oct 09 '11 at 04:38

3

Because, C & C++ are relatively low level languages meant for general purpose, even, for example, to run on a 16-bit processor with 1MB of memory in an embedded system, which couldn't afford wasting memory with gc.

answered Oct 09 '11 at 04:38

Petruza

1,018
1
8
14

2

"Embedded system"? At the time C was standardized (1989), it needed to be able to handle PCs with 1 MB of memory. – dan04 Oct 17 '11 at 07:35
I agree, I was citing a more current example. – Petruza Oct 17 '11 at 18:53
1MB??? Holy schmoley, who would ever need that much RAM? – Mark K Cowan Jul 08 '15 at 19:15

score 2 · Answer 14 · edited May 23 '17 at 12:40

2

There are garbage collectors in C++ and C. Not sure how this works in C, but in C++ you can leverage RTTI to dynamically discover your object graph and use that for garbage collection.

To my knowledge, you cannot write Java without a garbage collector. A little search turned up this.

The key difference between Java and C/C++ is that in C/C++ the choice is always yours, whereas in Java you're often left without options by design.

edited May 23 '17 at 12:40

Community

1

answered Oct 08 '11 at 21:25

back2dos

30,060

And also that the dedicated garbage collectors are better implemented, more efficient and fit better into the language. :) – Max Oct 14 '11 at 12:05
2

No, you can't use RTTI to dynamically discover the object graph in C/C++: It's the plain old data objects that spoil everything. There is simply no RTTI information stored in a plain old data object that would allow a garbage collector to differentiate between pointers and non-pointers within that object. Even worse, pointers do not need to be perfectly aligned on all hardware, so, given a 16 byte object, there are 9 possible locations a 64 bit pointer can be stored, only two of which don't overlap. – cmaster - reinstate monica Sep 06 '15 at 06:52

score 2 · Answer 15 · answered Oct 09 '11 at 19:36

It's a trade off between performance and safety.

There is no guarantee that your garbage will be collected in Java, so it may be hanging around using up space for a long time, while the scanning for unreferenced objects (ie garbage) also takes longer than explicitly deleting or freeing an unused object.

The advantage is, of course, that one can build a language without pointers or without memory leaks, so one is more likely to produce correct code.

There can be a slight 'religious' edge to these debates sometimes - be warned!

score 2 · Answer 16 · answered Sep 06 '15 at 07:23

Here is a list of inherent problems of GC, which make it unusable in a system language like C:

The GC has to run below the level of the code whose objects it manages. There is simply no such level in a kernel.
A GC has to stop the managed code from time to time. Now think about what would happen if it did that to your kernel. All processing on your machine would stop for, say, a millisecond, while the GC scans all existing memory allocations. This would kill all attempts to create systems that operate under strict real-time requirements.
A GC needs to be able to distinguish between pointers and non-pointers. That is, it must be able to look at every memory object in existence, and be able to produce a list of offsets where its pointers can be found.

This discovery must be perfect: The GC must be able to chase all the pointers it discovers. If it dereferenced a false positive, it would likely crash. If it failed to discover a false negative, it would likely destroy an object that's still in use, crashing the managed code or silently corrupting its data.

This absolutely requires that type information is stored in every single object in existence. However, both C and C++ allow for plain old data objects which contain no type information.
GC is an inherently slow business. Programmers that have been socialized with Java may not realize this, but programs can be orders of magnitude faster when they are not implemented in Java. And one of the factors that make Java slow is GC. This is what precludes GCed languages like Java from being used in supercomputing. If your machine costs a million a year in power consumption, you don't want to pay even 10% of that for garbage collection.

C and C++ are languages that are created to support all possible use cases. And, as you see, many of these use cases are precluded by garbage collection. So, in order to support these use cases, C/C++ cannot be garbage collected.

Why do languages such as C and C++ not have garbage collection, while Java does?

16 Answers16