2

I did a bit of an experiment to try to understand references in C++:

#include <iostream>
#include <vector>
#include <set>

struct Description {
  int a = 765;
};

class Resource {
public:
  Resource(const Description &description) : mDescription(description) {}

  const Description &mDescription;
};

void print_set(const std::set<Resource *> &resources) {
    for (auto *resource: resources) {
        std::cout << resource->mDescription.a << "\n";
    }
}

int main() {
  std::vector<Description> descriptions;
  std::set<Resource *> resources;

  descriptions.push_back({ 10 });
  resources.insert(new Resource(descriptions.at(0)));

  // Same as description (prints 10)
  print_set(resources);

  // Same as description (prints 20)
  descriptions.at(0).a = 20;
  print_set(resources);

  // Why? (prints 20)
  descriptions.clear();
  print_set(resources);

  // Object is written to the same address (prints 50)
  descriptions.push_back({ 50 });
  print_set(resources);

  // Create new array
  descriptions.reserve(100);

  // Invalid address
  print_set(resources);

  for (auto *res : resources) {
      delete res;
  }
  
  return 0;
}

https://godbolt.org/z/TYqaY6Tz8

I don't understand what is going on here. I have found this excerpt from C++ FAQ:

Important note: Even though a reference is often implemented using an address in the underlying assembly language, please do not think of a reference as a funny looking pointer to an object. A reference is the object, just with another name. It is neither a pointer to the object, nor a copy of the object. It is the object. There is no C++ syntax that lets you operate on the reference itself separate from the object to which it refers.

This creates some questions for me. So, if reference is the object itself and I create a new object in the same memory address, does this mean that the reference "becomes" the new object? In the example above, vectors are linear arrays; so, as long as the array points to the same memory range, the object will be valid. However, this becomes a lot trickier when other data sets are being used (e.g sets, maps, linked lists) because each "node" typically points to different parts of memory.

Should I treat references as undefined if the original object is destroyed? If yes, is there a way to identify that the reference is destroyed other than a custom mechanism that tracks the references?

Note: Tested this with GCC, LLVM, and MSVC

Gasim
  • 6,959
  • 12
  • 56
  • 121
  • 3
    It depends how you replace the object. With placement `new` old references refer to the new object (in most cases). If you `clear()` and `push_back()` it is technically Undefined Behavior as `clear()` invalidates all references to the elements, even though it will very likely look like it works every time you try it. – François Andrieux Mar 18 '22 at 14:29
  • 3
    "A reference is the object" is sloppy langauge, though imho it is better than thinking of references as pointers. A reference isnt really the object, but you can think of it like that as long as the object is alive, then the reference is dangling – 463035818_is_not_a_number Mar 18 '22 at 14:30
  • related/dupe: https://stackoverflow.com/questions/6438086/iterator-invalidation-rules-for-c-containers – NathanOliver Mar 18 '22 at 14:30
  • still not perfectly accurate, but maybe better "a valid reference is the object" . – 463035818_is_not_a_number Mar 18 '22 at 14:31
  • 2
    "Should I treat references as undefined if the original object is destroyed?" Yes. "is there a way to identify that the reference is destroyed" No. – Quimby Mar 18 '22 at 14:32
  • Thank you all for the replies. My main problem was understanding the validity of references but it is much clearer now. – Gasim Mar 18 '22 at 14:37
  • IMO the note is more misleading than clarifying. You might find it easier to understand if you forgot about it. – Passer By Mar 18 '22 at 14:47
  • @PasserBy is there another reference book that I can refer to about specification for references? – Gasim Mar 18 '22 at 14:47
  • 1
    @Gasim I don't know of a good book to learn specifically about references. But you might want to read [cppreference](https://en.cppreference.com/w/cpp/language/reference). – Passer By Mar 18 '22 at 14:49
  • Thank you, this is what I was looking for. There is even a section about dangling references. – Gasim Mar 18 '22 at 14:56
  • normal name can also be invalidated, for example when you explicitly call it's destructor. the alias name becomes invalid the same time as the aliased object. – apple apple Mar 18 '22 at 15:03
  • oh and dangling reference is (usually) found in return local variable by reference, which doesn't need to have anything to do with pointer. – apple apple Mar 18 '22 at 15:05
  • you can also see https://en.cppreference.com/w/cpp/language/lifetime – apple apple Mar 18 '22 at 15:09

1 Answers1

3

The note is misleading, treating references as syntax sugar for pointers is fine as a mental model. In all the ways a pointer might dangle, a reference will also dangle. Accessing dangling pointers/references is undefined behaviour (UB).

int* p = new int{42};
int& i = *p;
delete p;

void f(int);
f(*p); // UB
f(i);  // UB, with the exact same reason

This also extends to the standard containers and their rules about pointer/reference invalidation. The reason any surprising behaviour happens in your example is simply UB.

Passer By
  • 18,098
  • 6
  • 45
  • 90