17

I have the following problem: I have a have a data structure that is parsed from a buffer and contains some references into this buffer, so the parsing function looks something like

fn parse_bar<'a>(buf: &'a [u8]) -> Bar<'a>

So far, so good. However, to avoid certain lifetime issues I'd like to put the data structure and the underlying buffer into a struct as follows:

struct BarWithBuf<'a> {bar: Bar<'a>, buf: Box<[u8]>}
// not even sure if these lifetime annotations here make sense,
// but it won't compile unless I add some lifetime to Bar

However, now I don't know how to actually construct a BarWithBuf value.

fn make_bar_with_buf<'a>(buf: Box<[u8]>) -> BarWithBuf<'a> {
    let my_bar = parse_bar(&*buf);
    BarWithBuf {buf: buf, bar: my_bar}
}

doesn't work, since buf is moved in the construction of the BarWithBuf value, but we borrowed it for parsing.

I feel like it should be possible to do something along the lines of

fn make_bar_with_buf<'a>(buf: Box<[u8]>) -> BarWithBuf<'a> {

    let mut bwb = BarWithBuf {buf: buf};
    bwb.bar = parse_bar(&*bwb.buf);
    bwb
}

to avoid moving the buffer after parsing the Bar, but I can't do that because the whole BarWithBuf struct has to be initalised in one go. Now I suspect that I could use unsafe code to partially construct the struct, but I'd rather not do that. What would be the best way to solve this problem? Do I need unsafe code? If I do, would it be safe do to this here? Or am I completely on the wrong track here and there is a better way to tie a data structure and its underlying buffer together?

fjh
  • 10,640
  • 3
  • 40
  • 42
  • I never figured out if it is possible to have internal references to an other member of a struct without unsafe code. I can't see how the borrow checker could follow a borrow like this one... – Levans Nov 23 '14 at 19:10
  • 2
    This question is old enough that I don't want to close it as a duplicate, but most people visiting this should probably check out [Why can't I store a value and a reference to that value in the same struct?](https://stackoverflow.com/q/32300132/155423) instead. – Shepmaster Feb 18 '20 at 13:23

1 Answers1

4

I think you're right in that it's not possible to do this without unsafe code. I would consider the following two options:

  1. Change the reference in Bar to an index. The contents of the box won't be protected by a borrow, so the index might become invalid if you're not careful. However, an index might convey the meaning of the reference in a clearer way.

  2. Move Box<[u8]> into Bar, and add a function buf() -> &[u8] to the implementation of Bar; instead of references, store indices in Bar. Now Bar is the owner of the buffer, so it can control its modification and keep the indices valid (thereby avoiding the problem of option #1).

  3. As per DK's suggestion below, store indices in BarWithBuf (or in a helper struct BarInternal) and add a function fn bar(&self) -> Bar to the implementation of BarWithBuf, which constructs a Bar on-the-fly.

Which of these options is the most appropriate one depends on the actual problem context. I agree that some form of "member-by-member construction" of structs would be immensely helpful in Rust.

  • 1
    I've considered the first options, but I don't particularly like it. I have trouble understanding your second suggestion, could you maybe elaborate on that? – fjh Nov 23 '14 at 22:15
  • 1
    @fjh I believe Adrian is suggesting that you wrap the `Box` in a type which has a method that returns a temporary `Bar`. The idea is that you store a "portable" form of the `Bar` (like indices into the boxed data), and the method constructs a `Bar` on-demand, using the lifetime of `&self` to keep things safe. – DK. Nov 24 '14 at 02:52
  • @DK. Not quite, but I clarified option #2 and added your suggestion as option #3. – Adrian Willenbücher Nov 24 '14 at 08:22
  • Thank you! I'll probably just bite the bullet and use `String` and `Vec` instead of referring to the buffer to sidestep this whole issue. Index-based solutions feel a bit hacky and wouldn't really work for me since some of the references into the buffer are `&str`, so I'd have to repeatedly verify the the utf encoding or use unsafe code. – fjh Nov 24 '14 at 19:33