186

I am trying to write simple TCP/IP client in Rust and I need to print out the buffer I got from the server.

How do I convert a Vec<u8> (or a &[u8]) to a String?

Peter Mortensen
  • 30,030
  • 21
  • 100
  • 124
Athabaska Dick
  • 3,215
  • 3
  • 18
  • 21

4 Answers4

195

To convert a slice of bytes to a string slice (assuming a UTF-8 encoding):

use std::str;

//
// pub fn from_utf8(v: &[u8]) -> Result<&str, Utf8Error>
//
// Assuming buf: &[u8]
//

fn main() {

    let buf = &[0x41u8, 0x41u8, 0x42u8];

    let s = match str::from_utf8(buf) {
        Ok(v) => v,
        Err(e) => panic!("Invalid UTF-8 sequence: {}", e),
    };

    println!("result: {}", s);
}

The conversion is in-place, and does not require an allocation. You can create a String from the string slice if necessary by calling .to_owned() on the string slice (other options are available).

The library reference for the conversion function:

Shepmaster
  • 326,504
  • 69
  • 892
  • 1,159
gavinb
  • 18,233
  • 3
  • 44
  • 55
  • You may want to add that this is possible because Vec coerces to slices – torkleyy Apr 13 '17 at 07:49
  • 4
    Although it's true that `from_utf8` doesn't allocate, it may be worth mentioning that it needs to scan the data to validate utf-8 correctness. So this is not an O(1) operation (which one may think at first) – Zargony Jan 24 '19 at 12:16
125

I prefer String::from_utf8_lossy:

fn main() {
    let buf = &[0x41u8, 0x41u8, 0x42u8];
    let s = String::from_utf8_lossy(buf);
    println!("result: {}", s);
}

It turns invalid UTF-8 bytes into � and so no error handling is required. It's good for when you don't need that and I hardly need it. You actually get a String from this. It should make printing out what you're getting from the server a little easier.

Sometimes you may need to use the into_owned() method since it's clone on write.

Shepmaster
  • 326,504
  • 69
  • 892
  • 1,159
Bjorn
  • 64,873
  • 39
  • 133
  • 161
  • 5
    Thanks a lot for the `into_owned()` suggestion! Was exactly was I was looking for (this makes it become a proper `String` which you can return as a return value from a method, for example). – Per Lundberg Nov 25 '16 at 16:34
  • 1
    � is Unicode U+FFFD (UTF-8 sequence 0xEF 0xBF 0xBD (octal 357 277 275)), '[REPLACEMENT CHARACTER](https://www.utf8-chartable.de/unicode-utf8-table.pl?start=65280)'. In some text editors it can be searched for in regular expression mode by `\x{FFFD}`. – Peter Mortensen Apr 10 '21 at 19:18
76

If you actually have a vector of bytes (Vec<u8>) and want to convert to a String, the most efficient is to reuse the allocation with String::from_utf8:

fn main() {
    let bytes = vec![0x41, 0x42, 0x43];
    let s = String::from_utf8(bytes).expect("Found invalid UTF-8");
    println!("{}", s);
}
Shepmaster
  • 326,504
  • 69
  • 892
  • 1,159
  • 2
    Edit: Note that as mentioned by @Bjorn Tipling you might think you can use `String::from_utf8_lossy` instead here, then you don't need the `expect` call, but the input to that is a slice of bytess (`&'a [u8]`). OTOH, there's also `from_utf8_unchecked`. "If you are sure that the byte slice is valid UTF-8, and you don't want to incur the overhead of the conversion, there is an unsafe version of this function [`from_utf8_lossy]`, `from_utf8_unchecked`, which has the same behavior but skips the checks." – James Ray Jan 23 '19 at 09:22
  • Note that you can use `&vec_of_bytes` to convert back into a slice of bytes, as listed in the examples of `from_utf8_lossy`.https://doc.rust-lang.org/std/string/struct.String.html#method.from_utf8_lossy – James Ray Jan 23 '19 at 09:32
  • @JamesRay is there a way to get the behavior of `from_utf8_lossy` without reallocating? If I start with a `Vec` and then take a reference to it before converting it to a string as in `String::from_utf8_lossy(&my_vec)` I will end up reallocating memory when I don't actually need to. – Michael Dorst Dec 07 '21 at 06:49
  • Oh nevermind. `from_utf8_lossy` returns a `Cow`, not a String. If there are no invalid characters then it won't reallocate, but if there are it will. – Michael Dorst Dec 07 '21 at 06:55
5

In my case I just needed to turn the numbers into a string, not the numbers to letters according to some encoding, so I did

fn main() {
    let bytes = vec![0x41, 0x42, 0x43];
    let s = format!("{:?}", &bytes);
    println!("{}", s);
}
PPP
  • 783
  • 15
  • 52