I tried using round-robin to connect two Linux machines together, as they both support this. While I did get more available bandwidth than a single link, the result was not spectacular.
I bonded together two 10G links, in theory allowing 20 Gbps between the hosts. Sometimes I would indeed get close to this, but a lot of the time I would only get 11-15 Gbps.
The problem is that for whatever reason, the packets don't always arrive in order. For TCP this can cause it to think the packet has been lost and request a retransmit (lowering overall bandwidth) and increasing latency, but for bulk transfers like file servers you still end up with an overall higher transfer rate so it's not the end of the world.
However for protocols that don't guarantee delivery like UDP and realtime video streaming it's a disaster, as they tend to process the latest packet available and discard anything earlier that hasn't arrived yet, in order to get the lowest latency possible - they don't want to wait for all the packets to arrive first as that can introduce a lot of delay. While a lost packet here and there is no problem, with round-robin you can get extremely high numbers of packets arriving out of order which can completely corrupt the data stream and for video, make it unwatchable!
Imagine the source machine is sending a single stream out of four bonded NICs in the order NIC1, NIC2, NIC3, NIC4, NIC1, etc. Now imagine the receiving machine is processing the packets in the opposite order, first checking NIC4, then NIC3, NIC2, NIC1 then NIC4 again. In this example the packet it picks up first will be from NIC4 which will be the latest, and the next three packets off NIC3, 2 and 1 will be older. That's 75% of the packets out of order, which could cause a realtime video stream to lose 75% of its packets easily making it unwatchable.
The only way this can work is if the receiving hardware is designed to store the packets until a complete set has arrived and then forward them on in order, which will increase latency. It would also mean a switch would have to decide how long to wait before deciding a packet has been lost and give up waiting for it, so before you know it you're implementing a cut-down TCP-style guaranteed delivery protocol on top of Ethernet. Then you have to consider what happens when someone tries to bond a 10G and four 1G links, hoping to get 14G. Your 10G link will send so many more packets it will look like all the ones coming in over the 1G link have gotten lost, even though they are just taking 10 times longer to transmit. So how do you tell the difference between a lost packet or one that is just taking ages to arrive across a slower link?
Long story short, it's complicated to make it work.
The 40G QSFP standard uses four 10G links in parallel to achieve 40 Gbps of bandwidth but this only works because it's not done at the packet level. Each switch is splitting the packet and sending it out across the four links in parallel, and the receiving switch is combining it at the far end, thus keeping every packet in order and the latency low. This is really the only way to do it properly, but it means coming up with a new standard like they did with QSFP because the data going over the wire is no longer standard Ethernet and not backwards compatible.