36

I know that swift will optimize to copy on write for arrays but will it do this for all structs? For example:

struct Point {
   var x:Float = 0
}

var p1 = Point()
var p2 = p1 //p1 and p2 share the same data under the hood
p2.x += 1 //p2 now has its own copy of the data
Hamish
  • 74,809
  • 18
  • 177
  • 265
gloo
  • 2,270
  • 2
  • 20
  • 36
  • @vadian how do you know? – matt Apr 19 '17 at 04:40
  • 2
    Nitpick: This behaviour is a property of the Swift compiler, not of the Swift language. So long as the program behaviour is in line with the language specification, the compiler is free to do what it sees fit – Alexander Apr 19 '17 at 04:56

2 Answers2

48

Array is implemented with copy-on-write behaviour – you'll get it regardless of any compiler optimisations (although of course, optimisations can decrease the number of cases where a copy needs to happen).

At a basic level, Array is just a structure that holds a reference to a heap-allocated buffer containing the elements – therefore multiple Array instances can reference the same buffer. When you come to mutate a given array instance, the implementation will check if the buffer is uniquely referenced, and if so, mutate it directly. Otherwise, the array will perform a copy of the underlying buffer in order to preserve value semantics.

However, with your Point structure – you're not implementing copy-on-write at a language level. Of course, as @Alexander says, this doesn't stop the compiler from performing all sorts of optimisations to minimise the cost of copying whole structures about. These optimisations needn't follow the exact behaviour of copy-on-write though – the compiler is simply free to do whatever it wishes, as long as the program runs according to the language specification.

In your specific example, both p1 and p2 are global, therefore the compiler needs to make them distinct instances, as other .swift files in the same module have access to them (although this could potentially be optimised away with whole-module optimisation). However, the compiler still doesn't need to copy the instances – it can just evaluate the floating-point addition at compile-time and initialise one of the globals with 0.0, and the other with 1.0.

And if they were local variables in a function, for example:

struct Point {
    var x: Float = 0
}

func foo() {
    var p1 = Point()
    var p2 = p1
    p2.x += 1
    print(p2.x)
}

foo()

The compiler doesn't even have to create two Point instances to begin with – it can just create a single floating-point local variable initialised to 1.0, and print that.

Regarding passing value types as function arguments, for large enough types and (in the case of structures) functions that utilise enough of their properties, the compiler can pass them by reference rather than copying. The callee can then make a copy of them only if needed, such as when needing to work with a mutable copy.

In other cases where structures are passed by value, it's also possible for the compiler to specialise functions in order to only copy across the properties that the function needs.

For the following code:

struct Point {
    var x: Float = 0
    var y: Float = 1
}

func foo(p: Point) {
    print(p.x)
}

var p1 = Point()
foo(p: p1)

Assuming foo(p:) isn't inlined by the compiler (it will in this example, but once its implementation reaches a certain size, the compiler won't think it worth it) – the compiler can specialise the function as:

func foo(px: Float) {
    print(px)
}

foo(px: 0)

It only passes the value of Point's x property into the function, thereby saving the cost of copying the y property.

So the compiler will do whatever it can in order to reduce the copying of value types. But with so many various optimisations in different circumstances, you cannot simply boil the optimised behaviour of arbitrary value types down to just copy-on-write.

Community
  • 1
  • 1
Hamish
  • 74,809
  • 18
  • 177
  • 265
  • So in Xcode with whole module optimization turned on, if I create a struct with `var` and then pass it around to a bunch of functions that do NOT mutate the struct will Xcode optimize away all those copies? – gloo Apr 19 '17 at 22:30
  • @gloo It depends on the functions and the structure, but yes, it's fully possible – just found out (by going through the IR for an optimised build) that for large enough structures, Swift can pass them by reference to functions, therefore fully eliminating the copying (that is, until the callee mutates a copy). But with so many various optimisations and corner cases where they cannot be applied, you cannot simply boil the behaviour down to copy-on-write. Is there an actual performance bottleneck you're worried about, or are you just curious? – Hamish Apr 19 '17 at 22:57
  • 2
    Well I wrote a game engine in swift/metal. I pass around a lot of structs that represent drawing commands to be consumed by the GPU and current frame data. At the time I thought all my structures would employ COW to avoid wasted copies, but then I learned that there was actually a lot of disagreement over what Xcode actually does. So I became worried my engine was not as optimized as I thought. My game runs at 60fps so right now it is not an issue, just worried it won't scale well for future projects. – gloo Apr 19 '17 at 23:38
  • 3
    @gloo If it's not currently a performance bottleneck – I really wouldn't worry about it. As said, the compiler is able to perform lots of optimisations to reduce the amount of copying of value types. If it becomes a problem later down the line, you can relatively easily refactor your structure(s) to use copy-on-write; but you should only do so after identifying it as an issue when profiling, and after seeing that making the change actually boosts performance... – Hamish Apr 20 '17 at 09:58
  • 6
    as implementing copy-on-write at a language level requires references, and therefore comes with the cost of both heap allocation and reference counting. Attempting to change your logic now without knowing for certain whether you're making things better or worse would be counterproductive. – Hamish Apr 20 '17 at 09:58
  • @Hamish CoW also requires adding a branch (to check if a copy is necessary) on every mutating method. Between branch prediction and speculative execution, I'm not sure how it would play out, but I'm reasonably certain that it would be slower than unconditionally copying all small structs. – Alexander Jun 14 '19 at 17:07
1

Swift COW

By default Value Type[About] does not support COW(Copy on Write) mechanism. But some of system Classes like Collections support it

how to check address

// Print memory address
func address(_ object: UnsafeRawPointer) -> String {
    let address = Int(bitPattern: object)
    return NSString(format: "%p", address) as String
}

Default behaviour

struct A {
    var value: Int = 0
}

//Default behavior(COW is not used)
var a1 = A()
var a2 = a1

//different addresses
print(address(&a1)) //0x7ffee48f24a8
print(address(&a2)) //0x7ffee48f24a0

//COW is not used
a2.value = 1
print(address(&a2)) //0x7ffee48f24a0

Collection as exception

//collection(COW is realized)
var collection1 = [A()]
var collection2 = collection1

//same addresses
print(address(&collection1)) //0x600000c2c0e0
print(address(&collection2)) //0x600000c2c0e0

//COW is used
collection2.append(A())
print(address(&collection2)) //0x600000c2c440

We have next problem - if value type has a reference to heap through reference type the value by the reference is shared between different value types

//problem with reference to heap
class C {
    var value: Int = 0
}

struct B {
    var c: C
}

var b1 = B(c: C())
var b2 = b1

print(address(&b1)) //0x7ffeebd443d0
print(address(&b2)) //0x7ffeebd443c8

b2.c.value = 1
print(address(&b2)) //0x7ffeebd443c8

print(b1.c.value) //1 //<- is changed also
print(b2.c.value) //1

Solution is to use some sort of self-written realisation of COW

//Adding COW
final class Ref<T> {
    var value: T
    init(value: T) {
        self.value = value
    }
}

struct Box<T> {
    var reference: Ref<T>
    init(interior: T) {
        self.reference = Ref(value: interior)
    }
    
    var value: T {
        get {
            return reference.value
        }
        
        set {
            //it is true when accessing throught first owner
            //when `box1.value = 1` isKnownUniquelyReferenced returns true
            if (isKnownUniquelyReferenced(&self.reference)) {
                self.reference.value = newValue
            } else {
                self.reference = Ref(value: newValue)
            }
        }
    }
}

var box1 = Box(interior: 0)
var box2 = box1

//same addresses
print(address(&box1)) //0x7ffee11b53d0
print(address(&box2)) //0x7ffee11b53c0

//COW is used
box2.value = 1
print(address(&box2)) //0x7ffee11b53c0

print(box1.value) //0 // <-
print(box2.value) //1
yoAlex5
  • 21,739
  • 5
  • 148
  • 151
  • also, see: https://github.com/apple/swift/blob/main/docs/OptimizationTips.rst#advice-use-copy-on-write-semantics-for-large-values – Honghao Zhang Mar 30 '22 at 20:54