-7

The question my professor asked me is :

Using SIMD instructions and pure assembly, make a function to add efficiently any size of integer vectors. Compare the results obtained with the C++ code. The max evaluation for this work is16 values.

I already have the C++ code i just now need help in making the assembly part.

#include <iostream>
using namespace std;

extern "C" {
    void conta ( int*, int*,  int*, int);
}


int main()
{
    int arrayA[] = { 2, 2, 2, 2 };
    int arrayB[] = { 3, 3, 3, 3 };
    int elementos = sizeof(arrayA) / sizeof(arrayB[0]);
    unsigned int* Finalarray = (unsigned int*)malloc(elementos * sizeof(int));
    

    for (int i = 0; i < elementos; i++) {
        Finalarray[i] = arrayA[i] + arrayB[i];
        cout << Finalarray[i];
    }
  • 1
    Very good, you're halfway done! – Eljay May 10 '22 at 13:39
  • Paste it to godbolt.org and enable vectorization flags then look at assembly codes. Then write them yourself with proper parameters. – huseyin tugrul buyukisik May 10 '22 at 13:42
  • 2
    Same as [How to sum two integer vectors in asm](https://stackoverflow.com/q/72187160) - likely the same assignment. See comments there: You didn't specify what ISA. Hand-written asm is obviously different for x86 with AVX-512 vs. AArch64 with ASIMD vs. AArch64 with SVE or RISC-V with its similar scalable vector-length extension. Also, obviously printing inside the sum loop will stop a compiler from auto-vectorizing so unlike your classmate(?), this one isn't quite a duplicate of [How to vectorize with gcc?](https://stackoverflow.com/q/409300). It's still homework with no attempt made. – Peter Cordes May 10 '22 at 13:43
  • _"> integer vectors"_ There are no vectors in your code. _"> I already have the C++ code "_ No, you have C code which happens to be compatible with C++. – Revolver_Ocelot May 10 '22 at 13:44
  • The max array size to optimize for is 16 values? That's very short so hand-written asm that minimizes overhead for odd lengths (and shorter than a full vector width) is going to be important. That's a potentially interesting problem, since it's less trivial to solve. [SIMD array add for arbitrary array lengths](https://stackoverflow.com/q/10167564) is an x86 intrinsics version of that. – Peter Cordes May 10 '22 at 13:45
  • 1
    @Revolver_Ocelot: What C implementations do you know that accept `extern "C"`, or have `cout < – Peter Cordes May 10 '22 at 13:47

0 Answers0