0

How can I pass a str value (containing 3000 {'0', '1'} bytes) obtained using python code as an argument to a python c extended function (extended using SWIG) which requires int * (fixed length int array) as an input argument? My code is such:

int *exposekey(int *bits) {
    int a[1000];
    for (int j=2000; j < 3000; j++) {
        a[j - 2000] = bits[j];
    }
    return a;
}

What I've tried was to use ctypes (see below code):

import ctypes
ldpc = ctypes.cdll.LoadLibrary('./_ldpc.so')
arr = (ctypes.c_int * 3072)(<mentioned below>)
ldpc.exposekey(arr)

with 3072 {0, 1} entered in the position. Python returns syntax error : more than 255 arguments. This still doesn't help me to pass assigned str value instead of the initialized ctypes int array.

Other suggestion included using SWIG typemaps but how would that work for converting a str into int * ? Thanks in advance.

CristiFati
  • 32,724
  • 9
  • 46
  • 70
4am
  • 95
  • 1
  • 1
  • 11
  • Returning local arrays from functions is a tricky thing since they reside on the stack, and get destroyed when going out of scope, and thus later on when the returned address will be dereferenced, you'll most likely get a _segfault_. Either make it `static`, or dynamically allocate it, or add it as a 2nd (output) argument to your function. I'd go with #3. – CristiFati Nov 14 '17 at 14:29

1 Answers1

1

Regarding my comment, here are some more details about returning arrays from functions: [SO]: Returning an array using C. In short: ways handle this:

  1. Make the returned variable static
  2. Dynamically allocate it (using malloc (family) or new)
  3. Turn it into an additional argument for the function

Getting that piece of C code to run within the Python interpreter is possible in 2 ways:

Since they both are doing the same thing, mixing them together makes no sense. So, pick the one that best fits your needs.


1. ctypes

  • This is what you started with
  • It's one of the ways of doing things using ctypes

ctypes_demo.c:

#include <stdio.h>

#if defined(_WIN32)
#  define CTYPES_DEMO_EXPORT_API __declspec(dllexport)
#else
#  define CTYPES_DEMO_EXPORT_API
#endif


CTYPES_DEMO_EXPORT_API int exposekey(char *bitsIn, char *bitsOut) {
    int ret = 0;
    printf("Message from C code...\n");
    for (int j = 0; j < 1000; j++)
    {
        bitsOut[j] = bitsIn[j + 2000];
        ret++;
    }
    return ret;
}

Notes:

  • Based on comments, I changed the types in the function from int* to char*, because it's 4 times more compact (although it's still ~700% inefficient since 7 bits of each char are ignored versus only one of them being used; that can be fixed, but requires bitwise processing)
  • I took a and turned into the 2nd argument (bitsOut). I think this is best because it's caller responsibility to allocate and deallocate the array (the 3rd option from the beginning)
  • I also modified the index range (without changing functionality), because it makes more sense to work with low index values and add something to them in one place, instead of a high index values and subtract (the same) something in another place
  • The return value is the number of bits set (obviously, 1000 in this case) but it's just an example
  • printf it's just dummy, to show that the C code gets executed
  • When dealing with such arrays, it's recommended to pass their dimensions as well, to avoid out of bounds errors. Also, error handling is an important aspect

test_ctypes.py:

from ctypes import CDLL, c_char, c_char_p, c_int, create_string_buffer


bits_string = "010011000110101110101110101010010111011101101010101"


def main():
    dll = CDLL("./ctypes_demo.dll")
    exposekey = dll.exposekey

    exposekey.argtypes = [c_char_p, c_char_p]
    exposekey.restype = c_int

    bits_in = create_string_buffer(b"\0" * 2000 + bits_string.encode())
    bits_out = create_string_buffer(1000)
    print("Before: [{}]".format(bits_out.raw[:len(bits_string)].decode()))
    ret = exposekey(bits_in, bits_out)
    print("After: [{}]".format(bits_out.raw[:len(bits_string)].decode()))
    print("Return code: {}".format(ret))


if __name__ == "__main__":
    main()

Notes:

  • 1st, I want to mention that running your code didn't raise the error you got
  • Specifying function's argtypes and restype is mandatory, and also makes things easier (documented in the ctypes tutorial)
  • I am printing the bits_out array (only the first - and relevant - part, as the rest are 0) in order to prove that the C code did its job
  • I initialize bits_in array with 2000 dummy 0 at the beginning, as those values are not relevant here. Also, the input string (bits_string) is not 3000 characters long (for obvious reasons). If your bits_string is 3000 characters long you can simply initialize bits_in like: bits_in = create_string_buffer(bits_string.encode())
  • Do not forget to initialize bits_out to an array with a size large enough (in our example 1000) for its purpose, otherwise segfault might arise when trying to set its content past the size
  • For this (simple) function, the ctypes variant was easier (at least for me, since I don't use swig frequently), but for more complex functions / projects it will become an overkill and switching to swig would be the right thing to do

Output (running with Python3.5 on Win):

c:\Work\Dev\StackOverflow\q47276327>"c:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" test_ctypes.py
Before: [                                                   ]
Message from C code...
After: [010011000110101110101110101010010111011101101010101]
Return code: 1000


2. swig

  • Almost everything from the ctypes section, applies here as well

swig_demo.c:

#include <malloc.h>
#include <stdio.h>
#include "swig_demo.h"


char *exposekey(char *bitsIn) {
    char *bitsOut = (char*)malloc(sizeof(char) * 1000);
    printf("Message from C code...\n");
    for (int j = 0; j < 1000; j++) {
        bitsOut[j] = bitsIn[j + 2000];
    }
    return bitsOut;
}

swig_demo.i:

%module swig_demo
%{
#include "swig_demo.h"
%}

%newobject exposekey;
%include "swig_demo.h"

swig_demo.h:

char *exposekey(char *bitsIn);

Notes:

  • Here I'm allocating the array and return it (the 2nd option from the beginning)
  • The .i file is a standard swig interface file
    • Defines the module, and its exports (via %include)
    • One thing that is worth mentioning is the %newobject directive that deallocates the pointer returned by exposekey to avoid memory leaks
  • The .h file just contains the function declaration, in order to be included by the .i file (it's not mandatory, but things are more elegant this way)
  • The rest is pretty much the same

test_swig.py:

from swig_demo import exposekey

bits_in = "010011000110101110101110101010010111011101101010101"


def main():
    bits_out = exposekey("\0" * 2000 + bits_in)
    print("C function returned: [{}]".format(bits_out))


if __name__ == "__main__":
    main()

Notes:

  • Things make much more sense from Python programmer's PoV
  • Code is a lot shorter (that is because swig did some "magic" behind the scenes):
    • The wrapper .c wrapper file generated from the .i file has ~120K
    • The swig_demo.py generated module has ~3K
  • I used the same technique with 2000 0 at the beginning of the string

Output:

c:\Work\Dev\StackOverflow\q47276327>"c:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" test_swig.py
Message from C code...
C function returned: [010011000110101110101110101010010111011101101010101]


3. Plain Python C API

  • I added this part as a personal exercise
  • This is what swig does, but "manually"

capi_demo.c:

#include "Python.h"
#include "swig_demo.h"

#define MOD_NAME "capi_demo"


static PyObject *PyExposekey(PyObject *self, PyObject *args) {
    PyObject *bitsInArg = NULL, *bitsOutArg = NULL;
    char *bitsIn = NULL, *bitsOut = NULL;
    if (!PyArg_ParseTuple(args, "O", &bitsInArg))
        return NULL;
    bitsIn = PyBytes_AS_STRING(PyUnicode_AsEncodedString(bitsInArg, "ascii", "strict"));
    bitsOut = exposekey(bitsIn);
    bitsOutArg = PyUnicode_FromString(bitsOut);
    free(bitsOut);
    return bitsOutArg;
}


static PyMethodDef moduleMethods[] = {
    {"exposekey", (PyCFunction)PyExposekey, METH_VARARGS, NULL},
    {NULL}
};


static struct PyModuleDef moduleDef = {
    PyModuleDef_HEAD_INIT, MOD_NAME, NULL, -1, moduleMethods
};


PyMODINIT_FUNC PyInit_capi_demo(void) {
    return PyModule_Create(&moduleDef);
}

Notes:

  • It requires swig_demo.h and swig_demo.c (not going to duplicate their contents here)
  • It only works with Python 3 (actually I got quite some headaches making it work, especially because I was used to PyString_AsString which is no longer present)
  • Error handling is poor
  • test_capi.py is similar to test_swig.py with one (obvious) difference: from swig_demo import exposekey should be replaced by from capi_demo import exposekey
  • The output is also the same to test_swig.py (again, not going to duplicate it here)
CristiFati
  • 32,724
  • 9
  • 46
  • 70
  • Thank you soo much . this works perfectly. But like you said i'am trying to apply thins logic to complex functions/projects , hence the use of SWIG. – 4am Nov 15 '17 at 06:07
  • In the above code you are defining the input array to be passed, if have an input string of 0's and 1's (type= python str) how do i pass that as input to this ctypes func? Thanks again. – 4am Nov 15 '17 at 06:24
  • If you want to pass a string, you need to change the function (so its `bits` argument is a `char*`), and also from _Python_, its `argtypes`. It's definitely better to store a **0/1** value (1bit) in a `char` (8bits) than in an `int` (32bits), however it's still 700% inefficient from the storage's *PoV* (as the `char` 7 other bits aren't used). While on this subject, if changing `bits` wouldn't it be a good idea to also change `out` (to `char*`)? Or you could go the easy way (initialize `bits` like this): `for i, char in enumerate(your_string):` `bits[2000 + i] = ord(char) - ord("0")`. – CristiFati Nov 15 '17 at 08:26
  • Did you give the _swig_ variant a try? – CristiFati Nov 17 '17 at 08:51
  • tried what you suggested passing to char * and did the bit conversion at the c end using atoi(), instead of ord() at python end. Thank you – 4am Nov 17 '17 at 12:04