Angry Bits

Words on bytes and bits

Fear and loathing with ctypes

Yesterday I've done some hacks with ctypes. ctypes is a nice library but I've felt like the doc wasn't that good. My task was to wrap librsync to make it available to Python code. I was quite surprised to find out that such a wrapper didn't exist yet. Most of the wrappers I've found are outdated and didn't fit my needs.

While writing the wrapper I've found some tips and troubles I think are worth sharing as the doc didn't help much.

FILE struct

First bug I've encountered was about the C FILE struct. Even if the Python API has a nice function to convert a file object to a FILE pointer, it's better not to use it: FILE struct is implementation dependent and it can happen that your lib is written with a different compiler than your Python interpreter. A better way to pass around files is by using the value returned by the fileno method and then attach it from the C code using the fdopen function.

It wasn't the case for me: I would have to change the library anyway going on this path. Fortunately librsync supports callback to customize the streams...

Callbacks

Given the FILE issue we've decided to go on using a lower level API provided by librsync itself. ctypes lets us write callback functions and that's awesome.

Many libraries that implement a callbacks architecture, lets you add a custom parameter to the function, librsync is not an exception to this and in my case I took advantage of it to pass a Python object directly to the callback function used in the stream processing of librsync.

The custom parameter is a simple void pointer, so all we have to do from the callback function to get the original python object back is to use casting:

custom_param = ctypes.py_object((filein, buffer))
api_call(CBFUNCTYPE(my_python_function), custom_param)

...
def my_python_function(job_descr, status_descr, custom_param):
    filein, buffer = cast(custom_param, py_object).value
    ...

Pointer Arithmetic

It can happen you want to do some pointer arithmetic. I actually wanted to do that and the first thing I've encountered on the web was this path submitted in the Python issue tracker: http://bugs.python.org/issue6259

Unluckily the patch is not yet merged in Python and probably it won't be. In fact the solution is quite easy and involve yet another casting. Given buff, a character buffer, and buff_end, the pointer to the end of this buffer, we want to compute the lenght in bytes:

start_addr = cast(buff, c_void_p)
end_addr = cast(buff_end, c_void_p)
length = end_addr.value - start_addr.value

Casting to void pointer lets us use the raw address of the pointers as an integer number, making it possible to do all kind of arithmetic operations we want.

Conclusions

Writing the ctypes wrapper was quite straight forward until I've committed a trivial mistake that took me some times to fix: I've met again the horror of an invalid memory access! The problem was connected to a Python object passed to a Python callback that was collected by Python gc... My fault and actually the doc of ctypes nicely pointed this risk out.

Comments