Reading Python File-Like Objects From C | python



Code # 1:

# define CHUNK_SIZE 8192

 
/ * Use "like file" object and write bytes to standard output * /

static PyObject * py_consume_file (PyObject * self, PyObject * args)

{

PyObject * obj; 

PyObject * read_meth; 

PyObject * result = NULL; 

PyObject * read_args; 

 

if (! PyArg_ParseTuple (args, " O " , & amp; obj)) {

return NULL; 

}

 

  / * Get the method to read the passed object * /

if ((read_meth = PyObject_GetAttrString (obj, "read" )) == NULL) {

return NULL; 

}

 

  / * Create a list of arguments to read () * /

read_args = Py_BuildValue ( "(i)" , CHUNK_SIZE); 

while (1) {

PyObject * data; 

PyObject * enc_data; 

char * buf; 

Py_ssize_t len; 

 

/ * Call, read () * /

  if ((data = PyObject_Call (read_meth, read_args, NULL)) == NULL) {

goto final; 

}

 

  / * Check for EOF * /

if (PySequence_Length (data) == 0) {

Py_DECREF (data); 

break

}

 

  / * Encode Unicode as bytes for C * /

if ((enc_data = PyUnicode_AsEncodedString (data,

"utf-8" , "strict" )) == NULL) {

Py_DECREF (data); 

goto final; 

}

 

  / * Extract main buffer data * /

PyBytes_AsStringAndSize (enc_data, & amp; buf, & amp; len); 

 

/ * Write to stdout (replace with something more useful) * /

  write (1, buf, len); 

 

/ * Cleanup * /

  Py_DECREF (enc_data); 

Py_DECREF (data); 

}

result = Py_BuildValue ( " " ); 

 
final:

/ * Cleanup * /

Py_DECREF (read_meth); 

Py_DECREF (read_args); 

return result; 

}

A file object such as a StringIO instance is prepared for code testing and then passed to:

Code # 2:

import io

f = io.StringIO ( `Hello World` )

import sample

sample.consume_file (f)

Output:

 Hello World 

Unlike a regular system file, a file-like object is not necessarily built on a low-level file descriptor. Thus, normal C library functions cannot be used to access it. Instead, the Python C API is used to manipulate an object-like object in the same way as in Python. 
Thus, the read () method is retrieved from the passed object. The argument list is PyObject_Call () and then re-passed to PyObject_Call () to call the method. To detect end of file (EOF), PySequence_Length () is used to see if the returned result is of zero length. 
For all I / O operations, the problem is encoding and distinguishing between bytes and Unicode. This recipe shows how to read a file in text mode and decode the resulting text into a byte encoding that can be used by C. If the file is read in binary mode, only minor changes will be made, as shown in the code below.

Code # 3:

/ * Call, read () * /

if ((data = PyObject_Call (read_meth, read_args, NULL)) == NULL) {

goto final; 

}

 
/ * Check for EOF * /

if (PySequence_Length (data ) == 0) {

Py_DECREF (data); 

break

}

 

if (! PyBytes_Check (data)) {

  Py_DECREF (data); 

PyErr_SetString (PyExc_IOError, "File must be in binary mode " ); 

goto final; 

}

 
/ * Extract main buffer data * /
PyBytes_AsStringAndSize (data, & amp; buf, & amp; len);