To illustrate the solution below, two C functions operate on string data and output it for debugging and experimentation.
Code # 1: Uses bytes, represented in the form
char * , int
Code # 2: Uses wide characters in the form
wchar_t *, int
Python strings must be converted to a suitable byte encoding such as UTF-8 for the
print_chars () byte function. The code below is a simple extension function for the ultimate goal.
Code # 3:
For library functions that work with the machine type
wchar_t , the C extension code can be written as —
Code # 4:
The code below now checks how the extension functions work.
Observe how the
print_wchars() -oriented function
print_chars () gets the data in UTF-8, while
print_wchars() gets the Unicode code point values.
Code # 5:
< / p>
53 70 69 63 79 20 4a 61 6c 61 70 65 c3 b1 6f 53 70 69 63 79 20 4a 61 6c 61 70 65 f1 6f
Let’s check the nature of the C library being accessed. For many C libraries, it might make sense to pass bytes instead of a string. Let’s use the conversion code below to do this.
Code # 6:
If you still want to pass strings, care should be taken to ensure that Python3 uses an adaptable string representation that is not very easy to map directly to C libraries using the standard
char * or
wchar_t * . Thus, to represent string data in C, some kind of conversion is almost always necessary. The format codes s # and u # for
PyArg_ParseTuple () safely perform such conversions.
Whenever a conversion is performed, a copy of the converted data is attached to the original string object so that it can be used later, as shown in the code below.
Code # 7:
Size: 87 53 70 69 63 79 20 4a 61 6c 61 70 65 c3 b1 6f Size: 103 53 70 69 63 79 20 4a 61 6c 61 70 65 f1 6f Size: 163