Python does not currently have a sscanf() equivalent. Regular expressions are generally more powerful, albeit more verbose, than scanf () strings. The table below shows some more or less equivalent mappings between scanf () format tokens and regular expressions.
scanf () TokenRegular Expression
To extract filename and numbers from a string like
/ usr / sbin / sendmail - 0 errors, 4 warnings
you would use scanf () format like
% s -% d errors,% d warnings
The equivalent regular expression will be
(S +) - (d +) errors, (d +) warnings
sscanf in Python
Stackoverflow question
I’m looking for an equivalent to sscanf()
in Python. I want to parse /proc/net/*
files, in C I could do something like this:
int matches = sscanf(
buffer,
"%*d: %64[0-9A-Fa-f]:%X %64[0-9A-Fa-f]:%X %*X %*X:%*X %*X:%*X %*X %*d %*d %ld %*512s
",
local_addr, &local_port, rem_addr, &rem_port, &inode);
I thought at first to use str.split
, however it doesn’t split on the given characters, but the sep
string as a whole:
>>> lines = open("/proc/net/dev").readlines()
>>> for l in lines[2:]:
>>> cols = l.split(string.whitespace + ":")
>>> print len(cols)
1
Which should be returning 17, as explained above.
Is there a Python equivalent to sscanf
(not RE), or a string splitting function in the standard library that splits on any of a range of characters that I’m not aware of?
Answer:
There is also the parse
module.
parse()
is designed to be the opposite of format()
(the newer string formatting function in Python 2.6 and higher).
>>> from parse import parse
>>> parse(’{} fish’, ’1’)
>>> parse(’{} fish’, ’1 fish’)
<Result (’1’,) {}>
>>> parse(’{} fish’, ’2 fish’)
<Result (’2’,) {}>
>>> parse(’{} fish’, ’red fish’)
<Result (’red’,) {}>
>>> parse(’{} fish’, ’blue fish’)
<Result (’blue’,) {}>
Scanf in Python implementation
Python has powerful regular expressions, but they might be overkill for many simpler situations. In addition, some of the common number formats require fairly complex regular expressions to match reliably. This Python implementation of scanf internally translates the simple scanf format into regular expressions and then returns the parsed values.
Scanf in Python on Github
Pattern | Meaning |
---|---|
%c | One character |
%5c | 5 characters |
%d, %i | int value |
%7d, %7i | int value with length 7 |
%f | float value |
%o | octal value |
%X, %x | hex value |
%s | string terminated by whitespace |
Python Scanf Example
>>> from scanf import scanf >>> scanf("%s - %d errors, %d warnings", "/usr/sbin/sendmail - 0 errors, 4 warnings") (’/usr/sbin/sendmail’, 0, 4) >>> scanf("%o %x %d", "0123 0x123 123") (83, 291, 123) >>> pattern = ’Power: %f [%], %s, Stemp: %f’ >>> text = ’Power: 0.0 [%], Cool, Stemp: 23.73’ >>> scanf(pattern, text) (0.0, ’Cool’, 23.73) >>> pattern = ’Power: %f [%], %*s, Stemp: %f’ # note the ’*’ in %*s >>> scanf(pattern, text) (0.0, 23.73)