# Python | cmp () function

cmp | Python Methods and Functions

The cmp () method in Python compares two integers and returns -1, 0, 1 as compared.

`  Syntax:  cmp (a, b)  Parameters :  a and b are the two numbers in which the comparison is being done.  Returns:  -1 if a & lt; b 0 if a = b 1 if a & gt; b `

` `

` # Python program for demonstration # using the cmp () method   # when & lt; b a = 1   b = 2   print ( cmp (a, b))    # when a = b a = 2 b = 2   print ( cmp (a, b))    # when a & gt; b a = 3 b = 2   print ( cmp (a, b)) `

Exit:

` -1 0 1 `

Practical use: A program to check even or odd numbers using the cmp function.

Approach: compare 0 and n% 2, if it returns 0 then it is even, otherwise it is odd.

Below is the implementation of the above program in Python:

 ` # Python program to check if a number is ` ` # odd or even use of cmp function ` ` `  ` # check 12 ` ` n ` ` = ` ` 12 `  ` if ` ` cmp ` ` (` ` 0 ` `, n ` `% ` ` 2 ` `): ` ` print ` ` "odd" ` ` else ` `: ` ` ` ` print ` ` "even" `    ` # check 13 ` ` n ` ` = ` ` 13 `  ` if ` ` cmp ` ( ` 0 ` `, n ` `% ` ` 2 ` `): ` ` print ` ` "odd" ` ` else ` `: ` ` print ` ` "even" `

Output:

` even odd `

## __lt__ instead of __cmp__

### Question by Roger Pate

Python 2.x has two ways to overload comparison operators, `__cmp__` or the "rich comparison operators" such as `__lt__`. The rich comparison overloads are said to be preferred, but why is this so?

Rich comparison operators are simpler to implement each, but you must implement several of them with nearly identical logic. However, if you can use the builtin `cmp` and tuple ordering, then `__cmp__` gets quite simple and fulfills all the comparisons:

``````class A(object):
def __init__(self, name, age, other):
self.name = name
self.age = age
self.other = other
def __cmp__(self, other):
assert isinstance(other, A) # assumption for this example
return cmp((self.name, self.age, self.other),
(other.name, other.age, other.other))
``````

This simplicity seems to meet my needs much better than overloading all 6(!) of the rich comparisons. (However, you can get it down to "just" 4 if you rely on the "swapped argument"/reflected behavior, but that results in a net increase of complication, in my humble opinion.)

Are there any unforeseen pitfalls I need to be made aware of if I only overload `__cmp__`?

I understand the `<`, `<=`, `==`, etc. operators can be overloaded for other purposes, and can return any object they like. I am not asking about the merits of that approach, but only about differences when using these operators for comparisons in the same sense that they mean for numbers.

Update: As Christopher pointed out, `cmp` is disappearing in 3.x. Are there any alternatives that make implementing comparisons as easy as the above `__cmp__`?

Use the source, Luke!

In CPython, `range(...).__contains__` (a method wrapper) will eventually delegate to a simple calculation which checks if the value can possibly be in the range. The reason for the speed here is we"re using mathematical reasoning about the bounds, rather than a direct iteration of the range object. To explain the logic used:

1. Check that the number is between `start` and `stop`, and
2. Check that the stride value doesn"t "step over" our number.

For example, `994` is in `range(4, 1000, 2)` because:

1. `4 <= 994 < 1000`, and
2. `(994 - 4) % 2 == 0`.

The full C code is included below, which is a bit more verbose because of memory management and reference counting details, but the basic idea is there:

``````static int
range_contains_long(rangeobject *r, PyObject *ob)
{
int cmp1, cmp2, cmp3;
PyObject *tmp1 = NULL;
PyObject *tmp2 = NULL;
PyObject *zero = NULL;
int result = -1;

zero = PyLong_FromLong(0);
if (zero == NULL) /* MemoryError in int(0) */
goto end;

/* Check if the value can possibly be in the range. */

cmp1 = PyObject_RichCompareBool(r->step, zero, Py_GT);
if (cmp1 == -1)
goto end;
if (cmp1 == 1) { /* positive steps: start <= ob < stop */
cmp2 = PyObject_RichCompareBool(r->start, ob, Py_LE);
cmp3 = PyObject_RichCompareBool(ob, r->stop, Py_LT);
}
else { /* negative steps: stop < ob <= start */
cmp2 = PyObject_RichCompareBool(ob, r->start, Py_LE);
cmp3 = PyObject_RichCompareBool(r->stop, ob, Py_LT);
}

if (cmp2 == -1 || cmp3 == -1) /* TypeError */
goto end;
if (cmp2 == 0 || cmp3 == 0) { /* ob outside of range */
result = 0;
goto end;
}

/* Check that the stride does not invalidate ob"s membership. */
tmp1 = PyNumber_Subtract(ob, r->start);
if (tmp1 == NULL)
goto end;
tmp2 = PyNumber_Remainder(tmp1, r->step);
if (tmp2 == NULL)
goto end;
/* result = ((int(ob) - start) % step) == 0 */
result = PyObject_RichCompareBool(tmp2, zero, Py_EQ);
end:
Py_XDECREF(tmp1);
Py_XDECREF(tmp2);
Py_XDECREF(zero);
return result;
}

static int
range_contains(rangeobject *r, PyObject *ob)
{
if (PyLong_CheckExact(ob) || PyBool_Check(ob))
return range_contains_long(r, ob);

return (int)_PySequence_IterSearch((PyObject*)r, ob,
PY_ITERSEARCH_CONTAINS);
}
``````

The "meat" of the idea is mentioned in the line:

``````/* result = ((int(ob) - start) % step) == 0 */
``````

As a final note - look at the `range_contains` function at the bottom of the code snippet. If the exact type check fails then we don"t use the clever algorithm described, instead falling back to a dumb iteration search of the range using `_PySequence_IterSearch`! You can check this behaviour in the interpreter (I"m using v3.5.0 here):

``````>>> x, r = 1000000000000000, range(1000000000000001)
>>> class MyInt(int):
...     pass
...
>>> x_ = MyInt(x)
>>> x in r  # calculates immediately :)
True
>>> x_ in r  # iterates for ages.. :(
^Quit (core dumped)
``````

As I mentioned to David Wolever, there"s more to this than meets the eye; both methods dispatch to `is`; you can prove this by doing

``````min(Timer("x == x", setup="x = "a" * 1000000").repeat(10, 10000))
#>>> 0.00045456900261342525

min(Timer("x == y", setup="x = "a" * 1000000; y = "a" * 1000000").repeat(10, 10000))
#>>> 0.5256857610074803
``````

The first can only be so fast because it checks by identity.

To find out why one would take longer than the other, let"s trace through execution.

They both start in `ceval.c`, from `COMPARE_OP` since that is the bytecode involved

``````TARGET(COMPARE_OP) {
PyObject *right = POP();
PyObject *left = TOP();
PyObject *res = cmp_outcome(oparg, left, right);
Py_DECREF(left);
Py_DECREF(right);
SET_TOP(res);
if (res == NULL)
goto error;
PREDICT(POP_JUMP_IF_FALSE);
PREDICT(POP_JUMP_IF_TRUE);
DISPATCH();
}
``````

This pops the values from the stack (technically it only pops one)

``````PyObject *right = POP();
PyObject *left = TOP();
``````

and runs the compare:

``````PyObject *res = cmp_outcome(oparg, left, right);
``````

`cmp_outcome` is this:

``````static PyObject *
cmp_outcome(int op, PyObject *v, PyObject *w)
{
int res = 0;
switch (op) {
case PyCmp_IS: ...
case PyCmp_IS_NOT: ...
case PyCmp_IN:
res = PySequence_Contains(w, v);
if (res < 0)
return NULL;
break;
case PyCmp_NOT_IN: ...
case PyCmp_EXC_MATCH: ...
default:
return PyObject_RichCompare(v, w, op);
}
v = res ? Py_True : Py_False;
Py_INCREF(v);
return v;
}
``````

This is where the paths split. The `PyCmp_IN` branch does

``````int
PySequence_Contains(PyObject *seq, PyObject *ob)
{
Py_ssize_t result;
PySequenceMethods *sqm = seq->ob_type->tp_as_sequence;
if (sqm != NULL && sqm->sq_contains != NULL)
return (*sqm->sq_contains)(seq, ob);
result = _PySequence_IterSearch(seq, ob, PY_ITERSEARCH_CONTAINS);
return Py_SAFE_DOWNCAST(result, Py_ssize_t, int);
}
``````

Note that a tuple is defined as

``````static PySequenceMethods tuple_as_sequence = {
...
(objobjproc)tuplecontains,                  /* sq_contains */
};

PyTypeObject PyTuple_Type = {
...
&tuple_as_sequence,                         /* tp_as_sequence */
...
};
``````

So the branch

``````if (sqm != NULL && sqm->sq_contains != NULL)
``````

will be taken and `*sqm->sq_contains`, which is the function `(objobjproc)tuplecontains`, will be taken.

This does

``````static int
tuplecontains(PyTupleObject *a, PyObject *el)
{
Py_ssize_t i;
int cmp;

for (i = 0, cmp = 0 ; cmp == 0 && i < Py_SIZE(a); ++i)
cmp = PyObject_RichCompareBool(el, PyTuple_GET_ITEM(a, i),
Py_EQ);
return cmp;
}
``````

...Wait, wasn"t that `PyObject_RichCompareBool` what the other branch took? Nope, that was `PyObject_RichCompare`.

That code path was short so it likely just comes down to the speed of these two. Let"s compare.

``````int
PyObject_RichCompareBool(PyObject *v, PyObject *w, int op)
{
PyObject *res;
int ok;

/* Quick result when objects are the same.
Guarantees that identity implies equality. */
if (v == w) {
if (op == Py_EQ)
return 1;
else if (op == Py_NE)
return 0;
}

...
}
``````

The code path in `PyObject_RichCompareBool` pretty much immediately terminates. For `PyObject_RichCompare`, it does

``````PyObject *
PyObject_RichCompare(PyObject *v, PyObject *w, int op)
{
PyObject *res;

assert(Py_LT <= op && op <= Py_GE);
if (v == NULL || w == NULL) { ... }
if (Py_EnterRecursiveCall(" in comparison"))
return NULL;
res = do_richcompare(v, w, op);
Py_LeaveRecursiveCall();
return res;
}
``````

The `Py_EnterRecursiveCall`/`Py_LeaveRecursiveCall` combo are not taken in the previous path, but these are relatively quick macros that"ll short-circuit after incrementing and decrementing some globals.

`do_richcompare` does:

``````static PyObject *
do_richcompare(PyObject *v, PyObject *w, int op)
{
richcmpfunc f;
PyObject *res;
int checked_reverse_op = 0;

if (v->ob_type != w->ob_type && ...) { ... }
if ((f = v->ob_type->tp_richcompare) != NULL) {
res = (*f)(v, w, op);
if (res != Py_NotImplemented)
return res;
...
}
...
}
``````

This does some quick checks to call `v->ob_type->tp_richcompare` which is

``````PyTypeObject PyUnicode_Type = {
...
PyUnicode_RichCompare,      /* tp_richcompare */
...
};
``````

which does

``````PyObject *
PyUnicode_RichCompare(PyObject *left, PyObject *right, int op)
{
int result;
PyObject *v;

if (!PyUnicode_Check(left) || !PyUnicode_Check(right))
Py_RETURN_NOTIMPLEMENTED;

if (PyUnicode_READY(left) == -1 ||
return NULL;

if (left == right) {
switch (op) {
case Py_EQ:
case Py_LE:
case Py_GE:
/* a string is equal to itself */
v = Py_True;
break;
case Py_NE:
case Py_LT:
case Py_GT:
v = Py_False;
break;
default:
...
}
}
else if (...) { ... }
else { ...}
Py_INCREF(v);
return v;
}
``````

Namely, this shortcuts on `left == right`... but only after doing

``````    if (!PyUnicode_Check(left) || !PyUnicode_Check(right))

if (PyUnicode_READY(left) == -1 ||
``````

All in all the paths then look something like this (manually recursively inlining, unrolling and pruning known branches)

``````POP()                           # Stack stuff
TOP()                           #
#
case PyCmp_IN:                  # Dispatch on operation
#
sqm != NULL                     # Dispatch to builtin op
sqm->sq_contains != NULL        #
*sqm->sq_contains               #
#
cmp == 0                        # Do comparison in loop
i < Py_SIZE(a)                  #
v == w                          #
op == Py_EQ                     #
++i                             #
cmp == 0                        #
#
res < 0                         # Convert to Python-space
res ? Py_True : Py_False        #
Py_INCREF(v)                    #
#
Py_DECREF(left)                 # Stack stuff
Py_DECREF(right)                #
SET_TOP(res)                    #
res == NULL                     #
DISPATCH()                      #
``````

vs

``````POP()                           # Stack stuff
TOP()                           #
#
default:                        # Dispatch on operation
#
Py_LT <= op                     # Checking operation
op <= Py_GE                     #
v == NULL                       #
w == NULL                       #
Py_EnterRecursiveCall(...)      # Recursive check
#
v->ob_type != w->ob_type        # More operation checks
f = v->ob_type->tp_richcompare  # Dispatch to builtin op
f != NULL                       #
#
!PyUnicode_Check(left)          # ...More checks
!PyUnicode_Check(right))        #
PyUnicode_READY(left) == -1     #
PyUnicode_READY(right) == -1    #
left == right                   # Finally, doing comparison
case Py_EQ:                     # Immediately short circuit
Py_INCREF(v);                   #
#
res != Py_NotImplemented        #
#
Py_LeaveRecursiveCall()         # Recursive check
#
Py_DECREF(left)                 # Stack stuff
Py_DECREF(right)                #
SET_TOP(res)                    #
res == NULL                     #
DISPATCH()                      #
``````

Now, `PyUnicode_Check` and `PyUnicode_READY` are pretty cheap since they only check a couple of fields, but it should be obvious that the top one is a smaller code path, it has fewer function calls, only one switch statement and is just a bit thinner.

### TL;DR:

Both dispatch to `if (left_pointer == right_pointer)`; the difference is just how much work they do to get there. `in` just does less.

To add to Martijn‚Äôs answer, this is the relevant part of the source (in C, as the range object is written in native code):

``````static int
range_contains(rangeobject *r, PyObject *ob)
{
if (PyLong_CheckExact(ob) || PyBool_Check(ob))
return range_contains_long(r, ob);

return (int)_PySequence_IterSearch((PyObject*)r, ob,
PY_ITERSEARCH_CONTAINS);
}
``````

So for `PyLong` objects (which is `int` in Python 3), it will use the `range_contains_long` function to determine the result. And that function essentially checks if `ob` is in the specified range (although it looks a bit more complex in C).

If it‚Äôs not an `int` object, it falls back to iterating until it finds the value (or not).

The whole logic could be translated to pseudo-Python like this:

``````def range_contains (rangeObj, obj):
if isinstance(obj, int):
return range_contains_long(rangeObj, obj)

# default logic by iterating
return any(obj == x for x in rangeObj)

def range_contains_long (r, num):
if r.step > 0:
# positive step: r.start <= num < r.stop
cmp2 = r.start <= num
cmp3 = num < r.stop
else:
# negative step: r.start >= num > r.stop
cmp2 = num <= r.start
cmp3 = r.stop < num

# outside of the range boundaries
if not cmp2 or not cmp3:
return False

# num must be on a valid step inside the boundaries
return (num - r.start) % r.step == 0
``````

This function works in any OS (Unix, Linux, macOS, and Windows)
Python 2 and Python 3

EDITS:
By @radato `os.system` was replaced by `subprocess.call`. This avoids shell injection vulnerability in cases where your hostname string might not be validated.

``````import platform    # For getting the operating system name
import subprocess  # For executing a shell command

def ping(host):
"""
Returns True if host (str) responds to a ping request.
Remember that a host may not respond to a ping (ICMP) request even if the host name is valid.
"""

# Option for the number of packets as a function of
param = "-n" if platform.system().lower()=="windows" else "-c"

# Building the command. Ex: "ping -c 1 google.com"
command = ["ping", param, "1", host]

return subprocess.call(command) == 0
``````

Note that, according to @ikrase on Windows this function will still return `True` if you get a `Destination Host Unreachable` error.

Explanation

The command is `ping` in both Windows and Unix-like systems.
The option `-n` (Windows) or `-c` (Unix) controls the number of packets which in this example was set to 1.

`platform.system()` returns the platform name. Ex. `"Darwin"` on macOS.
`subprocess.call()` performs a system call. Ex. `subprocess.call(["ls","-l"])`.

This is more than a bit late, but you can extend the regex expression to account for scientific notation too.

``````import re

# Format is [(<string>, <expected output>), ...]
ss = [("apple-12.34 ba33na fanc-14.23e-2yapple+45e5+67.56E+3",
["-12.34", "33", "-14.23e-2", "+45e5", "+67.56E+3"]),
("hello X42 I"m a Y-32.35 string Z30",
["42", "-32.35", "30"]),
("he33llo 42 I"m a 32 string -30",
["33", "42", "32", "-30"]),
("h3110 23 cat 444.4 rabbit 11 2 dog",
["3110", "23", "444.4", "11", "2"]),
("hello 12 hi 89",
["12", "89"]),
("4",
["4"]),
("I like 74,600 commas not,500",
["74,600", "500"]),
("I like bad math 1+2=.001",
["1", "+2", ".001"])]

for s, r in ss:
rr = re.findall("[-+]?[.]?[d]+(?:,ddd)*[.]?d*(?:[eE][-+]?d+)?", s)
if rr == r:
print("GOOD")
else:
print("WRONG", rr, "should be", r)
``````

Gives all good!

Additionally, you can look at the AWS Glue built-in regex

``````mylist = ["b", "C", "A"]
mylist.sort()
``````

This modifies your original list (i.e. sorts in-place). To get a sorted copy of the list, without changing the original, use the `sorted()` function:

``````for x in sorted(mylist):
print x
``````

However, the examples above are a bit naive, because they don"t take locale into account, and perform a case-sensitive sorting. You can take advantage of the optional parameter `key` to specify custom sorting order (the alternative, using `cmp`, is a deprecated solution, as it has to be evaluated multiple times - `key` is only computed once per element).

So, to sort according to the current locale, taking language-specific rules into account (`cmp_to_key` is a helper function from functools):

``````sorted(mylist, key=cmp_to_key(locale.strcoll))
``````

And finally, if you need, you can specify a custom locale for sorting:

``````import locale
locale.setlocale(locale.LC_ALL, "en_US.UTF-8") # vary depending on your lang/locale
assert sorted((u"Ab", u"ad", u"aa"),
key=cmp_to_key(locale.strcoll)) == [u"aa", u"Ab", u"ad"]
``````

Last note: you will see examples of case-insensitive sorting which use the `lower()` method - those are incorrect, because they work only for the ASCII subset of characters. Those two are wrong for any non-English data:

``````# this is incorrect!
mylist.sort(key=lambda x: x.lower())
# alternative notation, a bit faster, but still wrong
mylist.sort(key=str.lower)
``````

You should implement the method `__eq__`:

``````class MyClass:
def __init__(self, foo, bar):
self.foo = foo
self.bar = bar

def __eq__(self, other):
if not isinstance(other, MyClass):
# don"t attempt to compare against unrelated types
return NotImplemented

return self.foo == other.foo and self.bar == other.bar
``````

Now it outputs:

``````>>> x == y
True
``````

Note that implementing `__eq__` will automatically make instances of your class unhashable, which means they can"t be stored in sets and dicts. If you"re not modelling an immutable type (i.e. if the attributes `foo` and `bar` may change the value within the lifetime of your object), then it"s recommended to just leave your instances as unhashable.

If you are modelling an immutable type, you should also implement the data model hook `__hash__`:

``````class MyClass:
...

def __hash__(self):
# necessary for instances to behave sanely in dicts and sets.
return hash((self.foo, self.bar))
``````

A general solution, like the idea of looping through `__dict__` and comparing values, is not advisable - it can never be truly general because the `__dict__` may have uncomparable or unhashable types contained within.

N.B.: be aware that before Python 3, you may need to use `__cmp__` instead of `__eq__`. Python 2 users may also want to implement `__ne__`, since a sensible default behaviour for inequality (i.e. inverting the equality result) will not be automatically created in Python 2.

Use

``````a = sorted(a, key=lambda x: x.modified, reverse=True)
#             ^^^^
``````

On Python 2.x, the `sorted` function takes its arguments in this order:

``````sorted(iterable, cmp=None, key=None, reverse=False)
``````

so without the `key=`, the function you pass in will be considered a `cmp` function which takes 2 arguments.

`==` is an equality test. It checks whether the right hand side and the left hand side are equal objects (according to their `__eq__` or `__cmp__` methods.)

`is` is an identity test. It checks whether the right hand side and the left hand side are the very same object. No methodcalls are done, objects can"t influence the `is` operation.

You use `is` (and `is not`) for singletons, like `None`, where you don"t care about objects that might want to pretend to be `None` or where you want to protect against objects breaking when being compared against `None`.

EDIT:

Indeed there was a patch which included `sign()` in math, but it wasn"t accepted, because they didn"t agree on what it should return in all the edge cases (+/-0, +/-nan, etc)

So they decided to implement only copysign, which (although more verbose) can be used to delegate to the end user the desired behavior for edge cases - which sometimes might require the call to `cmp(x,0)`.

I don"t know why it"s not a built-in, but I have some thoughts.

``````copysign(x,y):
Return x with the sign of y.
``````

Most importantly, `copysign` is a superset of `sign`! Calling `copysign` with x=1 is the same as a `sign` function. So you could just use `copysign` and forget about it.

``````>>> math.copysign(1, -4)
-1.0
>>> math.copysign(1, 3)
1.0
``````

If you get sick of passing two whole arguments, you can implement `sign` this way, and it will still be compatible with the IEEE stuff mentioned by others:

``````>>> sign = functools.partial(math.copysign, 1) # either of these
>>> sign = lambda x: math.copysign(1, x) # two will work
>>> sign(-4)
-1.0
>>> sign(3)
1.0
>>> sign(0)
1.0
>>> sign(-0.0)
-1.0
>>> sign(float("nan"))
-1.0
``````

Secondly, usually when you want the sign of something, you just end up multiplying it with another value. And of course that"s basically what `copysign` does.

``````s = sign(a)
b = b * s
``````

You can just do:

``````b = copysign(b, a)
``````

And yes, I"m surprised you"ve been using Python for 7 years and think `cmp` could be so easily removed and replaced by `sign`! Have you never implemented a class with a `__cmp__` method? Have you never called `cmp` and specified a custom comparator function?

In summary, I"ve found myself wanting a `sign` function too, but `copysign` with the first argument being 1 will work just fine. I disagree that `sign` would be more useful than `copysign`, as I"ve shown that it"s merely a subset of the same functionality.