  # itertools.islice in Python

islice | Python Methods and Functions

Creates an iterator over a limited subset of the elements of the passed object.

itertools.islice (iterable, [start], stop [, step])

-> iterator

iiterable - The object to get the subset from.

start = None - Integer. The index of the starting element of the subset. Starting with py2.5, None is treated as 0.

stop - Integer. The index of the ending element of the subset. If None, then all remaining elements are returned.

step = None - Integer. The step to go through the subset. Since + py2.5, None is interpreted as 1.

Attention

Unlike usual slice , this function does not support negative values ​​in ` start `, ` stop `, ` step `

` ` from itertools import islice  letters = 'ABCDEFG'   list (islice (letters, 2 )) # [' A',' B']  list (islice (letters, 2, 4)) # ['C',' D']  list (islice (letters, 2, None)) # ['C',' D', 'E',' F', 'G']  list (islice (letters, 0, None, 2)) # [' A', 'C',' E ',' G'] ` `

# Slicing a list

``````top5 = array[:5]
``````
• To slice a list, there"s a simple syntax: `array[start:stop:step]`
• You can omit any parameter. These are all valid: `array[start:]`, `array[:stop]`, `array[::step]`

# Slicing a generator

`````` import itertools
top5 = itertools.islice(my_list, 5) # grab the first five elements
``````
• You can"t slice a generator directly in Python. `itertools.islice()` will wrap an object in a new slicing generator using the syntax `itertools.islice(generator, start, stop, step)`

• Remember, slicing a generator will exhaust it partially. If you want to keep the entire generator intact, perhaps turn it into a tuple or list first, like: `result = tuple(generator)`

Note: this post assumes Python 3.x syntax.

A generator is simply a function which returns an object on which you can call `next`, such that for every call it returns some value, until it raises a `StopIteration` exception, signaling that all values have been generated. Such an object is called an iterator.

Normal functions return a single value using `return`, just like in Java. In Python, however, there is an alternative, called `yield`. Using `yield` anywhere in a function makes it a generator. Observe this code:

``````>>> def myGen(n):
...     yield n
...     yield n + 1
...
>>> g = myGen(6)
>>> next(g)
6
>>> next(g)
7
>>> next(g)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
``````

As you can see, `myGen(n)` is a function which yields `n` and `n + 1`. Every call to `next` yields a single value, until all values have been yielded. `for` loops call `next` in the background, thus:

``````>>> for n in myGen(6):
...     print(n)
...
6
7
``````

Likewise there are generator expressions, which provide a means to succinctly describe certain common types of generators:

``````>>> g = (n for n in range(3, 5))
>>> next(g)
3
>>> next(g)
4
>>> next(g)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
``````

Note that generator expressions are much like list comprehensions:

``````>>> lc = [n for n in range(3, 5)]
>>> lc
[3, 4]
``````

Observe that a generator object is generated once, but its code is not run all at once. Only calls to `next` actually execute (part of) the code. Execution of the code in a generator stops once a `yield` statement has been reached, upon which it returns a value. The next call to `next` then causes execution to continue in the state in which the generator was left after the last `yield`. This is a fundamental difference with regular functions: those always start execution at the "top" and discard their state upon returning a value.

There are more things to be said about this subject. It is e.g. possible to `send` data back into a generator (reference). But that is something I suggest you do not look into until you understand the basic concept of a generator.

Now you may ask: why use generators? There are a couple of good reasons:

• Certain concepts can be described much more succinctly using generators.
• Instead of creating a function which returns a list of values, one can write a generator which generates the values on the fly. This means that no list needs to be constructed, meaning that the resulting code is more memory efficient. In this way one can even describe data streams which would simply be too large to fit in memory.
• Generators allow for a natural way to describe infinite streams. Consider for example the Fibonacci numbers:

``````>>> def fib():
...     a, b = 0, 1
...     while True:
...         yield a
...         a, b = b, a + b
...
>>> import itertools
>>> list(itertools.islice(fib(), 10))
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
``````

This code uses `itertools.islice` to take a finite number of elements from an infinite stream. You are advised to have a good look at the functions in the `itertools` module, as they are essential tools for writing advanced generators with great ease.

About Python <=2.6: in the above examples `next` is a function which calls the method `__next__` on the given object. In Python <=2.6 one uses a slightly different technique, namely `o.next()` instead of `next(o)`. Python 2.7 has `next()` call `.next` so you need not use the following in 2.7:

``````>>> g = (n for n in range(3, 5))
>>> g.next()
3
``````

Python 2:

``````with open("datafile") as myfile:
head = [next(myfile) for x in xrange(N)]
``````

Python 3:

``````with open("datafile") as myfile:
head = [next(myfile) for x in range(N)]
``````

Here"s another way (both Python 2 & 3):

``````from itertools import islice

with open("datafile") as myfile:
``````

## Explain Python"s slice notation

In short, the colons (`:`) in subscript notation (`subscriptable[subscriptarg]`) make slice notation - which has the optional arguments, `start`, `stop`, `step`:

``````sliceable[start:stop:step]
``````

Python slicing is a computationally fast way to methodically access parts of your data. In my opinion, to be even an intermediate Python programmer, it"s one aspect of the language that it is necessary to be familiar with.

## Important Definitions

To begin with, let"s define a few terms:

start: the beginning index of the slice, it will include the element at this index unless it is the same as stop, defaults to 0, i.e. the first index. If it"s negative, it means to start `n` items from the end.

stop: the ending index of the slice, it does not include the element at this index, defaults to length of the sequence being sliced, that is, up to and including the end.

step: the amount by which the index increases, defaults to 1. If it"s negative, you"re slicing over the iterable in reverse.

## How Indexing Works

You can make any of these positive or negative numbers. The meaning of the positive numbers is straightforward, but for negative numbers, just like indexes in Python, you count backwards from the end for the start and stop, and for the step, you simply decrement your index. This example is from the documentation"s tutorial, but I"ve modified it slightly to indicate which item in a sequence each index references:

`````` +---+---+---+---+---+---+
| P | y | t | h | o | n |
+---+---+---+---+---+---+
0   1   2   3   4   5
-6  -5  -4  -3  -2  -1
``````

## How Slicing Works

To use slice notation with a sequence that supports it, you must include at least one colon in the square brackets that follow the sequence (which actually implement the `__getitem__` method of the sequence, according to the Python data model.)

Slice notation works like this:

``````sequence[start:stop:step]
``````

And recall that there are defaults for start, stop, and step, so to access the defaults, simply leave out the argument.

Slice notation to get the last nine elements from a list (or any other sequence that supports it, like a string) would look like this:

``````my_list[-9:]
``````

When I see this, I read the part in the brackets as "9th from the end, to the end." (Actually, I abbreviate it mentally as "-9, on")

## Explanation:

The full notation is

``````my_list[-9:None:None]
``````

and to substitute the defaults (actually when `step` is negative, `stop`"s default is `-len(my_list) - 1`, so `None` for stop really just means it goes to whichever end step takes it to):

``````my_list[-9:len(my_list):1]
``````

The colon, `:`, is what tells Python you"re giving it a slice and not a regular index. That"s why the idiomatic way of making a shallow copy of lists in Python 2 is

``````list_copy = sequence[:]
``````

And clearing them is with:

``````del my_list[:]
``````

(Python 3 gets a `list.copy` and `list.clear` method.)

### When `step` is negative, the defaults for `start` and `stop` change

By default, when the `step` argument is empty (or `None`), it is assigned to `+1`.

But you can pass in a negative integer, and the list (or most other standard slicables) will be sliced from the end to the beginning.

Thus a negative slice will change the defaults for `start` and `stop`!

### Confirming this in the source

I like to encourage users to read the source as well as the documentation. The source code for slice objects and this logic is found here. First we determine if `step` is negative:

`````` step_is_negative = step_sign < 0;
``````

If so, the lower bound is `-1` meaning we slice all the way up to and including the beginning, and the upper bound is the length minus 1, meaning we start at the end. (Note that the semantics of this `-1` is different from a `-1` that users may pass indexes in Python indicating the last item.)

``````if (step_is_negative) {
lower = PyLong_FromLong(-1L);
if (lower == NULL)
goto error;

if (upper == NULL)
goto error;
}
``````

Otherwise `step` is positive, and the lower bound will be zero and the upper bound (which we go up to but not including) the length of the sliced list.

``````else {
lower = _PyLong_Zero;
Py_INCREF(lower);
upper = length;
Py_INCREF(upper);
}
``````

Then, we may need to apply the defaults for `start` and `stop` - the default then for `start` is calculated as the upper bound when `step` is negative:

``````if (self->start == Py_None) {
start = step_is_negative ? upper : lower;
Py_INCREF(start);
}
``````

and `stop`, the lower bound:

``````if (self->stop == Py_None) {
stop = step_is_negative ? lower : upper;
Py_INCREF(stop);
}
``````

# Give your slices a descriptive name!

You may find it useful to separate forming the slice from passing it to the `list.__getitem__` method (that"s what the square brackets do). Even if you"re not new to it, it keeps your code more readable so that others that may have to read your code can more readily understand what you"re doing.

However, you can"t just assign some integers separated by colons to a variable. You need to use the slice object:

``````last_nine_slice = slice(-9, None)
``````

The second argument, `None`, is required, so that the first argument is interpreted as the `start` argument otherwise it would be the `stop` argument.

You can then pass the slice object to your sequence:

``````>>> list(range(100))[last_nine_slice]
[91, 92, 93, 94, 95, 96, 97, 98, 99]
``````

It"s interesting that ranges also take slices:

``````>>> range(100)[last_nine_slice]
range(91, 100)
``````

# Memory Considerations:

Since slices of Python lists create new objects in memory, another important function to be aware of is `itertools.islice`. Typically you"ll want to iterate over a slice, not just have it created statically in memory. `islice` is perfect for this. A caveat, it doesn"t support negative arguments to `start`, `stop`, or `step`, so if that"s an issue you may need to calculate indices or reverse the iterable in advance.

``````length = 100
last_nine_iter = itertools.islice(list(range(length)), length-9, None, 1)
list_last_nine = list(last_nine_iter)
``````

and now:

``````>>> list_last_nine
[91, 92, 93, 94, 95, 96, 97, 98, 99]
``````

The fact that list slices make a copy is a feature of lists themselves. If you"re slicing advanced objects like a Pandas DataFrame, it may return a view on the original, and not a copy.

I"m surprised nobody has thought of using `iter`"s two-argument form:

``````from itertools import islice

def chunk(it, size):
it = iter(it)
return iter(lambda: tuple(islice(it, size)), ())
``````

Demo:

``````>>> list(chunk(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)]
``````

This works with any iterable and produces output lazily. It returns tuples rather than iterators, but I think it has a certain elegance nonetheless. It also doesn"t pad; if you want padding, a simple variation on the above will suffice:

``````from itertools import islice, chain, repeat

return iter(lambda: tuple(islice(it, size)), (padval,) * size)
``````

Demo:

``````>>> list(chunk_pad(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)]
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, "a")]
``````

Like the `izip_longest`-based solutions, the above always pads. As far as I know, there"s no one- or two-line itertools recipe for a function that optionally pads. By combining the above two approaches, this one comes pretty close:

``````_no_padding = object()

it = iter(it)
sentinel = ()
else:
return iter(lambda: tuple(islice(it, size)), sentinel)
``````

Demo:

``````>>> list(chunk(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)]
>>> list(chunk(range(14), 3, None))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)]
>>> list(chunk(range(14), 3, "a"))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, "a")]
``````

I believe this is the shortest chunker proposed that offers optional padding.

As Tomasz Gandor observed, the two padding chunkers will stop unexpectedly if they encounter a long sequence of pad values. Here"s a final variation that works around that problem in a reasonable way:

``````_no_padding = object()
it = iter(it)
chunker = iter(lambda: tuple(islice(it, size)), ())
yield from chunker
else:
for ch in chunker:
yield ch if len(ch) == size else ch + (padval,) * (size - len(ch))
``````

Demo:

``````>>> list(chunk([1, 2, (), (), 5], 2))
[(1, 2), ((), ()), (5,)]
>>> list(chunk([1, 2, None, None, 5], 2, None))
[(1, 2), (None, None), (5, None)]
``````

The `grouper()` recipe from the `itertools` documentation"s recipes comes close to what you want:

``````def grouper(n, iterable, fillvalue=None):
"grouper(3, "ABCDEFG", "x") --> ABC DEF Gxx"
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)
``````

It will fill up the last chunk with a fill value, though.

A less general solution that only works on sequences but does handle the last chunk as desired is

``````[my_list[i:i + chunk_size] for i in range(0, len(my_list), chunk_size)]
``````

Finally, a solution that works on general iterators an behaves as desired is

``````def grouper(n, iterable):
it = iter(iterable)
while True:
chunk = tuple(itertools.islice(it, n))
if not chunk:
return
yield chunk
``````

There"s no such thing a the "first n" keys because a `dict` doesn"t remember which keys were inserted first.

You can get any n key-value pairs though:

``````n_items = take(n, d.iteritems())
``````

This uses the implementation of `take` from the `itertools` recipes:

``````from itertools import islice

def take(n, iterable):
"Return first n items of the iterable as a list"
return list(islice(iterable, n))
``````

See it working online: ideone

Update for Python 3.6

``````n_items = take(n, d.items())
``````