# Iterator Functions in Python | Set 2 (islice (), starmap (), tee () ..)

File handling | islice | Python Methods and Functions

1. islice (iterable, start, stop, step) : — this iterator selectively prints the values ​​referenced in its iterable container passed as an argument. This iterator takes 4 arguments, an iterative container, start position, end position, and stride.

2. starmap (func., List of tuples) : — This iterator takes a function and a list of tuples as an argument and returns a value according to the function from each tuple in the list.

 # Python code to demonstrate how it works # islice () and starmap ()   # import & quot; itertools & quot; for iterator operations import itertools   # initializing list li = [ 2 , 4 , 5 , 7 , 8 , 10 , 20 ]    # initialize the list of tuples li1 = [( 1 , 10 , 5 ), ( 8 , 4 , 1 ), ( 5 , 4 , 9 ), ( 11 , 10 , 1 )]       # using islice () to slice the list according to ... need # starts printing from 2nd index to 6th skip 2 print ( "The sliced ​​list values ​​are:" , end = "") print ( list (itertools.islice (li, 1 , 6 , 2 )))   # using starmap () to select a value acc. function # selects the minimum of all tuple values ​​ print ( "The values ​​acc. to function are:" , end = "") print ( list (itertools.starmap ( min , li1)))

Output:

The sliced ​​list values ​​are: [4, 7, 10] The values ​​acc. to function are: [1, 1, 4, 1]

3. takewhile (func, iterable) : — this iterator is the opposite of drop while (), it prints the values ​​ until the function returns false the first time.

4. tee (iterator, count) : — This iterator splits the container into multiple iterators mentioned in the argument.

 # Python code to demonstrate how it works # takewhile () and tee ()   # import & quot; itertools & quot; for iterator operations import itertools   # initializing list li = [ 2 , 4 , 6 , 7 , 8 , 10 , 20 ]    # save the list in an iterator iti = iter (li)    # using takewhile () to print values ​​until the condition is false. print ( " The list values ​​till 1st false value are: " , end = " ") print ( list (itertools.takewhile ( lambda x: x % 2 = = 0 , li)))   # using tee () to list iterators # makes a list of 3 iterators that have the same value. it = itertools.tee (iti, 3 )   # printing iterator values ​​ print ( "The iterators are:" ) for i in range ( 0 , 3 ): print ( list (it [i]))

Output:

The list values ​​till 1st false value are : [2, 4, 6] The iterators are: [2, 4, 6, 7, 8, 10, 20] [2, 4, 6, 7, 8, 10, 20] [2, 4, 6, 7 , 8, 10, 20]

5. zip_longest (iterable1, iterable2, fillval.) : — this iterator prints the values ​​of the iterables one at a time . If one of the iterations is printed completely, the remaining values ​​are filled with the values ​​assigned to fillvalue .

 # Python code to demonstrate how it works # zip_longest ()   # import & quot; itertools & quot; for iterator operations import itertools   # using zip_longest () to combine the two iterations. print ( "The combined values ​​of iterables is :" ) print ( * (itertools.zip_longest ( 'GesoGes' , ' ekfrek' , fillvalue = '_' )))

Output:

The combined values ​​of iterables is: ('G',' e') ('e',' k') ('s',' f') ('o',' r ') (' G', 'e') (' e', 'k') (' s', '_')

Combinatorial iterators

1. product (iter1, iter2) : — This iterator prints the Cartesian product of the two iterable containers passed as arguments.

2. permutations (iter, group_size) : — this iterator prints all possible permutations of all elements of the iterable.  The size of each permutation group is determined by the group_size argument.

 # Python- demo code # product () and permutation ()   # import & quot; itertools & quot; for iterator operations import itertools   # using product () to print a Cartesian product print ( "The cartesian product of the containers is:" ) print ( list (itertools.product ( 'AB' , ' 12' )))   # use permutations to compute all possible permutations print ( "All the permutations of the given container is:"  ) print ( list (itertools.permutations ( 'GfG' , 2 )))

Output:

The cartesian product of the containers is: [('A', '1'), (' A', '2'), (' B', '1'), (' B', '2')] All the permutations of the given container is: [(' G', 'f'), (' G', 'G'), (' f', 'G'), (' f', 'G'), (' G', 'G'), (' G', 'f')]

3. combinations (iterable, group_size) : — this iterator prints all possible combinations (no replacement) of the container passed in arguments in the specified group size in sorted order.

4. combinations_with_replacement (iterable, group_size) : — this iterator prints all possible combinations (with replacement) of the container passed in arguments in the specified group size in sorted order.

 # Python code to demonstrate how it works # combination () and combination_with_replacement ()   # import & quot; itertools & quot; for iterator operations import itertools   # use combinations () to print each combination # (without replacements) print ( "All the combination of container in sorted order (without replacement) is: " ) print ( list (itertools.combinations ( '1234' , 2 )))   # using combination_with_replacement () to print each combination # with replacement print ( "All the combination of container in sorted order (with replacement) is:" ) print ( list (itertools.combinations_with_replacement ( 'GfG' , 2 )))

Output:

All the combination of container in sorted order (without replacement) is: [('1',' 2 '), (' 1', '3'), (' 1', '4'), (' 2', '3'), (' 2', '4'), (' 3', '4 ')] All the combination of container in sorted order (with replacement) is: [(' G', 'G'), (' G', 'f'), (' G', 'G'), (' f', 'f'), (' f', 'G'), (' G', 'G')]

Infinite iterators

1. count (start, step) : — This iterator starts printing at "start" and prints indefinitely . If steps are mentioned, numbers are skipped, otherwise step is 1 by default.

Example:

iterator.count (5,2) prints - 5,7,9,11 .. .infinitely

2. loop (iterable) : — this iterator prints all values ​​in order from the passed container. It resumes printing from the beginning again when all items are cycled .

Example:

iterator.cycle ([1,2,3,4]) prints - 1,2,3,4,1,2,3,4,1 ... infinitely

3. repeat (val, num) : — This iterator repeatedly prints the passed value an infinite number of times. If a number is mentioned, they are up to that number.

 # Python code to demonstrate how it works # repeat ()   # import & quot; itertools & quot; for iterator operations import itertools   # using repeat () to repeatedly print the number print ( "Printing the numbers repeatedly:" ) print ( list (itertools.repeat ( 25 , 4 )))

Output:

Printing the numbers repeatedly: [25, 25, 25, 25]

This article courtesy of Manjeet Singh . If you are as Python.Engineering and would like to contribute, you can also write an article using contribute.python.engineering or by posting an article contribute @ python.engineering. See my article appearing on the Python.Engineering homepage and help other geeks.

# Slicing a list

top5 = array[:5]
• To slice a list, there"s a simple syntax: array[start:stop:step]
• You can omit any parameter. These are all valid: array[start:], array[:stop], array[::step]

# Slicing a generator

import itertools
top5 = itertools.islice(my_list, 5) # grab the first five elements
• You can"t slice a generator directly in Python. itertools.islice() will wrap an object in a new slicing generator using the syntax itertools.islice(generator, start, stop, step)

• Remember, slicing a generator will exhaust it partially. If you want to keep the entire generator intact, perhaps turn it into a tuple or list first, like: result = tuple(generator)

Note: this post assumes Python 3.x syntax.

A generator is simply a function which returns an object on which you can call next, such that for every call it returns some value, until it raises a StopIteration exception, signaling that all values have been generated. Such an object is called an iterator.

Normal functions return a single value using return, just like in Java. In Python, however, there is an alternative, called yield. Using yield anywhere in a function makes it a generator. Observe this code:

>>> def myGen(n):
...     yield n
...     yield n + 1
...
>>> g = myGen(6)
>>> next(g)
6
>>> next(g)
7
>>> next(g)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration

As you can see, myGen(n) is a function which yields n and n + 1. Every call to next yields a single value, until all values have been yielded. for loops call next in the background, thus:

>>> for n in myGen(6):
...     print(n)
...
6
7

Likewise there are generator expressions, which provide a means to succinctly describe certain common types of generators:

>>> g = (n for n in range(3, 5))
>>> next(g)
3
>>> next(g)
4
>>> next(g)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration

Note that generator expressions are much like list comprehensions:

>>> lc = [n for n in range(3, 5)]
>>> lc
[3, 4]

Observe that a generator object is generated once, but its code is not run all at once. Only calls to next actually execute (part of) the code. Execution of the code in a generator stops once a yield statement has been reached, upon which it returns a value. The next call to next then causes execution to continue in the state in which the generator was left after the last yield. This is a fundamental difference with regular functions: those always start execution at the "top" and discard their state upon returning a value.

There are more things to be said about this subject. It is e.g. possible to send data back into a generator (reference). But that is something I suggest you do not look into until you understand the basic concept of a generator.

Now you may ask: why use generators? There are a couple of good reasons:

• Certain concepts can be described much more succinctly using generators.
• Instead of creating a function which returns a list of values, one can write a generator which generates the values on the fly. This means that no list needs to be constructed, meaning that the resulting code is more memory efficient. In this way one can even describe data streams which would simply be too large to fit in memory.
• Generators allow for a natural way to describe infinite streams. Consider for example the Fibonacci numbers:

>>> def fib():
...     a, b = 0, 1
...     while True:
...         yield a
...         a, b = b, a + b
...
>>> import itertools
>>> list(itertools.islice(fib(), 10))
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

This code uses itertools.islice to take a finite number of elements from an infinite stream. You are advised to have a good look at the functions in the itertools module, as they are essential tools for writing advanced generators with great ease.

About Python <=2.6: in the above examples next is a function which calls the method __next__ on the given object. In Python <=2.6 one uses a slightly different technique, namely o.next() instead of next(o). Python 2.7 has next() call .next so you need not use the following in 2.7:

>>> g = (n for n in range(3, 5))
>>> g.next()
3

Python 2:

with open("datafile") as myfile:
head = [next(myfile) for x in xrange(N)]

Python 3:

with open("datafile") as myfile:
head = [next(myfile) for x in range(N)]

Here"s another way (both Python 2 & 3):

from itertools import islice

with open("datafile") as myfile:

## Explain Python"s slice notation

In short, the colons (:) in subscript notation (subscriptable[subscriptarg]) make slice notation - which has the optional arguments, start, stop, step:

sliceable[start:stop:step]

Python slicing is a computationally fast way to methodically access parts of your data. In my opinion, to be even an intermediate Python programmer, it"s one aspect of the language that it is necessary to be familiar with.

## Important Definitions

To begin with, let"s define a few terms:

start: the beginning index of the slice, it will include the element at this index unless it is the same as stop, defaults to 0, i.e. the first index. If it"s negative, it means to start n items from the end.

stop: the ending index of the slice, it does not include the element at this index, defaults to length of the sequence being sliced, that is, up to and including the end.

step: the amount by which the index increases, defaults to 1. If it"s negative, you"re slicing over the iterable in reverse.

## How Indexing Works

You can make any of these positive or negative numbers. The meaning of the positive numbers is straightforward, but for negative numbers, just like indexes in Python, you count backwards from the end for the start and stop, and for the step, you simply decrement your index. This example is from the documentation"s tutorial, but I"ve modified it slightly to indicate which item in a sequence each index references:

+---+---+---+---+---+---+
| P | y | t | h | o | n |
+---+---+---+---+---+---+
0   1   2   3   4   5
-6  -5  -4  -3  -2  -1

## How Slicing Works

To use slice notation with a sequence that supports it, you must include at least one colon in the square brackets that follow the sequence (which actually implement the __getitem__ method of the sequence, according to the Python data model.)

Slice notation works like this:

sequence[start:stop:step]

And recall that there are defaults for start, stop, and step, so to access the defaults, simply leave out the argument.

Slice notation to get the last nine elements from a list (or any other sequence that supports it, like a string) would look like this:

my_list[-9:]

When I see this, I read the part in the brackets as "9th from the end, to the end." (Actually, I abbreviate it mentally as "-9, on")

## Explanation:

The full notation is

my_list[-9:None:None]

and to substitute the defaults (actually when step is negative, stop"s default is -len(my_list) - 1, so None for stop really just means it goes to whichever end step takes it to):

my_list[-9:len(my_list):1]

The colon, :, is what tells Python you"re giving it a slice and not a regular index. That"s why the idiomatic way of making a shallow copy of lists in Python 2 is

list_copy = sequence[:]

And clearing them is with:

del my_list[:]

(Python 3 gets a list.copy and list.clear method.)

### When step is negative, the defaults for start and stop change

By default, when the step argument is empty (or None), it is assigned to +1.

But you can pass in a negative integer, and the list (or most other standard slicables) will be sliced from the end to the beginning.

Thus a negative slice will change the defaults for start and stop!

### Confirming this in the source

I like to encourage users to read the source as well as the documentation. The source code for slice objects and this logic is found here. First we determine if step is negative:

step_is_negative = step_sign < 0;

If so, the lower bound is -1 meaning we slice all the way up to and including the beginning, and the upper bound is the length minus 1, meaning we start at the end. (Note that the semantics of this -1 is different from a -1 that users may pass indexes in Python indicating the last item.)

if (step_is_negative) {
lower = PyLong_FromLong(-1L);
if (lower == NULL)
goto error;

if (upper == NULL)
goto error;
}

Otherwise step is positive, and the lower bound will be zero and the upper bound (which we go up to but not including) the length of the sliced list.

else {
lower = _PyLong_Zero;
Py_INCREF(lower);
upper = length;
Py_INCREF(upper);
}

Then, we may need to apply the defaults for start and stop - the default then for start is calculated as the upper bound when step is negative:

if (self->start == Py_None) {
start = step_is_negative ? upper : lower;
Py_INCREF(start);
}

and stop, the lower bound:

if (self->stop == Py_None) {
stop = step_is_negative ? lower : upper;
Py_INCREF(stop);
}

# Give your slices a descriptive name!

You may find it useful to separate forming the slice from passing it to the list.__getitem__ method (that"s what the square brackets do). Even if you"re not new to it, it keeps your code more readable so that others that may have to read your code can more readily understand what you"re doing.

However, you can"t just assign some integers separated by colons to a variable. You need to use the slice object:

last_nine_slice = slice(-9, None)

The second argument, None, is required, so that the first argument is interpreted as the start argument otherwise it would be the stop argument.

You can then pass the slice object to your sequence:

>>> list(range(100))[last_nine_slice]
[91, 92, 93, 94, 95, 96, 97, 98, 99]

It"s interesting that ranges also take slices:

>>> range(100)[last_nine_slice]
range(91, 100)

# Memory Considerations:

Since slices of Python lists create new objects in memory, another important function to be aware of is itertools.islice. Typically you"ll want to iterate over a slice, not just have it created statically in memory. islice is perfect for this. A caveat, it doesn"t support negative arguments to start, stop, or step, so if that"s an issue you may need to calculate indices or reverse the iterable in advance.

length = 100
last_nine_iter = itertools.islice(list(range(length)), length-9, None, 1)
list_last_nine = list(last_nine_iter)

and now:

>>> list_last_nine
[91, 92, 93, 94, 95, 96, 97, 98, 99]

The fact that list slices make a copy is a feature of lists themselves. If you"re slicing advanced objects like a Pandas DataFrame, it may return a view on the original, and not a copy.

I"m surprised nobody has thought of using iter"s two-argument form:

from itertools import islice

def chunk(it, size):
it = iter(it)
return iter(lambda: tuple(islice(it, size)), ())

Demo:

>>> list(chunk(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)]

This works with any iterable and produces output lazily. It returns tuples rather than iterators, but I think it has a certain elegance nonetheless. It also doesn"t pad; if you want padding, a simple variation on the above will suffice:

from itertools import islice, chain, repeat

return iter(lambda: tuple(islice(it, size)), (padval,) * size)

Demo:

[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)]
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, "a")]

Like the izip_longest-based solutions, the above always pads. As far as I know, there"s no one- or two-line itertools recipe for a function that optionally pads. By combining the above two approaches, this one comes pretty close:

it = iter(it)
sentinel = ()
else:
return iter(lambda: tuple(islice(it, size)), sentinel)

Demo:

>>> list(chunk(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)]
>>> list(chunk(range(14), 3, None))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)]
>>> list(chunk(range(14), 3, "a"))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, "a")]

I believe this is the shortest chunker proposed that offers optional padding.

As Tomasz Gandor observed, the two padding chunkers will stop unexpectedly if they encounter a long sequence of pad values. Here"s a final variation that works around that problem in a reasonable way:

it = iter(it)
chunker = iter(lambda: tuple(islice(it, size)), ())
yield from chunker
else:
for ch in chunker:
yield ch if len(ch) == size else ch + (padval,) * (size - len(ch))

Demo:

>>> list(chunk([1, 2, (), (), 5], 2))
[(1, 2), ((), ()), (5,)]
>>> list(chunk([1, 2, None, None, 5], 2, None))
[(1, 2), (None, None), (5, None)]

The grouper() recipe from the itertools documentation"s recipes comes close to what you want:

def grouper(n, iterable, fillvalue=None):
"grouper(3, "ABCDEFG", "x") --> ABC DEF Gxx"
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)

It will fill up the last chunk with a fill value, though.

A less general solution that only works on sequences but does handle the last chunk as desired is

[my_list[i:i + chunk_size] for i in range(0, len(my_list), chunk_size)]

Finally, a solution that works on general iterators an behaves as desired is

def grouper(n, iterable):
it = iter(iterable)
while True:
chunk = tuple(itertools.islice(it, n))
if not chunk:
return
yield chunk

There"s no such thing a the "first n" keys because a dict doesn"t remember which keys were inserted first.

You can get any n key-value pairs though:

n_items = take(n, d.iteritems())

This uses the implementation of take from the itertools recipes:

from itertools import islice

def take(n, iterable):
"Return first n items of the iterable as a list"
return list(islice(iterable, n))

See it working online: ideone

Update for Python 3.6

n_items = take(n, d.items())