NumPy in Python | Bundle 2 (Advanced)

This article discusses some of the more advanced techniques available in NumPy.

  1. Stacking: multiple arrays can be stacked together on different axes.
    • np.vstack: for arrays along the vertical axis.
    • np.hstack: for arrays along the horizontal axis.
    • np.column_stack: for stacking one-dimensional arrays as columns into two-dimensional arrays.
    • np.concatenate: for placement arrays along the specified axis (the axis is passed as an argument).

    import numpy as np

     

    a = np.array ([[ 1 , 2 ],

    [ 3 , 4 ]])

     

    b = np.array ([[ 5 , 6 ],

    [ 7 , 8 ]])

     
    # vertical stacking

    print ( "Vertical stacking:" , np.vstack ((a, b)))

      
    # horizontal stacking

    print ( "Horizontal stacking:" , np.hstack ((a, b)))

     

    c = [ 5 , 6 ]

     
    # column stacking

    print ( " Column stacking: " , np.column_stack ((a, c)))

      
    # concatenation method

    print ( "Concatenating to 2nd axis: " , np.concatenate ((a, b), 1 ))

    Output:

     Vertical stacking: [[1 2] [3 4] [5 6] [7 8]] Horizontal stacking: [[1 2 5 6] [3 4 7 8]] Column stacking: [[1 2 5] [3 4 6]] Concatenating to 2nd axis: [[1 2 5 6] [3 4 7 8]] 
  2. Separation: For separation we have the following functions :
    • np.hsplit: split the array along the horizontal axis.
    • np.vsplit: split the array along the vertical axis.
    • np.array_split: Split the array along the specified axis.

    import numpy as np

     

    a = np.array ([[ 1 , 3 , 5 , 7 , 9 , 11 ],

    [ 2 , 4 , 6 , 8 , 10 , 12 ]])

     
    # horizontal splitting

    print ( "Splittin g along horizontal axis into 2 parts: " , np.hsplit (a, 2 ))

      
    # vertical splitting

    print ( "Splitting along vertical axis into 2 parts:" , np.vsplit (a, 2 ))

    Output:

     Splitting along horizontal axis into 2 parts: [array ([[1, 3, 5], [2, 4, 6]]), array ([[7, 9, 11], [8, 10, 12]])] Splitting along vertical axis into 2 parts : [array ([[1, 3, 5, 7, 9, 11]]), array ([[2, 4, 6, 8, 10, 12]])] 
  3. Broadcast: The term broadcast describes how NumPy handles arrays of various shapes during arithmetic operations. Subject to certain constraints, the smaller array is "broadcast" over the larger array to have compatible forms.

    Broadcast provides a means of vectorizing array operations, so the looping happens in C instead of Python. This does this without making unnecessary copies of the data and usually results in an efficient implementation of the algorithm. There are also cases where broadcasting — a bad idea because it leads to inefficient use of memory, which slows down computations.

    NumPy operations are usually element-wise, which requires two arrays to have the same shape. Numpy`s broadcast rule relaxes this restriction when array shapes meet certain restrictions.

    Broadcast rule: For broadcast, the size of the back axes for both arrays in an operation must be either the same size or one of there should be one .

    Let`s see some examples:

      A (2-D array): 4 x 3 B (1-D array): 3 Result: 4 x 3 
      A (4-D array): 7 x 1 x 6 x 1 B (3-D array): 3 x 1 x 5 Result: 7 x 3 x 6 x 5 

    But that would be a mismatch:

      A: 4 x 3 B: 4   

    The simplest broadcast example occurs when an array and a scalar value are combined in an operation.
    Consider the example below:

    import numpy as np

     

    a = np.array ([ 1.0 , 2.0 , 3.0 ])

     
    # Example 1

    b = 2.0

    print (a * b)

     
    # Example 2

    c = [ 2.0 , 2.0 , 2.0 ]

    print (a * c)

    Output:

     [2. 4. 6.] [2. 4. 6.] 

    We can think of scalar b being stretched during arithmetic operation into an array of the same shape as a. The new elements in b, as shown in the figure above, are just copies of the original scalar. Although the analogy with stretching is only conceptual in nature.
    Numpy is smart enough to use the original scalar value without actually copying, so that broadcast operations are as fast and computationally efficient as possible. Since example 1 moves less memory during the multiplication (b — is a scalar, not an array), this is about 10% faster than example 2, which uses standard numpy on Windows 2000 with a millionth array of elements!
    The figure below makes the concept clearer:

    In the above example, scalar b is stretched to become an array with the same shape as and a so that the shapes are compatible for elementwise multiplication.

    Now let`s look at an example where both arrays are stretched.

    import numpy as np

     

    a = np.array ([ 0.0 , 10.0 , 20.0 , 30 .0 ])

    b = np.array ([ 0.0 , 1.0 , 2.0 ])

     

    print (a [ :, np.newaxis] + b)

    Output:

     [[0. 1. 2.] [10. 11. 12.] [20. 21. 22.] [30. 31. 32.]] 


    In some cases, broadcast stretches both arrays to form an output array larger than either of the initial arrays.

  4. Working with datetime: Numpy has basic array data types that natively support the functionality d atetime. The data type is called "datetime64", so named because "datetime" is already used by the datetime library included in Python.
    Consider the example below for some examples:

    import numpy as np

     
    # create date

    today = np.datetime64 ( `2017-02-12` )

    print ( "Date is:" , today)

    print ( "Year is:" , np.datetime64 (today, `Y` ))

      
    # create an array of monthly dates

    dates = np.arange ( ` 2017-02` , `2017-03` , dtype = `datetime64 [D]` )

    print ( "Dates of February, 2017:" , dates)

    print ( "Today is February:" , today in dates)

     
    # date arithmetic

    dur = np.datetime64 ( `2017-05-22` ) - np.datetime64 ( ` 2016-05- 22` )

    print ( "No. of days: " , dur)

    print ( "No. of weeks:" , np.timedelta64 (dur, ` W` ))

     
    # sorting dates

    a = np.array ([ ` 2017-02-12` , ` 2016-10-13` , `2019-05-22` ], dtype = ` datetime64` )

    print ( "Dates in sorted order:" , np.sort (a))

    Output:

     Date is: 2017-02-12 Year is: 2017 Dates of February, 2017: [`2017-02-01`` 2017-02-02` `2017-02-03` `2017-02-04`` 2017-02-05`` 2017-02-06`` 2017-02-07`` 2017-02-08`` 2017-02-09`` 2017-02-10`` 2017-02-11`` 2017-02-12`` 2017-02-13`` 2017-02-14`` 2017-02-15`` 2017-02-16`` 2017-02-17`` 2017 -02-18`` 2017-02-19`` 2017-02-20`` 2017-02-21`` 2017-02-22`` 2017-02-23`` 2017-02-24`` 2017- 02-25`` 2017-02-26`` 2017-02-27`` 2017-02-28`] Today is February: True No. of days: 365 days No. of weeks: 52 weeks Dates in sorted order: [`2016-10-13`` 2017-02-12` `2019-05-22`] 
  5. Linear algebra in NumPy: The linear algebra module in NumPy offers various methods for applying linear algebra to any array of dummies.
    You can find:
    • rank, determinant, trace, etc. of an array.
    • eigenvalues ​​of matrices
    • matrix and vector products (dot, internal, external, etc.), exponentiation of a matrix
    • solve linear or tensor equations and much more!

    Consider the example below that explains how we we can use NumPy to do some matrix operations.

    import numpy as np

     

    A = np.array ([[ 6 , 1 , 1 ],

    [ 4 , - 2 , 5 ],

    [ 2 , 8 , 7 ]])

     

    print ( "Rank of A:" , np.linalg.matrix_rank (A))

     

    print ( "Trace of A:" , np.trace (A))

     

    print ( " Determinant of A: " , np.linalg.det (A))

      

    print ( "Inverse of A : " , np.linalg.inv (A))

      

    print ( "Matrix A raised to power 3:" , np.linalg.matrix_power (A, 3 ))

    Output:

     Rank of A: 3 Trace of A: 11 Determinant of A: -306.0 Inverse of A: [[0.17647059 -0.00326797 -0.02287582] [0.05882353 -0.13071895 0.08496732] [-0.11764706 0.1503268 0.05228758]] Matrix A raised to power 3: [[336 162 162 228] [406 469] [698 702 905]] 

    Suppose we want to solve this set of linear equations:

      x + 2 * y = 8 3 * x + 4 * y = 18 

    This problem can be solved using the linalg.solve, method as shown in the example below:

    import numpy as np

      
    # odds

    a = np.array ([[ 1 , 2 ], [ 3 , 4 ]])

    # constants

    b = np.array ([ 8 , 18 ])

     

    print ( " Solution of linear equations: " , np.linalg.solve (a, b))

    Output:

     Solution of linear equations: [2. 3.] 

    Finally, we see an example which shows how you can perform linear regression using the least squares method.

    The linear regression line is w1 x + w 2 = y, and this is the line that minimizes the sum of squares the distance from each data point to the line So, given n data pairs (xi, yi), the parameters we are looking for are w1 and w2, which minimize the error:

    Let`s see an example below:

    import nu mpy as np

    import matplotlib.pyplot as plt

     
    # x coordinates

    x = np.arange ( 0 , 9 )

    A = np.array ([x, np.ones ( 9 )])

     
    # linearly generated sequence

    y = [ 19 , 20 , 20.5 , 21.5 , 22 , 23 , 23 , 25.5 , 24 ]

    # getting regression line parameters

    w = np.linalg.lstsq (AT, y) [ 0

     
    # drawing a line

    line = w [ 0 ] * x + w [ 1 ] # regression line

    plt.plot ( x, line, `r-` )

    plt.plot (x, y, ` o` )

    plt.show ()

    Output:

So this leads to completing this series of tutorials on NumPy.

NumPy — it is a widely used general purpose library that underlies many other computational libraries such as scipy, scikit-learn, tensorflow, matplotlib, opencv, etc. Having a basic understanding of NumPy helps you work efficiently with other higher level libraries!

Links: