Pandas | Time Series Manipulation Basics

NumPy | Python Methods and Functions

Purposes of Time Series Analysis

  • Create Date Series
  • Working with Data Timestamp
  • Convert string data to timestamp
  • Slicing data using a timestamp
  • Revise your time series for different time period / summary statistics aggregates
  • Dealing with missing data
  • Now let`s do some hands-on analysis of some data to demonstrate the use of pandas time series.

    Code # 1:

    import pandas as pd

    from datetime import datetime

    import numpy as np

      

    range_date = pd. date_range (start = `1/1 / 2019` , end = `1/08 / 2019`

    freq = ` Min` )

    print (range_date)

    Exit:

     DatetimeIndex ([`2019-01-01 00: 00: 00`,` 2019-01-01 00: 01: 00`, `2019-01-01 00: 02: 00`,` 2019-01-01 00: 03: 00`, `2019-01-01 00: 04: 00`,` 2019-01-01 00: 05: 00`, `2019-01-01 00: 06: 00`,` 2019-01-01 00: 07: 00`, `2019-01-01 00: 08: 00`,` 2019-01-01 00: 09: 00`, ... `2019-01-07 23: 51: 00`,` 2019-01-07 23: 52: 00`, `2019-01-07 23: 53: 00`,` 2019-01-07 23: 54: 00`, `2 019-01-07 23: 55: 00`, `2019-01-07 23: 56: 00`,` 2019-01-07 23: 57: 00`, `2019-01-07 23: 58: 00` , `2019-01-07 23: 59: 00`,` 2019-01-08 00: 00: 00`], dtype = `datetime64 [ns]`, length = 10081, freq = `T`)  

    Explanation:
    Here in this code, we have created a time timestamp based on minutes for the date ranges 01/01/2009 to 01/08. 2009 . We can vary the frequency from hours to minutes or seconds. This feature will help you record the data stored in a minute. As we can see in the output, the length of the date / time stamp is 10081. Remember that pandas use the datatype as datetime64 [ns] .

    Code # 2:

    import pandas as pd

    from datetime import datetime

    import numpy as np

      

    range_date = pd.date_range ( start = `1/1 / 2019` , end = `1/08 / 2019`

      freq = ` Min` )

    print ( type (range_date [ 110 ]))

    Exit:

     & lt; class `pandas._libs .tslibs.timestamps.Timestamp` & gt; 

    Explanation:
    We are checking the type of our object named range_date .

    Code # 3:

    import pandas as pd

    from datetime import datetime

    import numpy as np

     

    range_date = pd.date_range (start = `1/1 / 2019` , end = `1/08 / 2019` ,

     freq = `Min` )

     

    df = pd.DataFrame (range_date, columns = [ ` date` ])

    df [ ` data` ] = np.random.randint ( 0 , 100 , size = ( len (range_date)))

      

    print (df.head ( 10 ))

    Exit:

     date data 0 2019-01-01 00:00:00 49 1 2019-01-01 00:01:00 58 2 2019-01-01 00:02:00 48 3 2019-01-01 00:03: 00 96 4 2019-01-01 00:04:00 42 5 2019-01-01 00:05:00 8 6 2019-01-01 00:06:00 20 7 2019-01-01 00:07:00 96 8 2019-01-01 00:08:00 48 9 2019-01-01 00:09:00 78 

    Explanation :

    First we created time series and then converted that data into a data frame and used a random function to generate random data and match against the data frame. Then we use the print function to check the result. 
    To manipulate the time series, we need to have a datetime index so that the dataframe is indexed at the timestamp. Here we add another new column in the dataframe panda.

    Code # 4:

    import pandas as pd

    from datetime import datetime

    import numpy as np

     

    range_date = pd.date_range (start = `1/1 / 2019` , end = `1/08 / 2019` ,

      freq = `Min` )

      

    df = pd.DataFrame (range_date, columns = [ `date` ])

    df [ `data` ] = np.random.randint ( 0 , 100 , size = ( len (range_date)))

     

    string_data = [ str (x)   for x in range_date]

    print (string_data [ 1 : 11 ])

    Exit:

    [`2019-01-01 00:01:00` , `2019-01-01 00:02:00`, `2019-01-01 00:03:00`, `2019-01-01 00:04:00`, `2019-01-01 00:05: 00 `,` 2019-01-01 00:06:00 `,` 2019-01-01 00:07:00 `,` 2019-01-01 00:08:00 `,` 2019-01-01 00: 09:00 `,` 2019-01-01 00:10:00 `]

    Explanation:
    This code just uses elements data_rng and is converted to a string, and due to the large amount of data, we split the data into pieces and print a list of the first ten string_data values. Using for each loop in the list, we got all the values ​​in the range_date range. When we use date_range, we must always include a start and end date.

    Example :

    import pandas as pd

    from datetime import datetime

    import numpy as np

     

    range_data = pd.date_range (start = ` 1/1 / 2019` , end = ` 1/08 / 2019`

    freq  = `Min` )

      

    df = pd.DataFrame (range_data, columns = [ `date` ])

    df [ `data` ] = np.random.randint ( 0 , 100 , size = ( len (range_data)))

     

    df [ `datetime ` ] =  pd.to_datetime (df [ `date` ])

    df = df.set_index ( `datetime` )

    df.drop ([ `date ` ], axis = 1 , inplace = True )

     

    print (df [ `2019-01-05` ] [ 1 : 11 ])

    Exit :

     data datetime 2019-01-05 00:01:00 99 2019-01- 05 00:02:00 21 2019-01-0 5 00:03:00 29 2019-01-05 00:04:00 98 2019-01-05 00:05:00 0 2019-01-05 00:06:00 72 2019-01-05 00:07:00 69 2019-01-05 00:08:00 53 2019-01-05 00:09:00 3 2019-01-05 00:10:00 37 




Get Solution for free from DataCamp guru