Clear string data in specified Pandas Dataframe

| | | |

👻 Check our latest review to choose the best laptop for Machine Learning engineers and Deep learning tasks!

Suppose we are dealing with data from an e-commerce site. Product names are not in the correct format. Format the data correctly so that there are no leading and trailing spaces, and the first letters of all products are capitalized.

Solution # 1: In many cases, we are faced with a situation where we you need to write your own custom function suitable for the task at hand.

# import pandas as pd

import pandas as pd


# Create data frame

df = pd.DataFrame ({ ’Date’ : [ ’10/2 / 2011’ , ’ 11/2 / 2011’ , ’12/2 / 2011’ , ’ 13/2 / 2011’ ],

’Product’ : [ ’UMbreLla’ , ’ maTress’ , ’BaDmintoN’ , ’Shuttle ’ ],

’ Updated_Price’ : [ 1250 , 1450 , 1550 , 400 ],

’Discount’ : [ 10 , 8 , 15 , 10 ]})


# Print the data frame

print (df)

Output:

Now we will write our own a custom function to solve this problem.

def Format_data (df):

# iterate over all lines

for i in range (df.shape [ 0 ]):

# reassign values ‚Äã‚Äãto the product column

# first remove the spaces using the strip () function

# then we capitalize with the capitalize () function

df.iat [i, 1 ] = df.iat [i, 1 ]. strip (). capitalize ( )


# Let’s go to call the function
Format_data (df)


# Print the Dataframe

print (df)

Output:

Solution # 2: Now we will see a better and more efficient approach using the Pandas function DataFrame.apply () .

# import pandas as pd

import pandas as pd


# Create data frame

df = pd.DataFrame ( {’ ’ Date’ : [ ’10/2 / 2011’ , ’11/2 / 2011’ , ’12/2 / 2011’ , ’ 13/2 / 2011’ ],

’Product’ : [ ’UMbreLla’ , ’ maTress’ , ’BaDmintoN’ , ’Shuttle’ ],

’Updated_Price’ : [ 1250 , 1450 , 1550 , 400 ],

’Discount’ : [ 10 , 8 , 15 , 10 ]})


# Print the data frame

print (df)

Output:

Let’s use DataFrame.apply () Pandas DataFrame.apply () to format Product names in the desired format. Inside the Pandas DataFrame.apply () function, we’ll use a lambda function.

# Using the df.apply () function on a column product

df [ ’Product’ ] = df [ ’Product’ ]. apply ( lambda x: x.strip (). capitalize ())


# Print the Dataframe

print (df)

Output:
< / p>

👻 Read also: what is the best laptop for engineering students?

iat

InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately

3 answers

Tried to perform REST GET through python requests with the following code and I got error.

Code snip:

import requests
header = {"Authorization": "Bearer..."}
url = az_base_url + az_subscription_id + "/resourcegroups/Default-Networking/resources?" + az_api_version
r = requests.get(url, headers=header)

Error:

/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:79: 
          InsecurePlatformWarning: A true SSLContext object is not available. 
          This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. 
          For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning

My python version is 2.7.3. I tried to install urllib3 and requests[security] as some other thread suggests, I still got the same error.

Wonder if anyone can provide some tips?

334

Answer #1

The docs give a fair indicator of what"s required., however requests allow us to skip a few steps:

You only need to install the security package extras (thanks @admdrew for pointing it out)

$ pip install requests[security]

or, install them directly:

$ pip install pyopenssl ndg-httpsclient pyasn1

Requests will then automatically inject pyopenssl into urllib3


If you"re on ubuntu, you may run into trouble installing pyopenssl, you"ll need these dependencies:

$ apt-get install libffi-dev libssl-dev

iat

Dynamic instantiation from string name of a class in dynamically imported module?

3 answers

In python, I have to instantiate certain class, knowing its name in a string, but this class "lives" in a dynamically imported module. An example follows:

loader-class script:

import sys
class loader:
  def __init__(self, module_name, class_name): # both args are strings
    try:
      __import__(module_name)
      modul = sys.modules[module_name]
      instance = modul.class_name() # obviously this doesn"t works, here is my main problem!
    except ImportError:
       # manage import error

some-dynamically-loaded-module script:

class myName:
  # etc...

I use this arrangement to make any dynamically-loaded-module to be used by the loader-class following certain predefined behaviours in the dyn-loaded-modules...

222

Answer #1

You can use getattr

getattr(module, class_name)

to access the class. More complete code:

module = __import__(module_name)
class_ = getattr(module, class_name)
instance = class_()

As mentioned below, we may use importlib

import importlib
module = importlib.import_module(module_name)
class_ = getattr(module, class_name)
instance = class_()

iat

How to get all of the immediate subdirectories in Python

3 answers

I"m trying to write a simple Python script that will copy a index.tpl to index.html in all of the subdirectories (with a few exceptions).

I"m getting bogged down by trying to get the list of subdirectories.

184

Answer #1

import os
def get_immediate_subdirectories(a_dir):
    return [name for name in os.listdir(a_dir)
            if os.path.isdir(os.path.join(a_dir, name))]

We hope this article has helped you to resolve the problem. Apart from Clear string data in specified Pandas Dataframe, check other File handling-related topics.

Want to excel in Python? See our review of the best Python online courses 2022. If you are interested in Data Science, check also how to learn programming in R.

By the way, this material is also available in other languages:



Dmitry Porretti

Warsaw | 2022-12-10

I was preparing for my coding interview, thanks for clarifying this - Clear string data in specified Pandas Dataframe in Python is not the simplest one. Will use it in my bachelor thesis

Dmitry Porretti

Munchen | 2022-12-10

Simply put and clear. Thank you for sharing. Clear string data in specified Pandas Dataframe and other issues with sin was always my weak point 😁. Will use it in my bachelor thesis

Boris Wu

Texas | 2022-12-10

Maybe there are another answers? What Clear string data in specified Pandas Dataframe exactly means?. I am just not quite sure it is the best method

Shop

Learn programming in R: courses

$

Best Python online courses for 2022

$

Best laptop for Fortnite

$

Best laptop for Excel

$

Best laptop for Solidworks

$

Best laptop for Roblox

$

Best computer for crypto mining

$

Best laptop for Sims 4

$

Latest questions

NUMPYNUMPY

Common xlabel/ylabel for matplotlib subplots

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

12 answers

NUMPYNUMPY

Flake8: Ignore specific warning for entire file

12 answers

NUMPYNUMPY

glob exclude pattern

12 answers

NUMPYNUMPY

How to avoid HTTP error 429 (Too Many Requests) python

12 answers

NUMPYNUMPY

Python CSV error: line contains NULL byte

12 answers

NUMPYNUMPY

csv.Error: iterator should return strings, not bytes

12 answers

News


Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

sin

How to specify multiple return types using type-hints

exp

Printing words vertically in Python

exp

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

cos

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically