Change language

FuzzyWuzzy Python library

| |

There are many ways to compare strings in Python. Some of the main methods are:

  1. Using regular expressions
  2. Simple comparison
  3. Using difflib

But one of very simple methods — use the library fuzzywuzzy, where we can get a score of 100, which means the two strings are equal, giving a similarity index. This article explains how we got started using the fuzzywuzzy library.

FuzzyWuzzy — it is a Python library that is used for string matching. Fuzzy string matching — it is the process of finding strings that match a given pattern. It mainly uses Levenshtein distance to calculate differences between sequences. 
FuzzyWuzzy was developed and launched by SeatGeek, a ticket finder for sports and concert events. Their original use case is as described in their blog.

    Fuzzy requirements

  • Python 2.4 or higher
  • Python-Levenshtein

Install via pip:

  pip install fuzzywuzzy pip install python-Levenshtein  

How to use this library?

First import these modules,

from fuzzywuzzy import fuzz

from fuzzywuzzy import process

Simple ratio usage :

fuzz.ratio ( ’pythonengineering’ , ’geeksgeeks’ )


# Exact match

fuzz.ratio ( ’GeeksforGeeks’ , ’GeeksforGeeks’


fuzz.ratio ( ’ geeks for geeks’ , ’Geeks For Geeks’


< table border = "0" cellpadding = "0" cellspacing = "0">

fuzz.partial_ratio ( "geeks for geeks" , "geeks for geeks!" )

# Exclamation point in the second line,

but still partially words are same so score comes 100


fuzz.partial_ratio ( "geeks for geeks " , " geeks geeks " )

# less rating because there is additional

token in the middle middle of the string. 

The token now sets the token’s sort ratio:

# Token Sort Ratio

fuzz.token_sort_ratio ( " geeks for geeks " , " for geeks geeks " )


# This gives 100 since every word is the same regardless of position

# Token Set Ratio

fuzz.token_sort_ratio ( "geeks for geeks" , "geeks for for geeks" )

8 8

fuzz.token_set_ratio ( " geeks for geeks " , " geeks for geeks " )

# The score comes 100 in the second case, because token_set_ratio
considers duplicate words as a single word. 

Now suppose that if we have a list of parameters, and we want to find the closest matches, we can use the process module

query = ’geeks for geeks’

choices = [ ’geek for geek’ , ’geek geek’ , ’ g. for geeks’

# Get a list of matches ordered by score, the default limit is 5
process.extract (query, choices)

[( ’geeks geeks’ , 95 ), ( ’ g. For geeks’ , 95 ), ( ’geek for geek’ , 93 )]

# If we only want the top one
process.extractOne (query, choices)

( ’geeks geeks’ , 95 )

There is also another relationship, which is often called WRatio , sometimes it is better to use WRatio instead of a simple relationship, as WRatio handles lowercase and uppercase and some other parameters.

fuzz.WRatio ( ’geeks for geeks’ , ’Geeks For Geeks’ )


fuzz.WRatio ( ’geeks for geeks !!!’ , ’geeks for geeks’ )

# whereas a simple ratio will give for the above case

fuzz.ratio ( ’geeks for geeks !!!’ , ’ geeks for geeks’ )


Full code

# Python code showing all relationships together
# make sure you have fuzzywuzzy module installed


from fuzzywuzzy import fuzz

from fuzzywuzzy import process


s1 = "I love GeeksforGeeks"

s2 = "I am loving GeeksforGeeks"

print "FuzzyWuzzy Ratio:" , fuzz.ratio (s1, s2)

print "FuzzyWuzzy PartialRatio:" , fuzz.partial_ratio ( s1, s2)

print "FuzzyWuzzy TokenSortRatio:" , fuzz.token_sort_ratio (s1, s2)

print " FuzzyWuzzy TokenSetRatio: " , fuzz.token_set_ratio (s1, s2)

print " FuzzyWuzzy WRatio: " , fuzz.WRatio (s1, s2), ’ ’ 

# for process library

query = ’geeks for geeks’

choices = [ ’ geek for geek’ , ’geek geek’ , ’g. for geeks’

print "List of ratios: "

print process.extract (query, choices), ’’

print "Best among the above list:" , process.extractOne (query, choices)


 FuzzyWuzzy Ratio: 84 FuzzyWuzzy PartialRatio: 85 FuzzyWuzzy TokenSortRatio: 84 FuzzyWuzzy TokenSetRatio: 86 FuzzyWuzzy WRatio: 84 List of ratios: [’g. 95), (’geek for geek’, 93), (’ geek geek’, 86)] Best among the above list: (’g. For geeks’, 95) 

FuzzyWuzzy is built on top of the library difflib, python-Levenshtein is used for speed. So this is one of the best ways to match strings in Python.

FuzzyWuzzy Python library fuzzywuzzy: Questions

FuzzyWuzzy Python library Python functions: Questions


Best laptop for Fortnite


Best laptop for Excel


Best laptop for Solidworks


Best laptop for Roblox


Best computer for crypto mining


Best laptop for Sims 4


Best laptop for Zoom


Best laptop for Minecraft


Latest questions


psycopg2: insert multiple rows with one query

12 answers


How to convert Nonetype to int or string?

12 answers


How to specify multiple return types using type-hints

12 answers


Javascript Error: IPython is not defined in JupyterLab

12 answers



Python OpenCV | cv2.putText () method

numpy.arctan2 () in Python

Python | os.path.realpath () method

Python OpenCV | () method

Python OpenCV cv2.cvtColor () method

Python - Move item to the end of the list

time.perf_counter () function in Python

Check if one list is a subset of another in Python

Python os.path.join () method