Change language

Extracting email addresses using regular expressions in Python

| | | |

Let’s take an example where we only need to find out email from a given input by a regular expression. 
Examples:

 Input: Hello [email protected] Rohit [email protected] Output: [email protected] [email protected] Here we have only selected email from the given input string ... Input: My 2 favorite numbers are 7 and 10 Output: 2 7 10 Here we have selected only digits. 

Regular expression — 
Regular expression — it is a sequence of characters primarily used to find and replace patterns in a string or file. 
So we can say that the task of finding and extracting is so common that Python has a very powerful library called regular expressions that does many of these tasks quite elegantly.

SymbolUsage
$ Matches the end of the line
s Matches whitespace
S Matches any non-whitespace character
* Repeats a character zero or more times
S Matches any non-whitespace character
*? Repeats a character zero or more times (non-greedy)
+ Repeats a character one or more times
+? Repeats a character one or more times (non-greedy)
[aeiou] Matches a single character in the listed set
[^XYZ ] Matches a single character not in the listed set
[a-z0-9] The set of characters can include a range
( Indicates where string extraction is to start
) Indicates where string extraction is to end

# Python program to extract numeric digits
# from lines by regular expression ...

 
# Module import is required for regular
# expressions

import re 

 
# Line example

s = ’My 2 favorite numbers are 7 and 10’

 
# find all functions to select all numbers from 0
# up to 9 [0-9] for a numeric letter in a line
# + for repeating a character one or more times

lst = re.findall ( ’[0-9] +’ , s) 

  
# Print list

print (lst )

Exit:

 [’ 2’, ’7’,’ 10’] 

# Python program for curing emails from
# String by regular expression.

 
# Module import is required for regular
# expressions

import re 

 
# Example string

s = ’ Hello from [email protected] to [email protected] about the meeting @ 2 PM’

 
# / S matches any non-whitespace character
# @ for as in email
# + for Repeats a character one or more times

lst = re.findall ( ’S + @ S +’ , s) 

 
# Print list

print (lst)

Exit:

 [’shubhamg199630 @ gmail.com’,’ priya @ yahoo.com’] 

More details: