Python | Categorizing input data in lists

Python Methods and Functions | Regular Expressions

Lists, when created and by specifying values ​​in a section of code, generate output similar to the following:
Code :

List = [ ' GeeksForGeeks' , 'VenD' , 5 , 9.2 ]

print ( 'List:' , List )

Exit :

 List: ['GeeksForGeeks',' VenD', 5, 9.2] 

In the picture above, the defined list is a combination of integers and strings new values. The interpreter implicitly interprets "GeeksForGeeks" and "VenD" as string values, while 5 and 9.2 are interpreted as integer and floating point values, respectively. We can perform common arithmetic operations on integer and floating point values ​​as follows. 
Code :

# Common arithmetic operations on 5 and 9.2:

List = [ 'GeeksForGeeks ' , ' VenD' , 5 , 9.2 ]

print ( 'List [2] +2, Answer:' , end = '')

print ( List [ List . index ( 5 )] + 2 )

 

print ( 'List [3] +8.2, Answer:' , end = '')

print ( List [ List .index ( 9.2 )] + 8.2 )

Output :

 List [2] +2, Answer: 7 List [3] +8.2, Answer: 17.4 

In addition, string-specific operations such as string concatenation can be performed on matching strings:
Code :

 

# String concatenation operation
# List: [& # 39; GeeksForGeeks & # 39 ;, & # 39; VenD & # 39 ;, 5, 9.2]
# Combining list [0] and list [1]

List =   [ ' GeeksForGeeks' , 'VenD' , 5 , 9.2 ]

print ( List [ 0 ] + '' + List < code class = "plain"> [ 1 ])

However, since we know that lists contain elements of different data types, which can be of type: string, integer, float comma, tuple, dictionaries or even a list of themselves (a list of lists), then this is not valid if you are generating a list as a result of user input. For example, consider the example below:
Code :

# All resulting List2 elements will be of string type

 

list2 = []  # This is a list that will contain elements as input from the user

 

element_count = int ( input ( 'Enter Number of Elements you wish to enter:' ))

 

for i in range (element_count):

  element = input (f 'Enter Element {i + 1}:' )

list2.append (element)

 

print ( "List 2:" , list2)

Exit :

 Enter Number of Elements you wish to enter: 4 Enter Element 1: GeeksForGeeks Enter Element 2: VenD Enter Element 3: 5 Enter Element 4: 9.2 List 2 : ['GeeksForGeeks',' VenD', '5',' 9.2'] 

You may notice that the List2 generated by user input, i.e. Now only contains values ​​of string data type. In addition, numeric elements have now lost the ability to undergo arithmetic operations, since they are of a string data type. This behavior directly contradicts the generic behavior of lists.

As programmers, we need to process user data and store it in an appropriate format for operations and manipulations on the target dataset to be effective. 
In this approach, we will distinguish the data received from the user into three sections, namely: integer, string and floating point. To do this, we use a small code to perform the appropriate type casting operations.

Method to overcome the proposed limitation:

Code:

import re

 

def checkInt (string):

  string_to_integer = re. compile (r 'd +' )

if len (string_to_integer.findall (string))! = 0 :

if len (string_to_integer.findall (string) [ 0 ]) = = len (string):

return 1

else :

  return 0

else :

  return 0

  

def checkFloat (string):

string_to_float = re. compile (r 'd * .d *' )

if len (string_to_float.findall (string))! = 0 :

if len (string_to_float.findall (string) [ 0 ]) = = len (string):

return 1

else :

return 0

else :

  return 0

  

List2 = []

element_count = int ( input ( 'Enter number of elements:' ))

 

for i in range (element_count):

input_element = input (f 'Enter Element {i + 1}:' )

 

if checkInt (input_element):

input_element = int (input_element)

  List2.append (input_element)

  

elif checkFloat (input_element):

input_element = float (input_element)

List2. append (input_element)

 

else :

List2.append (input_element)

  

print (List2)

Exit :

 Enter number of elements: 4 Enter Element 1: GeeksForGeeks Enter Element 2: VenD Enter Element 3: 5 Enter Element 4: 9.2 ['GeeksForGeeks',' VenD', 5, 9.2] 

The above method is essentially an algorithm that uses a regular expression library along with an algorithm to parse the data types of the inserted elements. After successfully analyzing the data structure, we proceed to perform the type conversion.

For example, consider the following cases:

  1. Case1: all values ​​in the string are numeric ... User input — 7834, the regex function parses the data and determines that all values ​​are digits from 0 to 9, so the string & # 39; 7834 & # 39; typecast to an equivalent integer value and then added to the list as an integer.

    Expression used for integer identification: r & # 39; / d + & # 39;

  2. Case2: the String expression contains elements that represent a floating point number. A floating point value is identified by the sequence of digits preceding or following a full stop (& # 39;. & # 39;). For example: 567., .056, 6.7, etc.
    Expression used to identify a floating point value: r & # 39; / d *. / D * & # 39;
  3. Case3: String Input contains characters, special characters and numeric values. In this case, the data item is generalized as a string value. No special regular expressions are required as this expression will return false if classified as integer or floating point values. 
    Example: & # 39; 155B, Baker Street! & # 39;, & # 39; 12B72C_? & # 39;, & # 39; I am agent 007 & # 39;, & # 39; _GeeksForGeeks_ & # 39; etc

Output: This method, however, is only a small prototype of type values ​​and processing the raw data before storing it in a list ... This definitely offers a fruitful result that overcomes the suggested limitations on list inputs. In addition, thanks to advanced regex applications and some algorithm improvements, it is possible to parse and store correspondingly more forms of data such as tuples, dictionaries, etc.

Benefits of Dynamic TypeCasting:

  • Since the data is stored in an appropriate format, various operations can be performed on it, depending on the type. Example: concatenation in the case of strings, addition, subtraction, multiplication in the case of numeric values ​​and various other operations with the corresponding data types.
  • The typing phase occurs when the data is saved. Therefore, the programmer does not need to worry about type errors when performing operations on a dataset.

Limitations of dynamic TypeCasting:

  • Certain data items that do not necessarily require type conversion go through a process that leads to unnecessary computation.
  • Repeated conditional checks and function calls waste memory each time a new item is added.
  • The flexibility to handle multiple data types may require several new additions to the existing code to suit the needs of the developer.




Get Solution for free from DataCamp guru