Common String Methods

Learn via video courses
Topics Covered

Overview

The string is a very commonly used data type. In fact, for programmers, a daily encounter with string data is a must. Previously we have talked a lot about how to work with various types of data in Pandas, and now it's string data we will talk about.

Introduction

Pandas are known for data manipulation, and manipulation of string data is a major part of data manipulation. And why do we exactly need string manipulation? The answer to this is while we are working on real-world data, some of the data might not be apt for us to work on or give us an idea about what exactly the data is about.

Common String Methods

There are different types of operations to be performed on data, especially when it is about string data; the manipulations can be multiple. But of these, some operations happen pretty often, for example, changing a string from lower to upper case and vice versa or finding the length of a string for such operations. We have some common string methods in Pandas. We will look into each one of them one by one. But before we move to these methods, we will create a dataframe of type string, and then all the methods will be applied to the same data to give you a clear differentiation among these methods.

Code Example :

Output:

Code Example :

Output:

Series.str.upper()

As the name suggests, this method is used to change the case of the string to Uppercase.

Code Example :

Output:

Series.str.lower()

As the name suggests, this method is used to change the case of the string to lowercase.

Code Example:

Output:

Series.str.isUpper()

It checks whether all characters in each string in the Index of the Data-Frame are in upper case or not, and returns a Boolean value.

Code Example :

Output:

Series.str.islower()

It checks whether all characters in each string in the Index of the Data-Frame are in lowercase and returns a Boolean value.

Code Example:

Output :

title()

It converts each word's first letter to uppercase, leaves the remaining in lowercase, and returns the output. As we can see in the below example, just the first letter of each data element has been converted to uppercase rest are in lowercase.

Code Example :

Output:

Series.str.len()

It returns the length of the string and in case of an empty string, it returns NaN.

Code Example:

Output:

Series.str.isdigit()

It checks whether all the characters in the string are a digit. Since string data can contain different characters, such as whitespace and punctuators, this method will return False.

Code Example :

Output:

Code Example :

Output:

Series.str.isnumeric()

It checks whether all characters in each string in the Index of the DataFrame are numeric if yes it will return True or False.

Code Example :

Output:

Code Example :

Output:

Difference between isdigit() and isnumeric()

isdigit()isnumeric()
It accepts subscripts, decimals, and superscripts.It also supports vulgar fractions, roman numerals, and currency numerators.
Syntax : '123'.isdigit()Syntax : '123'.isnumeric()

Conclusion

In this article, we learned the common string methods in Pandas and their functions. We started with :

  • upper() and lower() - returns the string after changing their case into upper and lower, respectively.
  • isUpper() and islower() - which returns the boolean value after checking the case of the string. If the string is not all uppercase or lowercase, it will return False else True.
  • title() - this method returns the string after changing the first letter to uppercase and the remaining all to lowercase.
  • len() - This method returns the length of the string.
  • isdigit() - this checks if all the characters in the string are a digit and then returns a boolean value.
  • isnumeric() - this checks if all the characters in the string are numeric or not and returns a boolean value accordingly.