This tutorial will explore Python methods for splitting a string by the tab character. We will explore methods such as str.split() and str.rstrip().
Take into consideration the string below:
databases = "MySQL PostgreSQL Cassandra"
Suppose we wish to split the string above by the tab character. We can take into consideration three main methods.
Method 1 - Using the split()
Function
The split()
function in Python enables us to input a source string and split it based on a given character. Hence, we can use this function and specify the tab character, denoted as \t,
as the separator.
The function will then split the string on each occurrence of the tab character.
import re
databases = "MySQL PostgreSQL Cassandra"
db_list = databases.split('\t')
print(db_list)
In the method demonstrated above, we import the regex
module in python to handle the split.
We then call the str.split()
method and pass the \t
character as the delimiter. The function will split the string on each occurrence of the tab character and store the resulting values in a list.
$ python main.py
['MySQL', 'PostgreSQL', 'Cassandra']
As we can see from the output above, we got a list of each string.
Take into account what happens if your source string has a tab character at the start or end of it.
databases = " MySQL PostgreSQL Cassandra "
If we apply the split()
function to such as string:
import re
databases = " MySQL PostgreSQL Cassandra "
db_list = re.split(r' ', databases)
print(db_list)
In such a case, the function returns a list of items as:
['', 'MySQL\tPostgreSQL\tCassandra', '']
In such an instance, the resulting list contains empty string values at the start and end of the list. We can resolve this by using the filter()
function.
import re
databases = "\tMySQL PostgreSQL Cassandra\t"
db_list = list(filter(None, databases.split('\t')))
print(db_list)
The filter()
function will take care of the empty strings from our list and return:
['MySQL', 'PostgreSQL', 'Cassandra']
The function works by taking the function and iterable parameters. In our case, we pass the list as the iterable value and None
as the function parameter.
Once we set the function parameter to None
the function removes all the falsy elements, including empty strings from the list.
NOTE: Since the filter()
function returns a filter object, we use the list()
function to convert the object into a list.
Method 2 - Using the str.rstrip()
Function
Another common method we can use to split a string while removing empty strings is the rstrip()
function.
Take the code below:
import re
databases = "\tMySQL\tPostgreSQL\tCassandra\t"
db_list = re.split(r'\t+', databases.rstrip(' '))
print(db_list)
Output:
[MySQL', 'PostgreSQL', 'Cassandra']