Development

Python Convert String to Binary

In this tutorial, we will dive into the workings of Python built-in methods and techniques to learn how to convert a string into a binary.
Captain Salem 5 min read
Python Convert String to Binary

Strings are a fundamental and essential building block in Python and other programming languages. In Python, a string refers to an immutable sequence of Unicode characters. By default, we represent a string enclosed in single '', double "" or triple ''' or """ quotes.

Binary, conversely, is a base-2 number system that consists of 0's and 1's. Binary is an essential data format as it's what computers understand.

When working with Python programs, you might encounter instances where you need to convert a string type into its binary representation.

Method 1 - Using bytearray + bin

The first method we can use is to convert the input string into a bytearray object. The bytearray data type in Python is a built-in data type that represents a mutable sequence containing bytes in the range of 0-255.

To convert a string into a bytearray object, we use the bytearray() method as shown:

bytearray(string, encoding)

This should return the string as a sequence of bytes.

For example:

>>> string = "Hello, world!"
>>> print(bytearray(string, 'utf-8'))
bytearray(b'Hello, world!')

Once we have converted the string into bytes, we can use a for-loop to iterate over each byte and use the bin() method on each byte to convert it into its binary representation.

Finally, we can append the resulting binary representation into another list:

The code is as shown below:

>>> string = "Hello, world!"
>>> byte_array = bytearray(string, 'utf-8')
>>> bin_list = []
>>> for byte in byte_array:
...     bin_rep = bin(byte)
...     bin_list.append(bin_rep)
... 
>>> print(bin_list)

Output:

['0b1001000', '0b1100101', '0b1101100', '0b1101100', '0b1101111', '0b101100', '0b100000', '0b1110111', '0b1101111', '0b1110010', '0b1101100', '0b1100100', '0b100001']

As you can see from the output above, we quickly convert the string Hello, world! into its binary representation and store that result as a list.

How to Remove the 0b Prefix

As you can notice from the example above, the resulting values contain a prefix of 0b, which denotes that the number is represented as a binary and not as a decimal value.

However, since we already know this, it becomes unnecessary and difficult to read. We can eliminate it by slicing the binary number and starting with index 2 on the binary string.

Similarly, we can extend this functionality and use the join() method to join the binary representation as a single string.

>>> for byte in byte_array:
...     bin_rep = bin(byte)
...     bin_list.append(bin_rep[2:])
... 
>>> print(' '.join(bin_list))

Output:

1001000 1100101 1101100 1101100 1101111 101100 100000 1110111 1101111 1110010 1101100 1100100 100001

Method 2 - Using the Format and Bytearray

We can also use Python's bytearray() method to convert a given string into a byte object. The resulting object can then represent each string character as a byte.

Next, we can call the format(x, 'b') method to convert the resulting byte object to it's binary representation.

An example is as shown below:

>>> string = "GeekBits"
>>> result = ' '.join(format(x, 'b') for x in bytearray(string, 'utf-8'))
>>> print(result)

In the code above, we use the bytearray(string, 'utf-8') method to convert the string into a sequence of bytes.

Next, we use the generator expression format(x, 'b') for x in bytearray(...) to convert each byte into its binary form.

Output:

1000111 1100101 1100101 1101011 1000010 1101001 1110100 1110011

Method 3 - Using Python Ord and Format Methods

If you are unfamiliar, we have access to the old() function in Python, which allows us to get the Unicode representation of an input character.

We can use this function instead of the bytearray() method to convert the characters of the input string into their Unicode values.

Finally, we can use the format() method to convert them into binary and join() to combine them into a single string, as we did in the previous example.

>>> string = "GeekBits"
>>> result = ' '.join(format(ord(x), 'b') for x in string)
>>> print(result)

Output:

1000111 1100101 1100101 1101011 1000010 1101001 1110100 1110011

In this case, the ord() function takes an input character from looping the string. The function will then convert that character into its Unicode value.

Method 4 - Using Python Bin, Map, and Bytearray() Methods

Another of the powerful methods and techniques that we can use in Python to convert a string into binary is a combination of three methods.

How it works

We can use the map() method to pass the byte object from the bytearray() method. Once we pass each byte object to the bin() function, we can get the binary equivalent of each byte.

Finally, using a constructor, we can use the object returned by the map() method to convert it into a list. We can also use list comprehensions to create a binary string from the object.

An example is as shown:

>>> string = "GeekBits"
>>> result = ' '.join([x[2:] for x in list(map(bin, bytearray(string, 'utf-8')))])
>>> print(result)

In the example above, we start by converting the string into a byte object using the bytearray(string, 'utf-8') method.

Next, we convert each byte into its binary string using the map(bin, ...) method.

We also ensure to string the 0b prefix from the binary string using list comprehension and selecting from index 2 [x[2:] for x in ...].

And lastly, we join the binary strings with spaces: ' '.join(...) and print the result.

Output:

1000111 1100101 1100101 1101011 1000010 1101001 1110100 1110011

Method 5 - Using the Bitarray Library

We can also use the bitarray library to convert a string into its binary representation.

Start by installing the bitarray library using pip:

pip install bitarray

Next, import the library and use it to convert a string into binary as shown:

>>> from bitarray import bitarray
>>> string = "GeekBits"
>>> res = bitarray()
>>> a = bitarray()
>>> a.frombytes(string.encode('utf-8'))
>>> print(a.to01())

Output:

0100011101100101011001010110101101000010011010010111010001110011

You can learn more about Python's bitarray lib in the link below:

https://pypi.org/project/bitarray

Method 6 - Using BinAscii

We can also use the binascii module in Python, which contains several methods to convert between binary and various ASCII-encoded binary representations.

One of those methods is the hexlify method, which allows us to convert binary data into hexadecimal. We can also specify the base for this method as base 16 to convert it into an integer object.

Lastly, we can conver the result into binary using the bin function.

>>> import binascii
>>> string = "GeekBits"
>>> b = bytes(string, 'utf-8')
>>> result = bin(int(binascii.hexlify(b), 16))
>>> print(result[2:])

Output:

100011101100101011001010110101101000010011010010111010001110011

Conclusion

In this tutorial, you learned various powerful methods and techniques that you can use to convert a string into its binary representation in Python. You can try each of the methods and see which suits you best.

If you enjoyed our tutorials, be sure to subscribe and share.

Share
Comments
More from GeekBits

Join us at GeekBits

Join our members and get a currated list of awesome articles each month.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to GeekBits.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.