Topics

String in Python

Welcome to a tutorial on String in Python. In our previous tutorial, we have used strings in examples or discussed them, so this will not be the first time we will deal with strings. However, in this tutorial, you learn more about strings, how they are used, manipulated, and implemented in Python, and you will learn about some string functions to be able to manipulate strings.

 

What is a String?

In Python, String can be referred to as a sequence of characters. This is the simplest explanation of strings that can be provided. In the string definition, there are two important terms, the first being sequence and the other characters. This has already been discussed in our previous tutorial, on what is Sequence data-type and how strings are a type of sequence. In python, a Sequence is a data type that is made up of several elements of the same type e.g. integers, characters, strings, float, and so on.

Take note that in python, there is a unique code provided to all existing characters, and the coding convention had been labeled as a Unicode format. This consists of characters of almost every possible language as well as emoticons (emoticons are already declared as characters).

Thus, we can refer to strings as a special type of sequence, in which all its elements are characters. For instance, the string "Hello, World" is a sequence of ['H', 'e', 'l', 'l', 'o', ',', ' ', 'W', 'o', 'r', 'l', 'd']. Also, its length can be calculated by counting the number of characters inside the sequence which is 12.

Also note that space, yes, comma, and other things inside those will be a character if the length is 1.

In general, programming languages have different data types dedicated to characters only. While Python does not have a character data type, rather, characters are just treated as a string of length 1.

 

Declaration of Strings

>>> mystring = "This is String"
>>> print (mystring);
This is String

 

One can access each character of a string as well, like accessing each element of a sequence, so index numbers can be used for this purpose. Check out the example below to learn how to access the first character of mystring.

>>> print (mystring[0]);
T

 

From the above example, T is the first character of the string This is not my first String. Thus it has an index number of 0 (zero). Also, for more characters, we can use indexes 1,2,3, etc. Now, to access ith element we will have to use (i-1)th index.

In addition, there is a different way to access elements of the sequence from its end. Check out the example below to see how to access the last element of the sequence.

>>> print (mystring[-1])

From the example above, writing -1 in the index will imply that you are asking for the 1st element from the last. Also, to access the 2nd last element use -2 as an index, for the 3rd last use -3, etc. Therefore, for the ith element from the last use -ith as the index. Thus, this settles the generalization for accessing each character from both the forward and backward sides of a string. 

Take note that the positive index number implies you are accessing the character from the forward side, and the negative index number means you're accessing it from the rear end.

Check the table below to see the summary of learned so far. If we consider a string PYTHON, then each character can be accessed in two ways: from the front,

 

CharactersPYTHON
Forward Index012345
Backward Index-6-5-4-3-2-1

 

Escape Sequence

Now, let’s assume you want to use a string to store a quote by Mahatma Gandhi. “You must be the change if you wish to see in the world" – Gandhi; this will be the exact line you desire to display in the console and to have the quotes surround this sentence. If you print the statement, you find out that it is not easy. Therefore, Python will instantly return a syntax error, because of those extra double quotes that we added. 

You can see that Gandhi's quoted text is in black, but "Gandhi" is in green. Similarly, if we had used the IDLE, you may know that all the characters inside the string are highlighted in green in the IDLE. Although, it can be any color, depending on the text editor, OS, the python version, and so on. 

Therefore, this implies that Python isn't treating You must be the change you wish to see in the world as part of the sentence as a string. Thus, this means that whenever we open a quote and close it, to declare a string, and anything we want to write after the closing quote, is taken as some python keyword.

Just like the quotation above, we started the string with two double quotes and then wrote You must be the change you wish to see in the world next to it, since the double quote was already closed before this phrase, thus, Python sees the entire sentence as some non-understandable python keywords. Also, after the phrase, another double quote started, then have the - Gandhi after that and finally the closing double quote, because - Gandhi part is within a pair of double quotes, thus, its totally legitimate.

 

There are two ways we can have a string.

  1. For the first one, we can use single quotes inside of double quotes, check the example below.
>>> print ("'You must be the change you wish to see in the world' - Gandhi");

‘You must be the change you wish to see in the world' - Gandhi

So, it is legitimate to use a single quote inside double quotes, but the reverse is not true. Check out the example below.

>>> '"You must be the change you wish to see in the world" - Gandhi'

You can see that it gave an error.

 

2. The second one, in contrast to the first method, involves the use the double quotes. However, there is something called escape sequence or literally speaking, a back-slash. Check out the example below. 

>>> print (""You must be the change you wish to see in the world" – Gandhi");

If you notice, we used backslash or escape sequence at two places, just before the quotes which we directly want to print. But, if we want to inform the compiler to simply print whatever we type.

Note that you must use one escape sequence for one character. E.g. to print 5 double quotes, we have to use 5 back slashes; that's one before each quote. Check the example below.

>>> print (""""""");

 

Input and Output for String 

In our previous tutorial, we discussed input and Output methods. You can go over the tutorial again to refresh your knowledge of the topic. 

 

Operations on String 

In Python, String handling requires the least effort, because string operations have very low complexity in python when compared to other programming languages. Let's highlight some of how we can handle strings.

  1. Concatenation: This simply refers to the joining of two strings. Such as to join "Hello" with "World", and we will have "Hello World". Check out the example below. 
>>> print ("Hello" + "World");

HelloWorld

From the example above, the plus sign + was used. The plus sign + is used with strings to join the two strings. Check out the example below: 

>>> s1 = "Amazon "
>>> s2 = "Alexa "
>>> s3 = "Have Good AI"
>>> print (s1 + s2 + s3)

Amazon Alexa Have Good AI

2. Repetition: This is used in scenarios where we want to write the same text multiple times on a console. Such as repeating "Hi!" 100 times. Check out the example below. 

>>> print ("Hi!"*100)

Now, if we want the user to input some number n and also print a text on the console n times, how can you do it? This can be done by creating a variable n and using the input() function to get a number from the user and then multiply the text with n.

>>> n = input("Number of times you want the text to repeat: ")

Number of times you want the text to repeat: 5

>>> print ("Text"*n);

TextTextTextTextText

3. Check existence of a character or a sub-string in a string: The in keyword is used for this. Check out the example below. 

>>> "AI" in "Is alexa having good AI ?"

True

The Boolean datatype is one of the data types in Python, as it can have the possible values, either true or false. Therefore, since we checking if something exists in a string or not, the possible outcomes will either be Yes, it exists; or No, it doesn't, thus, either True or False is returned. In addition, this gives the idea of where to use the Boolean datatype during programming.

4. not in keyword: This keyword is the opposite of the in a keyword. Also, its implementation is quite similar to that of a keyword.

>>> "Google" in "Is alexa having good AI ?"

False

 

Converting String to Int or Float datatype and vice versa 

For most beginners, there is this confusion as to whether a number when enclosed in quotes becomes a string in Python. Then when there is an attempt to carry out a mathematical operation on it, an error response will be thrown.

numStr = '123'

From the above example, 123 is not a number, instead, it is a string.

Thus, in such a case as above, to convert a numeric string into a float or int datatype, the float() and int() function can be used.

numStr = '123'
numFloat = float(numStr)
numInt = int(numFloat)

In addition to the above example, to convert an int or float variable to a string, the str() function can be used.

num = 123
numStr = str(num)

 

 

Slicing 

Slicing is another string operation in Python. It allows us to extract a part of any string based on a start index and an end index. Take, for example, if there is a certain string say This is a Python tutorial and we choose to extract a part of this string or just a character, then Slicing can be used. Below is the general syntax for Slicing.

string_name[starting_index : finishing_index : character_iterate]

String_name:  This is the name of the variable holding the string.

starting_index: It is the index of the beginning character which you want in your sub-string.

finishing_index: This is one or more than the index of the last character that you want in your substring.

character_iterate: This can be explained like this, suppose we have a string Hello Brother!, and we want to use the slicing operation on it to extract a sub-string- this can be done as shown below:

 

>>> str = "Hello Brother!"
>>> print(str[0:10:2]);

From the above example, the str[0:10:2] means that we want to extract a substring starting from the index 0 (i.e. the beginning of the string), to the index value 10, and the last parameter means we want every second character, starting from the starting index. Thus, our output will be HloBo.

The H value is at index 0, then leaving e; the second character from H will be printed, and is l; then skipping the second l, the second character from the first l is printed, and is o; etc.

Let's take more examples, for better understanding. We will take a string with 10 characters, ABCDEFGHIJ, and our index number will start from 0 and end at 9. Check the table below:

ABCDEFGHIJ
0123456789

Let’s try the command below:

>>> print s[0:5:1]

In this case, slicing will be done from the 0th character to the 4th character (5-1) by iterating 1 character in each jump.

Let’s remove the last number and the colon and write the code below.

>>> print (s[0:5]);

Notice that the output is the same for both.

For you to understand better, practice by changing the value of the character iterate variable to some value say n, and then it will print every nth character from the beginning index to the last index.