Strings

In programming string refers to a sequence of characters that can act as a variable or constant. This is the most popular data type in Python. In fact the increasingly strong prevelance of Python in Bioinformatics is primarily due to its ability to easily perform different operations of strings. A string variable can be assigned using single or double quotes.

var1 = 'String A'
var2 = "String B"
print (var1)
print (var2)
String A
String B

Value for a string variable can be changed simple by reassigning with a new value.

var1 = 'String number two'
print (var1)
print("Another string")
String number two
Another string

Length of the string can be determined using the len function.

len(var2)
8
var1 = 'Hello'
var2 = 'World'
print(var1+var2)
HelloWorld

String concatenation

The arithmetic operator + and * can be used directly with string to concatenate (addition) or repeat (multiplication).

"Hello"+"World"
var1 = "Hello"
var2 = "World!"
print (var1+var2)
var3 = var1*3
print(var3)

5*3
5+5+5
HelloHelloHello

Slice of a string

Slice is another very useful operator that can be used to manipulate strings. The slice operator [] gives the character within the start and end positions separated by a colon. The numbering of characters within a string start from 0. Note that the start position character is included in the output but the end position character is not. Slicing effectively return the substring of a given string. The general syntax for slicing a string is as follows:

var4 = "ABCDEFG"
var4[1:5]
'BCDE'

In case no value is specified before or after the colon then the slicing would occur from begining or till end respectively.

print(var3)
HelloHelloHello
var3[:7]
'HelloHe'
var3[3:]
'loHelloHello'
print(var3)
var3[2::2]
HelloHelloHello
'loelHlo'

To get just one character within a string use the index of that character within the slice operator.

var3
##Exercise 
## write a command that outputs 'HHH' given a string 'HelloHelloHello'
## Solution is given below

print(var3[::5])
HHH

To declare a variable whose value is a long string that spans multiple lines tripple quotes can be used. Conceptually, tripple quotes allows newline and tabs to be incorporated verbatim.

var4 = """This is an example of
a long string that spans 
three lines."""
print (var4)
This is an example of
a long string that spans 
three lines.

String comparison

One of the frequently required tasks in programming is string comparison. In Python comparison operator can be used to compare two strings.

var1 = 'Hello' 
var2 = 'Hello'
var3 = 'Hi'
print(var1 == var2)
print(var1 == var3)
True
False

String comparison is case sensitive.

var3 = 'hello'
print(var1 == var3)

String are immutable

In Python the strings are immutable i.e. their value cannot be changed once it has been assigned. The values can however be reassigned. The variable or label for a string can change but the data contained within the variable can`t be changed

var1 = 'Hi'
var2 = var1
var1 = var1*3
print(var1)
print(var2)
var2 = 'Hello'
print(var2)

Spliting string

Sometimes there is a need to split a string based on certain delimiters, the split function is designed for that task. Python String types have split function associated with them that return a list of elements after spliting the string as per the delimiter (default is space).

s1 = "This is a sentence."
words1 = s1.split()
print(words1)

s2 = "This is an another sentence, a longer one."
words2 = s2.split(',')
print(words2)
['This', 'is', 'a', 'sentence.']
['This is an another sentence', ' a longer one.']

Quiz

What would be the output when “is” is used as delimiter for spliting the above sentence?

Python strings have methods upper and lower to change the case of the string. These methods acts on the string and returns a new string after changing the case.

s1 = 'apple'
#print(s1)
new_s1 = s1.upper()
print(s1)
print(new_s1)
#s2 = new_s1.lower()
#print(s2)
apple
APPLE
a = (3,4,4)
print(type(a))
b = [2,3,4]
print(type(b))
print(not(3>4))
print(22.0/7)
print((3>4) and (3>5))

User input

The input functions are different in 2.x and 3.x with former having raw_input() while the later has input(). Be default, the data type for the input is string. In case you need the data type of the input to be integer or float then you need to convert the type using the appropriate function.

name = input("Please enter your name: ")
print("The name entered is ", name)
Please enter your name: Manish
The name entered is  Manish
a = "2"
print(a)
print(type(a))
2
<class 'str'>
number1 = input("Please enter a number: ")
number2 = int(number1)
print(number2+1)
Please enter a number: 5
6

Exercise

Write a program that takes two numbers as input and prints the result of their comparison

num1 = int(input("Enter first number: "))
num2 = int(input("Enter second number: "))
if(num1<num2):
    print("First number is less than the second number")
elif(num1>num2):
    print("First number is greater than the second number")
else:
    print("Both numbers are equal")