5.12. The basic data types#
Think of how you interpret the following:
“According to Pythagoras, with sides of a right angle triangle equal to 3 and 4, the hypotenuse is 5.”
Based on the context, you infer the numerical meaning of the characters 3
, 4
and 5
. But to a computer, everything within the quotes is just a series of characters. We need to define our data explicitly in order for the computer to be able to operate on it effectively.
This leads directly to the notion of data types. Python comes with a number of core data types that I define below.
- string
specified either using
''
or""
around the content of interest. This is just a series of characters. It can be empty (has length 0) or much greater than that. For more detailed discussion of using strings, see Dealing with strings. (An immutable data type [1].)
a = "a string"
a
'a string'
e = ""
e
''
note that e
is an empty string.
- int
an integer. Specified by using a number without a
.
. This is a numeric data type.
i = 4
- float
a floating point number. Specified by using a
.
. This is a numeric data type.
pi = 2.14
f = 1.0
f
1.0
Note
A floating point number is NOT the same as a decimal! They are an approximation.
- bool
A boolean, which can be either
True
orFalse
. These are special values that are produced by the relational operators.
a = 2
a > 3
b = True
- None
A special type of the same name which is often a default value.
a = None
a is None
True
Now we get to “collection” data types [2]. Collections contain a number of elements and those elements can be of different types. Collection types are extremely powerful and wind up being a foundation for sophisticated algorithms.
In defining instances of collection types, different elements are delimited using a ,
separator.
Sometime, strings, lists and tuples are referred to as “sequence” types. In this grouping, strings are distinguished from tuples and lists since every element of a string is of the same type by definition. This constraint does not apply to lists, tuples, etc…
- list
As the name implies, it is a series with (≥ 0) elements. These elements do not have to be the same type (as I illustrate) [3].
Mutable data types can be modified after creation.
l = [0, "text"]
l
[0, 'text']
- tuple
Almost the same as a list, but immutable [1] and defined using different parentheses
t = (0, "text")
t
(0, 'text')
- dict
A dictionary. Like a conventional one, we look up entries in it using some “key” and get a “value” in return. Note the special parentheses used in the definition and also usage of
:
to separate the key and value. Subsequent key value pairs are separated by,
. As with tuples and lists, they can contain different data types. The keys for a dictionary must always be of an immutable [1] data type (sostr
,tuple
,int
,float
) but the values can be of any data type.dict
’s are mutable, you can add keys or remove keys. You can modify the values for a key as you want. For more detailed discussion of using dicts, see Dealing with dictionaries.
d = {"a": "first character", "b": 2}
d
{'a': 'first character', 'b': 2}
Add another key
d["new key"] = "some text"
d
{'a': 'first character', 'b': 2, 'new key': 'some text'}
5.12.1. How to tell the type of a variable#
Well that’s easy!
a = 4
type(a)
int
5.12.2. Type casting#
In programming, this has the explicit meaning of converting one data type into another. Of course, this is not always possible. For instance, it makes no sense to try and convert a dict
into a float
.
Casting is done using functions with names matching the data type.
5.12.2.1. int to float#
i = 4
f = float(i)
f
4.0
5.12.2.2. float to int#
f = 4.8
i = int(f)
i
4
5.12.2.3. string to float#
The builtin type functions can handle strings that contain appropriate content for the designated type (meaning the text contains a number), even if the text has flanking white space.
s = " 4.45"
f = float(s)
f, type(f)
(4.45, float)
or
s = "\t4.45\n"
f = float(s)
f, type(f)
(4.45, float)
But casting from a string can require multiple steps. For instance, you cannot directly cast a string that contains a decimal number, like s
, to an int
.
i = int(s)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[16], line 1
----> 1 i = int(s)
ValueError: invalid literal for int() with base 10: '\t4.45\n'
5.12.2.4. string to list, tuple#
Casting between the collection types is similar.
l = list(s)
l
t = tuple(s)
t
('\t', '4', '.', '4', '5', '\n')
Casting to a dict requires more work, as the original data type must have a shape that matches the required key, value
pair pattern.
5.12.2.5. Objects to strings#
This is an extremely common task, not least because of the need to convert data to strings for writing to file. I will show two basic approaches.
“C-style” format strings#
So-called because this is the approach used in the C programming language. In this instance, we use the %
sign in a couple of different ways. Firstly, we essentially define a template string with placeholders for whichever data we need to convert. These place-holders are also indicated by a %<c>
where a following character (which I’ve indicated by <c>
) indicates the type of data that will be put there. After the closing quote, we then have another %
which precedes the actual variables to be cast.
In the following I convert to a string: an int (using %d
); a float to two places (using %f
); a dict (using the generic %s
, which can be applied to any object).
i = 24
s = "%d" % i
s
'24'
f = 3.14678
s = "%.2f" % f
s
'3.15'
d = {1: ["some text", 4, "in a list!"]}
s = "%s" % d
s
"{1: ['some text', 4, 'in a list!']}"
You can of course have multiple elements in a single statement.
s = "%d\t%.2f\n" % (i, f)
s
'24\t3.15\n'
Note
For multiple data to be converted, they must be enclosed within ()
after the %
.
Using “format” strings#
These are new to Python, since version 3.6. I’ll bundle the int and float into a single statement.
i = 20
x = 420000.134
s = f"{i}\t{x:,.2f}\n"
s
'20\t420,000.13\n'
Note
The f
preceding the quotes is what indicates this is a format string. You indicate where a variable should go using the {variable name}
syntax. The formatting of numbers happens after the :
. The :,
indicates separate thousands by “,”, the .2f
means float to 2 places.
5.13. Exercises#
Rewrite the code from string to float so that you successfully convert the string into the integer.
What happens when you cast the following to a dict using the
dict()
command.data = [0, "a", 1, "b"]
What happens when you cast the following to a dict using the
dict()
command.data = [[0, "a"], [1, "b"]]
Try creating a dict using different data types as keys. Do they all work?
Make a really large int. Format it as a string with a thousands separator.
Create a float and convert it to a string. Repeat this, but change the displayed precision (how many decimal places are shown).