Chapter 3 Lists
In Chapter 2, we saw that strings can be thought of as sequences of characters in a particular order. In this chapter, we’ll learn about the list data type, which is the general Python container for a list of arbitrary elements in a particular order. Python lists are similar to the array data type in other languages (such as JavaScript and Ruby), so programmers familiar with other languages can probably guess a lot about how Python lists behave. (Although Python does have a built-in array type, in this tutorial “array” always refers to the ndarray data type defined by the NumPy library, which is covered in Section 11.2.)
We’ll start by explicitly connecting strings and lists via the split()
method (Section 3.1), and then learn about various other list methods and techniques throughout the rest of the chapter. In Section 3.6, we’ll also take a quick look at two closely related data types, Python tuples and sets.
3.1 Splitting
So far we’ve spent a lot of time understanding strings, and there’s a natural way to get from strings to lists via the split()
method:
$ source venv/bin/activate
(venv) $ python3
>>> "ant bat cat".split(" ") # Split a string into a three-element list.
['ant', 'bat', 'cat']
We see from this result that split()
returns a list of the strings that are separated from each other by a space in the original string.
Splitting on space is one of the most common operations, but we can split on nearly anything else as well (Listing 3.1).
>>> "ant,bat,cat".split(",")
['ant', 'bat', 'cat']
>>> "ant, bat, cat".split(", ")
['ant', 'bat', 'cat']
>>> "antheybatheycat".split("hey")
['ant', 'bat', 'cat']
Many languages support this sort of splitting, but note that Python includes an empty string in the final case illustrated above, which some languages (such as Ruby) trim automatically. We can avoid this extra string in the common case of splitting on newlines using splitlines()
instead (Listing 3.2).
splitlines()
.
>>> s = "This is a line.\nAnd this is another line.\n"
>>> s.split("\n")
['This is a line.', 'And this is another line.', '']
>>> s.splitlines()
['This is a line.', 'And this is another line.']
Many languages allow us to split a string into its component characters by splitting on the empty string, but this doesn’t work in Python:
>>> "badger".split("")
"badger".split("")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: empty separator
In Python, the best way to do this is using the list()
function directly on the string:
>>> list("badger")
['b', 'a', 'd', 'g', 'e', 'r']
Because Python can naturally iterate over a string’s characters, this technique is rarely needed explicitly; instead, we’ll typically use iterators, which we’ll learn about in Section 5.3.
Perhaps the most common use of split()
is with no arguments; in this case, the default behavior is to split on whitespace (such as spaces, tabs, or newlines):
>>> "ant bat cat".split()
['ant', 'bat', 'cat']
>>> "ant bat\t\tcat\n duck".split()
['ant', 'bat', 'cat', 'duck']
We’ll investigate this case more closely when discussing regular expressions in Section 4.3.
3.1.1 Exercises
- Assign
a
to the result of splitting the string “A man, a plan, a canal, Panama” on comma-space. How many elements does the resulting list have? - Can you guess the method to reverse
a
in place? (Google around if necessary.)
3.2 List access
Having connected strings with lists via the split()
method, we’ll now discover a second close connection as well. Let’s start by assigning a variable to a list of characters created using list()
:
>>> a = list("badger")
['b', 'a', 'd', 'g', 'e', 'r']
Here we’ve followed tradition and called the variable a
, both because it’s the first letter of the alphabet and as a nod to the array type that lists so closely resemble.
We can access particular elements of a
using the same bracket notation we first encountered in the context of strings in Section 2.6, as seen in Listing 3.3.
>>> a[0]
'b'
>>> a[1]
'a'
>>> a[2]
'd'
We see from Listing 3.3 that, as with strings, lists are zero-offset, meaning that the “first” element has index 0
, the second has index 1
, and so on. This convention can be confusing, and in fact it’s common to refer to the initial element for zero-offset lists as the “zeroth” element as a reminder that the indexing starts at 0
. This convention can also be confusing when using multiple languages (some of which start list indexing at 1
), as illustrated in the xkcd comic strip “Donald Knuth”.1
So far we’ve dealt exclusively with lists of characters, but Python lists can contain all types of elements (Listing 3.4).
>>> soliloquy = "To be, or not to be, that is the question:"
>>> a = ["badger", 42, "To be" in soliloquy]
>>> a
['badger', 42, True]
>>> a[2]
True
>>> a[3]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
We see here that the square bracket access notation works as usual for a list of mixed types, which shouldn’t come as a surprise. We also see that trying to access a list index outside the defined range raises an error if we try to access an element that’s out of range.
Another convenient feature of Python bracket notation is supporting negative indices, which count from the end of the list:
>>> a[-2]
42
Among other things, negative indices give us a compact way to select the last element in a list. Because len()
(Section 2.4) works on lists as well as strings, we could do it directly by subtracting 1
from the length (which we have to do because lists are zero-offset):
>>> a[len(a) - 1]
True
But it’s even easier like this:
>>> a[-1]
True
A final common case is where we want to access the final element and remove it at the same time. We’ll cover the method for doing this in Section 3.4.3.
By the way, starting in Listing 3.4, we used a literal square-bracket syntax to define lists by hand. This notation is so natural that you probably didn’t even notice it, and indeed it’s the same format the REPL uses when printing out lists.
We can use this same notation to define the empty list []
, which just evaluates to itself:
>>> []
[]
You may recall from Section 2.4.2 that empty or nonexistent things like ""
, 0
, and None
are False
in a boolean context. This pattern holds for the empty list as well:
>>> bool([])
False
3.2.1 Exercises
- We’ve seen that
list(str)
returns a list of the characters in a string. How can we make a list consisting of the numbers in the range 0–4? Hint: Recall therange()
function first encountered in Listing 2.24. - Show that you can create a list of numbers in the range 17–41 using
list()
withrange(17, 42)
.
3.3 List slicing
In addition to supporting the bracket notation described in Section 3.2, Python excels at a technique known as list slicing (Figure 3.1)2 for accessing multiple elements at a time. In anticipation of learning to sort in Section 3.4.2, let’s redefine our list a
to have purely numerical elements:
>>> a = [42, 8, 17, 99]
[42, 8, 17, 99]

One way to slice a list is to use the slice()
function and provide two arguments corresponding to the index number where the slice should start and where it should end. For example, slice(2, 4)
lets us pull out the elements with index 2
and 3
, ending at 4
:
>>> a[slice(2, 4)] # Not Pythonic
[17, 99]
This can be a little tricky to understand since there is no element with index 4
due to lists being zero-offset. We can understand this better by imagining a pointer that moves one element to the right as it creates the slice; it starts at 2
, selects element 2
as it moves to 3
, and then selects element 3
as it moves to 4
.
The explicit slice()
notation is rarely used in real Python code; far more common is the equivalent notation using colons, like this:
>>> a[2:4] # Pythonic
[17, 99]
Note that the index convention is the same: to select elements with indices 2
and 3
, we include a final range that is one more than the value of the final index in the slice (in this case, 3+1=4).
In the case of our current list, 4
is the length of the list, so in effect we are slicing from the element with index 2
to the end. This is such a common task that Python has a special notation for it—we just leave the second index off entirely:
>>> a[2:] # Pythonic
[17, 99]
As you might guess, the same basic notation works to slice from the front of the list:
>>> a[:2] # Pythonic
[42, 8]
The general pattern here is that a[start:end]
selects from index start
to index end-1
, where either can be omitted to select from the start or to the end. Python also supports an extension to this syntax taking the form a[start:end:step]
, which is the same as regular list slicing except taken step
at a time. For example, we can select numbers from a range 3
at a time as follows:
>>> numbers = list(range(20))
>>> numbers
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>> numbers[0:20:3] # Not Pythonic
[0, 3, 6, 9, 12, 15, 18]
Or we could start at, say, index 5
and end at index 17
:
>>> numbers[5:17:3]
[5, 8, 11, 14]
As with regular slicing, we can omit values if we want the start or the end:
>>> numbers[:10:3] # Goes from the beginning to 10-1
[0, 3, 6, 9]
>>> numbers[5::3] # Goes from 5 to the end
[5, 8, 11, 14, 17]
We can replicate the result of numbers[0:20:3]
more Pythonically by omitting both 0
and 20
:
>>> numbers[::3] # Pythonic
[0, 3, 6, 9, 12, 15, 18]
We can even go backward using a negative step:
>>> numbers[::-3]
[19, 16, 13, 10, 7, 4, 1]
This suggests a (perhaps too clever) way to reverse a list, which is to use a step of -1
. Applying this idea to our original list looks like this:
>>> a[::-1]
[99, 17, 8, 42]
You may encounter this [::-1]
construction in real-life Python code, so it’s important to know what it does, but there are more convenient and readable ways to reverse a list, as discussed in Section 3.4.2.
3.3.1 Exercises
- Define a list with the numbers 0 through 9. Use slicing and
len()
to select the third element through the third-to-last. Accomplish the same task using a negative index. - Show that strings also support slicing by selecting just
"bat"
from the string"ant bat cat"
. (You might have to experiment a little to get the indices just right.)
3.4 More list techniques
There are many other things we can do with lists other than accessing and selecting elements. In this section we’ll discuss element inclusion, sorting and reversing, and appending and popping.
3.4.1 Element inclusion
As with strings (Section 2.5), lists support testing for element inclusion using the in
keyword:
>>> a = [42, 8, 17, 99]
[42, 8, 17, 99]
>>> 42 in a
True
>>> "foo" in a
False
3.4.2 Sorting and reversing
Python has powerful facilities for sorting and reversing lists. They come in two general types: in-place and generators. Let’s take a look at some examples to see what this means.
We’ll start by sorting a list in place—an excellent trick that in ye olden days of C often required a custom implementation.3 In Python, we just call sort()
:
>>> a = [42, 8, 17, 99]
>>> a.sort()
>>> a # mutated list
[8, 17, 42, 99]
As you might expect for a list of integers, a.sort()
sorts the list numerically (unlike, e.g., JavaScript, which confusingly sorts them “alphabetically”, so that 17 comes before 8). We also see that (unlike Ruby but like JavaScript) sorting a list changes, or mutates, the list itself. (We’ll see in a moment that it returns None
.)
We can use reverse()
to reverse the elements in a list:
>>> a.reverse()
>>> a
[99, 42, 17, 8]
As with sort()
, note that reverse()
mutates the list itself.
Such mutating methods can help demonstrate a common gotcha about Python lists involving list assignment. Suppose we have a list a1
and want a copy called a2
(Listing 3.5).
>>> a1 = [42, 8, 17, 99]
>>> a2 = a1 # Dangerous!
The assignment in the second line is dangerous because a2
points to the same location in the computer’s memory as a1
, which means that if we mutate a1
it changes a2
as well:
>>> a1.sort()
>>> a1
[8, 17, 42, 99]
>>> a2
[8, 17, 42, 99]
We see here that a2
has changed even though we didn’t do anything to it directly. (You can avoid this using the list()
function or the copy()
method, as in a2 = list(a1)
or a2 = a1.copy()
.)
Python in-place methods are highly efficient, but usually more convenient are the related sorted()
and reversed()
functions. For example, we can obtain a sorted list as follows:
>>> a = [42, 8, 17, 99]
>>> sorted(a) # Pythonic
[8, 17, 42, 99]
>>> a
[42, 8, 17, 99]
Here, unlike the case of sort()
, the original list is unchanged.
Similarly, we can (almost) obtain a reversed list using reversed()
:
>>> a
[42, 8, 17, 99]
>>> reversed(a)
<list_reverseiterator object at 0x109561910>
Unfortunately, the parallel structure with sorted()
is slightly broken, at least as of this writing. Rather than returning a list, the reversed()
function reterns an iterator, which is a special type of Python object designed to be (you guessed it) iterated over. This isn’t usually a problem because we’ll usually be joining or looping over the reversed elements, in which case the generator will serve just fine (Section 5.3), but when we really need a list we can call the list()
function directly (Section 3.1):
>>> list(reversed(a))
[99, 42, 17, 8]
As noted, this minor wart rarely makes a difference since the generator’s behavior is effectively identical to the list version when being iterated over.4
Comparison
Lists support the same basic equality and inequality comparisons as strings (Chapter 2):
>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> a == b
True
>>> a != b
False
Python also supports is
, which tests whether two variables represent the same object. Because a
and b
, although they contain the same elements, are not the same object in Python’s memory system, ==
and is
return different results in this case:
>>> a == b
True
>>> a is b
False
In contrast, the lists a1
and a2
from Listing 3.5 are equal using both comparisons:
>>> a1 == a2
True
>>> a1 is a2
True
The second True
values follows because a1
and a2
truly are the exact same object. This behavior is effectively the same as the ===
syntax supported by many other languages, such as Ruby and JavaScript.
According to the PEP 8 style guide, is
should always be used when comparing with None
. For example, we can use is
to confirm that the list methods for reversing and sorting in place return None
:
>>> a.reverse() == None # Not Pythonic
True
>>> a.sort() == None # Not Pythonic
True
>>> a.reverse() is None # Pythonic
True
>>> a.sort() is None # Pythonic
True
3.4.3 Appending and popping
One useful pair of list methods is append()
and pop()
—append()
lets us append an element to the end of a list, while pop()
removes it and returns the value:
>>> a = sorted([42, 8, 17, 99])
>>> a
[8, 17, 42, 99]
>>> a.append(6) # Appending to a list
>>> a
[8, 17, 42, 99, 6]
>>> a.append("foo")
>>> a
[8, 17, 42, 99, 6, 'foo']
>>> a.pop() # Popping an element off
'foo'
>>> a
[8, 17, 42, 99, 6]
>>> a.pop()
6
>>> a.pop()
99
>>> a
[8, 17, 42]
Note that pop()
returns the value of the final element (while removing it as a side effect), while append()
returns None
(as indicated by nothing being printed after an append).
We are now in a position to appreciate the comment made in Section 3.2 about obtaining the last element of the list, as long as we don’t mind mutating it:
>>> the_answer_to_life_the_universe_and_everything = a.pop()
>>> the_answer_to_life_the_universe_and_everything
42
3.4.4 Undoing a split
A final example of a list method, one that brings us full circle from Section 3.1, is join()
. Just as split()
splits a string into list elements, join()
joins list elements into a string (Listing 3.6).
>>> a = ["ant", "bat", "cat", "42"]
['ant', 'bat', 'cat', '42']
>>> "".join(a) # Join on empty space.
'antbatcat42'
>>> ", ".join(a) # Join on comma-space.
'ant, bat, cat, 42'
>>> " -- ".join(a) # Join on double dashes.
'ant -- bat -- cat -- 42'
Note that in all cases shown in Listing 3.6 the lists we’re joining consist wholly of strings. What if we wanted a list containing, say, the number 42
rather than the string "42"
? It doesn’t work by default:
>>> a = ["ant", "bat", "cat", 42]
>>> ", ".join(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: sequence item 3: expected str instance, int found
I mention this mainly because many languages, including JavaScript and Ruby, automatically convert objects to strings when joining, so this could be considered a minor gotcha in Python for people familiar with such languages.
One solution in Python is to use the str()
function, which we’ll see again in Section 4.1.2:
>>> str(42)
'42'
Then to complete the join()
we can use a generator expression that returns str(e)
for each element in the list:
>>> ", ".join(str(e) for e in a)
'ant, bat, cat, 42'
This somewhat advanced construction is related to comprehensions, which we will cover more in Chapter 6.
3.4.5 Exercises
- To sort a list in reverse order, it’s possible to sort and then reverse, but the combined operation is so useful that both
sort()
andsorted()
support a keyword argument (Section 5.1.2) that does it automatically. Confirm thata.sort(reverse=True)
andsorted(a, reverse=True)
both have the effect of sorting and reversing at the same time. - Using the list documentation, figure out how to insert an element at the beginning of a list.
- Combine the two lists shown in Listing 3.7 into a single list using the
extend()
method. Doesextend()
mutatea1
? Does it mutatea2
?
>>> a1 = ["a", "b", "c"]
>>> a2 = [1, 2, 3]
>>> FILL_IN
>>> a1
['a', 'b', 'c', 1, 2, 3]
3.5 List iteration
One of the most common tasks with lists is iterating through their elements and performing an operation with each one. This might sound familiar, since we solved the exact same problem with strings in Section 2.6, and indeed the solution is virtually the same. All we need to do is adapt the for
loop from Listing 2.27 to lists, i.e., replace soliloquy
with a
, as shown in Listing 3.8.
for
loop.
>>> a = ["ant", "bat", "cat", 42]
>>> for i in range(len(a)): # Not Pythonic
... print(a[i])
...
ant
bat
cat
42
That’s convenient, but it’s not the best way to iterate through lists, and Mike Vanier still wouldn’t be happy (Figure 3.2).

for
loops.
Luckily, looping the Right Way™ is easier than it is in most other languages, so we can actually cover it here (unlike in, e.g., Learn Enough JavaScript to Be Dangerous, when we had to wait until Chapter 5). The trick is knowing that, as with strings, the default behavior of for...in
is to return each element in sequence, as shown in Listing 3.9.
for
to iterate over a list the Right Way™.
>>> for e in a: # Pythonic
... print(e)
...
ant
bat
cat
42
Using this style of for
loop, we can iterate directly through the elements in a list, thereby avoiding having to type out Mike Vanier’s bête noire, “for (i = 0; i < N; i++)”. The result is cleaner code and a happier programmer (Figure 3.3).

range(len())
has made Mike Vanier a little happier.
By the way, we can use enumerate()
if for some reason we need the index itself, as shown in Listing 3.10. (If you solved the exercise corresponding to Listing 2.29, the code in Listing 3.10 might look familiar.)
>>> for i, e in enumerate(a): # Pythonic
... print(f"a[{i}] = {e}")
...
a[0] = ant
a[1] = bat
a[2] = cat
a[3] = 42
Note the final results in Listing 3.10 aren’t quite right because we really should show, say, the first element as "ant"
instead of as ant
. Fixing this minor blemish is left as an exercise.
Finally, it’s possible to break out of a loop early using the break
keyword (Listing 3.11).
break
to interrupt a for
loop.
>>> for i, e in enumerate(a):
... if e == "cat":
... print(f"Found the cat at index {i}!")
... break
... else:
... print(f"a[{i}] = {e}")
...
a[0] = ant
a[1] = bat
Found the cat at index 2!
>>>
In this case the execution of the loop stops at index 2 and doesn’t proceed to any subsequent indices. We’ll see a similar construction using the return
keyword in Section 5.1.
3.5.1 Exercises
- Use
reversed()
to print out a list’s elements in reverse order. - We saw in Listing 3.10 that interpolating the values of the list into the string led to printing out, say,
ant
instead of"ant"
. We could put the quote marks in by hand, but then that would print42
out as"42"
, which is also wrong. Solve this conundrum using therepr()
function (Section 2.3) to interpolate a representation of each list element, as shown in Listing 3.12.
>>> for i, e in enumerate(a):
... print(f"a[{i}] = {repr(e)}")
...
???
3.6 Tuples and sets
In addition to lists, Python also supports tuples, which are basically lists that can’t be changed (i.e., tuples are immutable). By the way, I generally say “toople”, though you will also hear “tyoople” and “tupple”.
We can create literal tuples in much the same way that we created literal lists. The only difference is that tuples use parentheses instead of square brackets:
>>> t = ("fox", "dog", "eel")
>>> t
('fox', 'dog', 'eel')
>>> for e in t:
... print(e)
...
fox
dog
eel
We see here that iterating over a tuple uses the same for...in
syntax used for lists (Listing 3.9).
Because tuples are immutable, trying to change them raises an error:
>>> t.append("goat")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'tuple' object has no attribute 'append'
>>> t.sort()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'tuple' object has no attribute 'sort'
Otherwise, tuples support many of the same operations as lists, such as slicing or non-mutating sorting:
>>> t[1:]
('dog', 'eel')
>>> sorted(t)
['dog', 'eel', 'fox']
Note in the second case that sorted()
can take a tuple as an argument but that it returns a list.
By the way, we can also leave off parentheses when defining tuples:
>>> u = "fox", "dog", "eel"
>>> u
('fox', 'dog', 'eel')
>>> t == u
True
I think this notation is potentially confusing and generally prefer to use parentheses when defining tuples, but you should know about it in case you see it in other people’s code. The main exceptions are when simply displaying several variables in the REPL or when doing assignment via so-called tuple unpacking, which lets you make multiple assignments at once:
>>> a, b, c = t # Very Pythonic; works for lists, too
>>> a
'fox'
>>> a, b, c # Tuple to show the variable values
Finally, it’s worth noting that defining a tuple of one element requires a trailing comma because an object in parentheses alone is just the object itself:
>>> ("foo")
'foo'
>>> ("foo",)
('foo',)
Python also has native support for sets, which correspond closely to the mathematical definition and can be thought of as lists of elements where repeat values are ignored and the order doesn’t matter. Sets can be initialized literally using curly braces or by passing a list or a tuple (or in fact any iterable) to the set()
function:
>>> s1 = {1, 2, 3, 4}
>>> s2 = {3, 1, 4, 2}
>>> s3 = set([1, 2, 2, 3, 4, 4])
>>> s1, s2, s3
({1, 2, 3, 4}, {1, 2, 3, 4}, {1, 2, 3, 4})
Set equality can be tested with ==
as usual:
>>> s1 == s2
True
>>> s2 == s3
True
>>> s1 == s3
True
>>> {1, 2, 3} == {3, 1, 2}
True
Sets can also mix types (and can be initialized with a tuple instead of a list):
>>> set(("ant", "bat", "cat", 1, 1, "cat"))
{'bat', 'ant', 'cat'}
Note that in all cases duplicate values are ignored.
Python sets support many common set operations, such as union and intersection:
>>> s1 = {1, 2, "ant", "bat"}
>>> s2 = {2, 3, "bat", "cat"}
>>> s1 | s2 # Set union
{'bat', 1, 2, 'ant', 3, 'cat'}
>>> s1 & s2 # Set intersection
{'bat', 2}
See “Sets in Python” for more information.
Because they are unordered, set elements can’t be selected directly (how would Python know which set element to pick?) but can be tested for inclusion or iterated over:
>>> s = {1, 2, 3, 4}
>>> s[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'set' object is not subscriptable
>>> 3 in s
True
>>> for e in s:
... print(f"{e} is an element of the set")
...
1 is an element of the set
2 is an element of the set
3 is an element of the set
4 is an element of the set
Finally, it’s worth noting that, like the empty list, the empty tuple and the empty set are both False
in a boolean context:
>>> bool(())
False
>>> bool(set())
False
Note here that, perhaps counterintuitively, we can’t use {}
for the empty set because that combination is reserved for the empty dictionary, which we’ll discuss in Section 4.4. We also don’t have to include a trailing comma in ()
, which is the empty tuple as required.
We can confirm these statements using the type()
function:
>>> type(())
<class 'tuple'>
>>> type({})
<class 'dict'>
>>> type(set())
<class 'set'>
Here we see that ()
, {}
, and set()
are of class tuple, dictionary, and set, respectively. (We’ll discuss more about what a class is in Chapter 7.)
3.6.1 Exercises
- Confirm the existence of a
tuple()
function by convertingsorted(t)
from a list to a tuple. - Create a set with numbers in the range 0–4 by combining
set()
withrange()
. (Recall the use ofrange()
in Listing 2.24.) Confirm that thepop()
method mentioned in Section 3.4.3 allows you to remove one element at a time.