13 min read

In this article by Steven F. Lott, the author of the book Python Essentials, we’ll look at the break and continue statements; these modify a for or while loop to allow skipping items or exiting before the loop has processed all items. This is a fundamental change in the semantics of a collection-processing statement.

(For more resources related to this topic, see here.)

Processing collections with the for statement

The for statement is an extremely versatile way to process every item in a collection. We do this by defining a target variable, a source of items, and a suite of statements. The for statement will iterate through the source of items, assigning each item to the target variable, and also execute the suite of statements. All of the collections in Python provide the necessary methods, which means that we can use anything as the source of items in a for statement.

Here’s some sample data that we’ll work with. This is part of Mike Keith’s poem, Near a Raven. We’ll remove the punctuation to make the text easier to work with:

>>> text = '''Poe, E.
...     Near a Raven
...
... Midnights so dreary, tired and weary.'''
>>> text = text.replace(",","").replace(".","").lower()

This will put the original text, with uppercase and lowercase and punctuation into the text variable.

When we use text.split(), we get a sequence of individual words. The for loop can iterate through this sequence of words so that we can process each one. The syntax looks like this:

>>> cadaeic= {}
>>> for word in text.split():
...     cadaeic[word]= len(word)

We’ve created an empty dictionary, and assigned it to the cadaeic variable. The expression in the for loop, text.split(), will create a sequence of substrings. Each of these substrings will be assigned to the word variable. The for loop body—a single assignment statement—will be executed once for each value assigned to word.

The resulting dictionary might look like this (irrespective of ordering):

{'raven': 5, 'midnights': 9, 'dreary': 6, 'e': 1,
'weary': 5, 'near': 4, 'a': 1, 'poe': 3, 'and': 3,
'so': 2, 'tired': 5}

There’s no guaranteed order for mappings or sets. Your results may differ slightly.

In addition to iterating over a sequence, we can also iterate over the keys in a dictionary.

>>> for word in sorted(cadaeic):
...   print(word, cadaeic[word])

When we use sorted() on a tuple or a list, an interim list is created with sorted items. When we apply sorted() to a mapping, the sorting applies to the keys of the mapping, creating a sequence of sorted keys. This loop will print a list in alphabetical order of the various pilish words used in this poem.

Pilish is a subset of English where the word lengths are important: they’re used as mnemonic aids.

A for statement corresponds to the “for all” logical quantifier, . At the end of a simple for loop we can assert that all items in the source collection have been processed. In order to build the “there exists” quantifier, , we can either use the while statement, or the break statement inside the body of a for statement.

Using literal lists in a for statement

We can apply the for statement to a sequence of literal values. One of the most common ways to present literals is as a tuple. It might look like this:

for scheme in 'http', 'https', 'ftp':
   do_something(scheme)

This will assign three different values to the scheme variable. For each of those values, it will evaluate the do_something() function.

From this, we can see that, strictly-speaking, the () are not required to delimit a tuple object. If the sequence of values grows, however, and we need to span more than one physical line, we’ll want to add (), making the tuple literal more explicit.

Using the range() and enumerate() functions

The range() object will provide a sequence of numbers, often used in a for loop. The range() object is iterable, it’s not itself a sequence object. It’s a generator, which will produce items when required. If we use range() outside a for statement, we need to use a function like list(range(x)) or tuple(range(a,b)) to consume all of the generated values and create a new sequence object.

The range() object has three commonly-used forms:

  • range(n) produces ascending numbers including 0 but not including n itself. This is a half-open interval. We could say that range(n) produces numbers, x, such that . The expression list(range(5)) returns [0, 1, 2, 3, 4]. This produces n values including 0 and n – 1.
  • range(a,b) produces ascending numbers starting from a but not including b. The expression tuple(range(-1,3)) will return (-1, 0, 1, 2). This produces ba values including a and b – 1.
  • range(x,y,z) produces ascending numbers in the sequence . This produces (yx)//z values.

We can use the range() object like this:

for n in range(1, 21):
   status= str(n)
   if n % 5 == 0: status += " fizz"
   if n % 7 == 0: status += " buzz"
   print(status)

In this example, we’ve used a range() object to produce values, n, such that .

We use the range() object to generate the index values for all items in a list:

for n in range(len(some_list)):
   print(n, some_list[n])

We’ve used the range() function to generate values between 0 and the length of the sequence object named some_list.

The for statement allows multiple target variables. The rules for multiple target variables are the same as for a multiple variable assignment statement: a sequence object will be decomposed and items assigned to each variable. Because of that, we can leverage the enumerate() function to iterate through a sequence and assign the index values at the same time. It looks like this:

for n, v in enumerate(some_list):
     print(n, v)

The enumerate() function is a generator function which iterates through the items in source sequence and yields a sequence of two-tuple pairs with the index and the item. Since we’ve provided two variables, the two-tuple is decomposed and assigned to each variable.

There are numerous use cases for this multiple-assignment for loop. We often have list-of-tuples data structures that can be handled very neatly with this multiple-assignment feature.

Iterating with the while statement

The while statement is a more general iteration than the for statement. We’ll use a while loop in two situations. We’ll use this in cases where we don’t have a finite collection to impose an upper bound on the loop’s iteration; we may suggest an upper bound in the while clause itself. We’ll also use this when writing a “search” or “there exists” kind of loop; we aren’t processing all items in a collection.

A desktop application that accepts input from a user, for example, will often have a while loop. The application runs until the user decides to quit; there’s no upper bound on the number of user interactions. For this, we generally use a while True: loop. Infinite iteration is recommended.

If we want to write a character-mode user interface, we could do it like this:

quit_received= False
while not quit_received:
   command= input("prompt> ")
   quit_received= process(command)

This will iterate until the quit_received variable is set to True. This will process indefinitely; there’s no upper boundary on the number of iterations.

This process() function might use some kind of command processing. This should include a statement like this:

if command.lower().startswith("quit"): return True

When the user enters “quit”, the process() function will return True. This will be assigned to the quit_received variable. The while expression, not quit_received, will become False, and the loop ends.

A “there exists” loop will iterate through a collection, stopping at the first item that meets certain criteria. This can look complex because we’re forced to make two details of loop processing explicit.

Here’s an example of searching for the first value that meets a condition. This example assumes that we have a function, condition(), which will eventually be True for some number. Here’s how we can use a while statement to locate the minimum for which this function is True:

>>> n = 1
>>> while n != 101 and not condition(n):
...     n += 1
>>> assert n == 101 or condition(n)

The while statement will terminate when n == 101 or the condition(n) is True. If this expression is False, we can advance the n variable to the next value in the sequence of values. Since we’re iterating through the values in order from the smallest to the largest, we know that n will be the smallest value for which the condition() function is true.

At the end of the while statement we have included a formal assertion that either n is 101 or the condition() function is True for the given value of n. Writing an assertion like this can help in design as well as debugging because it will often summarize the loop invariant condition.

We can also write this kind of loop using the break statement in a for loop, something we’ll look at in the next section.

The continue and break statements

The continue statement is helpful for skipping items without writing deeply-nested if statements. The effect of executing a continue statement is to skip the rest of the loop’s suite. In a for loop, this means that the next item will be taken from the source iterable. In a while loop, this must be used carefully to avoid an otherwise infinite iteration.

We might see file processing that looks like this:

for line in some_file:
   clean = line.strip()
   if len(clean) == 0:
       continue
   data, _, _ = clean.partition("#")
   data = data.rstrip()
   if len(data) == 0:
       continue
   process(data)

In this loop, we’re relying on the way files act like sequences of individual lines. For each line in the file, we’ve stripped whitespace from the input line, and assigned the resulting string to the clean variable. If the length of this string is zero, the line was entirely whitespace, and we’ll continue the loop with the next line. The continue statement skips the remaining statements in the body of the loop.

We’ll partition the line into three pieces: a portion in front of any “#”, the “#” (if present), and the portion after any “#”. We’ve assigned the “#” character and any text after the “#” character to the same easily-ignored variable, _, because we don’t have any use for these two results of the partition() method. We can then strip any trailing whitespace from the string assigned to the data variable. If the resulting string has a length of zero, then the line is entirely filled with “#” and any trailing comment text. Since there’s no useful data, we can continue the loop, ignoring this line of input.

If the line passes the two if conditions, we can process the resulting data. By using the continue statement, we have avoided complex-looking, deeply-nested if statements.

It’s important to note that a continue statement must always be part of the suite inside an if statement, inside a for or while loop. The condition on that if statement becomes a filter condition that applies to the collection of data being processed. continue always applies to the innermost loop.

Breaking early from a loop

The break statement is a profound change in the semantics of the loop. An ordinary for statement can be summarized by “for all.” We can comfortably say that “for all items in a collection, the suite of statements was processed.”

When we use a break statement, a loop is no longer summarized by “for all.” We need to change our perspective to “there exists”. A break statement asserts that at least one item in the collection matches the condition that leads to the execution of the break statement.

Here’s a simple example of a break statement:

for n in range(1, 100):
   factors = []
   for x in range(1,n):
       if n % x == 0: factors.append(x)
   if sum(factors) == n:
       break

We’ve written a loop that is bound by . This loop includes a break statement, so it will not process all values of n. Instead, it will determine the smallest value of n, for which n is equal to the sum of its factors. Since the loop doesn’t examine all values, it shows that at least one such number exists within the given range.

We’ve used a nested loop to determine the factors of the number n. This nested loop creates a sequence, factors, for all values of x in the range , such that x, is a factor of the number n. This inner loop doesn’t have a break statement, so we are sure it examines all values in the given range.

The least value for which this is true is the number six.

It’s important to note that a break statement must always be part of the suite inside an if statement inside a for or while loop. If the break isn’t in an if suite, the loop will always terminate while processing the first item. The condition on that if statement becomes the “where exists” condition that summarizes the loop as a whole. Clearly, multiple if statements with multiple break statements mean that the overall loop can have a potentially confusing and difficult-to-summarize post-condition.

Using the else clause on a loop

Python’s else clause can be used on a for or while statement as well as on an if statement. The else clause executes after the loop body if there was no break statement executed. To see this, here’s a contrived example:

>>> for item in 1,2,3:
...     print(item)
...     if item == 2:
...         print("Found",item)
...       break
... else:
...     print("Found Nothing")

The for statement here will iterate over a short list of literal values. When a specific target value has been found, a message is printed. Then, the break statement will end the loop, avoiding the else clause.

When we run this, we’ll see three lines of output, like this:

1
2
Found 2

The value of three isn’t shown, nor is the “Found Nothing” message in the else clause.

If we change the target value in the if statement from two to a value that won’t be seen (for example, zero or four), then the output will change. If the break statement is not executed, then the else clause will be executed.

The idea here is to allow us to write contrasting break and non-break suites of statements. An if statement suite that includes a break statement can do some processing in the suite before the break statement ends the loop. An else clause allows some processing at the end of the loop when none of the break-related suites statements were executed.

Summary

In this article, we’ve looked at the for statement, which is the primary way we’ll process the individual items in a collection. A simple for statement assures us that our processing has been done for all items in the collection. We’ve also looked at the general purpose while loop.

Resources for Article:


Further resources on this subject:


LEAVE A REPLY

Please enter your comment!
Please enter your name here