4.1 Introduction to Generators
Generators are functions that return an iterator object. These iterators generate values on demand, allowing for the processing of potentially large datasets without loading them fully into memory.
There are several ways to create generators, and we'll look at the most popular ones below.
Function-based Generators
Generators are created using the yield keyword inside a function. When a function with yield is called, it returns a generator object but doesn't execute the code inside the function immediately. Instead, execution pauses at the yield expression and resumes each time the __next__() method of the generator object is called.
def count_up_to(max):
count = 1
while count <= max:
yield count
count += 1
counter = count_up_to(5)
print(next(counter)) # Output: 1
print(next(counter)) # Output: 2
print(next(counter)) # Output: 3
print(next(counter)) # Output: 4
print(next(counter)) # Output: 5
If a function has a yield statement, Python creates a generator object that manages the execution state of the function instead of running it traditionally.
Generator Expressions
Generator expressions are similar to list comprehensions but use parentheses instead of square brackets. They also return a generator object.
squares = (x ** 2 for x in range(10))
print(next(squares)) # Output: 0
print(next(squares)) # Output: 1
print(next(squares)) # Output: 4
Which method do you prefer?
3.2 Advantages of Generators
Efficient Memory Usage
Generators calculate values on the fly, allowing the processing of large datasets without fully loading them into memory. This makes generators an ideal choice for working with large datasets or data streams.
def large_range(n):
for i in range(n):
yield i
for value in large_range(1000000):
# Process values one by one
print(value)
Lazy Evaluation
Generators perform lazy evaluation, meaning they compute values only when needed. This avoids unnecessary computations and improves performance.
def fibonacci():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
fib = fibonacci()
for _ in range(10):
print(next(fib))
Convenient Syntax
Generators provide convenient syntax for creating iterators, making code easier to write and read.
3.3 Using Generators
Examples of Generators in the Standard Library
Many functions in Python's standard library use generators. For example, the range() function returns a generator object that generates a sequence of numbers.
for i in range(10):
print(i)
Yeah, the world will never be the same again.
Creating Infinite Sequences
Generators allow for the creation of infinite sequences, which can be useful in scenarios like generating endless data streams.
def natural_numbers():
n = 1
while True:
yield n
n += 1
naturals = natural_numbers()
for _ in range(10):
print(next(naturals))
Using send() and close()
Generator objects support the send() and close() methods, allowing you to send values back to the generator and terminate its execution.
def echo():
while True:
received = yield
print(received)
e = echo()
next(e) # Start the generator
e.send("Hello, world!") # Output: Hello, world!
e.close()
3.4 Generators in Practice
Generators and Exceptions
Generators can handle exceptions, making them powerful tools for writing more resilient code.
def controlled_execution():
try:
yield "Start"
yield "Working"
except GeneratorExit:
print("Generator closed")
gen = controlled_execution()
print(next(gen)) # Output: Start
print(next(gen)) # Output: Working
gen.close() # Output: Generator closed
We'll cover exception handling in upcoming lectures, but it's useful to know that generators handle them well.
Nested Generators
Generators can be nested, allowing for the creation of complex iterative structures.
def generator1():
yield from range(3)
yield from "ABC"
for value in generator1():
print(value)
# Output
0
1
2
A
B
C
Explanation:
yield from: This construct is used to delegate part of the operations to another generator, simplifying code and improving readability.
Generators and Performance
Using generators can significantly improve program performance by reducing memory usage and executing iterations more efficiently.
Comparing Lists and Generators Example
import time
import sys
def memory_usage(obj):
return sys.getsizeof(obj)
n = 10_000_000
# Using a list
start_time = time.time()
list_comp = [x ** 2 for x in range(n)]
list_time = time.time() - start_time
list_memory = memory_usage(list_comp)
# Using a generator
start_time = time.time()
gen_comp = (x ** 2 for x in range(n))
gen_result = sum(gen_comp) # Calculate sum for comparable results
gen_time = time.time() - start_time
gen_memory = memory_usage(gen_comp)
print(f"List:")
print(f" Time: {list_time:.2f} sec")
print(f" Memory: {list_memory:,} bytes")
print(f"\nGenerator:")
print(f" Time: {gen_time:.2f} sec")
print(f" Memory: {gen_memory:,} bytes")
List:
Time: 0.62 sec
Memory: 89,095,160 bytes
Generator:
Time: 1.13 sec
Memory: 200 bytes
GO TO FULL VERSION