2.1 Module threading
Multithreading in Python is a way of executing multiple threads at the same time, allowing for more efficient use of CPU resources, especially for I/O operations or other tasks that can run concurrently.
Key concepts of multithreading in Python:
Thread — the smallest unit of execution which can operate concurrently with other threads within the same process. All threads in a process share the same memory space, allowing for data exchange between threads.
Process — an instance of a program that runs within
the operating system with its own address space and resources. Unlike threads, processes are isolated from each other and exchange data through interprocess communication (IPC)
.
GIL
— a mechanism in the Python interpreter that
prevents the concurrent execution of multiple Python threads. The GIL
ensures the safe execution of Python code but limits the performance of multi-threaded programs on multi-core processors.
Important! Keep in mind that due to the Global Interpreter Lock (GIL), multithreading in Python might not provide a significant performance boost for CPU-bound tasks because the GIL prevents the simultaneous execution of multiple Python threads on multi-core processors.
Module threading
The threading module in Python provides a high-level interface for working with threads. It allows you to create and manage threads, synchronize them, and organize interaction between them. Let's take a closer look at the key components and functions of this module.
Key components of the threading module
Entities for working with threads:
-
Thread
— the primary class for creating and managing threads. -
Timer
— a timer for executing a function after a specified interval. -
ThreadLocal
— allows you to create thread-local data.
Thread synchronization mechanism:
-
Lock
— a synchronization primitive to prevent concurrent access to shared resources. -
Condition
— a conditional variable for more complex thread synchronization. Event
— a primitive for thread notification.-
Semaphore
— a primitive to limit the number of threads that can perform a specific section simultaneously. -
Barrier
— synchronizes a specified number of threads, blocking them until all threads have reached the barrier.
I'll tell you about 3 classes for working with threads below, but you won’t need the thread synchronization mechanism in the near future.
2.2 Class Thread
The Thread
class is the primary class for creating and managing threads. It has 4 main methods:
start()
: Starts the thread's execution.-
join()
: The current thread is suspended and waits for the spawned thread to complete. is_alive()
: ReturnsTrue
if the thread is still running.-
run()
: The method containing the code to be executed in the thread. Overridden when inheriting from theThread
class.
It's much simpler than it seems — here's an example of using the Thread
class.
Starting a simple thread
import threading
def worker():
print("Worker thread is running")
# Creating a new thread
t = threading.Thread(target=worker) #created a new Thread object
t.start() #Started the thread
t.join() # Wait for the thread to complete
print("Main thread is finished")
After calling the start method, the worker function will start execution. Or more accurately, its thread will be added to the list of active threads.
Using arguments
import threading
def worker(number, text):
print(f"Worker {number}: {text}")
# Creating a new thread with arguments
t = threading.Thread(target=worker, args=(1, "Hello"))
t.start()
t.join()
To pass parameters to the new thread, simply specify them in a tuple
and assign them to the args
parameter. When the function specified in
target
is called, the parameters will be passed automatically.
Overriding the run
method
import threading
class MyThread(threading.Thread):
def run(self):
print("Custom thread is running")
# Creating and starting the thread
t = MyThread()
t.start()
t.join()
There are two ways to specify the function to start a new
thread — you can pass it through the target
parameter when creating a Thread
object,
or inherit from the Thread
class and override the run
method. Both
approaches are legit and commonly used.
2.3 Class Timer
The Timer
class in the threading
module is designed to start a function after
a specified interval. This class is useful for executing delayed
tasks in a multithreaded environment.
A timer is created and initialized with the function to be called and the delay time in seconds.
-
The
start()
method starts the timer, which counts down the specified time interval and then calls the specified function. -
The
cancel()
method allows you to stop the timer, if it hasn't triggered yet. This is helpful for preventing the execution of the function if the timer is no longer needed.
Usage examples:
Starting a function with delay
In this example, the hello
function will be called 5 seconds after starting
the timer.
import threading
def hello():
print("Hello, world!")
# Creating a timer that will call the hello function after 5 seconds
t = threading.Timer(5.0, hello)
t.start() # Starting the timer
Stopping the timer before execution
Here the timer will be stopped before the hello
function can be executed,
and hence nothing will be printed.
import threading
def hello():
print("Hello, world!")
# Creating the timer
t = threading.Timer(5.0, hello)
t.start() # Starting the timer
# Stopping the timer before execution
t.cancel()
Timer with arguments
In this example, the timer will call the greet
function after 3 seconds and pass it
the argument "Alice"
.
import threading
def greet(name):
print(f"Hello, {name}!")
# Creating a timer with arguments
t = threading.Timer(3.0, greet, args=["Alice"])
t.start()
The Timer class is handy for scheduling task execution after a specific time. However, timers don't guarantee absolutely accurate execution time as it depends on the system load and the thread scheduler.
2.4 Class ThreadLocal
The ThreadLocal class is designed for creating threads with their own local data. This is useful in multithreaded applications when each thread should have its own version of data, to avoid conflicts and synchronization issues.
Each thread using ThreadLocal
will have its own
independent copy of data. Data stored in a ThreadLocal
object is
unique to each thread and not shared with other threads. This is convenient
for storing data used only within the context of a single thread,
such as the current user in a web application or the current database
connection.
Usage examples:
Main usage
In this example, each thread assigns its name to a thread-local variable value
and prints it. The value
is unique to each thread.
import threading
# Creating a ThreadLocal object
local_data = threading.local()
def process_data():
# Assigning value to thread-local variable
local_data.value = threading.current_thread().name
# Accessing thread-local variable
print(f'Value in {threading.current_thread().name}: {local_data.value}')
threads = []
for i in range(5):
t = threading.Thread(target=process_data)
threads.append(t)
t.start()
for t in threads:
t.join()
Storing user data in a web application
In this example, each thread processes a request for its own user.
The user_data.user
value is unique to each thread.
import threading
# Creating a ThreadLocal object
user_data = threading.local()
def process_request(user):
# Assigning value to thread-local variable
user_data.user = user
handle_request()
def handle_request():
# Accessing thread-local variable
print(f'Handling request for user: {user_data.user}')
threads = []
users = ['Alice', 'Bob', 'Charlie']
for user in users:
t = threading.Thread(target=process_request, args=(user,))
threads.append(t)
t.start()
for t in threads:
t.join()
These were the 3 most useful classes in the threading
module. You'll
likely use them in your work, but probably won't need the other classes.
Nowadays, everyone is moving towards asynchronous functions and
the asyncio
library. That's what we will be talking about
in the near future.
GO TO FULL VERSION