8.1 What is Serialization
Now that you've learned how to work with files, save, and read data, it's time to get serious about it. Today, we'll start diving into serialization.
Serialization is the process of converting an object into a sequence of bytes or a format that can be saved in a file, sent over a network, or stored in a database. Deserialization is the reverse process where the original object is restored from this sequence of bytes.
To save an object to a file (or send it over a network), you need to convert it into some kind of string (or set of bytes) that can be easily written to a file (and read from a file) or transmitted over a network.
Here are 4 main areas where serialization is used:
- Saving object state: To preserve the state of an object between program runs.
- Data transfer: To send objects over a network between different system components or between different systems.
- Caching: To store objects in cache for quick access.
- Databases: To store complex data structures in databases.
There are many libraries designed for serialization, each created for specific needs. We'll get acquainted with five of them, and with two of them, we'll do a deep dive.
Let's take a look at the most popular ones:
- Module
pickle
- Module
json
- Module
yaml
- Module
marshal
- Module
shelve
A brief overview of each is below:
8.2 Module pickle
pickle
is a built-in module for serialization and deserialization of Python objects. It allows you to save and restore almost any Python objects, including custom classes.
Example of using pickle
:
import pickle
# Example object for serialization
data = {'name': 'Alice', 'age': 30, 'is_student': False}
# Serialize object to file
with open('data.pkl', 'wb') as file:
pickle.dump(data, file)
# Deserialize object from file
with open('data.pkl', 'rb') as file:
loaded_data = pickle.load(file)
print(loaded_data)
8.3 Module json
json
is a built-in module for handling JSON (JavaScript Object Notation). JSON is a text format used for exchanging data between client and server.
Example of using json
:
import json
# Example object for serialization
data = {'name': 'Bob', 'age': 25, 'is_student': True}
# Serialize object to JSON string
json_string = json.dumps(data)
print(json_string)
# Serialize object to JSON file
with open('data.json', 'w') as file:
json.dump(data, file)
# Deserialize object from JSON string
loaded_data = json.loads(json_string)
print(loaded_data)
# Deserialize object from JSON file
with open('data.json', 'r') as file:
loaded_data = json.load(file)
print(loaded_data)
8.4 Module yaml
yaml
(YAML Ain't Markup Language) is a human-readable data serialization format. In Python, the PyYAML library is used to work with YAML.
Example of using yaml
:
import yaml
# Example object for serialization
data = {'name': 'Carol', 'age': 27, 'is_student': False}
# Serialize object to YAML string
yaml_string = yaml.dump(data)
print(yaml_string)
# Serialize object to YAML file
with open('data.yaml', 'w') as file:
yaml.dump(data, file)
# Deserialize object from YAML string
loaded_data = yaml.load(yaml_string, Loader=yaml.FullLoader)
print(loaded_data)
# Deserialize object from YAML file
with open('data.yaml', 'r') as file:
loaded_data = yaml.load(file, Loader=yaml.FullLoader)
print(loaded_data)
8.5 Module marshal
marshal
is a built-in module for serializing Python objects and is used for serializing Python code. It is faster than pickle
but supports fewer object types and is less flexible.
Example of using marshal
:
import marshal
# Example object for serialization
data = {'name': 'Dave', 'age': 35, 'is_student': True}
# Serialize object to file
with open('data.marshal', 'wb') as file:
marshal.dump(data, file)
# Deserialize object from file
with open('data.marshal', 'rb') as file:
loaded_data = marshal.load(file)
print(loaded_data)
8.6 Module shelve
shelve
is a built-in module that provides an easy way to store Python objects in files using a dictionary-like key-based storage.
Example of using shelve
:
import shelve
# Example object for serialization
data = {'name': 'Eve', 'age': 28, 'is_student': False}
# Serialize object to file
with shelve.open('data.shelve') as db:
db['person'] = data
# Deserialize object from file
with shelve.open('data.shelve') as db:
loaded_data = db['person']
print(loaded_data)
And although the modules and data storage formats differ, working with them from a programmer's perspective is quite similar. Below, we'll take a closer look at working with module pickle
and module json
.
GO TO FULL VERSION