Serialization

Python SELF EN
Level 22 , Lesson 0
Available

8.1 What is Serialization

Now that you've learned how to work with files, save, and read data, it's time to get serious about it. Today, we'll start diving into serialization.

Serialization is the process of converting an object into a sequence of bytes or a format that can be saved in a file, sent over a network, or stored in a database. Deserialization is the reverse process where the original object is restored from this sequence of bytes.

To save an object to a file (or send it over a network), you need to convert it into some kind of string (or set of bytes) that can be easily written to a file (and read from a file) or transmitted over a network.

Here are 4 main areas where serialization is used:

  • Saving object state: To preserve the state of an object between program runs.
  • Data transfer: To send objects over a network between different system components or between different systems.
  • Caching: To store objects in cache for quick access.
  • Databases: To store complex data structures in databases.

There are many libraries designed for serialization, each created for specific needs. We'll get acquainted with five of them, and with two of them, we'll do a deep dive.

Let's take a look at the most popular ones:

  • Module pickle
  • Module json
  • Module yaml
  • Module marshal
  • Module shelve

A brief overview of each is below:

8.2 Module pickle

pickle is a built-in module for serialization and deserialization of Python objects. It allows you to save and restore almost any Python objects, including custom classes.

Example of using pickle:


import pickle

# Example object for serialization
data = {'name': 'Alice', 'age': 30, 'is_student': False}
            
# Serialize object to file
with open('data.pkl', 'wb') as file:
    pickle.dump(data, file)
            
# Deserialize object from file
with open('data.pkl', 'rb') as file:
    loaded_data = pickle.load(file)
            
print(loaded_data)

8.3 Module json

json is a built-in module for handling JSON (JavaScript Object Notation). JSON is a text format used for exchanging data between client and server.

Example of using json:


import json

# Example object for serialization
data = {'name': 'Bob', 'age': 25, 'is_student': True}
            
# Serialize object to JSON string
json_string = json.dumps(data)
print(json_string)
            
# Serialize object to JSON file
with open('data.json', 'w') as file:
    json.dump(data, file)
            
# Deserialize object from JSON string
loaded_data = json.loads(json_string)
print(loaded_data)
            
# Deserialize object from JSON file
with open('data.json', 'r') as file:
    loaded_data = json.load(file)
print(loaded_data)

8.4 Module yaml

yaml (YAML Ain't Markup Language) is a human-readable data serialization format. In Python, the PyYAML library is used to work with YAML.

Example of using yaml:


import yaml

# Example object for serialization
data = {'name': 'Carol', 'age': 27, 'is_student': False}
            
# Serialize object to YAML string
yaml_string = yaml.dump(data)
print(yaml_string)
            
# Serialize object to YAML file
with open('data.yaml', 'w') as file:
    yaml.dump(data, file)
            
# Deserialize object from YAML string
loaded_data = yaml.load(yaml_string, Loader=yaml.FullLoader)
print(loaded_data)
            
# Deserialize object from YAML file
with open('data.yaml', 'r') as file:
    loaded_data = yaml.load(file, Loader=yaml.FullLoader)
print(loaded_data)

8.5 Module marshal

marshal is a built-in module for serializing Python objects and is used for serializing Python code. It is faster than pickle but supports fewer object types and is less flexible.

Example of using marshal:


import marshal

# Example object for serialization
data = {'name': 'Dave', 'age': 35, 'is_student': True}

# Serialize object to file
with open('data.marshal', 'wb') as file:
    marshal.dump(data, file)
            
# Deserialize object from file
with open('data.marshal', 'rb') as file:
    loaded_data = marshal.load(file)
            
print(loaded_data)

8.6 Module shelve

shelve is a built-in module that provides an easy way to store Python objects in files using a dictionary-like key-based storage.

Example of using shelve:


import shelve

# Example object for serialization
data = {'name': 'Eve', 'age': 28, 'is_student': False}
            
# Serialize object to file
with shelve.open('data.shelve') as db:
    db['person'] = data
            
# Deserialize object from file
with shelve.open('data.shelve') as db:
    loaded_data = db['person']
            
print(loaded_data)

And although the modules and data storage formats differ, working with them from a programmer's perspective is quite similar. Below, we'll take a closer look at working with module pickle and module json.

2
Task
Python SELF EN, level 22, lesson 0
Locked
Serialization with pickle
Serialization with pickle
2
Task
Python SELF EN, level 22, lesson 0
Locked
Serialization using YAML
Serialization using YAML
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION