1. Introduction: why optimize serialization?
In modern applications, serialization is everywhere — from network protocols to distributed caches and data exchange between services.
The speed and size of serialization are critical here. If serialization is slow, the application starts to “lag” when saving or loading data, while the network or disk sits idle. If objects turn out too large, they take up a lot of disk space, take longer to transfer over the network, and put extra load on memory and bandwidth.
Typical tasks include saving a large object graph to a file or cache, transmitting objects over the network with minimal latency, and fast serialization/deserialization of data in a multithreaded system.
The conclusion is simple: serialization optimization is not a “premium feature”, but a necessary practice for high-performance and scalable applications.
2. Optimizing the size of serialized data
Excluding unnecessary data: the transient keyword
By default, all fields of an object are serialized except those marked as transient. If a field doesn’t need to be saved (for example, caches, temporary data, references to services), mark it as transient:
public class User implements Serializable {
private String name;
private transient String sessionToken; // will not be serialized
}
Pros:
- Smaller serialized object size.
- No redundant/dangerous data in the file or over the network.
Manual serialization: the Externalizable interface
If you need full control over what and how is serialized, implement the Externalizable interface and explicitly describe the serialization (methods writeExternal/readExternal):
public class Person implements Externalizable {
private String name;
private int age;
private transient String secret;
@Override
public void writeExternal(ObjectOutput out) throws IOException {
out.writeUTF(name);
out.writeInt(age);
// do not serialize secret
}
@Override
public void readExternal(ObjectInput in) throws IOException {
name = in.readUTF();
age = in.readInt();
}
}
Pros:
- Only the required fields are serialized.
- You can change the serialization format without breaking compatibility.
Compression: compressing serialized data
Serialized objects often take up a lot of space, especially if they contain repeating strings and large collections. You can reduce the size with compression.
Example with GZIPOutputStream:
try (ObjectOutputStream out = new ObjectOutputStream(
new GZIPOutputStream(new FileOutputStream("data.gz")))) {
out.writeObject(bigObject);
}
Example with ZipOutputStream:
try (ZipOutputStream zip = new ZipOutputStream(new FileOutputStream("data.zip"))) {
zip.putNextEntry(new ZipEntry("object"));
ObjectOutputStream out = new ObjectOutputStream(zip);
out.writeObject(bigObject);
out.flush();
zip.closeEntry();
}
Pros:
- The file size can shrink multiple times over (especially for large object graphs).
- Less traffic when transferring over the network.
Cons:
- Compression/decompression requires additional time (CPU).
3. Optimizing serialization speed
Buffering: why BufferedOutputStream and BufferedInputStream are needed
Problem:
Without buffering, every call to write() or read() results in a system call to disk or network — this is very slow!
Solution:
Use buffered streams:
try (ObjectOutputStream out = new ObjectOutputStream(
new BufferedOutputStream(new FileOutputStream("data.bin")))) {
out.writeObject(bigObject);
}
try (ObjectInputStream in = new ObjectInputStream(
new BufferedInputStream(new FileInputStream("data.bin")))) {
Object obj = in.readObject();
}
Pros:
- Significantly speeds up writing/reading large objects.
- Reduces the number of disk/network accesses.
How does it work?
The buffer accumulates data in memory and writes it in chunks rather than byte by byte.
Fast copying: FileChannel.transferTo
If you need to quickly copy a large serialized file, use NIO and the transferTo method:
try (FileChannel src = new FileInputStream("data.bin").getChannel();
FileChannel dest = new FileOutputStream("copy.bin").getChannel()) {
src.transferTo(0, src.size(), dest);
}
Pros:
- Copying happens at the OS level, bypassing extra buffering in Java — very fast for large files.
4. Profiling serialization
Simple time measurement: System.nanoTime()
For a quick assessment of serialization performance you can use System.nanoTime():
long start = System.nanoTime();
try (ObjectOutputStream out = new ObjectOutputStream(
new BufferedOutputStream(new FileOutputStream("data.bin")))) {
out.writeObject(bigObject);
}
long end = System.nanoTime();
System.out.println("Serialization time: " + (end - start) / 1_000_000 + " ms");
Pros:
- Simple and fast.
- You can compare different variants (with buffer, without buffer, with compression, etc.).
Cons:
- Results can “jump around” due to GC activity and background processes.
- Not suitable for precisely comparing microscopic differences.
Accurate profiling: JMH (Java Microbenchmark Harness)
For more accurate measurement, use JMH — a dedicated microbenchmarking library.
Example of a simple benchmark:
@Benchmark
public void serializeWithBuffer() throws Exception {
try (ObjectOutputStream out = new ObjectOutputStream(
new BufferedOutputStream(new FileOutputStream("data.bin")))) {
out.writeObject(bigObject);
}
}
Pros:
- Accounts for JVM warm-up, GC impact, and OS “noise”.
- Provides reliable and reproducible results.
Cons:
- Requires setup and an understanding of JMH methodology.
- Overkill for a quick “eyeball” comparison.
5. Practice: comparing serialization time and size
Let’s run a mini experiment: serialize a large object graph (for example, a list of 100_000 objects with nested collections) in different ways and compare the time and file size.
Serialization without buffering and compression
long start = System.nanoTime();
try (ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream("data1.bin"))) {
out.writeObject(bigList);
}
long end = System.nanoTime();
System.out.println("No buffer: " + (end - start) / 1_000_000 + " ms, size: " +
new File("data1.bin").length() + " bytes");
Serialization with buffering
long start = System.nanoTime();
try (ObjectOutputStream out = new ObjectOutputStream(
new BufferedOutputStream(new FileOutputStream("data2.bin")))) {
out.writeObject(bigList);
}
long end = System.nanoTime();
System.out.println("With buffer: " + (end - start) / 1_000_000 + " ms, size: " +
new File("data2.bin").length() + " bytes");
Serialization with compression (GZIP)
long start = System.nanoTime();
try (ObjectOutputStream out = new ObjectOutputStream(
new GZIPOutputStream(new FileOutputStream("data3.gz")))) {
out.writeObject(bigList);
}
long end = System.nanoTime();
System.out.println("With compression: " + (end - start) / 1_000_000 + " ms, size: " +
new File("data3.gz").length() + " bytes");
Results analysis
When testing serialization, you can clearly see how much buffering and compression matter. Compressed files usually become 2–10 times smaller (the exact ratio depends on the data structure). With buffering, serialization goes noticeably faster, and compression slows the process a bit, but the space savings are often worth it.
Conclusion: for large volumes of data, always use buffering, and if size is critical — enable compression.
6. Common mistakes when optimizing serialization
Error No. 1: Not using buffering — serialization of large objects becomes many times slower.
Error No. 2: Serializing unnecessary or sensitive data (e.g., passwords, temporary tokens) — always use transient for such fields.
Error No. 3: Expecting compression to always speed up serialization — in reality, compression reduces size but can slow the process a bit (especially on weak CPUs).
Error No. 4: Measuring time without accounting for JVM warm-up and GC impact — use JMH for accurate benchmarks.
Error No. 5: Comparing only time or only size — always look at both metrics to choose the optimal balance for your task.
GO TO FULL VERSION