1. Why choose a format at all? Quick reminder
Any .NET object in memory is a set of addresses, references, fields, internal structures and other "engine room" stuff. You can't just dump that into a file: files must be organized in a format that not only we, but other programs can read (and preferably write). That's where the choice of a serialization format comes in.
Main goals of a serialization format:
- Compactness: disk space or network transfer speed.
- Readability (by humans!): can you open the file in a text editor and understand what's going on.
- Universality: is it suitable for integration with other systems and languages.
- Support for complex structures: nesting, collections, references, types.
- Security and compatibility.
2. Binary Serialization
Binary serialization represents data as a "raw" stream of bytes. Think of it like not just packing things into a box, but cementing them — fast and compact, but you won't find your socks again without a jackhammer.
// Example for historical reference only:
using System.Runtime.Serialization.Formatters.Binary;
// ... creating an object
var user = new User { Name = "Ivan", Age = 42 };
// Serialization
using var fs = new FileStream("user.bin", FileMode.Create);
var formatter = new BinaryFormatter();
formatter.Serialize(fs, user);
Pros:
- Compactness: minimal disk footprint, high speed.
- Built-in support for "native" .NET types.
Cons:
- The format is completely unreadable — try opening the file in Notepad and you'll get the Matrix (i.e. gibberish).
- Absolutely non-universal: only for .NET (and only the same version!).
- Vulnerable to compatibility issues between versions.
Where was it used?
Inside .NET, when you needed to quickly "dump" objects between processes on the same machine. Today it's highly NOT recommended, especially if you care about security and long-term maintenance. Using BinaryFormatter in modern projects is bad practice.
3. XML (Extensible Markup Language)
XML is a human- and machine-readable format based on tags, like HTML, but without the silly jokes about <body>.
<User>
<Name>Ivan</Name>
<Age>42</Age>
</User>
Serialization example:
using System.Xml.Serialization;
var user = new User { Name = "Ivan", Age = 42 };
var serializer = new XmlSerializer(typeof(User));
using var fs = new FileStream("user.xml", FileMode.Create);
serializer.Serialize(fs, user);
Pros:
- Readability: you can open it and visually see the structure.
- Universality: many languages and tools can read XML.
- Flexibility (support for schemas, validation, namespaces, etc.).
- Well suited for complex structures and nested objects.
Cons:
- Verbose and heavyweight: takes more space and bandwidth.
- Parsing is slower than binary formats.
- Inaccuracy when serializing some types (e.g., dates and times).
- Requires extra setup to support some collections or nonstandard objects.
Where is it used?
Data exchange between enterprise systems, configs, integration where a strict structure is required.
4. JSON (JavaScript Object Notation)
JSON is a compact, lightweight, easily readable format that came from the JavaScript world and conquered almost the entire data-exchange landscape because of its conciseness and simplicity.
{
"Name": "Ivan",
"Age": 42
}
Serialization example:
using System.Text.Json;
var user = new User { Name = "Ivan", Age = 42 };
string json = JsonSerializer.Serialize(user);
File.WriteAllText("user.json", json);
Pros:
- Very readable by both humans and machines.
- Easily integrates with web technologies (APIs, JavaScript and beyond).
- More compact than XML.
- Fast serialization/deserialization in modern libraries (System.Text.Json, Newtonsoft.Json).
Cons:
- No support for comments (your JSON can't joke about itself).
- Limitations on structures: can't serialize, for example, references between objects or complex types (like dictionaries with non-string keys).
- Differences in date/time formats across implementations.
Where is it used?
Almost everywhere you need cross-platform data exchange: web services, mobile apps, REST APIs, etc.
5. CSV (Comma-Separated Values)
CSV is a simple text format representing data as a table (rows and columns) where fields are separated by commas or another delimiter.
Name,Age
Ivan,42
Olga,27
Example writing CSV manually:
// For simplicity, write with a normal StreamWriter
var lines = new List<string>
{
"Name,Age",
"Ivan,42",
"Olga,27"
};
File.WriteAllLines("users.csv", lines);
Pros:
- Supported by tons of programs (Excel, Google Docs, DBs, etc.).
- Concise and convenient for tabular data.
Cons:
- Not suitable for nested or complex objects (only "flat" structures).
- Doesn't store data types (everything is a string).
- Escaping issues when data contains commas, quotes, or newlines.
Where is it used?
Export/import between databases, exchanging simple record sets, data dumps from applications.
6. Strongly-typed exchange protocols and formats
Protocol Buffers (protobuf) and MessagePack are binary serialization formats with contracts (schemas) that make serialization even more compact and faster. They require a predefined schema describing the data structures.
Protobuf example (simplified):
Not part of the .NET standard, you need the Google.Protobuf package.
message User {
string name = 1;
int32 age = 2;
}
Pros:
- Maximum speed and compactness.
- Cross-platform (many languages support protobuf).
Cons:
- Harder to learn and set up.
- Requires code generation from the schema.
- Often unnecessary unless you're dealing with millions of messages per second.
Where is it used?
High-load systems, data exchange between microservices, games, IoT.
7. Useful nuances
Comparison table of formats
| Format | Readability | Compactness | Universality | Complex objects | Standard in .NET 9 | Security |
|---|---|---|---|---|---|---|
| Binary (.NET) | No | Yes | No | Yes | No | No |
| XML | Yes | No | Yes | Yes | Yes | Yes |
| JSON | Yes | Yes | Yes | Partially | Yes | Yes |
| CSV | Yes | Yes | Yes | No | No | Yes |
| Protobuf | No | Best | Yes | Yes | No | Yes |
How to choose a serialization format for your app?
If you need to save complex data structures and human readability matters — use XML or JSON.
If you need to quickly transfer a large amount of data between services you control — consider protobuf or MessagePack.
If you're exporting reports for analysis in Excel or a DB — use CSV.
If you're writing configs for services or CI/CD — YAML will serve you well.
Compactness, speed and support for references between objects are critical? You'll have to look at binary formats, but remember the security and compatibility downsides.
8. Tricky points and common mistakes
If you decide to store data in a binary format for "obscurity", remember: a binary dump is not security. An attacker can still "unpack" that stream. Use encryption when needed.
A common problem: forgetting about encoding when working with text formats (XML/JSON/CSV) — use UTF-8. Not all editors and systems like BOM.
XML, although it may seem "old", is great for contract-driven protocols where a strict validation description matters (e.g., XSD schemas), while JSON is good for quick, flexible exchanges.
When serializing collections of complex objects, not all formats properly support nested or recursive structures. JSON and XML are friendly with arrays and lists; CSV is not.
GO TO FULL VERSION