- docx (Microsoft Word format);
- pdf (Adobe format);
- mobi (commonly used on Amazon Kindle devices);
- and much more (ePub, djvu, fb2, etc.).
JSON
JavaScript Object Notation. You already know a little about this format! We talked about it in this lesson, and we covered serialization into JSON right here. It got its name for a reason. Java objects converted to JSON actually look exactly like objects in JavaScript. You don't need to know JavaScript to understand our object:
{
"title": "War and Peace",
"author": "Lev Tolstoy",
"year": 1869
}
We're not limited to sending a single object. The JSON format can also represent an array of objects:
[
{
"title": "War and Peace",
"author": "Lev Tolstoy",
"year": 1869
},
{
"title": "Demons",
"author": "Fyodor Dostoyevsky",
"year": 1872
},
{
"title": "The Seagull",
"author": "Anton Chekhov",
"year": 1896
}
]
Because JSON represents JavaScript objects, it supports the following JavaScript data formats:
- strings;
- numbers;
- objects;
- arrays;
- booleans (true and false);
- null.
Human-readable format. This is an obvious advantage if your end user is human. For example, suppose your server has a database with a schedule of flights. A human customer, sitting at his computer at home, requests data from this database using a web application. Because you need to provide data in a format that he can understand, JSON is a great solution.
Simplicity. It's super simple :) Above, we gave an example of two JSON files. And even if you haven't heard about JavaScript (let alone JavaScript objects), you can easily understand the sort of objects described there.
The whole of JSON documentation consists of a webpage with a couple of pictures.Widespread use. JavaScript is the dominant front-end language, and it has its own requirements. Using JSON is a must. Therefore, a huge number of web services use JSON as the data exchange format. Every modern IDE supports the JSON format (including IntelliJ IDEA). A bunch of libraries have been written for all sorts of programming languages to enable working with JSON.
YAML
Initially, YAML stood for "Yet Another Markup Language". When it began, it was positioned as a competitor to XML. Now, with the passage of time, YAML has come to mean "YAML Ain't Markup Language". What is it exactly? Let's imagine that we need to create 3 classes to represent characters in a computer game: Warrior, Mage, and Thief. They will have the following characteristics: strength, agility, endurance, a set of weapons. Here's what a YAML file describing our classes would look like:
classes:
class-1:
title: Warrior
power: 8
agility: 4
stamina: 7
weapons:
- sword
- spear
class-2:
title: Mage
power: 5
agility: 7
stamina: 5
weapons:
- magic staff
class-3:
title: Thief
power: 6
agility: 6
stamina: 5
weapons:
- dagger
- poison
A YAML file has a tree structure: some elements are nested in others. We can control nesting using a certain number of spaces, which we use to denote each level.
What are the advantages of the YAML format?
Human-readable. Again, even seeing a YAML file without a description, you can easily understand the objects that it describes. YAML is so human readable that the website yaml.org is an ordinary YAML file :)
Compactness. The file structure is created using spaces: there's no need to use brackets or quotation marks.
Support for native data structures for programming languages. The huge advantage of YAML over JSON and many other formats is that it supports various data structures. They include:
!!map
An unordered set of key-value pairs that cannot have duplicates;!!omap
An ordered sequence of key-value pairs that cannot have duplicates;!!pairs:
An ordered sequence of key-value pairs that can have duplicates;- !!set
An unordered sequence of values that are not equal to each other; - !!seq
A sequence of arbitrary values;
You will recognize some of these structures from Java! :) This means that various data structures from programming languages can be serialized into YAML.
Ability to use anchor and alias
These markers allow you to identify some element in a YAML file, and then refer to it in the rest of the file if it occurs repeatedly. An anchor is created using the symbol &, and an alias is created using *.
Suppose we have a file describing books by Leo Tolstoy. In order to avoid writing out the author's name for each book, we simply create the leo anchor and refer to it using an alias when we need it:
books: book-1: title: War and Peace author: &leo Leo Tolstoy year: 1869 book-2: title: Anna Karenina author: *leo year: 1873 book-3: title: Family Happiness author: *leo year: 1859
When this file is parsed, the value "Leo Tolstoy" is substituted in the right places where we have our aliases.
- YAML can embed data in other formats. For example, JSON:
books: [ { "title": "War and Peace", "author": "Leo Tolstoy", "year": 1869 }, { "title": "Anna Karenina", "author": "Leo Tolstoy", "year": 1873 }, { "title": "Family Happiness", "author": "Leo Tolstoy", "year": 1859 } ]
Other serialization formats
XML
This format is based on a tag tree.
<book>
<title>Harry Potter and the Philosopher’s Stone</title>
<author>J. K. Rowling</author>
<year>1997</year>
</book>
Each element consists of an opening and closing tag (<> and </>). Each element can have nested elements. XML is a common format that's just as good as JSON and YAML (if we're talking about real projects). We have a separate lesson about XML.
BSON (binary JSON)
As its name implies, BSON is very similar to JSON, but it is not human-readable and uses binary data. As a result, it is very good for storing and transferring images and other attachments. In addition, BSON supports some data types not available in JSON. For example, a BSON file can include a date (in millisecond format) or even a piece of JavaScript code. The popular MongoDB NoSQL database stores information in BSON format.Position-based protocol
In some situations, we need to drastically reduce the amount of data sent (for example, if we have a lot of data and need to reduce the load). In this situation, we can use the position-based protocol, that is, send parameter values without the names of the parameters themselves.
"Leo Tolstoy" | "Anna Karenina" | 1873
Data in this format takes several times less space than a full JSON file.
Of course, there are other serialization formats, but you don't need to know all of them right now :) It's good if you are familiar with the current industry standard formats when developing applications, and remember their advantages and how they differ from one another.
And with this, our lesson comes to an end :)
Don't forget to solve a couple of tasks today!
Until next time! :)
GO TO FULL VERSION