1. Data streams

Rarely does a program exist as an island unto itself. Programs usually somehow interact with the "outside world". This can happen through reading data from the keyboard, sending messages, downloading pages from the Internet, or, conversely, uploading files to a remote server.

We can refer to all of these behaviors in one word: data exchange between the program and the outside world. Wait, that's not just one word.

Of course, data exchange itself can be divided into two parts: receiving data and sending data. For example, you read data from the keyboard using a Scanner object — this is receiving data. And you display data on the screen using a System.out.println() command — this is sending data.

In programming, the term "stream" is used to describe data exchange. Where did that term come from?

In real life, you can have a stream of water or a stream of consciousness. In programming, we have data streams.

Streams are a versatile tool. They allow the program to receive data from anywhere (input streams) and send data anywhere (output streams). Thus, there are two types:

  • An input stream is for receiving data
  • An output stream is for sending data

To make streams 'tangible', Java's creators wrote two classes: InputStream and OutputStream.

The InputStream class has a read() method that lets you read data from it. And the OutputStream class has a write() method that lets you write data to it. They have other methods as well, but more on that later.

Byte streams

What kind of data are we talking about? What format does it take? In other words, what data types do these classes support?

These are generic classes, so they support the most common data type — the byte. An OutputStream can write bytes (and byte arrays), and an InputStream object can read bytes (or byte arrays). That's it — they don't support any other data types.

As a result, these streams are also called byte streams.

One feature of streams is that their data can only be read (or written) sequentially. You can't read data from the middle of a stream without reading all the data that comes before it.

This is how reading data from the keyboard works through the Scanner class: you read data from the keyboard sequentially, line by line. We read a line, then the next line, then the next line, and so on. Fittingly, the method for reading lines is called nextLine().

Writing data to an OutputStream also happens sequentially. A good example of this is console output. You output a line, followed by another and another. This is sequential output. You can't output the first line, then the tenth, and then the second. All data is written to an output stream only sequentially.

Character streams

You recently learned that strings are the second most popular data type, and indeed they are. A lot of information is passed around in the form of characters and whole strings. A computer excels at sending and receiving everything as bytes, but humans aren't that perfect.

Accounting for this fact, Java programmers wrote two more classes: Reader and Writer. The Reader class is analogous the InputStream class, but its read() method reads not bytes, but characters (char). The Writer class corresponds to the OutputStream class. And just like the Reader class, it works with characters (char), not bytes.

If we compare these four classes, we get the following picture:

Bytes (byte) Characters (char)
Reading data
InputStream
Reader
Writing data
OutputStream
Writer

Practical application

The InputStream, OutputStream, Reader and Writer classes themselves are not used directly by anyone, since they are not associated with any concrete objects from which data can be read (or into which data can be written). But these four classes have plenty of descendant classes that can do a lot.


2. InputStream class

The InputStream class is interesting because it is the parent class for hundreds of descendant classes. It doesn't have any data of its own, but it does have methods that all of its derived classes inherit.

In general, it is rare for stream objects to store data internally. A stream is a tool for reading/writing data, but not storage. That said, there are exceptions.

Methods of the InputStream class and all its descendant classes:

Methods Description
int read()
Reads one byte from the stream
int read(byte[] buffer)
Reads an array of bytes from the stream
byte[] readAllBytes()
Reads all the bytes from the stream
long skip(long n)
Skips n bytes in the stream (reads and discards them)
int available()
Checks how many bytes are left in the stream
void close()
Closes the stream

Let's briefly go through these methods:

read() method

The read() method reads one byte from the stream and returns it. You might be confused by the int return type. This type was chosen because int is the standard integer type. The first three bytes of the int will be zero.

read(byte[] buffer) method

This is the second variant of the read() method. It lets you read a byte array from an InputStream all at once. The array that will store the bytes must be passed as an argument. The method returns a number — the number of bytes actually read.

Let's say you have a 10 kilobyte buffer and you're reading data from a file using the FileInputStream class. If the file contains only 2 kilobytes, all the data will be loaded into the buffer array, and the method will return the number 2048 (2 kilobytes).

readAllBytes() method

A very good method. It just reads all the data from the InputStream until it runs out and returns it as a single byte array. This is very handy for reading small files. Large files may not physically fit in memory, and the method will throw an exception.

skip(long n) method

This method allows you to skip the first n bytes from the InputStream object. Because the data is read strictly sequentially, this method simply reads the first n bytes from the stream and discards them.

Returns the number of bytes that were actually skipped (in case the stream ended before n bytes were skipped).

int available() method

The method returns the number of bytes that are still left in the stream

void close() method

The close() method closes the data stream and releases the external resources associated with it. Once a stream is closed, no more data can be read from it.

Let's write an example program that copies a very large file. We cannot use the readAllBytes() method to read the entire file into memory. Example:

Code Note
String src = "c:\\projects\\log.txt";
String dest = "c:\\projects\\copy.txt";

try(FileInputStream input = new FileInputStream(src);
FileOutputStream output = new FileOutputStream(dest))
{
   byte[] buffer = new byte[65536]; // 64Kb
   while (input.available() > 0)
   {
      int real = input.read(buffer);
      output.write(buffer, 0, real);
   }
}



InputStream for reading from the file
OutputStream for write to file

Buffer into which we will read the data
As long as there is data in the stream

Read data into the buffer
Write the data from the buffer to the second stream

In this example, we used two classes: FileInputStream is a descendant of InputStream for reading data from a file, and FileOutputStream is a descendant of OutputStream for writing data to a file. We will talk about the second class a little later.

Another interesting point here is the real variable. When the last block of data is read from a file, it could easily have less than 64KB of data. Accordingly, we need to output not the entire buffer, but only part of it — the first real bytes. This is exactly what happens in the write() method.



3. Reader class

The Reader class is a complete analogue of the InputStream class. The only one difference is that it works with characters (char), not with bytes. Just like the InputStream class, the Reader class is not used anywhere on its own: it is the parent class for hundreds of descendant classes and defines common methods for all of them.

Methods of the Reader class (and all its descendant classes):

Methods Description
int read()
Reads one char from the stream
int read(char[] buffer)
Reads an char array from the stream
long skip(long n)
Skips n chars in the stream (reads and discards them)
boolean ready()
Checks whether there is still something left in the stream
void close()
Closes the stream

The methods are very similar to those of the InputStream class, although there are slight differences.

int read() method

This method reads one char from the stream and returns it. The char type widens to an int, but the first two bytes of the result are always zero.

int read(char[] buffer) method

This is the second variant of the read() method. It lets you read a char array from a Reader all at once. The array that will store the characters must be passed as an argument. The method returns a number — the number of characters actually read.

skip(long n) method

This method allows you to skip the first n characters from the Reader object. It works exactly the same as the analogous method of the InputStream class. Returns the number of characters that were actually skipped.

boolean ready() method

Returns true if there are unread bytes in the stream.

void close() method

The close() method closes the data stream and releases the external resources associated with it. Once a stream is closed, no more data can be read from it.

For comparison, let's write a program that copies a text file:

Code Note
String src = "c:\\projects\\log.txt";
String dest = "c:\\projects\\copy.txt";

try(FileReader reader = new FileReader(src);
FileWriter writer = new FileWriter(dest))
{
   char[] buffer = new char[65536]; // 128Kb
   while (reader.ready())
   {
      int real = reader.read(buffer);
      writer.write(buffer, 0, real);
   }
}



Reader for reading from a file
Writer for writing to a file

Buffer into which we will read the data
As long as there is data in the stream

Read data into a buffer
Write the data from the buffer to the second stream