CodeGym /Courses /Java Syntax Zero /Streams for data input

Streams for data input

Java Syntax Zero
Level 16 , Lesson 2
Available

1. Data streams

Rarely does a program exist as an island unto itself. Programs usually somehow interact with the "outside world". This can happen through reading data from the keyboard, sending messages, downloading pages from the Internet, or, conversely, uploading files to a remote server.

We can refer to all of these behaviors in one word: data exchange between the program and the outside world. Wait, that's not just one word.

Of course, data exchange itself can be divided into two parts: receiving data and sending data. For example, you read data from the keyboard using a Scanner object — this is receiving data. And you display data on the screen using a System.out.println() command — this is sending data.

In programming, the term "stream" is used to describe data exchange. Where did that term come from?

In real life, you can have a stream of water or a stream of consciousness. In programming, we have data streams.

Streams are a versatile tool. They allow the program to receive data from anywhere (input streams) and send data anywhere (output streams). Thus, there are two types:

  • An input stream is for receiving data
  • An output stream is for sending data

To make streams 'tangible', Java's creators wrote two classes: InputStream and OutputStream.

The InputStream class has a read() method that lets you read data from it. And the OutputStream class has a write() method that lets you write data to it. They have other methods as well, but more on that later.

Byte streams

What kind of data are we talking about? What format does it take? In other words, what data types do these classes support?

These are generic classes, so they support the most common data type — the byte. An OutputStream can write bytes (and byte arrays), and an InputStream object can read bytes (or byte arrays). That's it — they don't support any other data types.

As a result, these streams are also called byte streams.

One feature of streams is that their data can only be read (or written) sequentially. You can't read data from the middle of a stream without reading all the data that comes before it.

This is how reading data from the keyboard works through the Scanner class: you read data from the keyboard sequentially, line by line. We read a line, then the next line, then the next line, and so on. Fittingly, the method for reading lines is called nextLine().

Writing data to an OutputStream also happens sequentially. A good example of this is console output. You output a line, followed by another and another. This is sequential output. You can't output the first line, then the tenth, and then the second. All data is written to an output stream only sequentially.

Character streams

You recently learned that strings are the second most popular data type, and indeed they are. A lot of information is passed around in the form of characters and whole strings. A computer excels at sending and receiving everything as bytes, but humans aren't that perfect.

Accounting for this fact, Java programmers wrote two more classes: Reader and Writer. The Reader class is analogous the InputStream class, but its read() method reads not bytes, but characters (char). The Writer class corresponds to the OutputStream class. And just like the Reader class, it works with characters (char), not bytes.

If we compare these four classes, we get the following picture:

Bytes (byte) Characters (char)
Reading data
InputStream
Reader
Writing data
OutputStream
Writer

Practical application

The InputStream, OutputStream, Reader and Writer classes themselves are not used directly by anyone, since they are not associated with any concrete objects from which data can be read (or into which data can be written). But these four classes have plenty of descendant classes that can do a lot.


2. InputStream class

The InputStream class is interesting because it is the parent class for hundreds of descendant classes. It doesn't have any data of its own, but it does have methods that all of its derived classes inherit.

In general, it is rare for stream objects to store data internally. A stream is a tool for reading/writing data, but not storage. That said, there are exceptions.

Methods of the InputStream class and all its descendant classes:

Methods Description
int read()
Reads one byte from the stream
int read(byte[] buffer)
Reads an array of bytes from the stream
byte[] readAllBytes()
Reads all the bytes from the stream
long skip(long n)
Skips n bytes in the stream (reads and discards them)
int available()
Checks how many bytes are left in the stream
void close()
Closes the stream

Let's briefly go through these methods:

read() method

The read() method reads one byte from the stream and returns it. You might be confused by the int return type. This type was chosen because int is the standard integer type. The first three bytes of the int will be zero.

read(byte[] buffer) method

This is the second variant of the read() method. It lets you read a byte array from an InputStream all at once. The array that will store the bytes must be passed as an argument. The method returns a number — the number of bytes actually read.

Let's say you have a 10 kilobyte buffer and you're reading data from a file using the FileInputStream class. If the file contains only 2 kilobytes, all the data will be loaded into the buffer array, and the method will return the number 2048 (2 kilobytes).

readAllBytes() method

A very good method. It just reads all the data from the InputStream until it runs out and returns it as a single byte array. This is very handy for reading small files. Large files may not physically fit in memory, and the method will throw an exception.

skip(long n) method

This method allows you to skip the first n bytes from the InputStream object. Because the data is read strictly sequentially, this method simply reads the first n bytes from the stream and discards them.

Returns the number of bytes that were actually skipped (in case the stream ended before n bytes were skipped).

int available() method

The method returns the number of bytes that are still left in the stream

void close() method

The close() method closes the data stream and releases the external resources associated with it. Once a stream is closed, no more data can be read from it.

Let's write an example program that copies a very large file. We cannot use the readAllBytes() method to read the entire file into memory. Example:

Code Note
String src = "c:\\projects\\log.txt";
String dest = "c:\\projects\\copy.txt";

try(FileInputStream input = new FileInputStream(src);
FileOutputStream output = new FileOutputStream(dest))
{
   byte[] buffer = new byte[65536]; // 64Kb
   while (input.available() > 0)
   {
      int real = input.read(buffer);
      output.write(buffer, 0, real);
   }
}



InputStream for reading from the file
OutputStream for write to file

Buffer into which we will read the data
As long as there is data in the stream

Read data into the buffer
Write the data from the buffer to the second stream

In this example, we used two classes: FileInputStream is a descendant of InputStream for reading data from a file, and FileOutputStream is a descendant of OutputStream for writing data to a file. We will talk about the second class a little later.

Another interesting point here is the real variable. When the last block of data is read from a file, it could easily have less than 64KB of data. Accordingly, we need to output not the entire buffer, but only part of it — the first real bytes. This is exactly what happens in the write() method.



3. Reader class

The Reader class is a complete analogue of the InputStream class. The only one difference is that it works with characters (char), not with bytes. Just like the InputStream class, the Reader class is not used anywhere on its own: it is the parent class for hundreds of descendant classes and defines common methods for all of them.

Methods of the Reader class (and all its descendant classes):

Methods Description
int read()
Reads one char from the stream
int read(char[] buffer)
Reads an char array from the stream
long skip(long n)
Skips n chars in the stream (reads and discards them)
boolean ready()
Checks whether there is still something left in the stream
void close()
Closes the stream

The methods are very similar to those of the InputStream class, although there are slight differences.

int read() method

This method reads one char from the stream and returns it. The char type widens to an int, but the first two bytes of the result are always zero.

int read(char[] buffer) method

This is the second variant of the read() method. It lets you read a char array from a Reader all at once. The array that will store the characters must be passed as an argument. The method returns a number — the number of characters actually read.

skip(long n) method

This method allows you to skip the first n characters from the Reader object. It works exactly the same as the analogous method of the InputStream class. Returns the number of characters that were actually skipped.

boolean ready() method

Returns true if there are unread bytes in the stream.

void close() method

The close() method closes the data stream and releases the external resources associated with it. Once a stream is closed, no more data can be read from it.

For comparison, let's write a program that copies a text file:

Code Note
String src = "c:\\projects\\log.txt";
String dest = "c:\\projects\\copy.txt";

try(FileReader reader = new FileReader(src);
FileWriter writer = new FileWriter(dest))
{
   char[] buffer = new char[65536]; // 128Kb
   while (reader.ready())
   {
      int real = reader.read(buffer);
      writer.write(buffer, 0, real);
   }
}



Reader for reading from a file
Writer for writing to a file

Buffer into which we will read the data
As long as there is data in the stream

Read data into a buffer
Write the data from the buffer to the second stream

Comments (11)
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION
Jose Javier Level 68 Expert
1 August 2024
Let me share my experience: So far, I was not able to complete this exercise, si difficult "Mixed-up bytes" (still pending the others), honestly, event I checked the solution was not completely clear, the second challenge was to understand why the solution was not finding my files "FileNotFoundException" then I realized the problem was the location of the files even they were in the same path as the Solution.class, so, I resolved this problem with the terminal in IntelliJ, I used "pwd" to check my current path/location, "ls" to check directories and then "cd" to move from one directory to another in this way I was able to find the correct path and use it in the program. Note: I created the files (file1.txt and file2.txt) manually and I wrote some random lines in file1.txt to be able to see the results in file2.txt after run the program.
Evgeniia Shabaeva Level 40, Budapest, Hungary
31 May 2024
So, I've finished this lesson and what really bothers me after that is why not teach us properly about the Files class and its methods. Paths would be of great help, too, as long as we want to fully understand how to do the suggested tasks.
Rene Level 28, Netherlands
12 March 2024
For the 'Mixed-up bytes' task: It did not pass validation in IntelliJ (5 failed requirements), but DID pass validation in the web editor. So upload or copy there and validate from the site, not from IntelliJ is my advice.
Parsa Level 83, Bangalore, India Expert
5 January 2024
It was pretty confusing, especially this part:

Another interesting point here is the real variable.
I'll just elaborate on it a bit in case there are others like me: Suppose the file you're trying to copy is 130KB. Here's what happens in this block:

while (input.available() > 0)
   {
      int real = input.read(buffer);
      output.write(buffer, 0, real);
   }
real = 64 (because that's the buffer size), writes the first 64KB. Again, real = 64, writes the next 64KB. Now real is only 2, so instead of writing another 2KB of data and 62KB of nothing, we only write the 2KB. That's why the real variable is used here and we don't say

      output.write(buffer, 0, size);
Rene Level 28, Netherlands
12 March 2024
To add to this: The write() method uses these parameters (String str, int pos, int length). So you write from the 0th position until the end of the received input (int real gives back the number of bytes actually read)
Pavel281185 Level 38, Prague, Czech Republic
21 June 2024
It is fine to sout the "real" int to see how much was in buffer and change buffer size and file size.
Xm Level 24, Israel
22 November 2023
why in "Mixed-up bytes" task, in correct solution, there is no closing streams...
Parsa Level 83, Bangalore, India Expert
5 January 2024
Because they're using try with resources. Declaring all these three in the parenthesis after the try keyword will automatically invoke the close methods at the end of the try block.

try (Scanner scanner = new Scanner(System.in);
             FileInputStream fis = new FileInputStream(scanner.nextLine());
             FileOutputStream fos = new FileOutputStream(scanner.nextLine())) { ... }
Matar29 Level 16, Italy Expert
5 April 2023
I suggest trying the second task before the first one. It's easier and could help to understand how to use particular classes, such as Files and Paths.
Ravi sahu Level 30, CodeGym University in India, India
11 March 2023
Overall, it can be a good understanding for novice learners.
Hiyo Level 24
29 January 2023
Not very good. Most of the tasks are hardcoded to pass only if you followed the exact solution... For people looking to get familiar with FileIO in Java, I strongly suggest trying to build your own File Explorer system using Java with Console log and do simple things like show files in given directory path, read text-files in directory, change their file names etc ( plenty of ways to do this )