1. Reading a file in chunks: ByteBuffer and encoding
Nowadays we rarely deal with small text files. Typically these are huge server logs, reports, CSV files, or gigabyte data dumps. Therefore, it’s important not just to read a file, but to do it efficiently and without the application “freezing”.
An asynchronous approach helps with exactly this: it doesn’t block the main thread—be it the UI or server logic—allows you to read and write large volumes of data in parallel, and makes the application scalable when you need to work with several files at once.
The key thing to understand: asynchronous I/O doesn’t make the disk itself faster—there are no miracles. It simply lets your program avoid idling while the disk performs an operation and do other work in the meantime.
How does asynchronous reading work?
An asynchronous channel (AsynchronousFileChannel) reads not strings but blocks of bytes into a ByteBuffer object. It’s like carrying boxes of letters rather than individual words. After reading, you need to turn those bytes into strings—with the right charset!
Example: asynchronous file reading in blocks
Let’s write the simplest example of asynchronous reading of a file in 4096-byte blocks and printing the contents to the console.
import java.nio.ByteBuffer;
import java.nio.channels.AsynchronousFileChannel;
import java.nio.file.Path;
import java.nio.file.StandardOpenOption;
import java.util.concurrent.Future;
import java.nio.charset.StandardCharsets;
import java.io.IOException;
public class AsyncTextReadExample {
public static void main(String[] args) throws Exception {
Path path = Path.of("bigfile.txt");
try (AsynchronousFileChannel channel = AsynchronousFileChannel.open(path, StandardOpenOption.READ)) {
ByteBuffer buffer = ByteBuffer.allocate(4096);
int position = 0;
Future<Integer> future = channel.read(buffer, position);
while (future.get() > 0) {
buffer.flip();
// Convert bytes to a string (UTF-8)
String chunk = StandardCharsets.UTF_8.decode(buffer).toString();
System.out.print(chunk);
buffer.clear();
position += chunk.getBytes(StandardCharsets.UTF_8).length;
future = channel.read(buffer, position);
}
}
}
}
Important points:
- We read the file in parts (by buffer), not all at once.
- After reading, bytes are decoded into a string using Charset.
- Don’t forget buffer.clear()—otherwise the next read won’t work!
Why isn’t simply decoding bytes enough?
The trouble is that a character can be “split” between two blocks, especially when using a multibyte charset (for example, "UTF-8"). If the last byte in the buffer is half of a character, the next block will start with the “remainder” of that character. Without special handling you’ll get gibberish or even a decoding error.
2. Converting bytes to strings: handling splits
The problem of split lines
Suppose you have the string "Hello\nWorld\n", and the buffer ended at "Hel", while "lo\nWorld\n" ended up in the next block. If you simply concatenate strings, you can lose characters or get an invalid string.
Solution: use CharsetDecoder
Java provides the CharsetDecoder class, which can correctly handle such cases. It “remembers” undecoded bytes and correctly reconstructs characters at block boundaries.
Example of using CharsetDecoder
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.CharBuffer;
import java.nio.ByteBuffer;
CharsetDecoder decoder = Charset.forName("UTF-8").newDecoder();
ByteBuffer buffer = ... // your bytes
CharBuffer charBuffer = CharBuffer.allocate(buffer.capacity());
decoder.decode(buffer, charBuffer, false);
// Now charBuffer contains correctly decoded characters
In a real task you will keep a “leftover” between reads and decode with this leftover taken into account.
3. Asynchronous writing of text files
Reading is only half the story. Writing is also performed in blocks of bytes, which you must first obtain from strings (encode).
Example: asynchronous writing of a string to a file
import java.nio.ByteBuffer;
import java.nio.channels.AsynchronousFileChannel;
import java.nio.file.Path;
import java.nio.file.StandardOpenOption;
import java.util.concurrent.Future;
import java.nio.charset.StandardCharsets;
public class AsyncTextWriteExample {
public static void main(String[] args) throws Exception {
Path path = Path.of("output.txt");
String text = "Hello, world!\n";
ByteBuffer buffer = ByteBuffer.wrap(text.getBytes(StandardCharsets.UTF_8));
try (AsynchronousFileChannel channel = AsynchronousFileChannel.open(path, StandardOpenOption.WRITE, StandardOpenOption.CREATE)) {
Future<Integer> future = channel.write(buffer, 0);
// For demonstration, wait for completion (you usually shouldn't!)
future.get();
System.out.println("Data written asynchronously.");
}
}
}
Comment: In real asynchronous scenarios, you shouldn’t call future.get() on the main thread—it turns asynchronous code into synchronous code. It’s better to use CompletionHandler (see the previous lecture).
4. Practice: asynchronously reading a large text file and counting lines
Let’s implement a practical task: asynchronously read a large text file and count the number of lines ("\n"). The result—print the number of lines to the console.
Example using CompletionHandler
import java.nio.ByteBuffer;
import java.nio.channels.AsynchronousFileChannel;
import java.nio.channels.CompletionHandler;
import java.nio.file.Path;
import java.nio.file.StandardOpenOption;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.StandardCharsets;
import java.nio.CharBuffer;
import java.io.IOException;
import java.util.concurrent.atomic.AtomicLong;
public class AsyncLineCounter {
public static void main(String[] args) throws IOException {
Path path = Path.of("bigfile.txt");
AsynchronousFileChannel channel = AsynchronousFileChannel.open(path, StandardOpenOption.READ);
ByteBuffer buffer = ByteBuffer.allocate(4096);
AtomicLong position = new AtomicLong(0);
CharsetDecoder decoder = StandardCharsets.UTF_8.newDecoder();
StringBuilder leftover = new StringBuilder();
AtomicLong lines = new AtomicLong(0);
channel.read(buffer, position.get(), null, new CompletionHandler<Integer, Object>() {
@Override
public void completed(Integer result, Object attachment) {
if (result == -1) {
// File read to the end
if (leftover.length() > 0) lines.incrementAndGet();
System.out.println("Lines in file: " + lines.get());
try { channel.close(); } catch (IOException e) { e.printStackTrace(); }
return;
}
buffer.flip();
CharBuffer charBuffer = CharBuffer.allocate(buffer.remaining());
decoder.decode(buffer, charBuffer, false);
charBuffer.flip();
String chunk = leftover.toString() + charBuffer.toString();
leftover.setLength(0);
// Count lines
int last = 0;
int idx;
while ((idx = chunk.indexOf('\n', last)) != -1) {
lines.incrementAndGet();
last = idx + 1;
}
// Remainder (part of the line after the last \n)
if (last < chunk.length()) {
leftover.append(chunk.substring(last));
}
buffer.clear();
position.addAndGet(result);
channel.read(buffer, position.get(), null, this);
}
@Override
public void failed(Throwable exc, Object attachment) {
System.err.println("Read error: " + exc.getMessage());
try { channel.close(); } catch (IOException e) { e.printStackTrace(); }
}
});
// So the program doesn't exit too early (for demo only!)
try { Thread.sleep(2000); } catch (InterruptedException e) {}
}
}
- We use CompletionHandler for truly async code.
- After each read the buffer is decoded using CharsetDecoder.
- The remainder of a line that didn’t end with "\n" is carried over to the next block.
- After reaching the end of the file, if something remains in leftover, that also counts as a line.
- For simplicity, the example “sleeps” for 2000 ms so the asynchronous operation can complete (in real applications this isn’t needed—you usually have a main loop or UI).
5. Asynchronous writing of results to a file
Suppose we want to write the result (for example, the number of lines) to a new file—asynchronously as well.
import java.nio.ByteBuffer;
import java.nio.channels.AsynchronousFileChannel;
import java.nio.channels.CompletionHandler;
import java.nio.file.Path;
import java.nio.file.StandardOpenOption;
import java.nio.charset.StandardCharsets;
import java.io.IOException;
public class AsyncWriteResult {
public static void main(String[] args) throws IOException {
String result = "Lines in file: 12345\n";
ByteBuffer buffer = ByteBuffer.wrap(result.getBytes(StandardCharsets.UTF_8));
Path path = Path.of("result.txt");
AsynchronousFileChannel channel = AsynchronousFileChannel.open(
path, StandardOpenOption.WRITE, StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING);
channel.write(buffer, 0, null, new CompletionHandler<Integer, Object>() {
@Override
public void completed(Integer written, Object attachment) {
System.out.println("Result written asynchronously!");
try { channel.close(); } catch (IOException e) { e.printStackTrace(); }
}
@Override
public void failed(Throwable exc, Object attachment) {
System.err.println("Write error: " + exc.getMessage());
try { channel.close(); } catch (IOException e) { e.printStackTrace(); }
}
});
try { Thread.sleep(500); } catch (InterruptedException e) {}
}
}
6. Tips for handling partial data and charsets
Partial lines between blocks
If a line is split between two blocks, do not try to “glue” bytes together manually! Use CharsetDecoder, which will carefully handle the missing bytes and won’t lose a single character.
Working with different charsets
"UTF-8" is the standard for modern applications, but if the file uses a different charset (for example, "Windows-1251" or "UTF-16"), use the corresponding Charset:
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
Charset charset = Charset.forName("Windows-1251");
CharsetDecoder decoder = charset.newDecoder();
Using CharsetDecoder and CharsetEncoder
When you read or write data in parts, it’s important to handle the charset correctly. A character may be “split” between two blocks, and without extra handling you’ll get a mess of bytes.
To avoid this, use CharsetDecoder and CharsetEncoder.
When reading, call decode(ByteBuffer, CharBuffer, endOfInput), and when writing—encode(CharBuffer, ByteBuffer, endOfInput).
They ensure that even if a character ends up split between two blocks, it will still be assembled and handled correctly.
7. Common mistakes in asynchronous processing of text files
Mistake No. 1: Ignoring leftover line fragments. If you don’t keep the “tail” of a line between blocks, some lines may be lost or decoded incorrectly.
Mistake No. 2: Incorrect buffer handling. Forgot to call buffer.clear() after processing—the next read won’t work or the data will be incorrect.
Mistake No. 3: Using the wrong charset. If bytes are decoded with a different Charset than was used when writing the file, you may get gibberish or even errors.
Mistake No. 4: Blocking the main thread. If you call future.get() or Thread.sleep() on the UI thread, you lose the point of asynchrony. Use CompletionHandler and reactive approaches.
Mistake No. 5: Not closing the channel after completion. Always close the channel (channel.close()) after all operations finish, even if an error occurred.
GO TO FULL VERSION