1. When multithreading helps
Multithreading is needed when there’s plenty of work that can be done at the same time. For example, if you need to process dozens of files — copy them, recompute, or analyse them — it’s easier to delegate different parts to different threads than to do everything sequentially. It’s like asking five friends to go through your photo archive instead of just one: the work gets done faster and is a bit more fun.
It’s especially useful for batch processing of files, downloading or copying large files in parts, or when, after reading data, you need to compute something in parallel for different parts.
But multithreading doesn’t always help. If you have one small file, spinning up a dozen threads for it is pointless. If the disk or network is already saturated, new threads will only slow things down. And if multiple threads write to the same file at the same time without synchronization — you can get real chaos with corrupted data.
Simply put, multithreading is a tool. Like a hammer: you can drive a nail with it — or hit your finger. The key is knowing when and how to use it.
2. Java tools for multithreaded I/O
You already know that Java has several ways to run tasks in parallel:
- Classic Thread — manual thread creation.
- Thread pools via ExecutorService — a modern, flexible, and convenient approach.
- CompletableFuture and parallel streams (Stream API) — for advanced tasks (we’ll cover them in later lectures).
Let’s start with the simplest: processing multiple files in different threads.
Example 1: Classic Thread
public class FileCopyTask extends Thread {
private final Path source;
private final Path target;
public FileCopyTask(Path source, Path target) {
this.source = source;
this.target = target;
}
@Override
public void run() {
try {
Files.copy(source, target, StandardCopyOption.REPLACE_EXISTING);
System.out.println("File copied: " + source);
} catch (IOException e) {
System.err.println("Copy error for " + source + ": " + e.getMessage());
}
}
}
// Start several copies in separate threads
List<Path> filesToCopy = List.of(
Path.of("log1.txt"), Path.of("log2.txt"), Path.of("log3.txt")
);
for (Path file : filesToCopy) {
new FileCopyTask(file, Path.of("backup_" + file.getFileName())).start();
}
Pros: Simple and clear.
Cons: Managing lots of threads by hand is inconvenient; no control over how many threads run at the same time.
Example 2: ExecutorService — thread pool
ExecutorService lets you delegate tasks to a thread pool that decides how many threads to use concurrently.
import java.util.concurrent.*;
public class MultiFileCopier {
public static void main(String[] args) throws InterruptedException {
ExecutorService executor = Executors.newFixedThreadPool(4); // max 4 threads
List<Path> filesToCopy = List.of(
Path.of("log1.txt"), Path.of("log2.txt"), Path.of("log3.txt")
);
for (Path file : filesToCopy) {
executor.submit(() -> {
try {
Files.copy(file, Path.of("backup_" + file.getFileName()), StandardCopyOption.REPLACE_EXISTING);
System.out.println("Copied: " + file);
} catch (IOException e) {
System.err.println("Error: " + file + " " + e.getMessage());
}
});
}
executor.shutdown(); // no longer accept tasks
executor.awaitTermination(1, TimeUnit.MINUTES); // wait for all tasks to finish
}
}
Pros:
- Easy to scale (you can set the number of threads).
- Convenient control over task completion (methods shutdown(), awaitTermination(...)).
- Suitable for processing hundreds and thousands of files.
3. Problems and constraints of multithreaded I/O
Contention for resources
If you try to read or write the same file from multiple threads simultaneously without synchronization, the result will be unpredictable. It’s like two people writing on the same page of a book at the same time: you get a mess. To coordinate, use, for example, synchronized, explicit locks, or a dedicated writer thread.
File system and OS limitations
- Not all file systems handle concurrent writes to a single file well.
- The operating system may limit the number of simultaneously open files.
- A hard drive (especially HDD) performs poorly under heavy random access.
Synchronization when writing to a shared resource
If multiple threads write to one file (for example, a log), always use synchronization (for instance, via synchronized, locks, or dedicated writer threads).
Inefficiency with small files
For small files, the overhead of creating threads and context switching can exceed the gains from parallelism.
4. Practical examples
Parallel file copying
Suppose we have a folder full of logs to copy to an archive directory.
import java.nio.file.*;
import java.util.List;
import java.util.concurrent.*;
public class ParallelFileCopier {
public static void main(String[] args) throws InterruptedException {
ExecutorService executor = Executors.newFixedThreadPool(4);
List<Path> filesToCopy = List.of(
Path.of("log1.txt"), Path.of("log2.txt"), Path.of("log3.txt")
// ... add as many files as you want
);
for (Path file : filesToCopy) {
executor.submit(() -> {
try {
Path target = Path.of("archive", file.getFileName().toString());
Files.copy(file, target, StandardCopyOption.REPLACE_EXISTING);
System.out.println("Copied: " + file);
} catch (IOException e) {
System.err.println("Error: " + file + " " + e.getMessage());
}
});
}
executor.shutdown();
executor.awaitTermination(10, TimeUnit.MINUTES);
}
}
Comment:
- We use a pool of 4 threads — that’s usually enough to keep the disk busy without overloading the system.
- For 1000 files you can raise the pool to 8, but don’t make it too large.
Parallel processing of file lines with the Stream API
Since Java 8 you can use parallel streams to process file contents:
import java.nio.file.*;
import java.io.IOException;
public class ParallelLineProcessing {
public static void main(String[] args) throws IOException {
Path path = Path.of("biglog.txt");
// Files.lines returns Stream<String> — a stream of file lines
Files.lines(path)
.parallel() // turn into a parallel stream
.filter(line -> line.contains("ERROR"))
.forEach(line -> System.out.println("Error: " + line));
}
}
Important:
- Parallel streams speed up processing if it’s compute-heavy (CPU-bound), not the reading itself (I/O-bound).
- If line processing is trivial (e.g., just printing via System.out.println), there may be no speedup.
Reading/writing different parts of a single large file
Java lets you read or write different regions of the same file concurrently using FileChannel and positional methods. This is advanced, but the idea is simple: each thread works with its own chunk of the file.
import java.nio.channels.FileChannel;
import java.nio.file.*;
import java.io.*;
import java.nio.ByteBuffer;
public class FileChunkReader implements Runnable {
private final Path path;
private final long position;
private final long size;
public FileChunkReader(Path path, long position, long size) {
this.path = path;
this.position = position;
this.size = size;
}
@Override
public void run() {
try (FileChannel channel = FileChannel.open(path, StandardOpenOption.READ)) {
ByteBuffer buffer = ByteBuffer.allocate((int) size);
channel.read(buffer, position);
System.out.println("Read chunk from position " + position + " with size " + size);
// You can process the buffer here
} catch (IOException e) {
System.err.println("Chunk read error: " + e.getMessage());
}
}
}
// Example run: read the file in 1 MB chunks with 4 threads
Path file = Path.of("bigdata.bin");
long fileSize = Files.size(file);
long chunkSize = 1024 * 1024; // 1 MB
int chunks = (int) Math.ceil((double) fileSize / chunkSize);
ExecutorService executor = Executors.newFixedThreadPool(4);
for (int i = 0; i < chunks; i++) {
long position = i * chunkSize;
long size = Math.min(chunkSize, fileSize - position);
executor.submit(new FileChunkReader(file, position, size));
}
executor.shutdown();
executor.awaitTermination(10, TimeUnit.MINUTES);
Comment:
- Each thread reads its own chunk of the file without interfering with others.
- This approach is used, for example, in torrents and download managers.
Synchronization when writing to a shared file
If several threads write to the same file (e.g., a log), you must synchronize access to avoid a “hodgepodge” of lines:
import java.io.*;
public class SafeLogger {
private final Writer writer;
public SafeLogger(String filename) throws IOException {
this.writer = new BufferedWriter(new FileWriter(filename, true));
}
public synchronized void log(String message) throws IOException {
writer.write(message);
writer.write(System.lineSeparator());
writer.flush();
}
public void close() throws IOException {
writer.close();
}
}
Comment:
- The log method is marked synchronized so only one thread writes to the file at a time.
- This works, but with many threads it can become a bottleneck — it’s better to write to separate files and then merge them.
5. When not to use multithreading
Multithreading is tempting: more threads means everything should fly! In practice, not always. If you’re processing a couple of small files, it’s simpler and more reliable to do it sequentially. The time you spend starting threads and coordinating them just won’t pay off.
Sometimes the problem isn’t the disk at all, but the network — then adding threads won’t speed anything up because the bottleneck is elsewhere. Another trap is parallel writes to the same file. If you don’t have much experience with synchronization, it’s better not to try: the chances of ending up with corrupted data are high.
And finally, if your disk or file system doesn’t like being hammered by dozens of threads, multithreading won’t save you — it will only make things worse.
Simply put, if it seems like “the more threads, the better,” that’s usually not true. Sometimes one calm thread does the job cleaner, faster, and more reliably than a dozen rushed ones.
6. A quick intro to FileChannel for advanced tasks
FileChannel from the java.nio.channels package is a low-level tool for working with files, allowing you to read and write data at arbitrary positions. This enables, for example, parallel downloading or processing of large files in chunks.
Example:
try (FileChannel channel = FileChannel.open(Path.of("bigfile.bin"), StandardOpenOption.READ)) {
ByteBuffer buffer = ByteBuffer.allocate(1024);
long position = 0;
int bytesRead = channel.read(buffer, position); // read 1024 bytes from position 0
// process the buffer
}
Important:
- FileChannel is not synchronized — if multiple threads work with one channel, you must implement synchronization yourself.
- For parallel work it’s simpler to open a separate channel per thread.
7. Common mistakes with multithreaded I/O
Error No. 1: Unsynchronized writes to a single file.
As a result — corrupted data, strange characters, sometimes the file won’t open at all. Always synchronize access or write to separate files.
Error No. 2: Too many threads.
If you open 1000 threads just to copy 1000 files, your machine may protest (OutOfMemoryError, sluggishness, crashes). Use a thread pool (ExecutorService) and limit the count.
Error No. 3: Streams/files aren’t closed.
Every open stream is an OS resource. If you don’t close them, you can hit the “Too many open files” error. Use try-with-resources or remember to call close().
Error No. 4: Premature program termination.
If you don’t wait for all threads to finish (e.g., forget to call executor.awaitTermination(...)), the program may exit before all files are copied.
Error No. 5: Parallel writes to the same file region without positional control.
If several threads write to the same area of a file, the data will get mixed up. For positional writes, use channels and clearly separated ranges.
GO TO FULL VERSION