1. Method distinct: removing duplicates
In real-world tasks, you often need not only to filter and transform data, but also to select only unique elements from a collection, limit the result size, or, conversely, skip the first few elements. For example:
- Get a list of unique user names.
- Take only the first 10 entries to display on a page.
- Skip the first 5 elements (for example, when implementing pagination — “show from the 2nd page”).
For such tasks, the Stream API provides special methods: distinct, limit, skip.
How does distinct work?
The distinct() method returns a new stream where all duplicate elements are removed. Duplicates are determined using the equals and hashCode methods of the corresponding class.
Imagine you are a numismatist, and it’s important that your main collection has no duplicates. All repeated coins go to the exchange fund. That’s exactly what distinct() can do — filter out duplicates from the main collection.
Example: unique user names
List<String> names = List.of(
"Alice", "Bob", "Alice", "Eva", "Bob", "Denis", "Gleb", "Eva"
);
Get the list of unique names:
List<String> uniqueNames = names.stream()
.distinct()
.collect(Collectors.toList());
System.out.println(uniqueNames);
// Output: [Alice, Bob, Eva, Denis, Gleb]
Example: unique user emails
Suppose we have a User class:
public class User {
String name;
String email;
// Constructor, getters, toString() — for convenience
public User(String name, String email) {
this.name = name;
this.email = email;
}
@Override
public String toString() {
return name + " <" + email + ">";
}
}
A list of users with duplicate emails:
List<User> users = List.of(
new User("Alice", "alice@mail.com"),
new User("Bob", "bob@mail.com"),
new User("Eva", "eva@mail.com"),
new User("Alice2", "alice@mail.com"), // duplicate email!
new User("Gleb", "gleb@mail.com"),
new User("Eva2", "eva@mail.com") // duplicate email!
);
If you just call users.stream().distinct(), duplicates won’t be removed, because by default User objects don’t have equals and hashCode overridden. In this case, distinct only works for reference equality.
Solution: Override equals and hashCode so that uniqueness is determined by email.
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
User user = (User) o;
return Objects.equals(email, user.email);
}
@Override
public int hashCode() {
return Objects.hash(email);
}
Now:
List<User> uniqueUsers = users.stream()
.distinct()
.collect(Collectors.toList());
uniqueUsers.forEach(System.out::println);
// Alice <alice@mail.com>
// Bob <bob@mail.com>
// Eva <eva@mail.com>
// Gleb <gleb@mail.com>
Important: For your own classes, always override equals and hashCode if you want distinct to work “properly”!
2. Method limit: limiting the number of elements
The method limit(long maxSize) returns a new stream that contains no more than the first maxSize elements of the original stream.
Analogy: You walk into a pastry shop and want to try only 3 pastries out of 100. limit(3) — and you get exactly the first three; the rest — “not today”.
Example: first 3 names
List<String> firstThree = names.stream()
.limit(3)
.collect(Collectors.toList());
System.out.println(firstThree);
// Output: [Alice, Bob, Alice]
Example with our app: top‑3 new users
List<User> firstUsers = users.stream()
.limit(3)
.collect(Collectors.toList());
firstUsers.forEach(System.out::println);
// Alice <alice@mail.com>
// Bob <bob@mail.com>
// Eva <eva@mail.com>
Example with sorting
You can combine with sorting — for example, take the top‑2 users with the shortest email:
List<User> top2ShortEmail = users.stream()
.sorted(Comparator.comparingInt(u -> u.email.length()))
.limit(2)
.collect(Collectors.toList());
top2ShortEmail.forEach(System.out::println);
// Bob <bob@mail.com>
// Eva <eva@mail.com>
3. Method skip: skipping the first elements
The method skip(long n) returns a new stream where the first n elements of the original stream are skipped.
Example: skip the first 2 names
List<String> afterTwo = names.stream()
.skip(2)
.collect(Collectors.toList());
System.out.println(afterTwo);
// Output: [Alice, Eva, Bob, Denis, Gleb, Eva]
Example: pagination
Often on a website you need to show, for example, “3 users per page.” For the second page, you need to skip the first 3 and take the next 3:
int pageSize = 3;
int pageNumber = 2; // Second page
List<User> page = users.stream()
.skip(pageSize * (pageNumber - 1))
.limit(pageSize)
.collect(Collectors.toList());
page.forEach(System.out::println);
// Alice2 <alice@mail.com>
// Gleb <gleb@mail.com>
// Eva2 <eva@mail.com>
4. Combining distinct, limit, skip
These methods can and should be combined — depending on the task.
Example: get 2 unique names starting from the third in order
List<String> result = names.stream()
.distinct() // Remove duplicates: [Alice, Bob, Eva, Denis, Gleb]
.skip(2) // Skip Alice and Bob: [Eva, Denis, Gleb]
.limit(2) // Take only two: [Eva, Denis]
.collect(Collectors.toList());
System.out.println(result);
// Output: [Eva, Denis]
Example with filtering
Suppose you need to get the first 2 unique emails that contain the letter “a”:
List<String> emails = users.stream()
.map(user -> user.email)
.filter(email -> email.contains("a"))
.distinct()
.limit(2)
.collect(Collectors.toList());
System.out.println(emails);
// Output: [alice@mail.com, eva@mail.com]
5. Practice: exercises
Task 1. Get a list of unique user names whose length is greater than 3 characters
List<String> longUniqueNames = names.stream()
.filter(name -> name.length() > 3)
.distinct()
.collect(Collectors.toList());
System.out.println(longUniqueNames);
// For example: [Alice, Denis]
Task 2. Get the 3rd and 4th unique email from the user list
List<String> thirdAndFourthEmail = users.stream()
.map(user -> user.email)
.distinct()
.skip(2)
.limit(2)
.collect(Collectors.toList());
System.out.println(thirdAndFourthEmail);
// For example: [eva@mail.com, gleb@mail.com]
Task 3. Get the first 5 unique numbers greater than 10
List<Integer> numbers = List.of(5, 12, 17, 5, 23, 17, 42, 19, 12, 8);
List<Integer> result = numbers.stream()
.filter(n -> n > 10)
.distinct()
.limit(5)
.collect(Collectors.toList());
System.out.println(result);
// Output: [12, 17, 23, 42, 19]
6. Visual diagram: order of operations
graph TD
A[Source list] --> B[filter]
B --> C[distinct]
C --> D[skip]
D --> E[limit]
E --> F[collect]
Note:
Usually you apply filter first, then remove duplicates with distinct, then use skip and limit, and finally — collect. But you can change the order if the task requires it.
7. Common mistakes when working with distinct, limit, skip
Error #1: Expecting that distinct removes duplicates by “any” criterion.
In reality, distinct relies on the object’s equals method. If you want uniqueness by a specific field (for example, only by email), and not by the whole object, you either need to override equals/hashCode or use tricks with Collectors.toMap() or additional filtering.
Error #2: Mixing up the order of operations.
If you apply limit first and then distinct, duplicates may remain — because you limited the stream to the first N elements, and some of them may be identical.
Error #3: Skipping more elements than exist.
If you try to skip more elements than the stream contains, the result will simply be an empty list — there will be no error, but the outcome may surprise you.
Error #4: Overlooked performance.
The methods distinct, limit, skip can be inefficient on very large streams, especially if the stream is unordered or elements are complex. In 99% of everyday tasks this isn’t a problem, but if you work with millions of records — it’s worth considering.
Error #5: Forgotten to override equals/hashCode for your classes.
If you’re working with custom objects (for example, User), then without overriding these methods, distinct will treat objects as different even if they are “logically” the same.
GO TO FULL VERSION