CodeGym /Courses /JAVA 25 SELF /Stream API subsets: distinct, limit, skip

Stream API subsets: distinct, limit, skip

JAVA 25 SELF
Level 30 , Lesson 2
Available

1. Method distinct: removing duplicates

In real-world tasks, you often need not only to filter and transform data, but also to select only unique elements from a collection, limit the result size, or, conversely, skip the first few elements. For example:

  • Get a list of unique user names.
  • Take only the first 10 entries to display on a page.
  • Skip the first 5 elements (for example, when implementing pagination — “show from the 2nd page”).

For such tasks, the Stream API provides special methods: distinct, limit, skip.

How does distinct work?

The distinct() method returns a new stream where all duplicate elements are removed. Duplicates are determined using the equals and hashCode methods of the corresponding class.

Imagine you are a numismatist, and it’s important that your main collection has no duplicates. All repeated coins go to the exchange fund. That’s exactly what distinct() can do — filter out duplicates from the main collection.

Example: unique user names

List<String> names = List.of(
    "Alice", "Bob", "Alice", "Eva", "Bob", "Denis", "Gleb", "Eva"
);

Get the list of unique names:

List<String> uniqueNames = names.stream()
    .distinct()
    .collect(Collectors.toList());

System.out.println(uniqueNames);
// Output: [Alice, Bob, Eva, Denis, Gleb]

Example: unique user emails

Suppose we have a User class:

public class User {
    String name;
    String email;

    // Constructor, getters, toString() — for convenience
    public User(String name, String email) {
        this.name = name;
        this.email = email;
    }
    @Override
    public String toString() {
        return name + " <" + email + ">";
    }
}

A list of users with duplicate emails:

List<User> users = List.of(
    new User("Alice", "alice@mail.com"),
    new User("Bob", "bob@mail.com"),
    new User("Eva", "eva@mail.com"),
    new User("Alice2", "alice@mail.com"), // duplicate email!
    new User("Gleb", "gleb@mail.com"),
    new User("Eva2", "eva@mail.com")      // duplicate email!
);

If you just call users.stream().distinct(), duplicates won’t be removed, because by default User objects don’t have equals and hashCode overridden. In this case, distinct only works for reference equality.

Solution: Override equals and hashCode so that uniqueness is determined by email.

@Override
public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;
    User user = (User) o;
    return Objects.equals(email, user.email);
}

@Override
public int hashCode() {
    return Objects.hash(email);
}

Now:

List<User> uniqueUsers = users.stream()
    .distinct()
    .collect(Collectors.toList());

uniqueUsers.forEach(System.out::println);
// Alice <alice@mail.com>
// Bob <bob@mail.com>
// Eva <eva@mail.com>
// Gleb <gleb@mail.com>

Important: For your own classes, always override equals and hashCode if you want distinct to work “properly”!

2. Method limit: limiting the number of elements

The method limit(long maxSize) returns a new stream that contains no more than the first maxSize elements of the original stream.

Analogy: You walk into a pastry shop and want to try only 3 pastries out of 100. limit(3) — and you get exactly the first three; the rest — “not today”.

Example: first 3 names

List<String> firstThree = names.stream()
    .limit(3)
    .collect(Collectors.toList());

System.out.println(firstThree);
// Output: [Alice, Bob, Alice]

Example with our app: top‑3 new users

List<User> firstUsers = users.stream()
    .limit(3)
    .collect(Collectors.toList());

firstUsers.forEach(System.out::println);
// Alice <alice@mail.com>
// Bob <bob@mail.com>
// Eva <eva@mail.com>

Example with sorting

You can combine with sorting — for example, take the top‑2 users with the shortest email:

List<User> top2ShortEmail = users.stream()
    .sorted(Comparator.comparingInt(u -> u.email.length()))
    .limit(2)
    .collect(Collectors.toList());

top2ShortEmail.forEach(System.out::println);
// Bob <bob@mail.com>
// Eva <eva@mail.com>

3. Method skip: skipping the first elements

The method skip(long n) returns a new stream where the first n elements of the original stream are skipped.

Example: skip the first 2 names

List<String> afterTwo = names.stream()
    .skip(2)
    .collect(Collectors.toList());

System.out.println(afterTwo);
// Output: [Alice, Eva, Bob, Denis, Gleb, Eva]

Example: pagination

Often on a website you need to show, for example, “3 users per page.” For the second page, you need to skip the first 3 and take the next 3:

int pageSize = 3;
int pageNumber = 2; // Second page

List<User> page = users.stream()
    .skip(pageSize * (pageNumber - 1))
    .limit(pageSize)
    .collect(Collectors.toList());

page.forEach(System.out::println);
// Alice2 <alice@mail.com>
// Gleb <gleb@mail.com>
// Eva2 <eva@mail.com>

4. Combining distinct, limit, skip

These methods can and should be combined — depending on the task.

Example: get 2 unique names starting from the third in order

List<String> result = names.stream()
    .distinct() // Remove duplicates: [Alice, Bob, Eva, Denis, Gleb]
    .skip(2)    // Skip Alice and Bob: [Eva, Denis, Gleb]
    .limit(2)   // Take only two: [Eva, Denis]
    .collect(Collectors.toList());

System.out.println(result);
// Output: [Eva, Denis]

Example with filtering

Suppose you need to get the first 2 unique emails that contain the letter “a”:

List<String> emails = users.stream()
    .map(user -> user.email)
    .filter(email -> email.contains("a"))
    .distinct()
    .limit(2)
    .collect(Collectors.toList());

System.out.println(emails);
// Output: [alice@mail.com, eva@mail.com]

5. Practice: exercises

Task 1. Get a list of unique user names whose length is greater than 3 characters

List<String> longUniqueNames = names.stream()
    .filter(name -> name.length() > 3)
    .distinct()
    .collect(Collectors.toList());

System.out.println(longUniqueNames);
// For example: [Alice, Denis]

Task 2. Get the 3rd and 4th unique email from the user list

List<String> thirdAndFourthEmail = users.stream()
    .map(user -> user.email)
    .distinct()
    .skip(2)
    .limit(2)
    .collect(Collectors.toList());

System.out.println(thirdAndFourthEmail);
// For example: [eva@mail.com, gleb@mail.com]

Task 3. Get the first 5 unique numbers greater than 10

List<Integer> numbers = List.of(5, 12, 17, 5, 23, 17, 42, 19, 12, 8);

List<Integer> result = numbers.stream()
    .filter(n -> n > 10)
    .distinct()
    .limit(5)
    .collect(Collectors.toList());

System.out.println(result);
// Output: [12, 17, 23, 42, 19]

6. Visual diagram: order of operations

graph TD
    A[Source list] --> B[filter]
    B --> C[distinct]
    C --> D[skip]
    D --> E[limit]
    E --> F[collect]

Note:
Usually you apply filter first, then remove duplicates with distinct, then use skip and limit, and finally — collect. But you can change the order if the task requires it.

7. Common mistakes when working with distinct, limit, skip

Error #1: Expecting that distinct removes duplicates by “any” criterion.
In reality, distinct relies on the object’s equals method. If you want uniqueness by a specific field (for example, only by email), and not by the whole object, you either need to override equals/hashCode or use tricks with Collectors.toMap() or additional filtering.

Error #2: Mixing up the order of operations.
If you apply limit first and then distinct, duplicates may remain — because you limited the stream to the first N elements, and some of them may be identical.

Error #3: Skipping more elements than exist.
If you try to skip more elements than the stream contains, the result will simply be an empty list — there will be no error, but the outcome may surprise you.

Error #4: Overlooked performance.
The methods distinct, limit, skip can be inefficient on very large streams, especially if the stream is unordered or elements are complex. In 99% of everyday tasks this isn’t a problem, but if you work with millions of records — it’s worth considering.

Error #5: Forgotten to override equals/hashCode for your classes.
If you’re working with custom objects (for example, User), then without overriding these methods, distinct will treat objects as different even if they are “logically” the same.

1
Task
JAVA 25 SELF, level 30, lesson 2
Locked
Managing the Queue for an Exclusive Party 🎉
Managing the Queue for an Exclusive Party 🎉
1
Task
JAVA 25 SELF, level 30, lesson 2
Locked
Top-4 Unique Sales Metrics 💰
Top-4 Unique Sales Metrics 💰
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION