Course Module 1. Java Syntax - Lecture: Convenience classes for working with strings in Java

1. `StringTokenizer` class

And now a few more common scenarios involving working with strings. How do you split a string into several parts? There are several ways to do this.

split() method

The first way to split a string into multiple parts is to use the split() method. A regular expression that defines a special delimiting string must be passed as an argument. You will learn what a regular expression is in the Java Multithreading quest.

Example:

Code	Result
`String str = "Good news everyone!"; String[] strings = str.split("ne"); System.out.println(Arrays.toString(strings));`	The result will be an array of three strings: `["Good ", "ws everyo", "!"]`

Simple, but sometimes this approach is excessive. If there are a lot of delimiters (for example, spaces, newline characters, tabs, periods), then you have to construct a rather complex regular expression. It is difficult to read and therefore difficult to modify.

StringTokenizer class

Java has a special class whose whole job is to split a string into substrings.

This class doesn't use regular expressions: instead, you simply pass in a string consisting of delimiters. The advantage of this approach is that it does not break the entire string into pieces all at once, but instead moves from beginning to end one step at a time.

The class has a constructor and two important methods. We pass the constructor a string that we split into parts, and a string comprised of a set of delimiting characters.

Methods	Description
`String nextToken()`	Returns the next substring
`boolean hasMoreTokens()`	Checks whether there are more substrings.

This class is somehow reminiscent of the Scanner class, which also has nextLine() and hasNextLine() methods.

You can create a StringTokenizer object with this command:

StringTokenizer name = new StringTokenizer(string, delimiters);

Where string is the string to be divided into parts. And delimiters is a string, and each character in it is treated as a delimiter. Example:

Code Console output

String str = "Good news everyone!";

StringTokenizer tokenizer = new StringTokenizer(str,"ne");
while (tokenizer.hasMoreTokens())
{
   String token = tokenizer.nextToken();
   System.out.println(token);
}

Good 
ws 
v
ryo
!

Note that each character in the string passed as the second string to the StringTokenizer constructor is considered a separator.

2. `String.format()` method and `StringFormatter` class

Another interesting method of the String class is format().

Let's say you have various variables storing data. How do you display them on the screen in one line? For example, we have some data (left column) and desired output (right column):

Code	Console output
`String name = "Amigo"; int age = 12; String friend = "Diego"; int weight = 200;`	`User = {name: Amigo, age: 12 years, friend: Diego, weight: 200 kg.}`

Your code will probably look something like this:

Program code

String name = "Amigo";
int age = 12;
String friend = "Diego";
int weight = 200;

System.out.println("User = {name: " + name + ", age:" + age + " years, friend: " + friend+", weight: " + weight + " kg.}");

Such code is not very readable. And if the variable names were longer, then the code would become even more difficult:

Program code


class User {
    ......
    public String getName() {
        return name;
    }

    public int getAge() {
        return age;
    }

    public List<String> getFriends() {
        return friends;
    }

    public ExtraInformation getExtraInformation() {
        return extraInformation;
    }
}

User user = new User();

System.out.println("User = {name: " + user.getName() + ", age:" + user.getAge() + " years, friend: " + user.getFriends().get(0) + ", weight: " + user.getExtraInformation().getWeight() + " kg.}");

Not very readable, is it?

But this is a common situation in real-world programs, so I want to tell you about a way to write this code more simply and more concisely.

String.format

The String class has a static format() method: it lets you specify a pattern for assembling a string with data. The general appearance of the command is as follows:

String name = String.format(pattern, parameters);

Example:

Code	Result
`String.format("Age=%d, Name=%s", age, name);`	`Age=12, Name=Amigo`
`String.format("Width=%d, Height=%d", width, height);`	`Width=20, Height=10`
`String.format("Fullname=%s", name);`	`Fullname=Diego`

The format() method's first parameter is a format string that contains all the desired text along with special characters called format specifiers (such as %d and %s) in the places where you need to insert data.

The format() method replaces these %s and %d format specifiers with the parameters that follow the format string in the parameter list. If we want to insert a string, then we write %s. If we want to insert a number, then the format specifier is %d. Example:

Code	Result
`String s = String.format("a=%d, b=%d, c=%d", 1, 4, 3);`	`s` is equal to `"a=1, b=4, c=3"`

Here is a short list of format specifiers that can be used inside the format string:

Specifier	Meaning
`%s`	`String`
`%d`	interger: `byte`, `short`, `int`, `long`
`%f`	real number: `float`, `double`
`%b`	`boolean`
`%c`	`char`
`%t`	`Date`
`%%`	`%` character

These specifiers indicate the type of data, but there are also specifiers that indicate the order of the data. To get an argument by its number (the numbering starts from one), you need to write "%1$d" instead of "%d". Example:

Code	Result
`String s = String.format("a=%3$d, b=%2$d, c=%d", 11, 12, 13);`	`s` is equal to `"a=13, b=12, c=11"`

%3$d will get the 3rd argument, %2$d will get the second argument, and %d will get the very first argument. The %s and %d format specifiers refer to arguments regardless of specifiers like %3$d or %2$s

3. String Pool

Every string specified in code as a string literal is stored in an area of memory called the StringPool while the program is running. StringPool is a special array for storing strings. Its purpose is to optimize string storage:

First, the strings specified in code must be stored somewhere, right? Code consists of commands, but data (especially, large strings) must be stored in memory separately from the code. Only references to string objects appear in code.

Second, all identical string literals must be stored in memory only once. And that's just how it works. When your class code is loaded by the Java machine, all string literals are added to the StringPool if they are not already there. If they are already there, then we simply use a string reference from the StringPool.

Accordingly, if you assign the same literal to several String variables in your code, then these variables will contain the same reference. A literal will be added to the StringPool only once. In all other cases, the code will get a reference to the string already loaded in the StringPool.

Here's roughly how it works:

Code	Working with the StringPool
`String a = "Hello"; String b = "Hello"; String c = "Bye";`	`String[] pool = {"Hello", "Bye"}; a = pool[0]; b = pool[0]; c = pool[1];`

That is why the a and b variables will store the same references.

intern() method

And the best part is that you can programmatically add any string to the StringPool. To do this, you just need to call the String variable's intern() method.

The intern() method will add the string to the StringPool if it is not already there, and will return a reference to the string in the StringPool.

If two identical strings are added to the StringPool using the intern()method, the method returns the same reference. This can be used to compare strings by reference. Example:

Code Note

String a = new String("Hello");
String b = new String("Hello");
System.out.println(a == b);



false

String a = new String("Hello");
String b = new String("Hello");

String t1 = a.intern();
String t2 = b.intern();
System.out.println(a == b);
System.out.println(t1 == t2);






false
true

You're unlikely to use this method often, but people love to ask about it in interviews. So it's better to know about it than to not know.

Comments (3)

TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION

Boat Level 40, Germany, Germany

13 April 2023

StringTokenizer task lists a condition that the maximum amount of tokens in the array should be 6, yet if you actually check for that as part of your while condition, code fails verification. Bit weird to specifically list that condition, highlight the number 6, yet ask you to ignore it entirely.

Oleksandr Level 31, Poland, Ukraine

1 January 2023

this is tough, I already decided, I removed the space between $ and 5000)))

Convenience classes for working with strings in Java

1. StringTokenizer class

2. String.format() method and StringFormatter class

3. String Pool

1. `StringTokenizer` class

2. `String.format()` method and `StringFormatter` class