CodeGym/Help with Java Tasks/Does anyone know the hidden CG requirements for this task...

Level 23

Szekesfehervar

21.09.2021
354views
9comments

Does anyone know the hidden CG requirements for this task?

Question about the task

Resolved

Is it possible that CG is deliberately hiding some details of their requirements? I mean how to count the words "world" exactly. CG says the words are separated by punctuation marks, but I still see some unclear circumstances. E.g. if it is included in the file: "world'abcd(world!worldly,worlds;" // it means 2 hits or 4? "world:WORLD" // it means 1 or 2? It is also unclear what exactly is meant by punctuation mark. Of the 65536 Unicode characters, which ones exactly should I consider to be the punctuation marks that separate words? According to one description on net, these are what are considered punctuation marks in ASCII: ! " # $ % & ' ( ) * + , - . / : ; ? @ [ \ ] ^ _ ` { | } ~ However, if I use the string.replaceAll("\p{Punct}","") or string.split("\p{Punct}") methods, which are supposedly to remove punctuation, then multiple characters will be removed, not just the preceding sequence of characters. It also removes characters that are, for example, MATH_SYMBOL (and not "anyType"_PUNCTUATION) according to Character.getType() method (see https://docs.oracle.com/javase/8/docs/api/java/lang/Character.html ). In summary, in the example below, I have not used typical punctuation to separate the 11 words "world", but various other characters from the ASCII code table (but not letters). (And then we were only talking about 256 characters, where are the other 65280....). So how do I handle this string below? Is this 11 hits of "world" strings, or is it just a single string with a lot of meaningless characters that counts as 0 hits?... "worldworldworldworldworldworldworldworld«world»world¿world"

Read a file name from the console.
The file contains words separated by punctuation marks.
Output to the console the number of times the word "world" appears in the file.
Close the streams.

Requirements:

The program must read the file name from the console (use BufferedReader).
The BufferedReader used for reading input from the console must be closed after use.
The program must read the file's contents (use the FileReader constructor that takes a String argument).
The file input stream (FileReader) must be closed.
The program must output to the console the number of times the word "world" appears in the file.

package com.codegym.task.task19.task1907; /* Counting words */ import java.io.*; //import java.util.*; public class Solution { public static void main(String[] args) throws Exception { BufferedReader readConsole = new BufferedReader(new InputStreamReader(System.in)); String fileName = readConsole.readLine(); readConsole.close(); FileReader fread = new FileReader(fileName); CharArrayWriter chArrWrite = new CharArrayWriter(); while (fread.ready()) { chArrWrite.write(fread.read()); } fread.close(); chArrWrite.close(); String[] wordsInFile = chArrWrite.toString().split("\\p{Punct}"); // System.out.println(Arrays.toString(wordsInFile)); int w = 0; // number of "world" for (String st : wordsInFile) { if (st.equals("world")) w++; } System.out.println(w); } }

Comments (9)

Popular
New
Old

You must be signed in to leave a comment

Lisa

Level 41

21 September 2021, 10:49

solution

Hi Mr. Gellert... may I ask you (before I try to help you) why you use CharArrayWriter? The same result you get if you create a byte array, read everything into the array and make it a string? That's basically three lines of code? Something like that

InputStream in = new BufferedInputStream(new FileInputStream(fileName));
byte[] buffer = new byte[in.available()];
in.read(buffer);
String data = new String(buffer);

But of course it's possible the way you do as well. You also could read it with a BufferedReader and check each line (without reading the whole file into a buffer). OK, now my guess why this is not validation. You didn't consider whitespace. CG just looks for lowercase world, exact world, no worldx. So

world'abcd(world!worldly,worlds

should result in 2 (as your code does). But

world'abcd ( world    !worldly, worlds

also should spit out 2 as result. That's a guess cause I just have written down splitting by \\p{Punct} as possible solution (and 4 others) but validated using one of these other approaches (using pattern-matcher and finding world in word border regex). And word borders remove space as well (even if a punct isn't really necessary anymore - but CG didn't mind/ test that). yuhiii, yuhaaaa, yeah, yeah... 🤪😜

Gellert Varga

Level 23 , Szekesfehervar, Hungary

21 September 2021, 11:20

Thanks for the quick reply:) 1) inpStr.read(byteBuffer); // yes, I like this solution too:) But requirement 3. forces the use of FileReader, so I didn't even try InputStream for now. I prefer to use FileReader straight away. And if it reads characters instead of bytes, I wanted to put the whole file content into some char[] array, and I had no better idea for that, but CharArrayWriter. 2) I'm not familiar with this damn regex thing :/ But am I getting the point right? : CG should correct the task description like this: original: "The file contains words separated by punctuation marks." but more precisely: "The file contains words separated by punctuation marks and/or spaces." (((Neither from my childhood schooling nor based on Java getType() would I have thought that a space should be considered a punctuation mark...))

Lisa

Level 41

21 September 2021, 11:33

solution

Ah, oki, didn't read the requirements just your code and your text. And I don't think CG considers a space to be a punct. But a punct can eg. be followed by a space. If you split using punct, then the space stays and if you compare the remaining " world" to "world", then they are not equal. You either can try to remove all spaces before you split or you may trim each word before comparing. But adding space to the split has the same effect :) Yeah, yeah... you did it 🤪😜

Gellert Varga

Level 23 , Szekesfehervar, Hungary

21 September 2021, 11:56

Thanx:) I inserted a replaceAll() method before split(), replacing all white spaces with a '.' char, and the task passed:) Here's a link to it, in case someone reads this later: https://stackoverflow.com/questions/5455794/removing-whitespace-from-strings-in-java

Guadalupe Gagnon

Level 37 , Tampa, United States

21 September 2021, 13:27

You two are awesome!

Gellert Varga

Level 23 , Szekesfehervar, Hungary

21 September 2021, 14:25

:-) but when we run into something really difficult we always can learn from Guadalupe!:)

Lisa

Level 41

21 September 2021, 15:40

Hey, Gua, you're around again... 🥳🍕🎈🥂 We had to take this over because you didn't show up for so long 😔😩 Puh, but now all is good, yeah, yeah 🤪😜

Guadalupe Gagnon

Level 37 , Tampa, United States

21 September 2021, 15:58

I never disappeared. I go through the help section quite often through the week, By the time I log in you guys have already answered everything just as good as I would have. Also, if you want to shorten my name, I actually go by Lupe.

Lisa

Level 41

21 September 2021, 16:53

Yes, we are so good, excellent, amazing 😎👏 (just the tasks here don't give me that feeling) But now I need to be quiet for a few days so that I can read a little bit more from you again. Started to miss that, Lupe (<- and as you can see I'm able to learn 😁) Yeah, yeah, getting smart from reading smart comments... soon again, yeah, yeah 🤪😜