What you see is not my most recent version. If I set things up to read just the first page of CodeGym cached files, works as expected:
Set while(true) to while(page == 0) in both IndeedStrategy and LinkedinStrategy getJobPostings
and for URL_FORMAT,
http://codegym.cc/testdata/big28data.html?text=Java+%s&pageNum=%d
https://codegym.cc/testdata/big28data2.html?q=java&l=%s&start=%d
As it stands, however, both strategies read the same page over and over (even though the page variable increments and is used in getDocument). I think this is because the websites do not behave as anticipated in CodeGym requirements? So it goes forever. Currently, the evaluator fails to complete, saying "the files being sent are too large."
Somehow I managed to pass validation up to this stage, but now am dead. Frustrating task. From the very start things did not work as expected (e.g. no examples in jsoup package, LinkedIn does not show just 25 items/page). I have never seen it work on the actual LinkedIn/Indeed websites. Think this is because even fields have changed relative to CodeGym cache. But then, I could be seriously confused!!
package com.codegym.task.task28.task2810;
import com.codegym.task.task28.task2810.model.LinkedinStrategy;
import com.codegym.task.task28.task2810.model.Model;
import com.codegym.task.task28.task2810.model.Provider;
import com.codegym.task.task28.task2810.view.HtmlView;
import com.codegym.task.task28.task2810.view.View;
import org.jsoup.Connection;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
/* this is not returning any values--LinkedIn has changed field names; would need to put break at
// 25 of LinkedInStrategy, debug and study format of document.
*/
public class Aggregator {
public static void main(String[] args) {
HtmlView view = new HtmlView();
Model model = new Model(view, new Provider(new LinkedinStrategy()));
Controller controller = new Controller(model);
view.setController(controller);
view.emulateCitySelection();
}
}