Skip to content Skip to sidebar Skip to footer

Extract Text Between Two
Tags In Css-less Html

Using Jsoup, what would be an optimal approach to extract text, of which its pattern is known ([number]%%[number]) but resides in an HTML page that uses neither CSS nor divs, spans

Solution 1:

How about this?

Document document = Jsoup.connect(url).get();
Elements hrs = document.select("hr");
Pattern pattern = Pattern.compile("(\\d+%%\\d+)");

for (Element hr : hrs) {
    String textAfterHr = hr.nextSibling().toString();
    Matcher matcher = pattern.matcher(textAfterHr);

    while (matcher.find()) {
        System.out.println(matcher.group(1)); // <-- There, your data.
    }
}

Post a Comment for "Extract Text Between Two
Tags In Css-less Html"