Extract Text Between Two
Tags In Css-less Html
Using Jsoup, what would be an optimal approach to extract text, of which its pattern is known ([number]%%[number]) but resides in an HTML page that uses neither CSS nor divs, spans
Solution 1:
How about this?
Document document = Jsoup.connect(url).get();
Elements hrs = document.select("hr");
Pattern pattern = Pattern.compile("(\\d+%%\\d+)");
for (Element hr : hrs) {
String textAfterHr = hr.nextSibling().toString();
Matcher matcher = pattern.matcher(textAfterHr);
while (matcher.find()) {
System.out.println(matcher.group(1)); // <-- There, your data.
}
}
Post a Comment for "Extract Text Between Two
Tags In Css-less Html"