Skip to content Skip to sidebar Skip to footer

Selective Screen Scraping With Htmlagilitypack And Xpath

[This question has a relative that lives at: Screen scraping with htmlAgilityPack and XPath ] I have some HTML to parse which has general appearance as follow: ...

Solution 1:

Following code will select first two <td> data and last two <td> nodes data:

html.DocumentNode.Descendants("tr")
    .Select(tr => 
       from td in tr.SelectNodes("td[position() < 3 or position() > last() - 2]")
       let a = td.SelectSingleNode("a[@href!='']")
       select a == null ? td.InnerText : a.Attributes["href"].Value);

This xpath is filtering nodes by position:

td[position() <3orposition() >last() -2]

Post a Comment for "Selective Screen Scraping With Htmlagilitypack And Xpath"