Skip to content Skip to sidebar Skip to footer

Xpath: "exclude" Tag In "innerhtml" (innerhtmlexcludeme

I am using XPath to query HTML sites, which works pretty good so far, but now I hit a (brick)wall and can't find a solution :-) The html looks like this:
  • Solution 1:

    This gets you the first direct text node child of <a>:

    /ul/li/a/text()[1]

    and this would get you any direct text node child (separately):

    /ul/li/a/text()
    

    Both of the above return "TextX", but if you had:

    <li><ahref="">Text4<span>AnotherText3</span>TrailingText</a></li>

    then the latter would return: ["Text4", "TrailingText"], while the former would return "Text4" only.

    Your expression /ul/li/a gets the string value of <a>, which is defined as the concatenation of the string value of all the children of <a>, so you get "TextXAnotherTextX".

Post a Comment for "Xpath: "exclude" Tag In "innerhtml" (innerhtmlexcludeme"