c# - Html Agility Pack - Remove Tags by ID Or Class -
here simplified html:
<html> <body> <div id="maindiv"> <div id="divtoremove"></div> <div id="divtokeep"></div> <div class="divtoremove"></div> <div class="divtoremove"></div> </div> </body> </html>
i want remove divs id or class named "divtoremove" , want select div called "maindiv" (in htmlnode).
the results should be:
<div id="maindiv"> <div id="divtokeep"></div> </div>
how can using html agility pack?
thanks!
the following code adapted this html agility pack forum page fit needs. essentially, grab divs , loop through them , check class or id match. if it's there remove it.
var divs = htmldoc.documentnode.selectnodes("//div"); if (divs != null) { foreach (var tag in divs) { if (tag.attributes["class"] != null && string.compare(tag.attributes["class"].value, "divtoremove", stringcomparison.invariantcultureignorecase) == 0) { tag.remove(); } else if(tag.attributes["id"] != null && string.compare(tag.attributes["id"].value, "divtoremove", stringcomparison.invariantcultureignorecase) == 0) { tag.remove(); } } }
you can combine these if statements 1 large if statement, thought read better answer.
finally, select node looking for...
var maindiv = htmldoc.documentnode.selectsinglenode("//div[@id='maindiv']");
Comments
Post a Comment