
Htmlcxx is a simple non-validating css1 and html parser for c++. although there are several other html parsers available, htmlcxx has some characteristics that make it unique:
* stl like navigation of dom tree, using excellent tree.hh library from
kasper peeters
* it is possible to reproduce exactly, character by character, the original
document from the parse tree
* bundled css parser
* optional parsing of attributes
* c++ code that looks like c++ (not so true anymore)
* offsets of tags/elements in the original document are stored in the nodes
of the dom tree
the parsing politics of htmlcxx were created trying to mimic mozilla firefox (http://www.mozilla.org) behavior. so you should expect parse trees similar to those create by firefox. however, differently from firefox, htmlcxx does not insert non-existent stuff in your html. therefore, serializing the dom tree gives exactly the same bytes contained in the original html document.
this package contains the c++ runtime library for css parsing.