
Documentation for tagsoup, a sax-compliant parser written in java that instead of parsing well-formed or valid xml, parses html as it is found in the wild: poor, nasty and brutish, though quite often far from short. tagsoup is designed for people who have to process this stuff using some semblance of a rational application design.