
Xml::rsslite attempts to extract the maximum amount of content from available documents, and is less concerned with xml compliance than alternatives. rather than rely on xml::parser, it uses heuristics and good old-fashioned perl regular expressions. it stores the data in a simple hash structure, and "aliases" certain tags so that when done, you can count on having the minimal data necessary for re-constructing a valid rss file. this means you get the basic title, description, and link for a channel and its items.
this module extracts more usable links by parsing "scriptingnews" and "weblog" formats in addition to rdf & rss. it also "sanitizes" the output for best results.