| Wed, 2007-05-16 18:08 — David Herron |
ROME offers the SyndFeed object (and SyndFeedImpl) as the in-memory storage of an RSS or ATOM feed. This object abstracts out details between those two formats and is the same object regardless of the format used by the feed. The feed-aggregator-tools pass around SyndFeed objects for the processing steps.
Let's look at how to get a feed into memory.
The feedget shows an example of reading and saving a feed. The first line simply reads the feed into memory, the feed can be either a URL or a file name. The second writes it to a file, atom.xml. If you prefer to save the feed in RSS format use 'saveToRSS' instead.
feed thefeed = new feed(args[0]); thefeed.saveToAtom("atom.xml");
The dumpfeed script uses feedget and then prints the result, allowing you to quickly check whether a given feed is functional. Its verbose output isn't useful for much more than seeing that the toolchain can process the feed, and inspecting details of the feed.
The feedarchive script approaches this a little differently. In addition to retrieving the feed from its source, it maintains an archive file of the prior feed contents. The publisher of a feed usually limits the number of entries in a feed, for example to 30 entries, or to entries less than a week old. But what if you want to maintain a longer term list of items? This is where feedarchive contributes, in that one can archive the feed items over a longer period.
The feedarchive script takes two arguments
groovy feedarchive.groovy feed-url archive-file
It retrieves the specified feed and then adds entries from that feed to the cntents of archive-file. The archive-file must be writable, and it is stored in ATOM format. It only adds entries to archive-file when entry's URI is not given in one of the items already in archive-file.