Informa and Custom XML Namespaces in RSS

While integrating a custom search application into a Java-based web application, I came across the need to access properties in custom namespaces through the Informa RSS library. Or to put it in another way; i needed to access to properties, Informa had been used for RSS parsing in the previous versions of the web application. The people who developed the original version of the application had decided to extend the Informa library into their own version, and had added several methods for .get<NameOfCustomProperty> etc. After thinking about this for approximately 2 seconds, I decided that having to support and modify a custom version of Informa was not the right track for us.

My initial thought was that their decision to customize Informa to support these methods had to come from the idea that Informa did not support custom namespaces out of the box. I did a few searchas over at Google, and found nothing useful. Reading through the documentation for Informa didn’t do me any good either, so I tried to find an alternative library instead. Did a bit of searching here too, and stumbled across a hit for one of the util classes for Informa (.. again). This did support custom namespaces, so the backend support was there at least. Then it struck me while reading the documentation for Informa and ChannelIF again; Informa did support it, as it inherited the methods from further up in the hierarchy. The getElementValue and getElementValues methods of the ChannelIF and ItemIF classes allows you to fetch the contents of elements with custom namespaces in a very easy to like manner.

System.out.println(item.getElementValue("exampleNS:field"));

This simply returns the string contained between <exampleNS:field> and </exampleNS:field>

Hoooray! We now have support for these additional fields, and we do not have to keep Informa manually in sync with the version in our application. Why the original developers decided to fork the Informa library to add their own properties I may never know, but I’ll update this post if they decide to step forward!

5 thoughts on “Informa and Custom XML Namespaces in RSS”

  1. Hi,
    I’m looking at something like this myself and came to a fairly similar conclusion (“Hooray! there’s a way to get ‘itemImage’ field after all!”).
    Now my problem is: Can I publish an RSS feed from Informa that includes the ‘itemImage’ field. I have yet to find a way to _add_ custom fields to Item objects or ItemIF implementers and I suspect it’s impossible without modifying the source (or maybe subclassing – haven’t checked that one yet).

    Ta!

    M0les.

  2. My best suggestion for this after reading through the informa API just now is to use the constructor for the basic Item which takes a org.jdom.Element as the first argument:

    Item(org.jdom.Element itemElement, ChannelIF channel, String title, String description, URL link)

    I haven’t tried it out and couldn’t find any documentation regarding the argument, but I guess you’ll be able to provide other, custom childs through that element (as it is probably used as the basic element for the item). I could try to write some more code testing it later tonight.

  3. Yup, was actually walking down that path, but hit a snag when I realised the RSS_2_0_Exporter ignores the underlying JDom structure you build (well, it’s hidden inside the Items, so the Exporter can only call the “known child element” getter methods).

    Thinking about the “best” way to solve this problem to me would be to change/redesign the Exporter, Item (and probably most other classes) so they better supported subclassing and extension. Specifically the Exporters should allow subclasses to hook-in to modify the JDom structure for the channel and also for each item prior to outputting.

    However that is not a small or simple mod, so for the time being, I’m going to attempt to build my own JDom structure for my output RSS stream.

    Thanks for your thinking about this!

    M0les.

  4. OK, I went-ahead and actually worked-out the redesign to Informa that would allow extension through subclassing to support “special” properties for channels and items (like the “itemImageURL” I’m after). I’ll see if I can inject it into the Informa mailing lists, but here’s what it is for the record (And it’s not that drastic):

    Each of the existing ChannelExporterIF implementers should call-out to a “protected” (or at least not “private”) method to generate each “item” JDOM Element. I implemented this for the RSS 2.0 exporter simply by taking the entire block of code inside the write() method’s item-terator loop and putting it into the new method:

    protected Element getItemElement( ItemIF item )

    (Plus minor tweaks for instantiation/return of the resultant ItemIF object).

    Anyway, having made this change, the Informa library behaves identically to the way it did before. However now, I can subclass the Item class to allow further parameter parsing. I could already subclass the ChannelBuilder class to allow reading/recognition of the relevant JDom tags to produce the new Item sublass instances. The new functionality is that now I can subclass the RSS 2.0 Exporter to produce a “customised” JDOM “item” Element from the new Item subclass instances sent to it.

    It was tempting to put the RSS 2.0 Exporter change into the ChannelExporterIF interface definition (so all implementers must implement the new method), but that would mandate the new method be public, which I’m not sure is sensible: There’s no _need_ for an external entity to be able to render an ItemIF object into a JDom “item” Element within the context of a particular ChannelExporterIF instance. Nevertheless the extensibility benefits may outweigh this slight inelegance (I’ll leave it up to “the mailing list” to decide).

    Thanks again

  5. Simply awesome work.

    Thanks for sharing the information, this is bound to be helpful for quite a number of people. I’ll be sure to keep an eye at the mailing list in the future, as there’s been quite some time since the last Informa release.

    I also have another issue that I’d like to add support for, which is based on extending the current RSS schema to support several channels in one document. This should in reality be quite easy to do, as the channel element currently is just defined as a single element in the DTD. We have a custom Informa version in one of our projects which supports this, and it allows you to use RSS to return results from several sources in one single response. I’ll see if I can get together proper support for this later.

    Yet again, thanks for your great work.

Leave a Reply

Your email address will not be published. Required fields are marked *