XML feeds – Format Selection

I am spending a fair amount of time these days thinking about middleware that will power the applications for the ’20s and beyond. From the wave of metaphors used in the “Web 2.0” parlance, one dominant theme is the pervasive use of XML feeds to communicate information to consumers.

As Ray Ozzie claims, XML feeds are the equivalent of Unix pipes for the Internet. If Unix pipes provided much fuel for composite applications on the workstation, XML feeds are expected to provide the glue for Internet scale composite applications. In another article, Adam Bosworth claims that XML feeds will save the world. 

There are mainly two flavors of XML feeds as I speak – RSS and Atom. I am sure you have read several articles comparing the two. Many such articles end up being technical comparisons for the sake of it, especially to promote Atom.

If you are thinking “Why does it matter how many formats there are or that I am using one and not the other?”, read on.

Of course, anything can be computed (within Turing’s incompleteness limits). Therefore, anything that is in RSS can be written as Atom and vice versa, correct? Turns out the answer is actually in the negative. To quote DeWitt Clinton the lead engineer of Amazon’s A9 search:

the problem is that the RSS syndication format is that it is lossy

Hopefully, this statement will get your attention and suggest that you read the entire article. The lossiness is invisible when dealing with simple use cases of syndication. However, when you start aggregating feeds and introducing semi-structured information in the feed payload, the entire problem starts becoming more and more apparent.

Add a Comment

Your email address will not be published. Required fields are marked *