In the last article I made a simple introduction to what is an RSS Feed and showed the path to creating XML files in PHP. Now it’s time to explain the RSS file structure along with some basic history.
The RSS format was not the first to be used in site summaries, in 1997 the Channel Definition Format created by Microsoft showed up along other less know formats, but it wasn’t until RSS that one specific format became largely popular and used by the “masses”. The first RSS version, known as RDF Site Summary, was created by Dan Libby in 1999 to be used in Netscape’s portal. This format became known as RSS 0.9 and was followed by RSS 0.91, based on alteration proposed by public opinion.
Around that time, due to Netscape’s lack of interest for the standard, a battle for its ownership started, causing what is now called the RSS Fork. Making a long story short, Dave Winer kept the 0.91 line going and a workgroup elaborated RSS 1.0 published by Tristan Louis in 2000. This standard had a more modular structure but was still RDF based.
Dave Winer insisted on RSS 0.91 and after some bumps and turbulence finally published RSS 2.0 in 2002, renaming it to Really Simple Sindication. This version got rid of RDF standards and made code syntax much more simple. Another format, Atom, also appeared in 2003, strutting as its differential the fact that a big corporation (IETF) was behind the scenes looking for evolution since RSS 2.0 was frozen in time.
The dispute between RSS 2.0 and Atom still persists, so for this article I made a choice to implement the RSS 2.0 format, as it’s the format used in my blog’s feeds and in the RSS I implemented over at ComuniWEB.
This is the basic structure of a RSS 2.0 file:
< ?xml version="1.0"?> <rss version="2.0"> <channel> <title>Liftoff News</title> <link>http://liftoff.msfc.nasa.gov/</link> <description>Liftoff to Space Exploration.</description> <item> <title>Star City</title> <link>http://liftoff.msfc.nasa.gov/news/ 2003/news-starcity.asp</link> <description>How do Americans get ready to work with Russians aboard the International Space Station? They take a crash course in culture, language and protocol at Russia's Star City.</description> </item> <item> <title>Space Exploration</title> <link>http://liftoff.msfc.nasa.gov/</link> <description>Sky watchers in Europe, Asia, and parts of Alaska and Canada will experience a partial eclipse of the Sun on Saturday, May 31st.</description> </item> <item> <title>The Engine That Does More</title> <link>http://liftoff.msfc.nasa.gov/news/ 2003/news-VASIMR.asp</link> <description>Before man travels to Mars, NASA hopes to design new engines that will let us fly through the Solar System more quickly. The proposed VASIMR engine would do that.</description> </item> </channel> </rss>
Notice the structure has a xml declaration and a “rss” root element. Under the rss element we have a “channel” which in a newspaper would be like the editorials: sports, politics,… Next up the items, this would represent the articles themselves.
<channel> <title>Liftoff News</title> <link>http://liftoff.msfc.nasa.gov/</link> <description>Liftoff to Space Exploration.</description> <language>en-us</language> <pubdate>Tue, 10 Jun 2003 04:00:00 GMT</pubdate> <lastbuilddate>Tue, 10 Jun 2003 09:41:01 GMT</lastbuilddate> <docs>http://blogs.law.harvard.edu/tech/rss</docs> <generator>Weblog Editor 2.0</generator> <managingeditor>email@example.com</managingeditor> <webmaster>firstname.lastname@example.org</webmaster> </channel>
This node has required fields (3) and optional information, I’ll keep myself to the important ones:
- Title: Channels title, ex: ComuniWEB – Last Minute
- Link: URL to the parent site
- Description: Simple description of the feed and its content
Language: Language of content “pt-br,en-us,…”
Ttl: “time to live”, determines the time to renew a feed’s cache
To find out about more sub-nodes visit the RSS 2.0 specification
<item> <title>The Engine That Does More</title> <link>http://liftoff.msfc.nasa.gov/news/2003/news-VASIMR.asp</link> <description>Before man travels to Mars, NASA hopes to design new engines that will let us fly through the Solar System more quickly. The proposed VASIMR engine would do that.</description> <pubdate>Tue, 27 May 2003 08:37:32 GMT</pubdate> <guid>http://liftoff.msfc.nasa.gov/2003/05/27.html#item571</guid> </item>
This is the key node to a RSS feed, it represents all the tiny pieces of information that will be shared, be it an article, an event’s information, comments, or any other type of info. This node has a different policy with its sub-nodes, it has to have at least one of theses elements: title, link, description.
Here the link element points to the actual article, not just the site, the other fields are straightforward:
- author: Author of the item, credits
- category: identifies a category and may have a “domain” attribute, that points to the URL that lists all in tat category.
- comments: URL that points to the coment page
- enclousure: allows external medias to be attached, like images, mp3, etc…
- guid: item’s unique identifier
- pubDate: date of publicationin this format: Sun, 19 May 2002 15:21:36 GMT
- source: link back to the original RSS
So thats the rundown on a RSS 2.0 file structure, it shows itself flexible enough to be used with different content types, making it adjustable to the type you wish to provide. Hope I shined some more light on the subject and this article pleases all you readers like the first part did, by the way thnx for the diggs!
Next time around, in part 3, I’ll wrap this up combining all we have discussed and publishing the feed.
Part 1: What is RSS and how do I build and XML using XML DOM?
This post is also available in: Portuguese (Brazil)
One thought on “A study on RSS – Part 2: The RSS format”
Comments are closed.