Understanding RSS - Part Five - How the RSS Feed Works & Some Programming Constructs

This is a continuation from my articles on RSS

In my last article on the major aspects of the "Channel" element, I promised to continue with the sub-elements of the Channel. This I will do in the next article. The time has come to explain how an RSS feed works, as it is critical to understand just what some of the RSS template commands want from us as well as our readers (in their options command.) No doubt towards the end of the series I will return to this information, however before getting into some possibilities in the template file, it is important to understand just what is going on "behind the scenes."

Years ago before the Internet, Windows and all HTML hit our PC universe, most of us were plugging and possibly blogging away with our computers if we were programmers. I certainly was one of those nutty programmers trying to decipher the innards of Dbase II. At that time instead of the Unicode sets and the various language sets, you were basically limited to the 256 code table of "ASCII". So what that meant was if you had special characters they sat in the Ascii table above the 128 margin (as below was reserved for English and special characters). That was great for those who wanted to see only English. But in languages such as Hebrew, not only did you have a right to left orientation, but there came to be known, for years, the infamous ALT-141 character problem. Alt-141 was assigned to a Hebrew character "mem-sofit". The problem was it was also assigned to a "back space". So instead of users getting a "mem-sofit" when they hit the character they would invariably erase the letter before! So if you were programming a database for instance, and you wanted the user to input information, you literally had to write an entire key-map utility to trap keys and to re-map them while the user was typing in order to display the correct character. It was a royal pain in the butt.

Why do I suddenly go into nostalgia of ASCII and pre-windows? You think we have come a long way? Think again!

One of the most annoying, impossible, crazy conventions is what HTML does with four specific characters. (Indeed to format this document correctly for an article I must go through a great deal more typing.) These are:

  1. < (less than sign) which is created by typing "&"+"lt;" (A plus was added as otherwise the system will interpert it as a < less than sign and reproduce it. You ignore the quotation marks and the plus but don't forget the semi-colon ; at the end!.)
  2. > (greater than sign) which is created by typing "&"+"gt;" (A plus was added as otherwise the system will interpert it as a > greater than sign and reproduce it. You ignore the quotation marks and the plus but don't forget the semi-colon ; at the end!..)
  3. & (The ampersand sign itself) which is created by typing "&"+"amp;" (A plus was added as otherwise the system will interpert it as an & and reproduce it. You ignore the quotation marks and the plus but don't forget the semi-colon ; at the end!..)
  4. " (quote character) which is created by typing "&"+"quot;" (A plus was added as otherwise the system will interpert it as a " quote and reproduce it. You ignore the quotation marks and the plus but don't forget the semi-colon ; at the end!..)

The normal "Ampersand" which we use often "&" is not beloved by HTML and certainly RSS. Indeed try putting the innocent & into your RSS file "text" without normal conventions and the feed wont validate. You will "grrrrr" and curse up a storm, if you are like me and use the & every other word. HTML and all programmers are familiar with this "little annoyance". Most are not. After all we do see the & all over the web.

So an IMPORTANT RULE FOR NON-PROGRAMMERS. Do not incorporate a plain "&" into the "TEXT" areas of your document, or the right and left "<>" signs (which are often used in programming).

The Cache Okay NEXT piece of information which you really should understand about RSS and the RSS feeds. You come across one of those beautiful little orange buttons and say "Oh Boy! GREAT! Here goes another feed into my Parser!" Or you are even more tricky and smart, and incorporate one of those feeds into your web pages (we will discuss how to do this in a later article!) Before you just go on your happy way, there is one term you should understand and know - CACHE.

You see every time you happily tell your parser to re-read the RSS file, it says to itself, "Okay. This owner of ours is a real nuisance. Once again we have to go travel on the Net, find the file at the web site, make a connection and download the information." And of course having no ability to tell you to have patience, it goes on its happy way. So it shakes hands with the file on the web and downloads that information. HOWEVER, a few thousand other people are also shaking hands with that same file. And every time it shakes hands it adds to bandwidth usage. Now the creator of the file knows this. What the creator also expects is that your RSS reader sets its cache to something normal like only "reloading" its memory once every 60 minutes or 120 minutes or even only once a day. That is the reason, by the way, that Parsers have a cache command, and RSS templates have Date commands, and even as you will learn a "TTL" - "time to live" command.

And if you are one of those who puts the RSS feed up on your web page, and set the cache to "0" then every single time someone hits that web page, the page has to go out to find the feed and update the contents. Thus you are adding to bandwidth usage and some RSS farms request that you are careful how and when you set your cache.

Now maybe this fifth piece should have waited. But in our next piece on sub-elements, and then following that on the "items" the nature of the text and understanding how RSS works is critical. I hope this helps.

Copyright