ASP, CGI and PHP Scripts and Record-Locking: What Every Webmaster Needs To Know

Many of us install server-side (ASP, CGI or PHP) scripts on our web sites, and many of this scripts store data on the server. However, poorly designed scripts can experience performance problems and sometimes even data corruption on busy (and not so busy) web sites. If you're not a programmer, why should this matter to you? Answer: Even if you're just installing and using server-side scripts, you'll want to make sure that the scripts that you choose don't randomly break or corrupt your data. First, some examples of the types of scripts which store data on web servers include: (Of course, many scripts in each of these (and other) categories are well-designed, and run perfectly well even on very busy web sites). 1. Follow-up autoresponders typically store the list of subscribers to the autoresponder, as well where in the sequence of messages, each subscriber is. Examples of autoresponder scripts: http://www.scriptcavern.com/scr_email_auto.php 2. Classified ad scripts store (at least) a list of all the classified ads placed by visitors. Examples of this type of script: http://www.scriptcavern.com/scr_classified.php 3. Free for all links scripts store a list of all links posted by visitors. See some example scripts listed at: http://www.scriptcavern.com/scr_ffa.php 4. Top site scripts usually store a list of the members of the top site as well as information about the number of "votes" that each has received. For examples of this type of script, see http://www.scriptcavern.com/scr_topsite.php So what kind of scripts have problems? And what sort of problems am I talking about? Well the principle problems all relate to what happens when bits of data from multiple users needs to be stored on updated at the same time. Some scripts handle these situations well, but others don't... DATA CORRUPTION Here's a common data corruption problem that can occur with many scripts: 1. When some bit of data needs to be updated, a copy of the server-side script starts running, and then starts updating it. 2. If another user comes along and does an update before the first copy of the script has finished, a second copy of the script starts running at the same time. 3. There are a number of ways things can now go wrong, for example: (a) What if the first copy of the script reads in the data, then the second copy reads the same data, then the first copy updates the data, then the second copy updates the data? Answer: any changes made by the first copy of the script can get lost. (b) What if the first and second copy of scripts are both adding multiple bits of new data to the store at the same time? For example, imagine each needs to store the headline, description and the name of the person posting a classified ad. Well, what can happen (with some scripts) is the two classified ads can get intermingled, so you might get (for example) HEADLINE-1, DESCRIPTION-1, HEADLINE-2, PERSON-1, DESCRIPTION-2, PERSON-2. Or worse yet, you might get bits of each part of each classified ad, mixed with the bits of the other. This type of thing is usually really bad news, as your data may consequently becoming unusable from that point on. Does this sound too unlikely a problem to worry about? Don't bank on it... even if it happens only 1 time in 1,000, or 1 in 10,000, eventually it will happen: You need a solution. So the real question is: is it possible for programmers to create scripts without these kinds of problems? Fortunately the answer is yes, and there are a number of ways that programmers can address it: 1. They can store each bit of data in a separate file. This isn't necessarily a total solution by itself (in particular, a script which just does this could still have problems if multiple copies of a script update the same file at the same time), but it does make data corruption less likely, and if corruption does occur, at least it won't corrupt the entire data store in one go. 2. They can use file-locking. This means that if one copy of a script is working with a file, another copy of the script is prevented from working on that file, until the first copy has finished. File-locking works if done correctly, but programming it into a script needs to be done very carefully and precisely, for every single possible case... even a tiny bug or omission can allow the possibility of data-corruption in through the backdoor! 3. They can use a database (such as MySQL) to store the data. Provided the data is properly structured in the database, the database handles the locking automatically. And, as the programmer doesn't have to write their own special locking routines, the possibility of bugs and omissions are much reduced. PERFORMANCE PROBLEMS Of course, avoiding having your data corrupted should be the paramount consideration in choosing a script, but is there anything else we need to be concerned about? Answer: Performance Of course, all webmasters are aiming to build busy high traffic web sites... but will your scripts be able to handle the load? Go back and re-read the paragraph on file-locking. Now think about what would happen if all the classified ads on your classified page were stored in a single file (or all the links on your top site, or all the subscribers to your autoresponder, etc.). What would happen? Answer: Because each update can only be performed after the previous update has been completely finished, your site may be slow, or even unable to handle all your users' requests. So what's the solution? There's two options that programmers can use: 1. They can use lots of small files and file-lock each individually (for example, one per classified, one per top site listing, etc.). Of course, this needs to be handled very carefully... 2. They can use a database (like MySQL), as databases allow any one individual record ("row") to be updated, even when another is also being updated. IN CONCLUSION Now, let's summarise: 1. Scripts that store data in files need to use file-locking to avoid data-corruption, and they also need to break the data into separately updateable chunks to avoid performance problems on busy web sites. 2. Scripts that store data in databases (like MySQL), provided of course that they have been properly coded, are usually less likely to suffer from data-corruption or performance problems. And one additional point: 3. Even the best script is not immune to hard-disk hardware failures, your web host being struck by lightning, and all the other snafus that can happen. So, do take regular back-ups of any data that you can't afford to lose! In short, even if you're not a script programmer, you need to be aware of data storage issues. In future, when considering a script for your web site, don't be afraid to ask some hard questions about how it stores data and how well it handles multiple users. This article is Copyright (C) 2005, Answers 2000 Limited. About the Author: This article was written by Sunil Tanna of Answers 2000. For a directory of ASP, CGI, PHP and Remotely hosted scripts, please visit http://www.scriptcavern.com - and for scripts written by Answers 2000 please visit http://www.scriptrocket.com ----------------------------------------------------------------- ----------------------------------------------------------------- Publication Terms And Conditions: Answers 2000 Limited grants you a free non-exclusive permission (license) to publish a copy of this article on your web site or opt-in ezine, subject to you complying with ALL of the following: 1. You must publish the article in full and unedited (except that you may omit this Terms and Conditions section, you may omit the word count, and you may correct any typos that you might find). 2. If you publish on a web site: (i) you must make ALL links clickable, (ii) you may format the article to fit within your web site's design, (iii) you must include the copyright notice and "About the Author" section at the end. 3. If you publish in an ezine: (i) your ezine must be opt-in with your users having specifically elected to subscribe to your ezine and with the ability to unsubscribe at any time, (ii) you must include all link URLs unedited and in full, (iii) you may format the article to your ezine's layout, (iv) you must include the copyright notice and "About the Author" section at the end. 4. To the maximum extent permissible under law, this article is provided "AS IS" without warranties of any kind whether express or implied. 5. These terms and conditions shall be governed by and construed in accordance with the laws of England and Wales. Any disputes arising from matters relating to this article shall be exclusively subject to the jurisdiction of the courts of England and Wales. You agree that any legal action against Answers 2000 Limited (or its directors, officers, or employees) relating to this article or this agreement will be brought in the courts of London, England, however Answers 2000 Limited reserves right to pursue breach of these terms in any jurisdiction. There are 1225 words in this article (including title and About the Author section). -----------------------------------------------------------------