Due to the nature of my job a lot of script writing (SQL in my case) is done on the fly and is often written in an ad hoc fashion. This is fine for my personal use (that is until I look back an indeterminate time later and have to decipher my own work whilst trying to remember my reasoning) but if a bus that has my name on it were ever to strike me down my successors would undoubtedly struggle with some of my scripts. So, here are my top 5 tips on how to check and maintain the integrity of scripts and to make them accessible for other people to read, run and adapt if necessary.
1. Write as few lines of code as possible
It may sound obvious but the more lines of code to read the harder it is for another person to follow what has been written. If the script is the type which needs to be run frequently with changes made to it for each new iteration then it also follows the more code there is, the more likely it can be corrupted without even realising (for example, find and replace can have unforeseen and disastrous consequences). There are some online tools that can help you with formatting your scripts more efficiently e.g.
My usual method for long scripts is to pull any code that will never change into a stored procedure as well as to create my own functions (reducing all that code to just one line) to compartmentalise everything which means code should never be accidentally altered and lines of code should never be missed if the script is run bit by bit which could severely affect the results. Of course, the following rules that apply to the main script should also apply to the stored procedures and functions. Writing scripts more efficiently will also allow them to be run faster.
2. Comments
The boring but essential element to script writing where you add an inert description of the purpose of each line of code so future users will know exactly what each command should do without having to work it out blindly for themselves. Adding comments will also make it easier to find problems should something go wrong.
3. Output results and Error Checking
Whilst comments are hidden within the code it is also advisable to include plenty of outputs for crucial results so each step can be checked as the script is run before moving on to the next part. For example in the case of an SQL script, if you create two tables of email addresses which should have no duplicates between them then you would include an extra query to output any emails that appear in both tables and you would only move on to the next command when the result is zero – otherwise you will have to go back through the script to spot your mistake. Adding in such safety protocols is actively encouraged.
4. Testing, testing, 1… 2… 3
Testing is crucial to ensuring a script is suitable for use. There is nothing worse than setting a script to run, especially overnight, only to find it has failed in the morning. For commands that perhaps delete or alter things you can run select commands first to evaluate what the command does to make sure you are operating on your desired criteria. When logic is involved, such as selecting on multiple criteria with ANDs and ORs, always perform a sanity check first to make sure your logic is sound. Other issues may arise if field types are not correct, for example running on dates which are saved as variable characters and not actual dates will give erroneous results. With lots of things that can potentially go wrong – testing is your friend.
5. Backups
If whatever you are running has the potential to alter data it always advised to create a backup copy of any tables or data sources so if you do irreparably destroy anything you can just revert back and won’t have single-handedly brought down your organisation.