Bounce back from typos

In the world of email marketing you want your newsletters and offers to reach as many people as possible, but the sad fact is people often suffer inexplicably from fat fingers and as a result your database may well be riddled with annoying, yet easy to fix typos. By creating and running an “email hygiene” procedure on email addresses you can easily repair a number of these typos which include such things as accidentally hitting “,” instead of “.” or typing “co.ku” instead of “co.uk” and a whole assortment of other crazy mistakes. Looking across a spectrum of clients we estimate around 0.5%-1% of emails can potentially be fixed by employing such a procedure. Now, this may not sound like a lot, but the bigger your database, the bigger impact this will have.

The most laborious part to this whole procedure is compiling both a list of incorrectly spelt suffixes e.g. couk instead of co.uk and also a list of known incorrectly spelt domains by analysing thousands of hard bounces. For example, if you get the message “5.1.2 (bad destination system: no such domain)” and you see an address like “xxx@hhotmail.com” then you know it should be @hotmail.com. So you can now create your tables in MySQL e.g. named EmailHygieneSuffixes and EmailHygieneDomains which contain these corrections and any others that you find during your analysis (the advantage of storing the corrections in tables rather than running separate queries being you can simply append any new corrections you find to these tables without the need to update any procedures themselves):

EmailHygieneSuffixes:

EmailHygieneDomains:

So far we have generated a list of nearly 700 fixable domains (and counting) in the war against typos (mostly from webmail domains like yahoo, outlook, gmail etc. as corporate domains are much harder to validate) but we are constantly adding to the list as people find new and clever ways to misspell things.

So, what steps are required in creating this “email hygiene” procedure in MySQL?

Ensure you have a table of emails including the following fields:

Email
OriginalEmail
Domain
NewDomain
Prefix

Set OriginalEmail to Email to keep a record of the original spelling.

lrim() and rtrim() your email address to remove any extraneous characters.

Extract the domain: substring(email FROM POSITION(‘@’ IN email)+1).

Extract the email prefix: left(email, POSITION(‘@’ IN email)-1).

Remove inner spaces and replace commas with full stops using the replace() procedure as well as any other characters that do not belong in an email.

Replace any matching incorrect Old Prefixes with the correct NewPrefix using a while loop to cycle through the EmailHygieneSuffixes table pulling in each old and new prefix as part of a replace() query.

Match Domain to IncorrectDomain in EmailHygieneDomains and replace NewDomain with CorrectDomain.

Rebuild your email using concat() to add together the prefix, an @ sign, then NewDomain.

Find out how many addresses you’ve corrected by counting where Email <> OriginalEmail.

It is definitely a project for a rainy day as there is an awful lot of manual work required to compile the lists of corrections, but imagine you have a database of 1 million records, you could be missing out on as many as 10,000 addresses that are easily fixable. Not to mention, this also has the added benefit of lowering the number of hard bounces in your sends which should also help towards keeping your sender reputation a healthy one.