Geocoding in the UK
Sunday, August 16th, 2009
The art of geocoding addresses in the UK, as I previously explained, is a soul-destroying process, frought with inaccuracy, bugs and convoluted workarounds. And for all that work you end up with a set of points of which a great deal are probably somewhat inaccurate and at least some of which are completely wrong. UK addresses (and probably those elsewhere in the world) are complicated creatures, which Google’s geocoding engine often interprets wrongly.
Postcodes, on the other hand, are rather easier; there is a well-defined relationship between a UK postcode and its corresponding (usually pretty small) piece of the British countryside. But google’s geocoding api will only return a geocode for the postcode sector (ie will give a geocode for LL12 5 when you searched for LL12 5TH). However, someone did figure out a way of using Google’s local search API combined with google maps to geocode UK postcodes. Since he blogged about it the API has changed, so below is an outline of how to geocode a batch of postcodes in the UK using just some simple php, the current google ajax search API and a little javascript (jQuery isn’t essential, but cuts down on coding a bit). The javascript is the crucial step.
Assuming you have a database full of postcodes and id numbers, and 2 empty columns to store latitude and longitude values, this is how it’s done. (Download source geocode.zip).
1. Create a html page geocode.html with the following content:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" > <head> <title></title> <meta name="description" content="" /> <meta http-equiv="content-type" content="text/html; charset=iso-8859-1" /> <link rel="stylesheet" href="" type="text/css" media="screen" /> <script type="text/javascript" src="jquery-1.3.2.js"></script> <script src="http://www.google.com/jsapi" type="text/javascript"></script> <script type="text/javascript" src="geocode.js"></script> </head> <body> <div id="counter"></div> </body> </html>
(Make sure you specify the correct location for your local javascript files)
2. Create a php file (in the same directory), geocode.php, with the following rough structure (it will only be accessed via ajax, so is very stripped down):
<?php
require_once ('mysqlConnect.php'); //or other database connection details
if($_GET)
{
//var_dump($_GET);
update_record();
send_new_data();
}
//gets the next record without a geocode and sends the id and postcode to the browser
function send_new_data() {
$query = @mysql_query("SELECT id, postcode FROM geocode_table WHERE lat = '' AND postcode != '' ORDER BY id LIMIT 1");
if(($query) &&mysql_num_rows($query)) {
$row = mysql_fetch_array($query, MYSQL_ASSOC);
echo $row['id'].','.$row['postcode'];
} else {
echo 'stop';
}
}
//updates the last record with data sent from browser
function update_record() {
$id = $_GET['id'];
$lat = $_GET['lat'];
$lng = $_GET['lng'];
if($id > 0)
{
$update = "UPDATE geocode_table SET lat = '".$lat."', lng = '".$lng."' WHERE id = ".$id;
$result = @mysql_query($update);
if (!$result) {
die('Invalid query: ' . mysql_error());
}
}
}
?>
3. Create a javascript file geocode.js, saved in the same directory again (I would paste it here but it keeps breaking wordpress)
4. Running the code
Once you’ve altered the database connection details, and SQL query to suit your setup, simply open geocode.html in your browser. A counter will tell you which record you’re on. To stop the code simply close your browser/browser tab.
How it all works
In a nutshell (ignoring the special case of starting off the loop) the code repeatedly performs the following process:
….in geocode.php, send_new_data() finds a record which has no latitude value and sends it’s id number and postcode as an ajax response to set_and_get_next(). This keeps track of the id in a global variable and sends the postcode to getPointFromPostcode(), which uses google’s local search to get a geocode. Once it’s found a geocode it passes it to set_and_get_next(), which sends it to geocode.php in an ajax request. There update_record()… well… updates the record, and send_new_data() finds a record which has no la….
Compared to my previous approach iterating a script over large sets of data, using ajax is very sleek. Similarly to a pure php script I can load from a browser, though with much of the resource intensive scripting taking place on my or google’s server. But with ajax there’s no problem with the browser timing out from time to time, or baulking at the number of times a page is requested. It’s a little harder to code, and probably less efficient… but I like it. And I’ll definitely be using my shiny new geocoded postcode data.

