Even google maps enterprise have restrictions on their geocoding/reverse-geocoding services, 100k if my memory serves me correctly. So, I have to rollout our own service to allow millions of lonlats for reverse geocoding. Have a look at Nominatim, yes it’s opensource. If you need to get it up and running, have a read of my nominatim installation via homebrew on OSX.

The nominatim www interface which spits out xml/json depending on the format parameter is done in php.

Anyway, I wanted to expose/use this webservice for our Rails3 app. It will also be good if we don’t use the nominatim webservice all the time if the lonlat was already requested–caching.

Python-Nominatim https://github.com/rdeguzman/python-nominatim

This project was forked from Austin’s Gabels python-nominatim. I added the ability to pass a base_url to the classes and added reverse_geocode.py. So assuming you have Python installed, you can do a reverse geocode like this…

from nominatim import ReverseGeocoder
client = ReverseGeocoder("http://127.0.0.1/nominatim/reverse.php?format=json")
response = client.geocode(-37.856206, 145.233980)
 
print response['full_address']
#Amesbury Avenue, Wantirna, City of Knox, 3152, Australia

PL/PYTHON
Now we wrap this python code via PL/PYTHON so Postgres can call it. Checkout setup.sql

CREATE PROCEDURAL LANGUAGE 'plpythonu' HANDLER plpython_call_handler;
 
CREATE OR REPLACE FUNCTION reverse_geocode(geocoding_url text, latitude FLOAT, longitude FLOAT) RETURNS
  text
  AS
  $$
    import nominatim
    client = nominatim.ReverseGeocoder(geocoding_url)
    response = client.geocode(latitude, longitude)
    RETURN response['full_address']
  $$
  LANGUAGE 'plpythonu';

With the snippet above, we can now call this with a regular SELECT statement…

SELECT reverse_geocode('http://127.0.0.1/nominatim/reverse.php?format=json', -37.856206, 145.233980); 
                       reverse_geocode
  ----------------------------------------------------------
   Amesbury Avenue, Wantirna, City OF Knox, 3152, Australia
  (1 ROW)

Rails ActiveRecord

    create_table :locations do |t|
      t.float    :latitude
      t.float    :longitude
      t.text     :address
    end

In AR, I created a location model above and exposed a reverse_geocode method below

class Location < ActiveRecord::Base
 
  def self.reverse_geocode(geocode_url, lat, lon)
    sql_string = "SELECT reverse_geocode('#{geocode_url}', #{lat}, #{lon}) as address, #{lat} as latitude, #{lon} as longitude"
    loc_array = self.find_by_sql sql_string
    loc_array[]
  end
 
end

So now, in one of my models, I could simply do..

class ActiveSession < ActiveRecord::Base
...
  def location_address
    if self.has_gps?
      loc = Location.reverse_geocode('http://path/to/reverse.php?format=json', self.gps_latitude, self.gps_longitude)
      loc.address
    else
      nil
    end
  end
...
end

In the view, we can simple call model.location_address to retrieve the location details. Below is a code snippet which creates a google marker and adds the location details in the infoWindow.

<% location = active_session.location_address %>
 
var latlong = new google.maps.LatLng(<%= active_session.gps_latitude %>, <%= active_session.gps_longitude %>);
 
var content = '<div style="width: 300px;">';
content = content + '<p><%= escape_javascript location %></p>';
content = content + '<p><%= active_session.gps_longitude %>,<%= active_session.gps_latitude %></p>';

marker.png

Caching
Our last step is to improve performance via caching. I have opted to do this from the PL/PYTHON end but using a Rails activerecord model/table. This way, the Rails activerecord has no idea that it is cached when it calls model.location_address. Below, I wrap the new reverse_geocode PL/PYTHON function in a rails migration.

class CreateFunctionReverseGeocoder < ActiveRecord::Migration
  ActiveRecord::Base.connection.schema_search_path = "public"
 
  def self.up
    execute 'CREATE OR REPLACE FUNCTION reverse_geocode(geocoding_url text, latitude float, longitude float) RETURNS
      text
      AS
      $$
        plan = plpy.prepare("SELECT address FROM locations WHERE latitude = $1 AND longitude = $2", [ "float", "float" ])
        rv = plpy.execute(plan, [ latitude, longitude ], 1)
 
        if rv.nrows() > 0:
          result = rv[0]["address"]
        else:
          import nominatim
          client = nominatim.ReverseGeocoder(geocoding_url)
          response = client.geocode(latitude, longitude)
          result = response["full_address"]
          insert_plan = plpy.prepare("INSERT INTO locations(latitude, longitude, address) VALUES($1, $2, $3)", ["float", "float", "text"])
          plpy.execute(insert_plan, [ latitude, longitude, result ])
 
        return result
      $$
      language plpythonu;'
  end
 
  def self.down
    execute 'DROP FUNCTION IF EXISTS reverse_geocode(text, double precision, double precision);'
  end
end

Benchmarks
I plotted 1000 records on my MBP (old core2duo early 2009 4GB RAM). Initial launch takes 108 seconds to load, ~ 2 minutes? But subsequent requests loads < 2 secs.

For 1000 records:
Completed 200 OK in 110478ms (Views: 1608.8ms | ActiveRecord: 108674.6ms)
Completed 200 OK in 1744ms (Views: 1110.7ms | ActiveRecord: 443.3ms)

Below is an architecture diagram of how the systems talk to each other. The locations cache is inside the geo_app_development db. Ofcourse, the nominatim database (gazetteer_au) is separate from our domain so it goes into a different db/server whereever.
archi.png