Introducing RGeo: A Geospatial Data Library For Ruby

At GeoPage, we’ve been doing location-aware and geospatial applications using Rails for several years now. I’m pleased to announce that we’ve released one of our core components, the RGeo library, under a liberal BSD-style license. This library should be very useful for developers working with location data in a Rails application.

What is RGeo?

RGeo is, at its heart, a representation of spatial vector data in Ruby. It implements the industry standard OGC Simple Feature Access Specification, providing classes for points, lines, polygons, and other geometric objects, as well as a full set of geometric operations such as finding intersections, creating buffers, calculating distances, etc. It also provides the tools needed to work with geographic coordinate systems and projections, and to read and write data in the well-known formats understood by spatial databases.

Some examples:

require 'rgeo'

# Support Cartesian (flat) geometry
factory = RGeo::Cartesian.factory

# Create points
p1 = factory.point(1, 2)
p2 = factory.point(3, 4)
p3 = factory.point(5, 2)

# Create a more complex shapes
line1 = factory.line_string([p1, p2, p3])
polygon = factory.polygon(factory.linear_ring([p1, p2, p3, p1]))

# Parse "well-known text" format
line2 = factory.parse_wkt('LINESTRING(1 4, 4 1)')

# Perform geometric operations and calculations
line3 = polygon.intersection(line2)
dist = line3.distance(p3)

# Support geographic calculations
geofactory = RGeo::Geographic.spherical_factory

# Longitude-latitude coordinates
loc1 = geofactory.point(-122.33, 47.62)  # (Seattle)
loc2 = geofactory.point(-73.97, 40.78)   # (New York)

# Uses spherical (haversine) formulas
dist = loc1.distance(loc2)

In addition to the core RGeo library, I’ve released add-on modules for working with the GeoJSON format and reading Shapefiles. I’ve also written a set of ActiveRecord connection adapters for the MySQL Spatial, SpatiaLite, and PostGIS spatial database systems. These connection adapters subclass and extend the existing standard adapters distributed with ActiveRecord, providing tools for managing spatial columns and indexes, and accessing spatial data as RGeo objects.

Example migrations:

create_table(:locations) do |t|
  t.string :name
  t.point :latlon, :srid => 4326  # geometric column
end
add_index(:locations, :latlon, :spatial => true)  # spatial index

Example data usage:

loc = Location.new
factory = RGeo::Geographic.spherical_factory

# Set data from an RGeo object
loc.latlon = factory.point(-122.33  47.62)
# Or automatically convert from WKT
loc.latlon = 'POINT(-122.33 47.62)'

# Returns an RGeo data object
p = loc.latlon
puts "latitude=#{p.latitude} / longitude=#{p.longitude}"

You can find links to these add-ons below under “How do I get started”.

How is RGeo different from GeoRuby and spatial_adapter?

GeoRuby and spatial_adapter are two older gems that cover some of the same functionality, and were also recently discussed in a RubyConf 2010 presentation. We evaluated these libraries a while back, but ran into a number of limitations that eventually led us to write our own.

  • GeoRuby implements a few parts of the OGC Simple Features Specification, but there are large gaps. You cannot compute sizes; take intersections, unions, or differences; evaluate any of the relational predicates such as “contains” or “touches”; or in fact perform most of the operations that require some geometric analysis. RGeo, on the other hand, provides a high-performance implementation of the entire specification using the GEOS library as its backend.
  • GeoRuby provides minimal support for coordinate systems and none for geographic projections. These concepts are built in to RGeo. Each geometric object knows which coordinate system it lives in, and the library provides facilities for transforming and projecting as needed, utilizing the standard Proj4 library.
  • GeoRuby includes support for some data interchange formats, but is subject to a number of limitations, largely because its core library does not implement the geometric analysis algorithms. For example, to correctly read polygons from a Shapefile requires evaluating ring directionality and containment. These operations are not available in GeoRuby; as a result, its Shapefile reader is forced to make some potentially incorrect assumptions when evaluating polygon-type shapefiles, and it does not support multipatch-type shapefiles at all. RGeo’s shapefile module does not have these limitations.
  • RGeo is designed from the inside out to be extended. It provides several implementations of its core interfaces, and extension writers can subclass these or even write completely new implementations for specific geospatial use cases.
  • spatial_adapter works by modifying (monkey patching) the existing mysql and postgresql adapters in place. I’ve chosen instead to subclass the adapters and make the subclasses available as separate adapters using the accepted ActiveRecord adapter gem naming system (e.g. activerecord-postgis-adapter). I believe this is a less invasive and more maintainable approach to the implemenation.
  • The RGeo-based adapters are tied to Rails 3, whereas spatial_adapter also supports Rails 2.3. I made this trade-off so that the RGeo-based adapters could support Arel-specific features in the future– for example, constructing spatial queries using Arel. It also lets them avoid a few limitations present in spatial_adapter; for example, spatial_adapter does not let ActiveRecord objects cache the converted value for a field (that is, it re-parses the data every time you access it, which can be slow for complex objects), whereas the RGeo-based adapters are able to cache the value because they can define a separate Arel type for spatial fields.

Finally, RGeo is a core piece of the technology stack used by GeoPage for its products. GeoPage’s engineers are working with the code daily, and I believe RGeo will be a very active project moving forward.

How do I get started?

Install the RGeo core as a gem:

gem install rgeo

Note that RGeo utilizes two C libraries, GEOS and Proj4, to handle some of the geometric analysis and projection computation. These librares aren’t strictly necessary for RGeo to install and function, but they are highly recommended since some features of RGeo will not be available without them. See RGeo’s readme for more information.

For documentation, see http://virtuoso.rubyforge.org/rgeo.

Source code is on Github at http://github.com/dazuma/rgeo.

Report bugs on Github at http://github.com/dazuma/rgeo/issues.

Distributed as part of RGeo is a technical paper entitled An Introduction to Spatial Programming With RGeo, covering GIS concepts and tools that you will often encounter when writing geospatial web applications. This paper is currently evolving along with the library, but you can read the current version online here as part of RGeo’s online RDocs.

We consider RGeo to be alpha quality right now. We are using it in production in a limited capacity at GeoPage, but some of the more advanced functionality is not yet production-hardened.

See also the RGeo add-on modules:

And the RGeo-based ActiveRecord spatial databases adapters:

GIS for Rails?

Location is one of the hottest technology trends right now, in both desktop web applications and especially mobile applications. So far, however, support for GIS technologies in the Ruby/Rails stack has been somewhat fragmented and not very mature. I hope RGeo will help move the platform forward. It should now have a solid start, and I already have lots of ideas for enhancements to the core library as well as further add-on modules and extensions that can be written. Some examples:

  • Implementations of the OGC Coordinate Transformation spec and WKT for representing coordinate systems.
  • Integration with spatial reference databases.
  • Ellipsoidal geography implementation, possibly utilizing geographiclib.
  • JRuby support via the JTS library.
  • Windows build support.
  • Support for GeoRSS, KML, and other formats.
  • Integration with the SimpleGeo API and potentially other external APIs.
  • Arel extensions for building spatial queries.
  • Possible additional ActiveRecord adapters.
  • Raster and tiling data support?

The real goal, however, is to make Ruby On Rails world-class platform for location-aware application development, and for that we need an even broader vision. Some of our inspiration comes from the GeoDjango project, which provides a mature set of GIS tools for Django applications, and was integrated into the main Django framework with its 1.0 release. The Rails community similarly needs its “GeoRails”. This begins with the geospatial data tools provided by RGeo. But we also need stable integration with key location services such as geocoding, place databases, social location, and data storage and search, as well as support for visualization technologies beyond Google Maps, not to mention spatial analysis and statistics. Location opens a world of possibilities, if we have the tools to support it.

18 thoughts on “Introducing RGeo: A Geospatial Data Library For Ruby

  1. Pingback: Tweets that mention Introducing RGeo: A Geospatial Data Library For Ruby | Daniel Azuma -- Topsy.com

  2. Daniel,

    I’m a little creeped out. I was literally thinking about needing to find this sort of library just last night.

    One thing I’m not 100% clear about (it is 6am my time, after all). If I have a publicly available shape file representing neighborhoods can I load that data into my database and then use AREL to find neighborhoods that contain a particular point? It strikes me as being a common usecase so I imagine so. Still an idiots guide to doing that would be appreciated.

    • You still have to construct the spatial database queries manually for now. Although an Arel extension for generating queries is definitely on the radar. I haven’t thought much yet about how that syntax ought to look. If you have any ideas, let me know. As for a tutorial or beginner’s guide, that’s also a good idea. I’ll see if I can throw one together.

  3. Awesome work! I’ve been using GeoRuby & SpatialAdapter for years now since Guilhem first developed them. They’ve become a bit antiquated, so the new approach is very welcome.

    In particular, supporting Arel methods is exciting. I’ve been experimenting a lot with Arel and how to build some pretty spectacular capabilities using the chaining and lazy evaluation.

    We’d love to help support the development of RGeo and push out some of the work we’ve done with spatial analysis in Ruby working on GeoCommons. In particular the additional format support, API integration, and full Arel support.

    It’s been a bit quiet – but you should post this on the GeoRuby google group. There are definitely a lot of other Ruby + Geo developers that will welcome this work.

    • Thanks Andrew! I sent a post to the google group for moderation. On Arel, I agree: it opens up powerful possibilities that I’m very interested in exploring. I’d love to speak with you more about collaborating on some of these things. I’ll send you an email.

  4. I look forward to seeing how you dealt with some of the problems related to ActiveRecord integration. As the maintainer (not the original author) of spatial_adapter I have been frustrated with the prevalence of monkey-patching in its source and have toyed with rewriting using the inheritance method you used here. I just haven’t had the time to work on it recently. Best wishes!

    • The ActiveRecord integration is not easy. With subclassing, you still run into some of the same difficulties that spatial_adapter does, largely because ActiveRecord doesn’t provide a way to customize the factories for the secondary objects such as Column and TableDefinition, so we’re forced to completely replace methods more often than we want. I’m still thinking about different approaches to the problem, including not subclassing but adding mixins after the fact (a performance hit), or even overriding the “new” method (ugh!).

  5. Pingback: Introducing RGeo: A Geospatial Data Library For Ruby | Daniel Azuma « Netcrema – creme de la social news via digg + delicious + stumpleupon + reddit

  6. Great Work!

    Correct me if this is already the case, i am wondering if different approaches should be taken per SQL DB. Obviously PostGIS is far superior to what MySQL and others offer in terms of geospatial features and functions and PostGIS relies on GEOS and Proj internally to perform operations. So in the case of PostGIS, wouldn’t it be most appropriate to just extend the active record adapter to wrap this functionality. It’s already been implemented in C within PostGIS and it ties in nicely with DB backed records already.

    For in memory calculations you could wrap the functions with an SQL SELECT, for DB backed records, simply select the DB columns that are appropriate etc…

    For example a transform that relies on Proj:
    SELECT ST_AsText(ST_Transform(ST_GeomFromText(POLYGON((743238 2967416,743238 2967450,743265 2967450,743265.625 2967416,743238 2967416))',2249),4326)))

    And for a union (GEOS dependency)
    SELECT ST_AsText(ST_Union(ST_GeomFromText('POINT(1 2)'),
    ST_GeomFromText('POINT(1 2)') ) );

    • For ActiveRecord integration, I agree, it’s important to expose the features of the database, and when doing spatial queries against spatial data in database-backed records, you’ll definitely want to use the functions defined by, e.g. PostGIS. RGeo’s current ActiveRecord adapters abstract the schema creation and the data conversion, but they don’t try to generate spatial queries yet. For Rails developers, I think the important feature moving forward is going to be some kind of Arel-based syntax for constructing those queries in ActiveRecord, although that does seem daunting because of the sheer number of functions defined.

      The core of RGeo is intended mostly for in-memory data representation and calculations, and for those cases, it seems odd to make a round trip to a shared resource like the database when you could just call GEOS directly. Unless I’m misunderstanding you?

  7. Yep, what you say makes sense Daniel. It’s not a criticism, it’s just a question.

    Your library is a really exciting development and i would like to contribute if there is room to.

    Arel would be pretty cool, but i think writing the SQL by hand is pretty simple, depends on your use case of course.

    My question is, when do you normally perform (exclusively) in memory calculations. Even if you were using shapefiles as the datastore, then it doesn’t really make sense to load all those objects into memory as well does it? You are accessing the GEOS API and so is PostGIS, it’s just another level of abstraction. When do you do geospatial calculations that are not based on lots and lots of real data?

    Most of the time in a spatial application you do have DB backed records even if the application is not a rails one or a web one, you need somewhere to store geospatial data. ArcGIS is a good example, it uses an object-relational database on the backend in which you can important into and manipulate from there.

    That said, there are many features in GEOS that are not in PostGIS.

    Anyway, sorry if I’m not getting it :)

    • Okay, I think I see what you’re getting at. Yes, the data will usually live in a large database-backed dataset, and no, I’m not advocating bringing everything into memory for RGeo’s sake. If you want to compute the “union of all geometries in my_huge_table.geom_column”, then write SQL by all means. RGeo is not trying to abstract that kind of operation yet, but doing so as an Arel extension is, I think, a very interesting project.

      The in-memory cases are, I think, more typical of Rails-based web applications where the common pattern is: (1) load a small number of relevant database rows into memory, (2) perform some operations on the data, which may include spatial operations among others, and then (3) save the results back to the database and/or render to html.

      • Yep. Ok. So I guess it’s similar to georuby, but of course much more feature rich and instead of chugging along and doing the in memory calcs in ruby, it makes use of GEOS and Proj to do the heavy lifting. I also like the idea that you propose for building out the ability to produce/consume georss and kml etc…

        Anyway, I’ll start playing with it once I update to rails 3.

        Thanks again for releasing it!

  8. Looks really cool.

    Minor correction:
    add_index(:locations, :loc, :spatial => true)

    should probably be:
    add_index(:locations, : latlon, :spatial => true)

  9. Pingback: Delicious Bookmarks for December 30th from 02:44 to 02:59 « Lâmôlabs

  10. Pingback: [ ultramagnus ] » Becoming Location Aware – Installation

Comments are closed.