The Building Blocks for Vector AR

Sean Gorman

Entrepreneur and Geographer

Published Dec 10, 2024

As we've been testing out our AR capabilities, using just the GPS and IMU, I've started to wonder what data we'd need to enable a purely vector AR experience. Given all the wonderful new location based datasets that are being open sourced, like Overture and Foursquare, I thought it might be worth diving into.

What is Vector AR?

Traditionally, location based AR is powered by cameras. They use raster based techniques to match pixels. The localization model takes the pixels it sees from your phone's camera, and matches it to a database of features (clustered pixels) that were mapped out ahead of time. The feature database is typically populated using street view imagery (e.g. Apple, Google) or crowdsourced camera phone images (Niantic, Meta/Mapillary) that are turned into point clouds through photogrammetric pipelines. The "points" from the "cloud" become the features you match the camera phone's pixels to. Using the match between the feature database and the camera phone you can determine a geographic pose for the user. I think of this an an image based system we could broadly call a raster based AR approach. While there are amazing mathematics and engineering that have gone into the raster based approach there are some significant drawbacks. The system needs imagery of all the places you want to provide AR experiences for and you need to compute accurate feature databases with all the imagery. Both of these endeavors are labor, compute and cost intensive.

Alternatively, you can try to solve the problem using just vector data. Instead of using the camera on your phone we use the GPS and IMU to determine your location and orientation to derive a geographic pose. Previously this approach had been limited by the accuracy of the GPS and IMU, and was insufficient for a good user experience. We believe the improvement our team has been making with GPS and IMU accuracy for consumer devices could change this.

We still have work to do on anchoring and stabilizing overlays, but assuming it is feasible what additional datasets would we need to enable a vector AR approach?

Open Data Building Blocks

While open source raster imagery data can be hard to come by (except Mapillary - you are awesome!) open vector data seems to be growing daily. It is exciting to see that the foundational work done by OpenStreetMap (OSM) for individual contributors and Overture for corporate contributors has turned the problem of a global vector map into a community solution. This is a big part of what makes vector AR approach interesting, so much of the data is now open.

Places Data

OSM started the ball rolling, as with most open vector data sets, creating tags like "amenity=restaurant" to denote places. Adding to this foundation Overture has opened sourced 55.5 million places with 49.6 million coming from Meta and 5.9 million coming from Microsoft.

Recently Foursquare open sourced another huge chunk of Places data. Sina Kashuk has a great post using the Foursquare data to aggregate it by Overture Maps Foundation building footprints.

Foursquare Places Intersected with Overture Building Footprints

Having "places" that are well aligned to building footprints was a critical piece to the AR demo with just GPS+IMU we posted a couple of weeks ago. We used the Google Places API because of the excellent work they do aligning "Places" with building footprints. Foursquare has also put a lot of work into aligning check-ins, buildings and places. They generated one of my favorite infographics comparing FSQ check-ins with GPS accuracy from smartphones:

Foursquare Checkin Locations vs. GPS Reported User Positions

It will be interesting to see how well aligned the Foursquare data is with building footprints since it is really easy to generate false positive and false negatives with Places data. Many geocoders are not based on building footprints and instead use address ranges to determine the location of a Place. These inferred locations for Places can cause the derived latitude and longitude to not intersect a building footprint or intersects the wrong building footprint. The later is especially prevalent if the geocoder fails to match of the address string and falls back to the zipcode or the state as a US centric example.

The good news is that crowdsourcing is one of the best methods for correcting this data and Overture has a nice schema (GERS) for connecting Places with building footprints explicitly. Also there are effective computational approaches to conflating Places and footprint data like Hootenanny. I believe we'll see a lot of activity around this work by the community in coming years since it is foundational for several compelling use cases.

Urban Tree Inventories

Another useful open data set to help map a user's field of view are trees. Fortunately there are some wonderful open tree location databases. Not surprisingly OpenStreetMap provides a great baseline dataset in one format that covers many cities around the world. If you want to provide a more sophisticated level of modeling there is the Global Urban Tree Inventory. It has 4,734 tree species planted in 473 urban areas in 73 countries. I think that they have a thing for numeric similarity :-) The additional data on tree species can be particularly useful for programmatically generating 3D vector models. The data can be downloaded here.

Recommended by LinkedIn

Exploring Real and Virtual Spaces with Data

Towards Data Science 8 months ago

3D Geospatial Technologies Market 2023 Decision…

360 Market Updates 1 year ago

SuperMap GIS 2024, Upgrading Geospatial AI to Empower…

SuperMap GIS 4 months ago

Building Footprint Data

There has been a wonderful explosion in available building footprint data since Microsoft open sourced 130 million footprints in the US back in 2018. As ML/CV techniques have improved so has the quality and scale of the data. Overture has graciously aggregated the various programmatic and manual building footprint initiatives into their data repository:

Arguably an overlooked but massive value add is the conflation that Overture has provided across these data sources to create one high quality data set. Below is a nice representation of this conflations across the four data sources in downtown Sand Diego. The geo-epicenter every July.

Overture Building Footprints Color Coded by Provider

While there is certainly work to keep expanding this data set and keeping footprints current it is arguably the most complete of the group.

Building Heights

For several use cases building heights is a critical attribute where there is still considerable work to be done. While high nadir satellite and aerial imagery used for base mapping lends itself quite well to extracting the 2D footprint for a building, they come up short for determining a building's height. This leaves manual processes, remote sensing or sensor fusion to determine building heights. Here are few examples that have been constructive but not fully successful in creating universal heigh attributes.

OpenStreetMap - users can add a "height" or "building:levels" attribute to a building footprint, which has made for some compelling 3D visualization. Unfortunately, a study by Lao et al. (2018) reported that globally, less than 3% of buildings have a "height value", and less than 4% have a building:levels value. While these statistics have improved since 2018 the majority of buildings are missing a height type variable in OSM.
Overture + 3DEP - another effort to add height to building footprint data is Overture's work fusing 3DEP aerial LiDAR measurements to add measurements to 6 million buildings. While there is a growing number of open LiDAR datasets many geographies don't have these data sets.
Global SAR - while aerial LiDAR doesn't have global coverage Sentinel-1 SAR data does. This makes the recent work by Kakooei and Baaleghi (2024) to use SAR data to infer building heights particularly exciting. The challenge is the compute to run this approach globally is not trivial.
Space Borne Altimetry - or space LiDAR has also recently been used to create quasi-global scale building height datasets. Specifically the GLObal dataset from the University of Texas uses data from GEDI and ICEsat-1 to generate building heights for 1200 urban areas. The code used to derive building heights is also available. While the accuracy is a little below other methods the readily available open data is a wonderful asset.
Earth Observation - last but not least is the Copernicus work to use multi modal earth observations data sets to generate global scale building heights. It is another wonderful addition. The accuracy is roughly on par with GLObal and 3D-GloBFP is also available for open download.

It is really exciting to see how much growth there has been in modeling to create these datasets as well as how many are readily available for download. I did a similar exercise at Snap a few years ago and the level of progress since then is really exciting. So, how does this all help AR?

Enabling Vector AR

The fundamental concept of vector AR is using the GPS+IMU to determine your geographic pose and then using open vector data to determine what you are looking at. There are a couple of aspects to this concept and we'll provide a brief break down of each.

What can the user see? To understand what is in a user's field of view we need to know what built environment intersect their geographic pose. Most critical of these are the buildings that largely drive a user's immediate field of view in an urbanized area. A handy way to think of this mathematically is an Isovist view that can be used to determine a visibility graph for a user. The Isovist view is a great tool to use building footprints to determine what is in a user's viewshed.

Varying Isovist Results Across Urban Morphologies

What Place is the user looking at? This is the core problem we were solving with our current AR demo. By setting the location of a Place to the centroid of a building an AR overlay or spatial detection can be generally accurate when aligned to a user's field of view. Ideally we'd have door or entrance location, but the building centroid is a great start. When the Place isn't aligned with building the user experience degrades significantly. Interestingly this Place data is critical for raster or vector based AR experiences. Places aren't encoded in pixels for a VPS but instead anchored based on their latitude, longitude and altitude.
What obstructs a user's view of an AR overlay? In AR work this is called occlusion, "The objective of occlusion in AR scenarios is to ensure compliance with the laws of line of sight. This means that virtual objects positioned behind real-life objects should be concealed or hidden from view to provide a more authentic experience for viewers and enhance their perception of depth." In raster based AR systems this handled by depth mapping and semantic segmentation or object recognition to realistically occlude AR effects. In a vector approach 3D buildings and trees make a reasonable 80% solution to occlusion. It doesn't have the fidelity of a raster based approach, but it require far fewer computational resources and the data is readily available.

Integrating all this into a viable system is still a considerable amount of work. That said the progress the community has made building the open data foundation is really exciting. I could be alone in thinking this makes sense. There is a lot of capital invested in the raster based approach, but given the success of smart glasses without waveguides a vector approach could hold promise. Ultimately I think the future is a hybrid system where highly active urban areas use a raster approach and less trafficked geographies use a low cost vector approach. We are looking forward to experimenting with it all and seeing how the underlying open data evolves.

Kevin Bullock

Partnerships @ Development Seed

Fantastic post, as usual, Sean! Great insights here along with a nice summary of amazing open data sources. I’m really looking forward to seeing apps integrating Vector AR and VPS, it will be interesting to see what this tech unlocks!

1 Reaction

Nick Kaufmann

Community Manager, inCitu

Dana Chermesh-Reshef Greg Lindsay Krisztián Tóth

3 Reactions

Tee Barr

Geospatial Disastertech | Insurtech | Climatetech | Leadership

Leave some creativity for the rest of us Gorman :)

1 Reaction

See more comments

To view or add a comment, sign in

The Building Blocks for Vector AR

Sean Gorman

Entrepreneur and Geographer

What is Vector AR?