By Richard Hwang and Aamir Manasawala, Software Engineers
One of our goals at DoorDash is to surface to consumers a wide range of stores that are quickly deliverable to their given address. This process involves calculating accurate road distances for each store-consumer pair in our real-time search pipeline. Our earlier blog post about recommendations for search primarily focuses on the ranking component of search at DoorDash. This blog post describes how we architected our search system using open source technologies to help determine consumer selection.
Calculating accurate driving distance in real time is critical to the selection that a DoorDash consumer sees. A mere straight line circle-based distance could be inaccurate and would result in very long Dasher drive times, especially when the topology of the region has unevenness due to barriers like mountains, lakes, bridges, parks, etc.
Figure 1 depicts a circle with a nine mile radius centered around an address in Southern California. If a consumer orders from a store on the edge of this circle, it will take at least half an hour (as shown in Figure 2) just for the Dasher to get from the store to the consumer.
Before we delve into the system architecture, let us define some terms:
The following diagram describes the overall architecture to determine if a store is in the consumer’s delivery address to determine its selection.
The offline component involves an isochrone service responsible for computing isochrones for a given location (lat and lng, which is converted to a level seven geohash) and parameters (eg: travel time).
To compute isochrones, we use our custom fork of Galton, an open source project. Galton is built on top of OSRM, an open source routing engine, and concaveman, a fast implementation of a concave hull algorithm. Galton first generates a grid of coordinates of configurable size and granularity around the input coordinate. OSRM then computes travel times from the input coordinate to each of the grid coordinates. Grid coordinates with travel times greater than the input travel time are filtered out. Finally, the concave hull algorithm generates an outline of the remaining coordinates, producing the appropriate isochrone as shown in Figure 6, which is the nine mile isochrone for the same address in Figure 1.
The service caches isochrones in DynamoDB, as simple key-value lookups for speedy retrieval. Further, we key by geohash, precision 7, rather than exact coordinate, to reduce the number of entries we need to store. Precision 7 geohashes have an error of ±0.076 km; isochrones for coordinates within these bounds will not vary drastically. We store isochrones in order of millions and with lookup time under ten milliseconds.
For each request, the service queries for DynamoDB: if the isochrone is present then it is returned. If the isochrone is absent then an asynchronous job is launched to generate and store it, returning a null response. On subsequent requests for that address and parameters, the generated isochrone will be returned. When we launch a new market, we bootstrap it by running a script to pre populate isochrone entries for all geohashes in the market, to get the market up to speed for accurate selection upfront.
Our current implementation accounts for the topology of the region via driving distance addressing the inaccurate selection problem in Figure 1 by isochrone selection as shown in Figure 6. Furthermore, this architecture allows flexibility to configure and control selection logic based on regionality, to dynamically change selection logic based on supply/demand curves, and to run selection experiments.
Some potential areas that we will be working on in the future include getting more accurate real-time traffic and road condition updates into the system.
If you are passionate about solving challenging problems in this space, we are hiring for our Search & Relevance team and our Data Infrastructure team. If you are interested in working on any other areas at DoorDash check out our careers page. Let us know for any comments or suggestions.