By Mitchell Koch, Software Engineer
At DoorDash, we want to make it as easy as possible for people to discover and order from great restaurants in their neighborhoods. As part of that goal, one fascinating problem we’re tackling is how to create a personalized experience for each user on the DoorDash platform by surfacing recommendations for restaurants and other merchants based on his or her order history. We’ve been experimenting with a variety of techniques for personalizing the DoorDash experience, and though we’ve just scratched the surface, we’re always looking for ways to improve our models.
For a recent hackathon project we wanted to model the similarities between cuisines on DoorDash, both to better understand the problem, as well as to use this data to improve our recommendations. Here, the question is, if we know you like certain cuisines, what other cuisines can we recommend?
To do this, we used Bayesian network structure learning over order data. A Bayesian network is a probabilistic graphical model that represents the relationships between variables and their dependencies in a graph structure. In our case, we have variables for cuisines such as “Spanish” and “Tapas,” and we want to represent the probability distribution between those, capturing that if you like Spanish food, you’re very likely to like Tapas as well. A Bayesian network structure learning algorithm finds a compact representation of the joint probability distribution between all of the variables in the data. Put simply, it allows us to find the probability of ordering one type of cuisine given that you’ve already ordered from a different cuisine. While using interventional data would allow us to get true causality, in this case we can see the relationships between cuisines but we can’t be confident in the causality of the arrows. To learn the network structure, we used a hill climbing approach, which resulted in the graph below. You can try this technique on your own data using the bnlearn R package. (While Bayes nets can represent inverse relationships, we only showed positive relationships in the graph below.)
From a quick look at the results, the graph shows some interesting relationships. As expected, if you like Italian, you’re likely to enjoy Pizza or Pasta. But there are some surprising results here as well: for example, if you like Fast Food, in turns out you’re likely to enjoy Chinese food too.
In the future we hope to use this type of model to determine which specific users should be informed about certain promotions or new restaurants. For example, if we want to feature a particular Thai restaurant, we can see here that it could be very relevant to anyone who likes Asian food. Incorporating this type of information into our recommendation models could further be helpful for suggesting restaurants of similar cuisines. We’re looking forward to implementing this kind of information into the DoorDash recommendation engine to better personalize your experience on the service and make it easier than ever for people to order from new restaurants.