How Doordash uses Data Science to Drive Revenue
Interviewing Daniel Parris, first 150 at DoorDash, on the power of lead scoring models
Run the Numbers welcomed its first public company CFO - Sonalee Parekh, the Chief Financial Officer of Ring Central - to go deep on earnings calls. To many, earnings calls are a black box. We demystified what goes on before, during, and after these quarterly events, and discussed the most granular details, as well as pitfalls to avoid.
As the CFO of a marketplace business, who recently inherited a Data Science team, I jumped at the chance to speak with Daniel Parris, who helped scale DoorDash’s DS team from scrappy startup through public company behemoth.
As any connoisseur of cheeseburgers and fried chicken sandwiches knows, DoorDash connects you with the eatable world around you. And what drew me to Daniel’s story is how he harnessed the power of data to improve the restaurants available to consumers on the platform. More specifically, he helped build a valuable lead scoring model that can be emulated at other company’s.
If that wasn’t enough, what made Daniel an even more fascinating person to speak to is how he now uses his data background post-DoorDash. He writes Stat Significant, a weekly newsletter featuring data-centric essays about movies, music, sports, and more. He’s taken on some wild questions over the years:
Daniel also strategy & analytics consulting working and can be reached at firstname.lastname@example.org if people are interested.
Enjoy this masterclass on how companies can use data and predictive analytics to increase revenue.
Part I: What is lead scoring
Defining lead scoring
“Lead scoring involves quantifying the value of sales targets.”
Figuring out which data points to feed the model
“There is a temptation to throw every conceivable datapoint into a black-box model.”
Using third party data vs finding data on your own
“Data acquisition is typically dependent on the type of business.”
Part II: Lead scoring in action
Figuring out which customers to go after
“Many restaurants excel on food delivery platforms but are not popular on Yelp and Google—especially chain restaurants.”
Metrics to determine if you’ve been successful
“You train a model from historical data—in this case, restaurants that had already launched on DoorDash—so we compared our predictions to the actual performance of these merchants.”
“Our biggest mistake was not investing in data quality upfront.”
Part III: Incentivizing Sales Reps
Structuring comp plans
“Predicted restaurant order volume became the global currency of the sales org.”
“Our compensation structure demanded an unrealistic degree of perfection from the selection intelligence team.”
Part IV: Lightening Round
Why people hate Nickelback (statistically speaking)
“There was a lot more to this internet meme than I initially expected.”
Messing up A16Z’s order at DoorDash
“DoorDash legend is that I made Marc Andresson personally angry, but I doubt this to be the case.”
A message you’d put on a billboard
“Don't be afraid to fail.”
Mostly metrics is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.
Part I: What’s lead scoring?
Can you define lead scoring?
Lead scoring involves quantifying the value of sales targets. These scores are then used to prioritize prospective leads, ensuring teams focus on accounts that generate more value for the business. Typically, sales organizations tie their compensation models to lead scores, paying sales reps for closing accounts of greater worth.
How do you figure out which data points are fed into a lead scoring model?
Like any predictive model, exploratory analysis usually precedes the model-building process. There is a temptation to throw every conceivable datapoint into a black-box model (ChatGPT or tree-based model) and lean on these algorithms to solve the problem for you.
The downside is interpretability—you'll have no idea why the model made certain choices.
Before model training, you should understand the business problem and dataset at the lowest level of detail. You want to engineer predictive variables, and then ascertain correlations between these variables and the thing you're predicting. You can then build a model around your most predictive features and better understand what drives its outputs.
How much of the data do you buy from third parties vs produce internally by brute force?
Data acquisition is typically dependent on the type of business. SaaS businesses often rely on external datasets to source prospective clients and to evaluate a customer's size and revenue potential. Marketplaces and consumer-facing applications rely less on external datasets, given they can mine user data and quantify patterns that may prove valuable for lead identification.
Part II: Lead scoring in action:
You helped build the lead scoring model at DoorDash to determine which restaurants to go after. Where did you start?
The first step was data-gathering, mainly collecting data on the preferences of DoorDash users in a given market. Food delivery customers are a distinctive audience.
Many restaurants excel on food delivery platforms but are not popular on Yelp and Google—especially chain restaurants. Data on cuisine preference was helpful, but only to a point. You don't want to identify that people like Greek food in Houston; you need to identify the best Greek food places in Houston. We had to collect data directly from our customers on the specific restaurant locations they wanted.
Over time, the selection intelligence team developed mechanisms for directly capturing consumer preference. We launched a feature that allowed consumers to request restaurants they wanted on the platform. In hindsight, this tool sounds rather obvious, but it was a major breakthrough at the time. Customer search data also proved predictive in projecting merchant popularity. Ultimately, we were able to build a decently accurate model around these variables.
Which metrics did you look at to determine if your model was successful?
We looked at classic error metrics like RMSE and r² when building the model. You train a model from historical data—in this case, restaurants that had already launched on DoorDash—so we compared our predictions to the actual performance of these merchants.
We also ran an experiment before rolling the model out to the sales team. We compared the performance of the model to hand-picked lead lists from operators (which was how we were selecting sales targets at this point). Each group curated a "best-of" set across a few markets, and then we picked a winner between the two. Ultimately, the model won, which gave us the confidence to roll it out more broadly.
What mistakes did you make during the process?
Our biggest mistake was not investing in data quality upfront. Ultimately, DoorDash's selection intelligence initiative focused on two distinct problems: 1) curate a database of every restaurant on planet Earth and 2) predict how these restaurants would perform if they were on DoorDash. At the beginning of the project, we focused more on model-building than database curation.
Our initial restaurant database came from a stale dataset of extremely low quality. Before we rolled the model out, I wrongly assumed that prediction accuracy would supersede data quality—I was sorely mistaken. The first time we distributed leads, sales reps were pissed. Nearly ten to twenty percent of lead lists were unworkable, either because these merchants were permanently closed or not restaurants at all. The sales team compiled a spreadsheet of the worst offenders to underscore their frustration. This Google sheet featured over 4,000 offensively bad leads, including strip clubs, consulting firms (McKinsey, Deloitte), dialysis clinics, and other garbage. I had to review the document with the head of sales to explain how this had happened. We spent the next year working on nothing but database quality. That problem was way more challenging than building a model.
Part III: Incentivizing Reps
How did you compensate sales reps? Was it per restaurant? Per order?
Predicted restaurant order volume became the global currency of the sales org. Each account executive was assigned a quarterly target for total predicted order volume. Reps would be compensated based on their performance vis-à-vis this quarterly target.
Were there any unintended consequences of how you set up the incentive structure?
Our compensation structure demanded an unrealistic degree of perfection from the selection intelligence team. Account executives could only close restaurants on their lead list and were incentivized based on projected values. As a result, our restaurant database was expected to be a near-perfect representation of the physical world, and our predictions needed to be highly accurate. Sometimes, sales reps wanted to close restaurants that weren't in our database, and other times, they rightly claimed that a given prediction underestimated a high-caliber merchant. Error is expected when making predictions. Unfortunately, this system design left us with little room for error.
Part IV: Lightning Round:
Based on your extensive research, why do people hate Nickelback? Give us the TL;DR
A few months ago, I published a data essay on why everyone hates Nickelback so much. There was a lot more to this internet meme than I initially expected. Musically, Nickelback is a highly repetitive artist that achieved heavy exposure in the early 2000s. Their music predates streaming, which means people reluctantly absorbed copious amounts of Nickelback via radio play, sowing seeds of discontent among disgruntled listeners. Nickelback appraisal also represents a significant cultural divide, as the band is highly popular amongst right-leaning audiences. There is a strong correlation between state Youtube search interest and partisan leaning. Much of the dismay surrounding Nickelback involves confusion around their massive popularity.
Furthermore, Nickelback was subject to various PR disasters that crystallized their status as a meme. Detriot Lions' fans petitioned to have the band replaced as halftime entertainment, and Comedy Central aired a commercial that featured a vicious Nickelback joke, and this commercial aired a lot. These events formalized a long-dormant discontent with the band's music and cultural footprint.
To sum up the phenomenon, I wrote: "Nickelback's mimetic staying power doesn't arise from any one idea but rather an amalgamation of spirited grievances. The band is a Rorschach test for everything you find wrong with modern music—you see what you want to dislike." I stand by this.
I heard someone at DoorDash got an email from the prominent Silicon Valley firm a16z?
In my first year at DoorDash, I ran a few experiments on merchant radii, a mechanism that determines a customer's restaurant selection. For example, you may open the DoorDash app today and see three hundred restaurants within a five-mile radius of your house. You will see significantly fewer restaurants if we collapse that radius to three miles. I was trying to shorten Dasher trips by shrinking radii—this way, drivers would travel shorter distances to drop off food. Such a change also meant customers would lose access to further away restaurants.
Almost immediately after shrinking radii in a few cities, someone in DoorDash's C-suite got an email from the prominent Silicon Valley firm a16z. They could no longer order from their favorite restaurant, and they explicitly asked, "Please extend our radii," as if they perfectly understood the mechanism. Ultimately, I reverted all changes, and the project was shelved. The experience taught me the importance of selecting success metrics that benefit the business holistically (like profitability) vs. something myopically specific (driver efficiency).
DoorDash legend is that I made Marc Andresson personally angry, but I doubt this to be the case.
What business related message would you put on a billboard that everyone has to drive by every day?
Don't be afraid to fail. Christopher Payne, the former COO of DoorDash, once said he learned the most when far outside his comfort zone. As such, he actively sought situations where he felt uncomfortable. People don't like discomfort and often associate this emotion with an existential threat to career stability (which is understandable). That said, if you never try something new, you'll never learn. My failures have facilitated the most impactful learning moments of my career.
Also, failure begets better stories. Nobody really wants to hear of somebody's triumphs (which often resonate as thinly veiled bragging). It's easier to connect over the things that make us human, which is our imperfections.
A word from our sponsor - Tropic
True story - I was on the horn with Tropics VP of Marketing last week, talking newsletter stuff and shooting the breeze. It came up that Q4 is a busy season for more than just sales teams - as a CFO, I find myself buying a lot of software. And at that moment I had a proposal from Gong on the table.
The next thing you know, we were running a price check together on the deal. Fast forward and a few drop downs later (like literally under 90 seconds), I was looking at benchmarks for what someone like me should be paying for the tool, given the package and number of licenses that we wanted.
I’ve always said that he or she who holds the data holds the power. So I went back to Gong and got another 15% chopped off.
Don’t buy before you run a check - you can run one for free today.
Quote I’ve Been Pondering:
"I don't want to be a part of any club that will have me as a member."