As unemployment reaches record lows, employers in almost every industry are struggling to fill open positions. The competition for labor has been fierce as companies try to lure in candidates with higher pay, increased sign-on bonuses, and added benefits. The fight for workers is even more intense when it comes to the gig economy where workers can easily leave the workforce with a simple click of a button. Companies such as Uber, Lyft, Doordash, and Instacart are all struggling to find enough couriers and drivers to satisfy the increased demand.
The effects of the labor undersupply are easily visible to customers. In the past year, an increasing number of consumers experienced longer than normal wait times and increased prices with food delivery or car rides. Stories of rideshare and food delivery prices increasing by three times have become common – all within a short timeframe. To respond, gig economy companies have been trying new methods to attract service workers, including offers such as increased referral bonuses, surge payments, and more weekly and hourly incentives. In the second quarter of 2021, Uber spent more than $250 million on a one-time stimulus aimed to attract more drivers to their platform. 
Testing and understanding the effectiveness of such strategies can help businesses thrive and prepare for future unpredictable times. Were the techniques (e.g., increasing pay) used by companies such as Uber and DoorDash effective? Did they result in the increased service workers joining the platforms? How was productivity affected? Did workers spend more time on the platform? How can rideshare and other gig economy companies measure their success when it comes to driver and courier incentives? This article outlines one possible methodology to help gig economy companies evaluate the effectiveness of their incentive spending.
What are the Incentives Available to Gig Economy Workers?
While rideshare workers do earn some predictable income (e.g., drivers receive fixed per-minute and per-mile amounts), the additional pay they earn can vary greatly. For example, rideshare companies frequently pay so called “quest” incentives that allow drivers to earn extra money after completing a certain number of trips in a given time period such as one week. They also offer additional pay if a driver starts a trip in a specified location at previously communicated time and does not reject or cancel the next two trips (known as a “consecutive trip” bonus).  Majority of platforms offer some type of surge pay - a bonus or increase in pay given to drivers and couriers during times when demand exceeds supply. In addition, gig economy companies guarantee certain pay during a worker’s first few weeks on the job and have referral bonuses for drivers who bring their friends or colleagues to the platform. Many food delivery companies also offer guaranteed earnings incentives that will give additional bonuses to workers who complete a minimum number of deliveries but did not reach previously advertised minimum pay. 
These may be the most common incentives offered to gig economy workers, but it is important to note that there are unlimited ways for rideshare and delivery companies to reward their workforce. Given the abundant compensatory options available, companies can offer bonuses at any time for any behavior they deem desirable. If workers are needed at the airport at 4AM on a Sunday, then current designs allow rideshare companies to easily communicate what times, days, and locations will be incentivized with extra payment. If more supply is required at times that are not easily predictable, live surge algorithms will adjust driver and courier payments while also showing workers the areas that they can earn the most money.
Measure Twice, Spend Once
So, how can performance of such incentives be measured? First, it is important to know that apples-to-apples comparisons cannot always be made. A control group will be needed to help develop metrics. Instinctively, one may think that excluding a portion of drivers or couriers would allow him/her to form a control group. However, when companies are fighting for workers in an extremely competitive labor market, it is not practical to deprive workers any compensation they depend upon. Imagine if Lyft chose to not pay some drivers their usual surge income - how would drivers respond? How many of these drivers would turn off the app and sign up with Uber instead? As a result, excluding any number of workers from the incentive is not feasible or recommended.
Instead, experimenting with varied sizes of incentives for different, randomly chosen worker groups would be more rational. For example, experimenters can divide all workers into two groups: Group A (roughly 50% of sample size) would receive 10% more than the average incentive payment and Group B (the remaining 50%) would receive 10% less than the average incentive payment. How many workers are included and the actual difference in incentives for each group will vary based on the experimenter’s budget and resource availability.
If this specific incentive is run on a weekly basis, drivers must be randomly selected for each group, every week to ensure no workers are treated unfairly by constantly being placed in the “low incentive pay” group. Randomizing both groups every time a new incentive starts minimizes the chances that the same workers constantly end up in one group. Once the distinct levels of incentives are determined and assigned, the total expenditure must be calculated and the benefits provided in each group must be documented.
The types of benefits provided and tested against also matter. Were incentives provided to increase supply hours? If so, then the change in total supply hours will need to be calculated. Were incentives to decrease customer wait times provided? If so, then the difference in wait times achieved by each group of workers must be determined. One sample use case involves using supply hours as the main benefit of the incentive. Table 1 highlights the breakdown below.
Table 1 - An Experiment to Improve Driver Supply Hours
The above example shows a 50-50 split between workers, which may be hard to replicate in reality. For instance, some workers may be on vacation the week of the incentive while others may be sick or unable to drive or deliver and others may switch to a competitor’s platform or leave the gig economy altogether. In addition, the experimenter may intentionally decide to have a different split between the two groups. To account for these variations between group sizes, the spend and benefit numbers will need to be adjusted based on the number of drivers in each group. The easiest way to accomplish this is to calculate “per worker” rates for each of the metrics. Modifying the above example based on this technique leads to Table 2.
Table 2 - Standardized Results of the Experiment
Cost Per Incemental Benefit and Marginal Benefits
The next step requires calculating the cost per incremental benefit or, in the case of the example, cost per incremental supply hour (CPISH). The formula shown below can guide us through the process:
Using numbers from the example, leads to the following result:
This indicates that the experimenter is spending $4.83 to get one extra supply hour. Understanding what this means requires knowing what the marginal benefit of supply hour (MBSH) is. MBSH can be taken from the company’s existing production function. If the company does not have a production function, it can easily be created by viewing historical supply hours and gross bookings numbers and determining the formula that would best predict gross bookings based on the number of supply hours, possibly through a regression analysis. Once obtained, the user should take the derivative of the function and calculate the MBSH based on the number of total supply hours currently at the company. To maintain simplicity for the example, MBSH is assumed to be $5 after such calculations.
Based on that knowledge, a CPISH of $4.83 will lead to $5 in additional gross bookings – which seems manageable as the return outweighs the cost. However, this value, as well as the marginal benefit of supply hour, can change from week to week and from one incentive to the next. If the experimenter is running multiple incentives and would like to compare effectiveness of each, it will require a unified metric. Dividing MBSH by CPISH would do exactly that:
The example would lead to 1.04 as the result. Any number above 1 will indicate that the incentive is running efficiently. A number below 1 indicates that the incentive costs more than the benefit it provides. This number can also be compared to the same metric calculated for other incentives or same incentives ran at various times, providing us with valuable information on how various incentives perform and how this performance changes over time.
Evaluating the performance of rideshare and delivery worker incentives is not an easy task but with millions of dollars spent each week on extra compensation, it is necessary to go through all the steps needed to better understand the effectiveness of such compensation to ultimately lead to smarter decisions for businesses. Given the unique and constantly changing nature of work platforms, an equally flexible metric is needed. Utilizing a methodology similar to the one presented will lead teams to a better understanding of the effectiveness of incentives for businesses.