Supply Conditioning in Online Marketplaces: Learning and Incentives
thesis
posted on 2023-08-01, 00:00authored byEren Ozbay
We consider the problem of ensuring efficient matchmaking in online marketplaces where the qualities of value-providing supply units (e.g., sellers, workers, service providers) are unknown when they first join. We first model this as a stochastic multi-armed bandit problem: Given K arms, instead of maximizing the expected total reward from T pulls (the traditional “sum” objective), we consider the vector of total rewards earned from each of the K arms at the end of T pulls and aim to maximize the expected highest total reward across arms (the “max” objective). For this objective, we show that any policy must incur an instance-dependent asymptotic regret of Ω(logT) (with a higher instance-dependent constant compared to the traditional objective) and a worst-case regret of Ω(K^(1/3) T^(2/3)). We then generalize our algorithmic insights to the problem of maximizing the expected value of the average total reward of the top m arms with the highest total rewards. For both models, we design adaptive explore-then-commit policies and present numerical experiments that demonstrate their efficacy.
We then consider a model where there are no centralized incentives that subsidize matchings which would allow the discovery of the qualities of new supply units and we inquire whether competitive forces in the market may generate adequate incentives for such (exploratory) behavior. We consider a fluid market in continuous time where new supply units arrive at some base information state and transition between information states as they get matched to a steady inflow of customers. (These information states could represent their publicly observable ratings, skill levels, etc.) We define a new market equilibrium notion called the local market equilibrium (LME), which characterizes the equilibrium prices and matchings arising in such markets. After proving the existence of the LME along with several structural and computational results, we present a comprehensive analysis of the performance under LME benchmarked against the optimal reward achieved by a centralized matching policy. We tightly characterize a novel market regularity condition under which the relative loss of the LME vs. the optimum is bounded below by the ratio of demand and supply. We define two broad classes of market instances in which the unique LME corresponds to system optimality, and additionally address the problem of designing optimal targeted price interventions that align the LME with the system-optimal solution.
History
Advisor
Kamble, Vijay
Chair
Kamble, Vijay
Department
Business Administration
Degree Grantor
University of Illinois at Chicago
Degree Level
Doctoral
Degree name
PhD, Doctor of Philosophy
Committee Member
Nadarajah, Selvaprabu
Soheili, Negar
Tulabandhula, Theja
Kash, Ian