Automated Inference of Shilling Behavior in Online Auction Systems
thesisposted on 2012-12-13, 00:00 authored by Fei Dong
Auction frauds develop in online auctions as online auction platforms expand in use. Shill bidding is one of the most prevalent forms of auction frauds that violate the integrity of online auctions. A shill is a person who pretends to be a legitimate buyer and feigns enthusiasm for an auctioned item by bidding up the auction price. Although the punishment for auction fraud could be severe (e.g., several years in prison with fines), shill bidding is still very popular. One primary reason is the lack of effective shill detection techniques in current online auction systems. Shill inference in online auctions is a difficult problem due to the characteristic of concealment of shill bidding activities and the anonymous nature of online applications. Shill bidding usually occurs without leaving obvious direct physical evidence, thus it cannot be easily noticed by the victims. In addition, because online auction users do not deal with each other face to face, acquired “hints” or evidence of shilling behavior generally involves uncertainty, thus making the investigation even more challenging. We propose to design an automated and effective approach to infer shills in online auction systems. To assist the investigation, we conducted an empirical study on the relationship between final auction price and shill bidding activity. Based on a predicted price, the actual price can help distinguish trustworthy auctions from likely shill-infected auctions. To infer exact shills, we propose to formalize various auction-level indicators and bid-level indicators that support shill bidding as well as innocent bidding. Since each indicator can involve uncertainty, we employ a formal reasoning technique, Dempster-Shafer (D-S) theory, to model the uncertainties associated with different indicators that pertain to varied aspects of an auction. This allows us to explicitly represent the uncertainties and combine knowledge from different sources to produce an aggregated assessment of trustworthiness.