posted on 2016-05-04, 00:00authored byJ Wang, CT Yu, PS Yu, B Liu, WY Meng
There has been a recent swell of interest in the analysis of blog comments. However, much of the work focuses on detecting comment spam in the blogsphere. An important issue that has been neglected so far is the identification of diversionary comments. Diversionary comments are defined as comments that divert the topic from the original post. A possible purpose is to distract readers from the original topic and draw attention to a new topic. We categorize diversionary comments into five types based on our observations and propose an effective framework to identify and flag them. To the best of our knowledge, the problem of detecting diversionary comments has not been studied so far. We solve the problem in two different ways: (i) rank all comments in descending order of being diversionary and (ii) consider it as a classification problem. Our evaluation on 4,179 comments under 40 different blog posts from Digg and Reddit shows that the proposed method achieves the high mean average precision of 91.9% when the problem is considered as a ranking problem and 84.9% of F-measure as a classification problem. Sensitivity analysis indicates that the effectiveness of the method is stable under different parameter settings.
Funding
This work was partially supported by NSF through grant CNS-1115234, IIS-1407927, Google Research Award, and the Pinnacle Lab at Singapore Management University.
History
Publisher Statement
This is a non-final version of an article published in final form in Wang, J., Yu, C. T., Yu, P. S., Liu, B. and Meng, W. Y. Diversionary Comments under Blog Posts. Acm Transactions on the Web. 2015. 9(4). DOI: 10.1145/2789211.