MOSAIC: Modeling Online Sharing of Animal Image Collections
thesisposted on 2019-08-06, 00:00 authored by Lorenzo Semeria
Predicting how people share images on social media is crucial in understanding the bias that affects any image collection found online. For this reason, this work aims at providing a better understanding of how animal pictures are shared with the ultimate goal of improving future estimates based on images extracted from online sources, with a focus on social networks. The focus on images is driven by the availability of effective tools – namely WildbookTM   – that allow the identification of individual animals in images. However, obtaining a rich dataset of pictures can be challenging. Using online sources of images, for example social networks, can make the data collection process both cheaper and more extensive. Unfortunately the fact that users arbitrarily choose what images they share online will inevitably bias the dataset – for example, younger individuals may be overrepresented. Understanding this bias is at the core of this work and in order to do it we created a model to predict which images will be shared from collections. Our models are designed to take into account both the image-specific features and the collection-specific ones – for example the structure of the SD card from which the images are chosen (ordering, distribution of species, ...). The introduction of features able to account for the collection in addition to the single image is a novelty and improved the models’ performance.