Predicting the Popularity
of Pet Images

Thousands of animals are present in shelter homes and much more are present on the streets. In order to stop things like cruelty and euthanization of these animals, we need to increase animal adoption rates. Animals with cute pictures are more likely to get adopted. Shelters need a way to estimate and increase “cuteness” of photos of these animals to get them adobted faster. The goal of our project is to use machine learning to make accurate predictions of “cuteness” and increase animal adoption rates from the shelters.

Problem definition

For CS7641, our project aims at estimating the cuteness/popularity of images of shelter animals. This is an open kaggle challenge. The dataset contains raw images of shelter animals along with metadata. The metadata consists of a set of binary features like presence of eyes, face, etc.

In this project, we use both supervised learning and unsupervised learning to estimate popularity/cuteness of images. In particular, we use representation learning to learn features from raw images along with PCA to select prominent features from the metadata. Finally, we plan on demonstrating the effectiveness of our solution by plotting training and validation loss along with an ablation study.

Methods

  1. Self-supervised/unsupervised contrastive learning on the image dataset. [3,5]
  2. PCA to filter out prominent features from the given meta features in the dataset.
  3. Merge PCA of metadata and self-supervised training’s encodings for final regression task either through a linear or a non-linear regression head.

Potential results and discussion

We will attempt to train a model which predicts a score that's reasonably close to the target score. We expect to then do ablation studies on the performance of the model based on the following:

  1. Feature reduction through PCA vs LDA
  2. Performance of Linear vs non-linear regression head on top of the contrastively trained model

We will then plot all the information on graphs for easy visualization and inference.
Finally, we also want to perform exploratory analysis on what pictures of pets the model assigns high score to and if there is any bias in it.

Responsibilities

Timeline