Skip to content

calista-ai/website-aesthetics-research

Repository files navigation

Deep Learning models to evaluate Website Aesthetics

Website aesthetics play an important role in attracting users and customers as well as in enhancing user experience.

In this repository, we develop deep learning models that can evaluate the aesthetics of a website achieving high correlation with the human perception.

Paper: "Calista: A deep learning-based system for understanding and evaluating website aesthetics"

Cite as:
@article{DELITZAS2023,
	title = {Calista: A deep learning-based system for understanding and evaluating website aesthetics},
	journal = {International Journal of Human-Computer Studies},
	volume = {175},
	pages = {103019},
	year = {2023},
	issn = {1071-5819},
	doi = {https://doi.org/10.1016/j.ijhcs.2023.103019},
	url = {https://www.sciencedirect.com/science/article/pii/S1071581923000253},
	author = {Alexandros Delitzas and Kyriakos C. Chatzidimitriou and Andreas L. Symeonidis}
}

Table of Contents

  1. Datasets

  2. Rating-Based models

    2.1. Rating-based approach I

    2.2. Rating-based approach II

    2.3. Rating-based approach III

  3. Comparison-Based models

    3.1. Comparison-based approach I

  4. Requirements

Datasets

For the training process, we rely on 2 different datasets.

  • Rating-based dataset: The users were asked to rate a webpage screenshot by providing an explicit numerical value on a scale

  • Comparison-based dataset: The users were asked to compare 2 different webpage screenshots at a time and choose which of the two is preferable

You can find more about these datasets here.

Rating-based models

Models trained and evaluated using the Rating-based dataset.

Evaluation method: Training set (75.4%) - Test set (24.6%)

Results synopsis:

Rating-based Model Pearson Correlation Coefficient RMSE Accuracy (2 classes)
Approach I 0.78 [0.69, 0.85] 0.616 88.78 %
Approach II 0.76 [0.66, 0.83] 0.662 84.69 %
Approach III 0.78 [0.68, 0.85] 0.628 83.67 %

Rating-based approach I

Description: In this approach, the model was trained using the mean value of the user ratings for each website. The model's output is an aesthetics score on the scale 1-9.

Transfer-learning: Flickr-Style was used as a base network

Rating-based approach II

Description: In this approach, the model was trained using the distribution of the user ratings for each website expressed as an empirical probability mass function. The model's output is a predicted distribution of the aesthetics scores. The final score is calculated by the mean value of the predicted distribution (scale 1-9).

Transfer-learning: NIMA-MobileNet was used as a base network

Rating-based approach III

Description: In this approach, the model was trained using all the pairs webpage - user rating. The model's output is an aesthetics score on the scale 1-9.

Transfer-learning: Flickr-Style was used as a base network

Comparison-based models

Models trained and evaluated using the Comparison-based dataset.

Evaluation method: Leave-one-out Cross Validation

Results synopsis:

Comparison-based Model Pearson Correlation Coefficient RMSE
Approach I 0.70 [0.58, 0.79] 1.353

Comparison-based approach I

Description: The dataset contains an aesthetics score for each website that was calculated using the Bradley-Terry model based on the pairwise comparisons collected by the users. In this approach, the model was trained using the Bradley-Terry ratings. The model's output is an aesthetics score on the scale 1-10.

Transfer-learning: Calista Rating-Based was used as a base network

Requirements

  • Step 1: Download the datasets

      git submodule update --init
    
  • Step 2: Download the pretrained models in the folder pretrained-models/. Links can be found here.

The code was tested using Python 3.6.