Chapter 1 Introduction

In this project, we are using data visualization for the purpose of explanatory data analysis. The data we are using for the project is the 2006 Speed Dating Dataset gathered by three professors of Columbia Business School: Sheena Iyengar, Emir Kamenica, Itamar Simonson.

The dataset and the data key of the experiment can be found on the website https://data.world/annavmontoya/speed-dating-experiment. The webpage archive of the experiment conducted is https://www8.gsb.columbia.edu/researcharchive/articles/867.

Data Source: https://data.world/annavmontoya/speed-dating-experiment

This data was found as part of an experiment that involved finding the major factors which affect the decision of people from both genders if they would consider someone from the opposite sex as their partner. In this experiment, every participant is allowed to meet every other participant of the opposite gender, and the rating is given based on the participant’s perception during a four-minute conversation. The link to the full article is https://www0.gsb.columbia.edu/mygsb/faculty/research/pubfiles/867/fisman%20iyengar.pdf.

The full citation for the data is:

Fisman, Raymond, Sheena Iyengar, Emir Kamenica, and Itamar Simonson. “Gender Differences in Mate Selection: Evidence from a Speed Dating Experiment.” Quarterly Journal of Economics 121, no. 2 (May 2006): 673-97.

The main conclusions from the experiment were as follows -

  • Women generally give more importance to the intelligence and the race of their partner
  • Men are more responsive to physical attractiveness.
  • The majority of men don’t give much value or importance to women’s intelligence or ambition when it exceeds their own.
  • Women tend to prefer men who grew up in affluent neighborhoods.
  • Male selectivity is invariant to group size, while female selectivity is strongly increasing in group size.

We aim to visualize and verify these results and different patterns through various graphs, plots, and different visualization techniques. We also want to analyze and find some underlying or hidden patterns from the dataset which have not been explored yet.