Attractiveness, Matching and Network Science (part I)
In this post we are going to discuss a speed dating dataset with the Network Science approach. This analysis is indeed a hot topic nowadays, justified by the appearance of many dating apps in the last years and the evolution of society with technology. The dataset is composed by the quizzes answered by around 200 participants in a few speed dating events.
After some data cleaning and preprocessing, we end up with two vectors that represent every participant, each vector with five components. One of the vector represents personal characteristics of the participant i, and the other indicates what this person looks for in the opposite sex. The components of the vectors are ranks for beauty, sincerity, intelligence, fun and ambition. So, stablishing a similarity measure between those vector for different persons, we can measure the interest of person i in meeting for coffee with person j. Clearly this relationships determine a bipartite network with directed links between people of different gender.
Each link has a weight (between 0 and 1) that express the strength of the attraction. It is also clear that it is a directed network because the romantic interest may not be corresponded. The total number of links in the network is determined by a numerical threshold for the similarity between any pair of users, keeping only true intentions of meeting for coffee.
Here you can explore the network using the BeGraph visualizer. You can find technical details on how we created the network in the appendix at the end of this post.
We can see that there are a few reciprocal links between women (crosses) and men (triangles). This means that there is mutual attraction and definitely the system or the event organization must recommend those people to meet for coffee. From the Network Science point of view, people with high InDegree (number of incoming links) are attractive people, while people with high OutDegree (number of outgoing links) like people profiles that are common in the opposite sex.
We may identify the most important/successful/influencing people in the network using the Page Rank centrality measure. The Page Rank is used by Google to rank webpages with is search engine based on the webpage position in the whole Internet. In this dating network it identifies people who attract other people who in turn are also very attractive. We can say that is a global centrality measure, in contrast to the InDegree which is purely local (restricted to the neighborhood of a node).
The combination of reciprocal weights leads to an undirected version of the network that is easier to analyze and visualize (although we lose some information). In this network a link means that there is some interest, at least by one of the parts, to meet for coffee. This actually makes sense because, who hasn’t met for coffee with someone that did not look very attractive at first sight but showed a lot of interest on you?
Here node popularity is represented by the Degree and the Eigenvector Centrality ranks the most important nodes in the network. The Eigenvector Centrality plays here a similar role as the Page Rank in the directed network.
In conclusion, we have created and represented with BeGraph a network of possible matches for a dating application. The identification of key nodes is straightforward and very cheap computationally. From an individual point of view, finding yourself in the network and studying your surroundings could give you an idea on what aspects of yourself you have to reinforce to become more successful in meeting for coffee with people from the opposite sex. Or just to know if the man/woman of your dreams is easy to find or almost unique!
In the next post we will deal with more interesting calculations: partition of the network and propagation of STD’s.