Recent Research Publications

Kosinski, M., Stillwell D.J., Graepel, T. (2013) Private traits and attributes are predictable from digital records
of human behavior. Proceedings of the National Academy of Sciences
(PNAS).
Abstract
We show that easily accessible digital records of behavior, Facebook Likes, can be used to automatically and accurately predict a range of highly sensitive personal attributes including: sexual orientation, ethnicity, religious and political views, personality traits, intelligence, happiness, use of addictive substances, parental separation, age, and gender. The analysis presented is based on a dataset of over 58,000 volunteers who provided their Facebook Likes,detailed demographic profiles, and the results of several psychometric tests. The proposed model uses dimensionality reduction for preprocessing the Likes data, which are then entered into logistic/linear regression to predict individual psychodemographic profiles from Likes. The model correctly discriminates between homosexual and heterosexual men in 88% of cases, African Americans and Caucasian Americans in 95% of cases, and between Democrat and Republican in 85% of cases. For the personality trait “Openness,” prediction accuracy is close to the test–retest accuracy of a standard
personality test. We give examples of associations between attributes and Likes and discuss implications for online personalization and privacy.
Schwartz, H. A., Eichstaedt, J.C., Kern, M.L., Dziurzynski, L., Ramones, S.M., Agrawal, M. Shah, A., Kosinski, M., Stillwell, D.S., Seligman, M.E.P, Ungar, L.H. (2013) Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach. PLOS ONE
Kosinski, M., Stillwell, D.J., Kohli, P., Bachrach, Y., Graepel, T. (2012) Personality and Website Choice. Web Science (Evanston, IL)
Abstract
We find that preference for websites, like preference for ob- jects in
the offline
world, is influenced by personality. We combine personality profiles
and website choices
of more than 160.000 users and investigate whether
different web-
sites attract audience
of different personality. Using two in-
dependent sources of
website choices, we show
that website audiences often have distinct
personality profiles, that there
is a psychologically meaningful
relationship
between per-
sonality and
preferences related to
website and website cat-
egories, and that results are stable across
independent data sources. Our findings are useful for researchers interested
in website content personalization, text search, search
result optimization and online marketing. (View paper)
Kosinski, M., Bachrach,
Y., Kasneci, G., Van Gael, J., Graepel, T. (2012) Crowd IQ: Measuring
the Intelligence of Crowdsourcing Platforms. Web Science (Evanston, IL)
Abstract
We
measure crowdsourcing performance based on a standard IQ questionnaire, and
examine Amazon’s Mechanical Turk (AMT) performance under different conditions.
These include variations of the payment amount offered, the way incorrect
responses affect workers’ reputations, threshold reputation scores of
participating AMT workers, and the number of workers per task. We show that
crowds composed of workers of high reputation achieve higher performance than
low reputation crowds, and the effect of the amount of payment is non-monotone—both
paying too much and too little
affects performance. Furthermore, higher performance is achieved when the task
is designed such that incorrect responses can decrease workers’ reputation
scores. Using majority vote to aggregate multiple responses to the same task can
significantly improve performance, which can be further boosted by dynamically
allocating workers to tasks in order to break ties. (view paper)
Bachrach, Y., Kosinski, M., Graepel,
T., Kohli, P., Stillwell, D.J., (2012) Personality and Patterns
of Facebook Usage. Web Science (Evanston, IL)
Abstract
We show how users’ activity on Facebook relates to their personality,
as measured by the standard Five Factor Model. Our dataset consists of the
personality profiles and Facebook profile data of 180,000 users. We examine
correlations between users’ personality and the properties of their Facebook profiles
such as the size and density of their friendship network, number uploaded
photos, number of events attended, number of group memberships, and number of
times user has been tagged in photos. Our results show significant
relationships between personality traits and various features of Facebook
profiles. We then show how multivariate regression allows prediction of the
personality traits of an individual user given their Facebook profile. The best
accuracy of such predictions is achieved for Extraversion and Neuroticism, the
lowest accuracy is obtained for Agreeableness, with Openness and Conscientiousness
lying in the middle.(view paper)
Wang, N., Kosinski, M., Stillwell, D.J. & Rust, J. (2012) Can
well-being be measured using Facebook status updates? Validation of
Facebook’s Gross National Happiness Index. Social Indicators
Research.
Abstract
Facebook's Gross National Happiness (FGNH) indexes the positive and negative words used in the millions of status updates submitted daily by Facebook users. FGNH has face validity: it shows a weekly cycle and increases on national holidays. Also, happier individuals use more positive words and fewer negative words in their status updates (Kramer, 2010). We examined the validity of FGNH in measuring mood and well-being by comparing it with scores on Diener's Satisfaction with Life Scale (SWLS), administered to an average of 34 Facebook users every day for a year, then aggregated by day, week, month, quarter and half year. FGNH and SWLS were not significantly correlated, with a negative correlation coefficient. Also, aggregated SWLS scores showed a positive relationship with numbers of negative words in status updates. We conclude that FGNH is a valid measure for neither mood nor well-being; however, it may play a role in mood regulation. This challenges the assumption that linguistic analysis of internet messages is related to underlying psychological states. (view paper)
Bachrach, Y., Graepel, T., Kasneci, G., Kosinski,
M., Van Gael, J.
(2012) Crowd IQ - Aggregating Opinions to Boost Performance. Proceedings of the 11th International
Conference on Autonomous Agents and Multiagent Systems (Valencia, June).
Abstract
We show how the quality of
decisions based on the aggregated opinions of the crowd can be conveniently
studied using a sample of individual responses to a standard IQ questionnaire.
We aggregated the responses to the IQ questionnaire using simple majority
voting and a machine learning approach based on a probabilistic graphical
model. The score for the aggregated questionnaire, Crowd IQ, serves as a quality
measure of decisions based on aggregating opinions, which also allows
quantifying individual and crowd performance on the same scale. We show that
Crowd IQ grows quickly with the size of the crowd but saturates, and that for
small homogeneous crowds the Crowd IQ significantly exceeds the IQ of even
their most intelligent member. We investigate alternative ways of aggregating the
responses and the impact of the aggregation method on the resulting Crowd IQ.
We also discuss Contextual IQ, a method of quantifying the individual
participant’s contribution to the Crowd IQ
based on the Shapley value from cooperative game theory. (view paper)
Stillwell, D.J. & Tunney, R.J. (2012) Effects of measurement
methods on the relationship between smoking and delay reward
discounting. Addiction.
Abstract
Aims: Delay reward discounting (DRD) measures the degree to which a person prefers smaller rewards soon or larger rewards later. People who smoke have been shown to have higher DRD. There are several ways of measuring DRD and the method used might influence the association between smoking and DRD. The key differences are the order that the items are presented in, the delays used, and the magnitude of the delayed amount.
Setting: An international online study running from September 2010 to June 2011.
Participants: N = 9454; 38% male, mean age = 23.1.
Design and Measurements: Users completed a multi-item DRD task. They were randomly presented the immediate rewards in an ascending, descending, or randomized order. The delays were between 1 week and 5 years. The delayed amounts were $1000 for all delays, and $100 for 1 month. Users also self-reported their smoking status.
Findings: A hyperbolic DRD function fit better than an exponential function. There were differences in the derived DRD function based on methodology used. Items presented in a randomized order, longer delays and smaller rewards showed steeper discounting. However, these did not interact with smoking status, as for all methodologies used daily smokers showed the steepest discounting, followed by non-daily smokers, then non-smokers.
Conclusions: Smokers discount more steeply irrespective of which method is used. However, the methods of assessing DRD influence the parameters, which means that parameters estimated with different methods cannot be compared. (view paper)
Quercia, D., Lambiotte, R., Kosinski, M., Stillwell, D. & Crowcroft, J. (2011) The Personality of Popular Facebook Users. ACM CSCW 2012.
Abstract
We study the relationship between Facebook popularity (number of contacts) and personality traits on a large number of subjects. We test to which extent two prevalent viewpoints hold. That is, popular users (those with many social contacts) are the ones whose personality traits either predict many offline (real world) friends or predict propensity to maintain superficial relationships. We find that the predictor for number of friends in the real world (Extraversion) is also a predictor for number of Facebook contacts. We then test whether people who have many social contacts on Facebook are the ones who are able to adapt themselves to new forms of communication, present themselves in likable ways, and have propensity to maintain superficial relationships. We show that there is no statistical evidence to support such a conjecture. (view paper)
Quercia, D., Kosinski, M., Stillwell, D. & Crowcroft, J. (2011)
Our Twitter profiles, our selves: Predicting personality with Twitter. IEEE SocialCom.
Abstract
Psychological personality has been shown to affect a variety of aspects: preferences for interaction styles in the digital world and for music genres, for example. Consequently, the design of personalized user interfaces and music recommender systems might benefit from understanding the relationship between personality and use of social media. Since there has not been a study between personality and use of Twitter at large, we set out to analyze the relationship between personality and different types of Twitter users, including popular users and influentials. For 335 users, we gather personality data, analyze it, and find that both popular users and influentials are extroverts and emotionally stable (low in the trait of Neuroticism). Interestingly, we also find that popular users are ‘imaginative’ (high in Openness), while influentials tend to be ‘organized’ (high in Conscientiousness). We then show a way of accurately predicting a user’s personality simply based on three counts publicly available on profiles: following, followers, and listed counts. Knowing these three quantities about an active user, one can predict the user’s five personality traits with a rootmean- squared error below 0.88 on a [1; 5] scale. Based on these promising results, we argue that being able to predict user personality goes well beyond our initial goal of informing the design of new personalized applications as it, for example, expands current studies on privacy in social media. (view paper)
Rust, J. (2012) Psychometrics. In Research Methods in Psychology, (Eds: G. M. Breakwell, J. A. Smith & D. B. Wright) 141-162.
Golombok, S., Rust, J., Xervoulis, K, Goilding, J and Hines, M. (2012) Continuity in sex-typed behavior from Preschool to Adoloscence: A Longitudinal population study of boys and girls aged 3-12 years. Archives of Sexual Behavior 41(3), 591-597.
Rust, J. & Golombok, S. (2008) Modern Psychometrics: The science of psychological assessment (3rd Edition). Routledge, London. (also in Chinese, Ren Min University Press, Beijing, 2011)
Golombok, S., Rust, J., Zervoulis, K., Croudace, T., Golding, J. & Hines, M. (2008) Developmental trajectories of sex-typed behaviour in boys and girls: A longitudinal population study of children aged 2.5 to 8 years. Child Development. 79(5) 1583-159.