Evaluating Voice Samples as a Potential Source of Information About Personality
Speech is a powerful medium through which a variety of psychologically relevant phenomena are expressed. Here we take a first step in evaluating the potential of using voice samples as non-self-report measures of personality. In particular, we examine the extent to which linguistic and vocal information extracted from semi-structured vocal samples can be used to predict conventional measures of personality. We extracted 94 linguistic features (using Linquistic Inquiry Word Count, 2015) and 272 vocal features (using pyAudioAnalysis) from 614 voice samples of at least 50 words. Using a two-stage, fully automatable machine learning pipeline we evaluated the extent to which these features predicted self-report personality scales (Big Five Inventory). For comparison purposes, we also examined the predictive performance of these voice features with respect to depression, age, and gender. Results showed that voice samples accounted for 10.67 % of the variance in personality traits on average and that the same samples could also predict depression, age, and gender. Moreover, the results reported here provide a conservative estimate of the degree to which features derived from voice samples could be used to predict personality traits and suggest a number of opportunities to optimize personality prediction and better understand how voice samples carry information about personality.