Can We Predict Suicide From Twitter Language?
We found that analyzing Twitter data in bulk does not add to our understanding of geographical variations in health outcomes.
Can we predict county-level death by suicide from Twitter data? We tried. Our surprising results added weight to the results of our re-analyses of Twitter data attempting to predict death from heart disease. Analyzing Twitter data in bulk does not add to our understanding of geographical variations in health outcomes.
Nick Brown and I (*) recently posted a preprint:
No Evidence That Twitter Language Reliably Predicts Heart Disease: A Reanalysis of Eichstaedt et al. (2015a)
We reanalyze Eichstaedt et al.’s (2015a) claim to have shown that language patterns among Twitter users, aggregated at the level of U.S. counties, predicted county-level mortality rates from atherosclerotic heart disease (AHD), with “negative” language being associated with higher rates of death from AHD and “positive” language associated with lower rates…We conclude that there is no evidence that analyzing Twitter data in bulk in this way can add anything useful to our ability to understand geographical variation in …