Measuring calories by reading tweets can improve public health

Monday, February 20, 2017
News
Peter Dodds, a scientist at the University of Vermont, who co-led the invention of the new device, believes the Lexicocalorimeter can become a powerful public health tool. "It's a bit like having a satellite image of how people in a state or city are eating and exercising."

Since people don’t actually consume social media posts like tweets, the Lexicocalorimeter gathers tens of millions of geo-tagged Twitter posts from across the country and fishes out thousands of food words, such as "apples," "ice cream," and "green beans." At the same time, it finds thousands of activity-related terms -- like "watching TV," "skiing," and even "alligator hunting" and "pole dancing." These words all get scored--based on data about typical calorie content of foods and activity burn rates -- and then compiled into two measures: "caloric input" and "caloric output."

The ratio of the two measures is supposed to paint a picture that might be of interest not just to athletes or weight-watchers, but also to mayors, public health officials, epidemiologists, or others interested in "public policy and collective self-awareness," the team of scientists write in their new study, published on February 10 in the journal PLOS ONE.

Open to public

The Lexicocalorimeter is open to the public, and the current version gives a portrait of each of the contiguous US states. In this particular example, the tweet flow into the device suggests that Vermont consumes more calories, per capita, than the overall average for the US. The reason is that at the top of its list of words that push the Green Mountain State to the gourmand's side of the ledger is "bacon" -- tied for second in the US when states are ranked by bacon's contribution to caloric balance. "We love to tweet about bacon," says Chris Danforth, a UVM scientist and mathematician who co-led the new study.

However, Vermont also expends more calories than average, the Lexicocalorimeter indicates, because of relatively frequent appearances of the words "skiing," "running," "snowboarding," and, yes, "sledding." And why does the Lexicocalorimeter suggest that New Jersey expends fewer calories than the US average? Below-average on "running" while the top of its low-intensity activity list is "getting my nails done."

Realtime health measurement

The PLOS ONE study suggests that the Lexicocalorimeter could provide a realtime measure of (for now only) the US population's health. The study shows that the remotely sensed results correlate very closely with other traditional measures of US well-being, like obesity and diabetes rates. For the study, the team of scientists explored about 50 million geo-tagged tweets from 2011 and 2012 and report that "pizza" was the dominant contributor to the measure of "calories in" in nearly every state. The dominant contributor to calories out: "watching TV or movies."

Part of larger effort

The Lexicocalorimeter is part of a larger effort by the University of Vermont team to build a series of online instruments that can quantify health-related behaviors from social media. The nine scientists involved - led by professors and students at the University of Vermont's Computational Story Lab as well as researchers at the University of California Berkley, WIC in East Boston, MIT, University of Adelaide, and Drexel University -- point out that the ratio of calories in to calories out in the new study are not meaningful as absolute numbers, but rather have power for comparisons.

"Given the right tools, our mobile phones will very soon know more about us than we know about ourselves," states UVM's Chris Danforth. "While the Lexicocalorimeter is focused on eating and exercise, and the Hedonometer is measuring happiness, the methodology we're building is far more general, and will eventually contribute to a dashboard of public health measures to complement traditional sources of data." Similar tools the scientists are thinking about are an Insomniameter and a Hangoverometer.