top | item 3946233

Looking At The World Through Twitter Data

94 points| arashdelijani | 14 years ago |arashd.scripts.mit.edu | reply

26 comments

order
[+] Anon84|14 years ago|reply
If you're interested in how information diffuses through social networks like Twitter, take a look at Truthy (one of my projects):

http://truthy.indiana.edu

     Truthy is a system to analyze and visualize the
     diffusion of information on Twitter. The Truthy system
     evaluates thousands of tweets an hour to identify new
     and emerging bursts of activity around memes of various 
     flavors. The data and statistics provided by Truthy are
     designed to aid in the study of social epidemics: How do
     memes propagate through the Twittersphere? What causes a
     burst of popularity?
[+] arashdelijani|14 years ago|reply
sounds like a cool thing to tackle. We'll definitely look at it!
[+] TravisPe|14 years ago|reply
A friend and I started playing around with twitter data back in early 2010. We currently have something close to over 587 million tweets collected (We stopped collecting earlier this year). We only pulled English tweets and those that described what someone was feeling (Im, I am, I feel, I am feeling, etc. along with the negatives I don't feel, I do not feel, etc).

We were able to see some interesting events happen during the time though. This is a graph of the anxiety levels of twitter on March 11th, the bottom axis is the hour of the day EST. The earthquake hit Japan @ 1:46 EST.

http://i.imgur.com/BeBwa.jpg

There is a strange dip around noon that we are unsure of how to account for as our servers did not report any failures.

It was a fun project to play around with.

[+] cpeterso|14 years ago|reply
> There is a strange dip around noon that we are unsure of how to account for as our servers did not report any failures.

Maybe people are away from their computer at lunch.

What do the blue and green line colors indicate? It would also be interesting to track emoticons. :)

[+] Permit|14 years ago|reply
Out of curiosity, is Twitter data such as this freely available to anyone, or was this specially acquired for this set of students? I can imagine a number of interesting projects that might arise out of such a data set.
[+] tmostak|14 years ago|reply
I've also been collecting twitter data for a bit. I developed a heatmapping application that runs on the GPU to produce time-animated heatmaps in real-time for any user-generated query over a Solr database of hundreds of millions of geotagged tweets. You can see a rough demo at http://youtu.be/4_v2EZGiA7w . Hopefully I'll release it as a web app when I get time this summer.
[+] akshaykarthik|14 years ago|reply
Wow... This is awesome. I actually did a project for my high school science fair that focused on analyzing twitter. It was no where near as sophisticated but it really opened my eyes to the massive amount of data and the availability of commodity hardware that can actually handle terabytes of data.
[+] joejohnson|14 years ago|reply
But, this is because non-English tweets that we have discarded are much more frequent during the night in our time zone, and they often don’t contain the word ‘a’ as often as English tweets do.

This doesn't make sense; are they only discarding the non-English tweets during certains times?

[+] arashdelijani|14 years ago|reply
We just mean that there's more tweeting going on in non-English speaking countries when it's night-time here.
[+] jermaink|14 years ago|reply
Hi, if you like that kind of stuff, I might give you an intro with Peter Gloor, who is author of swarmcreativity.net and at the MIT Center for Collective Intelligence. Tag #Twitter, Stock Prediction, Mood etc. You might meet on campus :)
[+] arashdelijani|14 years ago|reply
We know Peter, actually. He's a great guy and we've been talking to him about this. Thanks though! :)
[+] roarktoohey|14 years ago|reply
It would be cool, possibly profitable, to see stock symbols and their price change mapped vs. mentions of the ticker (like IBM).
[+] tzm|14 years ago|reply
Great work. I'll be following your updates. I'm building a platform for developers to crunch such APIs / data sets..
[+] mrlinx|14 years ago|reply
Is any of this data available? It would be great to have access to it.