We've been running our startup [1] in this "media research industry" for just a while. We're on the classic media use case side.
It is true that the vast majority of "research" is done by non-academics. Lots of companies doing market research want to mine media data.
Still, I believe that this "social media research" is a bit overvalued. There was this wave of "social media is the primary source where information appear". But now many realized how freaking difficult to separate this data from the noise comparing to traditional news published by journalists.
Also, take a look on this article [2] about how Dataminr sells insights from Twitter data to foreign governments (2017). Seems like just a way to punish the opposition channels.
It's a shame access to this API is limited to academic institutions, as many social media/misinformation researchers are now independent or affiliated with journalistic institutions.
In what research contexts is API usage valid instead of scraping a view more similar to what people experience? If the Twitter site and API are retrospectively cleared of removed/suspended accounts with large impact, how does that affect retrospective studies?
Are there ethical implications of working with Twitter to gather data? Despite Twitter TOS, legal, IRB ok, are there informed consent issues in studying the artifacts of social media use?
Until now, none I think. The API only gave a partial view while scraping offered all tweets for a particular search term. The scraper had to be clever to juke the anti scraping systems but you would get a more complete data set than using the API.
And the streaming API was terrible. Even if there was no data on the stream you could consume tens of gigabytes of bandwidth a day. Dreadful.
One easy example is language, for example tracking the spread of new words or other language constructs. You don’t care how the site looks, you care about the text that was previously input.
I wonder how are they going to enforce their rules, e.g. non-commercial use. I assume this will require some monitoring to be effective. Large scale Twitter API access is typically pricey, malicious actors might try to buy or steal researcher's credentials to cut costs.
I recently tried to sign up for Twitter API and the process is nothing like what it used to be. You have to give them a lot of information to even qualify, such as what you're going to use it for. It used to be that those were just some fields you need to fill out and you could sign up immediately. But nowadays the application process requires a direct approval from their team, which means they're monitoring every API account like Apple does with their app store. And if you like about your usage you are probably liable
> You are either a master’s student, doctoral candidate, post-doc, faculty, or research-focused employee at an academic institution or university.
This is gross. Rather than using the internet as a democratizing force for education, they restrict the program to those already inside credential-granting institutions. So much great research has been done from outside the institution and yet Twitter is actively pushing outsiders to resort to scraping.
I always wondered: how many of these tweets are just Justin Bieber fandom type posts, bots, spam, or other dross? Twitter is infamous for its bad signal to noise ratio. These researchers need to write algos to filter out all the noise
What's wrong with that? If you wanted to investigate, say, the rise of The Beatles - wouldn't you love to have access to the random thoughts of their fans in the 1960s?
Similarly, if you're researching bots and spam and how they manipulate people & markets - this is still useful.
Where's your sources, numbers, methodology? I mean anyone can make a kneejerk statement based on their perception (read: bubble), but that's not science.
I'd think libertarians would love twitter, deplatforming is the free market at work and the government has no right to make them do business with anyone they choose not to
you would think from the amount of republican/conservative accounts they have banned that they have some AI or parser dedicated to banning these types of voices
artembugara|5 years ago
It is true that the vast majority of "research" is done by non-academics. Lots of companies doing market research want to mine media data.
Still, I believe that this "social media research" is a bit overvalued. There was this wave of "social media is the primary source where information appear". But now many realized how freaking difficult to separate this data from the noise comparing to traditional news published by journalists.
Also, take a look on this article [2] about how Dataminr sells insights from Twitter data to foreign governments (2017). Seems like just a way to punish the opposition channels.
[1] https://newscatcherapi.com/
[2] https://www.theverge.com/2017/1/27/14412014/dataminr-twitter...
minimaxir|5 years ago
trident5000|5 years ago
adolph|5 years ago
Are there ethical implications of working with Twitter to gather data? Despite Twitter TOS, legal, IRB ok, are there informed consent issues in studying the artifacts of social media use?
fnord123|5 years ago
And the streaming API was terrible. Even if there was no data on the stream you could consume tens of gigabytes of bandwidth a day. Dreadful.
dharmab|5 years ago
thih9|5 years ago
cocktailpeanuts|5 years ago
AlchemistCamp|5 years ago
This is gross. Rather than using the internet as a democratizing force for education, they restrict the program to those already inside credential-granting institutions. So much great research has been done from outside the institution and yet Twitter is actively pushing outsiders to resort to scraping.
guerrilla|5 years ago
beefman|5 years ago
https://www.reuters.com/article/us-twitter-product/twitter-g...
jansenmac|5 years ago
tester34|5 years ago
there's a lot of strong people especially in CS who do not work with academia and still work on interesting stuff
leephillips|5 years ago
anonymousDan|5 years ago
adolph|5 years ago
blindm|5 years ago
edent|5 years ago
Similarly, if you're researching bots and spam and how they manipulate people & markets - this is still useful.
adolph|5 years ago
sigil|5 years ago
Triv888|5 years ago
Jkvngt|5 years ago
williesleg|5 years ago
[deleted]
waheoo|5 years ago
Cthulhu_|5 years ago
ceejayoz|5 years ago
wlesieutre|5 years ago
jijji|5 years ago