Data dump: Meta killed CrowdTangle. What does it mean for researchers, reporters?
Data dump: Meta killed CrowdTangle. What does it mean for researchers, reporters?
By Joe Arney
In Brian C. Keegan鈥檚 telling, the loss of tools like CrowdTangle and Pushshift鈥攚hich allow researchers to study user behavior and how information is shared on social media鈥攊s like particle physicists one day waking up to find out they can no longer access the Large Hadron Collider.
Earlier this month, Meta announced it was shutting down CrowdTangle, one of the most effective tools for understanding how Facebook and Instagram鈥檚 algorithms work and how disinformation is created and spread on the company鈥檚 platforms.
That鈥檚 a blow to researchers, watchdogs and journalists who will be less able to track how disinformation, hate speech and other poisons pollute the social media atmosphere鈥攂ut in the context of business decisions, there are strong financial and reputational benefits to obfuscating its operations. Not only is the platform sitting on mountains of data that can be licensed to companies building models to train generative artificial intelligence, Keegan said, 鈥渋t鈥檚 easy to imagine a world where Meta doesn鈥檛 want its name attached to a paper about how neo-Nazis are using Facebook groups to organize themselves.鈥
The economic case for 鈥榩rivacy washing鈥
鈥淭he loss of these data tools imperils our ability to do that kind of scholarship and is ultimately a detriment to democracy and civic institutions.鈥
Brian C. Keegan, assistant professor, information science
It鈥檚 becoming a more common story, as platforms that once made their data public are increasingly erecting paywalls, blocking APIs or cutting deals with A.I. companies. Often, those platforms mask their motivations behind what Keegan calls 鈥減rivacy washing,鈥 citing concerns about safeguarding user data in justifying the removal of key features for research labs, newsrooms and the public.
This particular example comes at an inauspicious time, with digital disinformation ratcheting up ahead of Election Day and more Americans than ever getting their news from social media.
鈥淭o address the challenges we鈥檙e up against, that are happening in real time, that we see journalists trying to grapple with, requires different models of publicly engaged scholarship, beyond just academic papers that take a year or two to publish,鈥 Keegan said. 鈥淭he loss of these data tools imperils our ability to do that kind of scholarship and is ultimately a detriment to democracy and civic institutions.鈥
It鈥檚 not just the media or public at large that are affected. When these tools are taken offline, it hurts the quality of the online communities, as well. Keegan has volunteered as a moderator on Reddit, and said PushShift鈥攚hich Reddit limited access to beginning last summer鈥攚as vital to forming context about user behavior that could determine whether someone was having a bad day, or whether that person was truly a bad actor.
Classroom impact
That鈥檚 a challenge as a moderator, but it鈥檚 having a bigger impact on his professional life, both as a researcher and teacher. He can use case studies from the 2016 U.S. presidential election cycle to show how fake news circulated, and the role of actors like Cambridge Analytica, 鈥渂ut that data and those strategies are now eight years old, and those contexts no longer exist鈥攚e鈥檙e in a different world now,鈥 Keegan said. 鈥淐an we prepare our students to be better engineers, managers, artists and citizens with such old case studies?鈥
Meta purchased Crowdtangle in 2016, and Keegan acknowledged that the tech platform isn鈥檛 required to make its data publicly available. 鈥淏ut researchers have built our careers, infrastructure and programs on assumptions that we鈥檇 have access to these tools, so to have that rug pulled from under us has been profoundly disruptive to our ability to provide transparency, engage and ask critical questions,鈥 he said.
Keegan hopes to learn more through a grant he鈥檚 pursuing from the National Science Foundation. If awarded, he hopes to study the consequences of actions like Meta鈥檚 in the scientific research community.
鈥淲hen that data disappears, how does that impact scholarship?鈥 he asked. 鈥淐an we measure how research methods changing, the way we collaborate, the strategies we鈥檒l need to develop to make sure we鈥檙e able to ask critical questions?鈥