After outing several 'online haters' at home, which caused several resignations from the populist, far-right Sweden Democrat party, the Swedish investigative journalists behind the revelations said they had accessed the identities of several million commenters using the popular Disqus system.
Martin Fredriksson and his colleagues started collecting Disqus data back in February 2013 as part of a project to more closely analyze anonymous online comments. They hoped to understand more about who was behind hateful and racist comments on far-right websites in Sweden. They unearthed some 6,000 anonymous accounts in Sweden on commission from the tabloid Expressen, which published the data on Tuesday.
Fredriksson told The Local on Thursday that the unmasking of a few thousand users behind pseudonyms used on far-right sites in Sweden could just be the tip of the iceberg.
There were millions of Disqus users whose identity is at risk of exposure, said Fredriksson, responsible publisher (ansvarig utgivare) for the Research Group (Researchgruppen, who said his group's database contained a total of 29 million comments from Disqus users around the world.
"We used an open Disqus API protocol to obtain the data," he said, using a common acronym for "application protocol interface", which specifies how software components should interact with one another. In order to obtain the data more efficiently, Fredriksson wrote a programme that automated the data download requests sent to Disqus servers.
"You usually get around 100 comments with one request, but our system was able to send ten requests at once," he explained.
While the thrust of the research focused on far-right sites in Sweden, data was also collected from news sites elsewhere in the world, including CNN, The Telegraph, ABC News, and The Jerusalem Post, as well as from mainstream Swedish news site such as Svenska Dagbladet, SVT Debatt as well as The Local.
Members of the Research Group quickly realized, however, that the data they received also came with metadata that included the email addresses tied to anonymous Disqus accounts.
"It came as something as a shock," he said. "We got a lot of data we probably weren't supposed to get."
Fredriksson emphasized that the group didn't use any illicit methods in obtaining the data, but that the information was included in their trawl due to a security flaw at Disqus.
"When you leave a comment as a Disqus user, there is information about the date, username, and the comment itself which is open data," he said. "But (Disqus) also sent us data with coding that made it possible to identify people's email addresses."
After it emerged that Disqus users has been identified in the Expressen news stories, the company was quick to take action.
"Disqus has not been cracked. No emails were leaked by Disqus," vice president for marketing Stephen Roy said in a statement released on Tuesday.
He explained that Disqus offers API services that include "MD5 hashes" of email addresses that allow users to access third-party services such as Gravatar, which in turn permits users to display a consistent avatar across platforms.
"This appears to be a targeted attack on a group of individuals using pattern matching of their activity across the web, associated with email addresses used by those individuals," said Roy, calling the actions a breach of Disqus privacy regulations. "As in all such cases, we are terminating the account."
Roy added that Disqus was disabling use of the Gravatar service and removing the MD5 hash email from its API.
"We will evaluate any further changes that will need to be made based on these actions," he said. Inquiries from The Local for further comment were not immediately returned.
Fredriksson took exception to the Research Group being painted as wrongdoers by Disqus, explaining that he and his time "didn't even use any account for this, and never had to agree on any terms of service"
"We are researchers and they cannot blame us for researching openly available data. I think the bad guys are those who handle our personal information so carelessly," he said.
Fredriksson went on to admit that he and his colleagues aren't sure what to do with the data now in their possession, but expressed fears about who else might have similar technology that could unmask Disqus users.
"You can imagine a lot of unseemly scenarios," he said. "Perhaps the authorities in Iran, for example, have data like this from Israeli media sites and might use it to find out who is behind the comments."
Fredriksson said the incident is a wake-up call for news sites and online commenters everywhere to be more aware that their data may not be as safe as they had previously thought.
"People need to know more about the risks that arise when third-parties get access to their data," he told The Local. "It shows how much uncertainty there is in systems like this."