Thinking about trolls

The Mona Lisa
This is the troll face.

We had a lab meeting this week where we talked a bit about some of the issues surrounding blogging, in particular we talked about trolls and their annoying trolling. I should be careful here, the term ‘troll’ has evolved a bit over the last few years. My understanding of a troll was generally someone who posted contentious material (inflammatory or offensive) for the purpose of getting a rise out of people or derailing a conversation. The most important part of the early definition was that the person was posting for the purpose of derailing the conversation, and often they did not believe what they were saying. As we move toward what I think of as the newer definition, it’s basically anyone posting inflammatory comments, whether they believe them or not. I’m going to use my second definition of the term troll from now on.

In the case of climate change denial, I’ve had an idea for a while that, while some trolls are genuine jerks who feel the need to vent, some are paid jerks. If that’s the case then it should show up in the IP logs of blogs or emails sent to individuals whose only crime was being a climate scientist. Has anyone done this kind of analysis? I would assume its pretty easy to do. The second thing I’ve been a bit curious about is the possibility that analysis of the troll posts might show some kind of similarities across posts.
Could we actually identify trolls in the wild using word frequencies, or are their posts too short?

Just an idea, I’ve looked at some r packages for text analysis, it might be a fun project. Anyone have a bunch of hate mail?

To outline:
1. build a corpus of hate mail (do you think Mike Mann would share?)
2. use a package like tm in R to build some clusters, then take a look at the clusters, their strength and their geographic coherence.
3. Somewhere in there you’d have to learn about text mining too. 🙂

Map of TOR server outlets.
TOR map outlets.

On my bike in to work today I came up with some hypotheses:
H0: There is no spatial structure to the hate mail.
H1: The hate mail represents genuine emails from people who are upset that climate science produces the results it does, and these people are representative of the general population in the US, so their structure should be similar to the structure of public opinion on climate change in the USA.
H2: The hate mail represents efforts to simulate grassroots opposition and so the spatial structure should represent the distribution of lobbyist groups that might support industries with vested interests against climate change.
H3: Lobbyist groups may be involved with these emails but they use technology to anonymize their IP addresses and so the addresses will mimic the TOR network.
H4: The final distribution will mimic both H3 and H1 to some degree.

I actually suspect that the spatial distribution will vary based on whether we look at hate mail directly to researchers and annoying posts on blog sites, but . . .

Awesome, who wants in? Who would publish this? How much time would this take away from otherwise productive science?


Published by


Assistant scientist in the Department of Geography at the University of Wisconsin, Madison. Studying paleoecology and the challenges of large data synthesis.

5 thoughts on “Thinking about trolls”

  1. I think that this is a really cool idea for a few reasons. First, a good textual analysis of the common language used might give some insight in how to combat or engage trolls. Secondly, it would be really worthwhile to get a sense of what the actual troll population is, and the population’s demography– I’ve seen anecdotes to this effect, but no good statistics.

    It’s hard to know when to engage based on the different kinds of trolls you mention– when is someone a genuine climate skeptic that might be swayed by a well-formed argument with backing data, and when are they politically motivated to spread disinformation? During the Wisconsin protests, several ads were uncovered recruiting people to troll blogs and Twitter, with compensation to be proportional to the amount of counter-engagements in the form of Tweets and comments (reinforcing the “don’t feed the trolls” argument, which is becoming increasingly unsatisfactory).

    Also, trolls get away with saying a lot of really abusive things on the internet that they would never be allowed to say to someone in a phone call or a letter without getting arrested. It’s been notoriously difficult to get law enforcement to treat the internet as a real place when it comes to verbal abuse. Getting a better handle on these kinds of statistics would go a long way towards changing that. Certainly, it would help to know that there are really only about two dozen well-paid cranks, or what-have-you.

    1. I don’t see any reason why individuals would have to be directly identified. Honestly, I’m not out to hunt people down, I’m interested in informing people about the nature of dissent. Is it genuine as some believe, is it dominated by vested interests, as others believe, or is it some combination of the two?

      It’s all a bit of a thought experiment anyway, my guess is no one really wants to share their hateful emails, and, ultimately, I don’t really have the time.

  2. That sounds like a great communications, social geography, or poli sci paper idea… “Patterns of troll rhetorical construction and IP distribution in climate change and science blog commentary (or emails).”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s