About a week ago, a Tweet I was mentioned in received a dozen or so “likes” over a very short time period (about two minutes). I happened to be on my computer at the time, and quickly took a look at the accounts that generated those likes. They all followed a similar pattern. Here’s an example of one of the accounts’ profiles:
All of the accounts I checked contained similar phrases in their description fields. Here’s a list of common phrases I identified:
All of the accounts also contained links to URLs in their description field that pointed to domains such as the following:
It turns out these are all shortened URLs, and the service behind each of them has the exact same landing page:
My colleague, Sean, checked a few of the links and found that they landed on “adult dating” sites. Using a VPN to change the browser’s exit node, he noticed that the landing pages varied slightly by region. In Finland, the links ended up on a site called “Dirty Tinder”.
Checking further, I noticed that some of the accounts either followed, or were being followed by other accounts with similar traits, so I decided to write a script to programmatically “crawl” this network, in order to see how large it is.
The script I wrote was rather simple. It was seeded with the dozen or so accounts that I originally witnessed, and was designed to iterate friends and followers for each user, looking for other accounts displaying similar traits. Whenever a new account was discovered, it was added to the query list, and the process continued. Of course, due to Twitter API rate limit restrictions, the whole crawler loop was throttled so as to not perform more queries than the API allowed for, and hence crawling the network took quite some time.
My script recorded a graph of which accounts were following/followed by which other accounts. After a few hours I checked the output and discovered an interesting pattern:
The discovered accounts seemed to be forming independent “clusters” (through follow/friend relationships). This is not what you’d expect from a normal social interaction graph.
After running for several days the script had queried about 3000 accounts, and discovered a little over 22,000 accounts with similar traits. I stopped it there. Here’s a graph of the resulting network.
Pretty much the same pattern I’d seen after one day of crawling still existed after one week. Just a few of the clusters weren’t “flower” shaped. Here’s a few zooms of the graph.
Since I’d originally noticed several of these accounts liking the same tweet over a short period of time, I decided to check if the accounts in these clusters had anything in common. I started by checking this one:
Oddly enough, there were absolutely no similarities between these accounts. They were all created at very different times and all Tweeted/liked different things at different times. I checked a few other clusters and obtained similar results.
One interesting thing I found was that the accounts were created over a very long time period. Some of the accounts discovered were over eight years old. Here’s a breakdown of the account ages:
As you can see, this group has less new accounts in it than older ones. That big spike in the middle of the chart represents accounts that are about six years old. One reason why there are fewer new accounts in this network is because Twitter’s automation seems to be able to flag behaviors or patterns in fresh accounts and automatically restrict or suspend them. In fact, while my crawler was running, many of the accounts on the graphs above were restricted or suspended.
Here are a few more breakdowns – Tweets published, likes, followers and following.
Here’s a collage of some of the profile pictures found. I modified a python script to generate this – far better than using one of those “free” collage making tools available on the Internets. 🙂
So what are these accounts doing? For the most part, it seems they’re simply trying to advertise the “adult dating” sites linked in the account profiles. They do this by liking, retweeting, and following random Twitter accounts at random times, fishing for clicks. I did find one that had been helping to sell stuff:
Individually the accounts probably don’t break any of Twitter’s terms of service. However, all of these accounts are likely controlled by a single entity. This network of accounts seems quite benign, but in theory, it could be quickly repurposed for other tasks including “Twitter marketing” (paid services to pad an account’s followers or engagement), or to amplify specific messages.
If you’re interested, I’ve saved a list of both screen_name and id_str for each discovered account here. You can also find the scraps of code I used while performing this research in that same github repo.