Twitter Forensics From The 2017 German Election

Over the past month, I’ve pointed Twitter analytics scripts at a set of search terms relevant to the German elections in order to study trends and look for interference.

Germans aren’t all that into Twitter. During European waking hours Tweets in German make up less than 0.5% of all Tweets published.

Data collected from the 1% sample stream (gardenhose)

Over the last month, Twitter activity around German election keywords has hovered at between 2 and 5 Tweets per second. Exceptions only occurred during the TV Debate (Sunday September 3rd 2017) and the day of voting (Sunday September 24th 2017). Surprisingly, Tweets volumes were still low on Saturday September 23rd 2017 – the day before votes were cast.

Here’s how things looked on Friday and Saturday:

Prior to polls closing on Sunday, Tweet volumes reached a sustained 10 Tweets per second. Once exit polls were announced, volumes exploded.

That sudden drop is my script not handling the volume all too well

Over the past month, topics related to the AfD party were pushed rather heavily on Twitter. This snapshot, taken on Thursday 21st September, is pretty similar to every one observed during the whole month.

Here’s an example of another trend I observed throughout the entire month regarding #afd hashtag volumes:

Notice that the Tweet volumes follow an organic pattern, trailing off during night hours. This contrasts what I observed during the French elections earlier this year, where #macronleaks hashtags were pushed by bots, and maintained a constant volume regardless of the time of day. Despite the high volume of AfD-related Twitter content being posted, AfD didn’t show up in Twitter’s own German trends at any point.

The terms “migrant”, “refugee”, “islam” were mentioned a fair bit in Tweets. Here’s what happened over the weekend.

Ben Nimmo of @DFRLab noticed that a hashtag named #wahlbetrug (election fraud) was being amplified by commercial Twitter bots on Saturday. This story was also picked up by the German publication Bild. My scripts also saw the hashtag briefly enter the top 10 during that day.

Here’s a Tweet timeline of this hashtag, since it’s appearance.

Looking at a timeline of Tweets using this hashtag shows the presence of certain more active accounts pushing this message.

Accounts retweeting Tweets containing the #wahlbetrug hashtag

Accounts replying to Tweets containing #wahlbetrug

Shortly after exit polls were published, the hashtag #fckafd surfaced.

The timeline of this particular hashtag is distinctly different.

However a number of highly active amplifiers were involved in both cases.

Highly active amplifiers of the #fckafd hashtag

Highly active amplifiers of the #wahlbetrug hashtag

This data illustrates how tricky it is to automate the discovery of artificially amplified Tweets. While the #wahlbetrug hashtag was indeed amplified by paid commercial botnets, it didn’t make a splash, and Twitter users would have most likely have needed to go searching for pro-AfD Tweets to find it.

A few Twitter users posted very actively during the campaign. Over the weekend, @Teletubbies007, @Jensjehagen, and @nanniag were very active.

@Teletubbies007 was the top tweeter of the #AfD, #btw17, #gehwaehlen, #reconquista, #traudichdeutschland, #wahlbeobachter, #wahlbeobachter, and #weidel hashtags.

Jensjehagen published the most retweets over the weekend.

The @AfD account was the most mentioned Twitter account.

Tools for automating Twitter activity (such as IFTTT) appeared in the top 10 of sources captured during the weekend.

Tweets published by highly active accounts made up about 3.5% of all traffic during the weekend. I had no accurate way of measuring the number of Tweets originating from commercial bot nets, and hence can’t give an estimation as to how much traffic those were responsible for.

My scripts were also configured to look for specific patterns in Tweets and metadata associated with users’ accounts in order to calculate how much activity originated from “alt-right” groups pushing right-wing agenda. Rough calculations from this data suggest that as much as 15% of all Twitter traffic associated with the German election fit this pattern.

Of the roughly 1.2 million Tweets processed between Friday afternoon and Sunday night, about 170,000 Tweets were matched by that bit of logic. These Tweets originated from about 3,300 accounts. This traffic was enough to generate the results seen in this article, but only to those looking at Twitter streams with automated tools, such as the ones I’m using.

Of note were a few videos and URLs that received a fair amount of retweets. The most obvious of these was a video posted by V_of_Europe showing an immigrant removing election campaign posters.

This was the second most retweeted Tweet I could find from the weekend (pertaining to the election itself). This Tweet was also pushed in other languages.

Another notable Tweet that showed up earlier in the weekend was a story about Merkel being booed at her final campaign rally in Munich. It didn’t get a whole lot of traction, though.

Plenty of links were shared to non-authoritative news sources. Here’s just one example…

…which was shared by this account…

Pay close attention to this user’s profile description

And on the same note, during the weekend, I saw plenty of non-German accounts Tweeting in German, and pushing links to questionable news sources.

I also captured plenty of pro-Trump accounts posting in English.

Another interesting story that failed to gain traction was one about an election results leak prior to the end of voting.

Across the duration of my analysis, 200,000 individual URLs were shared. Around 418,000 Tweets contained those URLs. Of those Tweets, only some 500 linked to questionable political content, which were shared in about 4,000 Tweets. Of the 400,000 unique Twitter users observed participating in this discussion, about 3,000 users were responsible for sharing “fake news” links. By and large, the accounts sharing these links didn’t look like bots.

Merkel was the most seen word in Tweets that shared links to political agenda articles.

Overall, German language Tweets made up roughly 60% of all Tweets during the run-up to the election.

At the time of writing, the most retweeted Tweet I can find pertaining to the election is this one, which has already received over 10,000 retweets.

Also, it’s nice to see that Russians themselves have a sense of humor when it comes to all the allegations of election interference.

Given the lack of German participation on Twitter, it seems to me that the heavy right-wing messaging push that’s been going on during the German election cycle has been more about recruiting new members into the alt-right than it’s been about election interference.

TrickBot In The Nordics, Episode II

The banking trojan TrickBot is not retired yet. Not in the least. In a seemingly never ending series of spam campaigns – not via the Necurs botnet this time – we’ve spotted mails written in Norwegian that appear to be sent by DNB, Norway’s largest bank.

Trickbot Mail

The mail wants the recipient to believe that they have received an important “decision letter” and that they should open the attached document for more information. They also suggest that, if there are problems reading the content, you have to click the “Enable Content” button… uh oh, where have we heard that before?

Anyway, let’s take a look at the attachment “SikreDokumenter.doc” (“Secure Document”). Not that much to see here though.

Sikre Dokumenter

“Laster Innhold” translates to “Loading content”, but that content never appears. As if it is waiting for the user to click “Enable Content”, as the mail suggests, no? Unfortunately, clicking this button still never reveals anything (how disappointing!). Instead, a Visual Basic macro launches a PowerShell script which will download and execute the TrickBot loader.

And just like last time we wrote about TrickBot, a large spam campaign often goes hand in hand with a malware update. Now the authors are “celebrating” a brand new list of targets. Here’s a short summary:

  • More targeting of finance related sites which are no traditional banks: American Express, Amazon, …
  • A few banks in Mexico, Argentina and Chile. Middle and South America, some of the last parts of the world that TrickBot hadn’t visited yet.
  • New European countries: Croatia, Slovenia, Hungary, Turkey, …
  • More banks in countries targeted before, such as Belgium, The Netherlands, Luxembourg, Germany, Spain, Italy, Poland, Singapore, Australia, New Zealand, …
  • And last but not least: the Nordic countries are back in the game.

Wait, the Nordic banks were gone? That’s right! They appeared in June, but were removed again early August. Our guess was that attacking the Nordics turned out not that profitable – but now they are back. Which immediately explains the localized spam.

But fear not, our security products were already protecting you against this latest campaign.

Special thanks to Päivi for the help.

Working Around Twitter API Restrictions To Identify Bots

Twitter is by far the easiest social media platform to work with programmatically. The Twitter API provides developers with a clean and simple interface to query Twitter’s objects (Tweets, users, timelines, etc.) and bindings to this API exist for many languages. As an example, I’ve been using Tweepy to write Python scripts that work with Twitter data.

While seemingly powerful at first, developers will inevitably bump into one of the many restrictions imposed on usage of Twitter’s API, likely not all that long after they start using it. And if they’re like me, they’ll probably put a great deal of time and effort into figuring out if they can circumvent those restrictions. Here are a few that I’ve bumped into:

  • The Twitter API imposes rate limits on every action you can perform. These rate limits vary depending on what it is you’re trying to do. Here’s a table that lists them. Rate limits almost completely destroy one’s ability to create forensic tools that iterate follower/following lists looking for associations between accounts (that can sometimes be useful for mapping out bot networks.)
  • The number of results returned by queries is capped. A while back I tried to retrieve lists of followers for Twitter’s top 100 most followed accounts (all of which have millions of followers). The API only let me retrieve the most recent 5000 followers. Likewise, if you want to iterate Tweets published by a specific user, the API will only return about 3,200 items, even if the user’s Tweet history contains more.
  • Searches will only retrieve data from the last 7 days. This prevented me from creating a tool to retrieve the first Tweet that contains a specific string, URL, or hashtag (which would be useful for forensic purposes). In order to see back past 7 days, you’d have to save all Tweets, all the time. And that would be expensive, if it were even possible, but…
  • You can’t listen to a stream of all Tweets that are happening. That stream is referred to as “the firehose”, and Twitter only grants a few customers access to it. You can, however, listen to a stream of 1% of all Tweets. That’s referred to as “the garden hose”. Alternatively, and this is the method I use, you can listen to a stream based on a set of search terms. This is a more targeted approach, and ends up being more useful than listening to a ton of noise in most cases.
  • As I mentioned in a previous post, some objects aren’t returned as you might expect. In the case of a “Quote Tweet”, you can access a static representation of the quoted Tweet, but not the full Tweet object. This prevents a script from iterating through nested quote Tweets without performing multiple queries and, yeah, you guessed it, hitting the rate limit wall.
  • Some information is missing. At the time of querying, I’d like to see how many replies a Tweet has accrued. Not possible. You can retrieve an integer corresponding to the number of times a Tweet has been liked or Retweeted, but you’d have to perform additional queries to get a list of users or Tweets associated with that action.

Regardless of the numerous restrictions, working with the Twitter API is fun. And figuring out how to retrieve the data you need while working under these restrictions is often a nice challenge. Saying that, I can’t help but feel like I’ll never operate at a level above “hobbyist”. The data that Twitter themselves have access to puts them in a much better position to find patterns associated with bots. For instance, they’ll likely have direct access to data about when accounts followed/unfollowed other accounts, name changes, Tweet deletions, and perhaps even the IP addresses of clients connecting to Twitter. A recent Brian Krebs article alluded to the fact that Twitter do have automation in place to detect bot-like behavior. Twitter’s back end logic appears to, in some cases, take automatic action against bots. It makes sense that they can’t reveal the logic behind their bot-detection algorithms, but you can definitely see it in play. In a recent example, Joseph Cox‘s Twitter account was automatically restricted after bots targeted some of his Tweets.

Here’s another example. While writing this article, I pointed a script at the garden hose (1% of all Tweets) and collected some metadata about each Tweet I encountered. That metadata included a count of all hashtags seen in Tweets. Here’s the top 10 hashtags my script encountered during the run.

Top 10 hashtags seen from Twitter’s garden hose between 11:00 and 12:00 EEST on 31st August 2017.

Right at the top of that list is the hashtag #izmirescort, a tag used in predominantly Turkish language Tweets to advertise escort services. However, that hashtag doesn’t show up in global trends. During the last several months, every time I’ve run a script against the garden hose, #izmirescort was the top hashtag. So it seems obvious that Twitter has some behind-the-scenes filtering going on to prevent certain hashtags from showing up in trends.

The Twitter streaming API supplies a Tweet object for every Tweet retrieved from the stream. That object doesn’t just contain information about the Tweet, it also contains information about the user who published the Tweet. Hence, by listening to a stream, a script can harvest information about both Tweets and users. This is one of the best ways of getting around rate limiting. For the cost of one API transaction, you can listen to a stream as long as the connection holds, and gather interesting data. While attached to the garden hose stream, I configured my script to fetch a few pieces of metadata associated with the user who posted each Tweet. By obtaining the account creation date and number of Tweets that account has published, I can calculate an average value for Tweets per day over the lifetime of that account. There are some accounts out there that post a phenomenal number of Tweets per day. Here’s an example.

A snapshot of high activity Twitter users obtained from a few minutes of listening to the garden hose stream.

If you listen to a stream for long enough, you’ll observe some accounts Tweeting multiple times. Recording the time interval between Tweets allows you to build up an “interarrival” map. You can also build an interarrival map on a individual user by obtaining previous Tweets from that user and examining the timestamp of each Tweet. Here’s an interarrival map of the last 3200 Tweets from the top listed account above (Love_McD).

0 | 1396
1 | 1249
2 | 341
3 | 99
4 | 28
5 | 6
6 | 1
7 | 1

The above data shows that 1396 Tweets were published with less than one second interval between them, 1249 Tweets one second apart, 341 Tweets 2 seconds apart, and so on. This account literally tweets every few seconds, non-stop.

Performing a standard deviation calculation on the numbers from the second column, you can obtain a floating point number that represents the “machine-like” behavior of that account. Normal accounts tend to have a standard deviation value very close to zero. This account’s value was 549.79.

Of course, by visiting the above account’s Twitter page, you’ll notice that it’s a verified account belonging to McDonalds Japan. Simple numerical analysis on account activity isn’t enough to determine whether it’s a “bad” bot. And there are plenty of legitimate bots on Twitter.

Some bots attempt to hide their activity, while pushing an agenda, by replying to other users. Anyone with a high-profile enough Twitter account has probably had a random Tweet of theirs replied to by a p0rn bot. Thus, using a script to track the percentage of Tweets from an account that were replies to other Tweets is a useful way of determining suspiciousness.

As Ben Nimmo has pointed out, some Twitter botnets utilize multiple accounts to Retweet and Like specific Tweets in attempt to modify SEO on those posts. This is the tactic one botnet owner used to attack Joseph Cox’s account. Again, a script can be used to examine Retweet and Like behavior of an individual account, and by listening to a stream, one can build a list of suspicious accounts on-the-fly, as data arrives.

Another rather easy way of finding bots is to look at the “source” field in a Tweet. This field is set by the application that posted the Tweet. If you publish a Tweet from the Twitter app on your iPhone, source will be set to “Twitter for iPhone”, for example. If you are using the Twitter API to publish Tweets, you’ll create a name for the source field when you set up your API keys. Not all bots use the Twitter API to post Tweets, though, since it’s an obvious giveaway. The bots that recently harassed Ben Nimmo and others all report “legitimate” values in their source fields, indicating that the bot master has automated Tweeting from iPhones and web clients. However, since examining the source field of a Tweet is trivial, it’s still a nice way to determine suspisicousness. Here’s some interesting looking source fields I picked up, in conjunction with high-volume Tweeters.

Non-standard source fields (rightmost column) from high-volume Tweeters.

Note that IFTTT (If This Then That) is a legitimate service used by many companies and individuals to automate social media activities. Hence it’s also a great place for not so legitimate operations to hide.

As an aside, here’s a breakdown of the languages seen while my script was running (which can provide information about the popularity of Twitter in different regions at that hour of the day).

Language breakdown from the Twitter garden hose stream between 11:00 and 12:00 EEST August 31st 2017. The large brown segment is “ja”, but the legend got cut off (I’m still tweaking my visualization implementation).

The botnet recently examined by Ben Nimmo was evidently being used to promote content in multiple languages. Hence, and examination of the languages used in Tweets published by a single account may help determine suspiciousness. However, that particular form of analysis is somewhat at the whim of Twitter’s own language-determination algorithms. I did a language distribution analysis of my own Tweets and found this.

1: cs
1: de
807: en
3: es
3: fi
2: fr
1: ht
1: in
1: no
2: pt
1: tl
1: tr
36: und

Und means “undecided”. As you can see, Twitter may have incorrectly categorized the language of some of my Tweets. However, en is overwhelmingly represented. Accounts that have been used to push content in multiple languages may have double-digit percentage values for multiple languages, indicating that the account holder is either multilingual, or that the account is automated. Of course, Retweets should be factored into this calculation.

The use of scripts to analyze Twitter data opens up many ways to search for suspicious activity. This post has touched upon some of the simplest techniques one can use to build Twitter bot analysis scripts. I’ll cover some more complex techniques in future posts.

Trump Hating South Americans Hacked HBO

Last week – I read the message “Mr. Smith” reportedly sent to HBO… and it brought up a few questions. And also, it offered some “answers” to questions that I’m often asked. Questions such as “how much money do cyber criminals make?”

Here’s the start of the message.

It took about 6 months

First, let’s examine Mr. Smith and his colleagues’ persistence. How long did it take to infiltrate HBO? According to Mr. Smith, it took ~6 months and they considered it to be “difficult”.

Next, how much money to cyber criminals make? And how busy are they?

we are IT professionals

According to Mr. Smith’s claims…

  • Annual expenses: ~400 to 500 thousand dollars for 0-day exploits.
  • Annual income: ~12 to 15 million dollars.
  • Annual number of targets: often two major operations.
  • Total number of targets: 17.
  • Amount who paid extortion: 14.

But of course, it’s supposedly not all about the money.

Even we hate trump like other Americans do

Near the end of the message, Mr. Smith warns “Don’t go to FBI or other f–––ing IT Idiots. They are so busy or shoe makers!”

Now, I haven’t come across the term “shoe makers” before so I was curious. In what country or countries is that expression commonly used? I mentioned it to Andy and he checked with our language localization team, and they took a look at the entire message.

So what’s their best guess as to where in the world is Mr. Smith?

They suggested Spanish speaking parts of South America based on the punctuation and capitalization. The phraseology and style of communication… maybe Argentina? Also, there’s the Latin American culture insult about shoemakers (aha!) and the inclusive use of “Americans”.

Inclusive use of Americans? I didn’t notice this on my first reading, but the localization team did. See above and should notice Mr. Smith’s “Even we hate trump like other Americans do.”

And there you go, HBO was hacked by Trump hating South Americans.

Break your own product, and break it hard

Hello readers,

I am Andrea Barisani, founder of Inverse Path, which is now part of F-Secure. I lead the Hardware Security consulting team within F-Secure’s Cyber Security Services.

You may have heard of our USB armory product, an innovative compact computer for security applications that is 100% open hardware, open source and Made in Italy.

USB armory

We’ve recently published a Proof-of-Concept (PoC) relating to the product and a security advisory, and I’d like to take this opportunity to discuss a traditional (but always relevant) topic that I consider of critical importance in information security.

The information security community has a long-established tradition of publishing PoC code to demonstrate security vulnerabilities. This is considered a welcome and essential practice in security research for many reasons. Publishing PoC code encourages further investigation and testing of issues among vendors or other affected parties; it promotes security research; and it empowers other skilled parties to further verify the scope and impact of vulnerabilities.

The most important and compelling reason to take this approach, however, is this: In scenarios where detailed technical information has already been made public, the lack of a working PoC does not, and should not, constitute any form of “protection.”

In other words, if there is enough public information describing a security vulnerability, we must always assume that it is being actively exploited, particularly by parties holding working exploits but not so keen on their release.

This generic rule applies to software, hardware and all kind of security vulnerabilities and it can be argued even when only a little proof is given for a valid finding.

Earlier this year our friends at Quarkslab coordinated the release of two secure boot vulnerabilities for the NXP i.MX6 series of application processors.

We immediately began working to convert their advisory into a working PoC so that we could assess applicability on other processors, such as the USB armory i.MX53, as well as fully understand the impact of the resulting secure boot bypass.

In little more than a man-day we were able to patch our own Open Source code signing tool to integrate a working PoC for one of the reported vulnerabilities.

This allowed us to:

  • validate the applicability of the reported issues on a new target
  • assess the complexity and time effort required to develop the PoC from the advisory contents
  • fully understand the vulnerability impact and scope
  • notify affected customers with full confidence of impact and mitigations
  • as always when developing PoCs, learn a lot!

One might argue that we are putting customers who leverage the secure boot feature at risk by releasing the PoC, raising the decades-old controversy between information security researchers and parties not familiar with this kind of disclosure process.

In short: We truly are not putting customers at risk, and in fact we are doing quite the opposite.

The illusion of protection, given by the lack of a PoC and while technical details are out there, is far more dangerous than our disclosure.

In fact protection by obscurity is never an option, especially when dealing with safety-critical industries such as avionics or automotive, which distinguish the Inverse Path team background now integrated in F-Secure CSS.

The responsible disclosure of this PoC is a benefit to customers, vendors and the research community. We are proud to show that we follow such principles even when it comes to breaking our own products.

It should be emphasized that the vulnerabilities reporting, investigation and disclosure has been fully coordinated among affected parties such as ourselves, applicable CERTs, the vendor and original reporters.

If you are interested in the full details and impact of the vulnerabilities please read our full advisory.

If you are interested in knowing more about X.509 ASN.1 certificate parsing manipulation, you can find the gory details in our PoC function.

You might be interested to know that F-Secure promotes security research against our products and services with our own Vulnerability Reward Program, a program that is also made available to company employees that want to volunteer in hacking company products. The VRP effort is a complement, not a replacement, to the ongoing company effort to secure our products and services without compromise.

I would like to personally thank our friends at Quarkslab for the findings and coordination in handling the vulnerabilities.

It is only with this kind of harmonized cooperation, sharing the same tradition and principles, that we can help vendors as well as ourselves in improving security without relying on obscurity.

Andrea Barisani
Head of Hardware Security – F-Secure – Inverse Path

Retefe Banking Trojan Targets Both Windows And Mac Users

Based on our telemetry, customers (mainly in the region of Switzerland and Germany) are being targeted by a Retefe banking trojan campaign which uses both Windows and macOS-based attachments. Its massive spam run started earlier this week and peaked yesterday afternoon (Helsinki time).

TrendMicro did a nice writeup on this threat earlier this week. The new campaign, which just started yesterday, made some updates on the malware payload.

Instead of having the installation strings and Onion proxy domain stored in the binary as a plain text, the authors made an effort to hide the interesting strings by XORring them with 0xFF.

Original encrypted…

…and decrypted!

The spam message looks like it’s coming from “Mein A1” info@ from different .ch TLDs with subject lines such as “Ihre Rechnung #123456-AB123456 vom 13/07/2017”. The mail itself is (signed) by A1 Telekom Austria AG. The mail contains two attachments: a zipped Mach-O application, and a .xlsx or .docx document file. The first attachment targets macOS systems, whereas the latter document file installs the malware on Windows systems.

The mail itself doesn’t give any social engineering cues to the victim as to which file to open; moreover, having an Austrian-based telecom company sending Swiss International Airlines related documents is probably more confusing than intriguing.

The text explains that double-clicking opens a larger view of the image – but actually, it runs the malware.

Though the malware is mainly targeting Switzerland with the .ch TLD domain, we found a target configuration also for Austrian banks.

List of Austrian-based targets:

‘*’, ‘*’, ‘*’, ‘*’, ‘’, ‘*’, ‘’, ‘*’

List of Swiss-based targets:

‘*’, ‘’, ‘*’, ‘*’, ‘’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘’, ‘*’, ‘*’, ‘’, ‘*’, ‘’, ‘*’, ‘’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’

Though the Retefe banking trojan has previously operated in other European countries, such as Sweden and the UK, countries other than Switzerland and Austria were not seen in this campaign.

As a note of historical interest, here’s a list of UK-based banks that were targeted in June 2016:

‘*’, ‘*’, ‘*’, ‘’, ‘’,’*’, ‘*’, ‘*’, ‘’,’*’,’*’, ‘*’, ‘’, ‘*’, ‘*’,’*’, ‘*’


  • https: //www[.]dropbox[.]com/s/azkkyzzo41tk84i/FzF7sEBlz128859.exe?dl=1
    https: //www[.]dropbox[.]com/s/96q0qkrusk5gkp6/HwJoS9VDWh570254.exe?dl=1
  • 6aaoqcl2leiptpvn.onion
  • 2cac780f6de5a8acc3506586c06b1218c33b21b0

How EternalPetya Encrypts Files In User Mode

On Thursday of last week (June 29th 2017), just after writing about EternalPetya, we discovered that the user-mode file encryption-decryption mechanism would be functional, provided a victim could obtain the correct key from the malware’s author. Here’s a description of how that mechanism works.

EternalPetya malware uses the standard Win32 crypto API to encrypt data.

For every fixed drive in the system which is assigned to a drive letter (C:, D: etc) the following is performed:

  • Initialize a context by calling CryptAcquireContext. Provider is “Microsoft Enhanced RSA and AES Cryptographic Provider”
  • CryptGenKey is called to generate a public/private key pair. The algorithm is AES-128. Afterwards it calls CryptSetKeyParam to set padding to PKCS5 and mode to CBC.
  • All files on the drive are enumerated. Files in C:\Windows and subfolders are skipped. The file extension is checked against a fixed list of 65 extensions.
Petya configuration

Petya public key, exclusions list, and file extensions list.

  • If there is a match, the file will be encrypted:
    • The file is opened via the Windows file mapping API.
    • If the file is larger than 1 MB, only the first MB will be encrypted.
    • Call to CryptEncrypt is made to encrypt the selected data.
    • The encrypted file is NOT renamed.
    • Note: there is a “bug” in this function: if the file is larger than 1 MB, the Initialization Vector will not be reset for the next file (i.e. the encryption “continues” there), making decryption more prone to failure.

In order to decrypt the files successfully, the files should be enumerated in the exact same order as during encryption, and with the same “bug” in place.

  • After all selected files have been encrypted, preparations are made to create the README.TXT contents. Here’s what happens:
    • The malware contains a hardcoded public encryption key in base64 format: “MIIBCgKCAQEAxP/VqKc0yLe9JhVqFMQGwUITO6…..” (see above screenshot). This key is first decoded using CryptDecodeObjectEx. The result is then passed to CryptImportKey to create the corresponding public key to be used in the next step.
    • CryptExportKey is called to export the private (file decryption) key. The output is encrypted using the public key generated in the previous step.
    • Finally, CryptBinaryToString is called to transform this encrypted key into a string representation.
  • Just before rebooting, the README.TXT – containing this encrypted string – is generated and written to the disk’s root folder.

File decryption should be possible, provided that:

  1. We have been provided with a private key to decrypt the file decryption key. (As of writing, the malware authors haven’t published it.)
  2. Files are enumerated in the exact same order as during encryption, i.e. no files were added, moved, or deleted between encryption and decryption phases.
  3. The disk’s MFT (master file table) hasn’t been destroyed by the other malware components.
  4. File encryption was only performed once. As we previously noted, propagation techniques in the malware may end up encrypting files a second time, with a different key. This would make the files absolutely unrecoverable.

Note that the malware does not include this decryption functionality. A separate decryptor tool would need to be provided to victims.


What Good Is A Not For Profit (Eternal) Petya?

Following up on our post from yesterday, as an intellectual thought experiment, let’s take the position that there’s something to the idea of (Eternal) Petya not being motivated by money/profit. Let’s also just go ahead and imagine that it’s been developed by a nation state.

In my mind, it raises the following question: WTF WHY? Why build a tool such as (Eternal) Petya? Or as Andy put’s it in this post: if someone wanted to build a “wiper”, why build an almost functional ransomware?

First, having written/edited numerous malware descriptions over the years, I’m a really bit pedantic about proper categorization – so let’s be clear, (Eternal) Petya is not a wiper. A wiper is something such as Shamoon. (Eternal) Petya is almost fully functional ransomware, and the question is: what more is it? If this is a prototype, what is it moving towards?

Say you’re developing tools of (cyber) warfare…

How useful is an indiscriminate, scorched-earth tool? Sure, it would have it’s uses, and it’s probably the first thing that you would develop, but in the end, it’s a pretty blunt tool. Deploying any such tool with clear attribution only escalates the situation. Use it, and you’ve immediately crossed a line. The response is going to be very severe, and will probably be something in kind. Think something like mutual assured destruction (MAD) severe. A world of nothing but indiscriminate tools/weapons is limited (and very dangerous).

So what you need is a discriminating tool; something more refined. You want/need something that can remediate collateral damage; something that can take you up to a line, but not cross completely over it. Perhaps what you want is to “weaponize” encryption. That would allow you disable your adversary but put you in a position to negotiate another move.

There are undoubtedly already nations with cyber warfare tools that can cripple critical infrastructure without completely disabling/destroying it. Which is to say, their tools are far more precise and thus they can more easily be deployed without crossing over too many lines.

That makes for asymmetry. And if you’re a nation state trying to quickly close a gap, you might decide to test things in-the-wild. But you wouldn’t just test in-the-clear, you need some plausible deniability – and crypto-ransomware is very good deniability. If you want a tool that is effectively acts like a wiper, delay remediation – or simply don’t respond. And if your goal is something otherwise, your tool is reversible without having to (publicly) admit guilt.

End of thought experiment.

And of course, remember, it could just be ransomware in-development.

(Eternal) Petya From A Developer’s Perspective

In our previous post about Petya, we speculated that the short-cuts, design flaws, and non-functional mechanisms observed in the  malware could have arisen due to it being developed under a tight deadline. I’d now like to elaborate a little on what we meant by that.

As a recap, this is what the latest version of Petya looks like (to us).

Since the previous post, we’ve determined that the user-mode encryption-decryption functionality is actually working as intended. That’s the mechanism used to encrypt and decrypt files on the system.

However, the MBR encryption-decryption functionality doesn’t work. This is because the “personal installation key” displayed in the MBR ransom screen is just a randomly generated string. This is one of the main reasons why some people are calling Petya a “wiper” (or “WTFware”).

Naturally, this unimplemented functionality has led to much discussion on the Internet. And also here at F-Secure.

We discussed the “illusory” MBR functionality in some detail. Here are a few points that were made.

  1. This is a new piece of crimeware when it comes to the user-mode component. But the MBR code was taken from v3 of Petya, known as “GoldenEye”. Current consensus seems to be that only a guy called “Janus” has the source code of Petya, so if this operation isn’t being run by “Janus”, it’s possible that this new ransomware is just using a hex-edited dump of the infected MBR.
  2. How would a third-party use the Petya binary? First off, the way this dump is used tells us that the author knows exactly where to patch the Petya encryption key and nonce, and also the “User ID”. The author also knows how to edit the MBR banner message, and other minor changes. This involves reverse engineering and taking a very close look at what people have written about Petya over the past year or so.
  3. In the “real” Petya the “User ID” is a real cryptographic entity that makes it possible to return back the actual disk encryption key, which is discarded in the process of locking up the disk. In this new Petya the “User ID” is totally random. I mean literally. It’s purposefully generated using Windows crypto APIs as a random string.
  4. Now ask yourself: If you know how to strip off the Petya bootloader, how to hex edit it, where exactly put all these variables, and how to change the banner, would you think it will work by just putting some random data there? Really? Everybody who follows these things knows that the “User ID” is a complicated piece of very precisely formatted information, that includes, for example, an Elliptic-curve public key generated by the infected machine. So, even if someone is tripping with mushrooms and thinks that generating random data will magically turn to something useful that can restore the encryption key, you need to ask next: would you in that case even once try to reboot and test if your magical random ID can be of some use? And would you ship it after this revelation?

Here we establish that the author of this malware obviously knows his code isn’t going to work. The author also knows that members of the community have meticulously studied prior versions of Petya, especially the MBR code, which is of interest, since it’s unique to Petya. So the malware author would be aware of the fact that it wouldn’t take reverse engineers all that long to figure out that the encryption-decryption mechanism is bogus.

However, consider the possibility that this malware was developed on a tight deadline, and released ahead of a reasonable schedule. I’m sure there isn’t a single software developer in the world who hasn’t been through that “process”.

But let me put on my “developer hat”. If I were to develop a piece of malware that includes Petya, I’d most likely start doing it first by embedding it in my project and maybe first filling all the variables with random data. Just to make sure the overall concept of embedding it works, Petya MBR code kicks in after reboot, and so on.

After everything seems to work, then I’d move on to the next step, where I’d implement the actual Elliptic-curve stuff that would ultimately replace the placeholder random data.

Let’s note at this point that all the Elliptic-curve-related code is completely absent in this new variant. It’s a complicated piece of code that would probably take over a week to develop and test properly.

So, what if my friend, who’s supposed to “ship” this thing gets a wrong version, or doesn’t bother to wait for it to be ready, or whatever. He will just package everything up, run it to see that everything looks like it works, and ship it. And off we go.

Putting together a proper build process for software isn’t easy. Tracking changes in different modules, making sure your final package contains the right things, and testing it thoroughly enough to catch discrepancies, or wrong versions is also time-consuming. Many “real” software companies have problems with build processes and versioning. The theory that the final build of Petya contained an old version of one of the components is not at all far-fetched. Neither is the theory that they shipped a “minimum viable product”.

Plenty of other evidence points towards this piece of software being developed in a hurry, and not thoroughly tested.For instance, a machine can re-infect itself and encrypt files twice. There’s also this bug:

And I’m sure we’ll find more bugs. (Whoever wrote this is getting a lot of free QA!)

At the end of the day, if someone wanted to build a “wiper”, why build an almost functional ransomware, save for a few bugs and a possibly misconfigured final package?


My colleague, Sean Sullivan wrote a follow post to this:

If this attack was aimed purely at the Ukraine, given the collateral damage we’ve seen, and the information emerging related to aggressive lateral propagation mechanisms employed by this malware, I’d add that (network propagation) logic  to the list of bugs/design flaws present in this malware.


Petya: “I Want To Believe”

There’s been a lot of speculation and conjecture around this “Petya” outbreak. A great deal of it seems to have been fueled by confirmation bias (to us, at least).

Many things about this malware don’t add up (at first glance). But it wouldn’t be the first time that’s happened.

And yet everyone seems to have had answers to a variety of burning questions – within mere moments of this whole thing exploding. It’s either a case of “wisdom of the crowd” (definitely good) or “group think” (definitely bad).

We prefer to avoid being pulled into group think, so we’ve stepped aside, exercised patience, and tried to apply some healthy skepticism to the matter. We realized that our questions could only be answered after a thorough analysis of the available material. So we took the methodical approach.

There’s a large risk of jumping the gun in this particular case, and it’s too important to us to risk that. Yeah, we get it. This is a topic with very high interest and people have been awaiting our say on the matter. We’ve erred on the side of caution. In the current media landscape, narratives can quickly get out of control of one person or organization’s ability to course correct. We don’t want to be the ones steering the ship in the wrong direction.

Needless to say, our analysts have been working long, hard hours on this case over the past few days, and ordering in so they can keep working.

Lunch pizza

Pizza: the lunch of champions!

So, taking the default position of  “it does what it says on the tin”, what evidence would convince us to change our minds? (Fans of the scientific method.) We didn’t really know what evidence was needed until we started looking. And, so, over the past 48 hours, we’ve bombarded our colleagues with questions (a lot of which we’ve seen others asking). So, without further ado, let’s start.

“Can it even be ransomware if it doesn’t have a good payment pipeline?”

Lots of ransomware uses email. There’s only two choices to communicate with your customers/victims: use email or create a service portal. They each have pros and cons. Starting with email doesn’t mean you’ve ruled out creating a portal later on, because in a case such as this, if you build it, they will come (if everything’s working properly).

So, is everything working properly?


Aha! So that’s evidence of it not being ransomware, right?


But why?

Malfunctioning malware isn’t rare. It’s possibly evidence of nothing more than a bug in the code, a design flaw, or issues with supporting infrastructure. It’s typically not enough evidence for us to attribute anything in particular.

So what doesn’t work?

Decryption of files is not possible.


For many reasons. We’ll get into that below.

Many reasons? So there’s lots of bugs? Isn’t that evidence that it’s not real ransomware?

To be honest, who knows. It’s evidence of a mess, and we’re still working to untangle all the knots. It’s time-consuming.

This line of questioning is getting us nowhere. Let’s move on.

Nation state malware is advanced and sophisticated, right? Is this sophisticated?

Yes and no. As you might have guessed from above, part of it most certainly isn’t sophisticated. But… part of it is. We’ve identified three main components. Two of them are pretty shoddy and seem kinda cobbled together. But the third component, the bit that allows the malware to spread laterally across networks, seems very sophisticated and well-tested.

So it’s a paradox then?

Kinda. You probably can see why we’re trying not to rush to any judgements (we hope!). For the sake of this post, let’s call these three components “user mode component”, “network propagation component”, and “MBR component”. Here’s a diagram.

Bad handwriting by yours truly (Andy).

So what’s the deal with this “sophisticated” part?

Aha! So here’s the interesting bit. It appears to be well designed, well tested, and there’s evidence that development on the network propagation component was completed in February ². The network propagation module was probably already in development in February. ²

Update: see below ².

“We won’t be able to determine the timestamp for the use of NSA tools since it’s part of the main DLL code which has the June timestamp.” ²

“Also, in this particular Petya sample, the shellcode is in a way coupled with the exploits. That is, they didn’t simply plug the shellcode in without properly testing it with their version of the SMB exploit.” ²

What’s interesting about that?

February is many weeks before the exploits EternalBlue and EternalRomance (both of which this module utilizes) were released to the public (in April) by the Shadow Brokers. And those exploits fit this component like a glove.

Note: this isn’t rock solid evidence, but it’s far more compelling to us than any of the other reasoning we’ve seen so far.

How does it compare to WannaCry (which also used these exploits)?

WannaCry clearly picked these exploits up after the Shadow Brokers dumped them into the public domain in April. Also, WannaCry didn’t do the best job at implementing these exploits correctly.

By comparison, this “Petya” looks well-implemented, and seems to have seen plenty of testing. It’s fully-baked.

So, if the network propagation component is fully-baked, why aren’t the other two?

Here’s our theory. WannaCry, again.

WannaCry burst onto the scene in May, and started trashing up the joint, causing everyone to scramble to patch SMB vulnerabilities. Microsoft even patched XP! The result of this was a sudden drop in effectiveness of carefully crafted network propagation components (such as the one we’re talking about here). Whatever project these guys were working on, suddenly got its deadline adjusted. And hence everything else was done in a bit of a hurry.

Do you have anything else to add to your timeline?

Kaspersky Lab pointed to a Ukraine-based watering hole, and it turns out the MBR component was actually pushed out via that site in late May, post-WannaCry. We feel this might also be consistent with an “adjusted” deadline.

So, can you sum this up for us?

  • January 2017. The Shadow Brokers advertised an “auction” which revealed the names of all the exploits they had for sale.
    • The NSA, upon noticing their exploits being advertised, hurriedly contacted Microsoft (reportedly with hat in hand), and owned up to their shenanigans.
  • February 2017. Microsoft then had a very busy patch cycle and actually missed patch Tuesday that month.
    • Meanwhile, “friends of the Shadow Brokers” were busy finishing up development of a rather nifty network propagation component, which ended up utilizing these exploits. ²
  • March 2017. Microsoft rolls out patches. Many fixes (to NSA-exploited vulnerabilities) made.
  • April 2017.  The Shadow Brokers dump a whole bunch of exploits into the public domain.
    • Somebody with possible connections to North Korea notices.
  • May 2017. WannaCry, ’nuff said.
    • Either the “friends of the Shadow Brokers” had something they felt they needed to get done, and their deadline was stepped up because of WannaCry…
    • Or they figured they could “join the party” as yet another ransomware, as long as they capitalized on it within a reasonable amount of time.
    • The MBR component of this malware was alpha tested using a watering hole attack.
  • June 2017. You are here.

Are you still skeptics?


But are you still skeptical about this malware being “nation state”?

Less and less so. We don’t think any current attribution is rock solid (attribution never really is). We feel this is definitely worth deeper investigation. And more pizza.

We’ve changed our minds on some of our earlier conclusions. Please note this if you’re reading any previous F-Secure analysis. And, of course, this is subject to further revision, as new facts come to light.

What other thoughts would you like to share?

As we mentioned earlier, two of the components in this malware are quite shoddy. Here are some interesting/confusing things we found.

The generated “personal installation key” displayed on the MBR version of the ransom page is 60 bytes of randomly generated data. This wouldn’t be a problem if it were sent upstream to the attacker, as a customer ID, but it isn’t (there’s no C&C functionality at all). It’s basically a placeholder that makes the ransomware look legit.

Why the authors of this malware failed to add proper decryption functionality to the MBR lock screen is still a question. Was it intentionally left out, did they make a huge mistake, or did they run out of time?

One of our analysts noted that implementation of the Elliptic curve Diffie–Hellman functionality necessary to enable proper encryption-decryption services in the MBR portion of the malware would take upwards of a week. If the developers were in a hurry, this could be one of the reasons why they opted for the “illusory” functionality we’re seeing.

This malware encrypts files on the user’s system and then, if it can elevate to admin, rewrites the MBR and reboots into a lock screen. Why encrypt files on the machine if you’re going to ultimately render the whole machine unusable? The user-mode encryption step is actually a fallback mechanism for if the malware can’t attain admin rights, which it would need to modify the MBR and execute that phase. Essentially, it’s a way of increasing the author’s chances of receiving a payment.

In cases where the malware fails to elevate, and only encrypts files in user-mode, a ransom note is left for the victim. This ransom note contains a different, much longer key than the one seen in the MBR lock screen. We’re currently looking into whether that key is generated in a way that might allow decryption to happen¹.

Petya user mode ransom note.

You can get infected multiple times.

This malware also does other stuff that indicates poor testing practices. For instance, a machine infected with this malware can re-infect itself via one of its own propagation mechanisms. In this case, user-mode encryption will run a second time (most likely with elevated privileges), making decryption impossible.

It has a vendetta against Kaspersky Lab.

If this malware finds running Kaspersky processes on the system, it writes junk to the first 10 sectors of the disk, and then reboots, bricking the machine completely.

One final thing.

We know of victims who don’t use M.E.Doc and have no obvious connections to Ukraine. Yet they were infected during Tuesday’s outbreak. This mystery is one of the factors that have kept us from jumping on the conspiracy train. And we still don’t have answers here.


¹ Edited on Thursday

We’ve confirmed that the user-mode encryption-decryption logic is functional and does work. Details here.

² Edited on Friday

See also our latest posts on the subject:

Some of the payloads utilized by the network propagation component have compilation timestamps from February 2017. The compilation dates on these payloads don’t have any bearing on when the Eternal* exploits were implemented in the network propagation code.


Processing Quote Tweets With Twitter API

I’ve been writing scripts to process Twitter streaming data via the Twitter API. One of those scripts looks for patterns in metadata and associations between accounts, as streaming data arrives. The script processes retweets, and I decided to add functionality to also process quote Tweets. Retweets “echo” the original by embedding a copy of the […]


Super Awesome Fuzzing, Part One

An informative guide on using AFL and libFuzzer. Posted on behalf of Atte Kettunen (Software Security Expert) & Eero Kurimo (Lead Software Engineer) – Security Research and Technologies. The point of security software is to make a system more secure. When developing software, one definitely doesn’t want to introduce new points of failure, or to […]


TrickBot Goes Nordic… Once In A While

We’ve been monitoring the banking trojan TrickBot since its appearance last summer. During the past few months, the malware underwent several internal changes and improvements, such as more generic info-stealing, support for Microsoft Edge, and encryption/randomization techniques to make analysis and detection more difficult. Unlike the very fast expansion of banks targeted during the first […]


OSINT For Fun And Profit: Hung Parliament Edition

The 2017 UK general election just concluded, with the Conservatives gaining the most votes out of all political parties. But they didn’t win enough seats to secure a majority. The result is a hung parliament. Both the Labour and Conservative parties gained voters compared to the previous general election. Some of those wins came from […]


Why Is Somebody Creating An Army Of Twitter Bots?

There’s been some speculation this week regarding Donald Trump’s Twitter account. Why? Because its follower count “dramatically” increased (according to reports) due to a bunch of bots. Since Twitter analytics are my thing at the moment, I decided to do some digging. Sean examined some of Trump’s new followers and found they had something in […]


Now Hiring: Developers, Researchers, Data Scientists

We’re hiring right now, and if you check out our careers page, you’ll find over 30 new positions ranging from marketing (meh) to malware analysis (woot!). A select number of these new positions are in F-Secure Labs. If you’re on the lookout for a job in cyber security, you might find one of these jobs […]


WannaCry, Party Like It’s 2003

Let’s take a moment to collect what we know about WannaCry (W32/WCry) and what we can learn from it. When looked at from a technical perspective, WCry (in its two binary components) has the following properties. Comprised of two Windows binaries. mssecsvc.exe: a worm that handles spreading and drops the payload. tasksche.exe: a ransomware trojan […]


WCry: Knowns And Unknowns

WCry, WannaCry, Wana Decrypt0r. I’m sure at this point you’ve heard something about what the industry has dubbed the largest crypto ransomware outbreak in history. Following its debut yesterday afternoon, a lot of facts have been flying around. Here’s what we know, and don’t know. WCry has currently made a measly $25,000 They now made […]


OSINT For Fun And Profit: #Presidentielle2017 Edition

As I mentioned in a previous post, I’m writing scripts designed to analyze patterns in Twitter streams. One of the goals of my research is to follow Twitter activity around a newsworthy event, such as an election. For example, last weekend France went to the polls to vote for a new president. And so I […]


Unicode Phishing Domains Rediscovered

There is a variant of phishing attack that nowadays is receiving much attention in the security community. It’s called IDN homograph attack and it takes advantage of the fact that many different Unicode characters look alike. The use of Unicode in domain names makes it easier to spoof websites as the visual representation of an […]