Retefe Banking Trojan Targets Both Windows And Mac Users

Based on our telemetry, customers (mainly in the region of Switzerland and Germany) are being targeted by a Retefe banking trojan campaign which uses both Windows and macOS-based attachments. Its massive spam run started earlier this week and peaked yesterday afternoon (Helsinki time).

TrendMicro did a nice writeup on this threat earlier this week. The new campaign, which just started yesterday, made some updates on the malware payload.

Instead of having the installation strings and Onion proxy domain stored in the binary as a plain text, the authors made an effort to hide the interesting strings by XORring them with 0xFF.

Original encrypted…

…and decrypted!

The spam message looks like it’s coming from “Mein A1” info@ from different .ch TLDs with subject lines such as “Ihre Rechnung #123456-AB123456 vom 13/07/2017”. The mail itself is (signed) by A1 Telekom Austria AG. The mail contains two attachments: a zipped Mach-O application, and a .xlsx or .docx document file. The first attachment targets macOS systems, whereas the latter document file installs the malware on Windows systems.

The mail itself doesn’t give any social engineering cues to the victim as to which file to open; moreover, having an Austrian-based telecom company sending Swiss International Airlines related documents is probably more confusing than intriguing.

The text explains that double-clicking opens a larger view of the image – but actually, it runs the malware.

Though the malware is mainly targeting Switzerland with the .ch TLD domain, we found a target configuration also for Austrian banks.

List of Austrian-based targets:

‘*’, ‘*’, ‘*’, ‘*’, ‘’, ‘*’, ‘’, ‘*’

List of Swiss-based targets:

‘*’, ‘’, ‘*’, ‘*’, ‘’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘’, ‘*’, ‘*’, ‘’, ‘*’, ‘’, ‘*’, ‘’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’, ‘*’

Though the Retefe banking trojan has previously operated in other European countries, such as Sweden and the UK, countries other than Switzerland and Austria were not seen in this campaign.

As a note of historical interest, here’s a list of UK-based banks that were targeted in June 2016:

‘*’, ‘*’, ‘*’, ‘’, ‘’,’*’, ‘*’, ‘*’, ‘’,’*’,’*’, ‘*’, ‘’, ‘*’, ‘*’,’*’, ‘*’


  • https: //www[.]dropbox[.]com/s/azkkyzzo41tk84i/FzF7sEBlz128859.exe?dl=1
    https: //www[.]dropbox[.]com/s/96q0qkrusk5gkp6/HwJoS9VDWh570254.exe?dl=1
  • 6aaoqcl2leiptpvn.onion
  • 2cac780f6de5a8acc3506586c06b1218c33b21b0

How EternalPetya Encrypts Files In User Mode

On Thursday of last week (June 29th 2017), just after writing about EternalPetya, we discovered that the user-mode file encryption-decryption mechanism would be functional, provided a victim could obtain the correct key from the malware’s author. Here’s a description of how that mechanism works.

EternalPetya malware uses the standard Win32 crypto API to encrypt data.

For every fixed drive in the system which is assigned to a drive letter (C:, D: etc) the following is performed:

  • Initialize a context by calling CryptAcquireContext. Provider is “Microsoft Enhanced RSA and AES Cryptographic Provider”
  • CryptGenKey is called to generate a public/private key pair. The algorithm is AES-128. Afterwards it calls CryptSetKeyParam to set padding to PKCS5 and mode to CBC.
  • All files on the drive are enumerated. Files in C:\Windows and subfolders are skipped. The file extension is checked against a fixed list of 65 extensions.
Petya configuration

Petya public key, exclusions list, and file extensions list.

  • If there is a match, the file will be encrypted:
    • The file is opened via the Windows file mapping API.
    • If the file is larger than 1 MB, only the first MB will be encrypted.
    • Call to CryptEncrypt is made to encrypt the selected data.
    • The encrypted file is NOT renamed.
    • Note: there is a “bug” in this function: if the file is larger than 1 MB, the Initialization Vector will not be reset for the next file (i.e. the encryption “continues” there), making decryption more prone to failure.

In order to decrypt the files successfully, the files should be enumerated in the exact same order as during encryption, and with the same “bug” in place.

  • After all selected files have been encrypted, preparations are made to create the README.TXT contents. Here’s what happens:
    • The malware contains a hardcoded public encryption key in base64 format: “MIIBCgKCAQEAxP/VqKc0yLe9JhVqFMQGwUITO6…..” (see above screenshot). This key is first decoded using CryptDecodeObjectEx. The result is then passed to CryptImportKey to create the corresponding public key to be used in the next step.
    • CryptExportKey is called to export the private (file decryption) key. The output is encrypted using the public key generated in the previous step.
    • Finally, CryptBinaryToString is called to transform this encrypted key into a string representation.
  • Just before rebooting, the README.TXT – containing this encrypted string – is generated and written to the disk’s root folder.

File decryption should be possible, provided that:

  1. We have been provided with a private key to decrypt the file decryption key. (As of writing, the malware authors haven’t published it.)
  2. Files are enumerated in the exact same order as during encryption, i.e. no files were added, moved, or deleted between encryption and decryption phases.
  3. The disk’s MFT (master file table) hasn’t been destroyed by the other malware components.
  4. File encryption was only performed once. As we previously noted, propagation techniques in the malware may end up encrypting files a second time, with a different key. This would make the files absolutely unrecoverable.

Note that the malware does not include this decryption functionality. A separate decryptor tool would need to be provided to victims.


What Good Is A Not For Profit (Eternal) Petya?

Following up on our post from yesterday, as an intellectual thought experiment, let’s take the position that there’s something to the idea of (Eternal) Petya not being motivated by money/profit. Let’s also just go ahead and imagine that it’s been developed by a nation state.

In my mind, it raises the following question: WTF WHY? Why build a tool such as (Eternal) Petya? Or as Andy put’s it in this post: if someone wanted to build a “wiper”, why build an almost functional ransomware?

First, having written/edited numerous malware descriptions over the years, I’m a really bit pedantic about proper categorization – so let’s be clear, (Eternal) Petya is not a wiper. A wiper is something such as Shamoon. (Eternal) Petya is almost fully functional ransomware, and the question is: what more is it? If this is a prototype, what is it moving towards?

Say you’re developing tools of (cyber) warfare…

How useful is an indiscriminate, scorched-earth tool? Sure, it would have it’s uses, and it’s probably the first thing that you would develop, but in the end, it’s a pretty blunt tool. Deploying any such tool with clear attribution only escalates the situation. Use it, and you’ve immediately crossed a line. The response is going to be very severe, and will probably be something in kind. Think something like mutual assured destruction (MAD) severe. A world of nothing but indiscriminate tools/weapons is limited (and very dangerous).

So what you need is a discriminating tool; something more refined. You want/need something that can remediate collateral damage; something that can take you up to a line, but not cross completely over it. Perhaps what you want is to “weaponize” encryption. That would allow you disable your adversary but put you in a position to negotiate another move.

There are undoubtedly already nations with cyber warfare tools that can cripple critical infrastructure without completely disabling/destroying it. Which is to say, their tools are far more precise and thus they can more easily be deployed without crossing over too many lines.

That makes for asymmetry. And if you’re a nation state trying to quickly close a gap, you might decide to test things in-the-wild. But you wouldn’t just test in-the-clear, you need some plausible deniability – and crypto-ransomware is very good deniability. If you want a tool that is effectively acts like a wiper, delay remediation – or simply don’t respond. And if your goal is something otherwise, your tool is reversible without having to (publicly) admit guilt.

End of thought experiment.

And of course, remember, it could just be ransomware in-development.

(Eternal) Petya From A Developer’s Perspective

In our previous post about Petya, we speculated that the short-cuts, design flaws, and non-functional mechanisms observed in the  malware could have arisen due to it being developed under a tight deadline. I’d now like to elaborate a little on what we meant by that.

As a recap, this is what the latest version of Petya looks like (to us).

Since the previous post, we’ve determined that the user-mode encryption-decryption functionality is actually working as intended. That’s the mechanism used to encrypt and decrypt files on the system.

However, the MBR encryption-decryption functionality doesn’t work. This is because the “personal installation key” displayed in the MBR ransom screen is just a randomly generated string. This is one of the main reasons why some people are calling Petya a “wiper” (or “WTFware”).

Naturally, this unimplemented functionality has led to much discussion on the Internet. And also here at F-Secure.

We discussed the “illusory” MBR functionality in some detail. Here are a few points that were made.

  1. This is a new piece of crimeware when it comes to the user-mode component. But the MBR code was taken from v3 of Petya, known as “GoldenEye”. Current consensus seems to be that only a guy called “Janus” has the source code of Petya, so if this operation isn’t being run by “Janus”, it’s possible that this new ransomware is just using a hex-edited dump of the infected MBR.
  2. How would a third-party use the Petya binary? First off, the way this dump is used tells us that the author knows exactly where to patch the Petya encryption key and nonce, and also the “User ID”. The author also knows how to edit the MBR banner message, and other minor changes. This involves reverse engineering and taking a very close look at what people have written about Petya over the past year or so.
  3. In the “real” Petya the “User ID” is a real cryptographic entity that makes it possible to return back the actual disk encryption key, which is discarded in the process of locking up the disk. In this new Petya the “User ID” is totally random. I mean literally. It’s purposefully generated using Windows crypto APIs as a random string.
  4. Now ask yourself: If you know how to strip off the Petya bootloader, how to hex edit it, where exactly put all these variables, and how to change the banner, would you think it will work by just putting some random data there? Really? Everybody who follows these things knows that the “User ID” is a complicated piece of very precisely formatted information, that includes, for example, an Elliptic-curve public key generated by the infected machine. So, even if someone is tripping with mushrooms and thinks that generating random data will magically turn to something useful that can restore the encryption key, you need to ask next: would you in that case even once try to reboot and test if your magical random ID can be of some use? And would you ship it after this revelation?

Here we establish that the author of this malware obviously knows his code isn’t going to work. The author also knows that members of the community have meticulously studied prior versions of Petya, especially the MBR code, which is of interest, since it’s unique to Petya. So the malware author would be aware of the fact that it wouldn’t take reverse engineers all that long to figure out that the encryption-decryption mechanism is bogus.

However, consider the possibility that this malware was developed on a tight deadline, and released ahead of a reasonable schedule. I’m sure there isn’t a single software developer in the world who hasn’t been through that “process”.

But let me put on my “developer hat”. If I were to develop a piece of malware that includes Petya, I’d most likely start doing it first by embedding it in my project and maybe first filling all the variables with random data. Just to make sure the overall concept of embedding it works, Petya MBR code kicks in after reboot, and so on.

After everything seems to work, then I’d move on to the next step, where I’d implement the actual Elliptic-curve stuff that would ultimately replace the placeholder random data.

Let’s note at this point that all the Elliptic-curve-related code is completely absent in this new variant. It’s a complicated piece of code that would probably take over a week to develop and test properly.

So, what if my friend, who’s supposed to “ship” this thing gets a wrong version, or doesn’t bother to wait for it to be ready, or whatever. He will just package everything up, run it to see that everything looks like it works, and ship it. And off we go.

Putting together a proper build process for software isn’t easy. Tracking changes in different modules, making sure your final package contains the right things, and testing it thoroughly enough to catch discrepancies, or wrong versions is also time-consuming. Many “real” software companies have problems with build processes and versioning. The theory that the final build of Petya contained an old version of one of the components is not at all far-fetched. Neither is the theory that they shipped a “minimum viable product”.

Plenty of other evidence points towards this piece of software being developed in a hurry, and not thoroughly tested.For instance, a machine can re-infect itself and encrypt files twice. There’s also this bug:

And I’m sure we’ll find more bugs. (Whoever wrote this is getting a lot of free QA!)

At the end of the day, if someone wanted to build a “wiper”, why build an almost functional ransomware, save for a few bugs and a possibly misconfigured final package?


My colleague, Sean Sullivan wrote a follow post to this:

If this attack was aimed purely at the Ukraine, given the collateral damage we’ve seen, and the information emerging related to aggressive lateral propagation mechanisms employed by this malware, I’d add that (network propagation) logic  to the list of bugs/design flaws present in this malware.


Petya: “I Want To Believe”

There’s been a lot of speculation and conjecture around this “Petya” outbreak. A great deal of it seems to have been fueled by confirmation bias (to us, at least).

Many things about this malware don’t add up (at first glance). But it wouldn’t be the first time that’s happened.

And yet everyone seems to have had answers to a variety of burning questions – within mere moments of this whole thing exploding. It’s either a case of “wisdom of the crowd” (definitely good) or “group think” (definitely bad).

We prefer to avoid being pulled into group think, so we’ve stepped aside, exercised patience, and tried to apply some healthy skepticism to the matter. We realized that our questions could only be answered after a thorough analysis of the available material. So we took the methodical approach.

There’s a large risk of jumping the gun in this particular case, and it’s too important to us to risk that. Yeah, we get it. This is a topic with very high interest and people have been awaiting our say on the matter. We’ve erred on the side of caution. In the current media landscape, narratives can quickly get out of control of one person or organization’s ability to course correct. We don’t want to be the ones steering the ship in the wrong direction.

Needless to say, our analysts have been working long, hard hours on this case over the past few days, and ordering in so they can keep working.

Lunch pizza

Pizza: the lunch of champions!

So, taking the default position of  “it does what it says on the tin”, what evidence would convince us to change our minds? (Fans of the scientific method.) We didn’t really know what evidence was needed until we started looking. And, so, over the past 48 hours, we’ve bombarded our colleagues with questions (a lot of which we’ve seen others asking). So, without further ado, let’s start.

“Can it even be ransomware if it doesn’t have a good payment pipeline?”

Lots of ransomware uses email. There’s only two choices to communicate with your customers/victims: use email or create a service portal. They each have pros and cons. Starting with email doesn’t mean you’ve ruled out creating a portal later on, because in a case such as this, if you build it, they will come (if everything’s working properly).

So, is everything working properly?


Aha! So that’s evidence of it not being ransomware, right?


But why?

Malfunctioning malware isn’t rare. It’s possibly evidence of nothing more than a bug in the code, a design flaw, or issues with supporting infrastructure. It’s typically not enough evidence for us to attribute anything in particular.

So what doesn’t work?

Decryption of files is not possible.


For many reasons. We’ll get into that below.

Many reasons? So there’s lots of bugs? Isn’t that evidence that it’s not real ransomware?

To be honest, who knows. It’s evidence of a mess, and we’re still working to untangle all the knots. It’s time-consuming.

This line of questioning is getting us nowhere. Let’s move on.

Nation state malware is advanced and sophisticated, right? Is this sophisticated?

Yes and no. As you might have guessed from above, part of it most certainly isn’t sophisticated. But… part of it is. We’ve identified three main components. Two of them are pretty shoddy and seem kinda cobbled together. But the third component, the bit that allows the malware to spread laterally across networks, seems very sophisticated and well-tested.

So it’s a paradox then?

Kinda. You probably can see why we’re trying not to rush to any judgements (we hope!). For the sake of this post, let’s call these three components “user mode component”, “network propagation component”, and “MBR component”. Here’s a diagram.

Bad handwriting by yours truly (Andy).

So what’s the deal with this “sophisticated” part?

Aha! So here’s the interesting bit. It appears to be well designed, well tested, and there’s evidence that development on the network propagation component was completed in February ². The network propagation module was probably already in development in February. ²

Update: see below ².

“We won’t be able to determine the timestamp for the use of NSA tools since it’s part of the main DLL code which has the June timestamp.” ²

“Also, in this particular Petya sample, the shellcode is in a way coupled with the exploits. That is, they didn’t simply plug the shellcode in without properly testing it with their version of the SMB exploit.” ²

What’s interesting about that?

February is many weeks before the exploits EternalBlue and EternalRomance (both of which this module utilizes) were released to the public (in April) by the Shadow Brokers. And those exploits fit this component like a glove.

Note: this isn’t rock solid evidence, but it’s far more compelling to us than any of the other reasoning we’ve seen so far.

How does it compare to WannaCry (which also used these exploits)?

WannaCry clearly picked these exploits up after the Shadow Brokers dumped them into the public domain in April. Also, WannaCry didn’t do the best job at implementing these exploits correctly.

By comparison, this “Petya” looks well-implemented, and seems to have seen plenty of testing. It’s fully-baked.

So, if the network propagation component is fully-baked, why aren’t the other two?

Here’s our theory. WannaCry, again.

WannaCry burst onto the scene in May, and started trashing up the joint, causing everyone to scramble to patch SMB vulnerabilities. Microsoft even patched XP! The result of this was a sudden drop in effectiveness of carefully crafted network propagation components (such as the one we’re talking about here). Whatever project these guys were working on, suddenly got its deadline adjusted. And hence everything else was done in a bit of a hurry.

Do you have anything else to add to your timeline?

Kaspersky Lab pointed to a Ukraine-based watering hole, and it turns out the MBR component was actually pushed out via that site in late May, post-WannaCry. We feel this might also be consistent with an “adjusted” deadline.

So, can you sum this up for us?

  • January 2017. The Shadow Brokers advertised an “auction” which revealed the names of all the exploits they had for sale.
    • The NSA, upon noticing their exploits being advertised, hurriedly contacted Microsoft (reportedly with hat in hand), and owned up to their shenanigans.
  • February 2017. Microsoft then had a very busy patch cycle and actually missed patch Tuesday that month.
    • Meanwhile, “friends of the Shadow Brokers” were busy finishing up development of a rather nifty network propagation component, which ended up utilizing these exploits. ²
  • March 2017. Microsoft rolls out patches. Many fixes (to NSA-exploited vulnerabilities) made.
  • April 2017.  The Shadow Brokers dump a whole bunch of exploits into the public domain.
    • Somebody with possible connections to North Korea notices.
  • May 2017. WannaCry, ’nuff said.
    • Either the “friends of the Shadow Brokers” had something they felt they needed to get done, and their deadline was stepped up because of WannaCry…
    • Or they figured they could “join the party” as yet another ransomware, as long as they capitalized on it within a reasonable amount of time.
    • The MBR component of this malware was alpha tested using a watering hole attack.
  • June 2017. You are here.

Are you still skeptics?


But are you still skeptical about this malware being “nation state”?

Less and less so. We don’t think any current attribution is rock solid (attribution never really is). We feel this is definitely worth deeper investigation. And more pizza.

We’ve changed our minds on some of our earlier conclusions. Please note this if you’re reading any previous F-Secure analysis. And, of course, this is subject to further revision, as new facts come to light.

What other thoughts would you like to share?

As we mentioned earlier, two of the components in this malware are quite shoddy. Here are some interesting/confusing things we found.

The generated “personal installation key” displayed on the MBR version of the ransom page is 60 bytes of randomly generated data. This wouldn’t be a problem if it were sent upstream to the attacker, as a customer ID, but it isn’t (there’s no C&C functionality at all). It’s basically a placeholder that makes the ransomware look legit.

Why the authors of this malware failed to add proper decryption functionality to the MBR lock screen is still a question. Was it intentionally left out, did they make a huge mistake, or did they run out of time?

One of our analysts noted that implementation of the Elliptic curve Diffie–Hellman functionality necessary to enable proper encryption-decryption services in the MBR portion of the malware would take upwards of a week. If the developers were in a hurry, this could be one of the reasons why they opted for the “illusory” functionality we’re seeing.

This malware encrypts files on the user’s system and then, if it can elevate to admin, rewrites the MBR and reboots into a lock screen. Why encrypt files on the machine if you’re going to ultimately render the whole machine unusable? The user-mode encryption step is actually a fallback mechanism for if the malware can’t attain admin rights, which it would need to modify the MBR and execute that phase. Essentially, it’s a way of increasing the author’s chances of receiving a payment.

In cases where the malware fails to elevate, and only encrypts files in user-mode, a ransom note is left for the victim. This ransom note contains a different, much longer key than the one seen in the MBR lock screen. We’re currently looking into whether that key is generated in a way that might allow decryption to happen¹.

Petya user mode ransom note.

You can get infected multiple times.

This malware also does other stuff that indicates poor testing practices. For instance, a machine infected with this malware can re-infect itself via one of its own propagation mechanisms. In this case, user-mode encryption will run a second time (most likely with elevated privileges), making decryption impossible.

It has a vendetta against Kaspersky Lab.

If this malware finds running Kaspersky processes on the system, it writes junk to the first 10 sectors of the disk, and then reboots, bricking the machine completely.

One final thing.

We know of victims who don’t use M.E.Doc and have no obvious connections to Ukraine. Yet they were infected during Tuesday’s outbreak. This mystery is one of the factors that have kept us from jumping on the conspiracy train. And we still don’t have answers here.


¹ Edited on Thursday

We’ve confirmed that the user-mode encryption-decryption logic is functional and does work. Details here.

² Edited on Friday

See also our latest posts on the subject:

Some of the payloads utilized by the network propagation component have compilation timestamps from February 2017. The compilation dates on these payloads don’t have any bearing on when the Eternal* exploits were implemented in the network propagation code.


Processing Quote Tweets With Twitter API

I’ve been writing scripts to process Twitter streaming data via the Twitter API. One of those scripts looks for patterns in metadata and associations between accounts, as streaming data arrives. The script processes retweets, and I decided to add functionality to also process quote Tweets.

Retweets “echo” the original by embedding a copy of the Tweet in a field called retweeted_status:


Twitter’s API reference entry for retweeted_status

According to Twitter’s own API documentation, a quote Tweet should work in a similar way. (A quote Tweet is like wrapping your tweet around somebody else’s.) A Tweet object containing the quoted Tweet should be available in the quoted_status field.


Twitter’s API reference entry for quoted_status

I some wrote code to fetch and process quoted_status in a similar way to how I was already processing retweeted_status, but it didn’t work. I “asked” Google for answers, but didn’t really find anything, so I decided to dig into what the API was actually returning in the quoted_status field.

It turns out it’s not a Tweet object. Here’s what a quoted_status field actually looks like:

{u'contributors': None, 
 u'truncated': False, 
 u'text': u'', 
 u'is_quote_status': False, 
 u'in_reply_to_status_id': None, 
 u'id': 0, 
 u'favorite_count': 0, 
 u'source': u'<a href="" rel="nofollow">Twitter Web Client</a>', 
 u'retweeted': False, 
 u'coordinates': None, 
 u'entities': {u'user_mentions': [], 
               u'symbols': [], 
               u'hashtags': [], 
               u'urls': []}, 
 u'in_reply_to_screen_name': None, 
 u'id_str': u'', 
 u'retweet_count': 0, 
 u'in_reply_to_user_id': None, 
 u'favorited': False, 
 u'user': {u'follow_request_sent': None, 
           u'profile_use_background_image': True, 
           u'default_profile_image': False, 
           u'id': 0, 
           u'verified': True, 
           u'profile_image_url_https': u'', 
           u'profile_sidebar_fill_color': u'FFFFFF', 
           u'profile_text_color': u'FFFFFF', 
           u'followers_count': 0, 
           u'profile_sidebar_border_color': u'FFFFFF', 
           u'id_str': u'0', 
           u'profile_background_color': u'FFFFFF', 
           u'listed_count': 0, 
           u'profile_background_image_url_https': u'', 
           u'utc_offset': -18000, 
           u'statuses_count': 0, 
           u'description': u"", 
           u'friends_count': 0, 
           u'location': None, 
           u'profile_link_color': u'FFFFFF', 
           u'profile_image_url': u'', 
           u'following': None, 
           u'geo_enabled': True, 
           u'profile_banner_url': u'', 
           u'profile_background_image_url': u'', 
           u'name': u'', 
           u'lang': u'en', 
           u'profile_background_tile': False, 
           u'favourites_count': 0, 
           u'screen_name': u'', 
           u'notifications': None, 
           u'url': None, 
           u'created_at': u'Fri Nov 27 23:14:06 +0000 2009', 
           u'contributors_enabled': False, 
           u'time_zone': u'', 
           u'protected': False, 
           u'default_profile': True, 
           u'is_translator': False}, 
 u'geo': None, 
 u'in_reply_to_user_id_str': None, 
 u'lang': u'en', 
 u'created_at': u'Thu Jun 22 00:33:13 +0000 2017', 
 u'filter_level': u'low', 
 u'in_reply_to_status_id_str': None, 
 u'place': None}

So, it’s a data structure that contains some of the information you might find in a Tweet object. But it’s not an actual Tweet object. Kinda makes sense if you think about it. A quote Tweet can quote other quote Tweets, which can quote other quote Tweets. (Some folks created rather long quote Tweet chains when the feature was first introduced.) So, if the API would return a fully-hydrated Tweet object for a quoted Tweet, that object could contain another Tweet object in its own quoted_status field, and so on, and so on.

Here’s a small piece of python code that looks for retweets and quote Tweets in a stream and retrieves the screen_name of the user who published the original Tweet, if it finds one. It illustrates the differences between handling retweets and quote Tweets.

from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
from tweepy import API

consumer_key="add your own key here"
consumer_secret="add your own secret here"
access_token="add your own token here"
access_token_secret="add your own secret here"

class StdOutListener(StreamListener):
    def on_status(self, status):
        screen_name = status.user.screen_name

        if hasattr(status, 'retweeted_status'):
            retweet = status.retweeted_status
            if hasattr(retweet, 'user'):
                if retweet.user is not None:
                    if hasattr(retweet.user, "screen_name"):
                        if retweet.user.screen_name is not None:
                            retweet_screen_name = retweet.user.screen_name
                            print screen_name + " retweeted " + retweet_screen_name

        if hasattr(status, 'quoted_status'):
            quote_tweet = status.quoted_status
            if 'user' in quote_tweet:
                if quote_tweet['user'] is not None:
                    if "screen_name" in quote_tweet['user']:
                        if quote_tweet['user']['screen_name'] is not None:
                            quote_tweet_screen_name = quote_tweet['user']['screen_name']
                            print screen_name + " quote tweeted " + quote_tweet_screen_name
        return True

    def on_error(self, status):
        print status

if __name__ == '__main__':
    l = StdOutListener()
    auth = OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_token_secret)
    auth_api = API(auth)
    print "Signing in as: "
    print "Preparing stream"

    stream = Stream(auth, l, timeout=30.0)
    searches = ['donald', 'trump', ]
    while True:
      if 'searches' in locals():
        print"Filtering on:" + str(searches)
        print"Getting 1% sample"

Super Awesome Fuzzing, Part One

An informative guide on using AFL and libFuzzer.

Posted on behalf of Atte Kettunen (Software Security Expert) & Eero Kurimo (Lead Software Engineer) – Security Research and Technologies.

The point of security software is to make a system more secure. When developing software, one definitely doesn’t want to introduce new points of failure, or to increase the attack surface of the system the software is running on. So naturally, we take secure coding practices and software quality seriously. One good example of how we strive to improve software quality and security at F-Secure is our Vulnerability Reward Program that’s been running for almost two years now. And it’s still running, so do participate if you have a chance! Earlier this year, we posted an article detailing what we learned during the first year. It goes without saying that we have many processes in-house to catch potential bugs and vulnerabilities in our software. In this article, we’d like to explain one of the many processes we use in-house to find vulnerabilities before they reach our customers, and our dear bug bounty hunters.

One method for bug hunting that has proven to be very effective is a technique called fuzzing, where the target program is injected with unexpected or malformed data, in order to reveal input handling errors leading to, for example, an exploitable memory corruption. To create fuzz test cases, a typical fuzzer will either mutate existing sample inputs, or generate test cases based on a defined grammar or ruleset. An even more effective way of fuzzing is coverage guided fuzzing, where program execution paths are used to guide the generation of more effective input data for test cases. Coverage guided fuzzing tries to maximize the code coverage of a program, such that every code branch present in the program is tested. With the emergence of [Google’s!!] open source coverage guided fuzzing tools such as American Fuzzy Lop (AFL), LLVM libFuzzer, and HonggFuzz, using coverage guided fuzzing has never been easier or more approachable. You no longer need to master arcane arts, spend countless hours writing test case generator rules, or collecting input samples that cover all functionality of the target. In the simplest cases you can just compile your existing tool with a different compiler, or isolate the functionality you want to fuzz, write just a few lines of code, and then compile and run the fuzzer. The fuzzer will execute thousands or even tens of thousands of test cases per second, and collect a set of interesting results from triggered behaviors in the target.

If you’d want to get started with coverage guided fuzzing yourself, here’s a couple of examples showing how you’d fuzz libxml2, a widely used XML parsing and toolkit library, with two fuzzers we prefer in-house: AFL and LLVM libFuzzer.

Fuzzing with AFL

Using AFL for a real world example is straightforward. On Ubuntu 16.04 Linux you can get fuzzing libxml2 via its xmllint utility with AFL with just seven commands.

First we install AFL and get the source code of libxml2-utils.

$ apt-get install -y afl
$ apt-get source libxml2-utils

Next we configure libxml2 build to use AFL compilers and compile the xmllint utility.

$ cd libxml2/
$ ./configure CC=afl-gcc CXX=afl-g++
$ make xmllint

Lastly we create a sample file with content “<a></a>” for AFL to start with and run the afl-fuzz.

$ echo "" > in/sample
$ LD_LIBRARY_PATH=./.libs/ afl-fuzz -i ./in -o ./out -- ./.libs/lt-xmllint -o /dev/null @@

AFL will continue fuzzing indefinitely, writing inputs that trigger new code coverage in ./out/queue/, crash triggering inputs in ./out/crashes/ and inputs causing hangs in /out/hangs/. For more information on how to interpret the AFL’s status screen, see:

Fuzzing with LLVM libFuzzer

Let’s now fuzz libxml2 with the LLVM libFuzzer. To start fuzzing, you’ll first need to introduce a target function, LLVMFuzzerTestOneInput, that receives the fuzzed input buffer from libFuzzer. The code looks like this.

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
 DoSomethingInterestingWithMyAPI(Data, Size);
 return 0; // Non-zero return values are reserved for future use.

For fuzzing libxml2, Google’s fuzzer test suite provides a good example fuzzing function.

// Copyright 2016 Google Inc. All Rights Reserved.
 // Licensed under the Apache License, Version 2.0 (the "License");
 #include "libxml/xmlversion.h"
 #include "libxml/parser.h"
 #include "libxml/HTMLparser.h"
 #include "libxml/tree.h"

void ignore (void * ctx, const char * msg, ...) {}

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
 xmlSetGenericErrorFunc(NULL, &ignore);
 if (auto doc = xmlReadMemory(reinterpret_cast(data), size, "noname.xml", NULL, 0))
 return 0;

Before compiling our target function, we need to compile all dependencies with clang and -fsanitize-coverage=trace-pc-guard, to enable SanitizerCoverage coverage tracing. It is a good idea to also use -fsanitize=address,undefined in order to enable both the AddressSanitizer(ASAN) and the UndefinedBehaviorSanitizer(UBSAN) that catch many bugs that otherwise might be hard to find.

 $ git clone libxml2
 $ cd libxml2
 $ FUZZ_CXXFLAGS="-O2 -fno-omit-frame-pointer -g -fsanitize=address,undefined -fsanitize-coverage=trace-pc-guard"
 $ ./
 $ CXX="clang++-5.0 $FUZZ_CXXFLAGS" CC="clang-5.0 $FUZZ_CXXFLAGS" CCLD="clang++-5.0 $FUZZ_CXXFLAGS" ./configure
 $ make

As of writing this post, libFuzzer is not shipped with precompiled clang-5.0 packages of, so you’ll still need to checkout and compile libFuzzer.a yourself as documented in, but this might change in the near future.

The second step is to compile our target function, with the same flags, and link it with both the libFuzzer runtime and the libxml2 we compiled earlier.

$ clang++-5.0 -std=c++11 $FUZZ_CXXFLAGS -lFuzzer ./ -I ./include ./.libs/libxml2.a -lz -llzma -o libxml-fuzzer

Now we are ready to run our fuzzer.

$ mkdir ./output
$ ./libxml-fuzzer ./output/

We didn’t use any sample inputs, so libFuzzer starts by generating random data in order to find inputs that trigger new code paths in our libxml2 target function. All inputs that trigger new coverage are stored as sample files in ./output. As libFuzzer runs in-process, if a bug is found, it saves the test case and exits. On a high-end laptop, a single instance of libFuzzer reached over 5000 executions per second, slowing down to around 2000 once it started to generate test cases with more coverage. For more information on how to interpret the output see:

Creating a corpus

If your target is fast, meaning hundreds or even thousands of executions per second, you can try generating a base corpus out of thin air. With coverage guided fuzzing it is possible to do this even with more complex formats like the AFL author Michał Zalewski did with JPEG-files, but to save time, you should get a good representation of typical files for the application that are as small as possible. The smaller the files, the faster they are to fuzz.

AFL does not give any additional flags to tinker with when generating corpus out of thin air. Just give it a small sample input, for example “<a></a>” as an XML sample, and run AFL like you normally would.

With libFuzzer you have more flags to experiment with. For example, for XML you might want to try with ‘-only_ascii=1‘. One good technique for most formats is to execute multiple short runs while incrementing the maximum sample size of our fuzzer on each round and then merge all the results to form the output corpus.

$ for foo in 4 8 16 32 64 128 256 512; do \
./libxml-fuzzer -max_len=$foo -runs=500000 ./temp-corpus-dir; \
$ ./libxml-fuzzer -merge=1 ./corpus ./temp-corpus-dir

With this approach, we first collect interesting inputs with maximum length of 4 bytes, the second run analyses the 4 byte inputs and uses those as a base for 8 byte inputs and so on. This way we discover “easy” coverage with faster smaller inputs and when we move to larger files we have a better initial set to start with.

To get some numbers for this technique, we did three runs with the example script.

On average, running the corpus generation script took about 18 minutes on our laptop. LibFuzzer was still frequently discovering new coverage at the end of iterations where -max_len was larger than 8 bytes, which suggests that, for those lengths, libFuzzer should be allowed to run longer.

For comparison, we also took the libFuzzer with default settings and ran it for three rounds, which took about 18 minutes.

$ ./libxml-fuzzer -max_total_time=1080 ./temp-corpus-dir
$ ./libxml-fuzzer -merge=1 ./corpus ./temp-corpus-dir;

From these results we see that our runs with the corpus generation script on average executed more test cases, generated a larger set of files, that triggers more coverage and features than the set generated with the default values. This is due to the size of test cases generated by libFuzzer using default settings. Previously libFuzzer used default -max_len of 64 bytes, but at the time of writing libFuzzer was just updated to have a default -max_len of 4096 bytes. In practice sample sets generated by this script have been very working starting points for fuzzing, but no data has been collected how the effects differ in comparison to default setting in long continuous fuzzing.

Corpus generation out of thin air is an impressive feat, but if we compare these results to the coverage from W3C XML test suite we see that it is a good idea to also to include sample files from different sources to your initial corpus, as you’ll get much better coverage before you’ve even fuzzed the target.

$ wget -O - | tar -xz
$ ./libxml-fuzzer -merge=1 ./samples ./xmlconf
$ ./libxml-fuzzer -runs=0 ./samples
  #950        DONE   cov: 18067 ft: 74369 corp: 934/2283Kb exec/s: 950 rss: 215Mb

Merging our generated corpus into the W3C test suite increased the block coverage to 18727, so not that much, but we still got a total of 83972 features, increasing the total throughput of these test cases. Both improvements are most probably due to small samples triggering error conditions that were not covered by the W3C test suite.

Trimming your corpus

After fuzzing the target for a while, you’ll end up with a huge set of fuzzed files. A lot of these files are unnecessary, and trimming them to a much smaller set will provide you with the same code coverage of the target. To achieve this, both projects provide corpus minimization tools.

AFL gives you the afl-cmin shell script that you can use to minimize your corpus. For the previous example, to minimize the corpus generated in the ./out directory, you can generate a minimized set of files into the ./output_corpus directory.

$ afl-cmin -i ./out/queue -o ./output_corpus -- ./.libs/lt-xmllint -o /dev/null @@

AFL also offers another tool afl-tmin that can be used to minimize individual files while maintaining the same coverage as observed initially. Be aware that running afl-tmin on a large set of files can take a very long time, so first do couple of iterations with afl-cmin before trying afl-tmin.

LibFuzzer doesn’t have an external trimming tool – it has the corpus minimization feature, called merge, built-in.

$ ./libxml-fuzzer -merge=1 <output directory> <input directory 1> <input directory 2> ... <input directory n>

LibFuzzer merge is a little easier to use since it looks for files recursively from any number of input directories. Another nice feature in libFuzzer merge is the -max_len flag. Using -max_len=X, libFuzzer will only use the first X bytes from each sample file, so you can collect random samples without caring about their sizes. Without the max_len flag, libFuzzer uses a default maximum length of 1048576 bytes when doing a merge.

With libFuzzer merge, you can use the same technique as you did to generate a corpus out of thin air.

$ for foo in 4 8 16 32 64 128 256 512 1024; do
mkdir ./corpus_max_len-$foo;
./libxml-fuzzer -merge=1 -max_len=$foo ./corpus-max_len-$foo ./corpus-max_len-* <input-directories>;
$ mkdir output_corpus;
$ ./libxml-fuzzer -merge=1 ./output_corpus ./corpus-max_len-*;

With this trimming strategy libFuzzer will first collect new coverage triggering 2 byte chunks from each input sample, then merge those samples to 4 byte chunks, and so on, until you have the optimized set out of all the different length chunks.

A simple merge won’t always help you with performance issues. Sometimes your fuzzer can stumble upon very slow code paths, causing collected samples to start decaying your fuzzing throughput. If you don’t mind sacrificing a few code blocks for performance, libFuzzer can be easily used to remove too slow samples from your corpus. When libFuzzer is run with a list of files as an argument instead of a folder, it will execute every file individually and print out execution time for each file.

$ ./libxml-fuzzer /* 
INFO: Seed: 3825257193
INFO: Loaded 1 modules (237370 guards): [0x13b3460, 0x149b148), 
./libxml2/libxml-fuzzer: Running 1098 inputs 1 time(s) each.
Running: ./corpus-dir/002ade626996b33f24cb808f9a948919799a45da
Executed ./corpus-dir/002ade626996b33f24cb808f9a948919799a45da in 1 ms
Running: ./corpus-dir/0068e3beeeaecd7917793a4de2251ffb978ef133
Executed ./corpus-dir/0068e3beeeaecd7917793a4de2251ffb978ef133 in 0 ms

With a snippet of awk, this feature can be used to print out names of files that that took too long to run, in our example 100 milliseconds, and then we can just remove those files.

$ ./libxml-fuzzer /* 2>&1 | awk  '$1 == "Executed" && $4 > 100 {print $2}' | xargs -r -I '{}' rm '{}'

Running both fuzzers in parallel

Now that you have a good base corpus, and you know how to maintain it, you can kick off some continuous fuzzing runs. You could run your favorite fuzzer alone, or run both fuzzers separately, but if you’ve got enough hardware available you can also easily run multiple fuzzers simultaneously on the same corpus. That way you get to combine best of both worlds while the fuzzers can share all the new coverage they find.

It’s easy to implement a simple script that will run both fuzzers simultaneously, while restarting the fuzzers every hour to refresh their sample corpus.

$ mkdir libfuzzer-output; echo "" > .libfuzzer-output/1
$ while true; do \
afl-fuzz -d -i ./libfuzzer-output/ -o ./afl-output/ -- ./libxml/afl-output/bin/xmllint -o /dev/null @@ 1>/dev/null & \
./libxml/libxml-fuzzer -max_total_time=3600 ./libfuzzer-output/; \
pkill -15 afl-fuzz; \
sleep 1; \
mkdir ./libfuzzer-merge; \
./libxml/libxml-fuzzer -merge=1 ./libfuzzer-merge ./libfuzzer-output/ ./afl-output/; \
rm -rf ./afl-output ./libfuzzer-output; \
mv ./libfuzzer-merge ./libfuzzer-output; \

Because the example script only runs one hour per iteration, AFL is used in “quick & dirty mode” to skip all the deterministic steps. Even one large file can cause AFL to spend hours, or even days, on deterministic steps, so it it’s more reliable to run AFL without them when running on time budget. Deterministic steps can be run manually, or automatically on another instance that copies new samples to ‘./libfuzzer_output‘.


You have your corpus, and you’re happily fuzzing and trimming. Where do you go from here?

Both AFL and libFuzzer support user-provided dictionaries. These dictionaries should contain keywords, or other interesting byte patterns, that would be hard for the fuzzer to determine. For some useful examples, take a look at Google libFuzzer’s XML dictionary and this AFL blog post about dictionaries.

Since these tools are quite popular nowadays, some good base dictionaries can be already found online. For example, Google has collected quite a few dictionaries: Also, AFL source code contains few example dictionaries. If you don’t have the source code, you can check out afl mirror from github:

Both AFL and libFuzzer also collect dictionary during execution. AFL collects dictionary when performing deterministic fuzzing steps, while libFuzzer approach is to instrument.

When running libFuzzer with time or test case limit, libFuzzer will output a recommended dictionary upon exit. This feature can be used to collect interesting dictionary entries, but it is recommended to do manual sanity checks over all automatically collected entries. libFuzzer builds those dictionary entries as it discovers new coverage, so those entries often build up towards the final keyword.


We tested dictionaries with three 10 minute runs: without dictionary, with the recommended dictionary from first run and with the Google’s libFuzzer XML dictionary. Results can be seen from the table below.

Surprisingly, there was no significant difference between the results from the run without dictionary and the run with recommended dictionary from the first run, but with a “real” dictionary there is a dramatic change in the amount of coverage discovered during the run.

Dictionaries can really change the effectiveness of fuzzing, at least on short runs, so they are worth the investment. Shortcuts, like the libFuzzer recommended dictionary, can help, but you still need to do the extra manual effort to leverage the potential in dictionaries.

Fuzzing experiment

Our goal was to do a weekend long run on a couple of laptops. We ran two instances of AFL and libFuzzer, fuzzing the above example. The first instance was started without any corpus, and the second one with trimmed corpus from W3C XML Test Suite. The results could then be compared by performing a dry run for minimized corpus from all four sets. Results from these fuzzers are not directly comparable since both fuzzers use different instrumentation to detect executed code paths and features. libFuzzer measures two things for assessing new sample coverage, block coverage, that is isolated blocks of code visited, the and feature coverage, that is a combination of different code path features like transitions between code blocks and hit counts. AFL doesn’t offer direct count for the observed coverage, but we use overall coverage map density in our comparisons. The map density indicates how many branch tuples we have hit, in proportion to how many tuples the coverage map can hold.

Our first run didn’t go quite as expected. After 2 days and 7 hours we were reminded about the downsides of using deterministic fuzzing on large files. Our afl-cmin minimized corpus contained a couple of over 100kB samples that caused AFL to slow down to crawl after processing only under 38% of the first round. It would have taken days for AFL to get through a single file, and we had four of those in our sample set, so we decided to restart instances, after we removed all over 10kB samples. Sadly, on Sunday night at 11PM, “backup first” wasn’t the first thing in our mind and the AFL plot data was accidentally overwritten, so no cool plots from the first round. We managed to save the AFL UI before aborting.

Full results of our 2 day fuzzing campaign can be found from the image/table below.

We had actually never tried to pit these fuzzers against each other before. Both fuzzers were surprisingly even in our experiment. Starting from the W3C samples, the difference between discovered coverage, as measured by libFuzzer, was only 1.4%. Also both fuzzers found pretty much the same coverage. When we merged all the collected files from the four runs, and the original W3C samples, the combined coverage was only 1.5% higher than the coverage discovered by libFuzzer alone. Another notable thing is that without initial samples, even after 2 days, neither libFuzzer or AFL had discovered more coverage than our previous demonstration in generating a corpus out of thin air did repeatedly in 10 minutes.

We also generated a chart from coverage discovery during libFuzzer fuzzing run with the the W3C samples.

Which one should I use?

As we detailed, AFL is really simple to use, and can be started with virtually no setup. AFL takes care of handling found crashes and stuff like that. However, if you don’t have a ready command line tool like xmllint, and would need to write some code to enable fuzzing, it often makes sense to use libFuzzer for superior performance.

In comparison to AFL, libFuzzer has built-in support for sanitizers, such as AddressSanitizer and UndefinedBehaviorSanitizer, which help in finding subtle bugs during fuzzing. AFL has some support for sanitizers, but depending on your target there might be some serious side effects. AFL documentation suggests on running fuzzing without sanitizers and running the output queue separately with sanitizer build, but there is no actual data available to determine whether that technique can catch the same issues as ASAN enabled fuzzing. For more info about AFL and ASAN you can check docs/notes_for_asan.txt from the AFL sources.

In many cases however it makes sense to run both fuzzers, as their fuzzing, crash detection and coverage strategies are slightly different.

If you end up using libFuzzer, you really should check the Google’s great libFuzzer tutorial.

Happy fuzzing!
Atte & Eero

TrickBot Goes Nordic… Once In A While

We’ve been monitoring the banking trojan TrickBot since its appearance last summer.

During the past few months, the malware underwent several internal changes and improvements, such as more generic info-stealing, support for Microsoft Edge, and encryption/randomization techniques to make analysis and detection more difficult. Unlike the very fast expansion of banks targeted during the first few months of activity, this number remained rather constant since then… until two weeks ago.

Initially we saw PayPal appearing in the configuration, the first and only financial transaction website victimized by TrickBot so far which is not a traditional bank. A surprising development, but apparently just a little taste of what was coming next. Last Wednesday, we observed a change in the list of targeted banks which is probably the largest expansion in TrickBot’s history thus far.

Those familiar with TrickBot meanwhile know that the trojan features two different MitB injection techniques, similar to those as seen in the Dyre trojan: “Static Injection” to replace login pages by rogue ones, and “Dynamic Injection” to redirect browser requests to the C&C. Both injection configurations now contain banks located in at least 9 countries that were not part of the rather questionable list of TrickBot’s victims before.

In the Dynamic Injection list, the following French banks were added:

  • banque-*.fr
  • ca-*.fr

And one bank located in Bahrain:


The Static Injection list suddenly tripled from 109 bank login URLs to a whopping 333, and these are not only added entries – the list is in fact entirely different. A closer look reveals that everything in Australia, New Zealand, Singapore, India, and Canada disappeared – the only leftovers are banks from the UK and Ireland. Instead, new countries include Switzerland, France, Lithuania, the Netherlands, and Luxembourg, but particularly interesting for us as a Finnish company are the 40 new Nordic banks. These are the targeted Finnish domains:








The complete Static Injection configuration can be found here:

The Static Injection technique replaces the actual login page with a rogue version created by the attackers. Here a few examples – left is the original page, right is the TrickBot version.

There are only some very subtle differences: the Chrome icon on the upper right indicating that some elements on the page are not from a secure source, the slightly different date format at the bottom of the Nordea page… not exactly things that an average user pays attention to.

But just when you thought that the TrickBot authors provided us enough surprises… nothing could be further from the truth. Last Friday, all new entries in the Static Injection had disappeared again, which basically reverted the list to its previous state of 109 URLs. And the story is not over. Yesterday evening, another new version popped up, this time with 235 URLs, that’s about 100 less than before. Several UK banks that were added last week didn’t make it in the new list, but all Nordic banks did. In other words, TrickBot’s attack on the Nordic banks started last Wednesday, but was suspended over the weekend.

So why that rollback on Friday? Was the updated configuration a mistake by the authors? A test? The C&C could not handle the sudden rise of traffic? Or perhaps they just wanted an easy weekend? We can only guess, but it will be interesting to see which tricks this bot has in store.

By the way, these recent changes in the configuration are not a coincidence. New malware versions are often accompanied by a campaign – this time was no different. On Wednesday we observed large spam campaigns delivering TrickBot, which can be seen in the graph below. The spam was spread using the Necurs botnet, which is also quite remarkable as we have seen it only distributing a very limited number of malware families, such as Dridex and Jaff.

TrickBot Linegraph

Graph, and screenshots below, courtesy of Päivi.

Again, the emails have a rather generic subject, but enough to attract the victim’s attention. A few examples of the spam content.

Opening the attached document eventually leads to launching a script which downloads the TrickBot binary, an infection chain we also found in recent campaigns delivering the ransomware Jaff. Since we had already detections for these documents in place, customers of our security products were protected.

OSINT For Fun And Profit: Hung Parliament Edition

The 2017 UK general election just concluded, with the Conservatives gaining the most votes out of all political parties. But they didn’t win enough seats to secure a majority. The result is a hung parliament.

Both the Labour and Conservative parties gained voters compared to the previous general election. Some of those wins came from defecting UKIP supporters. The rest, most of which went to Labour, came from young voters. And that was definitely reflected in social media.

The #VoteLabour hashtag was immensely popular in the lead-up to the elections.

#VoteLabour and #VoteConservative hashtags in the two weeks leading up to the 2017 UK election.

#VoteLabour continued to make a strong appearance during the week of the election and increased significantly as election day approached.

#VoteLabour and #VoteConservative hashtags during the election week.

The #VoteLabour hashtag completely overshadowed all other party hashtags all the way until polls closed.

Party hashtags on election day 2017.

On the day of the election, the #voted hashtag trended. Of those that tweeted the hashtag (in conjunction with election-related hashtags such as #GE2017), a majority of tweets referenced the Labour party. Here are the numbers when the polls closed (recorded during the day of the election).

Labour = 530
Conservative = 50
Libdems = 44
SNP = 111
UKIP = 19

Did we see any obvious external interference in the 2017 UK elections? Nope.

The top URLs shared over the two weeks leading up to the election included the following:

  • BBC’s Election Coverage website (3 links)
  • A number of pages, including the following headlines:
  • Labour party campaign site (
  • A YouTube video about Tory NHS cuts
  • A guide for tactically voting against the Tories (

Most of the popular URLs shared on Twitter were supportive of the Labour party, a reflection of Labour’s strong social media campaign. These findings support the fact that young voter turnout was, across the board, higher than in previous elections. Labour-run campaigns encouraging young people to vote were clearly successful, and in some constituencies, the youth vote actually changed the outcome.

Non-authoritative opinion-piece articles made up less than 10% of all URLs shared during the same time period. Notable examples included:

  • Sputnik: “Labour’s Poll Surge Has Establishment ‘Pundits’ in a Flap” (pro-Labour)
  • RT: “Tories ‘gagged’ us to prevent criticism of Theresa May, charities claim” (pro-Labour)
  • Daily Express: “Corbyn ready to hit homes with new garden tax which could TREBLE average council tax bills” (pro-Conservative)
  • RT: “BBC presenter confesses broadcaster ignores complaints of bias” (pro-Conservative)
  • RT: “Tory record on terrorism ‘very weak, deeply worrying,’ security expert tells RT” (pro-Labour)
  • RT: “Revealed! Big money bankrolling Tory campaign linked to claims of fraud, tax dodging” (pro-Labour)

Although the headlines look sensational, they’re nothing compared to what politically-oriented UK tabloids (such as The Sun and The Mirror) usually print.

In general, “non-authoritative” articles linked in Twitter weren’t politically biased towards one particular party. This is in stark contrast to the French presidential elections, where the majority of URLs shared on Twitter pointed to anti-Macron articles.

Articles from “US alt-right” sources (such as Breitbart) that dominated Twitter during the French elections were notably absent on Twitter.

I couldn’t find any popular hashtags exhibiting bot-like behavior. An insignificant number of the top Twitter posters during the past two weeks were from outside the UK. Those Twitter users who did post on regular intervals were news agencies and self-confessed bots designed to Tweet on regular schedules.

No blaming “outside interference” for this election outcome.

Why Is Somebody Creating An Army Of Twitter Bots?

There’s been some speculation this week regarding Donald Trump’s Twitter account. Why? Because its follower count “dramatically” increased (according to reports) due to a bunch of bots. Since Twitter analytics are my thing at the moment, I decided to do some digging.

Sean examined some of Trump’s new followers and found they had something in common. They aren’t just following Donald Trump, they’re following lots of popular accounts.

Popular person, Barack Obama

So, I wrote and ran a script that queried Twitter for the last 5,000 accounts to follow the “top 100” Twitter accounts (Twitter accounts with the highest number of followers). The output of that script was a list of roughly 200,000 unique accounts.

Of those 200,000, over 20,000 accounts follow 5 or more of the top 100 Twitter accounts. Roughly 8,000 of those 20,000 accounts were created on the 1st of June 2017, have a default profile, no profile picture, and haven’t Tweeted.

947 of those accounts follow @realDonaldTrump.

Over 2000, or roughly a quarter of the above 8,000 accounts follow exactly 21 Twitter users (436 of those follow @realDonaldTrump).

New Twitter Bots

My scripts harvested tons of this… stuff in just a few hours.

What do these accounts have in common?

  • Many of the accounts are named using Arabic or Chinese characters.
  • Most of the accounts have no followers. Those accounts that are being followed have picked up p0rnbots that automatically follow new Twitter accounts. The p0rnbot accounts don’t appear to be affiliated with the group creating these new Twitter accounts.
  • Some of the accounts are “themed”. For instance I came across a few that were following NASA and a number of science-related Twitter accounts. I found others following mostly celebrities or musicians. I also found African and Indian themes (accounts following politicians/groups in those regions).
  • Checking back in at a later time, I noticed that these accounts are slowly being “evolved” to look more “natural” (by liking Tweets and adding followers/followed).

Apparently somebody’s real busy cultivating a huge number of Twitter accounts at this very moment. As to why they’re doing it, we can only speculate.

  • The accounts will be sold off at a later date.
  • They’re being prepared for use by follower-boosting services.
  • They’re being cultivated for later “political” use.

Whatever the reason, this stuff isn’t being done in a very stealthy manner. And creating new Twitter accounts is easily automated. (I just created a new account using a Gmail alias;

I assume the folks at Twitter must see this activity. And I’m just wondering why they’re not doing anything about it. Creating accounts doesn’t even require a CAPTCHA.

P.S. – As an added bonus for those who like numbers p0rn, I checked which of the 200,000 unique accounts followed at least 10 of the top 100 accounts. It turns out roughly 7,000 of them do. Of these 7,000, around 3,000 were created on 1st June 2017, have a default profile, no profile picture, and haven’t Tweeted. 367 of those users follow @realDonaldTrump.

About 1,000 of those accounts follow exactly 21 other Twitter accounts. 160 of those follow @realDonaldTrump.

Now Hiring: Developers, Researchers, Data Scientists

We’re hiring right now, and if you check out our careers page, you’ll find over 30 new positions ranging from marketing (meh) to malware analysis (woot!). A select number of these new positions are in F-Secure Labs. If you’re on the lookout for a job in cyber security, you might find one of these jobs […]


WannaCry, Party Like It’s 2003

Let’s take a moment to collect what we know about WannaCry (W32/WCry) and what we can learn from it. When looked at from a technical perspective, WCry (in its two binary components) has the following properties. Comprised of two Windows binaries. mssecsvc.exe: a worm that handles spreading and drops the payload. tasksche.exe: a ransomware trojan […]


WCry: Knowns And Unknowns

WCry, WannaCry, Wana Decrypt0r. I’m sure at this point you’ve heard something about what the industry has dubbed the largest crypto ransomware outbreak in history. Following its debut yesterday afternoon, a lot of facts have been flying around. Here’s what we know, and don’t know. WCry has currently made a measly $25,000 They now made […]


OSINT For Fun And Profit: #Presidentielle2017 Edition

As I mentioned in a previous post, I’m writing scripts designed to analyze patterns in Twitter streams. One of the goals of my research is to follow Twitter activity around a newsworthy event, such as an election. For example, last weekend France went to the polls to vote for a new president. And so I […]


Unicode Phishing Domains Rediscovered

There is a variant of phishing attack that nowadays is receiving much attention in the security community. It’s called IDN homograph attack and it takes advantage of the fact that many different Unicode characters look alike. The use of Unicode in domain names makes it easier to spoof websites as the visual representation of an […]


F-Secure XFENCE (Little Flocker)

I use Macs both at home and at work, and as a nerd, I enjoy using interesting stand-alone tools and apps to keep my environment secure. Some of my favorites are knockknock, ransomwhere?, and taskexplorer, from the objective-see website. I’ve also been recently playing around with (and enjoying) from FireEye. When I heard that […]


Ransomware Timeline: 2010 – 2017

I’ve seen numerous compliments for this graphic by Micke, so… here’s a high-res version. Enjoy! Source: State of Cyber Security 2017


The Callisto Group

We’ve published a White Paper today titled: The Callisto Group. And who/what is the Callisto Group? A good question, here’s the paper’s summary. Heavy use of spear phishing, and malicious attachments sent via legitimate, but compromised, email accounts. Don’t click “OK”.


OSINT For Fun & Profit: @realDonaldTrump Edition

I’ve just started experimenting with Tweepy to write a series of scripts attempting to identify Twitter bots and sockpuppet rings. It’s been a while since I last played around with this kind of stuff, so I decided to start by writing a couple of small test scripts. In order to properly test it, I needed to point […]


“Cloud Hopper” Example Of Upstream Attack

There’s news today of a BAE/PWC report detailing a Chinese-based hacking group campaign dubbed “Operation Cloud Hopper”. Chinese Group Is Hacking Cloud Providers to Reach Into Secure Enterprise Networks — News from the Lab (@FSLabs) April 5, 2017 This operation is what’s known as an upstream attack, a method of compromise that we […]