Fun With Internet Metadata (AKA The Deep Web)

Our Cyber Security Services (CSS) division spend a fair amount of time working with companies on threat assessments. They’ve been doing this stuff for several years, and during that time, they developed some useful tools to make their jobs easier.

One of those tools is Riddler. It’s a web crawler that makes Internet metadata available via a search interface, and it’s useful for looking at relationships between domains, hosts, and IP addresses. It also lists metadata associated with sites that can give you clues as to any potential security issues. I got a hold of Riddler about half a year ago, and have had quite a bit of fun playing around with it since then.

Riddler has been available to the public for a while now, but as a company we’ve not really made much noise about it. You can access a web interface to it at The free version only returns ten results from a query, so it has limited use, but the subscription version is a lot more interesting. With that, you get access to a command-line interface and an API which makes it pretty easy to build your own mapping and monitoring tools.

I got quite addicted to digging through internet queries using the Riddler command-line interface.

I got quite addicted to digging through internet queries using the Riddler command-line interface.

I just finished writing a white paper about Riddler, which is available here. The paper tells the story behind Riddler – why and how we built it, a short guide on how to use it, and some ideas about what it can be used to do. If you’re interested in doing threat assessments, or like myself, just enjoy digging through Internet metadata, give it a look!

What’s The Deal With Non-Signature-Based Anti-Malware Solutions?

Gartner recently published an insightful report entitled “The Real Value of a Non-Signature-Based Anti-Malware Solution to Your Organization”. In this report, it discusses the ways in which non-signature technologies can be used to augment an organization’s endpoint protection strategy.

Let’s take a look at how Gartner has defined non-signature malware detection solutions. Here’s a clip directly from the report.

Gartner's list of nonsignature technologies

Gartner’s list of non-signature technologies, taken from “The Real Value of a Non-Signature-Based Anti- Malware Solution to Your Organization” report, 22nd September 2016, by Eric Ouellet and Peter Firstbrook

So, how do our endpoint protection technologies stand up against these competitor solutions?

Hardening — typically application control

This is a feature we include in our business products that’s coincidentally called “Application Control”. It’s something I haven’t specifically blogged about (yet). This feature works great in corporate environments, where the IT department can create a defined list of software or authenticode certificates that are allowed in the organization. This white list is then applied to each endpoint, and only software defined on this list is allowed to execute.

Application control is especially useful in hardened environments such as embedded devices (think ATM machines or bank teller terminals) where the list of allowed software is small and very well-defined. In other corporate environments, it can be overly restrictive to the end-user. This is why it’s a business feature. We leave it to the local IT department to define how they want to use the feature, based on how restrictive their policies are.

I’m actually not sure how long we’ve had application control in our products. As far as I remember, the feature was already there when I started at F-Secure over 11 years ago. (I tried to install World of Warcraft on my work laptop, for after hours fun, and was promptly disallowed.)

Hardening can also include patch management. We have a component we call “Software Updater”, the function of which is to enumerate all software on the system, check for latest patch versions, and automatically update the software, in the background, without the user needing to do anything themselves. Since unpatched vulnerabilities are one of the most common ways an attacker can infect a system, patch management is extremely useful, since it frees up the admin for other important tasks.

Memory protection (exploit prevention)

Our own exploit prevention methods are the same as those used in non-signature products. We hook application and system processes in order to analyze memory and execution traces, spot suspicious behavior, and shut down offending processes. This allows us to prevent exploits against browsers, browser plugins, and common applications (such as PDF readers and Microsoft Office). It’s also useful for catching scripted attacks. This is the same technology used in our activity/behavior monitoring, which is covered below.


Isolation technologies protect the system by sandboxing processes and allowing them limited access to the operating system. Bromium is the first product that comes to my mind when I think of isolation technologies. This is something we don’t do, because it’s a radically different approach to securing the endpoint, akin to taking Windows and making it work like iOS. Isolation is a really cool way of protecting a system (if you can solve the non-trivial usability issues that it presents). Done right, isolation technologies can negate the need for most other types of protection.

The closest thing we’re doing to this is on-client sandbox analysis. When we hit certain suspicious looking samples, we launch a sandbox, run the executable in question, examine its execution trace, and make a determination as to whether the sample is malicious. This analysis approach can task system performance, so it’s not something we’ll do on every file we encounter. Malware writers tend to add new anti-emulation tricks that defeat sandboxing, and this forces us to update the components and rules once in a while.

Activity/behavior monitoring

I’ve covered our behavioral analysis protection technologies in a few of my explainer posts. In fact, there’s one entirely dedicated to that topic here. I won’t bother reiterating what’s in that post except to say that we’ve been doing endpoint behavioral analysis for a decade already, and it comes as standard on every Windows product we ship. Familiar with Locky? The behavioral rules that caught that particular ransomware family were in our product for over half-a-year before it was in the wild.

Algorithmic file classification

I recently wrote about how we use machine learning techniques in a variety of our protection and detection technologies here. As that explainer states, we’ve been using machine learning techniques to train endpoint components to identify suspiciousness on both the structural and behavioral level. And, again, we’ve been shipping these technologies in our Windows products for ten years.

We ticked four out of the five boxes. What does that make F-Secure?

Gartner is an authoritative and influential player in the cybersecurity industry. Many enterprises go to them for advice when it comes to choosing a new product or solution. We understand that terminology is needed to distinguish between pure-play technology providers and established endpoint protection players. In its report Gartner uses the terms “non-signature” and “signature-based” to differentiate between the two. The problem as I see it is that “next-gen” marketing departments have perverted the term “signature-based” into “signature-only”.

All technically minded people know that there aren’t any signature-only endpoint protection products on the market. But “signature-based” also seems to imply that this category of products are overly reliant on signatures to protect against threats. This is most definitely not the case. For instance, we actually have internal test configurations with signature-based technologies disabled and our products still do a great job at blocking emerging threats.

Most of the mentioned pure-play vendors use a single technology from that list of “non-signature” technologies as the basis for their entire protection stack  (something which some industry analysts refer to as “feature-as-a-product”). Our product utilizes four of those technologies at the same time.  Given that a list of “non-signature” vendors was supplied in the report, but a corresponding list of “signature-based” vendors wasn’t, we’re wondering exactly how our products would be classified, because we clearly don’t fall into either category.

Or at least, we don’t think so and reject the label… signature-based.

Definitely Not Cerber

At the beginning of last week we noticed a spam campaign delivering a double zipped JScript file. The campaign started on September 8th. The email had the subject line of “RE: [name of recipient]” with an empty body, and an attached zip file named “[recipient name][a-z]{4}.zip”.

The characteristics of the mail, naming of the attached item, and obfuscation used in the sample were similar to what has been previously seen with the distribution of Cerber ransomware. Testing one of the samples lead to an unpleasant surprise looking nothing like Cerber.


Definitely not Cerber

The final payload of that particular sample was Locky ransomware. It was an odd discovery, especially as Locky is known to be distributed by the Necurs botnet in totally different campaigns with higher prevalence. This campaign spanned over a week, with no more than a few dozen samples per day. Further analysis of the campaign revealed minor tweaks and updates to the attached item during the week.


The first delivered attachment type on the evening of the 8th was an obfuscated JScript downloader. Distributing this type continued for few days. The next surge two days later delivered a similarly obfuscated JScript downloader in a JScript encoded script file (.jse). Later, the campaign continued by spamming encrypted JScript files, but changed the obfuscation to support custom XOR encryption on critical strings. In the last update the size of the downloader was doubled with comments, and the distribution spiked a little.

The contacted URLs were also following the format observed in previous Cerber campaigns. In total, the samples contacted 7 domains registered under the .top domain (TLD), resolving to two IP addresses, each with 7 different query parameters in format of ?f=[1-7]{1}.bin. The query was hard-coded on the distributed samples, and 25% of the samples were contacting the domains with query parameter 1. (By comparison, if the parameters were randomly generated the distribution share would be 14% instead of 25%.)

Further analysis on the URLs revealed that same sample of Locky was delivered on all domains with query parameters from 2 to 7. Query parameter 1 was allocated to serve Cerber ransomware.


Probably Cerber

This is not the first time Cerber has been distributed in the same campaigns with other nasty malware. Last May Cerber shared distribution framework with Dridex banking trojan. Though the campaign seems to be on a test phase based on the multiple minor updates on the dropper during the week, so far seeing two different ransomware on same campaign is unusual.


Seriously, Put Away The Foil

I was scanning the headlines this morning, as I do, and came across this article by YLE Uutiset (News). — “Finnish police: Keep your car keys in the fridge”

Finnish police: keep your car keys in the fridge

From YLE’s article:

“These so-called smart keys work by emitting a signal when the driver touches the door handle. The lock opens when it recognises the key’s signal. Criminals have technology that can strengthen that signal even from a hundred metres away—well inside the residential property where most owners keep their keys, according to Eero Heino of the If insurance company.”

So, should you keep your keys in a refrigerator?

Car key in a fridge

Don’t. (Cold can damage some batteries.)

Well, what about foil?

No. Put away the foil…

Look, if you have a car that’s actually valuable enough to be concerned about – get yourself a Faraday bag. Here’s one designed to fit a phone.

Car key in a Faraday bag

A very handy item to have when traveling abroad to “certain countries“.

Wickr branded Faraday bag

I got mine from the fine folks at Wickr. A quick search on Amazon yields results starting at about 10 bucks.

Or hey, here’s an idea, perhaps insurance companies could start giving customers Faraday bags when insuring an expensive car?

Just a thought.

0ld 5ch00l MBR Malware

I recently installed Audacity, an open source audio editor

Audacity UI

And while verifying the current version to download, I came across an interesting security notification. Before I read the details, I fully expected to discover yet another case of some crypto-ransomware group hijacking and trojanizing an application installer.

But not so!

Audacity’s download partner was infiltrated via compromised accounts and Audacity’s Windows installer was replaced by purely destructive malware, an MBR-overwriting trojan. That’s really something of a throwback in this age of malware-for-profit.

Those who installed the trojanized installers saw this message on reboot.

MBR message: It is a sad thing your adventures have ended here!

Classic Shell was also affected, here are file details from its forum.

And here’s a video by @danooct1 demoing the malware, and how to repair the overwritten MBR.

Infected Classic Shell/Audacity Trojan

Great stuff. And check out the view statistics… it seems there’s a decent audience for malware analysis video.

What’s The Deal With Machine Learning?

We’ve recently received quite a few questions regarding the use of machine learning techniques in cyber security. I figured it was time for a blog post. Interestingly, while I was writing this post, we got asked even more questions, so the timing couldn’t be better.

It seems that there are quite a few companies out there making noise about using machine learning techniques in their security products like it’s a new thing. It’s not. We’ve been using machine learning techniques since 2005, and nowadays you’ll find machine learning being used almost everywhere.

Machine learning techniques were first used by the security industry to train anti-spam engines. That fact prompted us to experiment with machine learning in an attempt to identify malicious files. In late 2005, we developed an engine designed to rate the suspiciousness of files based on both structural and behavioral characteristics. This engine was originally designed to suppress false positives generated by our new behavioral blocking technology, but since then has cemented itself as a solid piece of detection technology. Both of these components were introduced into our product line in 2006.


I couldn’t resist. (Source:

As I mentioned, we’re using machine learning all over the place. Here are a few examples of what we’re doing with it.

Sample analysis and categorization – We’re using expert systems and machine learning to automatically categorize the 500,000 new samples we receive each day. These systems generate a lot of high-quality metadata that is transformed into actionable threat intelligence.

URL reputation and categorization – We feed content from URLs into a machine learning system in order to categorize sites both for maliciousness and type of content (such as adult content, shopping, bank, et cetera).

Client-side detection logic – We use machine learning to train client-side components to identify suspicious files based on file structure and behavioral characteristics. We refer to these components as heuristic engines.  On August 25th, Sven Krasser at CrowdStrike published an informative and detailed blog post on how these techniques work that I recommend reading if you’d like to know more.

Breach detection – This is something I haven’t covered much yet, but plan to in the future. We use machine learning techniques to identify suspicious behavior on networks. These signals are sent to security experts working in our Rapid Detection Center, who investigate the incident and alert the customer if the information is valid. Naturally, the same techniques that uncover signs of breaches can also alert us to malicious insider activity.

Machine learning can be quite false-positive prone. This is why we prefer to use a hybrid approach that utilizes both human and machine. Combining machine learning with expert-developed rules and extensive automation allows us to reduce false positives and make much more accurate determinations of threats and suspicious behavior. For instance, in our sample categorization systems, machine learning techniques do a good job clustering incoming samples. However, for new samples it’s never seen before, we still use real humans to identify, label and categorize those clusters.

We’ve found machine learning to be extremely useful. However, it’s not a substitute for real human expertise just yet. As one colleague of mine put it, if you treat machine learning as a silver bullet, you’ll very quickly find that bullet in your foot. And that’s our advice to everyone out there – it’s critical that you don’t rely solely on machine-learning to protect your systems, and especially not solutions that can only identify file-based threats.

And there’s a couple of reasons why you shouldn’t do that. Firstly, you’ll not be protected against scams, phishing, and social engineering. For that, you need a URL blocking component. If you don’t have one, you can still easily end up on a site designed to steal your credentials, identity, or banking information. A solution designed to identify malicious files won’t be enough to keep you properly protected on the Internet.

Secondly, you definitely want protection against exploits. Exploits are the choke-point in the kill chain. There are hundreds of thousands of compromised or malicious sites out there, and hundreds of thousands of unique malicious files. However, there aren’t all that many unique exploits. Blocking all known exploits is much easier than ensuring every bad site out there and every single payload is handled. Here at F-Secure, we frequently gather the threat intelligence needed to find these exploits from in-house automation that relies on machine learning. However, the rules are still hand-written by our experts. This is one example of a client-side protection technology that simply doesn’t lend itself all that well to machine learning.

Finally, here are some questions @kevtownsend asked us, and my answers.

Will machine learning make jobs in the cyber security industry obsolete?

Absolutely not! Attackers, be they malware writers or actors looking to breach corporate networks, are humans. They think creatively and design attacks that can easily bypass purely automated solutions. Because of this, defenders need to be able to think creatively, too. Until artificial intelligence is capable of human-level creativity, humans will continue to be crucial in the field.

If machine-learning engines can be integrated into Virus Total, why can’t behavioral analysis engines be integrated?

Behavioral engines are difficult to integrate into Virus Total’s system. Every sample run through their system would need to be executed in an environment containing each vendor’s protection solution. Practically speaking, this means bringing up a virtual machine, installing or updating a vendor’s product, injecting the sample into the VM, executing it, extracting the product’s verdict, and then destroying the VM. This all has to happen under special network conditions to ensure malware is not spread further.

This whole process is not only super-resource intensive, it’s hell to maintain, especially when you consider that VT’s systems already contain over 50 products. Even if VT had the infrastructure available to do this for 500,000 samples times 50 vendors per day, they’d still need to hire a fleet of people to maintain the environment and keep the products up to date.

Is there an intrinsic difference between machine learning detection engines and behavioral detection engines?

This is an apples and oranges comparison. Machine learning techniques are used to “train” client-side detection logic. The actual machine learning process is run on heavy back end infrastructure, since it requires large volumes of samples and a significant amount of processing power. The logic bundle, once generated, is delivered to the client via product updates. Although some vendors don’t specifically talk about rules, signatures, or databases, you can be sure their products do contain them, one way or another. If a database is bundled into the binary itself, it’s still a database. Machine learning can be used to train logic designed to detect suspiciousness based on the structure of a file or its behavior, or both.

We strongly warn people against reading into the marketing hype out there. Most “AV” vendors have been using machine learning techniques to create rules and logic for years already.

Coming Soon: iOS 10

I’ve been testing iOS 10 Beta for several weeks (on a secondary iPad mini 2 of mine) and so far, so good. I’m enjoying Swift Playgrounds and looking forward to the final release.

Most of the changes I’ve noticed have been surface (i.e., UI) changes. But today I read an interesting blog post by @nabla_c0d3, regarding iOS 10 security and privacy. Under the hood stuff that sounds very promising.

Full post here: Security and Privacy Changes in iOS 10

If you don’t already use “Limit Ad Tracking”, you’ll find the option from: Settings > Privacy > Advertising > Limit Ad Tracking.

Enabling the option in iOS 10 will cause apps to see your Advertiser ID as all 0s, putting a limit on third-party tracking.

Apps on iOS have long been designed to ask for various permissions as needed, rather than all up front (à la Android), but with iOS 10, Apple will enforce the use of “purpose strings” which should be used to communicate a reason for why the permissions is needed.

Got Ransomware? Negotiate

ICYMI: we recently published a customer service study of various crypto-ransomware families. Communication being a crucial element of ransomware schemes, we decided to put it to a comparative test.

The biggest takeaway? If you find yourself compromised – negotiate.

Our Findings – In A Nutshell

You have little to lose, the majority of extortionists appear to be willing work with their “customers”.

Our report (download) also contains a fascinating email conversation as an appendix…

NanHaiShu: RATing the South China Sea

Since last year, we have been following a threat that we refer to as NanHaiShu, which is a Remote Access Trojan. The threat actors behind this malware target government and private-sector organizations that were directly or indirectly involved in the international territorial dispute centering on the South China Sea. Hence, the name nán hǎi shǔ (南海鼠) which means South China Sea rat.

Based on our observations, the timings of the attacks indicated political motivation, as they occurred either within a month following notable news reports related to the dispute, or within a month leading up to publicly-known political events featuring the said issue.

Timeline of events

Timeline of events

The white paper is a culmination of our research to understand the motivation behind NanHaiShu. To know more about our analysis and other interesting details, please read our white paper from here.

nanhaishu whitepaper cover

Bye Bye Flash! Part 2.5. Microsoft Edge Is Going “Click To Flash”

After last Thursday’s article on how Firefox will start reducing support for Flash, I received some comments pointing me to an announcement from Microsoft, back in April, where they stated that their Edge browser would also move towards a “Click to Flash” approach. The announcement notes that Flash plugins not central to the web page will be intelligently paused, and that content such as games and video will continue to run normally. This change to Edge will be delivered in the anniversary update of Windows 10.

I’d like to point out that we did notice this news back in April, and kudos to Microsoft, and the Edge team, for making this happen.

Microsoft Edge Logo

Microsoft Edge Logo (source:

Why didn’t we talk about this at the time? Well, Edge only works on newer Windows versions. It seems that Microsoft won’t make their 1 billion target for Windows 10 installs, and at current count, Windows 7 still has about 50% market share. So, we’re still waiting for that all-important announcement about Flash and Microsoft Internet Explorer.

Bye Bye Flash! Part 2 – Firefox Plans To “Reduce” Support For Flash

Earlier this year, in our 2015 Threat Report, our own Sean Sullivan predicted that Chrome, Firefox, and Microsoft would announce an iterative shift away from supporting Flash in the browser by 2017. Last month, we covered the announcement made by Google. As predicted, just yesterday, the Firefox developers made a similar announcement on their blog. […]


Malware History: Code Red

Fifteen years (5479 days) ago… Code Red hit its peak. An infamous computer worm, Code Red exploited a vulnerability in Microsoft Internet Information Server (IIS) to propagate. Infected servers displayed the following message. See @mikko‘s Tweet below for a visualization. @FSLabs @FSecure @5ean5ullivan — Mikko Hypponen (@mikko) July 18, 2016


A New High For Locky

After seeing a drop during first weeks of June, the spam campaigns distributing Locky crypto-ransomware has returned as aggressive as ever. Normally we have seen around 4000-10,000 spam hits a day during spam campaigns. Last week from Wednesday to Friday we observed a notable increase in amount of spam distributing Locky. At most we saw […]


Black Hat USA 2016 Briefings

We get a fair amount of requests from journalists and media organizations asking our opinion on a whole range of tech topics. And when Black Hat rolls around, the pace of those requests often picks up considerably. So, I spent some time last week reading through the Black Hat USA 2016 briefings. That was a […]


What’s The Deal With Detection Logic?

Detection logic is used by a variety of different mechanisms in modern endpoint protection software. It is also known by many different names in the cyber security industry. Similar to how the term “virus” is used by laypeople to describe what security people call “malware” (technically, “virus” is the term used to describe a program […]


What’s The Deal With Network Reputation?

Drive-by downloads or, more accurately, drive-by installations are some of the scariest threats on the Internet. Exploit kits provide the underlying mechanisms for this behavior. They work by examining your browser’s environment – browser type, browser version, installed plugins, and plugin versions, looking for a vulnerable piece of software. If the exploit kit finds any […]


Out of Office OPSEC

A “found object” from my Inbox (with sundry modifications). A vacation greeting from our CSS OPSEC experts! It’s absolutely fantastic that you’re soon going on holiday and are not at the office. And we’re sure it’s very well deserved! But before you go, consider this – you don’t have to tell the world where you […]


What’s The Deal With Threat Intelligence

The term “threat intelligence” is quite trendy right now. For many, threat intelligence is a term used to describe IOC feeds that are plugged into security infrastructure to identify suspicious or malicious activity. For us, it describes a whole lot more. As a company, we’ve been actively gathering and assimilating threat intelligence for over 25 […]


What’s The Deal With Prevalence

We use the word “prevalence” a lot at F-Secure Labs. And what’s prevalence? The prevalence of an executable file is defined as the number of times it’s been seen across our entire customer base. Malicious executables tend to be rare over time, most live and die quickly, and thus the number of times we’ve seen […]


Qarallax RAT: Spying On US Visa Applicants

Travelers applying for a US Visa in Switzerland were recently targeted by cyber-criminals linked to a malware called QRAT. Twitter user @hkashfi posted a Tweet saying that one of his friends received a file (US Travel Docs Information.jar) from someone posing as USTRAVELDOCS.COM support personnel using the Skype account ustravelidocs-switzerland (notice the “i” between “travel” […]