Although machine learning security research is still in its early stages, it’s clear that input possibilities without barriers increase threats. You don’t need to touch a keyboard anymore to fool a machine learning system. Software security expert Balázs Kiss touches upon a few points in this new field and gives advice on the basic protection measures.

Just like software in general, machine learning systems are vulnerable. “On the one hand, they’re pretty much like newborn babies that rely entirely on their parents to learn how the world works – including ‘backdoors’ such as fairy tales, or Santa Claus,” says security expert Balázs Kiss from Cydrill, a company specialized in software security. “On the other hand, machine learning systems are like old cats with poor eyesight – when a mouse learns how the cat hunts, it can easily avoid being seen and caught.”

Things don’t look good, according to Kiss. “Machine learning security is becoming a critical topic.” He points out that most software developers and experts in machine learning are unaware of the attack techniques. “Not even those that have been known to the software security community for a long time. Neither do they know about the corresponding best practices. This should change.”


Security expert and experienced software trainer Balázs Kiss recently developed a new course on machine learning security to be rolled out shortly by High Tech Institute in the Netherlands.

Machine learning (ML) solutions – like software systems – are vulnerable in various ways and they increase the security needs. Last year, this was pointed out in a quite embarrassing and simple way by two students from Leuven. They easily managed to mislead Yolo (You Only Look Once), one of the most popular algorithms to detect objects and people. By carrying a cardboard sign with a colorful print of 40 by 40 cm in front of their body, Simen Thys and Wiebe Van Ranst made themselves undetectable as human persons. Another example comes from McAfee researchers who managed to fool the Tesla autopilot by misclassifying speed limit signs and made the car accelerate past 35 mph.

Know your enemy

“An essential cybersecurity prerequisite is: know your enemy,” states Kiss, who is also an experienced software trainer and recently developed a brand new course on ML security to be rolled out shortly by High Tech Institute in the Netherlands. “Most importantly, you have to think with the head of an attacker,” he says.

Let’s take a look at what the attackers are going to target in machine learning. It all starts with exploring what security experts call “the attack surface,” the combination of all the different points in a software environment where an unauthorized user can try to enter or extract data. Keeping the attack surface as small as possible is a basic security measure. Like the students from Leuven proved: to fool an ML system you don’t even have to touch a keyboard.

'Garbage in, garbage out.'

A common saying in the machine learning world is “garbage in, garbage out.” All algorithms use training data to establish and refine their behavior. Bad data results in unexpected behavior. This is possible due to the model performing well on the training data but unable to generalize the results to other examples (overfitting), the model being unable to capture the underlying trends of the data (underfitting) or due to problems with the dataset. Biased, faulty or ambiguous training data are of course accidental problems, and there are ways to deal with them. For instance, by using appropriate testing and validation datasets. However, an adversary feeding in such bad input intentionally is a completely different scenario for which we also need special protection approaches.

Attackers are smart

Kiss: “We simply must assume that there will be malicious users. These attackers don’t even need to have any particular privileges within the system, but they can provide raw input as training data and see the system’s output, typically the classification value. This already means that they can send purposefully bad or malicious data to trigger inadvertent ML errors.”

'Attackers can learn how the model works and refine their inputs to adapt the attack.'

“But that’s just the tip of the iceberg,” finds Kiss. “Keep in mind that attackers are always working towards a goal. They will target specific aspects of the ML solution. By choosing the right input, they can actually do a lot of potential damage to the model, the generated prediction and even the various bits of code that process this input. Attackers are smart. They aren’t restricted to sending static inputs – they can learn how the model works and refine their inputs to adapt the attack.”

In case of supervised learning, it encompasses all three major steps of the ML workflow. For training, an attacker may be able to provide input data. For classification, an attacker can provide input data and read the classification result. If the ML system has feedback functionality, an attacker may also be able to give false feedback (“wrong” for a good classification and “correct” for a bad one) to confuse the system.

Crafted inputs

Many attacks make use of so-called adversarial examples. These crafted inputs either exploit the implicit trust an ML system puts in the training data received from the user to damage its security (poisoning) or trick the system into mis-categorizing its input (evasion). No foolproof method exists currently that can automatically detect and filter these examples; even the best solution, where a system is taught to recognize adversarial examples, is limited in scope.


By carrying a cardboard sign with a colorful print of 40 by 40 cm in front of their body, Simen Thys and Wiebe Van Ranst made themselves undetectable as human persons. Credit: KU Leuven/Eavise

There are defenses for detecting or mitigating adversarial examples, of course. However, an intelligent attacker can defeat solutions like obfuscation by producing a set of adversarial examples in an adaptive way. Kiss points to some excellent papers that highlighted these, like those from Nicholas Carlini and his colleagues at Google Brain.

All in all, ML security research is still in its early stages. The current studies mostly focus on image recognition. However, some defense techniques that work well for images may not be effective for text or audio. “That said, there are plenty of things you can still do to protect yourself in practice,” divulges Kiss. “Unfortunately, none will protect you completely from malicious activities. All of them will however add layers of protection, making the attacks harder to carry out.”

Most important, maintains the Cydrill expert, is that you think with the head of an attacker. “You have to train neural networks with adversarial samples to make them explicitly recognize this information as incorrect.” According to Kiss, it’s a good idea to create and use adversarial samples from all currently known attack techniques. A test framework can generate such samples to make the process easier. There are existing security testing tools that can help with this – like ML fuzz testers Tensorfuzzs and Deeptest, which automatically generate invalid or unexpected input.

Sanity checks

Limiting the attacker’s capabilities to send adversarial samples is always a good mitigation technique. One can easily achieve this by simply limiting the rate of inputs accepted from one user. Of course, detecting that the same user is behind a set of inputs might not be easy. “This is the same challenge as in the case of distributed denial-of-service attacks, but the same solutions might work as well.”

As always in software security, input validation can help. It may not be trivial to automatically tell good inputs from bad ones, but it’s definitely worth trying. We can also use machine learning itself to identify anomalous patterns in the input. “In the simplest case, if data received from an untrusted user is consistently closer to the classification boundary than to the average, we can flag the data for manual review, or just omit it.”

Applying regular sanity checks with test data can also help. Running the same test dataset against the model upon each retraining cycle can uncover poisoning attack attempts. Kiss: “Reject on negative impact, Roni, is a typical defense here, detecting if the system’s capability to classify the test dataset degrades after the retraining.”

The most obvious fact about ML security is often overlooked, notes Kiss. “Machine learning solutions are software systems. We program them in Python – or possibly C++ – and thus they potentially carry all common security weaknesses that apply to those languages.” The Cydrill trainer especially advises us to be aware of point 9 from the OWASP Top Ten. The Open Web Application Security Project is a document that summarizes the ten most critical security issues in web applications to raise awareness and help minimize the risk of attacks. Point 9 warns developers about using components with known vulnerabilities. “Any vulnerability in a widespread ML framework such as Tensorflow or one of its many dependencies can have far-reaching consequences for all of the applications that use it.”

Potential attack targets

The attackers interact with the ML system by feeding in data through the attack surface. Start to think with the head of the attacker and ask questions. How does the application digest the information? What kind of data? Does the system accept images, as well as audio and video files? Or are there restrictions? If so, how does it check the types? Does the program do any parsing or does it delegate it entirely to an open-source or commercially available media library? And after preprocessing the data, does the program have any assumptions (empty field, requirements on values)? Is data stored in a relational database or in XML or JSON? If so, what operations does the code perform on this data when it gets processed? Where are the hyperparameters stored, and are they modifiable at runtime? Does the application use third-party libraries, frameworks, middleware or web service APIs as part of the workflow that handles user input? If so, which ones?

Kiss: “Each of these questions can indicate potential attack targets. Each of them can hide vulnerabilities that attackers can exploit to achieve their original goals.”

These vulnerability types are not related to machine learning as much as to the underlying technologies: the programming language itself (probably Python), the deployment environment (mobile, desktop, cloud) and the operating system. But the dangers they pose are just as critical as the adversarial examples – successful exploitation can lead to a full compromise of the ML system. This isn’t restricted to the code of the application itself. Researcher Rock Stevens from the University of Maryland explored vulnerabilities in commonly-used platforms such as Tensorflow and Pytorch.

Real threats

Kiss’ main message is that ML security covers many real threats. It isn’t just a subset of cybersecurity, it shares many traits of software security in general. We should be concerned about malicious samples and adversarial learning but also about all the common software security weaknesses. Machine learning is software after all.

ML security is a new discipline. Research has just begun, we are just starting to understand the threats, the possible weaknesses and the vulnerabilities. Nevertheless, ML experts can learn a lot from software security. The last couple of decades have taught us lots of lessons there.

This article is written by René Raaijmakers, tech editor of Bits&Chips.

High Tech Institute and Cydrill invite you for a 45 minutes session on Oktober 6 at 17:00 that will give you a thorough overview of how ML applications can be hacked, and what you can do about it.