top | item 27892075

EvilModel: Hiding Malware Inside of Neural Network Models [pdf]

85 points| Hard_Space | 4 years ago |arxiv.org | reply

28 comments

order
[+] maffydub|4 years ago|reply
I'm not sure why this is specific to malware. Isn't this just steganography? You could equally hide malware in a compressed image.

Maybe the amount of data you can hide is higher, but that's primarily because they're storing all their weights as 32-bit floats which is overkill for inference.

...and I guess the fact you can retrain after hiding your malware to increase your inference accuracy again is maybe interesting?

[+] nuclearnice1|4 years ago|reply
> I'm not sure why this is specific to malware. Isn't this just steganography? You could equally hide malware in a compressed image.

Correct.

Paper. 3rd paragraph, page 1: “For delivering large-sized malware, some attackers attach the malware to benign-look carriers, like images, documents, compressed files, etc. [5] The malware is attached to the back of the carrier while keeping the carrier’s structure not damaged. Although they are often invisible to ordinary users, it is easy to detect them by antivirus engines. Another way to hide messages is steganography.”

[+] joosters|4 years ago|reply
Maybe because of the density of the payload? Hiding 37MB of data inside of 178MB of images would not be possible without severely degrading the image quality - and they are simple to check. Whereas the NN model continued to work with very little quality loss.

(You could easily append 37MB of hidden data to some images - e.g. adding an invisible extra layer to the image, but this paper details a technique where you don't alter the file size)

[+] api|4 years ago|reply
Neural networks can be Turing-complete, which could in theory allow embedded malware to actually run and do things. I can imagine a compiler that targeted neural networks and allowed programs to be compiled to run silently within them.

What it could do is of course highly dependent on what the neural network is doing and how it's embedded in an application. In many cases it would not be able to do much, but if the neural network is controlling something or has any mechanism to feed back into the app and execute commands...

Then "attacks only get better."

[+] xyzzy21|4 years ago|reply
The fact you can't "explain" how NN/ML arrives at its answers easily perhaps makes it a bit different but yeah, mostly. With stenography you overall statistic changes that can sometimes indicate it (though those can be spoofed away).
[+] Imnimo|4 years ago|reply
Why would I bother hiding the malware in weights that are actually used by the model? Couldn't I just define a big weight tensor of the appropriate size to hide the malware, declare it part of my model, but never actually make use of it? The paper says:

>To ensure more malware can be embedded, the attacker can introduce more neurons.

It seems like it would be extremely foolish to put the malware in neurons that impact the output if I can just make my own neurons off to the side and get a 0% drop in accuracy!

[+] admax88q|4 years ago|reply
Consider how you would defend this attack.

If there are extra unconnected neurons, it would be trivial to just remove them, or flag them from anti-virus.

[+] emtrnn|4 years ago|reply
Hi, great hackers.

I'm the author of the paper. I also think it is steganography, just applied in neural networks and malware.. However, it may happen one day, so maybe it deserves the attention. The latest experiment used other models from public repository. The results show that it also works on different models and tasks, including ResNet on ImageNet. I'll update the paper in the next few days.

After I finished the works in this paper, I found another similar work: "StegoNet: Turn Deep Neural Network into a Stegomalware"[1]. They proposed a more practical scenario with supply chain attack in MLaaS or similar services. ([1]https://dl.acm.org/doi/10.1145/3427228.3427268)

I also found another embedding method that has an embedding rate (defined as malware/model size) of nearly 50%, with less losses on the testing accuracy. I'm working on the results and will publish it later.

Thanks for your discussions. All your questions and advices are welcome!

[+] rob_c|4 years ago|reply
Black boxes of computing magic good...

Please install 30GB+ of "trustworthy" tools to run a magical analysis over 100MB of data that was cleaned by hand to optimise signal over noise without selection bias...

Don't get me wrong GPUs and FPGAs are amazing. But why people implicitly trust huge stacks of code to do all their work for them without at least verifying it works properly is beyond me.

I've seen papers that should have been retracted due to "minor" numerical bugs in upstream code. (minor here because it's a "rare" side effect of using a tool that doesn't cause a security problem)

Fine if you're working in sales or computing or an insurance company, crappy when you're trying to do science or healthcare with the same tools imo.

[+] mirker|4 years ago|reply
Paper is stretching by picking favorable benchmarks. It’s not ResNet on ImageNet, folks. It’s AlexNet on Fashion-MNIST.

AlexNet (2012) is out of date. The performance each parameter brings is less than newer models. Newer models, apart from just being better, use convolutions more extensively, which have less parameters than linear/affine layers. The authors describe this choice in Section 4, even showing how they have to retrofit a 224x224 model to the small fashion MNIST 28x28 dataset they use.

Therefore, when you embed a big payload into AlexNet, you should not be surprised that you lose little accuracy. Your model is a low accuracy model to begin with and the model’s parameter counts are fluffed up by its choice of layers.

[+] blunte|4 years ago|reply
Not what the paper discusses, but the title gives me a thought: can you train a neural net with a specific, predictable behavior that would generally go undetected but which you could use to your advantage? I suspect you could.

Imagine building an apparently useful trained model for finance which you have a special way to predict the behavior of and profit from. (This could also be likened to formal human finance education, which certain elite groups take advantage of, but I don't mean to go into conspiracy theory territory...)

[+] xyzzy21|4 years ago|reply
You mean like a "mole" or "sleeper cell"? :-)

Just give it the right input and it turns on its masters.

[+] 123pie123|4 years ago|reply
Neuralnet/ ML version of "Order 66"
[+] high_byte|4 years ago|reply
useless. they suggested no means to execute the malware - once it is extracted the antivirus WILL detect the unmodified malware. you'd be better off zipping with password or even simple XOR over the data would "evade" detection... until you run it raw.
[+] rjmunro|4 years ago|reply
I can imagine that in some circumstances, it's possible if an evil insider downloads some random large random data files, it may trigger alarms. If they download a large Neural Network Model, that would be expected for whatever work they do - they could legitimately ask for it to be whitelisted by their firewall. They'd then have to write a small python script from memory to extract the malware, but it's pretty unlikely.
[+] CloselyChunky|4 years ago|reply
AFAIK antivirus systems do (did?) not scan RAM, only persistent memory. So decrypting/decoding the malware in-memory and jumping into the code should avoid detection. That's how "runtime crypters" work or used to work a few years ago.
[+] bencollier49|4 years ago|reply
I'm rather looking forward to / dreading the spectacle of people poisoning the Copilot training set with obfuscated malware...
[+] theshadowknows|4 years ago|reply
I was thinking the same thing. Would it be feasible to hide dozens or even hundreds of small pieces of code that, when deployed enough times, operate together as one large piece of malware. Say a few lines of code here that cause some memory to leak in a server context. A few over there that cause the server to occasionally reach out to some public destination and a few others that, if they should so happen to be reached out to will know the origin is poisoned and set in motion some sort of attack…it would be the opposite approach to highly targeted attacks. But, if we have tools that offer up code snippets then surely we’re going to have people blindly using them just through inexperience. So it seems plausible at least that it could act as an attack vector.