Module 11 Lesson 2: Hacking ML Libraries
·AI Security

Module 11 Lesson 2: Hacking ML Libraries

Vulnerabilities in the engine. Learn about common CVEs and security flaws in core machine learning frameworks like PyTorch, TensorFlow, and NumPy.

Module 11 Lesson 2: Vulnerabilities in ML libraries (PyTorch, TensorFlow)

Underneath the "Intelligence" of AI is a massive amount of C++ and Python code. This code is subject to traditional software bugs like Buffer Overflows and Memory Corruption.

1. Why ML Libraries are Targets

Libraries like PyTorch and TensorFlow use specialized hardware (GPUs/TPUs). This requires "Kernel-level" drivers and very complex math libraries (like CUDA).

  • Complexity = Vulnerabilities.
  • The Attack: A malicious model file contains a "Malformed Operator." When the GPU tries to execute this operator, it triggers a crash or a memory leak in the server's graphics drivers.

2. Integer Overflows in NumPy

NumPy is the "math engine" for almost all Python AI.

  • The Attack: An attacker provides a "Huge" image or a "Huge" tensor.
  • The result: When NumPy calculates the size of the memory to allocate, it "Wraps around" to Zero. The library then tries to write data into memory it doesn't own.
  • Result: Crash or Remote Code Execution.

3. Deserialization Vulnerabilities

This is the most common bug in AI frameworks. To "save" a model, the library must "Pickle" or "Serialize" the Python objects.

  • The Flaw: Traditional Python pickle allows for Code Execution on Load.
  • If you do torch.load('malicious_model.pt'), the model file can actually include a command to delete your hard drive or start a reverse shell.

4. Mitigations for the Tech Stack

  1. Software Composition Analysis (SCA): Use tools like Snyk or GitHub Dependabot to scan your AI repo for outdated libraries.
  2. Safetensors: Use the safetensors format instead of .pt or .h5. Safetensors is a format designed by Hugging Face specifically to be Zero-Execute on Load. It only contains data, no code.
  3. Namespace Isolation: Run your model-loading logic in a dedicated container with no network access.

Exercise: The Library Auditor

  1. What is the difference between a "Dependency" and a "Transitive Dependency" in an AI project?
  2. Why is the .safetensors format considered more secure than the .pth (PyTorch) format?
  3. If you find a CVE in PyTorch, do you need to rewrite your model, or just update the library?
  4. Research: What is "CVE-2022-42902" and how did it affect the security of deep learning models?

Summary

The foundation of AI is built on software. If you don't keep your libraries updated and use secure formats for storage, you are leaving your front door unlocked.

Next Lesson: Stealing the brain: Model weights exfiltration and protection.

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn