Module 14 Lesson 2: Hardware Selection for Production
Buying for the future. A guide to RAM, VRAM, and processing power for high-uptime AI applications.
Hardware Selection: The Pro Buying Guide
If you are graduating from a "hobbyist" to a "professional," you need to stop using gaming laptops and start building or buying dedicated AI hardware.
Here is the hierarchy of hardware performance in 2025.
1. The VRAM Requirement (The Hard Filter)
You can't "optimize" your way out of a VRAM shortage.
- 8GB VRAM: Only for small (7B/8B) models at Q4 quantization.
- 16GB VRAM: The "Prosumer" sweet spot. Runs any 8B model at high quality + a small context window.
- 24GB VRAM (RTX 3090/4090): The "Gold Standard" for local development. Can run 30B models comfortably.
- 48GB+ (Dual 3090/A6000): Necessary for 70B models.
2. PC (NVIDIA) vs. Mac (Apple Silicon)
This is the biggest debate in local AI.
NVIDIA (The King of Speed)
- Best for: Real-time apps, high tokens-per-second, fine-tuning.
- Why: CUDA is the native language of AI. It is significantly faster than any other chip.
Apple Silicon (The King of Size)
- Best for: Large models (70B), high context windows, developers who want ease of use.
- Why: "Unified Memory." If you buy a Mac Studio with 128GB of RAM, you can use 90GB of it for VRAM. This allows a Mac to run giant models that would require $15,000 of NVIDIA GPUs.
3. CPU and RAM: Don't Forget the Foundation
If the model doesn't fit in VRAM, it goes to the CPU.
- CPU: Look for high "Single Core" speed. AMD Threadripper or high-end Intel i9s.
- RAM Speed: This is the secret bottleneck. Buy the fastest DDR5 memory your motherboard supports. 64GB of DDR5-6000 is much better than 128GB of slow DDR4.
4. Storage
Models are big.
- Format: NVMe M.2 SSDs (PCIe Gen 4 or Gen 5).
- Size: 2TB minimum.
- Why: You want models to "Load" into RAM instantly. A slow SATA SSD will make your "Initial response" delay feel terrible.
5. Summary Recommendations
- Individual Dev: M3 Max MacBook Pro (36GB+ RAM) or PC with RTX 4070 Ti Super (16GB VRAM).
- Small Office Server: Mac Studio M2 Ultra (128GB RAM).
- Fine-tuning Rig: PC with dual RTX 3090s (Used market) or a single RTX 6000 Ada.
Key Takeaways
- VRAM is the only spec that truly determines which models you can run.
- NVIDIA is faster; Apple is bigger (per dollar of memory).
- DDR5 RAM speed is critical when models spill over from the GPU.
- Always over-provision your SSD storage for future model growth.