Skip to main content
  1. Posts/

Hardware for local coding models is still affordable. For how long?

·449 words·3 mins

The recent RAM price hikes have pushed GPU prices up as well. The only systems that have not yet been affected to the same extent are Macs and high-end GPUs (RTX 6000 Pro and above). However, I would already classify GPUs as out of reach: running coding models with sufficiently large context windows would require one or two RTX 6000 Pro cards, or three to six RTX 5090s.

Let’s take a look at Macs instead.

PriceMem BandwidthCPURAMNVMeCPUGPUModelPrice/BandwidthPrice/Memory
2.500 EUR273 GB/sM4 Pro64 GB1 TB12C16CMac Mini9,15 EUR/GB/s39,06 EUR/GB
4.027 EUR546 GB/sM4 Max128 GB1 TB16C40CMac Studio7,14 EUR/GB/s31,43 EUR/GB
4.200 EUR800 GB/sM3 Ultra96 GB1 TB28C60CMac Studio5,52 EUR/GB/s43,75 EUR/GB
6.720 EUR800 GB/sM3 Ultra256 GB2 TB28C60CMac Studio8,40 EUR/GB/s26,25 EUR/GB
11.900 EUR800 GB/sM3 Ultra512 GB4 TB32C80CMac Studio14,87 EUR/GB/s23,24 EUR/GB

Mac Studios with the M3 Ultra currently offer the best overall value. The 96 GB version is fast, and its RAM capacity is sufficient to run many general-purpose models using quantization. The 256 GB version can handle coding models at reasonable quantization levels with 64k context, which is just about sufficient for running coding agents 1. The 512 GB model, however, exceeds the €10k, which I consider a hard upper limit for what can reasonably be called an “affordable” system.

The price difference between the 96 GB and 256 GB Mac Studio models is surprisingly close to current RAM market prices. For comparison, 256 GB of DDR5-6000 RAM currently costs around €3,430, or €13.39 per GB. The price delta between the 96 GB and 256 GB Mac Studio, and between the 256 GB and 512 GB versions, is approximately €12.88 per GB and €18.50 per GB, respectively. If Apple were to return to its usual memory pricing premiums, the 256 GB configuration would likely end up much closer to €10,000.

Hosted inference is both cheaper and more capable, but it cannot be used in all scenarios. In particular, freelancers and small companies are often required to rely on local systems when working for larger clients. The M3 Ultra, launched in March 2025, together with improved models, quantization techniques, and REAP, has effectively kicked off an era of affordable local coding models. That era may now enter a pause: RAM prices are expected to rise through 2026 and are likely to remain elevated until 2027 or 2028 2.


  1. Based on benchmark results shared on r/localllama ↩︎

  2. Based on an analysis from December 2025 on wccftech ↩︎