✨ NEW · A100 + A40 inference GPUs now available in Amsterdam — read more → ✨ NEW · A100 GPUs in Amsterdam → STATUS

GPU SERVERS

GPU power for the AI workloads you actually run

A100, A40, A10 sized for inference, fine-tuning, and small-team training. Predictable monthly pricing, deploy in minutes, in the jurisdiction you choose.

A100 80GB · A40 48GB · A10 24GB · zero hourly billing

Use cases

Match the GPU to the workload

Don’t pay for H100 capacity you won’t use. Most production AI runs comfortably on A100, A40, or A10.

  • LLM inference A100 80GB runs Llama 3.1 70B-Q4 at ~30 tok/s. A40 48GB handles 13B–34B comfortably. A10 24GB is the price-performance choice for 7B–13B.
  • Fine-tuning + LoRA 2× A100 NVLink for 70B fine-tunes; single A100 80GB for LoRA on 13B–34B. Persistent storage, no per-GB egress fees.
  • CV + embeddings A40 48GB for batch CV pipelines (Detectron2, SAM, Stable Diffusion XL). A10 for embedding services and ranking models in production.
  • Render + simulation Multiple A40s in parallel for Blender / Houdini / Octane. RT cores, NVENC, real-time ray tracing — same hardware, hours not minutes.

Configurations

Pick a configuration, deploy in minutes

All plans include unmetered 10G network, free DDoS protection, and zero hourly billing — just predictable monthly rates.

Custom quote

ACCELERATOR

ACCELERATOR-A10-EU

Accelerator H100 NVLink

AMD EPYC 7702p

  • CPU 1× 64 cores / 128 threads
  • RAM 256 GB up to 1024 GB
  • Storage2x4TB NVMe
  • Network2x 10GbE
  • Bandwidth100 TB
  • DDoS Shield
  • NR Guarantee
  • Amsterdam
Custom quote

ACCELERATOR

ACCELERATOR-A100-EU

Accelerator A100

AMD EPYC 7702p

  • CPU 1× 64 cores / 128 threads
  • RAM 512 GB up to 1024 GB
  • Storage2x4TB NVMe
  • Network2x 10GbE
  • Bandwidth100 TB
  • DDoS Shield
  • NR Guarantee
  • Amsterdam

Side-by-side comparison

The lineup

A100, A40, A10 — purpose-built for production AI

We don’t stock H100. They’re constrained, expensive, and overkill for what most teams actually run. Our pivot: A-series GPUs at scale.

  • A100 80GB SXM4 Training + 70B inference. 80 GB HBM2e, 6,912 CUDA cores, 432 Tensor cores, 600 GB/s NVLink. Single or 2-way NVLink configs.
  • A40 48GB Inference workhorse + render. 48 GB GDDR6 ECC, 10,752 CUDA cores, 336 Tensor cores, 84 RT cores. Best price/perf for 13B–34B inference and CV pipelines.
  • A10 24GB Cost-effective inference. 24 GB GDDR6, 9,216 CUDA cores. Right-sized for 7B–13B serving, embedding services, and rerankers — without paying A100 prices.

Included with every plan

No surprises in the bill or the install

Every GPU plan ships with the things teams usually find out are missing after they’ve deployed.

  • Free always-on DDoS Edge scrubbing on every plan, no extra cost, no opt-in. We’ve absorbed 600+ Gbps attacks without an inference endpoint dropping a single request.
  • Unmetered 10G network No egress fees, no per-GB surprises. Pull a 70B model from Hugging Face once, serve a million inference requests — same monthly bill.
  • 24/7 monitoring + engineers Real network engineers respond, not a chatbot. First response inside 5 minutes for incidents, on a workday or at 03:00 in any timezone.
  • CUDA-ready images Ubuntu 24.04 with current NVIDIA drivers + CUDA toolkit pre-installed. Automated reinstall puts a fresh image down in minutes if you wedge a kernel.

Why Netrouting

Predictable, fast, jurisdiction-flexible

  • Predictable monthly pricing No hourly traps. No egress fees. Run a model 24/7 and the bill is the same number every month — no surprises when training overran or traffic spiked.
  • Fast deploy, real engineers Most GPU plans deploy same day. Stuck on CUDA, drivers, or a kernel panic? Real engineers respond — not a support bot.
  • Pick your jurisdiction Host in Amsterdam, Frankfurt, or Stockholm under our Netherlands B.V. — or in Miami or New York under our Florida Inc. Your model weights and customer data stay under the law you choose, not the one we default to.

Common questions

Pricing FAQ

  • Which types of attacks does Netrouting DDoS Protection block automatically?

    We mitigate the full range of network-layer attacks: volumetric floods (UDP, ICMP, and TCP floods, plus reflection/amplification attacks against DNS, NTP, memcached, and similar), protocol-layer attacks (SYN floods, ACK floods, fragmentation and malformed-packet attacks), and multi-vector attacks that combine several techniques at once.

  • What happens if a DDoS attack exceeds our mitigation capacity?

    If attack volume exceeds what we can absorb upstream, we blackhole the targeted IP — dropping all traffic to it, malicious and legitimate, until the attack subsides. We notify the account owner when this happens. For uptime-critical workloads, we recommend pairing this with redundant IPs or Anycast where applicable.

  • Do I need to enable Netrouting DDoS Protection?

    No. It’s enabled by default on all applicable resources — Cloud Compute, Colocation, Bare Metal, and GPU — and active the moment your resource goes live. There’s nothing to configure.

  • Does Netrouting DDoS Protection cover application-layer (Layer 7) attacks?

    No. This service mitigates network-layer attacks only (Layers 3 and 4). Application-layer threats — HTTP floods, slowloris, API abuse, credential stuffing, malicious bots — require a Web Application Firewall (WAF) or an edge service such as Cloudflare or Akamai in front of your application.

  • What is Netrouting DDoS Protection

    A free, always-on service that automatically protects your Netrouting infrastructure—Cloud Compute, Colocation, Bare Metal, and GPU—from network-layer (L3/L4) DDoS attacks. No configuration required.

  • How do I test the network speed from my location?

    Each city page lists its public test IP and 100 MB / 1 GB / 10 GB speed test files. Use them with curl, wget, or a browser-based speed-test tool to measure latency and throughput from your location. Our looking glass also exposes BGP and traceroute data on demand.

  • Is the bandwidth dedicated or shared?

    Netrouting provides dedicated uplink ports to ensure consistent performance and quality, without the possible loss of performance and quality common in shared uplink scenarios.