H100-SXM GPU Instances

Accelerate your AI roadmap with infrastructure built for faster multi-GPU workloads.

Cut training time

Eliminate the communication bottlenecks that slow down large-scale AI thanks to NVIDIA NVLink interconnect technology: your models train at maximum speed without waiting on data transfers.

Serve models 30x faster

Drastically reduce latency for large-scale generative AI. The built-in Transformer Engine optimizes workloads on the fly to deliver unparalleled inference speeds at scale.

Maximize hardware utilization

Leverage Multi-Instance GPU (MIG) to partition a GPU into isolated Instances. Adjust your compute size for smaller workloads and maximize the energy efficiency of every GPU.

Exceptional performance for every AI workload

H100-SXM GPU Instances are designed for raw, uncompromised power and are ideal for tasks where speed is the only metric that matters. These instances handle the most intensive workloads, like training foundation models and scientific computing, without breaking a sweat.

: Convert spoken language into written text, simplifying the translation of verbal communication into machine-readable data.

: Generate new content such as image, text, audio, and code.

: Serve billion-parameter models in real-time. The H100's Transformer Engine automatically optimizes precision (FP8) to deliver up to 30x faster inference for generative AI, ensuring low latency for your users.

Specifications

View pricing

gpu
GPU
NVIDIA H100-SXM Tensor Core.
processor_type
Architecture
NVIDIA Hopper 2022.
gpu_memory
VRAM
80 GB HBM3 per GPU (3.35 TB/s).
processor
CPU
32-128 vCPUs Xeon Platinum 8452Y.
frequency
Processor frequency
2.65 Ghz.
memory
RAM
240-960 GB.
memory_type
RAM type
DDR5.
bandwidth
Network bandwidth
Up to 20 Gbps.
storage
Storage
Block Storage and Scratch local NVMe.
threads_cores
GPU performance
Tensor Cores 4th generation, RT Cores 3rd generation.
service_level
SLA
99.5%.

Estimate your GPU costs

Choose your plan

Estimated cost

Option and value	Price
ZoneParis 2
Instance1x	0€
Volume10GB	0€
Flexible IPv4No	0€

Get started with H100-SXM GPUs today

100% renewable energy, up to 30% less power

DC5 (PAR2) is one of Europe's greenest data centers, powered entirely by renewable wind and hydro energy (GO-certified) and cooled with ultra-efficient free and adiabatic cooling. With a PUE of 1.16 (vs. the 1.55 industry average), it slashes energy use by 30% compared to traditional data centers.

Get more details Our environmental commitments

Looking for more power? Discover our full range.

B300-SXM
Push the boundaries of performance with NVIDIA's Blackwell architecture.
Discover the range
Managed Inference
Deploy AI models in a dedicated inference infrastructure, with tailored security and predictable throughput.
Discover Managed Inference
Generative API
Consume AI models instantly via a simple API call - all hosted in Europe.
Discover the range

Choose the cloud built for what's next

Customer data sovereignty

Dependency is the enemy of resilience. Customers want their data hosted by a regional provider. Gain sovereignty with our multi-cloud tools & infrastructure.

Sustainable data centers

We recycle our hardware, only use renewable energy and pay close attention to our water usage. Also, our Power Usage Effectiveness (PUE) is displayed online 24/7 for you to see for yourself.

Low latency

Every complete cloud ecosystem needs 100% reliability, which is why we provide ten Availability Zones in four different regions.

Frequently asked questions

How quickly can I access H100-SXM cloud GPU Instances?

Getting started with our cloud GPU as a service is simple: create an account on the Scaleway console, ensure you have Owner status or the correct IAM permissions, and launch your H100-SXM cluster in just a few clicks.

What is NVLink?

NVLink is a high-speed interconnect technology developed by NVIDIA that allows for faster data transfer between GPUs and between GPUs and CPUs.
It's designed to significantly increase the bandwidth and reduce the latency of data transfers compared to traditional PCIe (Peripheral Component Interconnect Express) connections. This is particularly beneficial in high-performance computing (HPC) and data center environments where multiple GPUs are used in parallel to accelerate computations.
Learn more here.

How am I billed for my GPU cloud computing consumption?

We offer a fully transparent "pay-as-you-go" model billed by the minute. This GPU rental model gives you the flexibility to provision massive supercomputing power for your training models and delete the resources as soon as your job is done.

Automatic speech recognition

Generative AI

Enterprise-grade inference