Home Pricing Help & Support Menu
gpu-cloud-pricing
LIVE H100, A100 & L40S capacity available in IN, US & EU

Book your meeting with our
Sales team

GPU cloud pricing

Transparent NVIDIA GPU pricing .

Per-hour billing. No platform fees. No egress fees. Pick the category your workload actually fits, and pay only when it's running.

Dollar INR

NVIDIA L40S Instances

Instance Name Compute unit Model AI Compute memory (GB) Performa FP32 Performa FP16 vCPU Instance memory (GB) Peer to Peer Bandwidth (GB/s) Network Bandwidth (GB/s) Peak/Benchmark Memory Bandwidth (GB/s) On Demand Price/hour 1 Month Reserved Price/hr 6 Month Reserved Price/hr 12 Month Reserved Price/hr Action
1L40S.16v.256m NVIDIA 1xL40S (1X) 48 91.6 733 16 256 - 200 864

₹ 124

₹ 74


(40% Discount)

₹ 67.5


(45% Discount)

₹ 61


(50% Discount)
Reserve Now
2L40S.32v.512m NVIDIA 2xL40S (2X) 96 183.2 1466 32 512 64 400 864

₹ 245

₹ 145


(40.98% Discount)

₹ 130.95


(46.55% Discount)

₹ 118


(52% Discount)
Reserve Now
4L40S.64v.1024m NVIDIA 4xL40S (4X) 192 366.4 2932 64 768 128 800 864

₹ 485

₹ 286


(41.01% Discount)

₹ 259.2


(46.58% Discount)

₹ 233


(52.02% Discount)
Reserve Now
8L40S.64v.2048m NVIDIA 8xL40S (8X) 1536 1304 10456 64 1536 3600 3200 580

₹ 960

₹ 566

₹ 513

₹ 461

Reserve Now

AMD MI300X Instances

Instance Name Compute unit Model AI Compute memory (GB) Performa FP32 Performa FP16 vCPU Instance memory(GB) Peer to Peer Bandwidth (GB/s) Network Bandwidth (GB/s) Peak/Benchmark Memory Bandwidth (GB/s) On Demand Price/hour 1 Month Reserved Price/hr 6 Month Reserved Price/hr 12 Month Reserved Price/hr Action
1MI300.16v.256m AMD 1xMI300X (1X) 192 163 1307 16 256 - 400 580

₹ 274

₹ 219


(20.08% Discount)

₹ 197


(28.11% Discount)

₹ 164


(40.16% Discount)
Reserve Now
2MI300.32v.512m AMD 2xMI300X (2X) 384 326 2614 32 512 900 800 580

₹ 542

₹ 429


(20.89% Discount)

₹ 382


(29.56% Discount)

₹ 315


(41.98% Discount)
Reserve Now
4MI300.64v.1024m AMD 4xMI300X (4X) 768 652 5228 64 768 1800 1600 580

₹ 1074

₹ 849


(20.90% Discount)

₹ 756


(29.57% Discount)

₹ 623


(41.99% Discount)
Reserve Now
8MI300.128v.2048m AMD 8xMI300X (8X) 1536 1304 10456 128 1536 3600 3200 580

₹ 2125

₹ 1681


(20.91% Discount)

₹ 1496


(29.59% Discount)

₹ 1233


(42.02% Discount)
Reserve Now

NVIDIA H100 SXM Instances

Instance Name Compute unit Model AI Compute memory (GB) Performa FP32 Performa FP16 vCPU Instance memory(GB) Peer to Peer Bandwidth (GB/s) Network Bandwidth (GB/s) Peak/Benchmark Memory Bandwidth (GB/s) On Demand Price/hour 1 Month Reserved Price/hr 6 Month Reserved Price/hr 12 Month Reserved Price/hr Action
1H100.16v.256m SXM NVIDIA 1xH100 SXM (1X) 80 67 1979 16 256 - 200 2039

₹ 329

₹ 296


(10.03% Discount)

₹ 263


(20.07% Discount)

₹ 219


(33.44% Discount)
Reserve Now
2H100.32v.512m SXM NVIDIA 2xH100 SXM (2X) 160 134 3958 32 512 900 400 2039

₹ 651

₹ 580


(10.95% Discount)

₹ 510


(21.68% Discount)

₹ 420


(35.47% Discount)
Reserve Now
4H100.64v.1024m SXM NVIDIA 4xH100 SXM (4X) 320 268 7916 64 768 1800 800 2039

₹ 1289

₹ 1148


(10.95% Discount)

₹ 1010


(21.69% Discount)

₹ 832


(35.47% Discount)
Reserve Now
8H100.128v.2048m SXM NVIDIA 8xH100 SXM (8X) 640 536 15832 128 1536 3600 1600 2039

₹ 2552

₹ 2273


(10.96% Discount)

₹ 1998


(21.71% Discount)

₹ 1646


(35.49% Discount)
Reserve Now

NVIDIA V100 Instances

Instance Name Compute unit Model AI Compute memory (GB) Performa FP32 Performa FP16 vCPU Instance memory(GB) Peer to Peer Bandwidth (GB/s) Network Bandwidth (GB/s) Peak/Benchmark Memory Bandwidth (GB/s) On Demand Price/hour 1 Month Reserved Price/hr 6 Month Reserved Price/hr 12 Month Reserved Price/hr Action
1V100.16v.256m NVIDIA 1xV100 (1X) 32 15.7 125 16 256 - 100 900

₹ 54

₹ 48


(10.20% Discount)

₹ 43


(20.41% Discount)

₹ 39


(28.57% Discount)
Reserve Now
2V100.32v.512m NVIDIA 2xV100 (2X) 64 31.4 250 32 512 300 200 900

₹ 107

₹ 95


(11.11% Discount)

₹ 83


(22.01% Discount)

₹ 74


(30.71% Discount)
Reserve Now
4V100.64v.1024m NVIDIA 4xV100 (4X) 128 62.8 500 64 1024 600 400 900

₹ 211

₹ 188


(11.12% Discount)

₹ 165


(22.03% Discount)

₹ 146


(30.74% Discount)
Reserve Now
8V100.128v.2048m NVIDIA 8xV100 (8X) 256 125.6 1000 128 2048 1200 800 900

₹ 418

₹ 372


(11.13% Discount)

₹ 326


(22.05% Discount)

₹ 290


(30.78% Discount)
Reserve Now
1xV100.32v.32m NVIDIA 1xV100 (1X) 74 145 286 32 74 566 429 219

₹ 46

₹ 41

₹ 37

₹ 32

Reserve Now
1V100.8v.64m NVIDIA 2xV100 (1X) 1536 1304 10456 128 1536 3600 3200 580

₹ 45

₹ 41

₹ 33

₹ 23

Reserve Now
16V100.64v.128m NVIDIA 4xV100 (4X) 1536 1304 10456 128 1536 3600 3200 580

₹ 93

₹ 83

₹ 74

₹ 65

Reserve Now
8V100.128v.2048m NVIDIA 8xV100 (8X) 1536 1304 10456 128 1536 3600 3200 580

₹ 357

₹ 318

₹ 280

₹ 242

Reserve Now

NVIDIA A100 Instances

Instance Name Compute unit Model AI Compute memory (GB) Performa FP32 Performa FP16 vCPU Instance memory(GB) Peer to Peer Bandwidth (GB/s) Network Bandwidth (GB/s) Peak/Benchmark Memory Bandwidth (GB/s) On Demand Price/hour 1 Month Reserved Price/hr 6 Month Reserved Price/hr 12 Month Reserved Price/hr Action
1xA100.16v.256m NVIDIA 1xA100 (1X) 80 156 312 8 64 - 200 1555

₹ 198

₹ 196


(1.11% Discount)

₹ 194


(2.22% Discount)

₹ 187


(5.56% Discount)
Reserve Now
2xA100.32v.512m NVIDIA 2xA100 (2X) 160 312 624 16 128 600 400 1555

₹ 392

₹ 384


(1.11% Discount)

₹ 376


(2.22% Discount)

₹ 359


(5.56% Discount)
Reserve Now
4xA100.64v.1024m NVIDIA 4xA100 (4X) 320 624 1248 32 256 1200 800 1555

₹ 776

₹ 760


(2.11% Discount)

₹ 743


(4.23% Discount)

₹ 711


(8.44% Discount)
Reserve Now
8xA100.128v.2048m NVIDIA 8xA100 (8X) 640 1248 2496 64 512 2400 1600 1555

₹ 1536

₹ 1504


(2.14% Discount)

₹ 1471


(4.23% Discount)

₹ 1406


(8.49% Discount)
Reserve Now

Intel Gaudi2 Instances

Instance Name Compute unit Model AI Compute memory (GB) Performa FP32 Performa FP16 vCPU Instance memory(GB) Peer to Peer Bandwidth (GB/s) Network Bandwidth (GB/s) Peak/Benchmark Memory Bandwidth (GB/s) On Demand Price/hour 1 Month Reserved Price/hr 6 Month Reserved Price/hr 12 Month Reserved Price/hr Action
1xGaudi2.16v.256m Intel 1XGaudi 2 (1X) 96 60 180 19 288 - 200 2150

₹ 101

₹ 81


(19.57% Discount)

₹ 69


(31.52% Discount)

₹ 59


(41.30% Discount)
Reserve Now
2xGaudi2.32v.512m Intel 2XGaudi 2 (2X) 192 120 360 38 576 200 400 2150

₹ 200

₹ 160


(20.37% Discount)

₹ 134


(32.91% Discount)

₹ 114


(43.08% Discount)
Reserve Now
4xGaudi2.64v.1024m Intel 4XGaudi 2 (4X) 384 240 720 76 1152 400 800 2150

₹ 397

₹ 316


(20.42% Discount)

₹ 266


(32.95% Discount)

₹ 226


(43.12% Discount)
Reserve Now
8xGaudi2.128v.2048m Intel 8XGaudi 2 (8X) 768 480 1440 152 2304 800 1600 2150

₹ 785

₹ 625


(20.43% Discount)

₹ 527


(32.96% Discount)

₹ 447


(43.13% Discount)
Reserve Now

AMD MI325X Instances

Instance Name Compute unit Model AI Compute memory (GB) Performa FP32 Performa FP16 vCPU Instance memory(GB) Peer to Peer Bandwidth (GB/s) Network Bandwidth (GB/s) Peak/Benchmark Memory Bandwidth (GB/s) On Demand Price/hour 1 Month Reserved Price/hr 6 Month Reserved Price/hr 12 Month Reserved Price/hr Action
1xMI325.16v.256m AMD 1xMI325X (1X) 192 163 1307 16 256 - 400 580

₹ 298

₹ 217


(27.11% Discount)

₹ 181


(39.38% Discount)

₹ 150


(49.47% Discount)
Reserve Now
2xMI325.32v.512m AMD 2xMI325X (2X) 384 326 2614 32 512 900 800 580

₹ 590

₹ 425


(27.86% Discount)

₹ 350


(40.60% Discount)

₹ 289


(51.00% Discount)
Reserve Now
4xMI325.64v.1024m AMD 4xMI325X (4X) 768 652 5228 64 768 1800 1600 580

₹ 1167

₹ 842


(27.87% Discount)

₹ 693


(40.62% Discount)

₹ 572


(51.02% Discount)
Reserve Now
8xMI325.128v.2048m AMD 8xMI325X (8X) 1536 1304 10456 128 1536 3600 3200 580

₹ 2311

₹ 1667


(27.88% Discount)

₹ 1372


(40.63% Discount)

₹ 1132


(51.03% Discount)
Reserve Now

Responsive Banner

Serverless Text Models

Text-embedding-3-large is a robust language model by OpenAI

Up to 4B

Base Model

$0.085

/1M Tokens | input and output

4.1B - 8B

Base Model

$0.17

/1M Tokens | input and output

8.1B - 21B

Base Model

$0.255

/1M Tokens | input and output

21.1B - 41B

(e.g. Mistral 8x7B)

$0.68

/1M Tokens | input and output

41.1B - 80B

Base Model

$0.765

/1M Tokens | input and output

80.1B - 110B

Base Model

$1.44

/1M Tokens | input and output

MoE 1B - 56B

(e.g. Mistral 8x7B)

$0.425

/1M Tokens | input and output

MoE 56.1B - 176B

(e.g. DBRX, Mistral 8x22B)

$0.96

/1M Tokens | input and output

Deepseek-v3

Base Model

$0.72

/1M Tokens | input and output

Deepseek-r1

Base Model

$6.40

/1M Tokens | input and output

DeepSeek LLM Chat 67B

Base Model

$0.765

/1M Tokens | input and output

Yi Large

Base Model

$2.55

/1M Tokens | input and output

LLAMA 3 70B

Base Model

$0.88

/1M Tokens / input and output

Meta Llama 3.1 405B

Base Model

$2.55

/1M Tokens / input and output

Mistral 7B

Base Model

$0.25

/1M Tokens | input and output

i

Note: The prices listed are calculated per 1 million tokens, encompassing both input and output tokens for various models, including chat, multimodal, language, and code models. This pricing structure allows users to estimate costs based on their usage of the models in different applications.

Responsive Banner

Image Models

Text-embedding-3-large is a robust language model by OpenAI

All Non-Flux Models

(SDXL, Playground, etc)

$0.000104

(price per step image)

FLUX.1

[dev]

$0.000425

(price per step image)

FLUX.1

[schnell]

$0.0002975

(price per step image)

FLUX.1 Canny

[dev]

$ 0.025

(price per step image)

FLUX.1 Depth

[dev]

$ 0.025

(price per step image)

FLUX.1 Redux

[dev]

$ 0.025

(price per step image)

Pixtral 12B

$ 0.12

(Per 1M token)

i

Note: For image generation models such as SDXL, the pricing is based on the number of inference steps, which refers to the denoising iterations involved in the image creation process. All the FLUX models share the same pricing structure.
The pricing for all FLUX models is based on a standard number of processing steps. Additionally, users should be aware that more steps can enhance the quality and detail of the generated images, making it important to balance cost with desired output quality.

Template Name Master Node Count Master Node Plan Worker Node Count Worker Node Plan 1 Month Reserved Price 12 Month Reserved Price Action
K8s-1 Master(4 vCPU, 16 GB), 1 Worker(4 vCPU, 16 GB) 1 4v-16m 1 4v-16m

₹ 10700

₹ 115560


(10% Discount)
Reserve Now
K8s-1 Master(4 vCPU, 16 GB), 3 Worker(4 vCPU, 16 GB) 1 4v-16m 3 4v-16m

₹ 16900

₹ 182520


(10% Discount)
Reserve Now
K8s-3 Master(4 vCPU, 16 GB), 2 Worker(4 vCPU, 16 GB) 3 4v-16m 2 4v-16m

₹ 20000

₹ 216000


(10% Discount)
Reserve Now
K8s-3 Master(4 vCPU, 16 GB), 3 Worker(4 vCPU, 16 GB) 3 4v-16m 3 4v-16m

₹ 22800

₹ 246240


(10% Discount)
Reserve Now
K8s-3 Master(4 vCPU, 16 GB), 5 Worker(4 vCPU, 16 GB) 3 4v-16m 5 4v-16m

₹ 29300

₹ 316440


(10% Discount)
Reserve Now

Responsive Banner

Speech-to-text Models

Text-embedding-3-large is a robust language model by OpenAI

Whisper-v3-large

$ 0.001275

/audio min (billed per sec)

Whisper-v3-large-turbo

$ 0.000765

/audio min (billed per sec)

Streaming transcription service

$ 0.00256

/audio min (billed per sec)

i

Note:For speech-to-text models, we bill based on the duration of audio input, charging per second. This pricing structure allows users to efficiently manage costs based on the length of the audio they wish to transcribe.

Responsive Banner

Embedding Models

Text-embedding-3-large is a robust language model by OpenAI

Up to 150M

$ 0.0064

/1M input tokens

150M - 350M

$ 0.0128

/1M input tokens

i

Note: The pricing for embedding models is determined by the quantity of input tokens that the model processes. This means that the cost will vary depending on the length and complexity of the text being analyzed. It means more tokens lead to higher costs.

NVIDIA L40S Instances

Instance Name Compute unit Model AI Compute memory (GB) Performa FP32 Performa FP16 vCPU Instance memory (GB) Peer to Peer Bandwidth (GB/s) Network Bandwidth (GB/s) Peak/Benchmark Memory Bandwidth (GB/s) On Demand Price/hour 1 Month Reserved Price/hr 6 Month Reserved Price/hr 12 Month Reserved Price/hr Action
1L40S.16v.256m NVIDIA 1xL40S (1X) 48 91.6 733 16 256 - 200 864

₹ 124

₹ 74


(40% Discount)

₹ 67.5


(45% Discount)

₹ 61


(50% Discount)
Reserve Now
2L40S.32v.512m NVIDIA 2xL40S (2X) 96 183.2 1466 32 512 64 400 864

₹ 245

₹ 145


(40.98% Discount)

₹ 130.95


(46.55% Discount)

₹ 118


(52% Discount)
Reserve Now
4L40S.64v.1024m NVIDIA 4xL40S (4X) 192 366.4 2932 64 768 128 800 864

₹ 485

₹ 286


(41.01% Discount)

₹ 259.2


(46.58% Discount)

₹ 233


(52.02% Discount)
Reserve Now
8L40S.64v.2048m NVIDIA 8xL40S (8X) 1536 1304 10456 64 1536 3600 3200 580

₹ 960

₹ 566

₹ 513

₹ 461

Reserve Now

AMD MI300X Instances

Instance Name Compute unit Model AI Compute memory (GB) Performa FP32 Performa FP16 vCPU Instance memory(GB) Peer to Peer Bandwidth (GB/s) Network Bandwidth (GB/s) Peak/Benchmark Memory Bandwidth (GB/s) On Demand Price/hour 1 Month Reserved Price/hr 6 Month Reserved Price/hr 12 Month Reserved Price/hr Action
1MI300.16v.256m AMD 1xMI300X (1X) 192 163 1307 16 256 - 400 580

₹ 274

₹ 219


(20.08% Discount)

₹ 197


(28.11% Discount)

₹ 164


(40.16% Discount)
Reserve Now
2MI300.32v.512m AMD 2xMI300X (2X) 384 326 2614 32 512 900 800 580

₹ 542

₹ 429


(20.89% Discount)

₹ 382


(29.56% Discount)

₹ 315


(41.98% Discount)
Reserve Now
4MI300.64v.1024m AMD 4xMI300X (4X) 768 652 5228 64 768 1800 1600 580

₹ 1074

₹ 849


(20.90% Discount)

₹ 756


(29.57% Discount)

₹ 623


(41.99% Discount)
Reserve Now
8MI300.128v.2048m AMD 8xMI300X (8X) 1536 1304 10456 128 1536 3600 3200 580

₹ 2125

₹ 1681


(20.91% Discount)

₹ 1496


(29.59% Discount)

₹ 1233


(42.02% Discount)
Reserve Now

NVIDIA H100 SXM Instances

Instance Name Compute unit Model AI Compute memory (GB) Performa FP32 Performa FP16 vCPU Instance memory(GB) Peer to Peer Bandwidth (GB/s) Network Bandwidth (GB/s) Peak/Benchmark Memory Bandwidth (GB/s) On Demand Price/hour 1 Month Reserved Price/hr 6 Month Reserved Price/hr 12 Month Reserved Price/hr Action
1H100.16v.256m SXM NVIDIA 1xH100 SXM (1X) 80 67 1979 16 256 - 200 2039

₹ 329

₹ 296


(10.03% Discount)

₹ 263


(20.07% Discount)

₹ 219


(33.44% Discount)
Reserve Now
2H100.32v.512m SXM NVIDIA 2xH100 SXM (2X) 160 134 3958 32 512 900 400 2039

₹ 651

₹ 580


(10.95% Discount)

₹ 510


(21.68% Discount)

₹ 420


(35.47% Discount)
Reserve Now
4H100.64v.1024m SXM NVIDIA 4xH100 SXM (4X) 320 268 7916 64 768 1800 800 2039

₹ 1289

₹ 1148


(10.95% Discount)

₹ 1010


(21.69% Discount)

₹ 832


(35.47% Discount)
Reserve Now
8H100.128v.2048m SXM NVIDIA 8xH100 SXM (8X) 640 536 15832 128 1536 3600 1600 2039

₹ 2552

₹ 2273


(10.96% Discount)

₹ 1998


(21.71% Discount)

₹ 1646


(35.49% Discount)
Reserve Now

NVIDIA V100 Instances

Instance Name Compute unit Model AI Compute memory (GB) Performa FP32 Performa FP16 vCPU Instance memory(GB) Peer to Peer Bandwidth (GB/s) Network Bandwidth (GB/s) Peak/Benchmark Memory Bandwidth (GB/s) On Demand Price/hour 1 Month Reserved Price/hr 6 Month Reserved Price/hr 12 Month Reserved Price/hr Action
1V100.16v.256m NVIDIA 1xV100 (1X) 32 15.7 125 16 256 - 100 900

₹ 54

₹ 48


(10.20% Discount)

₹ 43


(20.41% Discount)

₹ 39


(28.57% Discount)
Reserve Now
2V100.32v.512m NVIDIA 2xV100 (2X) 64 31.4 250 32 512 300 200 900

₹ 107

₹ 95


(11.11% Discount)

₹ 83


(22.01% Discount)

₹ 74


(30.71% Discount)
Reserve Now
4V100.64v.1024m NVIDIA 4xV100 (4X) 128 62.8 500 64 1024 600 400 900

₹ 211

₹ 188


(11.12% Discount)

₹ 165


(22.03% Discount)

₹ 146


(30.74% Discount)
Reserve Now
8V100.128v.2048m NVIDIA 8xV100 (8X) 256 125.6 1000 128 2048 1200 800 900

₹ 418

₹ 372


(11.13% Discount)

₹ 326


(22.05% Discount)

₹ 290


(30.78% Discount)
Reserve Now
1xV100.32v.32m NVIDIA 1xV100 (1X) 74 145 286 32 74 566 429 219

₹ 46

₹ 41

₹ 37

₹ 32

Reserve Now
1V100.8v.64m NVIDIA 2xV100 (1X) 1536 1304 10456 128 1536 3600 3200 580

₹ 45

₹ 41

₹ 33

₹ 23

Reserve Now
16V100.64v.128m NVIDIA 4xV100 (4X) 1536 1304 10456 128 1536 3600 3200 580

₹ 93

₹ 83

₹ 74

₹ 65

Reserve Now
8V100.128v.2048m NVIDIA 8xV100 (8X) 1536 1304 10456 128 1536 3600 3200 580

₹ 357

₹ 318

₹ 280

₹ 242

Reserve Now

NVIDIA A100 Instances

Instance Name Compute unit Model AI Compute memory (GB) Performa FP32 Performa FP16 vCPU Instance memory(GB) Peer to Peer Bandwidth (GB/s) Network Bandwidth (GB/s) Peak/Benchmark Memory Bandwidth (GB/s) On Demand Price/hour 1 Month Reserved Price/hr 6 Month Reserved Price/hr 12 Month Reserved Price/hr Action
1xA100.16v.256m NVIDIA 1xA100 (1X) 80 156 312 8 64 - 200 1555

₹ 198

₹ 196


(1.11% Discount)

₹ 194


(2.22% Discount)

₹ 187


(5.56% Discount)
Reserve Now
2xA100.32v.512m NVIDIA 2xA100 (2X) 160 312 624 16 128 600 400 1555

₹ 392

₹ 384


(1.11% Discount)

₹ 376


(2.22% Discount)

₹ 359


(5.56% Discount)
Reserve Now
4xA100.64v.1024m NVIDIA 4xA100 (4X) 320 624 1248 32 256 1200 800 1555

₹ 776

₹ 760


(2.11% Discount)

₹ 743


(4.23% Discount)

₹ 711


(8.44% Discount)
Reserve Now
8xA100.128v.2048m NVIDIA 8xA100 (8X) 640 1248 2496 64 512 2400 1600 1555

₹ 1536

₹ 1504


(2.14% Discount)

₹ 1471


(4.23% Discount)

₹ 1406


(8.49% Discount)
Reserve Now

Intel Gaudi2 Instances

Instance Name Compute unit Model AI Compute memory (GB) Performa FP32 Performa FP16 vCPU Instance memory(GB) Peer to Peer Bandwidth (GB/s) Network Bandwidth (GB/s) Peak/Benchmark Memory Bandwidth (GB/s) On Demand Price/hour 1 Month Reserved Price/hr 6 Month Reserved Price/hr 12 Month Reserved Price/hr Action
1xGaudi2.16v.256m Intel 1XGaudi 2 (1X) 96 60 180 19 288 - 200 2150

₹ 101

₹ 81


(19.57% Discount)

₹ 69


(31.52% Discount)

₹ 59


(41.30% Discount)
Reserve Now
2xGaudi2.32v.512m Intel 2XGaudi 2 (2X) 192 120 360 38 576 200 400 2150

₹ 200

₹ 160


(20.37% Discount)

₹ 134


(32.91% Discount)

₹ 114


(43.08% Discount)
Reserve Now
4xGaudi2.64v.1024m Intel 4XGaudi 2 (4X) 384 240 720 76 1152 400 800 2150

₹ 397

₹ 316


(20.42% Discount)

₹ 266


(32.95% Discount)

₹ 226


(43.12% Discount)
Reserve Now
8xGaudi2.128v.2048m Intel 8XGaudi 2 (8X) 768 480 1440 152 2304 800 1600 2150

₹ 785

₹ 625


(20.43% Discount)

₹ 527


(32.96% Discount)

₹ 447


(43.13% Discount)
Reserve Now

AMD MI325X Instances

Instance Name Compute unit Model AI Compute memory (GB) Performa FP32 Performa FP16 vCPU Instance memory(GB) Peer to Peer Bandwidth (GB/s) Network Bandwidth (GB/s) Peak/Benchmark Memory Bandwidth (GB/s) On Demand Price/hour 1 Month Reserved Price/hr 6 Month Reserved Price/hr 12 Month Reserved Price/hr Action
1xMI325.16v.256m AMD 1xMI325X (1X) 192 163 1307 16 256 - 400 580

₹ 298

₹ 217


(27.11% Discount)

₹ 181


(39.38% Discount)

₹ 150


(49.47% Discount)
Reserve Now
2xMI325.32v.512m AMD 2xMI325X (2X) 384 326 2614 32 512 900 800 580

₹ 590

₹ 425


(27.86% Discount)

₹ 350


(40.60% Discount)

₹ 289


(51.00% Discount)
Reserve Now
4xMI325.64v.1024m AMD 4xMI325X (4X) 768 652 5228 64 768 1800 1600 580

₹ 1167

₹ 842


(27.87% Discount)

₹ 693


(40.62% Discount)

₹ 572


(51.02% Discount)
Reserve Now
8xMI325.128v.2048m AMD 8xMI325X (8X) 1536 1304 10456 128 1536 3600 3200 580

$25.7

$18.5


(27.88% Discount)

$15.2


(40.63% Discount)

$12.6


(51.03% Discount)
Reserve Now
What's included

No hidden fees. No surprises on the invoice.

The price you see is the price you pay. Every plan includes the things most GPU clouds charge extra for.

Deploy in under 60 seconds

Pre-built CUDA, PyTorch, TensorFlow, and vLLM images. Pull your model and start training — no driver hunts, no kernel module rebuilds.

Hourly billing, no minimums

Spin up, train for 47 minutes, shut down — you're only billed for the time the GPU was actually running. No daily minimums, no monthly commitments unless you choose reserved.

Zero egress & ingress fees

Move datasets in, ship checkpoints out, mirror models to your registry. Outbound bandwidth doesn't appear on your invoice.

NVLink & InfiniBand by default

8×H100 SXM nodes ship with full NVLink and 3.2 Tb/s InfiniBand fabric. Multi-node training scales without rewriting your collectives.

SOC 2 & ISO-grade isolation

Single-tenant bare-metal options, per-tenant VPCs, encrypted-at-rest volumes, BYOK support, and DPDP-compliant data residency in India.

Real engineers on Slack

Enterprise plans get a shared Slack channel with the platform team. Reply times measured in minutes, not ticket queue positions.

Pick by workload

Which GPU fits
what you're actually building?

A short, practical guide for the workloads we see most often from teams on Cyfuture AI. If your workload doesn't fit cleanly, message us and we'll help size it.

LLM fine-tuning & pre-training

Full-parameter fine-tuning of 70B-class models, continued pre-training, RLHF, and large-scale distributed runs. FP8 cuts wall-clock roughly in half versus BF16 on A100, and NVLink keeps gradient sync from becoming the bottleneck.

Recommended: 8×H100 SXM · A100 80GB for cost-sensitive runs

AI inference serving

vLLM, TGI, TensorRT-LLM. Throughput-heavy serving for 7B–70B models.

Best fit: L40S · H100 PCIe

Stable Diffusion & image gen

SDXL, Flux, ControlNet pipelines. Batch generation and LoRA training.

Best fit: L40S · A100 40GB

RAG pipelines

Embedding generation, vector DB hosting, low-latency retrieval-augmented inference.

Best fit: L40S · L4

Speech & audio AI

Whisper, real-time ASR, TTS, and voice agents. Latency < 300ms.

Best fit: L40S · V100

Video generation & CV

Sora-class video, real-time computer vision, object detection at scale.

Best fit: H100 · L40S

Kubernetes GPU workloads

Managed K8s with NVIDIA device plugin, MIG slicing, autoscaling node pools, and per-pod GPU billing.

Any GPU · MIG on A100 / H100

AI research & experimentation

Jupyter notebooks, ad-hoc experiments, paper reproductions. Spin up, shut down, no waste.

Best fit: V100 · A100 40GB
Ready when you are

Self-serve a GPU now, or scope a cluster with us.

Most teams are training within 5 minutes on the self-serve console. If you need 32+ H100s, custom networking, single-tenant bare metal, or DPDP-compliant India residency, our infrastructure team can scope and provision in 24–72 hours.

No credit card to start ₹100 free trial credit Dedicated Slack channel on enterprise
Customer Stories

What AI teams say about
Cyfuture GPU pricing

Real results from researchers, startups, and enterprise ML teams in India.

★★★★★

"We switched our H100 cluster from a US-based provider to Cyfuture. 58% cost reduction in INR, India data residency for our regulated healthcare data, and the deployment experience was genuinely better."

A
Arjun Nair
Head of ML, HealthTech Startup, Bangalore
★★★★★

"The GPU pricing transparency is refreshing — no hidden egress bills, no surprise charges. We burn through 1,000+ A100 hours per month and the per-second billing actually matters at that scale."

P
Priya Mehta
AI Research Lead, EdTech Unicorn, Pune
★★★★★

"As a university lab, our budget is tight. Cyfuture's V100 PCIe pricing at ₹28/hr let us run our entire semester's NLP research for what we'd spend on 3 days of AWS. The V100's 16GB HBM2 and FP16 throughput is perfect for our transformer workloads."

R
Dr. Rohit Sharma
Associate Professor, IIT Research Lab

FAQs: GPU Cloud Pricing

The power of AI, backed by human support

At Cyfuture AI, we combine advanced technology with genuine care. Our expert team is always ready to guide you through setup, resolve your queries, and ensure your experience with Cyfuture AI remains seamless. Reach out through our live chat or drop us an email at [email protected] - help is only a click away.

On-demand H100 SXM pricing starts at $3.66 per GPU per hour, with hourly billing and no minimum commitment. With a 12-month reserved commitment the rate drops to $2.38 per GPU per hour — about 35% off. H100 PCIe (single-GPU) starts at $2.99/hr on-demand. Multi-GPU 8×H100 SXM nodes with NVLink and InfiniBand are available from $28.50/hr.

On-demand is pay-as-you-go: spin up a GPU, get billed by the hour, shut down when done. No commitment, but you pay the standard hourly rate. Reserved means you commit to 6 or 12 months in exchange for a discount — 22% off for 6-month, 35% off for 12-month. Reserved is right for steady workloads (production inference, ongoing training). On-demand fits bursty work, experiments, and prototyping.

Self-serve single-GPU and multi-GPU instances come up in under 60 seconds for available SKUs. Pre-built CUDA/PyTorch/TensorRT/vLLM images mean no driver setup. For reserved clusters of 32+ H100s with custom networking or bare-metal isolation, expect 24–72 hours from contract to first login.

For throughput-heavy LLM serving (vLLM, TGI, TensorRT-LLM) the L40S usually wins on price-per-token at the 7B–34B range. For large-context or 70B+ models the H100 PCIe is the sweet spot. For edge or always-on small models, the L4 gives the lowest hourly cost. A100 80GB still fits well if you're already on that stack and need predictable behaviour.

A rough rule: parameters × bytes per parameter × 1.25 (for activations and KV cache headroom). A 7B model in BF16 needs roughly 17 GB; in FP8 around 9 GB. A 70B model in BF16 fits on a single 80GB H100; in FP16 you'll need 2× A100 80GB. Use quantization (INT8, INT4, FP8) to shrink the memory footprint — most production inference runs quantized.

No egress fees, no ingress fees. Move datasets and checkpoints freely. Persistent block storage is billed separately at $0.08/GB/month with snapshot support. S3-compatible object storage starts at $0.018/GB/month. There's no platform fee and no minimum monthly spend.

Yes — managed Kubernetes ships with the NVIDIA device plugin, MIG slicing, GPU autoscaler, and Prometheus add-ons. Control plane is $72/cluster/month on Standard, and GPU node pools bill at the same per-hour rate as GPU as a Service — there's no Kubernetes upcharge on the compute itself.

H100 is the right call if your workload benefits from FP8 (modern training, frontier inference) or if you need NVLink 4.0 and 3.35 TB/s memory bandwidth — typically 70B+ training, KV-cache-heavy serving, or latency-sensitive inference. A100 is the cost-efficient default for mature workloads on a stable CUDA stack — 7B–34B fine-tuning, classical deep learning, and production inference that doesn't need FP8.

Standard instances run in per-tenant VPCs with encrypted-at-rest volumes. For regulated workloads we offer single-tenant bare-metal options where the entire host is yours — no shared CPU, no shared memory, no neighbouring tenants. We are SOC 2 Type II and ISO 27001 certified, with DPDP-compliant India data residency available on request.

Train Models Faster, Smarter, Cheaper

Cut training time by up to 80% with powerful GPU rentals designed for AI & ML workloads.

⚡ No credit card required · ₹100 free credits · Deploy in 60 seconds · SOC-2 certified