Choose the plan that's right for you

Developer

Powerful speed and reliability to start your project

600 requests/min rate limit

Up to 100 deployed models

Custom PEFT add-ons

Pay per usage

Get Started →

Business

A plan that scales with your production usage

Everything from the Developer plan

Custom rate limits

Team collaboration features

API telemetry and metrics

Dedicated email support

Enterprise

Personalized configurations for serving at scale

Everything from the Business plan

Custom pricing

Unlimited rate limits

Unlimited deployed models

Custom base models

Dedicated and self-hosted deployments

Specialized enterprise support

Text Models

Base model parameter count	$/1M tokens
up to 16B	$0.20
16.1B - 80B	$0.90
Mixtral 8x7B	$0.50

Per-token pricing is applied only for non-enterprise deployments. Contact us for dedicated deployment pricing options.

Image Models

SDXL, $/step	SDXL w/ ControlNet, $/step
$0.0002	$0.0003

For image generation models like SDXL we charge based on the number of inference steps (denoising iterations).

Multi-Modal

For multi-modal models like LLaVA, each image is billed as 576 prompt tokens.

Embedding Models

Base model parameter count	$/1M input tokens
up to 150M	$0.008
150M - 350M	$0.016

Embedding model pricing is based on the number of input tokens processed by the model.

Fine-tuning

Model	$ / 1M tokens in training
Models up to 16B parameters	$0.50
Models 16.1B - 80B	$3.00
Mixtral 8x7B	$2.00

Fireworks charges based on the total number of tokens in your fine-tuning dataset (dataset size * number of epochs). A minimum charge of $3 is enforced (fine-tuning jobs that would have been charged less than $3 are rounded up to $3).

Frequently asked questions