NexGate

Pricing Explained

How NexGate credits, token pricing, credit packs, reservations, billing errors, and usage exports work.

How credits work

NexGate uses prepaid credits. You buy credits first, then each API call deducts credits based on token usage and the configured model price.

Note

Credits are USD-denominated. Every credit equals $1 USD of API usage across any supported model.

Buy credits

Purchase credit packs starting at $25.

Use credits

Chat requests deduct credits based on actual token usage.

Keep control

Requests stop when credits or spend safety limits are reached.

Credit packs

PackPriceCreditsBest for
Builder$25$30.00Side projects and prototypes
Pro$50$60.00Production apps
Team$100$120.00High-volume applications
Scale$200$250.00Growing products

Tip

The price you pay is the amount charged; the credits column is what lands in your account. You pay $25 and get $30.00 in credits. Credits never expire.

Note

Eligible new users receive $5 in free credits after sign-up — enough to make hundreds of test requests before buying a pack.

Buy credits

Open the live pricing page.

How token pricing works

Every API call is priced from:

  1. The resolved model.
  2. Input tokens.
  3. Output tokens.
  4. The NexGate price configured at the time of use.

Example calculation

A gpt-5.5 request with 500 input tokens and 200 output tokens costs:

Input:  500 / 1,000,000 x $5.00 = $0.002500
Output: 200 / 1,000,000 x $30.00 = $0.006000
Total:  $0.008500

Your dashboard and CSV export show the exact logged cost for each request.

Credit reservations

Before NexGate calls a provider, it reserves credits using estimated input tokens and the maximum possible output. After the response completes, NexGate charges actual usage and releases unused reserved credits.

reserve estimated maximum cost -> provider response -> charge actual usage -> release remainder
reserve estimated maximum cost -> provider error -> release reservation -> log error
reserve estimated maximum cost -> stream disconnects before usage -> release reservation -> log client_disconnected

Tip

Set max_tokens or max_completion_tokens to reduce the maximum reservation for a request.

Token pricing by model

All prices are per 1 million tokens.

ModelInput / 1MOutput / 1MUse case
GPT-5.5$5.00$30.00Latest deployed GPT-5.5 flagship
GPT-5$1.25$10.00GPT-5 reasoning and coding
GPT-5.4$2.50$15.00Updated GPT-5.4 model
GPT-4o$2.50$10.00Multimodal flagship
GPT-4.1$2.00$8.00Long-context production reasoning
o4 Mini$1.10$4.40Fast reasoning planner
GPT-5.4 Mini$0.40$1.60Fast GPT-5.4 workhorse
GPT-4.1 Mini$0.40$1.60Reliable workhorse
GPT-5.4 Nano$0.10$0.40Ultra-low-cost GPT-5.4 tasks
GPT-4.1 Nano$0.10$0.40Classification and routing
DeepSeek V4 Pro$1.93$3.83Frontier DeepSeek V4 with enhanced reasoning
DeepSeek V3.2 Speciale$0.40$0.80Higher-capability DeepSeek chat
DeepSeek V3.2$0.28$0.42Efficient coding and chat
DeepSeek V4 Flash$0.14$0.28Fast low-cost chat and coding
Kimi K2.6$0.95$4.00Long-context agent chat
Kimi K2.5$0.60$2.50Long-context multilingual chat
Grok 4.3$1.25$2.50Latest Grok 4.3 frontier model
Grok 4 20 Reasoning$1.25$2.50Frontier reasoning and analysis
Grok 4.1 Fast Reasoning$0.20$0.50Fast reasoning workloads
Llama 3.3 70B Instruct$0.13$0.40Meta Llama 3.3 chat
Llama 4 Maverick$0.18$0.70Open MoE model
Llama 4 Scout$0.10$0.35Low-cost open chat

Warning

Model prices can change when upstream provider costs change. Credits stay USD-denominated and requests are charged at the NexGate price configured at the time of use.

No automatic billing

NexGate does not automatically charge your card when credits run low. When credits are insufficient:

  1. Chat completion requests return 402 insufficient_credits.
  2. Low-balance alerts can notify you if you set a threshold.
  3. You manually buy more credits when ready.

Spend safety limits

NexGate enforces an hourly spend safety limit on your account. If the estimated request would exceed that limit, chat completions return 429 rate_limit_error.

Note

You can set a finite hourly limit or use -1 for unlimited through the dashboard account settings flow.

You can also set optional per-key hourly limits — a dollars-per-hour spend ceiling and a requests-per-hour ceiling — on each API key. A request is rejected with 429 rate_limit_error if it would exceed either the account limit or any limit on the key it authenticated with.

Purchase capacity

NexGate sells credits against a configurable platform-wide capacity. When capacity is temporarily sold out, the pricing and top-up pages show a "capacity full" state and disable new purchases. Existing credits remain fully usable, and new purchases reopen once capacity is available again.

Refunds and expiration

Credits do not expire. Purchased credits are non-refundable, except where support determines a technical billing issue caused incorrect charging.

Warning

Do not buy credits expecting automatic refunds for unused balance.

Promo credits

Eligible new users may receive launch promo credits after sign-up verification. Promo availability and amount are controlled by runtime settings and anti-abuse checks.

Usage tracking

Your dashboard shows:

  • Real-time balance
  • Per-request cost
  • Prompt and completion token counts
  • Resolved model and requested alias when applicable
  • Latency
  • Status
  • CSV export for the latest 1,000 usage rows
date,model,requested_model,prompt_tokens,completion_tokens,cost_usd,latency_ms,status

Cost optimization

Billing error states

StatusErrorMeaning
402insufficient_creditsAvailable balance cannot cover the estimated request
429rate_limit_errorHourly spend safety limit has been reached
503This pack is not yet configured. Please contact support.Selected credit pack is temporarily unavailable
500Server misconfiguredA configuration error prevents checkout from completing

What's next?

On this page