Neural Inverse is Open Source →
← Back to changelog
May 26, 2026

New LLM Providers: GitHub Models, Fireworks AI, Cerebras

Picture Sanjay SenthilkumarSanjay Senthilkumar

Three new LLM providers added — GitHub Models (multi-model via PAT), Fireworks AI (fastest open-model inference), and Cerebras (2000+ tok/s wafer-scale hardware).

Neural Inverse now supports 20 LLM providers. Three new providers ship today — all OpenAI-compatible, all zero-config beyond an API key.

GitHub Models

Access 40+ models from a single GitHub Personal Access Token. GPT-4.1, DeepSeek-R1, Llama 4, Grok-3, Mistral — all from one credential you already have.

  • Endpoint: https://models.github.ai/inference
  • Auth: GitHub PAT with models:read scope
  • Free tier: Available (rate-limited)
  • Default models: openai/gpt-4.1, openai/gpt-4.1-mini, openai/o4-mini, deepseek/deepseek-r1, meta/llama-4-scout-17b-16e-instruct, xai/grok-3-mini

Fireworks AI

Fastest open-model inference available. Native function calling for Power Mode agents. Sub-second latency on 70B+ models.

  • Endpoint: https://api.fireworks.ai/inference/v1
  • Auth: API key
  • Default models: llama-v3p3-70b-instruct, deepseek-r1, qwen3-235b-a22b, gemma-4-31b-it, gpt-oss-120b

Cerebras

Wafer-scale inference hardware generating 2000+ tokens per second. Makes autocomplete and inline edit feel instant.

  • Endpoint: https://api.cerebras.ai/v1
  • Auth: API key
  • Free tier: Available
  • Default models: llama3.1-8b, gpt-oss-120b, qwen-3-235b-a22b-instruct-2507

Setup

  1. Open Settings > Neural Inverse > LLM Providers
  2. Select the provider
  3. Enter your API key
  4. Select a model for each feature (Chat, Autocomplete, Ctrl+K, Power Mode)

All keys stay local. No proxy.

Copyright 2026 Neural Inverse Inc.


Was this page helpful?