Contributing: Model Management

This page covers the internals of the Neural Inverse model management layer for contributors. It assumes familiarity with the VS Code extension architecture and the Neural Inverse DI system.

Source root: src/vs/workbench/contrib/neuralInverse/browser/modelManagement/

Module Map

modelManagement/
├── cloudCredentialService.ts       # Encrypted credential storage + validation
├── cloudDeploymentService.ts       # AWS/Azure provisioning via terminal
├── localProvidersAutoSetup.ts      # Runs once on IDE start — seeds provider settings
├── marketplaceService.ts           # Model catalog (HuggingFace + curated list)
└── deployment/
    ├── deploymentTypes.ts          # Shared types + type guards
    ├── deploymentRegistryService.ts # Unified local+cloud registry, health polling
    ├── autoConfigService.ts        # Auto-applies provider settings on deployment ready
    └── index.ts                    # DI registration + barrel re-exports

Registration entry point: neuralInverse.contribution.ts imports ./modelManagement/deployment/index.js as a side-effect.

DI Services

Decorator key	Interface	Class	Type
`modelMarketplaceService`	`IModelMarketplaceService`	`ModelMarketplaceService`	`Delayed`
`cloudCredentialService`	`ICloudCredentialService`	`CloudCredentialService`	`Delayed`
`cloudDeploymentService`	`ICloudDeploymentService`	`CloudDeploymentService`	`Delayed`
`deploymentRegistryService`	`IDeploymentRegistryService`	`DeploymentRegistryService`	`Delayed`
`deploymentAutoConfigService`	`IDeploymentAutoConfigService`	`DeploymentAutoConfigService`	`Delayed`

All services are InstantiationType.Delayed — they instantiate on first injection, not at startup.

Deployment Types

deploymentTypes.ts defines the unified type that represents any running model endpoint:

type IUnifiedDeployment = ILocalDeployment | ICloudDeploymentEntry;

interface ILocalDeployment {
  kind: 'local';
  id: string;                    // e.g. "local-ollama"
  provider: ProviderName;
  displayName: string;
  endpoint: string;
  status: 'running' | 'unreachable' | 'stopped';
  models: string[];
  lastChecked: number;
}

interface ICloudDeploymentEntry {
  kind: 'cloud';
  id: string;
  cloudProvider: 'aws' | 'azure';
  voidProvider: ProviderName;    // always 'vLLM' today
  modelId: string;
  modelName: string;
  endpoint: string;
  status: CloudDeploymentStatus;
  config: IDeploymentConfig;
  createdAt: number;
  costPerHour: number;
}

Type guards: isLocalDeployment(d), isCloudDeployment(d), isDeploymentActive(d)

Helper: getDeploymentEndpoint(d): IDeploymentEndpoint | undefined — extracts { provider, url, apiKey?, modelName? } regardless of deployment kind.

DeploymentRegistryService

IDeploymentRegistryService is the single source of truth for what is running.

Local detection

On construction and every 30 seconds, the service runs _refreshLocalDeployments():

for each LOCAL_PROVIDER (ollama, vLLM, lmStudio):
  fetch(endpoint + healthPath, timeout: 5s)
  if ok → fetch models list → upsert ILocalDeployment{status: 'running'}
  if fail → upsert ILocalDeployment{status: 'unreachable'|'stopped'}

No external service is injected for local detection — it uses fetch() directly against well-known localhost ports. This avoids circular DI and keeps the registry lightweight.

_syncCloudDeployments() reads from ICloudDeploymentService.listDeployments() and upserts ICloudDeploymentEntry records. It is called on init and whenever cloudDeploymentService.onDeploymentStatusChanged fires.

State transitions + events

The registry tracks _previousStates: Map<string, boolean> to diff state across polls:

Transition	Event fired
`false → true` (deployment came up)	`onDeploymentBecameReady`
`true → false` (deployment went down)	`onDeploymentWentDown`
Any change	`onDidChange`

Persistence

State is serialized to IStorageService (profile scope, key neuralInverse.deploymentRegistry) so the Deployments tab can render immediately on IDE start before the first health check completes.

DeploymentAutoConfigService

Listens on registryService.onDeploymentBecameReady. When a deployment becomes ready:

Guard: already dismissed? — if user selected "Don't auto-configure" for this provider, skip.
Guard: already configured? — checks settingsOfProvider[provider]:
- endpoint differs from default → skip
- apiKey is non-empty → skip
- _didFillInProviderSettings is true → skip
Apply:
- Local deployment: call voidSettingsService.setAutodetectedModels(provider, models, { enableProviderOnSuccess: true })
- Cloud deployment: set endpoint, apiKey, and models via setSettingOfProvider + setAutodetectedModels
Notify: show a notification with Undo + "Don't auto-configure" actions.

Applied rules are stored per profile (neuralInverse.deployment.autoConfigRules) and can be iterated via getAppliedRules().

CloudDeploymentService

cloudDeploymentService.ts handles AWS EC2 and Azure VM provisioning.

Security design

Concern	Implementation
Command injection	`_shellEscape(s)` wraps all user-supplied strings in single quotes and escapes embedded single quotes
Open endpoints	vLLM starts with `--api-key <32-byte-hex>` generated at deploy time
SSRF via IMDS	EC2 instances launched with `HttpTokens=required` (IMDSv2 only)
Overly permissive firewall	Security group created with inbound rule scoped to caller's IP only
Stale deployments	Any deployment stuck in `provisioning` > 20 min is transitioned to `error` on next load
Runaway deploys	15-minute hard timeout per deployment; `AbortController` cancels the polling loop
Credential leakage	Credentials read from `ISecretStorageService` at deploy time, never stored in plain state

Terminal execution

The service uses ITerminalService to run AWS CLI / Azure CLI commands in a named terminal (Neural Inverse — Cloud Deploy). This means users can see exactly what commands are being run, and the terminal persists after deployment for debugging.

Terminal creation:

const terminal = await this.terminalService.createTerminal({ config: { name } });

Health check loop

After provisioning, the service polls GET <endpoint>/health with:

30-second interval
MAX_HEALTH_RETRIES = 20 (10 minutes total)
Each retry increments a counter shown in the wizard UI
On success: transitions to running, fires onDeploymentStatusChanged
On exhaustion: transitions to error

Abort

abort(deploymentId) sets the AbortController signal, which cancels the health-check fetch and transitions the deployment to stopped.

CloudCredentialService

Stored credentials go through three layers before being accepted:

Whitespace trimming — leading/trailing whitespace stripped from all fields
Format validation:
- AWS: access key must match /^AKIA[0-9A-Z]{16}$/, secret key must be 40 chars
- Azure: subscription ID and tenant ID must be valid GUIDs (/^[0-9a-f-]{36}$/i)
- Region: validated against an allowlist regex to prevent SSRF
Connectivity test — a lightweight API call (sts.get-caller-identity for AWS, az account show for Azure) confirms the credentials work before storing

On validation failure, a descriptive error is returned and nothing is stored.

Agent Manager UI — Models Tab

agentManagerPart.ts owns the Models tab UI. It has two modes toggled by _modelsViewMode: 'simple' | 'advanced'.

Simple mode (`_renderSimpleModelsView`)

Hero section with gradient + Ollama status pill
Categorized model grid rendered by _createModelSection()
Each card built by _createCuratedModelCard(): org badge, params badge, star count, progress bar overlay during install
Install flow calls IModelManagementService pull API; progress updates via onPullProgress event

Advanced mode (`renderModelsMarketplace`)

Left sidebar: search input (debounced 300ms), provider chips, category chips
Right pane: model list with detail panel
Cloud deploy wizard (full-screen overlay) triggered from model detail

Cloud deploy wizard

The wizard is a self-contained overlay that tracks its own state machine:

idle → configuring → deploying → (running | error)

During deploying, a scrollable timeline log renders each provisioning step as it comes in, with a live elapsed-time counter and an Abort button.

Agent Manager UI — Deployments Tab

_renderDeploymentsView() in agentManagerPart.ts:

Subscribes to deploymentRegistryService.onDidChange on mount; unsubscribes on tab switch
Renders two sections: Local Providers and Cloud Deployments
Each row built by _createDeploymentRow() with status badge, endpoint, model list, and action buttons
Action buttons use _createSmallAction() helper (styled inline <button>)
Auto-config status section at the bottom shows applied rules from deploymentAutoConfigService.getAppliedRules()

Adding a New Local Provider

Add an entry to LOCAL_PROVIDERS in deploymentRegistryService.ts:

{
  provider: 'myProvider',
  displayName: 'My Provider',
  defaultEndpoint: 'http://localhost:9000',
  healthPath: '/health',
  modelsPath: '/v1/models',
}

If the models endpoint returns a non-standard shape, extend _fetchLocalModels() with a new branch keyed on config.provider.
Add 'myProvider' to ProviderName in voidSettingsTypes.ts if it isn't already there.
Add the default endpoint to the defaults map in autoConfigService._isProviderConfigured() and _revertCloudConfig().

Adding a New Cloud Provider

Add the provider type to ICloudDeployment.provider union in cloudTypes.ts.
Implement _deployAzure() / _deployAWS() pattern in cloudDeploymentService.ts — follow the existing shell-escape and IMDSv2 patterns exactly.
Add credential validation to cloudCredentialService.ts following the _validateAWS / _validateAzure pattern.
Add a new credential form to the wizard UI in agentManagerPart.ts.

Testing

There are no unit tests for this module yet. Manual testing checklist:

Ollama running → Deployments tab shows green pill, models listed
Ollama stopped → status transitions to unreachable within 30s
Install a model from Simple mode → progress bar renders, model appears in Ollama after download
Cloud deploy → wizard shows timeline, health check count increments, transitions to running
Cloud deploy → Abort mid-provisioning → status transitions to stopped
Auto-config with unconfigured provider → notification shown, endpoint set
Auto-config with already-configured provider → no notification, no change
"Don't auto-configure" → dismissed state persists across IDE restart
Invalid AWS key format → validation error shown before storage

Was this page helpful?