Neural Inverse is Open Source →
GuidesContributing — Model Management
GuidesOpen Source ContributingContributing — Model Management

Contributing: Model Management

This page covers the internals of the Neural Inverse model management layer for contributors. It assumes familiarity with the VS Code extension architecture and the Neural Inverse DI system.

Source root: src/vs/workbench/contrib/neuralInverse/browser/modelManagement/


Module Map

modelManagement/
├── cloudCredentialService.ts       # Encrypted credential storage + validation
├── cloudDeploymentService.ts       # AWS/Azure provisioning via terminal
├── localProvidersAutoSetup.ts      # Runs once on IDE start — seeds provider settings
├── marketplaceService.ts           # Model catalog (HuggingFace + curated list)
└── deployment/
    ├── deploymentTypes.ts          # Shared types + type guards
    ├── deploymentRegistryService.ts # Unified local+cloud registry, health polling
    ├── autoConfigService.ts        # Auto-applies provider settings on deployment ready
    └── index.ts                    # DI registration + barrel re-exports

Registration entry point: neuralInverse.contribution.ts imports ./modelManagement/deployment/index.js as a side-effect.


DI Services

Decorator keyInterfaceClassType
modelMarketplaceServiceIModelMarketplaceServiceModelMarketplaceServiceDelayed
cloudCredentialServiceICloudCredentialServiceCloudCredentialServiceDelayed
cloudDeploymentServiceICloudDeploymentServiceCloudDeploymentServiceDelayed
deploymentRegistryServiceIDeploymentRegistryServiceDeploymentRegistryServiceDelayed
deploymentAutoConfigServiceIDeploymentAutoConfigServiceDeploymentAutoConfigServiceDelayed

All services are InstantiationType.Delayed — they instantiate on first injection, not at startup.


Deployment Types

deploymentTypes.ts defines the unified type that represents any running model endpoint:

type IUnifiedDeployment = ILocalDeployment | ICloudDeploymentEntry;

interface ILocalDeployment {
  kind: 'local';
  id: string;                    // e.g. "local-ollama"
  provider: ProviderName;
  displayName: string;
  endpoint: string;
  status: 'running' | 'unreachable' | 'stopped';
  models: string[];
  lastChecked: number;
}

interface ICloudDeploymentEntry {
  kind: 'cloud';
  id: string;
  cloudProvider: 'aws' | 'azure';
  voidProvider: ProviderName;    // always 'vLLM' today
  modelId: string;
  modelName: string;
  endpoint: string;
  status: CloudDeploymentStatus;
  config: IDeploymentConfig;
  createdAt: number;
  costPerHour: number;
}

Type guards: isLocalDeployment(d), isCloudDeployment(d), isDeploymentActive(d)

Helper: getDeploymentEndpoint(d): IDeploymentEndpoint | undefined — extracts { provider, url, apiKey?, modelName? } regardless of deployment kind.


DeploymentRegistryService

IDeploymentRegistryService is the single source of truth for what is running.

Local detection

On construction and every 30 seconds, the service runs _refreshLocalDeployments():

for each LOCAL_PROVIDER (ollama, vLLM, lmStudio):
  fetch(endpoint + healthPath, timeout: 5s)
  if ok → fetch models list → upsert ILocalDeployment{status: 'running'}
  if fail → upsert ILocalDeployment{status: 'unreachable'|'stopped'}

No external service is injected for local detection — it uses fetch() directly against well-known localhost ports. This avoids circular DI and keeps the registry lightweight.

Cloud sync

_syncCloudDeployments() reads from ICloudDeploymentService.listDeployments() and upserts ICloudDeploymentEntry records. It is called on init and whenever cloudDeploymentService.onDeploymentStatusChanged fires.

State transitions + events

The registry tracks _previousStates: Map<string, boolean> to diff state across polls:

TransitionEvent fired
false → true (deployment came up)onDeploymentBecameReady
true → false (deployment went down)onDeploymentWentDown
Any changeonDidChange

Persistence

State is serialized to IStorageService (profile scope, key neuralInverse.deploymentRegistry) so the Deployments tab can render immediately on IDE start before the first health check completes.


DeploymentAutoConfigService

Listens on registryService.onDeploymentBecameReady. When a deployment becomes ready:

  1. Guard: already dismissed? — if user selected "Don't auto-configure" for this provider, skip.
  2. Guard: already configured? — checks settingsOfProvider[provider]:
    • endpoint differs from default → skip
    • apiKey is non-empty → skip
    • _didFillInProviderSettings is true → skip
  3. Apply:
    • Local deployment: call voidSettingsService.setAutodetectedModels(provider, models, { enableProviderOnSuccess: true })
    • Cloud deployment: set endpoint, apiKey, and models via setSettingOfProvider + setAutodetectedModels
  4. Notify: show a notification with Undo + "Don't auto-configure" actions.

Applied rules are stored per profile (neuralInverse.deployment.autoConfigRules) and can be iterated via getAppliedRules().


CloudDeploymentService

cloudDeploymentService.ts handles AWS EC2 and Azure VM provisioning.

Security design

ConcernImplementation
Command injection_shellEscape(s) wraps all user-supplied strings in single quotes and escapes embedded single quotes
Open endpointsvLLM starts with --api-key <32-byte-hex> generated at deploy time
SSRF via IMDSEC2 instances launched with HttpTokens=required (IMDSv2 only)
Overly permissive firewallSecurity group created with inbound rule scoped to caller's IP only
Stale deploymentsAny deployment stuck in provisioning > 20 min is transitioned to error on next load
Runaway deploys15-minute hard timeout per deployment; AbortController cancels the polling loop
Credential leakageCredentials read from ISecretStorageService at deploy time, never stored in plain state

Terminal execution

The service uses ITerminalService to run AWS CLI / Azure CLI commands in a named terminal (Neural Inverse — Cloud Deploy). This means users can see exactly what commands are being run, and the terminal persists after deployment for debugging.

Terminal creation:

const terminal = await this.terminalService.createTerminal({ config: { name } });

Health check loop

After provisioning, the service polls GET <endpoint>/health with:

  • 30-second interval
  • MAX_HEALTH_RETRIES = 20 (10 minutes total)
  • Each retry increments a counter shown in the wizard UI
  • On success: transitions to running, fires onDeploymentStatusChanged
  • On exhaustion: transitions to error

Abort

abort(deploymentId) sets the AbortController signal, which cancels the health-check fetch and transitions the deployment to stopped.


CloudCredentialService

Stored credentials go through three layers before being accepted:

  1. Whitespace trimming — leading/trailing whitespace stripped from all fields
  2. Format validation:
    • AWS: access key must match /^AKIA[0-9A-Z]{16}$/, secret key must be 40 chars
    • Azure: subscription ID and tenant ID must be valid GUIDs (/^[0-9a-f-]{36}$/i)
    • Region: validated against an allowlist regex to prevent SSRF
  3. Connectivity test — a lightweight API call (sts.get-caller-identity for AWS, az account show for Azure) confirms the credentials work before storing

On validation failure, a descriptive error is returned and nothing is stored.


Agent Manager UI — Models Tab

agentManagerPart.ts owns the Models tab UI. It has two modes toggled by _modelsViewMode: 'simple' | 'advanced'.

Simple mode (_renderSimpleModelsView)

  • Hero section with gradient + Ollama status pill
  • Categorized model grid rendered by _createModelSection()
  • Each card built by _createCuratedModelCard(): org badge, params badge, star count, progress bar overlay during install
  • Install flow calls IModelManagementService pull API; progress updates via onPullProgress event

Advanced mode (renderModelsMarketplace)

  • Left sidebar: search input (debounced 300ms), provider chips, category chips
  • Right pane: model list with detail panel
  • Cloud deploy wizard (full-screen overlay) triggered from model detail

Cloud deploy wizard

The wizard is a self-contained overlay that tracks its own state machine:

idle → configuring → deploying → (running | error)

During deploying, a scrollable timeline log renders each provisioning step as it comes in, with a live elapsed-time counter and an Abort button.


Agent Manager UI — Deployments Tab

_renderDeploymentsView() in agentManagerPart.ts:

  • Subscribes to deploymentRegistryService.onDidChange on mount; unsubscribes on tab switch
  • Renders two sections: Local Providers and Cloud Deployments
  • Each row built by _createDeploymentRow() with status badge, endpoint, model list, and action buttons
  • Action buttons use _createSmallAction() helper (styled inline <button>)
  • Auto-config status section at the bottom shows applied rules from deploymentAutoConfigService.getAppliedRules()

Adding a New Local Provider

  1. Add an entry to LOCAL_PROVIDERS in deploymentRegistryService.ts:
    {
      provider: 'myProvider',
      displayName: 'My Provider',
      defaultEndpoint: 'http://localhost:9000',
      healthPath: '/health',
      modelsPath: '/v1/models',
    }
  2. If the models endpoint returns a non-standard shape, extend _fetchLocalModels() with a new branch keyed on config.provider.
  3. Add 'myProvider' to ProviderName in voidSettingsTypes.ts if it isn't already there.
  4. Add the default endpoint to the defaults map in autoConfigService._isProviderConfigured() and _revertCloudConfig().

Adding a New Cloud Provider

  1. Add the provider type to ICloudDeployment.provider union in cloudTypes.ts.
  2. Implement _deployAzure() / _deployAWS() pattern in cloudDeploymentService.ts — follow the existing shell-escape and IMDSv2 patterns exactly.
  3. Add credential validation to cloudCredentialService.ts following the _validateAWS / _validateAzure pattern.
  4. Add a new credential form to the wizard UI in agentManagerPart.ts.

Testing

There are no unit tests for this module yet. Manual testing checklist:

  • Ollama running → Deployments tab shows green pill, models listed
  • Ollama stopped → status transitions to unreachable within 30s
  • Install a model from Simple mode → progress bar renders, model appears in Ollama after download
  • Cloud deploy → wizard shows timeline, health check count increments, transitions to running
  • Cloud deploy → Abort mid-provisioning → status transitions to stopped
  • Auto-config with unconfigured provider → notification shown, endpoint set
  • Auto-config with already-configured provider → no notification, no change
  • "Don't auto-configure" → dismissed state persists across IDE restart
  • Invalid AWS key format → validation error shown before storage

Was this page helpful?