Contributing: Model Management
This page covers the internals of the Neural Inverse model management layer for contributors. It assumes familiarity with the VS Code extension architecture and the Neural Inverse DI system.
Source root: src/vs/workbench/contrib/neuralInverse/browser/modelManagement/
Module Map
modelManagement/
├── cloudCredentialService.ts # Encrypted credential storage + validation
├── cloudDeploymentService.ts # AWS/Azure provisioning via terminal
├── localProvidersAutoSetup.ts # Runs once on IDE start — seeds provider settings
├── marketplaceService.ts # Model catalog (HuggingFace + curated list)
└── deployment/
├── deploymentTypes.ts # Shared types + type guards
├── deploymentRegistryService.ts # Unified local+cloud registry, health polling
├── autoConfigService.ts # Auto-applies provider settings on deployment ready
└── index.ts # DI registration + barrel re-exportsRegistration entry point: neuralInverse.contribution.ts imports ./modelManagement/deployment/index.js as a side-effect.
DI Services
| Decorator key | Interface | Class | Type |
|---|---|---|---|
modelMarketplaceService | IModelMarketplaceService | ModelMarketplaceService | Delayed |
cloudCredentialService | ICloudCredentialService | CloudCredentialService | Delayed |
cloudDeploymentService | ICloudDeploymentService | CloudDeploymentService | Delayed |
deploymentRegistryService | IDeploymentRegistryService | DeploymentRegistryService | Delayed |
deploymentAutoConfigService | IDeploymentAutoConfigService | DeploymentAutoConfigService | Delayed |
All services are InstantiationType.Delayed — they instantiate on first injection, not at startup.
Deployment Types
deploymentTypes.ts defines the unified type that represents any running model endpoint:
type IUnifiedDeployment = ILocalDeployment | ICloudDeploymentEntry;
interface ILocalDeployment {
kind: 'local';
id: string; // e.g. "local-ollama"
provider: ProviderName;
displayName: string;
endpoint: string;
status: 'running' | 'unreachable' | 'stopped';
models: string[];
lastChecked: number;
}
interface ICloudDeploymentEntry {
kind: 'cloud';
id: string;
cloudProvider: 'aws' | 'azure';
voidProvider: ProviderName; // always 'vLLM' today
modelId: string;
modelName: string;
endpoint: string;
status: CloudDeploymentStatus;
config: IDeploymentConfig;
createdAt: number;
costPerHour: number;
}Type guards: isLocalDeployment(d), isCloudDeployment(d), isDeploymentActive(d)
Helper: getDeploymentEndpoint(d): IDeploymentEndpoint | undefined — extracts { provider, url, apiKey?, modelName? } regardless of deployment kind.
DeploymentRegistryService
IDeploymentRegistryService is the single source of truth for what is running.
Local detection
On construction and every 30 seconds, the service runs _refreshLocalDeployments():
for each LOCAL_PROVIDER (ollama, vLLM, lmStudio):
fetch(endpoint + healthPath, timeout: 5s)
if ok → fetch models list → upsert ILocalDeployment{status: 'running'}
if fail → upsert ILocalDeployment{status: 'unreachable'|'stopped'}No external service is injected for local detection — it uses fetch() directly against well-known localhost ports. This avoids circular DI and keeps the registry lightweight.
Cloud sync
_syncCloudDeployments() reads from ICloudDeploymentService.listDeployments() and upserts ICloudDeploymentEntry records. It is called on init and whenever cloudDeploymentService.onDeploymentStatusChanged fires.
State transitions + events
The registry tracks _previousStates: Map<string, boolean> to diff state across polls:
| Transition | Event fired |
|---|---|
false → true (deployment came up) | onDeploymentBecameReady |
true → false (deployment went down) | onDeploymentWentDown |
| Any change | onDidChange |
Persistence
State is serialized to IStorageService (profile scope, key neuralInverse.deploymentRegistry) so the Deployments tab can render immediately on IDE start before the first health check completes.
DeploymentAutoConfigService
Listens on registryService.onDeploymentBecameReady. When a deployment becomes ready:
- Guard: already dismissed? — if user selected "Don't auto-configure" for this provider, skip.
- Guard: already configured? — checks
settingsOfProvider[provider]:endpointdiffers from default → skipapiKeyis non-empty → skip_didFillInProviderSettingsis true → skip
- Apply:
- Local deployment: call
voidSettingsService.setAutodetectedModels(provider, models, { enableProviderOnSuccess: true }) - Cloud deployment: set
endpoint,apiKey, and models viasetSettingOfProvider+setAutodetectedModels
- Local deployment: call
- Notify: show a notification with Undo + "Don't auto-configure" actions.
Applied rules are stored per profile (neuralInverse.deployment.autoConfigRules) and can be iterated via getAppliedRules().
CloudDeploymentService
cloudDeploymentService.ts handles AWS EC2 and Azure VM provisioning.
Security design
| Concern | Implementation |
|---|---|
| Command injection | _shellEscape(s) wraps all user-supplied strings in single quotes and escapes embedded single quotes |
| Open endpoints | vLLM starts with --api-key <32-byte-hex> generated at deploy time |
| SSRF via IMDS | EC2 instances launched with HttpTokens=required (IMDSv2 only) |
| Overly permissive firewall | Security group created with inbound rule scoped to caller's IP only |
| Stale deployments | Any deployment stuck in provisioning > 20 min is transitioned to error on next load |
| Runaway deploys | 15-minute hard timeout per deployment; AbortController cancels the polling loop |
| Credential leakage | Credentials read from ISecretStorageService at deploy time, never stored in plain state |
Terminal execution
The service uses ITerminalService to run AWS CLI / Azure CLI commands in a named terminal (Neural Inverse — Cloud Deploy). This means users can see exactly what commands are being run, and the terminal persists after deployment for debugging.
Terminal creation:
const terminal = await this.terminalService.createTerminal({ config: { name } });Health check loop
After provisioning, the service polls GET <endpoint>/health with:
- 30-second interval
MAX_HEALTH_RETRIES = 20(10 minutes total)- Each retry increments a counter shown in the wizard UI
- On success: transitions to
running, firesonDeploymentStatusChanged - On exhaustion: transitions to
error
Abort
abort(deploymentId) sets the AbortController signal, which cancels the health-check fetch and transitions the deployment to stopped.
CloudCredentialService
Stored credentials go through three layers before being accepted:
- Whitespace trimming — leading/trailing whitespace stripped from all fields
- Format validation:
- AWS: access key must match
/^AKIA[0-9A-Z]{16}$/, secret key must be 40 chars - Azure: subscription ID and tenant ID must be valid GUIDs (
/^[0-9a-f-]{36}$/i) - Region: validated against an allowlist regex to prevent SSRF
- AWS: access key must match
- Connectivity test — a lightweight API call (
sts.get-caller-identityfor AWS,az account showfor Azure) confirms the credentials work before storing
On validation failure, a descriptive error is returned and nothing is stored.
Agent Manager UI — Models Tab
agentManagerPart.ts owns the Models tab UI. It has two modes toggled by _modelsViewMode: 'simple' | 'advanced'.
Simple mode (_renderSimpleModelsView)
- Hero section with gradient + Ollama status pill
- Categorized model grid rendered by
_createModelSection() - Each card built by
_createCuratedModelCard(): org badge, params badge, star count, progress bar overlay during install - Install flow calls
IModelManagementServicepull API; progress updates viaonPullProgressevent
Advanced mode (renderModelsMarketplace)
- Left sidebar: search input (debounced 300ms), provider chips, category chips
- Right pane: model list with detail panel
- Cloud deploy wizard (full-screen overlay) triggered from model detail
Cloud deploy wizard
The wizard is a self-contained overlay that tracks its own state machine:
idle → configuring → deploying → (running | error)During deploying, a scrollable timeline log renders each provisioning step as it comes in, with a live elapsed-time counter and an Abort button.
Agent Manager UI — Deployments Tab
_renderDeploymentsView() in agentManagerPart.ts:
- Subscribes to
deploymentRegistryService.onDidChangeon mount; unsubscribes on tab switch - Renders two sections: Local Providers and Cloud Deployments
- Each row built by
_createDeploymentRow()with status badge, endpoint, model list, and action buttons - Action buttons use
_createSmallAction()helper (styled inline<button>) - Auto-config status section at the bottom shows applied rules from
deploymentAutoConfigService.getAppliedRules()
Adding a New Local Provider
- Add an entry to
LOCAL_PROVIDERSindeploymentRegistryService.ts:{ provider: 'myProvider', displayName: 'My Provider', defaultEndpoint: 'http://localhost:9000', healthPath: '/health', modelsPath: '/v1/models', } - If the models endpoint returns a non-standard shape, extend
_fetchLocalModels()with a new branch keyed onconfig.provider. - Add
'myProvider'toProviderNameinvoidSettingsTypes.tsif it isn't already there. - Add the default endpoint to the
defaultsmap inautoConfigService._isProviderConfigured()and_revertCloudConfig().
Adding a New Cloud Provider
- Add the provider type to
ICloudDeployment.providerunion incloudTypes.ts. - Implement
_deployAzure()/_deployAWS()pattern incloudDeploymentService.ts— follow the existing shell-escape and IMDSv2 patterns exactly. - Add credential validation to
cloudCredentialService.tsfollowing the_validateAWS/_validateAzurepattern. - Add a new credential form to the wizard UI in
agentManagerPart.ts.
Testing
There are no unit tests for this module yet. Manual testing checklist:
- Ollama running → Deployments tab shows green pill, models listed
- Ollama stopped → status transitions to
unreachablewithin 30s - Install a model from Simple mode → progress bar renders, model appears in Ollama after download
- Cloud deploy → wizard shows timeline, health check count increments, transitions to running
- Cloud deploy → Abort mid-provisioning → status transitions to stopped
- Auto-config with unconfigured provider → notification shown, endpoint set
- Auto-config with already-configured provider → no notification, no change
- "Don't auto-configure" → dismissed state persists across IDE restart
- Invalid AWS key format → validation error shown before storage