roadmap: Jan has revamped Remote Engines (e.g. OpenAI, Anthropic etc) #3786

dan-menlo · 2024-10-13T14:02:33Z

Goal

Note: This Epic has changed multiple times, as our architecture has also changed
A lot of the early comments are referring to a different context
e.g. "Provider Abstraction" in Jan

• Cortex is now an API Platform and needs to route route `/chat/completion` requests to Remote APIs
	- This is intended to allow us to support Groq, Martian, OpenRouter etc
• Remote API Extensions will need to support
	- Getting Remote API's model list
	- Enabling certain default models (e.g. we may not want to show every nightly model in Remote API's Model List)
	- Remote APIs may have specific model.yaml templates (e.g. context length)
	- Routing of `/chat/completion`
	- Extension should cover both UI layer, as well as "Backend" (we may need to modify Cortex to accept a Remote param)
	- Handling API Key Management
• We may need an incremental path to Remote API Extensions
	- Cortex.cpp does not support Extensions for now
	- We may need to have Remote API Extensions define a specific payload, that Cortex `/chat/completions` then routes conditionally

Tasklist

Jan

UI implementation feat: Implement UI for Remote Engine Management #4251
- UI update for Adding Remote Engines - pending @imtuyethan
- planning: Jan's path to cortex.cpp? #3690
- Planning: Migration path for Jan's current Remote Extensions, and API Keys @louis-jan
- Planning: Remote Engines registry @louis-jan @urmauur
  - Is there a github that we can have, with transformReq, transformResp
  - Similar to model catalog, but for Engines: https://github.com/janhq/engines
    • [x] Support for Prompt Caching:
  - idea: Add Claude Prompt Caching Support to Jan #3715
  - Settings UI

Backend

Remote APIs to Support

Popular

OpenAI: feat: Provider Extension - OpenAI #3370
OpenRouter:
• feat: Use OpenRouter Default Model instead of OpenRouter/auto #3110
• feat: Provider Extension - OpenRouter #3452
• feat: Add specific model selection for OpenRouter provider #3582
• bug: cannot seem to add more Openrouter remote models #2752
Anthropic (Claude):
- idea: Add Claude Prompt Caching Support to Jan #3715
- bug: cannot seem to add more Openrouter remote models #2752
Mistral

Deprioritized

The text was updated successfully, but these errors were encountered:

dan-menlo · 2024-10-14T06:34:37Z

Goal: Clear Eng Spec for Providers

Scope

"Provider" Scope
- Remote (Groq, NIM, etc)
- Local = Hardware + APIs + Model Management (Ollama, Cortex)
- It is possible that we don't need a differentiation between Remote and Local
- Choosing a better name, vs. "OAIEngine")
Provider Interface + Abstraction
- Providers registers certain things (e.g. UI, Models), can be called by other extensions
- Registers Settings Page
- Registers Models Category + List
Each Provider Extension should be a separate repo?
- I would like this -> add others to help maintain

Jan Providers

Local Provider

Currently, the local extension still has to manage processes itself, which involves utilizing third-party frameworks such as Node.js (child_process) for building functions.

What if we build Jan on mobile we have to cover extensions as well. It would be better to move these parts to Core module and frontend will just need to use it’s API.

Local Provider will need to execute a command to run its program. Therefore, the command and arguments will be defined, while the rest will be delegated to the super class.

Lifecycle:

A Local Provider is intended to run engines as an API Server (potentially using HTTP, socket, or gRPC).
Local Provider executes a command through CoreAPI (reducing the main process implementation from extensions, easy to port to other platforms such as mobile)
Main Process core module will run a watchdog and maintain the process
Since then, the app has been able to make requests and proxy through the Local Provider extension.
App terminates -> watchdog terminates the process.

Examples

class CortexProvider extends LocalProvider {
  async onLoad() {
     // The Run is implemented from the core module
     // then the spawn process will be maintained by the watchdog
     this.run("cortex", [ "start", "--port", "39291" ], { cwd: "./", env: { } })
  } 

  async loadModel() {
    // Can be a http request, socket or grpc
    this.post("/v1/model/start", { mode: "llama3.2" })
  }
}

https://drive.google.com/file/d/1lITgfqviqA5b0-etSGtU5wI8BS7_TXza/view?usp=sharing

Remove Provider

The same as discussions: Remote API Extension #3505
Remote extensions should work with autopopulating models, e.g. /models list.
We could not build hundreds model.json files manually.
The current extension framework is actually designed to handle this, it's just an implementation issue from extensions, which can be improved.
There was a hacky UI implementation where we pre-populated models, then disabled all of them until the API key was set. That should be a part of the extension, not the Jan app.

Extension builder still ships default available models.

 // Before
 override async onLoad(): Promise<void> {
   super.onLoad()
   // Register Settings (API Key, Endpoints)
   this.registerSettings(SETTINGS)
 	
   // Pre-populate models - persist model.json files
   // MODELS are model.json files that come with the extension.
   this.registerModels(MODELS)
 }

 // After
 override async onLoad(): Promise<void> {
   super.onLoad()
   // Register Settings (API Key, Endpoints)
   this.registerSettings(SETTINGS)
 	
   // Fetch models from provider models endpoint - just a simple fetch
   // Default to `/models`
   get('/models')
     .then((models) => {
         // Model builder will construct model template (aka preset)
 	// This operation builds Model DTOs that works with the app.
 	this.registerModels(this.modelBuilder.build(models))
     })
 }

Remote Provider Extension

Draw.io

https://drive.google.com/file/d/1pl9WjCzKl519keva85aHqUhx2u0onVf4/view?usp=sharing

Supported parameters?

Each provider works with different parameters, but they all share the same basic function with the current ones defined.
We've already supported transformPayload and transformResponse to adapt to these cases.

So users still see parameters consistent from model to model, but the magic happens behind the scenes, where the transformations are simplified under the hood.

/**
* transformPayload Example
* Tranform the payload before sending it to the inference endpoint.
* The new preview models such as o1-mini and o1-preview replaced max_tokens by max_completion_tokens parameter.
* Others do not.
*/
transformPayload = (payload: OpenAIPayloadType): OpenAIPayloadType => {
  // Transform the payload for preview models
  if (this.previewModels.includes(payload.model)) {
    const { max_tokens, ...params } = payload
    return { ...params, max_completion_tokens: max_tokens }
  }
  // Pass through for officialw models
  return payload
}

Decoration?
- We've currently hard-coded many provider metadata from Jan, which could cause issues with future installed extensions.
- The decoration should be done from the Extension Manifest (package.json).
- https://code.visualstudio.com/api/references/extension-manifest
```
{
  "name": "openai-extension",
  "displayName": "OpenAI Extension Provider",
  "icon": "https://openai.com/logo.png"
}
```
Just remove the hacky parts

Model Dropdown: It checks if the engine is nitro or others, filtering for local versus cloud sections. New local engines will be treated as remote engines (e.g. cortex.cpp). -> Filter by Extension type (class name or type, e.g. LocalOAIEngine vs RemoteOAIEngine).
All models from the cloud provider are disabled by default if no API key is set. What if I use a self-hosted endpoint without API key restrictions? Models available or not should be determined from the extensions, when there are no credentials to meet the requirements, it will result in an empty section, indicating no available models. When users input the API-Key from extension settings page, it will fetch model list automatically and cache. Users can also refresh the models list from there (should not fetch so many times, we are building a local-first application)
Application settings can be a bit confusing, with Model Providers and Core Extensions listed separately. Where do other extensions fit in?

Extension settings do not have a community or "others" section

Provider Interface and abstraction

Providers are scoped at engine operations, such as running engines, loading models...
- registerModels(models)
- run(commands, arguments, options)
- loadModel(model)
- unloadModel(model)
Core functions can be extended through extensions, not confined by providers, such as Hardware and UI.
- systemStatus()
- registerSettings()
- registerRibbon()
- registerView()

Registered models will be stored in an in-memory store, accessible from other extensions(ModelManager.instance().models). The same as settings. App and extensions can perform chat/completions requests with just model name, which means the registered model should be unique across extensions.

The core module also exposes extensive APIs, such as systemStatus so other extensions can access, there should be just one implementation of the logic supplied by extensions. Otherwise, it will merely be utilized within the extension, first come, first served.

The UI of the model should be aligned with the model object, minimize decorations (e.g. model icon), and avoid introducing various types of model DTOs.

Each Provider Extension should be a separate repo?

Extensions installation is a straightforward process that requires minimal effort.

There is no official way to install extensions from a GitHub repository URL. Users typically don't know how to package and install software from sources.
There should be a shortcut from the settings page that allows users to input the URL, pop up the extension repository details, and then install from there.
It would be helpful to provide a list of community extensions, allowing users to easily find the right extension for their specific use case without having to search.

dan-menlo · 2024-10-15T16:09:57Z

@louis-jan We can start working on this refactor, and make adjustments on the edges. Thank you for the clear spec!

dan-menlo added this to Menlo Oct 13, 2024

dan-menlo converted this from a draft issue Oct 13, 2024

freelerobot mentioned this issue Oct 14, 2024

planning: Jan and Cortex's Extension Framework #3773

Open

dan-menlo changed the title ~~architecture: Local Provider Extension~~ architecture: Provider Abstraction Oct 14, 2024

dan-menlo assigned louis-menlo Oct 14, 2024

freelerobot added category: local engines category: providers Local & remote inference providers labels Oct 14, 2024

freelerobot pinned this issue Oct 14, 2024

freelerobot moved this from Investigating to Planning in Menlo Oct 15, 2024

louis-menlo mentioned this issue Oct 17, 2024

feat: Jan Integrates Cortex.cpp as Provider #3821

Merged

14 tasks

dan-menlo changed the title ~~architecture: Provider Abstraction~~ discussion: Provider Abstraction Oct 17, 2024

freelerobot added the type: epic A major feature or initiative label Oct 17, 2024

freelerobot mentioned this issue Oct 17, 2024

epic: Provider Refactor + Extensions #3824

Closed

14 tasks

freelerobot removed the category: local providers label Oct 17, 2024

freelerobot changed the title ~~discussion: Provider Abstraction~~ planning: Provider Abstraction Oct 17, 2024

freelerobot added type: planning Discussions, specs and decisions stage and removed type: epic A major feature or initiative labels Oct 17, 2024

imtuyethan mentioned this issue Oct 18, 2024

chore: Structure Icebox in Github Projects #3840

Closed

dan-menlo mentioned this issue Oct 22, 2024

idea: Request for more cloud providers (Google AI Studio, Together API, AI Horde) #3859

Closed

dan-menlo changed the title ~~planning: Provider Abstraction~~ planning: Remote API Extensions for Jan & Cortex Oct 29, 2024

imtuyethan modified the milestones: v0.5.14, v0.5.15 Jan 21, 2025

This was referenced Jan 23, 2025

roadmap: Allow Users To Add Remote Models #4515

Open

idea: Add Ollama to model provider settings #4514

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

roadmap: Jan has revamped Remote Engines (e.g. OpenAI, Anthropic etc) #3786

roadmap: Jan has revamped Remote Engines (e.g. OpenAI, Anthropic etc) #3786

dan-menlo commented Oct 13, 2024 •

edited by louis-menlo

Loading

dan-menlo commented Oct 14, 2024 •

edited

Loading

louis-menlo commented Oct 14, 2024 •

edited

Loading

dan-menlo commented Oct 15, 2024

roadmap: Jan has revamped Remote Engines (e.g. OpenAI, Anthropic etc) #3786

roadmap: Jan has revamped Remote Engines (e.g. OpenAI, Anthropic etc) #3786

Comments

dan-menlo commented Oct 13, 2024 • edited by louis-menlo Loading

dan-menlo commented Oct 14, 2024 • edited Loading

Scope

Related

louis-menlo commented Oct 14, 2024 • edited Loading

Jan Providers

Local Provider

Remove Provider

Provider Interface and abstraction

Each Provider Extension should be a separate repo?

dan-menlo commented Oct 15, 2024

dan-menlo commented Oct 13, 2024 •

edited by louis-menlo

Loading

dan-menlo commented Oct 14, 2024 •

edited

Loading

louis-menlo commented Oct 14, 2024 •

edited

Loading