Epic: Jan 0.6.0 that uses Cortex Platform #3365

Van-QA · 2024-08-14T03:42:54Z

louis-jan · 2024-08-28T03:13:43Z

@marknguyen1302 to attach the specs here

dan-homebrew · 2024-08-30T07:28:09Z

@louis-jan @marknguyen1302 I'm adding the #3325 to this Tasklist

we can discuss later if we don't need a migration wizard (if we will continue supporting legacy filesystem)

louis-jan · 2024-08-30T08:20:58Z

@dan-homebrew @marknguyen1302 I'm listing somepoints to consider / concerns here so we can address together

Model prepopulation - One important point to consider when using the HF/Provider model is fetching (3373, 3374) feat: Remote APIs can fetch Model List #3374 (comment)
Cortex process handling - For now, Cortex serves the chat function and also serves the server API. There are possible side effects such as:
- Load model fail -> cortex process crashed -> server stopped working
- Run Cortex on app load so the API server can always be on (on message send before)
- Change thread settings (context length / ngl) / Stop / Delete a running model -> Unload instead of just kill as before
- New UX on state update? E.g. Cortex stopped working -> Restart by process watchdog -> Running
Need a better UX for running multiple models simultaneously
- Loading so many models could break the user's machine
- Unloading a model could affect a running server (or not, since sending a message could load the model again)
Need a better UX for engine setup (the UI update should also work with our extension architecture but not direct users to many screens)
- Llama.cpp is bundled by default, but not for others, also CUDA dependencies (should some how shared among engines)
Models can be run through the API Server
- Legacy models are routed to Nitro (even eventually route to cortex-cpp), new models are routed to Cortex. Is it a good UX to force Cortex API server to serve legacy nitro models?
Pre-populate HF cortex models: there should be duplicated models where the 0.5.3 models list is updated. New branch / model repo to serve Jan hub only? So we can filter out which models could be pre-populated instead of everything.
Preserve thread settings - confusing feature? since we will work on Presets soon, let's take it out?
Sync settings (& dtos?) across projects (https://discord.com/channels/1107178041848909847/1239846009258119178/1279001259613093910)
Cortex release process - e.g. We are updating the cortex repository's CI/CD for interim CLI development
Cortex provides a backward-compatible FS module, so this means all requests from Jan go through the Cortex API? Back to the previous integration, there are so many edge cases to handle such as:
- The Cortex server should be ON before showing threads/models. Data caching can improve the experiment for a better user experience, as users cannot wait for loading every time they reopen the app.
- Cortex process status, as described above ^
- Models import should also go thru cortex server
- Deprecate core FS from Jan
- Advanced Settings from Jan -> configure Cortex?
- HTTP Proxy for model downloading?
- HuggingFace API Token input from Jan -> Cortex
- Logs settings from Jan > Cortex (on/off, cleaning interval..), how to log from Jan app -> cortex?
- Tools support (Actually just PDF retrieval for now) - Depends on Jan core FS, if we don't deprecate that is fine
IMPORTANT: Multiple instructions set support - Multiple engine binaries distribution

marknguyen1302 · 2024-08-30T09:32:50Z

thanks @louis-jan for this. about Cortex provides a backward-compatible FS module, I think it's for the Jan API server only, so we keep the current app the same, instead of spawning a new port like the current Jan, we will use Cortex Platform API for that server. what do you think?

marknguyen1302 · 2024-08-30T09:37:16Z

about Cortex process handling, I think when load/inference/start the Jan API server, we should check the health of the Cortex server and attempt to spawn if the server is not on.

louis-jan · 2024-08-30T09:59:34Z

Yes @marknguyen1302, scope down the other parts would help.
For now:

I think just trying to sync the state between threads and the API server could be a good option when the Cortex process is terminated -> update the state from both places, and the user can either send a new message to try to load again or turn on the server.

But when we migrate full requests to Cortex endpoints, there should be something to addres.

dan-homebrew · 2024-09-04T04:28:16Z

@louis-jan Thanks for your points here - I'll likely need to break them up into separate discussions - it covers a lot of issues
#3365 (comment)

0xSage · 2024-09-09T13:41:43Z

to be rescoped into #3599

dan-homebrew · 2024-09-10T07:50:22Z

Discontinuing this due to Cortex Platform deprecation

Van-QA added the type: epic A major feature or initiative label Aug 14, 2024

Van-QA modified the milestones: v0.6.1, v.0.6.0 Aug 14, 2024

Van-QA changed the title ~~epic: Jan 0.5.2 with Cortex platform~~ epic: Jan 0.5.2 with Cortex platform extensions Aug 14, 2024

imtuyethan assigned louis-jan and Van-QA Aug 14, 2024

imtuyethan added the P1: important Important feature / fix label Aug 14, 2024

imtuyethan assigned imtuyethan and marknguyen1302 and unassigned imtuyethan Aug 14, 2024

Van-QA mentioned this issue Aug 23, 2024

epic: Jan - Cortex API #3142

Closed

3 tasks

Van-QA pinned this issue Aug 23, 2024

Van-QA modified the milestones: v.0.6.0, v.0.6.1 Aug 26, 2024

louis-jan changed the title ~~epic: Jan 0.5.2 with Cortex platform extensions~~ epic: Jan 0.6.0 with Cortex platform extensions Aug 27, 2024

imtuyethan mentioned this issue Aug 30, 2024

epic: V0.6.0 Migration Wizard #3325

Closed

dan-homebrew changed the title ~~epic: Jan 0.6.0 with Cortex platform extensions~~ epic: Jan 0.6.0 that uses Cortex Platform Sep 3, 2024

louis-jan changed the title ~~epic: Jan 0.6.0 that uses Cortex Platform~~ Epic: Jan 0.6.0 that uses Cortex Platform Sep 5, 2024

imtuyethan unassigned marknguyen1302 Sep 9, 2024

dan-homebrew unassigned Van-QA Sep 9, 2024

dan-homebrew closed this as completed Sep 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Epic: Jan 0.6.0 that uses Cortex Platform #3365

Epic: Jan 0.6.0 that uses Cortex Platform #3365

Van-QA commented Aug 14, 2024 •

edited by dan-homebrew

Loading

louis-jan commented Aug 28, 2024

dan-homebrew commented Aug 30, 2024 •

edited

Loading

louis-jan commented Aug 30, 2024 •

edited

Loading

marknguyen1302 commented Aug 30, 2024

marknguyen1302 commented Aug 30, 2024

louis-jan commented Aug 30, 2024

dan-homebrew commented Sep 4, 2024

0xSage commented Sep 9, 2024

dan-homebrew commented Sep 10, 2024

Epic: Jan 0.6.0 that uses Cortex Platform #3365

Epic: Jan 0.6.0 that uses Cortex Platform #3365

Comments

Van-QA commented Aug 14, 2024 • edited by dan-homebrew Loading

Motivation

Main Decisions:

v0.6 scope

v0.6 out-of-scope

Specs

Tasklist

Discontinued

Appendix

louis-jan commented Aug 28, 2024

dan-homebrew commented Aug 30, 2024 • edited Loading

louis-jan commented Aug 30, 2024 • edited Loading

marknguyen1302 commented Aug 30, 2024

marknguyen1302 commented Aug 30, 2024

louis-jan commented Aug 30, 2024

dan-homebrew commented Sep 4, 2024

0xSage commented Sep 9, 2024

dan-homebrew commented Sep 10, 2024

Van-QA commented Aug 14, 2024 •

edited by dan-homebrew

Loading

dan-homebrew commented Aug 30, 2024 •

edited

Loading

louis-jan commented Aug 30, 2024 •

edited

Loading