Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V2 update plan #2

Open
PicoCreator opened this issue Jun 15, 2023 · 9 comments
Open

V2 update plan #2

PicoCreator opened this issue Jun 15, 2023 · 9 comments

Comments

@PicoCreator
Copy link
Collaborator

PicoCreator commented Jun 15, 2023

Latest version of https://github.com/saharNooby/rwkv.cpp has new quantization format (breaking change?) and GPU offload (!!!)
Since this might be potentially breaking changes, its gonna be a v2 update.

  • update to the newer version, which has breaking change on the model (might be backwards compat)
  • (to confirm if not backwards compat) create a new set of ".bin" files for the new version
  • make changes to the API, to add support for GPU offload (its a param now in the new version, on how many layers you want to offload to the GPU)
  • for input inference, update to the new batch mode API (10x faster)
  • (stretch goal) change to async API
  • support for world model / world tokenizer (we can detect this using the token count)
@PicoCreator
Copy link
Collaborator Author

<= 67.2 ms on 7B with partial GPU offload (3060 Ti 8G) is a huge win win

@cahya-wirawan
Copy link

Hi,
I converted a pth file to bin format, unfortunately it crashed with following error message:

% rwkv-cpp-node --modelPath ./rwkv-7b-369-Q5_1.bin 
--------------------------------------
Starting RWKV chat mode
--------------------------------------
Loading model from ./rwkv-7b-369-Q5_1.bin ...
Unsupported file version 101
/Users/eugene/Desktop/RWKV/rwkv.cpp/rwkv.cpp:211: version == RWKV_FILE_VERSION
zsh: segmentation fault  rwkv-cpp-node --modelPath ./rwkv-7b-369-Q5_1.bin

Is it the compatibility issue mentioned here?
Thanks (Looking forward to the world version :-) )

@PicoCreator
Copy link
Collaborator Author

Yup, the new rwkv.cpp - is now merged into v2 (publishing now)

@PicoCreator
Copy link
Collaborator Author

PicoCreator commented Jun 28, 2023

Merged in : 74655de

This resolves all issues, EXCEPT support for world model tokenizer (needs a new JS tokenizer)

@cgisky1980
Copy link

Yup, the new rwkv.cpp - is now merged into v2 (publishing now)

update the docs please

@cgisky1980
Copy link

Merged in : 74655de

This resolves all issues, EXCEPT support for world model tokenizer (needs a new JS tokenizer)

waiting for it ~~

@cahya-wirawan
Copy link

Hi, I reinstalled the package, but when I run it, I get only following result:

% rwkv-cpp-node --modelPath ./rwkv-7b-369-Q5_1.bin
--------------------------------------
Starting RWKV chat mode
--------------------------------------
Loading model with {"path":"./rwkv-7b-369-Q5_1.bin","threads":4,"gpuOffload":0} ...
The following is a conversation between the User and the Bot ...
--------------------------------------
? User:  Hi how are you
Bot: <|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|>

Did I convert the model wrong?

@cgisky1980
Copy link

Hi, I reinstalled the package, but when I run it, I get only following result:

% rwkv-cpp-node --modelPath ./rwkv-7b-369-Q5_1.bin
--------------------------------------
Starting RWKV chat mode
--------------------------------------
Loading model with {"path":"./rwkv-7b-369-Q5_1.bin","threads":4,"gpuOffload":0} ...
The following is a conversation between the User and the Bot ...
--------------------------------------
? User:  Hi how are you
Bot: <|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|>

Did I convert the model wrong?

world models not support yet

@cahya-wirawan
Copy link

Hi, I reinstalled the package, but when I run it, I get only following result:

% rwkv-cpp-node --modelPath ./rwkv-7b-369-Q5_1.bin
--------------------------------------
Starting RWKV chat mode
--------------------------------------
Loading model with {"path":"./rwkv-7b-369-Q5_1.bin","threads":4,"gpuOffload":0} ...
The following is a conversation between the User and the Bot ...
--------------------------------------
? User:  Hi how are you
Bot: <|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|>

Did I convert the model wrong?

world models not support yet

It is just a normal finetuned raven model

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants