-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sweep(slow): count tokens on the server side #17
Comments
Here's the PR! #20.⚡ Sweep Basic Tier: I'm using GPT-4. You have 4 GPT-4 tickets left for the month and 3 for the day. (tracking ID:
1bfb7247a1 )For more GPT-4 tickets, visit our payment portal. For a one week free trial, try Sweep Pro (unlimited GPT-4 tickets). Actions (click)
Sandbox Execution ✓Here are the sandbox execution logs prior to making any changes: Sandbox logs for
|
import { Index, Match, Show, Switch, batch, createEffect, createSignal, onMount } from 'solid-js' | |
import { Toaster, toast } from 'solid-toast' | |
import { useThrottleFn } from 'solidjs-use' | |
import { generateSignature } from '@/utils/auth' | |
import { fetchModeration, fetchTitle } from '@/utils/misc' | |
import { audioChunks, getAudioBlob, startRecording, stopRecording } from '@/utils/record' | |
import { countTokens } from '@/utils/tiktoken' | |
import { MessagesEvent } from '@/utils/events' | |
import IconClear from './icons/Clear' | |
import MessageItem from './MessageItem' | |
import SystemRoleSettings from './SystemRoleSettings' | |
import ErrorMessageItem from './ErrorMessageItem' | |
import TokenCounter, { encoder } from './TokenCounter' | |
import type { ChatMessage, ErrorMessage } from '@/types' | |
import type { Setter } from 'solid-js' | |
export const minMessages = Number(import.meta.env.PUBLIC_MIN_MESSAGES ?? 3) | |
export const maxTokens = Number(import.meta.env.PUBLIC_MAX_TOKENS ?? 3000) | |
free-chat/src/pages/api/generate.ts
Lines 1 to 15 in 117c9ef
// #vercel-disable-blocks | |
import { ProxyAgent, fetch } from 'undici' | |
// #vercel-end | |
import { generatePayload, parseOpenAIStream } from '@/utils/openAI' | |
import { verifySignature } from '@/utils/auth' | |
import type { APIRoute } from 'astro' | |
const apiKey = import.meta.env.OPENAI_API_KEY | |
const httpsProxy = import.meta.env.HTTPS_PROXY | |
const baseUrl = ((import.meta.env.OPENAI_API_BASE_URL) || 'https://api.openai.com').trim().replace(/\/$/, '') | |
const sitePassword = import.meta.env.SITE_PASSWORD | |
const ua = import.meta.env.UNDICI_UA | |
const FORWARD_HEADERS = ['origin', 'referer', 'cookie', 'user-agent', 'via'] | |
free-chat/src/components/Generator.tsx
Lines 217 to 260 in 117c9ef
const storagePassword = localStorage.getItem('pass') | |
try { | |
const controller = new AbortController() | |
setController(controller) | |
const requestMessageList = [...messageList()] | |
let limit = maxTokens | |
const systemMsg = currentSystemRoleSettings() | |
? { | |
role: 'system', | |
content: currentSystemRoleSettings(), | |
} as ChatMessage | |
: null | |
systemMsg && (limit -= countTokens(encoder()!, [systemMsg])!.total) | |
while (requestMessageList.length > minMessages && countTokens(encoder()!, requestMessageList)!.total > limit) | |
requestMessageList.shift() | |
systemMsg && requestMessageList.unshift(systemMsg) | |
const timestamp = Date.now() | |
const response = await fetch('/api/generate', { | |
method: 'POST', | |
body: JSON.stringify({ | |
model: localStorage.getItem('model') || 'gpt-3.5-turbo-1106', | |
messages: requestMessageList, | |
time: timestamp, | |
pass: storagePassword, | |
sign: await generateSignature({ | |
t: timestamp, | |
m: requestMessageList?.[requestMessageList.length - 1]?.content || '', | |
}), | |
}), | |
signal: controller.signal, | |
headers: localStorage.getItem('apiKey') ? { authorization: `Bearer ${localStorage.getItem('apiKey')}` } : {}, | |
}) | |
if (!response.ok) { | |
const error = await response.json() | |
console.error(error.error) | |
setCurrentError(error.error) | |
throw new Error('Request failed') | |
} |
free-chat/src/utils/tiktoken.ts
Lines 1 to 38 in 117c9ef
import type { ChatMessage } from '@/types' | |
import type { Tiktoken } from 'tiktoken' | |
const countTokensSingleMessage = (enc: Tiktoken, message: ChatMessage) => { | |
return 4 + enc.encode(message.content).length // im_start, im_end, role/name, "\n" | |
} | |
export const countTokens = (enc: Tiktoken | null, messages: ChatMessage[]) => { | |
if (messages.length === 0) return | |
if (!enc) return { total: Infinity } | |
const lastMsg = messages.at(-1) | |
const context = messages.slice(0, -1) | |
const countTokens: (message: ChatMessage) => number = countTokensSingleMessage.bind(null, enc) | |
const countLastMsg = countTokens(lastMsg!) | |
const countContext = context.map(countTokens).reduce((a, b) => a + b, 3) // im_start, "assistant", "\n" | |
return { countContext, countLastMsg, total: countContext + countLastMsg } | |
} | |
const cl100k_base_json = import.meta.env.PUBLIC_CL100K_BASE_JSON_URL || '/cl100k_base.json' | |
const tiktoken_bg_wasm = import.meta.env.PUBLIC_TIKTOKEN_BG_WASM_URL || '/tiktoken_bg.wasm' | |
async function getBPE() { | |
return fetch(cl100k_base_json).then(r => r.json()) | |
} | |
export const initTikToken = async() => { | |
const { init } = await import('tiktoken/lite/init') | |
const [{ bpe_ranks, special_tokens, pat_str }, { Tiktoken }] = await Promise.all([ | |
getBPE().catch(console.error), | |
import('tiktoken/lite/init'), | |
fetch(tiktoken_bg_wasm).then(r => r.arrayBuffer()).then(wasm => init(imports => WebAssembly.instantiate(wasm, imports))), | |
]) | |
return new Tiktoken(bpe_ranks, special_tokens, pat_str) |
Step 2: ⌨️ Coding
- Create
src/utils/tiktoken-server.ts
✓ 09d7244
Create src/utils/tiktoken-server.ts with contents:
• Create a new utility file named `tiktoken-server.ts` in the `src/utils` directory for the server-side token counting logic.
• Use `tiktoken-js` instead of `tiktoken` as the server-side equivalent library.
• Define and export a function `countTokensServer` that implements the same logic as `countTokens` from `src/utils/tiktoken.ts`.
• Ensure the function interface matches that of the `countTokens` presently on the client side, taking an encoder and a list of messages as arguments and returning an object with the total token count.
• Make sure to wrap any initializations that are not available on the server, such as fetching base configurations or initializing WebAssembly modules, in a server-compatible manner.
- Running GitHub Actions for
src/utils/tiktoken-server.ts
✓
Check src/utils/tiktoken-server.ts with contents:Ran GitHub Actions for 09d72442ebe25ea72693afd406fe601d703d1b27:
• Vercel Preview Comments: ✓
- Modify
src/pages/api/generate.ts
✓ 30a5ea4
Modify src/pages/api/generate.ts with contents:
• In the `post` method of the API route, import the `countTokensServer` function from `src/utils/tiktoken-server.ts`.
• After retrieving the request body, apply the token counting logic to trim the `messages` array, ensuring it remains under a defined token limit.
• Use the constants defined in `src/components/Generator.tsx` like `minMessages` and `maxTokens` to set the lower message limit and token count limit. These may need to be moved to a shared constants file if they are not already.
• Ensure that after implementing the logic, the trimmed `messages` are then passed on for the rest of the processing where the generation payload is created.
- Running GitHub Actions for
src/pages/api/generate.ts
✓
Check src/pages/api/generate.ts with contents:Ran GitHub Actions for 30a5ea4d0bdc06e092563c96327c3e11eeb3cff2:
• Vercel Preview Comments: ✓
Step 3: 🔁 Code Review
I have finished reviewing the code for completeness. I did not find errors for sweep/server-side-token-counting_1
.
🎉 Latest improvements to Sweep:
- Sweep uses OpenAI's latest Assistant API to plan code changes and modify code! This is 3x faster and significantly more reliable as it allows Sweep to edit code and validate the changes in tight iterations, the same way as a human would.
- Sweep now uses the
rope
library to refactor Python! Check out Large Language Models are Bad at Refactoring Code. To have Sweep refactor your code, trysweep: Refactor <your_file>.py
!
💡 To recreate the pull request edit the issue title or description. To tweak the pull request, leave a comment on the pull request.
Join Our Discord
Details
在 src/pages/api/generate.ts 中加上和 src/components/Generator.tsx 中一样的裁剪 messages 的逻辑:
但是注意:服务端用不了 tiktoken 库,只能用 tiktoken-js 库,他们应该有类似的 interface
Checklist
src/utils/tiktoken-server.ts
✓ 09d7244src/utils/tiktoken-server.ts
✓src/pages/api/generate.ts
✓ 30a5ea4src/pages/api/generate.ts
✓The text was updated successfully, but these errors were encountered: