Skip to content

Commit

Permalink
[feat] support ios benchmark.
Browse files Browse the repository at this point in the history
  • Loading branch information
wangzhaode committed Jul 12, 2024
1 parent d70533c commit c424d86
Show file tree
Hide file tree
Showing 3 changed files with 68 additions and 7 deletions.
26 changes: 20 additions & 6 deletions ios/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,23 @@

## 速度

模型: Qwen-1.8b-int4
- iPhone 11 : pefill 52.00 tok/s, decode 16.23 tok/s
- iPhone 14 Pro: pefill 102.63 tok/s, decode 33.53 tok/s
[旧版测试prompt](../resource/prompt.txt)
- Qwen-1.8b-chat 4bit
- iPhone 11 : pefill 52.00 tok/s, decode 16.23 tok/s
- iPhone 14 Pro: pefill 102.63 tok/s, decode 33.53 tok/s
- Qwen-1.8b-chat 8bit
- iPhone 11 : pefill 61.90 tok/s, decode 14.75 tok/s
- iPhone 14 Pro: pefill 105.41 tok/s, decode 25.45 tok/s

模型: Qwen-1.8b-int8
- iPhone 11 : pefill 61.90 tok/s, decode 14.75 tok/s
- iPhone 14 Pro: pefill 105.41 tok/s, decode 25.45 tok/s
---

[新本测试prompt](../resource/bench.txt)
- Qwen1.5-0.5b-chat 4bit
- iPhone 15 Pro: pefill 282.73 tok/s, decode 51.68 tok/s
- Qwen2-0.5b-instruct 4bit
- iPhone 15 Pro: pefill 234.51 tok/s, decode 51.36 tok/s
- Qwen2-1.5b-instruct 4bit
- iPhone 15 Pro: pefill 107.64 tok/s, decode 25.57 tok/s

## 编译
1. 首先下载模型文件: [Qwen1.5-0.5B-Chat-MNN](https://modelscope.cn/models/zhaode/Qwen1.5-0.5B-Chat-MNN/files)
Expand All @@ -20,6 +30,10 @@

备注:如测试其他模型,可以将`ios/mnn-llm/model/qwen1.5-0.5b-chat`替换为其他模型的文件夹;同时修改`LLMInferenceEngineWrapper.m +38`的模型路径;

## 性能
等待模型加载完成后,发送:`benchmark`,即可进行benchmark测试;

## 测试
等待模型加载完成后即可发送信息,如下图所示:

![ios-app](./ios_app.jpg)
45 changes: 44 additions & 1 deletion ios/mnn-llm/mnn-llm/LLMInferenceEngineWrapper.mm
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,50 @@ - (void)processInput:(NSString *)input withStreamHandler:(StreamOutputHandler)ha
};
LlmStreamBuffer streambuf(callback);
std::ostream os(&streambuf);
llm->response([input UTF8String], &os, "<eop>");
if (std::string([input UTF8String]) == "benchmark") {
// do benchmark
std::string model_dir = GetMainBundleDirectory();
std::string prompt_file = model_dir + "/bench.txt";
std::ifstream prompt_fs(prompt_file);
std::vector<std::string> prompts;
std::string prompt;
while (std::getline(prompt_fs, prompt)) {
// prompt start with '#' will be ignored
if (prompt.substr(0, 1) == "#") {
continue;
}
std::string::size_type pos = 0;
while ((pos = prompt.find("\\n", pos)) != std::string::npos) {
prompt.replace(pos, 2, "\n");
pos += 1;
}
prompts.push_back(prompt);
}
int prompt_len = 0;
int decode_len = 0;
int64_t prefill_time = 0;
int64_t decode_time = 0;
for (int i = 0; i < prompts.size(); i++) {
llm->response(prompts[i], &os, "\n");
prompt_len += llm->prompt_len_;
decode_len += llm->gen_seq_len_;
prefill_time += llm->prefill_us_;
decode_time += llm->decode_us_;
}
float prefill_s = prefill_time / 1e6;
float decode_s = decode_time / 1e6;
os << "\n#################################\n"
<< "prompt tokens num = " << prompt_len << "\n"
<< "decode tokens num = " << decode_len << "\n"
<< "prefill time = " << std::fixed << std::setprecision(2) << prefill_s << " s\n"
<< " decode time = " << std::fixed << std::setprecision(2) << decode_s << " s\n"
<< "prefill speed = " << std::fixed << std::setprecision(2) << prompt_len / prefill_s << " tok/s\n"
<< " decode speed = " << std::fixed << std::setprecision(2) << decode_len / decode_s << " tok/s\n"
<< "##################################\n";
os << "<eop>";
} else {
llm->response([input UTF8String], &os, "<eop>");
}
}

- (void)dealloc {
Expand Down
4 changes: 4 additions & 0 deletions resource/bench.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
计算8乘以12
将下面的句子翻译成中文:It's a beautiful day to learn something new.
描述优秀的领导者应具备的五个特质,并解释每个特质为什么重要
近年来,随着技术的快速发展和全球化的深入推进,数字经济已成为推动世界经济增长的新引擎。数字经济不仅改变了人们的生活方式,促进了信息和资源的快速流通,还重塑了传统行业的业务模式和竞争格局。尽管数字经济的发展为全球经济增长提供了新的动能,但同时也带来了数据安全、隐私保护、数字鸿沟和市场垄断等一系列挑战。考虑到这些背景,请详细分析数字经济在促进世界经济增长方面的作用,包括但不限于数字经济对提高生产效率、创造就业机会和促进可持续发展的贡献。同时,探讨如何应对数字经济发展过程中出现的挑战,具体包括如何保护个人数据安全和隐私、缩小数字鸿沟以确保数字经济的包容性和公平性,以及如何制定有效政策以避免市场垄断情况的出现,最终实现数字经济的健康和可持续发展。

0 comments on commit c424d86

Please sign in to comment.