Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker部署SecretFlow-Serving文档内容疑问 #123

Open
zhuiguang49 opened this issue Nov 14, 2024 · 20 comments
Open

docker部署SecretFlow-Serving文档内容疑问 #123

zhuiguang49 opened this issue Nov 14, 2024 · 20 comments

Comments

@zhuiguang49
Copy link

zhuiguang49 commented Nov 14, 2024

微信图片_20241114123334

  1. “步骤1:部署SecretFlow-Serving”自己创建serving空间,又指示“我们这里使用 'examples' 目录下的模型包作为示例文件然后将其放置在 “serving” 目录下”,这里的模型包指什么呢?serving仓库里的模型包吗?

微信图片_20241114123732

2. 按照1.5创建docker-compose.yaml文件时,voluems部分的三个config文件的地址对不上步骤一种说的创建serving工作空间的地址。

不太理解serving文档中部署SecretFlow-Serving中docker部署的流程,感觉说的很模糊,能给予一些指导吗?

这是SecretFlow-Serving这部分的文档链接:
https://www.secretflow.org.cn/zh-CN/docs/serving/0.7.0b0/topics/deployment/deployment

@wangzul
Copy link

wangzul commented Nov 14, 2024

  1. 文档中说明的examples指向的是serving仓库里的模型包,这个文件镜像中默认已经存在也可以不做这一步。
    如果不做这一步需要修改,参考https://github.com/secretflow/serving/blob/main/examples/alice/serving.config
    “sourcePath”
    ”sourceSha256“
    image

  2. 在创建docker-compose.yaml之前有3个创建配置文件的步骤----假设没有执行第一步cp serving/examples/alice/glm-test.tar.gz .
    此时目录结构应该是
    └── serving
    ├── serving.config
    ├── logging.config
    ├── trace.config
    └── docker-compose.yaml

至于你说的路径文件

  1. 当前目录为serving
  2. docker-compose.yaml和下列文件同级,因此可以使用./xxx.config作为挂载路径
    • ./serving.config:/root/sf_serving/conf/serving.config
    • ./logging.config:/root/sf_serving/conf/logging.config
    • ./trace.config:/root/sf_serving/conf/trace.config

@zhuiguang49
Copy link
Author

zhuiguang49 commented Nov 14, 2024

@wangzul
我将ALICE PORT替换为实际可用的port之后,启动serving服务,容器不断重启,查看log后报错如下:

2024-11-14 16:52:18.546 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-14 16:52:18.547 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-14 16:52:18.547 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-14 16:52:18.547 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-14 16:52:18.547 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:52:18.547 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:52:18.547 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-14 16:52:18.553 [error] [main.cc:main:144] server startup failed, code: 5, msg: [Enforce fail at secretflow_serving/source/filesystem_source.cc:31] std::filesystem::exists(config_.source_path()). source_path ./glm-test.tar.gz in model_conf  does not exist, stack: #0 secretflow::serving::Source::PullModel[abi:cxx11]()+0x55b0874339ba
#1 secretflow::serving::Server::Start()+0x55b085d39990
#2 main+0x55b085d31c75
#3 __libc_start_main+0x7fc0c863acf3

2024-11-14 16:52:19.683 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-14 16:52:19.683 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-14 16:52:19.683 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-14 16:52:19.683 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-14 16:52:19.683 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:52:19.683 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:52:19.683 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-14 16:52:19.689 [error] [main.cc:main:144] server startup failed, code: 5, msg: [Enforce fail at secretflow_serving/source/filesystem_source.cc:31] std::filesystem::exists(config_.source_path()). source_path ./glm-test.tar.gz in model_conf  does not exist, stack: #0 secretflow::serving::Source::PullModel[abi:cxx11]()+0x55a91f5b09ba
#1 secretflow::serving::Server::Start()+0x55a91deb6990
#2 main+0x55a91deaec75
#3 __libc_start_main+0x7f94a6c3acf3

2024-11-14 16:52:20.676 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-14 16:52:20.676 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-14 16:52:20.676 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-14 16:52:20.676 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-14 16:52:20.677 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:52:20.677 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:52:20.677 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-14 16:52:20.682 [error] [main.cc:main:144] server startup failed, code: 5, msg: [Enforce fail at secretflow_serving/source/filesystem_source.cc:31] std::filesystem::exists(config_.source_path()). source_path ./glm-test.tar.gz in model_conf  does not exist, stack: #0 secretflow::serving::Source::PullModel[abi:cxx11]()+0x5571ab4909ba
#1 secretflow::serving::Server::Start()+0x5571a9d96990
#2 main+0x5571a9d8ec75
#3 __libc_start_main+0x7fac7a23acf3

2024-11-14 16:52:21.703 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-14 16:52:21.703 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-14 16:52:21.703 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-14 16:52:21.703 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-14 16:52:21.704 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:52:21.704 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:52:21.704 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-14 16:52:21.709 [error] [main.cc:main:144] server startup failed, code: 5, msg: [Enforce fail at secretflow_serving/source/filesystem_source.cc:31] std::filesystem::exists(config_.source_path()). source_path ./glm-test.tar.gz in model_conf  does not exist, stack: #0 secretflow::serving::Source::PullModel[abi:cxx11]()+0x56181c0659ba
#1 secretflow::serving::Server::Start()+0x56181a96b990
#2 main+0x56181a963c75
#3 __libc_start_main+0x7fe80543acf3

2024-11-14 16:52:22.850 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-14 16:52:22.850 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-14 16:52:22.850 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-14 16:52:22.850 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-14 16:52:22.851 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:52:22.851 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:52:22.851 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-14 16:52:22.856 [error] [main.cc:main:144] server startup failed, code: 5, msg: [Enforce fail at secretflow_serving/source/filesystem_source.cc:31] std::filesystem::exists(config_.source_path()). source_path ./glm-test.tar.gz in model_conf  does not exist, stack: #0 secretflow::serving::Source::PullModel[abi:cxx11]()+0x5617f58479ba
#1 secretflow::serving::Server::Start()+0x5617f414d990
#2 main+0x5617f4145c75
#3 __libc_start_main+0x7fe54c23acf3

2024-11-14 16:52:24.835 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-14 16:52:24.836 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-14 16:52:24.836 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-14 16:52:24.836 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-14 16:52:24.836 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:52:24.836 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:52:24.836 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-14 16:52:24.842 [error] [main.cc:main:144] server startup failed, code: 5, msg: [Enforce fail at secretflow_serving/source/filesystem_source.cc:31] std::filesystem::exists(config_.source_path()). source_path ./glm-test.tar.gz in model_conf  does not exist, stack: #0 secretflow::serving::Source::PullModel[abi:cxx11]()+0x55b66368d9ba
#1 secretflow::serving::Server::Start()+0x55b661f93990
#2 main+0x55b661f8bc75
#3 __libc_start_main+0x7fd1f783acf3

2024-11-14 16:52:28.403 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-14 16:52:28.403 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-14 16:52:28.403 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-14 16:52:28.403 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-14 16:52:28.404 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:52:28.404 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:52:28.404 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-14 16:52:28.410 [error] [main.cc:main:144] server startup failed, code: 5, msg: [Enforce fail at secretflow_serving/source/filesystem_source.cc:31] std::filesystem::exists(config_.source_path()). source_path ./glm-test.tar.gz in model_conf  does not exist, stack: #0 secretflow::serving::Source::PullModel[abi:cxx11]()+0x55d733dec9ba
#1 secretflow::serving::Server::Start()+0x55d7326f2990
#2 main+0x55d7326eac75
#3 __libc_start_main+0x7f592c63acf3

似乎是没有./glm-test.tar.gz模型文件

然而在我将glm-test.tar.gz文件cp至当前目录,也即/home/zhangsan/serving/glm-test.tar.gz后,启动serving服务,查看log如下:

2024-11-14 16:57:21.559 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-14 16:57:21.559 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-14 16:57:21.559 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-14 16:57:21.559 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-14 16:57:21.560 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:57:21.560 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:57:21.560 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-14 16:57:21.566 [error] [main.cc:main:144] server startup failed, code: 5, msg: [Enforce fail at secretflow_serving/source/filesystem_source.cc:31] std::filesystem::exists(config_.source_path()). source_path ./glm-test.tar.gz in model_conf  does not exist, stack: #0 secretflow::serving::Source::PullModel[abi:cxx11]()+0x55ad681389ba
#1 secretflow::serving::Server::Start()+0x55ad66a3e990
#2 main+0x55ad66a36c75
#3 __libc_start_main+0x7ffb92c3acf3

2024-11-14 16:57:22.530 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-14 16:57:22.531 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-14 16:57:22.531 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-14 16:57:22.531 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-14 16:57:22.531 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:57:22.531 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:57:22.531 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-14 16:57:22.536 [error] [main.cc:main:144] server startup failed, code: 5, msg: [Enforce fail at secretflow_serving/source/filesystem_source.cc:31] std::filesystem::exists(config_.source_path()). source_path ./glm-test.tar.gz in model_conf  does not exist, stack: #0 secretflow::serving::Source::PullModel[abi:cxx11]()+0x55f118e299ba
#1 secretflow::serving::Server::Start()+0x55f11772f990
#2 main+0x55f117727c75
#3 __libc_start_main+0x7f620983acf3

2024-11-14 16:57:23.518 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-14 16:57:23.518 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-14 16:57:23.518 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-14 16:57:23.518 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-14 16:57:23.518 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:57:23.518 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:57:23.518 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-14 16:57:23.524 [error] [main.cc:main:144] server startup failed, code: 5, msg: [Enforce fail at secretflow_serving/source/filesystem_source.cc:31] std::filesystem::exists(config_.source_path()). source_path ./glm-test.tar.gz in model_conf  does not exist, stack: #0 secretflow::serving::Source::PullModel[abi:cxx11]()+0x55674583c9ba
#1 secretflow::serving::Server::Start()+0x556744142990
#2 main+0x55674413ac75
#3 __libc_start_main+0x7fad1bc3acf3

2024-11-14 16:57:24.488 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-14 16:57:24.488 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-14 16:57:24.488 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-14 16:57:24.488 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-14 16:57:24.489 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:57:24.489 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:57:24.489 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-14 16:57:24.494 [error] [main.cc:main:144] server startup failed, code: 5, msg: [Enforce fail at secretflow_serving/source/filesystem_source.cc:31] std::filesystem::exists(config_.source_path()). source_path ./glm-test.tar.gz in model_conf  does not exist, stack: #0 secretflow::serving::Source::PullModel[abi:cxx11]()+0x55e38b5059ba
#1 secretflow::serving::Server::Start()+0x55e389e0b990
#2 main+0x55e389e03c75
#3 __libc_start_main+0x7f5cb3c3acf3

2024-11-14 16:57:25.635 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-14 16:57:25.636 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-14 16:57:25.636 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-14 16:57:25.636 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-14 16:57:25.636 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:57:25.636 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:57:25.636 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-14 16:57:25.642 [error] [main.cc:main:144] server startup failed, code: 5, msg: [Enforce fail at secretflow_serving/source/filesystem_source.cc:31] std::filesystem::exists(config_.source_path()). source_path ./glm-test.tar.gz in model_conf  does not exist, stack: #0 secretflow::serving::Source::PullModel[abi:cxx11]()+0x56498c3ee9ba
#1 secretflow::serving::Server::Start()+0x56498acf4990
#2 main+0x56498acecc75
#3 __libc_start_main+0x7fb20583acf3

2024-11-14 16:57:27.574 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-14 16:57:27.574 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-14 16:57:27.574 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-14 16:57:27.574 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-14 16:57:27.575 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:57:27.575 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:57:27.575 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-14 16:57:27.581 [error] [main.cc:main:144] server startup failed, code: 5, msg: [Enforce fail at secretflow_serving/source/filesystem_source.cc:31] std::filesystem::exists(config_.source_path()). source_path ./glm-test.tar.gz in model_conf  does not exist, stack: #0 secretflow::serving::Source::PullModel[abi:cxx11]()+0x55f58ebdc9ba
#1 secretflow::serving::Server::Start()+0x55f58d4e2990
#2 main+0x55f58d4dac75
#3 __libc_start_main+0x7f158543acf3

2024-11-14 16:57:31.151 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-14 16:57:31.151 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-14 16:57:31.151 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-14 16:57:31.151 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-14 16:57:31.151 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:57:31.151 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-14 16:57:31.151 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-14 16:57:31.157 [error] [main.cc:main:144] server startup failed, code: 5, msg: [Enforce fail at secretflow_serving/source/filesystem_source.cc:31] std::filesystem::exists(config_.source_path()). source_path ./glm-test.tar.gz in model_conf  does not exist, stack: #0 secretflow::serving::Source::PullModel[abi:cxx11]()+0x5565682ee9ba
#1 secretflow::serving::Server::Start()+0x556566bf4990
#2 main+0x556566becc75
#3 __libc_start_main+0x7fd7dac3acf3

似乎仍然缺少这个文件

现在我的serving文件夹结构如下:
└── serving
├── serving.config
├── logging.config
├── trace.config
├── glm-test.tar.gz
└── docker-compose.yaml

@zhuiguang49
Copy link
Author

路径文件这个我明白了,之前没弄清楚是将宿主机文件挂载到容器中

@wangzul
Copy link

wangzul commented Nov 15, 2024

路径文件这个我明白了,之前没弄清楚是将宿主机文件挂载到容器中

路径解决后现在是什么问题那?

@wangzul
Copy link

wangzul commented Nov 15, 2024

路径文件这个我明白了,之前没弄清楚是将宿主机文件挂载到容器中

docker-compose.yaml文件提供一下我看看。

@zhuiguang49
Copy link
Author

路径文件这个我明白了,之前没弄清楚是将宿主机文件挂载到容器中

docker-compose.yaml文件提供一下我看看。

docker-compose.yaml文件内容如下

version: "3.8"
services:
  serving:
    cap_add:
      - NET_ADMIN
    command:
      - /root/sf_serving/secretflow_serving
      - --serving_config_file=/root/sf_serving/conf/serving.config
      - --logging_config_file=/root/sf_serving/conf/logging.config
      - --trace_config_file=/root/sf_serving/conf/trace.config
    restart: always
    image: secretflow/serving-anolis8:latest
    ports:
      - 4590:9010
    volumes:
      - ./serving.config:/root/sf_serving/conf/serving.config
      - ./logging.config:/root/sf_serving/conf/logging.config
      - ./trace.config:/root/sf_serving/conf/trace.config

上面贴了之前启动serving失败的报错,似乎是缺少./glm-test.tar.gz模型文件

@wangzul
Copy link

wangzul commented Nov 15, 2024

你这样做:

  1. 首先修改一下配置文件。
    image
    image
    容器中默认有这个文件,你可以参考这个修改一下地址。或者将 glm-test.tar.gz
    挂载到容器的某个路径下在修改地址如:
    volumes:
    • ./glm-test.tar.gz:/root/glm-test.tar.gz

@zhuiguang49
Copy link
Author

你这样做:

  1. 首先修改一下配置文件。
    image
    image
    容器中默认有这个文件,你可以参考这个修改一下地址。或者将 glm-test.tar.gz
    挂载到容器的某个路径下在修改地址如:
    volumes:

    • ./glm-test.tar.gz:/root/glm-test.tar.gz

两种方式均已尝试,容器仍然不断重启,似乎是哈希值计算有问题,SHA256 校验失败

报错如下:

2024-11-15 14:17:40.915 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 14:17:40.916 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 14:17:40.916 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 14:17:40.916 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 14:17:40.916 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 14:17:40.916 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 14:17:40.916 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 14:17:40.919 [info] [filesystem_source.cc:OnPullModel:37] copy model file from /root/glm-test.tar.gz to ./data/test_service_id/glm-test/model_bundle.tar.gz
2024-11-15 14:17:40.919 [warning] [sys_util.cc:CheckSHA256:124] file(./data/test_service_id/glm-test/model_bundle.tar.gz) sha256 check failed, expect:3b6a3b76a8d5bbf0e45b83f2d44772a0a6aa9a15bf382cee22cbdc8f59d55522, get:c6308af488bcd6c54a48a145af17aa209dec463b5cb44d83c6b58195818c10a0
2024-11-15 14:17:40.922 [error] [main.cc:main:144] server startup failed, code: 10, msg: [Enforce fail at secretflow_serving/source/source.cc:58] SysUtil::CheckSHA256(dst_file_path.string(), source_sha256). model(/root/glm-test.tar.gz) sha256 check failed, stack: #0 secretflow::serving::Server::Start()+0x5633661d6990
#1 main+0x5633661cec75
#2 __libc_start_main+0x7f891ba3acf3

2024-11-15 14:17:41.840 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 14:17:41.840 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 14:17:41.840 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 14:17:41.840 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 14:17:41.840 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 14:17:41.840 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 14:17:41.840 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 14:17:41.843 [warning] [sys_util.cc:CheckSHA256:124] file(./data/test_service_id/glm-test/model_bundle.tar.gz) sha256 check failed, expect:3b6a3b76a8d5bbf0e45b83f2d44772a0a6aa9a15bf382cee22cbdc8f59d55522, get:c6308af488bcd6c54a48a145af17aa209dec463b5cb44d83c6b58195818c10a0
2024-11-15 14:17:41.843 [info] [source.cc:PullModel:52] remove tmp model file:./data/test_service_id/glm-test/model_bundle.tar.gz
2024-11-15 14:17:41.843 [info] [filesystem_source.cc:OnPullModel:37] copy model file from /root/glm-test.tar.gz to ./data/test_service_id/glm-test/model_bundle.tar.gz
2024-11-15 14:17:41.843 [warning] [sys_util.cc:CheckSHA256:124] file(./data/test_service_id/glm-test/model_bundle.tar.gz) sha256 check failed, expect:3b6a3b76a8d5bbf0e45b83f2d44772a0a6aa9a15bf382cee22cbdc8f59d55522, get:c6308af488bcd6c54a48a145af17aa209dec463b5cb44d83c6b58195818c10a0
2024-11-15 14:17:41.846 [error] [main.cc:main:144] server startup failed, code: 10, msg: [Enforce fail at secretflow_serving/source/source.cc:58] SysUtil::CheckSHA256(dst_file_path.string(), source_sha256). model(/root/glm-test.tar.gz) sha256 check failed, stack: #0 secretflow::serving::Server::Start()+0x559a95735990
#1 main+0x559a9572dc75
#2 __libc_start_main+0x7fbaa523acf3

2024-11-15 14:17:42.674 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 14:17:42.674 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 14:17:42.674 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 14:17:42.674 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 14:17:42.675 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 14:17:42.675 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 14:17:42.675 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 14:17:42.678 [warning] [sys_util.cc:CheckSHA256:124] file(./data/test_service_id/glm-test/model_bundle.tar.gz) sha256 check failed, expect:3b6a3b76a8d5bbf0e45b83f2d44772a0a6aa9a15bf382cee22cbdc8f59d55522, get:c6308af488bcd6c54a48a145af17aa209dec463b5cb44d83c6b58195818c10a0
2024-11-15 14:17:42.678 [info] [source.cc:PullModel:52] remove tmp model file:./data/test_service_id/glm-test/model_bundle.tar.gz
2024-11-15 14:17:42.678 [info] [filesystem_source.cc:OnPullModel:37] copy model file from /root/glm-test.tar.gz to ./data/test_service_id/glm-test/model_bundle.tar.gz
2024-11-15 14:17:42.678 [warning] [sys_util.cc:CheckSHA256:124] file(./data/test_service_id/glm-test/model_bundle.tar.gz) sha256 check failed, expect:3b6a3b76a8d5bbf0e45b83f2d44772a0a6aa9a15bf382cee22cbdc8f59d55522, get:c6308af488bcd6c54a48a145af17aa209dec463b5cb44d83c6b58195818c10a0
2024-11-15 14:17:42.680 [error] [main.cc:main:144] server startup failed, code: 10, msg: [Enforce fail at secretflow_serving/source/source.cc:58] SysUtil::CheckSHA256(dst_file_path.string(), source_sha256). model(/root/glm-test.tar.gz) sha256 check failed, stack: #0 secretflow::serving::Server::Start()+0x5640d26b9990
#1 main+0x5640d26b1c75
#2 __libc_start_main+0x7f85e5e3acf3

2024-11-15 14:17:43.506 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 14:17:43.506 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 14:17:43.506 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 14:17:43.506 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 14:17:43.506 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 14:17:43.506 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 14:17:43.506 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 14:17:43.509 [warning] [sys_util.cc:CheckSHA256:124] file(./data/test_service_id/glm-test/model_bundle.tar.gz) sha256 check failed, expect:3b6a3b76a8d5bbf0e45b83f2d44772a0a6aa9a15bf382cee22cbdc8f59d55522, get:c6308af488bcd6c54a48a145af17aa209dec463b5cb44d83c6b58195818c10a0
2024-11-15 14:17:43.509 [info] [source.cc:PullModel:52] remove tmp model file:./data/test_service_id/glm-test/model_bundle.tar.gz
2024-11-15 14:17:43.509 [info] [filesystem_source.cc:OnPullModel:37] copy model file from /root/glm-test.tar.gz to ./data/test_service_id/glm-test/model_bundle.tar.gz
2024-11-15 14:17:43.509 [warning] [sys_util.cc:CheckSHA256:124] file(./data/test_service_id/glm-test/model_bundle.tar.gz) sha256 check failed, expect:3b6a3b76a8d5bbf0e45b83f2d44772a0a6aa9a15bf382cee22cbdc8f59d55522, get:c6308af488bcd6c54a48a145af17aa209dec463b5cb44d83c6b58195818c10a0
2024-11-15 14:17:43.512 [error] [main.cc:main:144] server startup failed, code: 10, msg: [Enforce fail at secretflow_serving/source/source.cc:58] SysUtil::CheckSHA256(dst_file_path.string(), source_sha256). model(/root/glm-test.tar.gz) sha256 check failed, stack: #0 secretflow::serving::Server::Start()+0x5571997df990
#1 main+0x5571997d7c75
#2 __libc_start_main+0x7f638f23acf3

2024-11-15 14:17:44.623 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 14:17:44.624 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 14:17:44.624 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 14:17:44.624 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 14:17:44.624 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 14:17:44.624 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 14:17:44.624 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 14:17:44.627 [warning] [sys_util.cc:CheckSHA256:124] file(./data/test_service_id/glm-test/model_bundle.tar.gz) sha256 check failed, expect:3b6a3b76a8d5bbf0e45b83f2d44772a0a6aa9a15bf382cee22cbdc8f59d55522, get:c6308af488bcd6c54a48a145af17aa209dec463b5cb44d83c6b58195818c10a0
2024-11-15 14:17:44.627 [info] [source.cc:PullModel:52] remove tmp model file:./data/test_service_id/glm-test/model_bundle.tar.gz
2024-11-15 14:17:44.627 [info] [filesystem_source.cc:OnPullModel:37] copy model file from /root/glm-test.tar.gz to ./data/test_service_id/glm-test/model_bundle.tar.gz
2024-11-15 14:17:44.627 [warning] [sys_util.cc:CheckSHA256:124] file(./data/test_service_id/glm-test/model_bundle.tar.gz) sha256 check failed, expect:3b6a3b76a8d5bbf0e45b83f2d44772a0a6aa9a15bf382cee22cbdc8f59d55522, get:c6308af488bcd6c54a48a145af17aa209dec463b5cb44d83c6b58195818c10a0
2024-11-15 14:17:44.630 [error] [main.cc:main:144] server startup failed, code: 10, msg: [Enforce fail at secretflow_serving/source/source.cc:58] SysUtil::CheckSHA256(dst_file_path.string(), source_sha256). model(/root/glm-test.tar.gz) sha256 check failed, stack: #0 secretflow::serving::Server::Start()+0x5580be56b990
#1 main+0x5580be563c75
#2 __libc_start_main+0x7f4b9d83acf3

2024-11-15 14:17:46.567 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 14:17:46.567 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 14:17:46.567 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 14:17:46.567 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 14:17:46.567 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 14:17:46.567 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 14:17:46.567 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 14:17:46.570 [warning] [sys_util.cc:CheckSHA256:124] file(./data/test_service_id/glm-test/model_bundle.tar.gz) sha256 check failed, expect:3b6a3b76a8d5bbf0e45b83f2d44772a0a6aa9a15bf382cee22cbdc8f59d55522, get:c6308af488bcd6c54a48a145af17aa209dec463b5cb44d83c6b58195818c10a0
2024-11-15 14:17:46.570 [info] [source.cc:PullModel:52] remove tmp model file:./data/test_service_id/glm-test/model_bundle.tar.gz
2024-11-15 14:17:46.570 [info] [filesystem_source.cc:OnPullModel:37] copy model file from /root/glm-test.tar.gz to ./data/test_service_id/glm-test/model_bundle.tar.gz
2024-11-15 14:17:46.571 [warning] [sys_util.cc:CheckSHA256:124] file(./data/test_service_id/glm-test/model_bundle.tar.gz) sha256 check failed, expect:3b6a3b76a8d5bbf0e45b83f2d44772a0a6aa9a15bf382cee22cbdc8f59d55522, get:c6308af488bcd6c54a48a145af17aa209dec463b5cb44d83c6b58195818c10a0
2024-11-15 14:17:46.573 [error] [main.cc:main:144] server startup failed, code: 10, msg: [Enforce fail at secretflow_serving/source/source.cc:58] SysUtil::CheckSHA256(dst_file_path.string(), source_sha256). model(/root/glm-test.tar.gz) sha256 check failed, stack: #0 secretflow::serving::Server::Start()+0x55e2081a6990
#1 main+0x55e20819ec75
#2 __libc_start_main+0x7f258c23acf3

@wangzul
Copy link

wangzul commented Nov 15, 2024

"sourceSha256" 使用源码0.7.x 的尝试一下。

@wangzul
Copy link

wangzul commented Nov 15, 2024

配置sourceSha256后能够解决日志中提示的验证问题吗?

@zhuiguang49
Copy link
Author

"sourceSha256" 使用源码0.7.x 的尝试一下。

是指使用0.7.x版本的serving/examples/alice/serving.config的"sourceSha256": "c6308af488bcd6c54a48a145af17aa209dec463b5cb44d83c6b58195818c10a0"

尝试如此更改后报错:

2024-11-15 17:43:48.175 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 17:43:48.175 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 17:43:48.175 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 17:43:48.175 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 17:43:48.176 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:48.176 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:48.176 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 17:43:48.178 [info] [filesystem_source.cc:OnPullModel:37] copy model file from /root/glm-test.tar.gz to ./data/test_service_id/glm-test/model_bundle.tar.gz
2024-11-15 17:43:48.180 [error] [main.cc:main:144] server startup failed, code: 3, msg: [Enforce fail at secretflow_serving/server/server.cc:85] !host.empty(). get empty host., stack: #0 main+0x55b9c424ec75
#1 __libc_start_main+0x7f774043acf3

2024-11-15 17:43:49.275 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 17:43:49.275 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 17:43:49.275 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 17:43:49.275 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 17:43:49.276 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:49.276 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:49.276 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 17:43:49.280 [error] [main.cc:main:144] server startup failed, code: 3, msg: [Enforce fail at secretflow_serving/server/server.cc:85] !host.empty(). get empty host., stack: #0 main+0x559c8680fc75
#1 __libc_start_main+0x7f148243acf3

2024-11-15 17:43:50.448 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 17:43:50.448 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 17:43:50.448 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 17:43:50.448 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 17:43:50.448 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:50.448 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:50.448 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 17:43:50.453 [error] [main.cc:main:144] server startup failed, code: 3, msg: [Enforce fail at secretflow_serving/server/server.cc:85] !host.empty(). get empty host., stack: #0 main+0x561f7823dc75
#1 __libc_start_main+0x7ffba5c3acf3

2024-11-15 17:43:51.554 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 17:43:51.554 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 17:43:51.554 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 17:43:51.554 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 17:43:51.554 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:51.554 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:51.554 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 17:43:51.559 [error] [main.cc:main:144] server startup failed, code: 3, msg: [Enforce fail at secretflow_serving/server/server.cc:85] !host.empty(). get empty host., stack: #0 main+0x560d8f0edc75
#1 __libc_start_main+0x7f3af343acf3

2024-11-15 17:43:52.686 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 17:43:52.686 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 17:43:52.687 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 17:43:52.687 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 17:43:52.687 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:52.687 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:52.687 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 17:43:52.691 [error] [main.cc:main:144] server startup failed, code: 3, msg: [Enforce fail at secretflow_serving/server/server.cc:85] !host.empty(). get empty host., stack: #0 main+0x55e806e6cc75
#1 __libc_start_main+0x7fbb8fa3acf3

2024-11-15 17:43:54.702 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 17:43:54.702 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 17:43:54.702 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 17:43:54.702 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 17:43:54.703 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:54.703 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:54.703 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 17:43:54.707 [error] [main.cc:main:144] server startup failed, code: 3, msg: [Enforce fail at secretflow_serving/server/server.cc:85] !host.empty(). get empty host., stack: #0 main+0x55d487b63c75
#1 __libc_start_main+0x7f7ebea3acf3

似乎是get empty host,sha256的问题可能解决了,我没找到这样的日志输出

@wangzul
Copy link

wangzul commented Nov 15, 2024

你的 serving.config提供一下看看

@oeqqwq
Copy link
Member

oeqqwq commented Nov 15, 2024

"sourceSha256" 使用源码0.7.x 的尝试一下。

是指使用0.7.x版本的serving/examples/alice/serving.config的"sourceSha256": "c6308af488bcd6c54a48a145af17aa209dec463b5cb44d83c6b58195818c10a0"

尝试如此更改后报错:

2024-11-15 17:43:48.175 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 17:43:48.175 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 17:43:48.175 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 17:43:48.175 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 17:43:48.176 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:48.176 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:48.176 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 17:43:48.178 [info] [filesystem_source.cc:OnPullModel:37] copy model file from /root/glm-test.tar.gz to ./data/test_service_id/glm-test/model_bundle.tar.gz
2024-11-15 17:43:48.180 [error] [main.cc:main:144] server startup failed, code: 3, msg: [Enforce fail at secretflow_serving/server/server.cc:85] !host.empty(). get empty host., stack: #0 main+0x55b9c424ec75
#1 __libc_start_main+0x7f774043acf3

2024-11-15 17:43:49.275 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 17:43:49.275 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 17:43:49.275 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 17:43:49.275 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 17:43:49.276 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:49.276 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:49.276 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 17:43:49.280 [error] [main.cc:main:144] server startup failed, code: 3, msg: [Enforce fail at secretflow_serving/server/server.cc:85] !host.empty(). get empty host., stack: #0 main+0x559c8680fc75
#1 __libc_start_main+0x7f148243acf3

2024-11-15 17:43:50.448 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 17:43:50.448 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 17:43:50.448 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 17:43:50.448 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 17:43:50.448 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:50.448 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:50.448 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 17:43:50.453 [error] [main.cc:main:144] server startup failed, code: 3, msg: [Enforce fail at secretflow_serving/server/server.cc:85] !host.empty(). get empty host., stack: #0 main+0x561f7823dc75
#1 __libc_start_main+0x7ffba5c3acf3

2024-11-15 17:43:51.554 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 17:43:51.554 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 17:43:51.554 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 17:43:51.554 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 17:43:51.554 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:51.554 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:51.554 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 17:43:51.559 [error] [main.cc:main:144] server startup failed, code: 3, msg: [Enforce fail at secretflow_serving/server/server.cc:85] !host.empty(). get empty host., stack: #0 main+0x560d8f0edc75
#1 __libc_start_main+0x7f3af343acf3

2024-11-15 17:43:52.686 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 17:43:52.686 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 17:43:52.687 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 17:43:52.687 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 17:43:52.687 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:52.687 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:52.687 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 17:43:52.691 [error] [main.cc:main:144] server startup failed, code: 3, msg: [Enforce fail at secretflow_serving/server/server.cc:85] !host.empty(). get empty host., stack: #0 main+0x55e806e6cc75
#1 __libc_start_main+0x7fbb8fa3acf3

2024-11-15 17:43:54.702 [info] [trace.cc:InitTracer:156] trace log span processor configured
2024-11-15 17:43:54.702 [info] [spdlog_span_exporter.cc:SpdLogSpanExporter:61] trace log init success, trace_log_path=./trace.log
2024-11-15 17:43:54.702 [info] [main.cc:main:96] version: 0.7.0b0
2024-11-15 17:43:54.702 [info] [main.cc:main:107] op list: PHE_2P_REDUCE, PHE_2P_MERGE_Y, PHE_2P_DECRYPT_PEER_Y, MERGE_Y, DOT_PRODUCT, PHE_2P_DOT_PRODUCT, ARROW_PROCESSING, TREE_SELECT, TREE_MERGE, TREE_ENSEMBLE_PREDICT
2024-11-15 17:43:54.703 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:54.703 [info] [retry_policy.cc:RetryPolicy:48] Create RetryPolicy:backoff_time:10ms
2024-11-15 17:43:54.703 [info] [retry_policy.cc:SetConfig:171] Regist retry policy: name=bob
2024-11-15 17:43:54.707 [error] [main.cc:main:144] server startup failed, code: 3, msg: [Enforce fail at secretflow_serving/server/server.cc:85] !host.empty(). get empty host., stack: #0 main+0x55d487b63c75
#1 __libc_start_main+0x7f7ebea3acf3

似乎是get empty host,sha256的问题可能解决了,我没找到这样的日志输出

当前文档上的例子模版已过时,请参考如下模版重新配置一下:

  {
    "id": "test_service_id",
    "serverConf": {
      "featureMapping": {
        "v24": "x24",
        "v22": "x22",
        "v21": "x21",
        "v25": "x25",
        "v23": "x23"
      },
      "host": "0.0.0.0",
      "servicePort": "9010",
      "communicationPort": "9110",
      "metricsExposerPort": 10306,
      "brpcBuiltinServicePort": 10307
    },
    "modelConf": {
      "modelId": "glm-test",
      "basePath": "./data",
      "sourcePath": "./glm-test.tar.gz",
      "sourceSha256": "c6308af488bcd6c54a48a145af17aa209dec463b5cb44d83c6b58195818c10a0",
      "sourceType": "ST_FILE"
    },
    "clusterConf": {
      "selfId": "alice",
      "parties": [
        {
          "id": "alice",
          "address": "0.0.0.0:9110"
        },
        {
          "id": "bob",
          "address": "0.0.0.0:9111"
        }
      ],
      "channelDesc": {
        "protocol": "http"
      }
    },
    "featureSourceConf": {
      "mockOpts": {}
    }
  }

正确且详细的配置字段说明可参考:https://www.secretflow.org.cn/zh-CN/docs/serving/0.7.0b0/reference/config#servingconfig

我们会尽快修复相关文档

@zhuiguang49
Copy link
Author

alice的serving.config

  {
    "id": "test_service_id",
    "serverConf": {
      "featureMapping": {
        "v24": "x24",
        "v22": "x22",
        "v21": "x21",
        "v25": "x25",
        "v23": "x23"
      },
      "host": "0.0.0.0",
      "servicePort": "9010",
      "communicationPort": "9110",
      "metricsExposerPort": 10306,
      "brpcBuiltinServicePort": 10307
    },
    "modelConf": {
      "modelId": "glm-test",
      "basePath": "./data",
      "sourcePath": "examples/alice/glm-test.tar.gz",
      "sourceSha256": "c6308af488bcd6c54a48a145af17aa209dec463b5cb44d83c6b58195818c10a0",
      "sourceType": "ST_FILE"
    },
    "clusterConf": {
      "selfId": "alice",
      "parties": [
        {
          "id": "alice",
          "address": "0.0.0.0:9110"
        },
        {
          "id": "bob",
          "address": "0.0.0.0:9111"
        }
      ],
      "channelDesc": {
        "protocol": "http"
      }
    },
    "featureSourceConf": {
      "mockOpts": {}
    }
  }

bob的serving.config

{
  "id": "test_service_id",
  "serverConf": {
    "featureMapping": {
      "v6": "x6",
      "v7": "x7",
      "v8": "x8",
      "v9": "x9",
      "v10": "x10"
    },
    "host": "0.0.0.0",
    "servicePort": "9011",
    "communicationPort": "9111",
    "metricsExposerPort": 10308,
    "brpcBuiltinServicePort": 10309
  },
  "modelConf": {
    "modelId": "glm-test-1",
    "basePath": "/tmp/bob",
    "sourcePath": "examples/bob/glm-test.tar.gz",
    "sourceSha256": "d45d34feae42d663875569c206ac20263480300a40e34deda4037e01aeb9b2f8",
    "sourceType": "ST_FILE"
  },
  "clusterConf": {
    "selfId": "bob",
    "parties": [
      {
        "id": "alice",
        "address": "0.0.0.0:9110"
      },
      {
        "id": "bob",
        "address": "0.0.0.0:9111"
      }
    ],
    "channel_desc": {
      "protocol": "http"
    }
  },
  "featureSourceConf": {
    "mockOpts": {}
  }
}

两个容器均已能正常运行,但是两个节点似乎没法相互连接

1 2

@pchyuan
Copy link

pchyuan commented Nov 16, 2024

alice的serving.config

  {
    "id": "test_service_id",
    "serverConf": {
      "featureMapping": {
        "v24": "x24",
        "v22": "x22",
        "v21": "x21",
        "v25": "x25",
        "v23": "x23"
      },
      "host": "0.0.0.0",
      "servicePort": "9010",
      "communicationPort": "9110",
      "metricsExposerPort": 10306,
      "brpcBuiltinServicePort": 10307
    },
    "modelConf": {
      "modelId": "glm-test",
      "basePath": "./data",
      "sourcePath": "examples/alice/glm-test.tar.gz",
      "sourceSha256": "c6308af488bcd6c54a48a145af17aa209dec463b5cb44d83c6b58195818c10a0",
      "sourceType": "ST_FILE"
    },
    "clusterConf": {
      "selfId": "alice",
      "parties": [
        {
          "id": "alice",
          "address": "0.0.0.0:9110"
        },
        {
          "id": "bob",
          "address": "0.0.0.0:9111"
        }
      ],
      "channelDesc": {
        "protocol": "http"
      }
    },
    "featureSourceConf": {
      "mockOpts": {}
    }
  }

bob的serving.config

{
  "id": "test_service_id",
  "serverConf": {
    "featureMapping": {
      "v6": "x6",
      "v7": "x7",
      "v8": "x8",
      "v9": "x9",
      "v10": "x10"
    },
    "host": "0.0.0.0",
    "servicePort": "9011",
    "communicationPort": "9111",
    "metricsExposerPort": 10308,
    "brpcBuiltinServicePort": 10309
  },
  "modelConf": {
    "modelId": "glm-test-1",
    "basePath": "/tmp/bob",
    "sourcePath": "examples/bob/glm-test.tar.gz",
    "sourceSha256": "d45d34feae42d663875569c206ac20263480300a40e34deda4037e01aeb9b2f8",
    "sourceType": "ST_FILE"
  },
  "clusterConf": {
    "selfId": "bob",
    "parties": [
      {
        "id": "alice",
        "address": "0.0.0.0:9110"
      },
      {
        "id": "bob",
        "address": "0.0.0.0:9111"
      }
    ],
    "channel_desc": {
      "protocol": "http"
    }
  },
  "featureSourceConf": {
    "mockOpts": {}
  }
}

两个容器均已能正常运行,但是两个节点似乎没法相互连接

1 2

你说的无法连接是什么意思?用的容器内的端口,还是宿主机端口?容器内端口有映射到宿主机吗?

@wangzul
Copy link

wangzul commented Nov 16, 2024

路径文件这个我明白了,之前没弄清楚是将宿主机文件挂载到容器中

docker-compose.yaml文件提供一下我看看。

docker-compose.yaml文件内容如下

version: "3.8"
services:
  serving:
    cap_add:
      - NET_ADMIN
    command:
      - /root/sf_serving/secretflow_serving
      - --serving_config_file=/root/sf_serving/conf/serving.config
      - --logging_config_file=/root/sf_serving/conf/logging.config
      - --trace_config_file=/root/sf_serving/conf/trace.config
    restart: always
    image: secretflow/serving-anolis8:latest
    ports:
      - 4590:9010
    volumes:
      - ./serving.config:/root/sf_serving/conf/serving.config
      - ./logging.config:/root/sf_serving/conf/logging.config
      - ./trace.config:/root/sf_serving/conf/trace.config

上面贴了之前启动serving失败的报错,似乎是缺少./glm-test.tar.gz模型文件

  1. 我看到你的配置中端口配置,宿主机的是4590,容器内部是9010。
  2. 你的ip配置的都是0.0但是我们现在是在容器内部中运行,2方的网络环境不一致无法互相访问的。

@wangzul
Copy link

wangzul commented Nov 16, 2024

路径文件这个我明白了,之前没弄清楚是将宿主机文件挂载到容器中

docker-compose.yaml文件提供一下我看看。

docker-compose.yaml文件内容如下

version: "3.8"
services:
  serving:
    cap_add:
      - NET_ADMIN
    command:
      - /root/sf_serving/secretflow_serving
      - --serving_config_file=/root/sf_serving/conf/serving.config
      - --logging_config_file=/root/sf_serving/conf/logging.config
      - --trace_config_file=/root/sf_serving/conf/trace.config
    restart: always
    image: secretflow/serving-anolis8:latest
    ports:
      - 4590:9010
    volumes:
      - ./serving.config:/root/sf_serving/conf/serving.config
      - ./logging.config:/root/sf_serving/conf/logging.config
      - ./trace.config:/root/sf_serving/conf/trace.config

上面贴了之前启动serving失败的报错,似乎是缺少./glm-test.tar.gz模型文件

  1. 我看到你的配置中端口配置,宿主机的是4590,容器内部是9010。
  2. 你的ip配置的都是0.0但是我们现在是在容器内部中运行,2方的网络环境不一致无法互相访问的。

路径文件这个我明白了,之前没弄清楚是将宿主机文件挂载到容器中

docker-compose.yaml文件提供一下我看看。

docker-compose.yaml文件内容如下

version: "3.8"
services:
  serving:
    cap_add:
      - NET_ADMIN
    command:
      - /root/sf_serving/secretflow_serving
      - --serving_config_file=/root/sf_serving/conf/serving.config
      - --logging_config_file=/root/sf_serving/conf/logging.config
      - --trace_config_file=/root/sf_serving/conf/trace.config
    restart: always
    image: secretflow/serving-anolis8:latest
    ports:
      - 4590:9010
    volumes:
      - ./serving.config:/root/sf_serving/conf/serving.config
      - ./logging.config:/root/sf_serving/conf/logging.config
      - ./trace.config:/root/sf_serving/conf/trace.config

上面贴了之前启动serving失败的报错,似乎是缺少./glm-test.tar.gz模型文件

  1. 我看到你的配置中端口配置,宿主机的是4590,容器内部是9010。
  2. 你的ip配置的都是0.0但是我们现在是在容器内部中运行,2方的网络环境不一致无法互相访问的。

image

@zhuiguang49
Copy link
Author

zhuiguang49 commented Nov 16, 2024

我使用docker network create sf-network命令创建网络

并将alice和bob的docker-cmopose.yaml文件修改如下:
alice

version: "3.8"
services:
  serving:
    cap_add:
      - NET_ADMIN
    command:
      - /root/sf_serving/secretflow_serving
      - --serving_config_file=/root/sf_serving/conf/serving.config
      - --logging_config_file=/root/sf_serving/conf/logging.config
      - --trace_config_file=/root/sf_serving/conf/trace.config
    restart: always
    image: secretflow/serving-anolis8:latest
    container_name: alice-serving
    networks:
      - sf-network
    ports:
      - 4590:9010  # 服务端口
      - 9110:9110  # 通信端口
      - 10306:10306  # metrics端口
      - 10307:10307  # brpc端口
    volumes:
      - ./serving.config:/root/sf_serving/conf/serving.config
      - ./logging.config:/root/sf_serving/conf/logging.config
      - ./trace.config:/root/sf_serving/conf/trace.config
      - ./glm-test.tar.gz:/root/glm-test.tar.gz

networks:
  sf-network:
    external: true

bob

version: "3.8"
services:
  serving:
    cap_add:
      - NET_ADMIN
    command:
      - /root/sf_serving/secretflow_serving
      - --serving_config_file=/root/sf_serving/conf/serving.config
      - --logging_config_file=/root/sf_serving/conf/logging.config
      - --trace_config_file=/root/sf_serving/conf/trace.config
    restart: always
    image: secretflow/serving-anolis8:latest
    container_name: bob-serving
    networks:
      - sf-network
    ports:
      - 4591:9011  # 服务端口
      - 9111:9111  # 通信端口
      - 10308:10308  # metrics端口
      - 10309:10309  # brpc端口
    volumes:
      - ./serving.config:/root/sf_serving/conf/serving.config
      - ./logging.config:/root/sf_serving/conf/logging.config
      - ./trace.config:/root/sf_serving/conf/trace.config
      - ./glm-test.tar.gz:/root/glm-test.tar.gz

networks:
  sf-network:
    external: true

alice和bob的serving.config如下:
alice

version: "3.8"
services:
  serving:
    cap_add:
      - NET_ADMIN
    command:
      - /root/sf_serving/secretflow_serving
      - --serving_config_file=/root/sf_serving/conf/serving.config
      - --logging_config_file=/root/sf_serving/conf/logging.config
      - --trace_config_file=/root/sf_serving/conf/trace.config
    restart: always
    image: secretflow/serving-anolis8:latest
    container_name: alice-serving
    networks:
      - sf-network
    ports:
      - 4590:9010  # 服务端口
      - 9110:9110  # 通信端口
      - 10306:10306  # metrics端口
      - 10307:10307  # brpc端口
    volumes:
      - ./serving.config:/root/sf_serving/conf/serving.config
      - ./logging.config:/root/sf_serving/conf/logging.config
      - ./trace.config:/root/sf_serving/conf/trace.config
      - ./glm-test.tar.gz:/root/glm-test.tar.gz

networks:
  sf-network:
    external: true

bob

{
  "id": "test_service_id",
  "serverConf": {
    "featureMapping": {
      "v6": "x6",
      "v7": "x7",
      "v8": "x8",
      "v9": "x9",
      "v10": "x10"
    },
    "host": "0.0.0.0",
    "servicePort": "9011",
    "communicationPort": "9111",
    "metricsExposerPort": 10308,
    "brpcBuiltinServicePort": 10309
  },
  "modelConf": {
    "modelId": "glm-test-1",
    "basePath": "/tmp/bob",
    "sourcePath": "examples/bob/glm-test.tar.gz",
    "sourceSha256": "d45d34feae42d663875569c206ac20263480300a40e34deda4037e01aeb9b2f8",
    "sourceType": "ST_FILE"
  },
  "clusterConf": {
    "selfId": "bob",
    "parties": [
      {
        "id": "alice",
        "address": "alice-serving:9110"
      },
      {
        "id": "bob",
        "address": "bob-serving:9111"
      }
    ],
    "channel_desc": {
      "protocol": "http"
    }
  },
  "featureSourceConf": {
    "mockOpts": {}
  }
}

跑起来之后使用下面命令能相互ping通,这样是部署成功了吗

docker exec -it bob-serving ping alice-serving
docker exec -it alice-serving ping bob-serving

@wangzul
Copy link

wangzul commented Nov 16, 2024

确保双方都配置正确后,尝试运行一下看看。

@zhuiguang49
Copy link
Author

634C247A-D5D3-46ce-A5C7-D1A558B30B3F
能够正常启动的,能正常访问 /health 接口,并且按文档进行第三步预测测试也能够完成

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants