Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

serving调用时字段类型问题 #129

Open
neaos opened this issue Dec 9, 2024 · 19 comments
Open

serving调用时字段类型问题 #129

neaos opened this issue Dec 9, 2024 · 19 comments

Comments

@neaos
Copy link

neaos commented Dec 9, 2024

{
    "fs_params": {
        "com2023011620072311738": {
            "query_datas": [
                "5"
            ],
            "query_context": "text"
        },
        "com2023011620060497797": {
            "query_datas": [
                "7"
            ],
            "query_context": "text"
        }
    }
}

上述是推理预测请求报文。
申请serving job的报文里也没有指定字段类型。
image

为什么从日志看,它限定了字段类型是double,
image

就是代码里feature_schema_里限定字段类型是从哪里定的,或者没看出来是从入参里传的,是默认的吗
可以从哪里主动觉得feature_schema_里的字段类型呢

@neaos
Copy link
Author

neaos commented Dec 9, 2024

image

跟读代码da到这里source_schema_ 它里面的字段类型从哪里决定的呢

@neaos
Copy link
Author

neaos commented Dec 9, 2024

image

这里的EntryNodes是从哪里来的呢

@huocun-ant
Copy link
Collaborator

在Server启动的时候加载的模型包里

@neaos
Copy link
Author

neaos commented Dec 9, 2024

1、先训练,会生成一个模型文件 0
2、再调一个kuscia api接口,把特征计算和模型文件0,合并一下
3、再发起serving job,生成推理接口
我模型是在第2步通过kuscia API来生成的 例如model2024120808521160582-model-export-output

{"job_id":"model2024120808521160582","initiator":"com2023011620060497797","max_parallelism":1,"tasks":[{"task_id":"model2024120808521160582-model-export","app_image":"secretflow-image","alias":"model2024120808521160582-model-export","task_input_config":"{"sf_cluster_desc":{"devices":[{"config":"{\"runtime_config\":{\"protocol\":\"SEMI2K\",\"field\":\"FM128\"},\"party_address\":{\"com2023011620072311738\":\"172.16.16.116\",\"com2023011620060497797\":\"172.16.16.111\"},\"link_desc\":{\"connect_retry_times\":60,\"connect_retry_interval_ms\":1000,\"brpc_channel_protocol\":\"http\",\"brpc_channel_connection_type\":\"pooled\",\"recv_timeout_ms\":1200000,\"http_timeout_ms\":1200000}}","name":"spu","parties":["com2023011620060497797","com2023011620072311738"],"type":"spu"},{"config":"{\"mode\":\"PHEU\",\"schema\":\"paillier\",\"key_size\":2048}","name":"heu","parties":["com2023011620060497797","com2023011620072311738"],"type":"heu"}],"parties":["com2023011620060497797","com2023011620072311738"],"ray_fed_config":{"cross_silo_comm_backend":"brpc_link"}},"sf_datasource_config":{"com2023011620060497797":{"id":"default-data-source"},"com2023011620072311738":{"id":"default-data-source"}},"sf_input_ids":[],"sf_node_eval_param":{"attr_paths":["model_name","model_desc","input_datasets","output_datasets","component_eval_params"],"attrs":[{"s":"modelExport-model2024120808521160582"},{"s":""},{"ss":["reefr2024120515141372809-qagbmwrn-node-36-output-0","reefr2024120515141372809-qagbmwrn-node-37-output-0"]},{"ss":["reefr2024120515141372809-qagbmwrn-node-37-output-0","reefr2024120515141372809-qagbmwrn-node-37-output-1","reefr2024120515141372809-qagbmwrn-node-39-output-0","reefr2024120515141372809-qagbmwrn-node-39-output-1"]},{"ss":["eyJhdHRyX3BhdGhzIjpbImlucHV0L2luX2RzL2ZlYXR1cmVzIiwicnVsZXMiXSwiYXR0cnMiOlt7ImlzX25hIjpmYWxzZSwic3MiOlsiYWdlIiwiZWR1Y2F0aW9uIiwiZGVmYXVsdCIsImJhbGFuY2UiLCJob3VzaW5nIiwibG9hbiIsImRheSIsImR1cmF0aW9uIiwiY2FtcGFpZ24iLCJwZGF5cyIsInByZXZpb3VzIiwiam9iX2JsdWUtY29sbGFyIiwiam9iX2VudHJlcHJlbmV1ciIsImpvYl9ob3VzZW1haWQiLCJqb2JfbWFuYWdlbWVudCIsImpvYl9yZXRpcmVkIiwiam9iX3NlbGYtZW1wbG95ZWQiLCJqb2Jfc2VydmljZXMiLCJqb2Jfc3R1ZGVudCIsImpvYl90ZWNobmljaWFuIiwiam9iX3VuZW1wbG95ZWQiLCJtYXJpdGFsX2Rpdm9yY2VkIiwibWFyaXRhbF9tYXJyaWVkIiwibWFyaXRhbF9zaW5nbGUiLCJjb250YWN0X2NlbGx1bGFyIiwiY29udGFjdF90ZWxlcGhvbmUiLCJjb250YWN0X3Vua25vd24iLCJtb250aF9hcHIiLCJtb250aF9hdWciLCJtb250aF9kZWMiLCJtb250aF9mZWIiLCJtb250aF9qYW4iLCJtb250aF9qdWwiLCJtb250aF9qdW4iLCJtb250aF9tYXIiLCJtb250aF9tYXkiLCJtb250aF9ub3YiLCJtb250aF9vY3QiLCJtb250aF9zZXAiLCJwb3V0Y29tZV9mYWlsdXJlIiwicG91dGNvbWVfb3RoZXIiLCJwb3V0Y29tZV9zdWNjZXNzIiwicG91dGNvbWVfdW5rbm93biJdfSx7InMiOiJ7XCJvcFwiOlwiU1RBTkRBUkRJWkVcIn0ifV0sImNoZWNrcG9pbnRfdXJpIjoiY2tiZW9iLXFhZ2Jtd3JuLW5vZGUtMzctb3V0cHV0LTAiLCJkb21haW4iOiJwcmVwcm9jZXNzaW5nIiwibmFtZSI6ImZlYXR1cmVfY2FsY3VsYXRlIiwidmVyc2lvbiI6IjAuMC4xIn0=","eyJhdHRyX3BhdGhzIjpbImlucHV0L3RyYWluX2RhdGFzZXQvZmVhdHVyZV9zZWxlY3RzIiwiaW5wdXQvdHJhaW5fZGF0YXNldC9sYWJlbCIsImVwb2NocyIsImxlYXJuaW5nX3JhdGUiLCJiYXRjaF9zaXplIiwic2lnX3R5cGUiLCJyZWdfdHlwZSIsInBlbmFsdHkiLCJsMl9ub3JtIiwiZXBzIiwicmVwb3J0X3dlaWdodHMiXSwiYXR0cnMiOlt7ImlzX25hIjpmYWxzZSwic3MiOlsiYWdlIiwiZWR1Y2F0aW9uIiwiZGVmYXVsdCIsImJhbGFuY2UiLCJob3VzaW5nIiwibG9hbiIsImRheSIsImR1cmF0aW9uIiwiY2FtcGFpZ24iLCJwZGF5cyIsInByZXZpb3VzIiwiam9iX2JsdWUtY29sbGFyIiwiam9iX2VudHJlcHJlbmV1ciIsImpvYl9ob3VzZW1haWQiLCJqb2JfbWFuYWdlbWVudCIsImpvYl9yZXRpcmVkIiwiam9iX3NlbGYtZW1wbG95ZWQiLCJqb2Jfc2VydmljZXMiLCJqb2Jfc3R1ZGVudCIsImpvYl90ZWNobmljaWFuIiwiam9iX3VuZW1wbG95ZWQiLCJtYXJpdGFsX2Rpdm9yY2VkIiwibWFyaXRhbF9tYXJyaWVkIiwibWFyaXRhbF9zaW5nbGUiLCJjb250YWN0X2NlbGx1bGFyIiwiY29udGFjdF90ZWxlcGhvbmUiLCJjb250YWN0X3Vua25vd24iLCJtb250aF9hcHIiLCJtb250aF9hdWciLCJtb250aF9kZWMiLCJtb250aF9mZWIiLCJtb250aF9qYW4iLCJtb250aF9qdWwiLCJtb250aF9qdW4iLCJtb250aF9tYXIiLCJtb250aF9tYXkiLCJtb250aF9ub3YiLCJtb250aF9vY3QiLCJtb250aF9zZXAiLCJwb3V0Y29tZV9mYWlsdXJlIiwicG91dGNvbWVfb3RoZXIiLCJwb3V0Y29tZV9zdWNjZXNzIiwicG91dGNvbWVfdW5rbm93biJdfSx7ImlzX25hIjpmYWxzZSwic3MiOlsieSJdfSx7Imk2NCI6MywiaXNfbmEiOmZhbHNlfSx7ImYiOjAuMDEsImlzX25hIjpmYWxzZX0seyJpNjQiOjE2LCJpc19uYSI6ZmFsc2V9LHsiaXNfbmEiOmZhbHNlLCJzIjoidDEifSx7ImlzX25hIjpmYWxzZSwicyI6ImxvZ2lzdGljIn0seyJpc19uYSI6ZmFsc2UsInMiOiJOb25lIn0seyJmIjowLjUsImlzX25hIjpmYWxzZX0seyJmIjowLjAwMSwiaXNfbmEiOmZhbHNlfSx7ImlzX25hIjp0cnVlfV0sImNoZWNrcG9pbnRfdXJpIjoiY2tiZW9iLXFhZ2Jtd3JuLW5vZGUtMzktb3V0cHV0LTAiLCJkb21haW4iOiJtbC50cmFpbiIsIm5hbWUiOiJzc19zZ2RfdHJhaW4iLCJ2ZXJzaW9uIjoiMC4wLjEifQ=="]}],"domain":"model","name":"model_export","version":"0.0.1"},"sf_output_ids":["model2024120808521160582-model-export-output","model2024120808521160582-model-export-output-report"],"sf_output_uris":["model2024120808521160582-model-export-output","model2024120808521160582-model-export-output-report"]}","priority":100,"parties":[{"domain_id":"com2023011620060497797","role":"partner"},{"domain_id":"com2023011620072311738","role":"partner"}],"dependencies":[]}]}

而我在这个请求里没有指定字段类型呀,为何模型包里定义的字段类型是double

@neaos
Copy link
Author

neaos commented Dec 9, 2024

或者我在上述的合并模型请求里 该怎么指定正确的字段类型呢

@huocun-ant
Copy link
Collaborator

字段的类型会与你在特征计算的时候输入的表格的字段类型保持一致,如果你没有特征计算的话,会与你训练的输入表的类型保持一致。在导出模型的时候系统会回溯整个流程一直得到原始表的类型信息,换句话说,模型包里包含了从原始表开始每个步骤的输入输出表的类型信息。

@neaos
Copy link
Author

neaos commented Dec 9, 2024

我没有导出模型

@neaos
Copy link
Author

neaos commented Dec 9, 2024

f4ee56d642816fe9c710c11c14eab2a

/serving-2024120823344327406/model2024120808521160582-model-export-output/model_bundle.tar.gz
模型包就是这个吗

@neaos
Copy link
Author

neaos commented Dec 9, 2024

model-export 这个接口 我有复用训练时的特征计算,但是训练时的特征计算指定的字段类型是 float呀,但是从serving日志看是double

@neaos
Copy link
Author

neaos commented Dec 9, 2024

2e60b4cd9a34aed56a54952c5910348

@neaos
Copy link
Author

neaos commented Dec 9, 2024

是不是可以通过对model-export接口传参指定众字段类型

@neaos
Copy link
Author

neaos commented Dec 9, 2024

是不是在ss里,那一大长串的base64字符串里

@neaos
Copy link
Author

neaos commented Dec 9, 2024

image

在serving里我查出的数据是float类型,但是feature_schema_里指定的数据类型是double,所以我试图做类型转换,将float转换成double,但是失败,类型转换看起来行不通。
(array->type()->id() == arrow::Type::FLOAT) {
if (expected_type->id() == arrow::Type::DOUBLE) {
SPDLOG_INFO("Column {} type mismatch, casting FLOAT to DOUBLE...", i);
status = arrow::compute::Cast(*array, arrow::float64()).Value(&array);
}
}

@neaos
Copy link
Author

neaos commented Dec 9, 2024

所以我只能寻找feature_schema_的赋值源头,修改它类型指定,改完 float,使其一致。
也就是模型包里的字段类型,能否从模型导出model-export接口入参来决定字段类型。

@neaos
Copy link
Author

neaos commented Dec 9, 2024

0.log

这是模型导出日志

@neaos
Copy link
Author

neaos commented Dec 9, 2024

eca540ccf07cb0d7e67c7feffd13b7d

image

前面定义的是float,
后面这个就变成float64了

@oeqqwq
Copy link
Member

oeqqwq commented Dec 11, 2024

解释一下相关的几个问题:

feature_schema_ 来源于serving model记录的入口算子所需要的schema信息,如果你的模型来自SF的model_export算子,它会在导出serving model时根据你选择提交的sf算子trace出相应的schema

在你的adpater中,你可以任意转换类型,只需要保证response(https://github.com/secretflow/serving/blob/main/secretflow_serving/feature_adapter/feature_adapter.h#L40)
中的recordbatch的schema完整包含feature_schema_的内容即可

@oeqqwq
Copy link
Member

oeqqwq commented Dec 11, 2024

eca540ccf07cb0d7e67c7feffd13b7d ![image](https://private-user-images.githubusercontent.com/20437299/393742876-521f5a00-dcb8-4342-91c4-4471edfc5baa.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzM5MTI0NjIsIm5iZiI6MTczMzkxMjE2MiwicGF0aCI6Ii8yMDQzNzI5OS8zOTM3NDI4NzYtNTIxZjVhMDAtZGNiOC00MzQyLTkxYzQtNDQ3MWVkZmM1YmFhLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDEyMTElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQxMjExVDEwMTYwMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWMxNmQ0MzFiNDJmZGI3OTcwOTk0MjQ5NjMxOTMzMWMyZjhhNTNhZDEyMGY1ZGJkNDM0NmJjZmZlMzc0MTY1NTYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.qI-0DaXgqcuPmd9lpyBtLIem2Qns-HLQhdzAdrERStA)

前面定义的是float, 后面这个就变成float64了

这里可以补充一下这里的前面后面具体是什么?

不过通常来说,如果你的输入数据最开始定义的是float,通过特征计算等相关算子后,它的输出是可能发生类型变化的,你可以在model_export时选择将特征计算与模型训练算子一起打包,即打包的serving model里会包含你的特征计算逻辑,此时serving model的入口schema即为你最原始的输入schema;如果你只通过模型训练算子本身导出模型,那么serving model的schema就会是为特征计算算子输出的schema信息

@huocun-ant
Copy link
Collaborator

0.log

这是模型导出日志

看模型导出日志,你的模型导出包含feature_calculate , ss_sgd_train 等组件,可以从1606行和1628行的日志对比中看出,在经过feature_calculate 的时候,float变成了float64.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants