Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inquiry about Character Encoding Method in CK2dll #146

Open
thalesfu opened this issue Dec 11, 2023 · 2 comments
Open

Inquiry about Character Encoding Method in CK2dll #146

thalesfu opened this issue Dec 11, 2023 · 2 comments
Assignees

Comments

@thalesfu
Copy link

Dear Mantanki Saito,

I am a CKII player and have been utilizing your software, CK2dll. Firstly, I would like to express my gratitude for your diligent work on this project, which has significantly aided my endeavors.

Recently, I encountered a challenge while handling the save files in CK2dll. I observed that some names are stored in a peculiar encoding format, like "bn="�¯e�eg"". I suspect these represent Chinese characters, but I am unable to decode them accurately.

As I am not very proficient in C++, I have faced difficulties in understanding and replicating the specific encoding method used in the save files. I am interested in implementing this encoding method in a Go language, to translate these seemingly garbled characters into readable Chinese names.

Therefore, I would like to seek your guidance on the following:

How is character encoding implemented in the CK2dll save files? Is there a specific algorithm or method used?
Could you guide me to the part of the code where this encoding is implemented, or provide some detailed instructions on the encoding process?
I appreciate any professional knowledge you can share and the valuable time you spend in responding. Your assistance or guidance in this matter would be greatly appreciated.

Looking forward to your reply, and once again, thank you for your contribution to CK2dll.

@LianJ333
Copy link

LianJ333 commented Dec 11, 2023 via email

@matanki-saito matanki-saito self-assigned this May 3, 2024
@matanki-saito
Copy link
Owner

matanki-saito commented May 11, 2024

@thalesfu
Sorry for the late reply.

This special encoding always expresses the 2byte character in 3byte.First character is 0x10 to 0x14.In the case of 0x10, the subsequent 2byte is considered to be the codepoint of the unicode.In the case of 0x11 to 0x14, the value shifted from the subsequent 2byte number is considered codePoint.

This shift allows you to safely embed two bytes into the original string, the CP1252 byte string

image

The js code written by the manager of paratranz is in the gist, so it might be helpful.

https://gist.github.com/bruceCzK/91abe395c72c5b08f186d5ae8add03e6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants