-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test vectors #35
Comments
depends on #25 |
I’d like to help create test vectors for ChillDKG’s public APIs. Here’s my current thinking: We need test vectors for all the public functions: Some outputs, like How many JSON files? My initial plan is one JSON file per public API. If that feels too fragmented later, we can merge some APIs into a single file. Starting with one file per API is more straightforward and gives us the flexibility to combine them later if needed. Design Options for Representing Objects We have two main designs for our custom types in the JSON:
Trade-offs:
An Example (Design 2) If we choose Design 2, where we represent objects as nested JSON structures, here’s how it might look. Consider flowchart TB
A[ParticipantState1]
A --> B[params]
B --o C1["t<br/><small>(int)</small>"]
C1:::highlight
B --o C2["hostpubkeys<br/><small>(bytes[n])</small>"]
C2:::highlight
A --o D["idx<br/><small>(int)</small>"]
D:::highlight
A --> E[enc_state]
E --> F[simpl_state]
F --o G1["t<br/><small>(int)</small>"]
G1:::highlight
F --o G2["n<br/><small>(int)</small>"]
G2:::highlight
F --o G3["idx<br/><small>(int)</small>"]
G3:::highlight
F --o G4["com_to_secrets<br/><small>(GE)</small>"]
G4:::highlight
E --o G["pubnonce<br/><small>(bytes)</small>"]
G:::highlight
E --o H["enckeys<br/><small>(bytes[n])</small>"]
H:::highlight
E --o I["idx<br/><small>(int)</small>"]
I:::highlight
classDef highlight stroke:#333,stroke-width:2px,stroke-dasharray: 5 5;
A naive way of representing this in JSON would look like: {
"ParticipantState1": {
"params": {
"t": 2,
"hostpubkeys": [
"02aabbccddeeff...",
"02ffaabbccddee...",
"02ccddeeaabbff..."
]
},
"idx": 1,
"enc_state": {
"simpl_state": {
"t": 2,
"n": 3,
"idx": 1,
"com_to_secrets": "03abcdef1234..."
},
"pubnonce": "02ddeeff112233...",
"enckeys": [
"02aabbccddeeff...",
"02ffaabbccddee...",
"02ccddeeaabbff..."
],
"idx": 1
}
}
}
However, this could be optimized by [1] using a global array of host keys and referencing them by indices, and [2] eliminating repeated fields like "global_hostpubkeys": [
"02aabbccddeeff...",
"02ffaabbccddee...",
"02ccddeeaabbff..."
]
"ParticipantState1": {
"params": {
"t": 2,
"hostpubkey_indices": [0, 1, 2]
},
"idx": 1,
"enc_state": {
"simpl_state": {
// t, n, idx are redundant (parser can infer these vals)
"com_to_secrets": "03abcdef1234..."
},
"pubnonce": "02ddeeff112233...",
// enckeys_indices = hostpubkey_indices (parser infers this list)
// idx is redundant
}
} Next Steps:
I’d like to hear your thoughts on my current plan—what seems solid and what I might be getting wrong. |
Hey @siv2r,
thanks, that would be great!
Yes, we only need to define serializations for objects that are between coordinator and participants.
I think design 2 is better than design 1. However, there's another alternative. Right now it seems slightly better than Design 2, but I'm not sure. Design 3The main difference to design 1 and 2 is that the test vectors do not contain state inputs or outputs. For example, a test vector of a successful DKG run from the participant's perspective would be a tuple
We would only need to define JSON serializations for A test vector of a failing
Functions without state, like I think the advantage is that such test vectors are easier to use because users don't need to concern themselves with the JSON serialization of state. However, it's possible to reduce the size of the test vectors by specifying common prefixes only once, for example like that:
P.S.I think experience shows that it's probably best to start by writing a script that automatically generates test vectors instead of trying to generate them ad-hoc. |
@siv2r Great to hear that you'd like to ! @jonasnick I'm not sure if I understand the idea of Design 3 entirely. How would
Which components in this tuple are inputs, and which are expected outputs? (My thinking: If I don't see how this would work, or how we could create test cases in which one of the participants "becomes" malicious in the middle of the executions.
I agree (independently of Design 3.) Parsing JSON is annoying for implementations, but enforcing serializations seems even more annoying. @siv2r Perhaps check out https://github.com/C2SP/wycheproof/tree/master as an inspiration. It's a collection of test vectors in JSON format. It's closer to Design 2, and it comes with JSON schemas. All test cases have a human-readable note of what is tested.
I think we should prefer simplicity to smaller JSON files. Disk space on dev machines should not be a concern, and things are complex enough.
I fully agree. |
I'd imagine that the test vector in design 3 for failing def test_vec_participant_step2(hostseckey, session_params, random, ParticipantMsg1, CoordinatorMsg1, Exception):
state1, msg1 = participant_step1(hostseckey, session_params, random)
# Since the output of participant_step1 is tested somewhere else, we could
# omit ParticipantMsg1 from the test vector tuple (and I think we should, this was just an
# oversight on my part).
assert msg1 == ParticipantMsg1
try:
participant_step2(hostseckey, state1, CoordinatorMsg1)
except Error as e:
assert e == Exception
return
assert False So everything but the exception is an input. We should probably mark inputs and outputs separately in the test vector. I was just trying to discuss the high level idea.
Since we don't need coordinator state to invoke
Does the above example make it clearer? Design 3 only runs the honest functions of an honest participant. We model faulty participants through messages received by the honest participant and those are part of the test vector.
In the context of design 3, using prefixes would help understandability because users could see more easily which parts of different test vectors are the same and which parts differ. |
I see, I that's a reasonable design. And yes, the advantage is that we don't need serializations of the internal data structures, and this makes things easier for implementers. A disadvantage is that the test procedures that exercise that test vectors get slightly more complicated, e.g., the test for So yes, I believe that Design 3 is a good design, and I also tend to think it's slightly better than Design 2, though it probably won't matter too much in the end.
I see your point that not testing it is closer to the single-responsibility principle, but I'd argue that testing |
I really like Design 3. Excluding state serializations in the JSON format will simplify the testing code and the vectors.
Thanks for all the suggestions! I now have some clarity on the high-level structure for the test vectors. I’ll implement a basic proof of concept (using a script) with one test vector for each API, covering various cases (valid inputs, error scenarios, etc.). Once the PoC is ready, I’ll share it for feedback before expanding to more vectors.
Ah yes, I briefly referred to Wycheproof when brainstorming Design 1 and Design 2. I’ll keep it as a reference while working on the PoC. |
I'm not entirely sure how a combined workflow would look like (I think it would end up looking pretty similar to separate API tests) but testing those APIs separately would be my first attempt.
Yes. But @real-or-random suggested to also add the ParticipantMsg1 output of participant_step1 to the test vector, which allows asserting Either way of doing it is fine for me.
Yes, in design 3 we wouldn't include the ParticipantState or CoordinatorState objects in the test vectors. |
Add test vectors for
The text was updated successfully, but these errors were encountered: