Test vectors #35

jonasnick · 2024-07-08T07:47:46Z

Add test vectors for

correct DKG protocol runs
incorrect contributions covering all edge cases

real-or-random · 2024-07-08T12:15:26Z

depends on #25

siv2r · 2024-12-19T13:07:40Z

I’d like to help create test vectors for ChillDKG’s public APIs. Here’s my current thinking:

We need test vectors for all the public functions: hostpubkey_gen, params_id, participant_step1, participant_step2, participant_finalize, participant_blame, coordinator_step1, coordinator_finalize, coordinator_blame, and recover.

Some outputs, like ParticipantMsg1, ParticipantMsg2, CoordinatorMsg1, CoordinatorMsg2, and CoordinatorBlameMsg, must be serialized into byte arrays since they’re transmitted between coordinator and participant. Other outputs that represent states can remain as it is?

How many JSON files?

My initial plan is one JSON file per public API. If that feels too fragmented later, we can merge some APIs into a single file. Starting with one file per API is more straightforward and gives us the flexibility to combine them later if needed.

Design Options for Representing Objects

We have two main designs for our custom types in the JSON:

Design 1: Represent it as a single large hex string (a fully serialized byte array).
Design 2: Use a nested JSON structure that shows all internal fields.

Trade-offs:

Readability:

Design 2 is more readable and transparent about internal structures. But will implementers actually care about this readability? Won’t they just copy and paste the test vectors into their implementation to check if it works?
Complexity:

With Design 2, any code using these test vectors must serialize the internal fields to match the function’s actual input/output formats. This might not be an issue, though, since developers implementing our custom type in any language will likely have serialization and deserialization functions for it.
Duplication:

Some data is duplicated across internal fields. For instance, in ParticipantState1
- t is included in both params and simpl_state.
- The pubkeys list in params is identical to the enckeys list in enc_state.
- idx is repeated in params, enc_state, and simpl_state.
Design 2 could eliminate this duplication by leaving it to the parsing code to infer these elements when creating a ParticipantState1 object. This might also be possible in Design 1 by handling it within its serialization/deserialization function.

An Example (Design 2)

If we choose Design 2, where we represent objects as nested JSON structures, here’s how it might look. Consider ParticipantState1 as an example, its structure looks like this:

flowchart TB
    A[ParticipantState1]
    A --> B[params]
    B --o C1["t<br/><small>(int)</small>"]
    C1:::highlight
    B --o C2["hostpubkeys<br/><small>(bytes[n])</small>"]
    C2:::highlight
    A --o D["idx<br/><small>(int)</small>"]
    D:::highlight
    A --> E[enc_state]
    E --> F[simpl_state]
    F --o G1["t<br/><small>(int)</small>"]
    G1:::highlight
    F --o G2["n<br/><small>(int)</small>"]
    G2:::highlight
    F --o G3["idx<br/><small>(int)</small>"]
    G3:::highlight
    F --o G4["com_to_secrets<br/><small>(GE)</small>"]
    G4:::highlight
    E --o G["pubnonce<br/><small>(bytes)</small>"]
    G:::highlight
    E --o H["enckeys<br/><small>(bytes[n])</small>"]
    H:::highlight
    E --o I["idx<br/><small>(int)</small>"]
    I:::highlight

    classDef highlight stroke:#333,stroke-width:2px,stroke-dasharray: 5 5;

A naive way of representing this in JSON would look like:

{
  "ParticipantState1": {
    "params": {
      "t": 2,
      "hostpubkeys": [
        "02aabbccddeeff...",
        "02ffaabbccddee...",
        "02ccddeeaabbff..."
      ]
    },
    "idx": 1,
    "enc_state": {
      "simpl_state": {
        "t": 2,
        "n": 3,
        "idx": 1,
        "com_to_secrets": "03abcdef1234..."
      },
      "pubnonce": "02ddeeff112233...",
      "enckeys": [
        "02aabbccddeeff...",
        "02ffaabbccddee...",
        "02ccddeeaabbff..."
      ],
      "idx": 1
    }
  }
}

However, this could be optimized by [1] using a global array of host keys and referencing them by indices, and [2] eliminating repeated fields like t and idx across multiple nested objects.

"global_hostpubkeys": [
    "02aabbccddeeff...",
    "02ffaabbccddee...",
    "02ccddeeaabbff..."
]
"ParticipantState1": {
  "params": {
    "t": 2,
    "hostpubkey_indices": [0, 1, 2]
  },
  "idx": 1,
  "enc_state": {
    "simpl_state": {
      // t, n, idx are redundant (parser can infer these vals)
      "com_to_secrets": "03abcdef1234..."
    },
    "pubnonce": "02ddeeff112233...",
    // enckeys_indices = hostpubkey_indices (parser infers this list)
    // idx is redundant
  }
}

Next Steps:

I plan to implement (de)serialization methods for the message types, not sure if state types need it.
I’d like your input on Design 1 vs. Design 2
I’ll start by creating one JSON file per API and then decide if merging makes sense.
After that, I’ll work on code that verifies the test vectors against the specification.

I’d like to hear your thoughts on my current plan—what seems solid and what I might be getting wrong.

jonasnick · 2024-12-30T19:44:47Z

Hey @siv2r,

I’d like to help create test vectors for ChillDKG’s public APIs.

thanks, that would be great!

Other outputs that represent states can remain as it is?

Yes, we only need to define serializations for objects that are between coordinator and participants.

I’d like your input on Design 1 vs. Design 2

I think design 2 is better than design 1. However, there's another alternative. Right now it seems slightly better than Design 2, but I'm not sure.

Design 3

The main difference to design 1 and 2 is that the test vectors do not contain state inputs or outputs.
Thus, exercising the test vectors can invoke multiple ChillDKG functions.

For example, a test vector of a successful DKG run from the participant's perspective would be a tuple

(hostseckey, session_params, random, ParticipantMsg1, CoordinatorMsg1, ParticipantMsg2, CoordinatorMsg2, DKGOutput, RecoveryData)

We would only need to define JSON serializations for session_params and DKGOutput.

A test vector of a failing participant_step2 test vector would be a tuple

(hostseckey, session_params, random, ParticipantMsg1, CoordinatorMsg1, Exception)

Functions without state, like hostpubkey_gen, session_params and recover can still be tested individually.
And there could be one test vector file per tested function as in designs 1 and 2.

I think the advantage is that such test vectors are easier to use because users don't need to concern themselves with the JSON serialization of state.
The disadvantage is that the test vectors are bigger.

However, it's possible to reduce the size of the test vectors by specifying common prefixes only once, for example like that:

{
  "prefix": [hostseckey, session_params, random, ParticipantMsg1],
  "vectors" {
    "v1": [CoordinatorMsg1, Exception],
    "v2": [CoordinatorMsg1', Exception']
  }
}

P.S.

I think experience shows that it's probably best to start by writing a script that automatically generates test vectors instead of trying to generate them ad-hoc.

real-or-random · 2025-01-07T10:43:15Z

@siv2r Great to hear that you'd like to !

@jonasnick I'm not sure if I understand the idea of Design 3 entirely. How would

A test vector of a failing participant_step2 test vector would be a tuple
(hostseckey, session_params, random, ParticipantMsg1, CoordinatorMsg1, Exception)

Which components in this tuple are inputs, and which are expected outputs?

(My thinking: If participant_step2 is supposed to fail, then, for example, CoordinatorMsg1 must not match everything left to it. So the test case needs to override CoordinatorMsg1 here?)

I don't see how this would work, or how we could create test cases in which one of the participants "becomes" malicious in the middle of the executions.

I think design 2 is better than design 1.

I agree (independently of Design 3.) Parsing JSON is annoying for implementations, but enforcing serializations seems even more annoying.

@siv2r Perhaps check out https://github.com/C2SP/wycheproof/tree/master as an inspiration. It's a collection of test vectors in JSON format. It's closer to Design 2, and it comes with JSON schemas. All test cases have a human-readable note of what is tested.

However, it's possible to reduce the size of the test vectors by specifying common prefixes only once, for example like that:

I think we should prefer simplicity to smaller JSON files. Disk space on dev machines should not be a concern, and things are complex enough.

I think experience shows that it's probably best to start by writing a script that automatically generates test vectors instead of trying to generate them ad-hoc.

I fully agree.

jonasnick · 2025-01-07T13:21:03Z

Which components in this tuple are inputs, and which are expected outputs?

I'd imagine that the test vector in design 3 for failing participant_step2 would be executed like this:

def test_vec_participant_step2(hostseckey, session_params, random, ParticipantMsg1, CoordinatorMsg1, Exception):
    state1, msg1 = participant_step1(hostseckey, session_params, random)
    # Since the output of participant_step1 is tested somewhere else, we could
    # omit ParticipantMsg1 from the test vector tuple (and I think we should, this was just an
    # oversight on my part).
    assert msg1 == ParticipantMsg1
    try:
        participant_step2(hostseckey, state1, CoordinatorMsg1)
    except Error as e:
        assert e == Exception
        return
    assert False

So everything but the exception is an input. We should probably mark inputs and outputs separately in the test vector. I was just trying to discuss the high level idea.

So the test case needs to override CoordinatorMsg1 here?

Since we don't need coordinator state to invoke participant_step2, we don't need to run the coordinator functions.

I don't see how this would work, or how we could create test cases in which one of the participants "becomes" malicious in the middle of the executions.

Does the above example make it clearer? Design 3 only runs the honest functions of an honest participant. We model faulty participants through messages received by the honest participant and those are part of the test vector.

However, it's possible to reduce the size of the test vectors by specifying common prefixes only once, for example like that:

I think we should prefer simplicity to smaller JSON files. Disk space on dev machines should not be a concern, and things are complex enough.

In the context of design 3, using prefixes would help understandability because users could see more easily which parts of different test vectors are the same and which parts differ.

real-or-random · 2025-01-08T13:52:55Z

I see, I that's a reasonable design.

And yes, the advantage is that we don't need serializations of the internal data structures, and this makes things easier for implementers. A disadvantage is that the test procedures that exercise that test vectors get slightly more complicated, e.g., the test for participant_step2 needs to run participant_step1. But I think that's a reasonable tradeoff.

So yes, I believe that Design 3 is a good design, and I also tend to think it's slightly better than Design 2, though it probably won't matter too much in the end.

# Since the output of participant_step1 is tested somewhere else, we could
# omit ParticipantMsg1 from the test vector tuple (and I think we should, this was just an
# oversight on my part).
assert msg1 == ParticipantMsg1

I see your point that not testing it is closer to the single-responsibility principle, but I'd argue that testing msg1 is better for debugging: Then, if you have a failing vector, you'll see immediately where to look for the cause (namely, in particpant_step1 or in participant_step2). Moreover, I think that this assert adds only negligible complexity and overhead, so it's basically another test for free.

siv2r · 2025-01-09T11:22:51Z

I really like Design 3. Excluding state serializations in the JSON format will simplify the testing code and the vectors.

Just to confirm, we are testing participant_step1 and participant_step2 as separate APIs, not as part of a combined workflow, correct? For example:
- participant_step1 test vector:
  - Input: hostseckey, session_params, random
  - Output: ParticipantMsg1
- participant_step2 test vector:
  - Input: hostseckey, session_params, random, CoordinatorMsg1
  - Output: ParticipantMsg2
And we’ll ignore the state variables these functions output (i.e., they won’t be included in the expected test vector outputs)?

Thanks for all the suggestions! I now have some clarity on the high-level structure for the test vectors. I’ll implement a basic proof of concept (using a script) with one test vector for each API, covering various cases (valid inputs, error scenarios, etc.). Once the PoC is ready, I’ll share it for feedback before expanding to more vectors.

@siv2r Perhaps check out https://github.com/C2SP/wycheproof/tree/master as an inspiration.

Ah yes, I briefly referred to Wycheproof when brainstorming Design 1 and Design 2. I’ll keep it as a reference while working on the PoC.

jonasnick · 2025-01-09T13:15:18Z

Just to confirm, we are testing participant_step1 and participant_step2 as separate APIs, not as part of a combined workflow, correct?

I'm not entirely sure how a combined workflow would look like (I think it would end up looking pretty similar to separate API tests) but testing those APIs separately would be my first attempt.

participant_step2 test vector:
Input: hostseckey, session_params, random, CoordinatorMsg1
Output: ParticipantMsg2

Yes. But @real-or-random suggested to also add the ParticipantMsg1 output of participant_step1 to the test vector, which allows asserting msg1 == ParticipantMsg1 as in my example above. The argument against that is that in the tests are simplified if in the participant_stepN test vectors, we assume participant_stepM functions for M < N are correct and we have separate test vectors for participant_stepM to test their correctness. The reason for adding ParticipantMsg1 in the participant_step2 test I think is that it helps debugging when the test fails. Because this could also happen when participant_step1 is not actually correct and its test vectors do not reveal that. However, this would mean that the test vectors of participant_step1 are bad.

Either way of doing it is fine for me.

And we’ll ignore the state variables these functions output (i.e., they won’t be included in the expected test vector outputs)?

Yes, in design 3 we wouldn't include the ParticipantState or CoordinatorState objects in the test vectors.

real-or-random added this to the final milestone Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test vectors #35

Test vectors #35

jonasnick commented Jul 8, 2024

real-or-random commented Jul 8, 2024

siv2r commented Dec 19, 2024 •

edited

Loading

jonasnick commented Dec 30, 2024

real-or-random commented Jan 7, 2025

jonasnick commented Jan 7, 2025 •

edited

Loading

real-or-random commented Jan 8, 2025

siv2r commented Jan 9, 2025 •

edited

Loading

jonasnick commented Jan 9, 2025

Test vectors #35

Test vectors #35

Comments

jonasnick commented Jul 8, 2024

real-or-random commented Jul 8, 2024

siv2r commented Dec 19, 2024 • edited Loading

jonasnick commented Dec 30, 2024

Design 3

P.S.

real-or-random commented Jan 7, 2025

jonasnick commented Jan 7, 2025 • edited Loading

real-or-random commented Jan 8, 2025

siv2r commented Jan 9, 2025 • edited Loading

jonasnick commented Jan 9, 2025

siv2r commented Dec 19, 2024 •

edited

Loading

jonasnick commented Jan 7, 2025 •

edited

Loading

siv2r commented Jan 9, 2025 •

edited

Loading