Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translator traits take up a lot of space #717

Open
afilini opened this issue Jul 31, 2024 · 9 comments
Open

Translator traits take up a lot of space #717

afilini opened this issue Jul 31, 2024 · 9 comments

Comments

@afilini
Copy link
Contributor

afilini commented Jul 31, 2024

Related to #585, I have been investigating what caused the recent code size increase and it looks like it is due to a few different changes, one of them being the switch to use from_ast in the translator. For example through bisecting I got it down to this specific commit as one of the offenders: fa2e8b4

For context, when building my firmware commenting out the content of Descriptor::at_derivation_index() and replacing it with a todo!() reduces the overall size by 160KiB! Doing the same on an older version of miniscript (9.0.2) decreases the size by 60KiB.

at_derivation_index() mainly just uses the translator to map keys, and it happens to be the only instance of the translators in use in my code. That's why commenting it out causes LTO to prune all that code and save a lot of space.

I tried reverting that from_ast() change specifically but unfortunately it didn't make much difference. I'll keep digging but at least we now have a rough idea of where to look...

@afilini
Copy link
Contributor Author

afilini commented Aug 1, 2024

Changed my bisect scripts slightly, now it also points to: be07a9c (maybe the extra generic for the error in translate_pk?)

@apoelstra
Copy link
Member

Great find! I'll investigate specifically this API.

As I mentioned to you offline, this trait was originally designed as a "Swiss army knife" trait which I hoped to use to do every possible transformation of a Miniscript, from counting keys to lifting to semantic policies to annotating keys. But the result of this is that basically every single call that uses this trait instantiates a whole new copy of a bunch of code.

I think it would make more sense to split this into multiple traits or even to just use a collection of closures for translation.

@apoelstra
Copy link
Member

Looking more closely, some observations:

  • The Translator trait itself is pretty tight. It has only two generics: a source keytype and a target keytype.
  • Its use in Descriptor::at_derivation_index is also pretty tight: it defines a newtype around a u32 and implements the Translator trait on it with concrete error types.
  • The at_derivation_index is also not defined on a generic Descriptor but on a specific Descriptor<DescriptorPublicKey>.

So I'm pretty surprised to see that this particular method is blowing up. Could it be that rustc is making multiple copies of the Derivator newtype and its Translator impl? What if you just move that code outside of the at_derivation_index function so that at_derivation_index itself is a one-liner?

@apoelstra
Copy link
Member

Alternately, could the real blowup be that by using Translator you are pulling in crate::Error and all its bloat, whereas you otherwise wouldn't be?

@apoelstra
Copy link
Member

Ok, looking even more closely, I think the code blowup comes from the Miniscript::translate_pk function which does a giant match and pulls in all the type-checking code and all its error paths and everything, so even though it's only instantiated once inside at_derivation_index this one time is really big.

I think this would be resolved by using a context object during construction, which would let us have a slab allocator, which would let us have "direct" translation where we just replaced the keys without touching the hashes or any of the structural properties of the script.

@apoelstra
Copy link
Member

#718

@afilini
Copy link
Contributor Author

afilini commented Aug 1, 2024

Thank you so much for taking a look!

Alternately, could the real blowup be that by using Translator you are pulling in crate::Error and all its bloat, whereas you otherwise wouldn't be?

Yes! In my "synthetic" test that I used for bisecting (see attached scripts below) it looks like that's where a lot of the increase comes from. In my real firmware unfortunately making that change doesn't have a huge impact.. I'll have to investigate further, it may be that since I'm doing other miniscript operations I already had all the Error code in my binary.

This is the setup I use to bisect:

check.sh
#!/usr/bin/env sh

sed -i '/unused_imports/d' ./src/lib.rs

V=$(RUSTFLAGS="-C opt-level=z" cargo bloat --example profiler --filter miniscript --full-fn | grep filtered | awk '{ print $3 }' | tr -d 'B' | numfmt --from auto)

git reset --hard

echo "Total miniscript size = $V"
if [ "$V" -gt "40000" ]; then
	exit 1
fi
examples/profiler.rs
fn main() {
    let desc: &miniscript::Descriptor::<miniscript::descriptor::DescriptorPublicKey> = unsafe { std::mem::transmute(&0) };
    
    // use std::str::FromStr;
    // let desc: miniscript::Descriptor::<miniscript::descriptor::DescriptorPublicKey> = FromStr::from_str("sh(wsh(or_d(\
    // c:pk_k(020e0338c96a8870479f2396c373cc7696ba124e8635d41b0ea581112b67817261),\
    // c:pk_k(0250863ad64a87ae8a2fe83c1af1a8403cb53f53e486d8511dad8a04887e5b2352)\
    // )))").unwrap();

    let d = desc.at_derivation_index(0);
    unsafe {
        let _ = std::ptr::read_volatile(&d as *const _);
    }
}

@afilini
Copy link
Contributor Author

afilini commented Aug 1, 2024

Ok I was able to pin down the issue in my firmware, it's the fact that almost all the new translate_pk methods now go through the various constructors (for example Pkh::new()) which internally run all the sanity checks.

Constructing those structs directly in translate_pk shaves off 40KiB and brings it back in line with the older version.

@apoelstra
Copy link
Member

Yep, makes sense. But the existing constructor mechanism is there to prevent translations from converting good scripts to bad ones, so we can't fix the codegen issue this way. We need to do somthing smarter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants