Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Simplify opaque node pointer casting to concrete
**Description** - Remove dependency on `sptr` by adding shim functions in the nightly module as replacement - Remove some `#[inline(always)]` complete and change all other instances to `#[inline]` - Modify `count_words` example to use nul-terminated string internally - Remove the `TaggedPointer::<T>::cast` cast function in favour of casting directly on the resulting `NonNull<T>` **Motivation** - The `sptr` crate hasn't been updated to include strict provenance methods for `NonNull`, see Gankra/sptr#17 - `#[inline(always)]` is generally too strong of a requirement, I'd rather leave it to the compiler heuristic without strong evidence to the contrary. - The `count_words` example was failing on some inputs because of words that were prefixes of other words - The `cast` function was showing up in a hot part of the call stack and it seemed inefficient to carry the data tag over for the cast for my use-case **Testing Done** `./scripts/full-test.sh nightly` callgrind benchmark results: ``` iai_callgrind::bench_clone_group::bench_clone with_prefixes:... Instructions: 19915877|20358816 (-2.17566%) [-1.02224x] L1 Hits: 26426358|26975784 (-2.03674%) [-1.02079x] L2 Hits: 53259|53262 (-0.00563%) [-1.00006x] RAM Hits: 53909|53909 (No change) Total read+write: 26533526|27082955 (-2.02869%) [-1.02071x] Estimated Cycles: 28579468|29128909 (-1.88624%) [-1.01923x] iai_callgrind::bench_clone_group::bench_clone dictionary:... Instructions: 23554976|24201112 (-2.66986%) [-1.02743x] L1 Hits: 31367457|32219796 (-2.64539%) [-1.02717x] L2 Hits: 65574|65584 (-0.01525%) [-1.00015x] RAM Hits: 64468|64467 (+0.00155%) [+1.00002x] Total read+write: 31497499|32349847 (-2.63478%) [-1.02706x] Estimated Cycles: 33951707|34804061 (-2.44901%) [-1.02510x] iai_callgrind::bench_lookup_group::bench_lookup_single first_key:... Instructions: 237|302 (-21.5232%) [-1.27426x] L1 Hits: 289|371 (-22.1024%) [-1.28374x] L2 Hits: 7|7 (No change) RAM Hits: 15|14 (+7.14286%) [+1.07143x] Total read+write: 311|392 (-20.6633%) [-1.26045x] Estimated Cycles: 849|896 (-5.24554%) [-1.05536x] iai_callgrind::bench_lookup_group::bench_lookup_single last_key:... Instructions: 311|375 (-17.0667%) [-1.20579x] L1 Hits: 373|453 (-17.6600%) [-1.21448x] L2 Hits: 0|0 (No change) RAM Hits: 20|20 (No change) Total read+write: 393|473 (-16.9133%) [-1.20356x] Estimated Cycles: 1073|1153 (-6.93842%) [-1.07456x] iai_callgrind::bench_lookup_group::bench_lookup_multiple dictionary:... Instructions: 641978|818546 (-21.5709%) [-1.27504x] L1 Hits: 822170|1042104 (-21.1048%) [-1.26750x] L2 Hits: 498|499 (-0.20040%) [-1.00201x] RAM Hits: 32|32 (No change) Total read+write: 822700|1042635 (-21.0942%) [-1.26733x] Estimated Cycles: 825780|1045719 (-21.0323%) [-1.26634x] iai_callgrind::bench_remove_group::bench_remove_single first_key:... Instructions: 11218789|12018513 (-6.65410%) [-1.07128x] L1 Hits: 15711416|16725644 (-6.06391%) [-1.06455x] L2 Hits: 64609|64600 (+0.01393%) [+1.00014x] RAM Hits: 85|93 (-8.60215%) [-1.09412x] Total read+write: 15776110|16790337 (-6.04054%) [-1.06429x] Estimated Cycles: 16037436|17051899 (-5.94927%) [-1.06326x] iai_callgrind::bench_remove_group::bench_remove_single last_key:... Instructions: 11217859|12017573 (-6.65454%) [-1.07129x] L1 Hits: 15710131|16724349 (-6.06432%) [-1.06456x] L2 Hits: 64594|64589 (+0.00774%) [+1.00008x] RAM Hits: 76|80 (-5.00000%) [-1.05263x] Total read+write: 15774801|16789018 (-6.04095%) [-1.06429x] Estimated Cycles: 16035761|17050094 (-5.94913%) [-1.06325x] iai_callgrind::bench_remove_group::bench_remove_multiple dictionary:... Instructions: 11760514|12651771 (-7.04452%) [-1.07578x] L1 Hits: 16424055|17548190 (-6.40599%) [-1.06844x] L2 Hits: 63672|63675 (-0.00471%) [-1.00005x] RAM Hits: 2012|2007 (+0.24913%) [+1.00249x] Total read+write: 16489739|17613872 (-6.38209%) [-1.06817x] Estimated Cycles: 16812835|17936810 (-6.26630%) [-1.06685x] iai_callgrind::bench_insert_group::bench_insert_single first_key:... Instructions: 11218977|12018753 (-6.65440%) [-1.07129x] L1 Hits: 15711756|16726095 (-6.06441%) [-1.06456x] L2 Hits: 64604|64598 (+0.00929%) [+1.00009x] RAM Hits: 56|66 (-15.1515%) [-1.17857x] Total read+write: 15776416|16790759 (-6.04108%) [-1.06429x] Estimated Cycles: 16036736|17051395 (-5.95059%) [-1.06327x] iai_callgrind::bench_insert_group::bench_insert_single last_key:... Instructions: 11219463|12019297 (-6.65458%) [-1.07129x] L1 Hits: 15712374|16726794 (-6.06464%) [-1.06456x] L2 Hits: 64603|64595 (+0.01238%) [+1.00012x] RAM Hits: 58|68 (-14.7059%) [-1.17241x] Total read+write: 15777035|16791457 (-6.04130%) [-1.06430x] Estimated Cycles: 16037419|17052149 (-5.95075%) [-1.06327x] iai_callgrind::bench_insert_group::bench_insert_multiple dictionary:... Instructions: 12395731|13406130 (-7.53684%) [-1.08151x] L1 Hits: 17323108|18695159 (-7.33907%) [-1.07920x] L2 Hits: 67155|67142 (+0.01936%) [+1.00019x] RAM Hits: 226|241 (-6.22407%) [-1.06637x] Total read+write: 17390489|18762542 (-7.31272%) [-1.07890x] Estimated Cycles: 17666793|19039304 (-7.20883%) [-1.07769x] iai_callgrind::bench_iterator_group::bench_full_iterator dictionary:... Instructions: 196616|196616 (No change) L1 Hits: 196621|196620 (+0.00051%) [+1.00001x] L2 Hits: 32768|32768 (No change) RAM Hits: 1|2 (-50.0000%) [-2.00000x] Total read+write: 229390|229390 (No change) Estimated Cycles: 360496|360530 (-0.00943%) [-1.00009x] iai_callgrind::bench_iterator_group::bench_prefix_iterator empty:... Instructions: 198051|198198 (-0.07417%) [-1.00074x] L1 Hits: 198408|198600 (-0.09668%) [-1.00097x] L2 Hits: 32775|32775 (No change) RAM Hits: 27|24 (+12.5000%) [+1.12500x] Total read+write: 231210|231399 (-0.08168%) [-1.00082x] Estimated Cycles: 363228|363315 (-0.02395%) [-1.00024x] iai_callgrind::bench_iterator_group::bench_prefix_iterator specific_key:... Instructions: 347|425 (-18.3529%) [-1.22478x] L1 Hits: 400|502 (-20.3187%) [-1.25500x] L2 Hits: 1|1 (No change) RAM Hits: 26|26 (No change) Total read+write: 427|529 (-19.2817%) [-1.23888x] Estimated Cycles: 1315|1417 (-7.19831%) [-1.07757x] iai_callgrind::bench_iterator_group::bench_prefix_iterator random_partial:... Instructions: 505|665 (-24.0602%) [-1.31683x] L1 Hits: 568|772 (-26.4249%) [-1.35915x] L2 Hits: 16|18 (-11.1111%) [-1.12500x] RAM Hits: 29|28 (+3.57143%) [+1.03571x] Total read+write: 613|818 (-25.0611%) [-1.33442x] Estimated Cycles: 1663|1842 (-9.71770%) [-1.10764x] iai_callgrind::bench_iterator_group::bench_fuzzy_iterator zero:... Instructions: 5113|5070 (+0.84813%) [+1.00848x] L1 Hits: 6559|6465 (+1.45398%) [+1.01454x] L2 Hits: 5|4 (+25.0000%) [+1.25000x] RAM Hits: 40|42 (-4.76190%) [-1.05000x] Total read+write: 6604|6511 (+1.42835%) [+1.01428x] Estimated Cycles: 7984|7955 (+0.36455%) [+1.00365x] iai_callgrind::bench_iterator_group::bench_fuzzy_iterator specific_key:... Instructions: 29816850|30093788 (-0.92025%) [-1.00929x] L1 Hits: 38992410|39350019 (-0.90879%) [-1.00917x] L2 Hits: 64554|64565 (-0.01704%) [-1.00017x] RAM Hits: 256|257 (-0.38911%) [-1.00391x] Total read+write: 39057220|39414841 (-0.90733%) [-1.00916x] Estimated Cycles: 39324140|39681839 (-0.90142%) [-1.00910x] iai_callgrind::bench_iterator_group::bench_range_iterator full:... Instructions: 196752|196758 (-0.00305%) [-1.00003x] L1 Hits: 196801|196811 (-0.00508%) [-1.00005x] L2 Hits: 32772|32772 (No change) RAM Hits: 12|12 (No change) Total read+write: 229585|229595 (-0.00436%) [-1.00004x] Estimated Cycles: 361081|361091 (-0.00277%) [-1.00003x] iai_callgrind::bench_iterator_group::bench_range_iterator specific_key:... Instructions: 1007|1237 (-18.5934%) [-1.22840x] L1 Hits: 1180|1484 (-20.4852%) [-1.25763x] L2 Hits: 19|20 (-5.00000%) [-1.05263x] RAM Hits: 36|33 (+9.09091%) [+1.09091x] Total read+write: 1235|1537 (-19.6487%) [-1.24453x] Estimated Cycles: 2535|2739 (-7.44797%) [-1.08047x] iai_callgrind::bench_iterator_group::bench_range_iterator middle_third:... Instructions: 66635|66884 (-0.37229%) [-1.00374x] L1 Hits: 66817|67143 (-0.48553%) [-1.00488x] L2 Hits: 10953|10954 (-0.00913%) [-1.00009x] RAM Hits: 35|34 (+2.94118%) [+1.02941x] Total read+write: 77805|78131 (-0.41725%) [-1.00419x] Estimated Cycles: 122807|123103 (-0.24045%) [-1.00241x] ``` select criterion benchmark result: ``` dict/words/full/insert/asc time: [4.2254 ms 4.2395 ms 4.2590 ms] thrpt: [76.687 MiB/s 77.040 MiB/s 77.297 MiB/s] change: time: [-11.908% -11.229% -10.646%] (p = 0.00 < 0.05) thrpt: [+11.915% +12.650% +13.518%] Performance has improved. Found 5 outliers among 100 measurements (5.00%) 1 (1.00%) high mild 4 (4.00%) high severe dict/words/full/insert/desc time: [3.2245 ms 3.2281 ms 3.2324 ms] thrpt: [101.04 MiB/s 101.18 MiB/s 101.29 MiB/s] change: time: [-16.318% -15.838% -15.435%] (p = 0.00 < 0.05) thrpt: [+18.252% +18.818% +19.500%] Performance has improved. Found 7 outliers among 100 measurements (7.00%) 5 (5.00%) high mild 2 (2.00%) high severe dict/words/full/insert/rand time: [5.1313 ms 5.1356 ms 5.1414 ms] thrpt: [63.526 MiB/s 63.598 MiB/s 63.652 MiB/s] change: time: [-11.073% -10.895% -10.724%] (p = 0.00 < 0.05) thrpt: [+12.012% +12.227% +12.452%] Performance has improved. Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe ```
- Loading branch information