Yet another string type for Rust 🦀
- no copy borrow via
borrowed
(aconst
constructor) orfrom_static
- no alloc small strings (23 bytes on 64-bit platform)
- no copy owned slices
- a niche:
Option<HipStr>
andHipStr
have the same size - zero dependency and compatible
no_std
withalloc
Also byte strings, OS strings, and paths!
use hipstr::HipStr;
let simple_greetings = HipStr::from_static("Hello world");
let _clone = simple_greetings.clone(); // no copy
let user = "John";
let greetings = HipStr::from(format!("Hello {}", user));
let user = greetings.slice(6..): // no copy
drop(greetings); // the slice is owned, it exists even if greetings disappear
let chars = user.chars().count(); // "inherits" `&str` methods
std
(default): usesstd
rather thancore
andalloc
, and also provides more trait implementations (for comparison and conversions)serde
: provides serialization/deserialization support withserde
crateunstable
: exposes internalBackend
trait that may change at any moment
This crate makes extensive use of unsafe
code blocks. 🤷
It leverages the 2-bit alignment niche present in pointers across most platforms (all platforms currently supported by the Rust compiler?) to discriminate between the three possible representations.
To make things safer, Rust is tested thoroughly on multiple platforms, normally and with Miri (the MIR interpreter).
To ensure safety and reliability, this crate undergoes thorough testing:
- Near 100% test coverage
- Cross-platform validation:
- 32-bit and 64-bit architectures
- little and big endian
In addition, this crate is checked with advanced dynamic verification methods:
- Concurrency testing using the Tokio's loom crate
- Undefined behavior detection using Miri (the MIR interpreter)
This crate has near full line coverage:
cargo llvm-cov --all-features --html
# or
cargo tarpaulin --all-features --out html --engine llvm
Check out the current coverage on Codecov:
In the Github-provided CI, hipstr
is tested under:
- Linux
- Windows
- MacOS (ARM 64-bit LE)
You can easily run the test on various platforms with cross
:
cross test --target s390x-unknown-linux-gnu # 32-bit BE
cross test --target powerpc64-unknown-linux-gnu # 64-bit BE
cross test --target i686-unknown-linux-gnu # 32-bit LE
cross test --target x86_64-unknown-linux-gnu # 64-bit LE
NB: previously I used MIPS targets for big endian, but due to some LLVM-related issue they are not working anymore… see Rust issue #113065
🧵 Loom
This crates uses the loom
crate to check the custom "Arc" implementation. To run the tests:
RUSTFLAGS='--cfg loom' cargo test --release loom
🔍 Miri
This crate runs successfully with Miri:
MIRIFLAGS=-Zmiri-symbolic-alignment-check cargo +nightly miri test
for SEED in $(seq 0 10); do
echo "Trying seed: $SEED"
MIRIFLAGS="-Zmiri-seed=$SEED" cargo +nightly miri test || { echo "Failing seed: $SEED"; break; };
done
To check with different word size and endianness:
# Big endian, 64-bit
cargo +nightly miri test --target mips64-unknown-linux-gnuabi64
# Little endian, 32-bit
cargo +nightly miri test --target i686-unknown-linux-gnu
Note: this crate leverages the "exposed provenance" semantics.
#[non_exhaustive]
Name | Thread-safe cheap-clone | Local cheap-clone | Inline | Cheap slice | Bytes | Borrow 'static |
Borrow any 'a |
Comment |
---|---|---|---|---|---|---|---|---|
hipstr |
🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | obviously! |
arcstr |
🟢* | ❌ | ❌ | ❌** | ❌ | 🟢 | ❌ | *use a custom thin Arc , **heavy slice (with dedicated substring type) |
flexstr |
🟢* | 🟢 | 🟢 | ❌ | ❌ | 🟢 | ❌ | *use an Arc<str> instead of an Arc<String> (remove one level of indirection but use fat pointers) |
imstr |
🟢 | 🟢 | ❌ | 🟢 | ❌ | ❌ | ❌ | |
faststr |
🟢 | ❌ | 🟢 | 🟢 | ❌ | 🟢 | ❌ | zero-doc with complex API |
fast-str |
🟢 | ❌ | 🟢 | 🟢 | ❌ | 🟢 | ❌ | inline repr is opt-in |
ecow |
🟢* | ❌ | 🟢 | ❌ | 🟢** | 🟢 | ❌ | *on two words only 🤤, **even any T |
cowstr |
🟢 | ❌ | ❌ | ❌* | ❌ | 🟢 | ❌** | *heavy slice, **contrary to its name |
compact_str |
❌ | ❌ | 🟢 | ❌ | 🟢* | ❌ | ❌ | *opt-in via smallvec |
inline_string |
❌ | ❌ | 🟢 | ❌ | ❌ | ❌ | ❌ | |
kstring |
🟢 | ❌ | 🟢 | ❌ | ❌ | 🟢 | ❌ | |
smartstring |
❌ | ❌ | 🟢 | ❌ | ❌ | ❌ | ❌ | |
smallstr |
❌ | ❌ | 🟢 | ❌ | ❌ | ❌ | ❌ | |
smol_str |
❌ | ❌ | 🟢* | ❌ | ❌ | 🟢 | ❌ | *but only inline string, here for reference |
skipping specialized string types like tinystr
(ASCII-only, bounded), or bstr
, or bytestring
, or...
In short, HipStr
, one string type to rule them all 😉
While speed is not the main motivator for hipstr
, it seems to be doing OK on that front.
See some actual benchmarks on Rust's String Rosetta.
For now, just me PoLazarus 👻
Help welcome! 🚨
MIT + Apache