Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unique and improve backends #35

Merged
merged 4 commits into from
Nov 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .github/workflows/basic.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,11 @@ jobs:
- name: Run tests
run: cargo test --verbose --profile ${{matrix.profile}} ${{matrix.features}} -- --nocapture

- name: Run loom tests
env:
RUSTFLAGS: "--cfg loom"
run: cargo test --verbose --profile ${{matrix.profile}} ${{matrix.features}} loom -- --nocapture

fmt:
runs-on: ubuntu-latest
steps:
Expand Down
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,16 @@

Notable changes only.

## Unreleased

### Added

- add new unique (non-shared) strings and byte vectors

### Changed

- refactor the backend to support unique

## [0.6.0] - 2024-10-08

### Changed
Expand Down
18 changes: 10 additions & 8 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,8 @@ bstr = ["dep:bstr"]
[dev-dependencies]
fastrand = "2.0.0"
serde_test = "1.0.176"
serde = { version = "1.0.100", default-features = false, features = [
"derive",
"alloc",
] }
serde_json = { version = "1.0.45", default-features = false, features = [
"alloc",
] }
serde = { version = "1.0.100", default-features = false, features = ["derive", "alloc"] }
serde_json = { version = "1.0.45", default-features = false, features = ["alloc"] }

[dependencies]
sptr = "0.3.2"
Expand All @@ -54,5 +49,12 @@ optional = true
default-features = false
features = ["alloc"]

[target.'cfg(loom)'.dependencies]
loom = "0.7"

[lints.rust]
unexpected_cfgs = { level = "warn", check-cfg = ['cfg(coverage_nightly)', 'cfg(docsrs)'] }
unexpected_cfgs = { level = "warn", check-cfg = [
'cfg(coverage_nightly)',
'cfg(docsrs)',
'cfg(loom)',
] }
77 changes: 54 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,19 +34,31 @@ let chars = user.chars().count(); // "inherits" `&str` methods

## ✏️ Features

- `std` (default): uses `std` rather than `core` and `alloc`, and also provides more trait implementations (for comparison, conversions, and errors)
- `std` (default): uses `std` rather than `core` and `alloc`, and also provides more trait implementations (for comparison and conversions)
- `serde`: provides serialization/deserialization support with `serde` crate
- `unstable`: exposes internal `Backend` trait that may change at any moment

## ☣️ Safety of `hipstr`

This crate uses `unsafe` extensively. 🤷
This crate makes extensive use of `unsafe` code blocks. 🤷

It exploits the 2-bit alignment niche in pointers existing on most platforms (I think all Rustc supported platforms) to distinguish the inline representation from the other representations.
It leverages the 2-bit alignment niche present in pointers across most platforms (all platforms currently supported by the Rust compiler?) to discriminate between the three possible representations.

To make things safer, Rust is tested thoroughly on multiple platforms, normally and with [Miri] (the MIR interpreter).

## 🧪 Testing
## 🧪 Testing and Verification Strategy

To ensure safety and reliability, this crate undergoes thorough testing:

- Near 100% test coverage
- Cross-platform validation:
- 32-bit and 64-bit architectures
- little and big endian

In addition, this crate is checked with advanced dynamic verification methods:

- Concurrency testing using the [Tokio's loom crate][loom]
- Undefined behavior detection using [Miri] (the MIR interpreter)

### ☔ Coverage

Expand All @@ -64,6 +76,12 @@ Check out the current coverage on [Codecov]:

### 🖥️ Cross-platform testing

In the Github-provided CI, `hipstr` is tested under:

- Linux
- Windows
- MacOS (ARM 64-bit LE)

You can easily run the test on various platforms with [`cross`]:

```bash
Expand All @@ -75,6 +93,14 @@ cross test --target x86_64-unknown-linux-gnu # 64-bit LE

NB: previously I used MIPS targets for big endian, but due to some LLVM-related issue they are not working anymore… see [Rust issue #113065](https://github.com/rust-lang/rust/issues/113065)

### 🧵 [Loom]

This crates uses the `loom` crate to check the custom "Arc" implementation. To run the tests:

```bash
RUSTFLAGS='--cfg loom' cargo test --release loom
```

### 🔍 [Miri]

This crate runs successfully with Miri:
Expand All @@ -97,29 +123,28 @@ cargo +nightly miri test --target mips64-unknown-linux-gnuabi64
cargo +nightly miri test --target i686-unknown-linux-gnu
```

[Codecov]: https://app.codecov.io/gh/polazarus/hipstr
[`cross`]: https://github.com/cross-rs/cross
[Miri]: https://github.com/rust-lang/miri
Note: this crate leverages the "exposed provenance" semantics.

## 📦 Similar crates

`#[non_exhaustive]`

| Name | Thread-safe cheap-clone | Local cheap-clone | Inline | Cheap slice | Bytes | Cow<'a> | Comment |
| -------------------------------------------------------------- | ----------------------- | ----------------- | ------ | ----------- | ------ | ------- | :----------------------------------------------------------------------------------------------------- |
| `hipstr` | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | obviously! |
| [`arcstr`](https://github.com/thomcc/arcstr) | 🟢\* | ❌ | ❌ | ❌\*\* | ❌ | ❌ | \*use a custom thin `Arc`, \*\*heavy slice (with dedicated substring type) |
| [`flexstr`](https://github.com/nu11ptr/flexstr) | 🟢\* | 🟢 | 🟢 | ❌ | ❌ | ❌ | \*use an `Arc<str>` instead of an `Arc<String>` (remove one level of indirection but use fat pointers) |
| [`imstr`](https://github.com/xfbs/imstr) | 🟢 | 🟢 | ❌ | 🟢 | ❌ | ❌ | |
| [`faststr`](https://github.com/volo-rs/faststr) | 🟢 | ❌ | 🟢 | 🟢 | ❌ | ❌ | zero-doc with complex API |
| [`fast-str`](https://github.com/xxXyh1908/rust-fast-str) | 🟢 | ❌ | 🟢 | 🟢 | ❌ | ❌ | inline repr is opt-in |
| [`ecow`](https://github.com/typst/ecow) | 🟢\* | ❌ | 🟢 | ❌ | 🟢\*\* | ❌ | \*on two words only 🤤, \*\*even any `T` |
| [`cowstr`](https://git.pipapo.org/cehteh/cowstr.git) | 🟢 | ❌ | ❌ | ❌\* | ❌ | ❌\*\* | \*heavy slice, \*\*contrary to its name |
| [`compact_str`](https://github.com/parkmycar/compact_str) | ❌ | ❌ | 🟢 | ❌ | 🟢\* | ❌ | \*opt-in via `smallvec` |
| [`inline_string`](https://github.com/fitzgen/inlinable_string) | ❌ | ❌ | 🟢 | ❌ | ❌ | ❌ | |
| [`smartstring`](https://github.com/bodil/smartstring) | ❌ | ❌ | 🟢 | ❌ | ❌ | ❌ | |
| [`smallstr`](https://github.com/murarth/smallstr) | ❌ | ❌ | 🟢 | ❌ | ❌ | ❌ | |
| [`smol_str`](https://github.com/rust-analyzer/smol_str) | ❌ | ❌ | 🟢\* | ❌ | ❌ | ❌ | \*but only inline string, here for reference |
| Name | Thread-safe cheap-clone | Local cheap-clone | Inline | Cheap slice | Bytes | Borrow `'static` | Borrow any `'a` | Comment |
| -------------------------------------------------------------- | ----------------------- | ----------------- | ------ | ----------- | ------ | ---------------- | :-------------- | ------------------------------------------------------------------------------------------------------ |
| `hipstr` | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | obviously! |
| [`arcstr`](https://github.com/thomcc/arcstr) | 🟢\* | ❌ | ❌ | ❌\*\* | ❌ | 🟢 | ❌ | \*use a custom thin `Arc`, \*\*heavy slice (with dedicated substring type) |
| [`flexstr`](https://github.com/nu11ptr/flexstr) | 🟢\* | 🟢 | 🟢 | ❌ | ❌ | 🟢 | ❌ | \*use an `Arc<str>` instead of an `Arc<String>` (remove one level of indirection but use fat pointers) |
| [`imstr`](https://github.com/xfbs/imstr) | 🟢 | 🟢 | ❌ | 🟢 | ❌ | ❌ | ❌ | |
| [`faststr`](https://github.com/volo-rs/faststr) | 🟢 | ❌ | 🟢 | 🟢 | ❌ | 🟢 | ❌ | zero-doc with complex API |
| [`fast-str`](https://github.com/xxXyh1908/rust-fast-str) | 🟢 | ❌ | 🟢 | 🟢 | ❌ | 🟢 | ❌ | inline repr is opt-in |
| [`ecow`](https://github.com/typst/ecow) | 🟢\* | ❌ | 🟢 | ❌ | 🟢\*\* | 🟢 | ❌ | \*on two words only 🤤, \*\*even any `T` |
| [`cowstr`](https://git.pipapo.org/cehteh/cowstr.git) | 🟢 | ❌ | ❌ | ❌\* | ❌ | 🟢 | ❌\*\* | \*heavy slice, \*\*contrary to its name |
| [`compact_str`](https://github.com/parkmycar/compact_str) | ❌ | ❌ | 🟢 | ❌ | 🟢\* | ❌ | ❌ | \*opt-in via `smallvec` |
| [`inline_string`](https://github.com/fitzgen/inlinable_string) | ❌ | ❌ | 🟢 | ❌ | ❌ | ❌ | ❌ | |
| [`kstring`](https://docs.rs/kstring/latest/kstring/) | 🟢 | ❌ | 🟢 | ❌ | ❌ | 🟢 | ❌ | |
| [`smartstring`](https://github.com/bodil/smartstring) | ❌ | ❌ | 🟢 | ❌ | ❌ | ❌ | ❌ | |
| [`smallstr`](https://github.com/murarth/smallstr) | ❌ | ❌ | 🟢 | ❌ | ❌ | ❌ | ❌ | |
| [`smol_str`](https://github.com/rust-analyzer/smol_str) | ❌ | ❌ | 🟢\* | ❌ | ❌ | 🟢 | ❌ | \*but only inline string, here for reference |

skipping specialized string types like [`tinystr`](https://github.com/unicode-org/icu4x) (ASCII-only, bounded), or `bstr`, or `bytestring`, or...

Expand All @@ -131,11 +156,17 @@ In short, `HipStr`, one string type to rule them all 😉

While speed is not the main motivator for `hipstr`, it seems to be doing OK on that front.

See some actual benchmarks on [Rust's String Rosetta](https://github.com/rosetta-rs/string-rosetta-rs).
See some actual benchmarks on [Rust's String Rosetta].

## 📖 Author and licenses

For now, just me PoLazarus 👻 \
Help welcome! 🚨

MIT + Apache

[codecov]: https://app.codecov.io/gh/polazarus/hipstr
[miri]: https://github.com/rust-lang/miri
[`cross`]: https://github.com/cross-rs/cross
[loom]: https://github.com/tokio-rs/loom
[Rust's String Rosetta]: https://github.com/rosetta-rs/string-rosetta-rs
24 changes: 17 additions & 7 deletions src/backend.rs
Original file line number Diff line number Diff line change
@@ -1,15 +1,25 @@
//! Sealed backend trait and the built-in implementations.

pub mod rc;

pub use rc::Local;
/// Shared (thread-safe) reference counted backend.
#[cfg(target_has_atomic = "ptr")]
pub use rc::ThreadSafe;
pub use crate::smart::Arc;
/// Use a local reference counted backend.
pub use crate::smart::Rc;
/// Use a unique reference.
pub use crate::smart::Unique;

#[deprecated(note = "renamed to Rc")]
pub type Local = crate::smart::Rc;

#[deprecated(note = "renamed to Arc")]
pub type ThreadSafe = crate::smart::Arc;

/// Sealed marker trait for allocated backend.
pub trait Backend: rc::Count + 'static {}
pub trait Backend: crate::smart::Kind + 'static {}

impl Backend for Local {}
impl Backend for Rc {}

#[cfg(target_has_atomic = "ptr")]
impl Backend for ThreadSafe {}
impl Backend for Arc {}

impl Backend for Unique {}
Loading