Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestions for API additions #6

Open
truchi opened this issue Apr 22, 2023 · 11 comments
Open

Suggestions for API additions #6

truchi opened this issue Apr 22, 2023 · 11 comments

Comments

@truchi
Copy link

truchi commented Apr 22, 2023

Hello sir,

Good job on crop!

I was considering using it in my toy editor, however I found out looking around in the API that some things I like in ropey are missing here:

Things not in ropey I'd love to use:

  • Iterators (esp. Chunks) with offsets (see this)
  • byte <-> (line, col) (see this)

What are your thoughts about those API additions? Would you be open to PRs?

Thanks

@noib3
Copy link
Collaborator

noib3 commented Apr 22, 2023

Thanks!

Rope::is_instance: I'm open to adding this. What's your use case?

Rope::from_reader: No, this is usually highly custom to the editor (e.g. Helix doesn't use this) and Ropey's implementation is really just a placeholder;

Rope::write_to: No because a) I'd be weird to have this without having its read_from counterpart, and b) it's 4 lines long;

RopeSlice::as_str: what's your use case? In general you should just assume that a Rope{Slice} has multiple chunks, I don't see the benefit in special casing that logic. Also, unlike Ropey, crop uses gap buffers for the leaves of the tree instead of contiguous strings so your Rope could literally have 2 characters and it could still return a None;

design.md: this is something that I'd like to write but probably won't have the time to any time soon (never?). The memory layout of crop's B-tree is very similar to Ropey's, except the internal nodes use a Vec to store their children instead of a fixed size array (turns out using an array is not any faster and just introduces some more unsafe).

The B-tree's invariants are also identical, except in crop CRLFs can be split across chunks.

complex iterators: I could be open to add something like a ChunksInfos iterator that returns (bytes, lines, chunk): (usize, usize, &str) but I'm not sure it's worth the increased surface area in the API.

byte <-> (line, col) (where by col I'm assuming the byte offset in that line): No, you can easily implement an extension trait on crop::Rope that does this using Rope::line_of_byte and Rope::byte_of_line.

@truchi
Copy link
Author

truchi commented May 9, 2023

Hello

  • Rope::is_instance: it is part of my checks to skip rendering.
  • Rope::from_reader, Rope::write_to: I'll look into Helix then.
  • RopeSlice::as_str: my st*pid shaping code requires contiguous strs (for ligatures), segmentation algorithms may as well. Maybe one can implement this with the public API, without much overhead.
  • ChunksInfos: maybe I'm the only one to like this idea...
  • byte <-> (line, col) (yes): an internal implementation wouldn't be ~twice as pretty fast?

Thank you!

@noib3
Copy link
Collaborator

noib3 commented May 20, 2023

  • is_instance: I'm accepting a PR for this if you want to have a go at it, it should be simple to implement;

  • as_str: the segmentation algorithms in unicode-segmentation don't require contiguous strings. If you still want that functionality you can use the Chunks iterator: if it returns 1 chunk it's contiguous, if it returns more it isn't.

    In the second case, if you need to get a contiguous string around a chunk boundary (and you know the byte offsets) you
    can do rope.byte_slice(start..end).to_string();

  • byte <-> (line, col): yes but those methods are fast enough that it doesn't matter, it's not worth the increased API surface.

@stevefan1999-personal
Copy link
Contributor

CharIndicies would be nice, given this is what we are left to support Nom: https://doc.rust-lang.org/beta/core/str/struct.CharIndices.html

@noib3
Copy link
Collaborator

noib3 commented Oct 14, 2023

@stevefan1999-personal what is the use case for CharIndices? It could easily be implemented by 3rd parties by wrapping Chars.

@stevefan1999-personal
Copy link
Contributor

@stevefan1999-personal what is the use case for CharIndices? It could easily be implemented by 3rd parties by wrapping Chars.

I'm trying to implement nom traits for crop but I guess I will try around it

@cessen
Copy link

cessen commented Jan 1, 2024

Author of Ropey here. I just want to chime in about a couple of things.

@noib3 wrote:

Rope::from_reader: No, this is usually highly custom to the editor (e.g. Helix doesn't use this) and Ropey's implementation is really just a placeholder;

Rope::write_to: No because a) I'd be weird to have this without having its read_from counterpart, and b) it's 4 lines long;

Even in Ropey, Rope::from_reader and Rope::write_to are only intended as convenience methods for people who are knocking something out quick in less serious or throw-away projects. Neither of them are intended for serious usage, and it's expected that any serious editor project will instead roll their own solutions based on RopeBuilder and the Chunks iterator, respectively.

In particular, both of those methods are opaque, which means (among other things):

  • You can't query how much data has been read/written so far, so you can't provide any kind of progress bar for the end user with large files.
  • You can't abort them (e.g. if you're in the middle of a read/save, and the user cancels).
  • You can't process the text data as its read/written, for e.g. converting text encodings on the fly. This is critical for reading/writing non-utf8 text files.

So I think it's perfectly sensible for crop to not support these methods.

RopeSlice::as_str what's your use case?

This one actually is useful. There are various situations where if it is a contiguous string you can do something more efficient, and otherwise you fall back to less efficient code that doesn't rely on it being contiguous. This especially comes up in rendering/shaping code.

From<RopeSlice> for Cow<str> is similarly useful, but ensures that the string data is contiguous for the client code by copying it into a contiguous buffer if it's not. So it's more ergonomic, but may not be ideal if you have a fallback for split text that doesn't require copying.

Having said that, both of these can essentially be implemented by client code (though perhaps slightly less efficiently). So it's certainly not a critical API. But unlike certain other APIs in Ropey, I actually do think these two have really pulled their weight.

@wmedrano
Copy link

Can we get non-panicking variants of the functions?

@stevefan1999-personal
Copy link
Contributor

What about replace with other ropes without having to turn it into a string, i.e. rope merge?

@noib3
Copy link
Collaborator

noib3 commented Jan 29, 2024

Can we get non-panicking variants of the functions?

Yes, there's a PR open for it that I've yet to review, but I'm open to the idea.

What about replace with other ropes [..] i.e. rope merge?

Probably not, like Rope::append(&Self) this is something people very rarely (never?) need, but I'd be interested in hearing your use case.

without having to turn it into a string

You don't have to allocate to do this, you can call Rope::chunks() and call Rope::replace() multiple times.

@stevefan1999-personal
Copy link
Contributor

@noib3 but what is a chunk? I need to know this because I'm implementing a preprocessor and I have some #ifdef blocks which might overwrite contents.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants