-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatic Rustup #4034
Merged
Merged
Automatic Rustup #4034
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
coverage: Restrict empty-span expansion to only cover `{` and `}` Coverage instrumentation has some tricky code for converting a coverage-relevant `Span` into a set of start/end line/byte-column coordinates that will be embedded in the CGU's coverage metadata. A big part of this complexity is special code for handling empty spans, which are expanded into non-empty spans (if possible) because LLVM's coverage reporter does not handle empty spans well. This PR simplifies that code by restricting it to only apply in two specific situations: when the character after the empty span is `{`, or the character before the empty span is `}`. (As an added benefit, this means that the expanded spans no longer extend awkwardly beyond the end of a physical line, which was common under the previous implementation.) Along the way, this PR also removes some unhelpful code for dealing with function source code spread across multiple files. Functions currently can't have coverage spans in multiple files, and if that ever changes (e.g. to properly support expansion regions) then this code will need to be completely overhauled anyway.
Miri subtree update r? `@ghost`
…Void Update minifer version to `0.3.2` This version fixes a few lints but the main change is that it makes `clap` dependency optional since it's only used for the binary. r? `@notriddle`
Rollup of 3 pull requests Successful merges: - #132675 (coverage: Restrict empty-span expansion to only cover `{` and `}`) - #132849 (Miri subtree update) - #132858 (Update minifer version to `0.3.2`) r? `@ghost` `@rustbot` modify labels: rollup
Add Unicode block-drawing compiler output support Add nightly-only theming support to rustc output using Unicode box drawing characters instead of ASCII-art to draw the terminal UI. In order to enable, the flags `-Zunstable-options=yes --error-format=human-unicode` must be passed in. After: ``` error: foo ╭▸ test.rs:3:3 │ 3 │ X0 Y0 Z0 │ ┌───╿──│──┘ │ ┌│───│──┘ │ ┏││━━━┙ │ ┃││ 4 │ ┃││ X1 Y1 Z1 5 │ ┃││ X2 Y2 Z2 │ ┃│└────╿──│──┘ `Z` label │ ┃└─────│──┤ │ ┗━━━━━━┥ `Y` is a good letter too │ `X` is a good letter ╰╴ note: bar ╭▸ test.rs:4:3 │ 4 │ ┏ X1 Y1 Z1 5 │ ┃ X2 Y2 Z2 6 │ ┃ X3 Y3 Z3 │ ┗━━━━━━━━━━┛ ├ note: bar ╰ note: baz note: qux ╭▸ test.rs:4:3 │ 4 │ X1 Y1 Z1 ╰╴ ━━━━━━━━ ``` Before: ``` error: foo --> test.rs:3:3 | 3 | X0 Y0 Z0 | ___^__-__- | |___|__| | ||___| | ||| 4 | ||| X1 Y1 Z1 5 | ||| X2 Y2 Z2 | |||____^__-__- `Z` label | ||_____|__| | |______| `Y` is a good letter too | `X` is a good letter | note: bar --> test.rs:4:3 | 4 | / X1 Y1 Z1 5 | | X2 Y2 Z2 6 | | X3 Y3 Z3 | |__________^ = note: bar = note: baz note: qux --> test.rs:4:3 | 4 | X1 Y1 Z1 | ^^^^^^^^ ``` After: ![rustc output with unicode box drawing characters](https://github.com/rust-lang/rust/assets/1606434/d210b79a-6579-4407-9706-ba8edc6e9f25) Before: ![current rustc output with ASCII art](https://github.com/rust-lang/rust/assets/1606434/5aecccf8-a6ee-4469-8b39-72fb0d979a9f)
query/plumbing: adjust comment to reality The limit for the query key size got changed recently in rust-lang/rust@f51ec11 but the comment was not updated. Though maybe it is time to intern `CanonicalTypeOpAscribeUserTypeGoal` rather than copying it everywhere? r? `@lcnr`
target_features: explain what exacty 'implied' means here
…Gomez rustdoc-search: simplify rules for generics and type params **Heads up!**: This PR is a follow-up that depends on #124544. It adds 12dc24f46007f82b93ed85614347a42d47580afa, a change to the filtering behavior, and 9900ea48b566656fb12b5fcbd0a1b20aaa96e5ca, a minor ranking tweak. Part of rust-lang/rust-project-goals#112 This PR overturns rust-lang/rust#109802 ## Preview * no results: [`Box<[A]> -> Vec<B>`](http://notriddle.com/rustdoc-html-demo-12/search-sem-3/std/index.html?search=Box%3C%5BA%5D%3E%20-%3E%20Vec%3CB%3E) * results: [`Box<[A]> -> Vec<A>`](http://notriddle.com/rustdoc-html-demo-12/search-sem-3/std/index.html?search=Box%3C%5BA%5D%3E%20-%3E%20Vec%3CA%3E) * [`T -> U`](http://notriddle.com/rustdoc-html-demo-12/search-sem-3/std/index.html?search=T%20-%3E%20U) * [`Cx -> TyCtxt`](http://notriddle.com/rustdoc-html-demo-12/search-sem-3-compiler/rustdoc/index.html?search=Cx%20-%3E%20TyCtxt) ![image](https://github.com/user-attachments/assets/015ae28c-7469-4f7f-be03-157d28d7ec97) ## Description This commit is a response to feedback on the displayed type signatures results, by making generics act stricter. - Order within generics is significant. This means `Vec<Allocator>` now matches only with a true vector of allocators, instead of matching the second type param. It also makes unboxing within generics stricter, so `Result<A, B>` only matches if `B` is in the error type and `A` is in the success type. The top level of the function search is unaffected. - Generics are only "unboxed" if a type is explicitly opted into it. References and tuples are hardcoded to allow unboxing, and Box, Rc, Arc, Option, Result, and Future are opted in with an unstable attribute. Search result unboxing is the process that allows you to search for `i32 -> str` and get back a function with the type signature `&Future<i32> -> Box<str>`. - Instead of ranking by set overlap, it ranks by the number of items in the type signature. This makes it easier to find single type signatures like transmute. ## Find the discussion on * <https://rust-lang.zulipchat.com/#narrow/stream/393423-t-rustdoc.2Fmeetings/topic/meeting.202024-07-08/near/449965149> * <rust-lang/rust#124544 (comment)> * <https://rust-lang.zulipchat.com/#narrow/channel/266220-t-rustdoc/topic/deciding.20on.20semantics.20of.20generics.20in.20rustdoc.20search>
Only copy, rename and link `llvm-objcopy` if llvm tools are enabled Fixes #132719. cc `@bjorn3` who reported the bootstrapping problem for cg_clif. cc `@davidtwco` in case this might be problematic for linux -> macOS cross-compile builds, but seems very unlikely. cc `@albertlarsan68` (co-reviewed #131405) r? bootstrap
move all mono-time checks into their own folder, and their own query The mono item collector currently also drives two mono-time checks: the lint for "large moves", and the check whether function calls are done with all the required target features. Instead of doing this "inside" the collector, this PR refactors things so that we have a new `rustc_monomorphize::mono_checks` module providing a per-instance query that does these checks. We already have a per-instance query for the ABI checks, so this should be "free" for incremental builds. Non-incremental builds might do a bit more work now since we now have two separate MIR visits (in the collector and the mono-time checks) -- but one of them is cached in case the MIR doesn't change, which is nice. This slightly changes behavior of the large-move check since the "move_size_spans" deduplication logic now only works per-instance, not globally across the entire collector. Cc `@saethlin` since you're also doing some work related to queries and caching and monomorphization, though I don't know if there's any interaction here.
Delete the `cfg(not(parallel))` serial compiler Since it's inception a long time ago, the parallel compiler and its cfgs have been a maintenance burden. This was a necessary evil the allow iteration while not degrading performance because of synchronization overhead. But this time is over. Thanks to the amazing work by the parallel working group (and the dyn sync crimes), the parallel compiler has now been fast enough to be shipped by default in nightly for quite a while now. Stable and beta have still been on the serial compiler, because they can't use `-Zthreads` anyways. But this is quite suboptimal: - the maintenance burden still sucks - we're not testing the serial compiler in nightly Because of these reasons, it's time to end it. The serial compiler has served us well in the years since it was split from the parallel one, but it's over now. Let the knight slay one head of the two-headed dragon! #113349 Note that the default is still 1 thread, as more than 1 thread is still fairly broken. cc `@onur-ozkan` to see if i did the bootstrap field removal correctly, `@SparrowLii` on the sync parts
`#[inline]` integer parsing functions This improves the performance of `str::parse` into integers. Before: ``` compare fastest │ slowest │ median │ mean │ samples │ iters ╰─ std │ │ │ │ │ ├─ 328920585 10.23 ns │ 24.8 ns │ 10.34 ns │ 10.48 ns │ 100 │ 25600 ├─ 3255 8.551 ns │ 8.59 ns │ 8.551 ns │ 8.56 ns │ 100 │ 25600 ╰─ 5 7.847 ns │ 7.887 ns │ 7.847 ns │ 7.853 ns │ 100 │ 25600 ``` After: ``` compare fastest │ slowest │ median │ mean │ samples │ iters ╰─ std │ │ │ │ │ ├─ 328920585 8.316 ns │ 23.7 ns │ 8.355 ns │ 8.491 ns │ 100 │ 25600 ├─ 3255 4.566 ns │ 4.588 ns │ 4.586 ns │ 4.576 ns │ 100 │ 51200 ╰─ 5 2.877 ns │ 3.697 ns │ 2.896 ns │ 2.945 ns │ 100 │ 102400 ``` Benchmark: ```rust fn std(input: &str) -> Result<u64, ParseIntError> { input.parse() } ```
vectorize slice::is_sorted Benchmarks using u32 slices: | len | New | Old | |--------|----------------------|----------------------| | 2 | 1.1997 ns | 889.23 ps | | 4 | 1.6479 ns | 1.5396 ns | | 8 | 2.5764 ns | 2.5633 ns | | 16 | 5.4750 ns | 4.7421 ns | | 32 | 11.344 ns | 8.4634 ns | | 64 | 12.105 ns | 18.104 ns | | 128 | 17.263 ns | 33.185 ns | | 256 | 29.465 ns | 60.928 ns | | 512 | 48.926 ns | 116.19 ns | | 1024 | 85.274 ns | 237.91 ns | | 2048 | 160.94 ns | 469.53 ns | | 4096 | 311.60 ns | 911.43 ns | | 8192 | 615.89 ns | 2.2316 µs | | 16384 | 1.2619 µs | 3.4871 µs | | 32768 | 2.5245 µs | 6.9947 µs | | 65536 | 5.2254 µs | 15.212 µs | Seems to be a bit slower on small N but much faster on large N. Godbolt: https://rust.godbolt.org/z/Txn5MdfKn
tweak attributes for const panic macro Let's do some random mutations of this macro to see if that can re-gain the perf lost in rust-lang/rust#132542.
…rkingjubilee improve codegen of fmt_num to delete unreachable panic it seems LLVM doesn't realize that `curr` is always decremented at least once in either loop formatting characters of the input string by their appropriate radix, and so the later `&buf[curr..]` generates a check for out-of-bounds access and panic. this is unreachable in reality as even for `x == T::zero()` we'll produce at least the character `Self::digit(T::zero())`, yielding at least one character output, and `curr` will always be at least one below `buf.len()`. adjust `fmt_int` to make this fact more obvious to the compiler, which fortunately (or unfortunately) results in a measurable performance improvement for workloads heavy on formatting integers. in the program i'd noticed this in, you can see the `cmp $0x80,%rdi; ja 7c` here, which branches to a slice index fail helper: <img width="660" alt="before" src="https://github.com/rust-lang/rust/assets/4615790/ac482d54-21f8-494b-9c83-4beadc3ca0ef"> where after this change the function is broadly similar, but smaller, with one fewer registers updated in each pass through the loop in addition the never-taken `cmp/ja` being gone: <img width="646" alt="after" src="https://github.com/rust-lang/rust/assets/4615790/1bee1d76-b674-43ec-9b21-4587364563aa"> this represents a ~2-3% difference in runtime in my [admittedly comically i32-formatting-bound](https://github.com/athre0z/disas-bench/blob/master/bench/yaxpeax/src/main.rs#L58-L67) use case (printing x86 instructions, including i32 displacements and immediates) as measured on a ryzen 9 3950x. the impact on `<impl LowerHex for i8>::fmt` is both more dramatic and less impactful: it continues to have a loop that is evaluated at most twice, though the compiler doesn't know that to unroll it. the generated code there is identical to the impl for `i32`. there, the smaller loop body has less effect on runtime, and removing the never-taken slice bounds check is offset by whatever address recalculation is happening with the `lea/add/neg` at the end of the loop. it behaves about the same before and after. --- i initially measured slightly better outcomes using `unreachable_unchecked()` here instead, but that was hacking on std and rebuilding with `-Z build-std` on an older rustc (nightly 5b377cece, 2023-06-30). it does not yield better outcomes now, so i see no reason to proceed with that approach at all. <details> <summary>initial notes about that, seemingly irrelevant on modern rustc</summary> i went through a few tries at getting llvm to understand the bounds check isn't necessary, but i should mention the _best_ i'd seen here was actually from the existing `fmt_int` with a diff like ```diff if x == zero { // No more digits left to accumulate. break; }; } } + + if curr >= buf.len() { + unsafe { core::hint::unreachable_unchecked(); } + } let buf = &buf[curr..]; ``` posting a random PR to `rust-lang/rust` to do that without a really really compelling reason seemed a bit absurd, so i tried to work that into something that seems more palatable at a glance. but if you're interested, that certainly produced better (x86_64) code through LLVM. in that case with `buf.iter_mut().rev()` as the iterator, `<impl LowerHex for i8>::fmt` actually unrolls into something like ``` put_char(x & 0xf); let mut len = 1; if x > 0xf { put_char((x >> 4) & 0xf); len = 2; } pad_integral(buf[buf.len() - len..]); ``` it's pretty cool! `<impl LowerHex for i32>::fmt` also was slightly better. that all resulted in closer to an 6% difference in my use case. </details> --- i have not looked at formatters other than LowerHex/UpperHex with this change, though i'd be a bit shocked if any were _worse_. (i have absolutely _no_ idea how you'd regression test this, but that might be just my not knowing what the right tool for that would be in rust-lang/rust. i'm of half a mind that this is small and fiddly enough to not be worth landing lest it quietly regress in the future anyway. but i didn't want to discard the idea without at least offering it upstream here)
…riplett optimize char::to_digit and assert radix is at least 2 approved by t-libs: rust-lang/libs-team#475 (comment) let me know if this needs an assembly test or similar.
Rustup without diff is always suspicious... |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Please close and re-open this PR to trigger CI, then enable auto-merge.