The State of Path Mapping #22658
Replies: 10 comments 30 replies
-
FWIW rules_pkl is a starlark rule which supports path mapping, see https://github.com/apple/rules_pkl/blob/bf3de08ea8c7f05329a2d59d9790ec8653d2a28c/pkl/private/pkl.bzl#L232 for an example. |
Beta Was this translation helpful? Give feedback.
-
Do you think that c++ support would be available at the next version (7.3.0)? |
Beta Was this translation helpful? Give feedback.
-
Is there any plan to support android? |
Beta Was this translation helpful? Give feedback.
-
Edit: Path mapping in 7.3.0 is currently broken due to race conditions in the new action deduplication functionality. See #23288. |
Beta Was this translation helpful? Give feedback.
-
Just started prototyping this and the results so far are amazing. Subtle configuration differences no longer cause duplication of super expensive actions. For me it is generated C files and architectureless data files. Thanks for doing this! I feel like we're just beginning to recognize the impact and will find much more savings. Deduplicating executions is important for us since local executions of these duplicates often get scheduled around the same time. Another interesting note is the ability to cap disk cache size might be required before we roll this out to workstations. |
Beta Was this translation helpful? Give feedback.
-
@fmeum As I am trying to fix my rules to support path mapping, I wanted to see what rules could possibly be fixed to use this. I wanted to see all the cmd line args of a particular target, to help me figure out the rules that need fixing. I tried |
Beta Was this translation helpful? Give feedback.
-
We have a few rules that take directory as an argument. Most locations kind used this args.add(file.dirname)
# Updated now to
args.add_all([file], map_each = _dirname) Above suffices in a lot of places, but becomes very quirky when you have complex rules. For example, in protobuf rules, there is a lot of directory references passed to
|
Beta Was this translation helpful? Give feedback.
-
I was testing this out with #23630 applied, I got pretty far but I hit this issue:
Where the header here is from a rules_foreign_cc target:
It might be relevant that I access it through an alias (although removing that doesn't help):
Are you tracking this one? |
Beta Was this translation helpful? Give feedback.
-
I've been implementing path mapping in few rule sets and I think I'm either implementing something incorrectly or I've found an issue with path mapping + multiplex worker sandboxing. I'm seeing errors where the worker doesn't get run because the executable isn't present in the sandbox. Both path mapping and multiplex sandboxing work on their own:
Here's a minimal repro case as well as the full error (in the README): https://github.com/lucidsoftware/path-mapping-bug-repro You can repro with |
Beta Was this translation helpful? Give feedback.
-
I'm attempting to implement path mapping in a ruleset, and I'm not sure if I'm misunderstanding how path mapping works, but I'm not able to get cache hits between what should be identical actions. I think I may have discovered a bug. Here's a repository that minimally reproduces the issue: The Do you know why this is happening and how I can prevent |
Beta Was this translation helpful? Give feedback.
-
What is path mapping?
With path mapping enabled, Bazel automatically rewrites paths in action command lines with the aim of making them more likely to be disk or remote cache hits.
Specifically, configuration prefixes such as
bazel-out/darwin-amd64-fastbuild/bin
are replaced with a fixed string such asbazel-out/cfg/bin
, so that the result of an action that doesn't depend on all aspects of the configuration encoded in the path (e.g. the OS, architecture and build mode) can be shared between different target platforms and configurations.Path mapping by itself won't result in cache hits between different execution platforms because toolchains typically differ between platforms (in content, not just paths). One way to address this is to bundle toolchains for multiple execution platforms, see "Universal toolchains". Another is to use cache key scrubbing.
For more information on path mapping, watch the "Towards faster cross-platform builds" talk.
Using path mapping
Bazel has experimental support for path mapping for both native and Starlark rules as detailed below.
General requirements
--experimental_output_paths=strip
.remote
dynamic
sandboxed
multiplex-worker
with--experimental_worker_multiplex_sandboxing
worker
with--worker_sandboxing
and--noworker_multiplex
Java
Java compilation actions as well as header compilation actions support path mapping.
Requirements
Since the default Java toolchain doesn't enable support for multiplex sandboxing yet, non-remote execution with path mapping requires using a custom Java toolchain:
rules_java 7.5.0 and higher have support for multiplex sandboxing.
C/C++
C++ compilation actions have experimental support for path mapping.
Requirements
Bazel 7.4.0 or higher is required and path mapping needs to be explicitly enabled for C/C++ actions via
--modify_execution_info=CppCompile=+supports-path-mapping,CppModuleMap=+supports-path-mapping
.Go
rules_go optionally uses path mapping for its non-cgo compilation actions, including the stdlib.
Requirements
rules_go from HEAD including this commit.
Custom Starlark rules
Custom Starlark rules can opt into path mapping by adding an entry with the key
supports-path-mapping
to theexecution_requirements
dict passed toctx.actions.run
orctx.actions.run_shell
.They can also be opted in from the command line via
--modify_execution_info
.Additionally, command lines need to be constructed using
ctx.actions.args
and avoiding any of the methods onFile
that return a path string. Assuming thatsrc
is aFile
representing a file anddir
is aFile
representing a directory ("tree artifact"):args.add(src.path)
, useargs.add(src)
;args.add(dir.path)
, useargs.add_all([dir], expand_directories = False)
;args.add(src.dirname)
, use:Generally speaking, all functions that return a path string that may contain a configuration prefix such as
bazel-out/darwin-amd64-fastbuild/bin
must only be called inmap_each
callbacks, where they are automatically path mapped by Bazel.Beta Was this translation helpful? Give feedback.
All reactions