-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
volk_get_index broken, stuck in infinite loop #516
Comments
Thanks for the report and investigation! If we could come up with a system that is smarter than "iterate over a list of strings and compare", that'd be great. A couple of issues:
|
Well, ideally, both the list of implementation and the substring would already be sanitized, so
agreed.
But it needs to come with some error handling.
Agreed. I especially think the way the impls of a kernel are stored is inelegant and doesn't reflect our needs: there's separate lists for _u and _a kernels, instead of one sorted-by-speed list, where the byte-aligment is just an integer property of the implementation description, that upon constructing things can be checked. In essence, a sorted list of the following struct would make more sense: struct impl_descriptor {
unsigned int rank; // from benchmarking, or 0 if not benchmarked yet
machine_emum machine; //e.g. GENERIC, or MMX. Don't lug around strings for comparison.
short alignment;
function_ptr impl; // the actual function pointer we'll call
char* full_machine_name; // or string or whatever, something like generic_presorting or sse4_alternative, or zeroptr if "pure" machine
// leaves us room for future extension, e.g. a field for in-placeness.
}; We could then have a struct kernel_descriptor {
char* name; // 0-terminated
//sorted list of impls, sorted by volk_profile rank, followed by machine, followed by alignment
struct impl_descriptor **implementations; // 0-terminated list of pointers to impl descriptors
};
struct impl_descriptor* find_best_match(struct kernel_descriptor* kernel,
enum machine mach,
short aligment) {
struct impl_descriptor *current_impl = kernel->implementations[0];
while(current_impl != 0) {
if(current_impl->machine == mach && current_impl->alignment <= alignment)
return current_impl;
current_impl++;
}
} honestly, the whole magic would be in sorting |
Regarding C vs. C++: If we want to change this, we need to discuss it and make a conscious decision. An immediate fix here might be: // add this if statement
if (strncmp(impl_name, "generic", 20)) {
return -1;
}
// proceed
return volk_get_index(impl_names, n_impls, "generic"); Though, it may cause issues wherever the return goes. We might just up the character limit if (!strncmp(impl_names[i], impl_name, 42)) // instead of 20 Or we could figure out how many characters are available in Besides, I'd like to sketch an idea Most users just use the function pointer, or e.g. |
This function results in an infinite loop on Debian 11 for some impls. This is a first step to fix it. Fix gnuradio#516 Signed-off-by: Johannes Demel <[email protected]>
This function results in an infinite loop on Debian 11 for some impls. This is a first step to fix it. Fix gnuradio#516 Signed-off-by: Johannes Demel <[email protected]>
This function results in an infinite loop on Debian 11 for some impls. This is a first step to fix it. Fix gnuradio#516 Signed-off-by: Johannes Demel <[email protected]>
This function results in an infinite loop on Debian 11 for some impls. This is a first step to fix it. Fix gnuradio#516 Signed-off-by: Johannes Demel <[email protected]>
@marcusmueller Thanks for:
I'll try it as soon as I can. |
I'd like to add a few more specifics. I work in a Docker Container:
I build and run the container with: docker build -t debian11-bin .
docker run -it --rm debian11-bin In that container, I run # volk_profile -n -R polar
RUN_VOLK_TESTS: volk_8u_x3_encodepolarpuppet_8u(131071,1987)
generic completed in 1374.23 ms
u_ssse3 completed in 732.772 ms
u_avx2 completed in 718.598 ms
a_ssse3 completed in 741.293 ms
a_avx2 completed in 620.256 ms
Best aligned arch: a_avx2
Best unaligned arch: u_avx2
RUN_VOLK_TESTS: volk_32f_8u_polarbutterflypuppet_32f(131071,1987)
^C I have to kill the last process because nothing ever happens. Except a threads causes 100% CPU load and the fans on my machine yell at me. On the same machine, source build, but outside that container: volk_profile -n -R polarbutter
RUN_VOLK_TESTS: volk_32f_8u_polarbutterflypuppet_32f(131071,1987)
generic completed in 6013.26 ms
u_avx completed in 894.084 ms
u_avx2 completed in 861.248 ms
Best aligned arch: u_avx2
Best unaligned arch: u_avx2 So it works on a Ubuntu 20.04 host. @maitbot reports that
Run the container and build VOLK inside: docker build -t volk-debian11 .
docker run -it -v "$(pwd)":/opt --rm volk-debian11
# cd /opt
# mkdir docker-build
# cd docker-build
# cmake ..
# make -j8
# ./apps/volk_profile -n -R polarbutter
Warning: this IS a dry-run. Config will not be written!
RUN_VOLK_TESTS: volk_32f_8u_polarbutterflypuppet_32f(131071,1987)
generic completed in 5502.05 ms
u_avx completed in 872.164 ms
u_avx2 completed in 877.575 ms
Best aligned arch: u_avx
Best unaligned arch: u_avx
Warning: this was a dry-run. Config not generated I toggle 2 build options in my VOLK build
Specifically, I add the build flag |
Note that this is a bug, but might be a different one.
`__init_volk_8u_x2_encodeframepolar_8u` was the infinite-loop-inducing call.
Could you run into the infinite loop, attach GDB and do a backtrace? (you'll need to give
your container `SYS_PTRACE` capabilities for that)
|
Summary: I couldn't reproduce any of those issues. See below. Unfortunately, I was to quick. Toggling
without
The The only clear indicator for the infinite loop would be if your terminal is flooded with
Even with the GNU Radio CI container: I can't reproduce the issue here or gnuradio/gnuradio#5013 . |
Hm, could you attach a debugger inside the container or at least perf top on the outside? We shouldn't be guessing here, when we have the tools to get backtraces ;) |
I did a bit of digging, and this seems to be due to a problematic debian patch: gnuradio/gnuradio#5013 (comment) |
Would it be possible to have the build fail if any kernel is missing a generic implementation? |
Actually, this would be a good requirement. It may require a bit of digging where to implement this check. Besides, thanks for your investigation. This sheds quite a bit of light on the issue. This seems to be the culprit: Debian specific patch to VOLK |
I suggest to close this issue because we could trace the source to a patch outside this repo. I hope this bug won't reproduce with later releases. |
This function results in an infinite loop on Debian 11 for some impls. This is a first step to fix it. Fix gnuradio#516 Signed-off-by: Johannes Demel <[email protected]>
I agree that this issue can be closed off, since it was not a bug in VOLK itself. It might still be worth improving the build process so that it fails if any kernel is missing a generic implementation, but that could be tracked in a separate issue. |
We found the root cause of this bug outside VOLK. I'm closing this issue now. |
Unfortunately this is not entirely fixed. There are some kernels which do not have an implementation named
|
Initially I thought that asserting the presence of a Lines 29 to 32 in a26a1b8
Lines 52 to 57 in a26a1b8
|
Lines 716 to 723 in a26a1b8
|
While debugging gnuradio/gnuradio#5013, I stumbled across
volk_get_index
calling itself indefinitely when usingvolk_8u_x2_encodeframepolar_8u
(at least on Deb11). No matter the reason for finding the right implementation of that kernel failing, this function mustn't go into infinite recursion.Maybe, we should just actually
return -1;
and deal with that (byabort
ing) in the calling functionsvolk_rank_archs
andvolk.tmpl.c
:${kern.name}_manual
. That would at least make debugging easier.This is a tail recursion in
volk_get_index
. Sadly, it will never terminate (until it causes memory exhaustion in ctest, I guess).volk_get_index
is a bag of mixed emotions for me:strncmp(a,b,20)
: We'll happily compare equal impls that are only different after the first 20 characters._generic
one. If that is the case, it just recurses to looking up the generic one, and we end up where we are.What is confusing is why neither
volk_get_info
nor thevolk_profile
tool know ofvolk_8u_x2_encodeframepolar_8u_generic
. In fact, the latter doesn't seem to want to know any implementations of that at all.Originally posted by @marcusmueller in gnuradio/gnuradio#5013 (comment)
The text was updated successfully, but these errors were encountered: