Skip to content

Commit

Permalink
bytes_terminate_multi(): rewrite to a less error-prone algorithm
Browse files Browse the repository at this point in the history
After discovering that the existing implementation might index `src` out
of bounds if `src` is shorter than `term`, I realized that the whole
approach is prone to such errors and it would be wise to change the
algorithm. It's not the first time I've discovered out-of-bounds
indexing, the first time was in
kaitai-io/kaitai_struct_java_runtime@deb426e.

The main change is that the outer loop (now actually the only loop)
ensures that we never index `src` out of bounds. Ensuring that we don't
index `term` out of bounds is now a bit harder, but doable. We have to
treat the empty `term` as a special case (since the loop assumes that
`i_term = 0` is a valid index). We increment `i_term` only once each
time and immediately check if it is equal to `unit_size`
(`term.length()`). If so, we return from the function, so
`term.length()` acts as as an exclusive upper bound for `i_term` and
thus we don't get any out-of-bounds reads.
  • Loading branch information
generalmimon committed Jul 30, 2024
1 parent 57ab3fa commit 1ac7819
Showing 1 changed file with 14 additions and 11 deletions.
25 changes: 14 additions & 11 deletions kaitai/kaitaistream.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -595,19 +595,22 @@ std::string kaitai::kstream::bytes_terminate(std::string src, char term, bool in
}

std::string kaitai::kstream::bytes_terminate_multi(std::string src, std::string term, bool include) {
std::size_t len = src.length();
std::size_t unit_size = term.length();
std::size_t last_unit_start = len > unit_size ? len - unit_size : 0;
for (std::size_t i = 0; i <= last_unit_start; i += unit_size) {
bool match = true;
for (std::size_t j = 0; j < unit_size; j++) {
if (src[i + j] != term[j]) {
match = false;
break;
}
if (unit_size == 0) {
return std::string();
}
std::size_t len = src.length();
std::size_t i_term = 0;
for (std::size_t i_src = 0; i_src < len;) {
if (src[i_src] != term[i_term]) {
i_src += unit_size - i_term;
i_term = 0;
continue;
}
if (match) {
return src.substr(0, i + (include ? unit_size : 0));
i_src++;
i_term++;
if (i_term == unit_size) {
return src.substr(0, i_src - (include ? 0 : unit_size));
}
}
return src;
Expand Down

0 comments on commit 1ac7819

Please sign in to comment.