C++: Don't generate dataflow nodes for functions with summaries #18592
+52
−25
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Not generating dataflow nodes for instructions/operands inside summarized callables has a big impact when using STL since the bodies of the instantiations are always in the database. So by excluding these we will avoid generating dataflow nodes in all of the instantiations which may be pretty significant if there are many instantiations.
Additionally, when we have flow through a function both through MaD and through the source code we may unnecessarily hit the field flow branch limit by duplicating flow. Staying below the field flow branch limit means we allow field flow in more functions.
DCA results looks good. Here's my breakdown of them:
cpp/path-injection
We gain 1 new result on
cpp/path-injection
. I've confirmed that this is because we now stay below the default field flow branch limit (i.e.,2
) for flow out of this call to push_back. Before, we had out flow from both the MaD summary and the source code which resulted in going over the limit. But now we only have the MaD summary-proided out flow which keeps us below the threshold. So, unlike onmain
, field flow is now permitted in the enclosing function.cpp/non-constant-format
We lose 60 results on SAMATE for this query. They all appear to be false positives that happen because of the generous
isSource
in the query that makes us start flow at some random output parameter of a call todelete
deep inside the destructor of an iterator inside the libstdc++. Obviously, that's not what the query is supposed to be finding and I doubt that any of our queries will benefit from starting flow deep inside the implementation of a MaD summarized function.