Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create rule S7118 String methods should be used to query content instead of C apis CPP-5790 #4381

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions rules/S7118/cfamily/metadata.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
{
"title": "String methods should be used to query content instead of C apis",
"type": "CODE_SMELL",
"status": "ready",
"remediation": {
"func": "Constant\/Issue",
"constantCost": "5min"
},
"tags": [
],
"defaultSeverity": "Major",
"ruleSpecification": "RSPEC-7118",
"sqKey": "S7118",
"scope": "All",
"defaultQualityProfiles": ["Sonar way"],
"quickfix": "unknown",
"code": {
"impacts": {
"MAINTAINABILITY": "MEDIUM"
},
"attribute": "CLEAR"
}
}
212 changes: 212 additions & 0 deletions rules/S7118/cfamily/rule.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,212 @@
`std::string` methods should be preferred over operations on the result of `c_str()` or `data()` when manipulating a standard string.

== Why is this an issue?

When manipulating standard strings, it is recommended to use their methods, rather than accessing the underlying char array representation with low-level C methods. The string methods are clearer in their purpose, less error-prone, and will sometimes be more performant.

[source,cpp]
----
void func(std::string const& param1, std::string const& param2) {
int size = strlen(param1.data()); // Noncompliant
bool same = (strcmp(param1.c_str(), param2.c_str()) == 0); // Noncompliant
}
----

== How to fix it

=== size calculation

Using `strlen` to calculate the size of a string will necessitate finding its null terminator. The string already stores its own size and can return it for free.

==== Noncompliant code example

[source,cpp,diff-id=1,diff-type=noncompliant]
----
int size = strlen(string.c_str()); // Noncompliant
----

==== Compliant solution

[source,cpp,diff-id=1,diff-type=compliant]
----
int size = string.size(); // Compliant
----

=== Embedded null-terminator detection

Note that as mentioned in the "Pitfalls" part, `size()` will return the full size,
including embedded null characters. If the goal of the call to `c_str()` is to detect
the presence of embedded null-terminators, more explicit methods like `std::string::find()` or `std::string::contains()` should be used.

==== Noncompliant code example

[source,cpp,diff-id=2,diff-type=noncompliant]
----
if (string.size() != strlen(string.c_str())) // Noncompliant
{
// ...
}
----

==== Compliant solution

In C++23 `std::string::contains()` is the best solution.

[source,cpp,diff-id=2,diff-type=compliant]
----
if (string.contains('\0')) // Compliant
{
// ...
}
----

Before that, `std::string::find()` is an alternative.

[source,cpp,diff-id=2,diff-type=compliant]
----
if (string.find('\0') != std::string::npos) // Compliant
{
// ...
}
----

=== string comparison

Testing the equality between two full strings can be directly performed using the equality operator. For other comparisons, `std::string::compare` is more readable.

==== Noncompliant code example

[source,cpp,diff-id=3,diff-type=noncompliant]
----
void comparisons(std::string const& s1, std::string const& s2) {
bool equal = strcmp(s1.c_str(), s2.c_str()) == 0; // Noncompliant
int comparePart = strncmp(s1.c_str() + 2, s2.c_str() + 3, 10); // Noncompliant
}
----

==== Compliant solution

[source,cpp,diff-id=3,diff-type=compliant]
tomasz-kaminski-sonarsource marked this conversation as resolved.
Show resolved Hide resolved
----
void comparisons(std::string const& s1, std::string const& s2) {
bool equal = s1 == s2; // Compliant
int comparePart = s1.compare(2, 10, s2, 3, 10); // Compliant
}
----

With C++17 and higher, substring comparison can also be done through `std::string_view`.

[source,cpp]
----
auto comparePart = std::string_view(s1).substr(2, 10) <=> std::string_view(s2).substr(3, 10);
----

=== Searching a substring

Either searching a substring or a single character inside a string can be done with `std::string::find()`.

==== Noncompliant code example

[source,cpp,diff-id=4,diff-type=noncompliant]
----
void search(std::string const& string) {
char const* sub = strstr(string.c_str(), "substring"); // Noncompliant
char const* dot = strchr(string.c_str(), '.'); // Noncompliant
// ...
}
----

==== Compliant solution

[source,cpp,diff-id=4,diff-type=compliant]
----
void search(std::string const& string) {
size_t sub = string.find("substring"); // Compliant
size_t dot = string.find('.'); // Compliant
// ...
}
----

=== Searching one of many characters

Four methods allow searching for any character from a list inside a string: `find_first_of`, `find_first_not_of`, `find_last_of`, and `find_last_not_of`. They are better alternatives than `strspn`, `strcspn`, and `strpbrk` when manipulating strings.

==== Noncompliant code example

[source,cpp,diff-id=5,diff-type=noncompliant]
----
void search(std::string const& string) {
size_t firstNonNumeric = strspn(string.c_str(), "0123456789"); // Noncompliant
size_t firstSeparator = strcspn(string.c_str(), ",. ;"); // Noncompliant
// ...
}
----

==== Compliant solution

[source,cpp,diff-id=5,diff-type=compliant]
----
void search(std::string const& string) {
size_t firstNonNumeric = string.find_first_not_of("0123456789"); // Compliant
size_t firstSeparator = string.find_first_of(",. ;"); // Compliant
// ...
}
----
tomasz-kaminski-sonarsource marked this conversation as resolved.
Show resolved Hide resolved

=== Pitfalls

Using string methods instead of the result of `c_str()` (or `data()`) will have different results when the string contains embedded null terminators in its content. Operations on `c_str()` will stop at the first null-terminator while string methods will apply to the whole content.

Most of the time, operating on the whole content is the intended behavior, but in cases where it isn't, using methods might be less straightforward. It is still recommended to fix the code, to make this pitfall clear to other maintainers.

If the goal is to only keep the first part, the string can be cropped in place using `std::string::resize()` or `std::string::erase()`.
Calling `strlen(string.c_str())` inside a call to resize or erase is compliant by exception because it is the simplest way to crop a string like that.
[source,cpp]
----
string.resize(strlen(string.c_str())); // Compliant by exception
----

Otherwise, if working with {cpp}17 or higher and if the goal is to just query the content, a good approach is to create a `std::string_view` and use its methods, which mirror those of `std::string`.

[source,cpp]
----
std::string_view beginning{string.c_str()};
// Use string_view methods afterward
// ...
----

Otherwise, if working with older {cpp} versions or if the resulting string will be modified, doing a full copy also works.

[source,cpp]
----
std::string beginning{string.c_str()};
// Use beginning and its methods afterward
// ...
----

=== Going the extra mile

Direct replacement of one C function with the corresponding string method may not lead to the best code overall. When replacing a C function call, you should consider what overall functionality this call is part of and whether your {cpp} version allows to implement that functionality more easily and effectively.

For example using ranges (`std::ranges::views::split`, `std::ranges::views::chunk`), using algorithms, using `std::istringstream` or `std::ispanstream`, using `std::string_view`.

An example of a function splitting words in a moddern way could be as follow.

[source,cpp]
----
void treatWord(std::string_view word);

void treatAllWords(std::string_view text) {
constexpr std::string_view delim{" "};
for (const auto word : std::views::split(text, delim))
treatWord(std::string_view{word});
}
----

tomasz-kaminski-sonarsource marked this conversation as resolved.
Show resolved Hide resolved
== Resources

=== Documentation

* {cpp} reference - https://en.cppreference.com/w/cpp/string/basic_string[Native functions for std::string]

* {cpp} reference - https://en.cppreference.com/w/cpp/ranges#Range_adaptors[ranges documentation]
2 changes: 2 additions & 0 deletions rules/S7118/metadata.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
{
}
Loading