Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go: add tests for dataflow relating to type aliasing #17400

Merged

Conversation

smowton
Copy link
Contributor

@smowton smowton commented Sep 6, 2024

No description provided.

@smowton smowton requested a review from a team as a code owner September 6, 2024 16:07
@github-actions github-actions bot added the Go label Sep 6, 2024
mbg
mbg previously approved these changes Sep 6, 2024
Copy link
Member

@mbg mbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable. I have a couple of suggestions and one question, but nothing that blocks us from merging this.

Comment on lines 11 to 17
// Note trickery with packages is necessary so that Go will assign the fields the same name as well as the same type.

type EmbedsPkg1IntStruct = struct{ pkg1.IntStruct }
type EmbedsPkg2IntStruct = struct{ pkg2.IntStruct }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this part probably warrants a bit more explanation. What would happen without the package trickery? The comment implies that they will have different names and different types, but it's not immediately clear why that is or how it affects the test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a very artificial test, since I'm not sure why a package would define an alias without changing the name (just to avoid mentioning the original package?). More likely might be two packages with aliases with the same name for the same type. Would that test the same thing? Or be a separate test worth adding?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice this happens because an impl package of some sort defines a type, then that type gets exposed, so you get something like

package impl
...
type IntStruct struct { x int }

type EmbedsIntStruct = struct { IntStruct }

...

package public
...
type IntStruct = impl.IntStruct

type EmbedsIntStruct = struct { IntStruct }

The promotion of the impl type to a public one makes sense. In the examples I saw I didn't see a particular reason to redefine the same type alias involving embedding twice in two different packages; probably just an accident that nobody had happened to rationalise. But the result was that there were two identical types that used different surface types for an embedded field, and we need to make sure our dataflow lib knows they are identical, so writes through one should connect with reads through the other. The specific change needed was in hasOwnField, which synthesises the implicit field's name.

@owen-mc
Copy link
Contributor

owen-mc commented Sep 10, 2024

This should be retargeted to main, I think.

owen-mc
owen-mc previously approved these changes Sep 10, 2024
@smowton smowton dismissed stale reviews from owen-mc and mbg via 2d5cbfd September 17, 2024 16:19
@smowton smowton force-pushed the smowton/admin/further-golang-aliasing-tests branch from 706f09f to 2d5cbfd Compare September 17, 2024 16:19
@smowton smowton requested review from a team as code owners September 17, 2024 16:19
@smowton smowton changed the base branch from rc/3.15 to main September 17, 2024 16:19
@smowton smowton removed request for a team September 17, 2024 16:20
Copy link
Contributor

QHelp previews:

cpp/ql/src/experimental/Security/CWE/CWE-409/DecompressionBombs.qhelp

User-controlled file decompression

Extracting Compressed files with any compression algorithm like gzip can cause denial of service attacks.

Attackers can compress a huge file consisting of repeated similiar bytes into a small compressed file.

Recommendation

When you want to decompress a user-provided compressed file you must be careful about the decompression ratio or read these files within a loop byte by byte to be able to manage the decompressed size in each cycle of the loop.

Example

Reading an uncompressed Gzip file within a loop and check for a threshold size in each cycle.

#include "zlib.h"

void SafeGzread(gzFile inFileZ) {
    const int MAX_READ = 1024 * 1024 * 4;
    const int BUFFER_SIZE = 8192;
    unsigned char unzipBuffer[BUFFER_SIZE];
    unsigned int unzippedBytes;
    unsigned int totalRead = 0;
    while (true) {
        unzippedBytes = gzread(inFileZ, unzipBuffer, BUFFER_SIZE);
        totalRead += unzippedBytes;
        if (unzippedBytes <= 0) {
            break;
        }

        if (totalRead > MAX_READ) {
            // Possible decompression bomb, stop processing.
            break;
        } else {
            // process buffer
        }
    }
}

The following example is unsafe, as we do not check the uncompressed size.

#include "zlib.h"

void UnsafeGzread(gzFile inFileZ) {
    const int BUFFER_SIZE = 8192;
    unsigned char unzipBuffer[BUFFER_SIZE];
    unsigned int unzippedBytes;
    while (true) {
        unzippedBytes = gzread(inFileZ, unzipBuffer, BUFFER_SIZE);
        if (unzippedBytes <= 0) {
            break;
        }

        // process buffer
    }
}

References

javascript/ql/src/Security/CWE-312/ActionsArtifactLeak.qhelp

Storage of sensitive information in GitHub Actions artifact

Sensitive information included in a GitHub Actions artifact can allow an attacker to access the sensitive information if the artifact is published.

Recommendation

Only store information that is meant to be publicly available in a GitHub Actions artifact.

Example

The following example uses actions/checkout to checkout code which stores the GITHUB_TOKEN in the `.git/config` file and then stores the contents of the `.git` repository into the artifact:

name: secrets-in-artifacts
on:
  pull_request:
jobs:
  a-job: # VULNERABLE
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: "Upload artifact"
        uses: actions/upload-artifact@1746f4ab65b179e0ea60a494b83293b640dd5bba # v4.3.2
        with:
          name: file
          path: .

The issue has been fixed below, where the actions/upload-artifact uses a version (v4+) which does not include hidden files or directories into the artifact.

name: secrets-in-artifacts
on:
  pull_request:
jobs:
  a-job: # NOT VULNERABLE
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: "Upload artifact"
        uses: actions/upload-artifact@v4
        with:
          name: file
          path: .

python/ql/src/experimental/Security/CWE-346/CorsBypass.qhelp

Cross Origin Resource Sharing(CORS) Policy Bypass

Cross-origin resource sharing policy may be bypassed due to incorrect checks like the string.startswith call.

Recommendation

Use a more stronger check to test for CORS policy bypass.

Example

Most Python frameworks provide a mechanism for testing origins and performing CORS checks. For example, consider the code snippet below, origin is compared using a startswith call against a list of whitelisted origins. This check can be bypassed easily by origin like domain.com.baddomain.com

import cherrypy

def bad():
    request = cherrypy.request
    validCors = "domain.com"
    if request.method in ['POST', 'PUT', 'PATCH', 'DELETE']:
        origin = request.headers.get('Origin', None)
        if origin.startswith(validCors):
            print("Origin Valid")

This can be prevented by comparing the origin in a manner shown below.

import cherrypy

def good():
    request = cherrypy.request
    validOrigin = "domain.com"
    if request.method in ['POST', 'PUT', 'PATCH', 'DELETE']:
        origin = request.headers.get('Origin', None)
        if origin == validOrigin:
            print("Origin Valid")

References

  • PortsSwigger : Cross-origin resource sharing (CORS)
  • Related CVE: CVE-2022-3457.
swift/ql/src/queries/Security/CWE-020/IncompleteHostnameRegex.qhelp

Incomplete regular expression for hostnames

Sanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts.

If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the . meta-characters appropriately. Even if the check is not used in a security-critical context, the incomplete check may still cause undesirable behaviors when it accidentally succeeds.

Recommendation

Escape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the . meta-character.

Example

The following example code checks that a URL redirection will reach the example.com domain, or one of its subdomains.


func handleUrl(_ urlString: String) {
    // get the 'url=' parameter from the URL
    let components = URLComponents(string: urlString)
    let redirectParam = components?.queryItems?.first(where: { $0.name == "url" })

    // check we trust the host
    let regex = #/^(www|beta).example.com//#  // BAD
    if let match = redirectParam?.value?.firstMatch(of: regex) {
        // ... trust the URL ...
    }
}

The check is, however, easy to bypass because the unescaped . allows for any character before example.com, effectively allowing the redirect to go to an attacker-controlled domain such as wwwXexample.com.

Address this vulnerability by escaping . to \.:


func handleUrl(_ urlString: String) {
    // get the 'url=' parameter from the URL
    let components = URLComponents(string: urlString)
    let redirectParam = components?.queryItems?.first(where: { $0.name == "url" })

    // check we trust the host
    let regex = #/^(www|beta)\.example\.com//#  // GOOD
    if let match = redirectParam?.value?.firstMatch(of: regex) {
        // ... trust the URL ...
    }
}

References

swift/ql/src/queries/Security/CWE-116/BadTagFilter.qhelp

Bad HTML filtering regexp

It is possible to match some single HTML tags using regular expressions (parsing general HTML using regular expressions is impossible). However, if the regular expression is not written well, it might be possible to circumvent it. This can lead to cross-site scripting or other security issues.

Some of these mistakes are caused by browsers having very forgiving HTML parsers, and will often render invalid HTML containing syntax errors. Regular expressions that attempt to match HTML should also recognize tags containing such syntax errors.

Recommendation

Use a well-tested sanitization or parser library if at all possible. These libraries are much more likely to handle corner cases correctly than a custom implementation.

Example

The following example attempts to filters out all <script> tags.

let script_tag_regex = /<script[^>]*>.*<\/script>/

var old_html = ""
while (html != old_html) {
  old_html = html
  html.replace(script_tag_regex, with: "")
}

...

The above sanitizer does not filter out all <script> tags. Browsers will not only accept </script> as script end tags, but also tags such as </script foo="bar"> even though it is a parser error. This means that an attack string such as <script>alert(1)</script foo="bar"> will not be filtered by the function, and alert(1) will be executed by a browser if the string is rendered as HTML.

Other corner cases include HTML comments ending with --!>, and HTML tag names containing uppercase characters.

References

swift/ql/src/queries/Security/CWE-1333/ReDoS.qhelp

Inefficient regular expression

Some regular expressions take a long time to match certain input strings to the point where the time it takes to match a string of length n is proportional to nk or even 2n. Such regular expressions can negatively affect performance, and potentially allow a malicious user to perform a Denial of Service ("DoS") attack by crafting an expensive input string for the regular expression to match.

The regular expression engine used by Swift uses backtracking non-deterministic finite automata to implement regular expression matching. While this approach is space-efficient and allows supporting advanced features like capture groups, it is not time-efficient in general. The worst-case time complexity of such an automaton can be polynomial or exponential, meaning that for strings of a certain shape, increasing the input length by ten characters may make the automaton about 1000 times slower.

Typically, a regular expression is affected by this problem if it contains a repetition of the form r* or r+ where the sub-expression r is ambiguous in the sense that it can match some string in multiple ways. More information about the precise circumstances can be found in the references.

Recommendation

Modify the regular expression to remove the ambiguity, or ensure that the strings matched with the regular expression are short enough that the time complexity does not matter.

Example

Consider the following regular expression:


/^_(__|.)+_$/

Its sub-expression "(__|.)+" can match the string "__" either by the first alternative "__" to the left of the "|" operator, or by two repetitions of the second alternative "." to the right. Therefore, a string consisting of an odd number of underscores followed by some other character will cause the regular expression engine to run for an exponential amount of time before rejecting the input.

This problem can be avoided by rewriting the regular expression to remove the ambiguity between the two branches of the alternative inside the repetition:


/^_(__|[^_])+_$/

References

Copy link
Contributor

⚠️ The head of this PR and the base branch were compared for differences in the framework coverage reports. The generated reports are available in the artifacts of this workflow run. The differences will be picked up by the nightly job after the PR gets merged.

Click to show differences in coverage

java

Generated file changes for java

  • Changes to framework-coverage-java.rst:
-    Java Standard Library,``java.*``,10,4264,259,99,,9,,,26
+    Java Standard Library,``java.*``,10,4284,259,99,,9,,,26
-    Java extensions,"``javax.*``, ``jakarta.*``",69,3257,90,10,4,2,1,1,4
+    Java extensions,"``javax.*``, ``jakarta.*``",69,3260,90,10,4,2,1,1,4
-    Totals,,312,25147,2635,404,16,128,33,1,409
+    Totals,,312,25170,2635,404,16,128,33,1,409
  • Changes to framework-coverage-java.csv:
- java.net,23,3,278,,,,1,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,21,,,,,,,,,,,,,,3,274,4
+ java.net,23,3,279,,,,1,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,21,,,,,,,,,,,,,,3,275,4
- java.nio,47,,361,,,,,,,,,5,,,,,,,,,,,,,,,41,,,,,,,,,1,,,,,,,,,,,,,,,259,102
+ java.nio,47,,373,,,,,,,,,5,,,,,,,,,,,,,,,41,,,,,,,,,1,,,,,,,,,,,,,,,267,106
- java.security,21,,543,,,11,10,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,539,4
+ java.security,21,,547,,,11,10,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,543,4
- java.util,48,2,1218,,,,,,,,,1,,,,,,,,,,,34,,,,3,,,,5,2,,1,2,,,,,,,,,,,,,2,,,704,514
+ java.util,48,2,1221,,,,,,,,,1,,,,,,,,,,,34,,,,3,,,,5,2,,1,2,,,,,,,,,,,,,2,,,705,516
- javax.management,2,,799,,,,,,,,,,,,,,,,,2,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,798,1
+ javax.management,2,,802,,,,,,,,,,,,,,,,,2,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,801,1

go

Generated file changes for go

  • Changes to framework-coverage-go.csv:
- crypto,,,1,,,,,,,,,,,,,,,,,1,
+ crypto,,,10,,,,,,,,,,,,,,,,,10,
- crypto/cipher,,,3,,,,,,,,,,,,,,,,,3,
- crypto/rsa,,,2,,,,,,,,,,,,,,,,,2,
- crypto/tls,,,3,,,,,,,,,,,,,,,,,3,
- crypto/x509,,,1,,,,,,,,,,,,,,,,,1,
- database/sql,,,7,,,,,,,,,,,,,,,,,7,
+ database/sql,,,11,,,,,,,,,,,,,,,,,11,
- database/sql/driver,,,4,,,,,,,,,,,,,,,,,4,
- encoding,,,4,,,,,,,,,,,,,,,,,4,
- encoding/ascii85,,,2,,,,,,,,,,,,,,,,,2,
- encoding/asn1,,,8,,,,,,,,,,,,,,,,,8,
- encoding/base32,,,3,,,,,,,,,,,,,,,,,3,
- encoding/base64,,,3,,,,,,,,,,,,,,,,,3,
- encoding/binary,,,2,,,,,,,,,,,,,,,,,2,
- encoding/csv,,,5,,,,,,,,,,,,,,,,,5,
- encoding/gob,,,7,,,,,,,,,,,,,,,,,7,
+ encoding,,,77,,,,,,,,,,,,,,,,,77,
- encoding/hex,,,3,,,,,,,,,,,,,,,,,3,
- encoding/json,,,14,,,,,,,,,,,,,,,,,14,
- encoding/pem,,,3,,,,,,,,,,,,,,,,,3,
- encoding/xml,,,23,,,,,,,,,,,,,,,,,23,
- github.com/astaxie/beego,5,6,7,,,,4,,,,,,1,,,,,,6,7,
+ github.com/astaxie/beego,7,21,21,,,,5,,,,,,2,,,,,,21,21,
- github.com/astaxie/beego/context,2,15,1,,,,1,,,,,,1,,,,,,15,1,
- github.com/astaxie/beego/utils,,,13,,,,,,,,,,,,,,,,,13,
- github.com/beego/beego,5,6,7,,,,4,,,,,,1,,,,,,6,7,
- github.com/beego/beego/context,2,15,1,,,,1,,,,,,1,,,,,,15,1,
+ github.com/beego/beego,14,42,42,,,,10,,,,,,4,,,,,,42,42,
- github.com/beego/beego/core/utils,,,13,,,,,,,,,,,,,,,,,13,
- github.com/beego/beego/server/web,5,6,7,,,,4,,,,,,1,,,,,,6,7,
- github.com/beego/beego/server/web/context,2,15,1,,,,1,,,,,,1,,,,,,15,1,
- github.com/beego/beego/utils,,,13,,,,,,,,,,,,,,,,,13,
- github.com/go-jose/go-jose,2,,,,2,,,,,,,,,,,,,,,,
- github.com/go-jose/go-jose/jwt,1,,4,,,1,,,,,,,,,,,,,,4,
+ github.com/go-jose/go-jose,3,,4,,2,1,,,,,,,,,,,,,,4,
- github.com/lestrrat-go/jwx,1,,,,1,,,,,,,,,,,,,,,,
+ github.com/lestrrat-go/jwx,2,,,,2,,,,,,,,,,,,,,,,
- github.com/lestrrat-go/jwx/jwk,1,,,,1,,,,,,,,,,,,,,,,
- github.com/square/go-jose,2,,,,2,,,,,,,,,,,,,,,,
- github.com/square/go-jose/jwt,1,,4,,,1,,,,,,,,,,,,,,4,
+ github.com/square/go-jose,3,,4,,2,1,,,,,,,,,,,,,,4,
- gopkg.in/go-jose/go-jose,2,,,,2,,,,,,,,,,,,,,,,
- gopkg.in/go-jose/go-jose/jwt,1,,4,,,1,,,,,,,,,,,,,,4,
+ gopkg.in/go-jose/go-jose,3,,4,,2,1,,,,,,,,,,,,,,4,
- gopkg.in/square/go-jose,2,,,,2,,,,,,,,,,,,,,,,
- gopkg.in/square/go-jose/jwt,1,,4,,,1,,,,,,,,,,,,,,4,
+ gopkg.in/square/go-jose,3,,4,,2,1,,,,,,,,,,,,,,4,
- html,,,2,,,,,,,,,,,,,,,,,2,
+ html,,,8,,,,,,,,,,,,,,,,,8,
- html/template,,,6,,,,,,,,,,,,,,,,,6,
- io,,,19,,,,,,,,,,,,,,,,,19,
- io/fs,,3,13,,,,,,,,,,,,,,,3,,13,
- io/ioutil,5,1,2,,,,5,,,,,,,,,,,1,,2,
+ io,5,4,34,,,,5,,,,,,,,,,,4,,34,
- mime,,,5,,,,,,,,,,,,,,,,,5,
+ mime,,,14,,,,,,,,,,,,,,,,,14,
- mime/multipart,,,8,,,,,,,,,,,,,,,,,8,
- mime/quotedprintable,,,1,,,,,,,,,,,,,,,,,1,
- net,,,20,,,,,,,,,,,,,,,,,20,
- net/http,2,16,22,,,,1,,,,,,,1,,,,,16,22,
+ net,2,16,100,,,,1,,,,,,,1,,,,,16,100,
- net/http/httputil,,,10,,,,,,,,,,,,,,,,,10,
- net/mail,,,6,,,,,,,,,,,,,,,,,6,
- net/textproto,,,19,,,,,,,,,,,,,,,,,19,
- net/url,,,23,,,,,,,,,,,,,,,,,23,
- os,27,10,6,1,,,26,,,,,,,,,,7,3,,6,
+ os,29,10,6,3,,,26,,,,,,,,,,7,3,,6,
- os/exec,2,,,2,,,,,,,,,,,,,,,,,
- path,,,5,,,,,,,,,,,,,,,,,5,
+ path,,,18,,,,,,,,,,,,,,,,,18,
- path/filepath,,,13,,,,,,,,,,,,,,,,,13,
- sync,,,10,,,,,,,,,,,,,,,,,10,
+ sync,,,34,,,,,,,,,,,,,,,,,34,
- sync/atomic,,,24,,,,,,,,,,,,,,,,,24,

Copy link
Member

@mbg mbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding more documentation to the tests! That should make it easier for us to maintain going forward and understand what's going wrong when troubleshooting failures.

@smowton smowton merged commit cfd281b into github:main Sep 18, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants