Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(scanner): Add flag to scanner to detect unlicensed files #9487

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions model/src/main/kotlin/config/ScannerConfiguration.kt
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,11 @@ data class ScannerConfiguration(
*/
val skipConcluded: Boolean = false,

/**
* A flag to indicate whether the scanner should add files without license to the scanner results.
*/
val includeUnlicensed: Boolean = false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my taste, this sounds too much like https://spdx.org/licenses/Unlicense.html. To avoid any confusion, I propose to rename this to includeFilesWithoutFindings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, property-ordering-wise I don't like this to go between the two "skip..." properties. Between skipExcluded and archive makes more sense to me.


/**
* A flag to control whether excluded scopes and paths should be skipped during the scan.
*/
Expand Down
38 changes: 36 additions & 2 deletions scanner/src/main/kotlin/Scanner.kt
Original file line number Diff line number Diff line change
Expand Up @@ -36,13 +36,15 @@ import org.ossreviewtoolkit.model.FileList
import org.ossreviewtoolkit.model.Identifier
import org.ossreviewtoolkit.model.Issue
import org.ossreviewtoolkit.model.KnownProvenance
import org.ossreviewtoolkit.model.LicenseFinding
import org.ossreviewtoolkit.model.OrtResult
import org.ossreviewtoolkit.model.Package
import org.ossreviewtoolkit.model.PackageType
import org.ossreviewtoolkit.model.ProvenanceResolutionResult
import org.ossreviewtoolkit.model.ScanResult
import org.ossreviewtoolkit.model.ScanSummary
import org.ossreviewtoolkit.model.ScannerRun
import org.ossreviewtoolkit.model.TextLocation
import org.ossreviewtoolkit.model.VcsInfo
import org.ossreviewtoolkit.model.config.DownloaderConfiguration
import org.ossreviewtoolkit.model.config.ScannerConfiguration
Expand Down Expand Up @@ -192,8 +194,6 @@ class Scanner(

val vcsPathsForProvenances = getVcsPathsForProvenances(provenances)

val filteredScanResults = filterScanResultsByVcsPaths(controller.getAllScanResults(), vcsPathsForProvenances)

val files = controller.getAllFileLists().mapTo(mutableSetOf()) { (provenance, fileList) ->
FileList(
provenance = provenance.alignRevisions() as KnownProvenance,
Expand All @@ -207,6 +207,40 @@ class Scanner(
}
}

val filteredScanResults = filterScanResultsByVcsPaths(controller.getAllScanResults(), vcsPathsForProvenances)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, a few random starting thoughts / comments:

  • This code gets complex and deserves a few more code comments that explain the algorithm / intention.

  • For the default case ideally no unnecessary copy should be done (like is done now in line 217). To archive that, I'd leave val filteredScanResults as-is in original line 195, and instead introduce a new variable like

      val scanResults = if (scannerConfig.includeUnlicensed) {
          // Do the work.
      } else {
          filteredScanResults
      }
    
      // ....
    
      return ScannerRun.EMPTY.copy(
          config = scannerConfig,
          provenances = provenances,
          scanResults = scanResults,
          files = files,
          scanners = scanners
      )
    

.mapTo(mutableSetOf()) { scanResult ->
val licenseFiles = scanResult.summary.licenseFindings.mapTo(mutableSetOf()) { licenseFinding ->
licenseFinding.location.path
}

if (!scannerConfig.includeUnlicensed) {
scanResult.copy(provenance = scanResult.provenance.alignRevisions())
} else {
// Adds files without license to the scanned results
val scanSummary =
controller.getAllFileLists()[scanResult.provenance]?.files
.orEmpty().asSequence().mapNotNull { fileEntry ->
if (fileEntry.path in licenseFiles) {
null
} else {
fileEntry.path
}
}.toSet().let { fileEntryLicenses ->
(fileEntryLicenses subtract licenseFiles).mapTo(mutableSetOf()) { newFinding ->
LicenseFinding(license = "NONE", location = TextLocation(newFinding, 1))
}.let {
val allFindings = scanResult.summary.licenseFindings union it
Fixed Show fixed Hide fixed
scanResult.summary.copy(licenseFindings = allFindings)
}
Fixed Show fixed Hide fixed
}

scanResult.copy(
provenance = scanResult.provenance.alignRevisions(),
summary = scanSummary
)
Comment on lines +219 to +240
Copy link
Member

@sschuberth sschuberth Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reduce nesting and make the code more readable, I propose something like this (completely untested; this goes to // Do the work. from my comment above):

        filteredScanResults.mapTo(mutableSetOf()) { scanResult ->
            val allPaths = controller.getAllFileLists()[scanResult.provenance]?.files?.map { it.path }.orEmpty()
            val pathsWithFindings = scanResult.summary.licenseFindings.map { it.location.path }
            val pathsWithoutFindings = allPaths - pathsWithFindings

            val findingsThatAreNone = pathsWithoutFindings.map {
                LicenseFinding(SpdxConstants.NONE, TextLocation(it, UNKNOWN_LINE))
            }

            scanResult.copy(
                summary = scanResult.summary.copy(
                    licenseFindings = scanResult.summary.licenseFindings + findingsThatAreNone
                )
            )
        }

Due to the introduced variables, this maybe is even clear enough to not require any code comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I'm using the SpdxConstants.NONE constant instead of "NONE" here, and deliberately an unknown / invalid line range. I have not checked whether this might cause any problems with some later sanity checks or so.

}
}

val scannerNames = scannerWrappers.mapTo(mutableSetOf()) { it.name }
val scanners = packages.associateBy({ it.id }) { scannerNames }

Expand Down