Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QortexRtdProvider: Supports new Qortex bid enrichment process #12173

Merged
merged 31 commits into from
Oct 25, 2024

Conversation

shilohannese
Copy link
Contributor

Type of change

  • Updated RTD adapter

Description of change

This change refactors the Qortex RTD module to use the new version of our context and bid enrichment API

  • enables bid enrichment for accounts linked to unique internal group ids that have agreement to use this feature
  • sends basic page information to our analysis API to match with already analyzed urls and create new analysis processes for bid enrichment
  • adds helper functions for reporting and bid enrichment options available within our technology
  • refactors testing and creates new units tests for added logic

Copy link

Tread carefully! This PR adds 1 linter error (possibly disabled through directives):

  • modules/qortexRtdProvider.js (+1 error)

Copy link

Tread carefully! This PR adds 1 linter error (possibly disabled through directives):

  • modules/qortexRtdProvider.js (+1 error)

Copy link

Tread carefully! This PR adds 1 linter error (possibly disabled through directives):

  • modules/qortexRtdProvider.js (+1 error)

@ChrisHuie ChrisHuie self-requested a review August 27, 2024 14:52
@ChrisHuie ChrisHuie self-assigned this Aug 27, 2024
return {
pageUrl: document.location.href,
title: document.title,
text: document.body.textContent.replaceAll(/\r?\n/gi, ' '),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems bizarre to send so much content over the wire, why not send a robot to get copies of pages you don't know the contents of? Also, this prevents anyone from using you from behind a login, as you get the login info

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patmmccann Our goal here is to send the information specific to the page that the user is accessing in order to allow a publisher to enrich bids with that page data through our API. This is currently directly feeding into our process and is expected data to begin analysis. Changing to a robot would fundamentally alter our product and its ability to quickly provide page-specific contextual data.

We don't collect or use any login info, and our services require existing business relationships and contracts to create a connection that uses this data

* rearrange logic, needs a few more tests

* updated and unit tests written

* remove logs
@rrochwick
Copy link
Contributor

Thank you for your response, @patmmccann. We have reviewed your feedback and implemented several new security features to ensure that both Qortex and the publisher are fully informed whenever content data is transferred.

Let me first address your suggestion regarding the use of a robot or page crawler to capture this data. While we currently have the capability for page crawling, it is not always an accurate representation of the content that an end user will see on a publisher's site. This is particularly true for publishers with dynamic content or content that is targeted differently based on factors such as geographic location, device type, operating system, or screen resolution. As a result, data collected directly from the user session will provide our services with the most accurate information, allowing us to give publishers precise contextual data to enhance bid auctions.

The additional security measures we've added in this pull request are as follows:

  1. Introduction of the 'disableBidEnrichment' configuration parameter: This will give the publisher full control over stopping all content/contextual data transfers to and from Qortex services.
  2. Content transfer prevention until publisher authorization: No content will be transferred to Qortex services until the publisher is confirmed as an authorized user within the Qortex system (via groupConfig.active). The group ID used in the configuration must be found and set as an active account in the Qortex platform. If the group ID is not found or is inactive, no content will be passed to Qortex.
  3. Content transfer prevention until Qortex bid enrichment enablement: Even with an active group ID, there is an additional toggle on the Qortex side (groupConfig.prebidBidEnrichment) that must explicitly be set to allow bid enrichment contextuality. If prebidBidEnrichment is not set to "true" on the Qortex platform, we will neither collect publisher content nor send contextual information back to the module.

With these measures in place, there should be no accidental transfer of publisher content without explicit acknowledgment from both the publisher and Qortex.

We appreciate your feedback and continued collaboration.

@JoelPM
Copy link
Contributor

JoelPM commented Sep 11, 2024

Weighing in after this was discussed at the Prebid.js PMC. I am concerned about Prebid enabling the entire contents of a page to be sent to a third party. Even if the publisher is opted in, this feels like a potentially large user privacy violation. I think it's possible that browser makers would consider it similarly, though that's just a guess. I would hate for Prebid to be flagged as allowing this type of behavior, even with pub consent (and assuming the publisher's privacy policy makes explicit that a user's entire page is being sent to a third party).

Given the publisher opt-in requirement, it feels like the same goal could be enabled by having the pub include a separate script that the RTD plugin interacts with. As a user, though, I would actively avoid any site that does this.

In general, I think Prebid should consider adding this behavior to the list of not allowed activities.

Copy link

Tread carefully! This PR adds 3 linter errors (possibly disabled through directives):

  • modules/qortexRtdProvider.js (+3 errors)

@rrochwick
Copy link
Contributor

@JoelPM @patmmccann :: Thank you for your valuable feedback. In response, we have refined the data collection process to focus exclusively on heading and paragraph tags. This adjustment ensures that no data is inadvertently collected from forms, user inputs, image tags, or similar elements. Additionally, the text collected will be limited to a maximum of 5,000 characters.

We feel it is important to clarify that the bid enrichment feature does not collect data on every page load for every user. Data is only gathered for new pages that have not been previously encountered by our system. This occurs during the first user interaction with a new page, and subsequent users will not trigger further data collection. For instance, if 100,000 users visit the same page, only the first visitor will contribute to the data capture. This ensures we provide contextual analysis for publisher content without incorporating any user-specific information.

We are focused on gathering page-level content, not user data, to provide a detailed understanding of a site's subject matter. This approach is aimed at enhancing the value of our supply partners' content and facilitating the transparent and efficient sharing of that value through our open-source Prebid.js solution while remaining within your policy guidelines.

@rrochwick
Copy link
Contributor

@patmmccann @ChrisHuie The changes requested by the Prebid PMC have been implemented. Bid enrichment is now explicitly opt-in for publishers and will only be enabled once the Qortex system confirms an agreement is in place and verifies the publisher's authorization to use the service. Additionally, we have updated the documentation to clarify the types of content transmitted, and included a notice advising publishers to review any pages for sensitive content before enabling the feature.

@ChrisHuie ChrisHuie removed their request for review September 27, 2024 13:55
@ChrisHuie ChrisHuie assigned patmmccann and unassigned ChrisHuie Sep 27, 2024
@patmmccann
Copy link
Collaborator

I'm okay with this state; but the docs changed in the js project md file arent enough, could you link the docs pull as well. Also, I think we decided to ask the pub committee to weigh in cc @GLStephen

@shilohannese
Copy link
Contributor Author

shilohannese commented Oct 8, 2024

@patmmccann Thank you for your response, i'm glad to see that we are getting closer to a ready-state with this release. What changes specifically are you looking for in the documentation? Do you mean the changes made in this PR or in another place? Happy to adjust if needed.

@rrochwick
Copy link
Contributor

Previous commit is to address jsdoc issues ticket

@rrochwick
Copy link
Contributor

@patmmccann Would it be possible to elaborate on what you would like linked with the documentation? I think we are a bit confused on what actions we should take next. Thanks!

@rrochwick
Copy link
Contributor

@patmmccann To ensure continued progress and updates to the Qortex RTD module, we have made the decision to remove the page content indexing functionality from the module entirely. There do not appear to be other concerns regarding the proposed updates; however, if any additional revisions are required, please do not hesitate to reach out.

return analyticsPercentage > randomInt;
}

function shouldAllowBidEnrichment() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is thi9s controlling if you've removed the text going over the wire

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is only used for determining if we enrich the auction with contextual data about the current page (and videos known to be on the page) from our servers. This still requires a check on the prebid config and within our zone configurations to even attempt to retrieve information known about the current page.

@rrochwick
Copy link
Contributor

@patmmccann patmmccann merged commit 80fbc98 into prebid:master Oct 25, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants