Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(new sink): initial websocket_server sink #22213

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

esensar
Copy link
Contributor

@esensar esensar commented Jan 15, 2025

Summary

This adds a new sink, similar to websocket which acts like a client, while this one acts as a server. It currently broadcasts messages to all currently connected clients.

It might make sense to add some kind of buffer and allow clients to catch up to missed messages, or something like that, but I wanted to leave such complications for further PRs.

This also adds customizable authentication using VRL, which would allow us to use custom VRL scripts to check auth, meaning we no longer use hardcoded basic auth and can use enrichment tables to look for credentials.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

How did you test this PR?

Started up vector with the following configuration:

sources:
  demo_logs_test:
    type: "demo_logs"
    format: "json"

sinks:
  websocket_sink:
    inputs: ["demo_logs_test"]
    type: "websocket_listener"
    address: "0.0.0.0:1234"
    encoding:
      codec: "json"

Connected to the server with websocat and observed that events are sent to the client. I have also tested auth in this way, by defining auth header using websocat.

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the "no-changelog" label to this PR.

Checklist

  • Please read our Vector contributor resources.
    • make check-all is a good command to run locally. This check is
      defined here. Some of these
      checks might not be relevant to your PR. For Rust changes, at the very least you should run:
      • cargo fmt --all
      • cargo clippy --workspace --all-targets -- -D warnings
      • cargo nextest run --workspace (alternatively, you can run cargo test --all)
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run dd-rust-license-tool write to regenerate the license inventory and commit the changes (if any). More details here.

This adds a new sink, similar to `websocket` which acts like a client, while this one acts as a
server. It currently broadcasts messages to all currently connected clients.
@github-actions github-actions bot added domain: topology Anything related to Vector's topology code domain: sinks Anything related to the Vector's sinks labels Jan 15, 2025
@esensar
Copy link
Contributor Author

esensar commented Jan 15, 2025

I have added customizable auth handler in this PR, but I can take it out into a separate PR if needed.

This is also missing some tests and some metrics. I will probably also work on an optional event buffer to enable replays for some clients. That one might be more complex, so I wanted to put it off for now, to keep the initial PR simple.

@github-actions github-actions bot added the domain: external docs Anything related to Vector's external, public documentation label Jan 15, 2025
@esensar esensar changed the title feat(new_sink): initial websocket_listener sink feat(new sink): initial websocket_listener sink Jan 15, 2025
@pront
Copy link
Member

pront commented Jan 15, 2025

I have added customizable auth handler in this PR, but I can take it out into a separate PR if needed.
Are you referring to src/common/server_auth.rs? I also wonder if that code adds duplication with other http components.

If it's not too much effort, please do. It should make reviewing faster. It's just me reviewing PRs for this month 😅

@esensar
Copy link
Contributor Author

esensar commented Jan 15, 2025

I have added customizable auth handler in this PR, but I can take it out into a separate PR if needed.
Are you referring to src/common/server_auth.rs? I also wonder if that code adds duplication with other http components.

If it's not too much effort, please do. It should make reviewing faster. It's just me reviewing PRs for this month 😅

Oh 😄
Alright, I will do it, it should be pretty simple to move out.

@esensar
Copy link
Contributor Author

esensar commented Jan 15, 2025

Oh and sorry, I just saw the other part of the comment. Yeah, it causes duplication, my idea was to reuse it across these components, but didn't want to make too many changes in this PR. I guess if I move it to a separate PR, then changing these out should be okay.

esensar added a commit to esensar/vector that referenced this pull request Jan 17, 2025
This adds `custom` auth strategy for components with HTTP server (`http_server`, `datadog_agent`,
`opentelemetry`, `prometheus`) besides the default basic auth. This is a breaking change because
`strategy` is now required for auth - for existing configurations `strategy: "basic"` needs to be
added.

Related: vectordotdev#22213
@github-actions github-actions bot removed the domain: topology Anything related to Vector's topology code label Jan 17, 2025
@esensar
Copy link
Contributor Author

esensar commented Jan 21, 2025

Would it make more sense to name this websocket_server? Not sure why I named it listener to be honest. I think server would be more consistent with other components.

@jszwedko
Copy link
Member

Would it make more sense to name this websocket_server? Not sure why I named it listener to be honest. I think server would be more consistent with other components.

👍 websocket_server would be consistent with http_server. It leaves the door open for a websocket_client source that, like the http_client source, would initiate requests rather than listen for them.

@esensar esensar changed the title feat(new sink): initial websocket_listener sink feat(new sink): initial websocket_server sink Jan 23, 2025
@esensar esensar marked this pull request as ready for review January 24, 2025 11:32
@esensar esensar requested review from a team as code owners January 24, 2025 11:32
@pront pront self-assigned this Jan 24, 2025
Comment on lines +17 to +19
When enabled for a sink, any source connected to that sink, where the source supports
end-to-end acknowledgements as well, waits for events to be acknowledged by **all
connected** sinks before acknowledging them at the source.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When enabled for a sink, any source connected to that sink, where the source supports
end-to-end acknowledgements as well, waits for events to be acknowledged by **all
connected** sinks before acknowledging them at the source.
When enabled for a sink, any source that supports end-to-end
acknowledgements that is connected to that sink waits for events
to be acknowledged by **all connected sinks** before acknowledging
them at the source.

}
device_version: {
description: """
Identifies the version of the problem. In combination with device product and vendor, it composes the unique id of the device that sends messages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Identifies the version of the problem. In combination with device product and vendor, it composes the unique id of the device that sends messages.
Identifies the version of the problem. The combination of the device produc, vendor, and this value make up the unique id of the device that sends messages.

name: {
description: """
This is a path that points to the human-readable description of a log event.
The value length must be less than or equal to 512.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Juts checking - 512 and not 511?

Reflects importance of the event.

It must point to a number from 0 to 10.
0 = Lowest, 10 = Highest.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
0 = Lowest, 10 = Highest.
0 = lowest importance, 10 = highest importance.

non_numeric: """
Puts quotes around all fields that are non-numeric.
Namely, when writing a field that does not parse as a valid float or integer,
then quotes are used even if they aren't strictly necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
then quotes are used even if they aren't strictly necessary.
quotes are used even if they aren't strictly necessary.

description: """
Controls how metric tag values are encoded.

When set to `single`, only the last non-bare value of tags are displayed with the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When set to `single`, only the last non-bare value of tags are displayed with the
When set to `single`, only the last non-bare value of tags is displayed with the

description: """
Sets the list of supported ALPN protocols.

Declare the supported ALPN protocols, which are used during negotiation with peer. They are prioritized in the order
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Declare the supported ALPN protocols, which are used during negotiation with peer. They are prioritized in the order
Declare the supported ALPN protocols, which are used during negotiation with a peer. They are prioritized in the order

The certificate must be in DER, PEM (X.509), or PKCS#12 format. Additionally, the certificate can be provided as
an inline string in PEM format.

If this is set, and is not a PKCS#12 archive, `key_file` must also be set.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If this is set, and is not a PKCS#12 archive, `key_file` must also be set.
If this is set _and_ is not a PKCS#12 archive, `key_file` must also be set.

If enabled, certificates must not be expired and must be issued by a trusted
issuer. This verification operates in a hierarchical manner, checking that the leaf certificate (the
certificate presented by the client/server) is not only valid, but that the issuer of that certificate is also valid, and
so on until the verification process reaches a root certificate.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
so on until the verification process reaches a root certificate.
so on, until the verification process reaches a root certificate.

@esensar
Copy link
Contributor Author

esensar commented Jan 28, 2025

@neko-dd sorry for the late reply. I have applied one of the suggestions, which was directly related to things added in this PR. The rest of it has been generated automatically from other things that I have used (which are standard for all sinks) and you can see that documentation already live on almost all of the sinks. Maybe these should be fixed in a separate PR? Because if I fixed the here, I would generate too many changes that are not really related to this (it would change docs for all sinks).

@pront
Copy link
Member

pront commented Jan 28, 2025

Maybe these should be fixed in a separate PR? Because if I fixed the here, I would generate too many changes that are not really related to this (it would change docs for all sinks).

Good idea 👍


Also, please take a look at this 'guide', we will need some more doc files:
#22072 (comment)

@esensar
Copy link
Contributor Author

esensar commented Jan 29, 2025

Also, please take a look at this 'guide', we will need some more doc files: #22072 (comment)

Does this look better now? I have based this on websocket docs. I have also added the service file, but I am not sure how that is used exactly.

esensar added a commit to esensar/vector that referenced this pull request Jan 29, 2025
Updates wording in common sinks component docs (as suggested by @neko-dd)

Related: vectordotdev#22213 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: external docs Anything related to Vector's external, public documentation domain: sinks Anything related to the Vector's sinks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants