Adds consumer metadata checking to next requests #5141

jnmoyne · 2024-02-28T03:22:29Z

Adds a metadata field to consumer next requests
When processing the request the server will compare the request's metadata (if any) against the consumer's metadata (in the config) and reject the request with a status 409 if the two do not match (either a value is different or a key in the request is missing in the consumer).

When processing the request the server will compare the request's metadata (if any) against the consumer's metadata (in the config) and reject the request with a status 409 if the two do not match (either a value is different or a key in the request is missing in the consumer). Signed-off-by: Jean-Noël Moyne <[email protected]>

Jarema

It would be nice to explain why you would like this PR in. What is the use case?

Sending metadata to the server in Pull Request as a validation if we sent it to proper consumer is really counter intuitive. Why would sending a metadata in Pull Request check if it's the same on the server?

My guess is it's (a bit hacky :)) way to make sure if the consumer is the same, meaning - it was not recreated under the same name?
This goes back to the fact that we did not separate ID of the consumer from the Name. It would be useful to have such a feature, as it would allow more flexibility with renaming streams and consumers.

Let's discuss how to best solve the use case you have :).

Jarema · 2024-02-28T06:40:39Z

server/consumer.go

@@ -3332,6 +3332,26 @@ func (o *consumer) processNextMsgRequest(reply string, msg []byte) {
 		return
 	}

+	if len(consumerMetaData) > 0 && len(o.cfg.Metadata) > 0 {
+		var matching = true
+		for k, v := range consumerMetaData {


This is a one way check.
If consumer metadata on the server has key-pairs that are not present on the metadata passed in request, the check will still return true, a false positive.

Not a false positive, that's the behavior I want:
I pass you a collection of metadata key/values and I want to make sure those are present in the consumer's metadata and they match the value. It's not a 'deep equal' but a 'check that those keys and values are there and the same'.

Jarema · 2024-02-28T06:41:49Z

server/consumer.go

+			}
+		}
+		if !matching {
+			sendErr(409, "Request's medata does not match the consumer's")


This should to be the same error code.

I would really like us to have separate code's at least for errors that have a very distinct way to react to them in the client.

Also this should be a const, not a literal.

I just did the same as what the existing other 409 errors do (e.g. line 3308) 🤷‍♂️

Typos and incorrect 's here

This should not be 409.
Client libraries should not compare strings to determine what is happening and what to do.

EDIT: I know we did it all over the place already, but it would be really nice to improve the situation, at least for the new errors...

We use http status codes here, and 4xx are client errors, with 409 being conflict which seems correct.

jnmoyne · 2024-02-28T17:08:19Z

My particular use case is checking that some part of the consumer's metadata (that I expect to be there and with the values that I expect) did not get changed between two pull requests. In my case I use this to get an 'exclusive puller from a consumer' behavior. Meaning that each client instance picks a unique id and sets a key in the consumer config's metadata with that id and uses consumer create to make sure it is the only one creating the consumer (which has a durable name shared by all the instances). If you can create the consumer with your id in the metadata you want to make sure that the same id is there each time you do a consumer next because in some failure scenarios (like a network failure) the consumer could be deleted because of timeout and another instance (therefore different id in the metadata) may have been able to create the consumer for itself.

I would suspect that there may be other use cases where you want to check that something in the configuration (i.e. part of the metadata) did not get changed between 2 pull requests.

I otherwise would have to do a get consumer info each time before I do a fetch (and wouldn't be able to use 'consume()' or 'messages()', I'd have to call fetch() myself.

That said, ultimately my desired functionality is exclusive puller from a durable consumer, and there may be reasons to implement it another way which I would gladly use, this just seemed like a simple change that could maybe have other use cases of wanting to know that 'something has changed in the consumer metadata'

derekcollison · 2024-02-29T03:59:55Z

I am ok with this actually now that I understand what is being accomplished and that it will be used in only very specific use cases.

Interested in what @ripienaar thinks.

ripienaar

Generally ok with it too, I am just thinking how it can be abused and impact on payloads etc if this becomes heavily used.

And this solution will work only for pull consumers? Do we have a matching plan for push?

ripienaar · 2024-02-29T10:26:33Z

server/consumer.go

+			}
+		}
+		if !matching {
+			sendErr(409, "Request's medata does not match the consumer's")


Typos and incorrect 's here

Jarema · 2024-02-29T21:10:40Z

I have a different idea @derekcollison @ripienaar

We add a field "unique: bool" to Pull Request, false by default.
When server see it, it will check if there are Pull Requests for other inboxes. If they are, it will act exactly as here - return an error.

I think that way would be cleaner, and it would actually provide a check for exclusivity, rather than a convention that different payload means something changed.

This could be also done in reverse, so exclusiveness set on consumer, and flag in pull request to potentially take over the consumer, draining others, which could simplify the orchestration.

ripienaar · 2024-03-01T10:00:02Z

@Jarema Hmm, we do not really have guarantees of inboxes being unique and it changes per pull doesnt it? Well it can it does not need to but maybe it does, so perhaps that not always legit?

I agree we need a less hacky approach though so lets think of alternatives of do a look around at how we use inboxes for pulls

Jarema · 2024-03-01T10:07:27Z

@ripienaar I was wondering about it too.

We do reuse the same inbox while processing the messages in the same Consumer instance in the clients, with optional one wildcard token at the and, but the subject still would match.
And yes, it is not guaranteed to have unique inboxes except by convention.
Though to be fair, if we reuse inbox across clients, that would mess up messages delivery anyway (for pull consumers), so we assume it is unique (and we use random inbox() for all pull consumers).

ripienaar · 2024-03-01T10:15:43Z

For sure we can’t have 2 clients with same inbox that would be bad (though I can tell you why I do that in places).

I am more concerned that using an inbox as a signal of uniqueness within one client might not be valid. Like is it that the last token changes between pulls and all the rest is same (new inbox style) or old style where it’s always difference etc. how’s the consumer to know.

Jarema · 2024-03-01T10:43:37Z

We do not use muxing and request/reply for this, as it is multiple resonses for single request.
All clients rely on subscribing to inbox, but some add (for easier counting of messages per pull request) a token at the end so the sub looks like:

_INBOX.NUID.*

So we would have to do subject match, not equal match.

The biggest problem with this approach is that it would not work with old Fetch, that creates new inbox per Pull Request. Or at least might now work as expected. Would need to revisit the code.

Maybe a variation would be to add to the Pull Request a UID instead of bool flag @ripienaar ?
That way we would not rely on subject uniquness.
Would work almost exactly the same as @jnmoyne PR, but without abusing metadata for that purpose.

ripienaar · 2024-03-01T12:19:27Z

Yes I like the UID approach, you can calculate it once and just reuse it in that client, good idea.

Jarema · 2024-03-01T13:37:18Z

Created the draft PR with the owner ID approach.

derekcollison · 2024-03-01T14:35:32Z

I think this is a different problem set vs what you are suggesting which is an appId that could allow for exclusive consumers.

Jarema · 2024-03-01T18:20:05Z

Hm, how is this different?

I think this is exactly what @jnmoyne tries to achieve here. quoting him:

In my case I use this to get an 'exclusive puller from a consumer' behavior. Meaning that each client instance picks a unique id and sets a key in the consumer config's metadata with that id and uses consumer create to make sure it is the only one creating the consumer

use case
I actually see disadvage in using metadata in so opinionated way - it means we cannot use it for anything else in pull consumers.

It also ensures exclusiveness purely by convention: different clients can set different metadata and each will "think" it's exclusive.

performance
This solution allocates (and compares) a whole map for every pull request sent.

5157 PR formalizes that unique ID into clean and explicit way, such that can be also easily incorporated into clients to allow exclusive consumers for any other needs without to align metadata across clients.

aricart · 2024-03-01T18:57:25Z

shouldn't this be somehow tied to the connection? - if we have the server ID and the CID couldn't that be the identifier? - What I saying is the server knows the client making the request (server ID and CID) - I think this is available outside of operator mode, so if the consumer was of type exclusive - the server could aggregate the name provided for the consumer with this other info and effectively get an unique guy

jnmoyne · 2024-03-01T20:47:55Z

Basically boils down to whether there are other use cases that could benefit from this "check some part of the metadata with each pull request" functionality besides the exclusive puller use case (I will admit, I can't think of a specific example one so far). Because otherwise @Jarema's other PR is basically the same thing except using a new string in consumer config and a new optional string field in the pull request, is more 'locked down' in that you can only use it for exclusivity and there's no option for people to 'abuse' it or getting it wrong if they don't all agree on the metadata key names.

In that more locked down specific to just exclusivity version, then you could indeed use the server name + cid as your unique id which means then it would not need to be added as a field to the pull request message (if the server can tell which server name + cid that request came from).

jnmoyne · 2024-03-01T23:34:39Z

Forgot to answer: it not meant to be for push consumers, but I don't need push and as I'm using the new API anyways that's a non issue.

jnmoyne · 2024-03-01T23:59:39Z

Another comment upon more thinking about it for my use case and I think I would rather be able to store an id that I generate myself as part of the consumer config (either in metadata or a new field) rather than have the exclusivity just be a boolean flag and using server+cid to generate the id automatically. For example because then anyone can tell using consumer info which id is the currently active one and each can tell whether it's its own id (though you could make the current id be part of the consumer state and expose some convenience function to compose the server+cid correctly for oneself. And in my use-case I'd rather identify at the process rather than the current connection level (i.e. same id even after reconnecting).

ripienaar · 2024-03-02T10:31:47Z

I like the flexibility that the metadata approach bring: its very undefined, open ended and one can maybe solve other problems with the it in the future, a nice little hacky tool.

But is that what we really need at this stage? We can do it but should we? I am not sure we are at a place where we should be addeding open ended, undefined and unrestricted things like this - especially not that might be used on every pull request.

It begs so many questions, how many fields are allowed? How big are they? What limits are there? What schema do they follow?

I think its dangerous assigning core behaviours to entirely open ended undefined things like metadata.

So my vote is a ID based approach and if we feel visibility of the ID is needed then we add it to consumer state. But we need to be quite careful to not open the can of worms of Raft state again.

derekcollison · 2024-03-02T23:13:18Z

@Jarema What I meant by different is that @jnmoyne is using this so that he can make sure the consumer has not been changed out from underneath of the application.

What you and I are suggesting is more for exclusive consumers, in my version tied to an appID which allows it to be horizontally scalable. The consumer would be configured to do exclusive consumers, meaning once it bound to an AppID in a pull request, it would only deliver messages to that AppID until they were all gone then it would "switch" to a new AppID.

jnmoyne · 2024-03-03T21:32:31Z

I'm totally fine with going with @Jarema 's version of just having a new exclusive id in the config instead of using metadata. I think we all agree that using the metadata for that is interesting and maybe a bit hacky and could be useful in some other use cases but could also open the door for "abuse" and since for my needs adding a new exclusive id to the config works just as well and is less open to abuse than this version I propose that we close this PR and move to @Jarema 's PR instead.

In any case the two are actually not incompatible: one if we do come up with or get a request for a use case were this way of monitoring changes to the config would be good to have then we can always resuscitate this.

jnmoyne · 2024-03-25T20:58:48Z

Closing this as replaced by #5157

jnmoyne requested a review from a team as a code owner February 28, 2024 03:22

Jarema requested changes Feb 28, 2024

View reviewed changes

ripienaar requested changes Feb 29, 2024

View reviewed changes

Jarema mentioned this pull request Mar 1, 2024

Add consumer Owner ID #5157

Draft

jnmoyne closed this Mar 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds consumer metadata checking to next requests #5141

Adds consumer metadata checking to next requests #5141

jnmoyne commented Feb 28, 2024

Jarema left a comment •

edited

Loading

Jarema Feb 28, 2024

jnmoyne Feb 28, 2024

Jarema Feb 28, 2024

jnmoyne Feb 28, 2024

ripienaar Feb 29, 2024

Jarema Mar 1, 2024 •

edited

Loading

derekcollison Mar 2, 2024

jnmoyne commented Feb 28, 2024 •

edited

Loading

derekcollison commented Feb 29, 2024

ripienaar left a comment

ripienaar Feb 29, 2024

Jarema commented Feb 29, 2024

ripienaar commented Mar 1, 2024

Jarema commented Mar 1, 2024

ripienaar commented Mar 1, 2024

Jarema commented Mar 1, 2024 •

edited

Loading

ripienaar commented Mar 1, 2024

Jarema commented Mar 1, 2024

derekcollison commented Mar 1, 2024 •

edited

Loading

Jarema commented Mar 1, 2024

aricart commented Mar 1, 2024 •

edited

Loading

jnmoyne commented Mar 1, 2024 •

edited

Loading

jnmoyne commented Mar 1, 2024

jnmoyne commented Mar 1, 2024 •

edited

Loading

ripienaar commented Mar 2, 2024

derekcollison commented Mar 2, 2024

jnmoyne commented Mar 3, 2024 •

edited

Loading

jnmoyne commented Mar 25, 2024

Adds consumer metadata checking to next requests #5141

Adds consumer metadata checking to next requests #5141

Conversation

jnmoyne commented Feb 28, 2024

Jarema left a comment • edited Loading

Choose a reason for hiding this comment

Jarema Feb 28, 2024

Choose a reason for hiding this comment

jnmoyne Feb 28, 2024

Choose a reason for hiding this comment

Jarema Feb 28, 2024

Choose a reason for hiding this comment

jnmoyne Feb 28, 2024

Choose a reason for hiding this comment

ripienaar Feb 29, 2024

Choose a reason for hiding this comment

Jarema Mar 1, 2024 • edited Loading

Choose a reason for hiding this comment

derekcollison Mar 2, 2024

Choose a reason for hiding this comment

jnmoyne commented Feb 28, 2024 • edited Loading

derekcollison commented Feb 29, 2024

ripienaar left a comment

Choose a reason for hiding this comment

ripienaar Feb 29, 2024

Choose a reason for hiding this comment

Jarema commented Feb 29, 2024

ripienaar commented Mar 1, 2024

Jarema commented Mar 1, 2024

ripienaar commented Mar 1, 2024

Jarema commented Mar 1, 2024 • edited Loading

ripienaar commented Mar 1, 2024

Jarema commented Mar 1, 2024

derekcollison commented Mar 1, 2024 • edited Loading

Jarema commented Mar 1, 2024

aricart commented Mar 1, 2024 • edited Loading

jnmoyne commented Mar 1, 2024 • edited Loading

jnmoyne commented Mar 1, 2024

jnmoyne commented Mar 1, 2024 • edited Loading

ripienaar commented Mar 2, 2024

derekcollison commented Mar 2, 2024

jnmoyne commented Mar 3, 2024 • edited Loading

jnmoyne commented Mar 25, 2024

Jarema left a comment •

edited

Loading

Jarema Mar 1, 2024 •

edited

Loading

jnmoyne commented Feb 28, 2024 •

edited

Loading

Jarema commented Mar 1, 2024 •

edited

Loading

derekcollison commented Mar 1, 2024 •

edited

Loading

aricart commented Mar 1, 2024 •

edited

Loading

jnmoyne commented Mar 1, 2024 •

edited

Loading

jnmoyne commented Mar 1, 2024 •

edited

Loading

jnmoyne commented Mar 3, 2024 •

edited

Loading