Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large messages causes channel to shut down #49

Closed
2joocy opened this issue Sep 27, 2022 · 8 comments · Fixed by #123
Closed

Large messages causes channel to shut down #49

2joocy opened this issue Sep 27, 2022 · 8 comments · Fixed by #123
Labels
bug Something isn't working

Comments

@2joocy
Copy link

2joocy commented Sep 27, 2022

We've got a production setup running where we've encountered an issue where the responses causes the server to shut down with the following error:

connection closed by server 505 UNEXPECTED_FRAME - expected content body, got non content body frame instead 60 40
amqp-client connection closed connection closed: UNEXPECTED_FRAME - expected content body, got non content body frame instead (505)
ERROR AMQPError: connection closed: UNEXPECTED_FRAME - expected content body, got non content body frame instead (505)

I've made a repo that replicates this issue consistently, under my profile: https://github.com/2joocy/cloudamqp-frame-issue
The repo contains a very minimal example of sending a message from the same channel a couple times, causing the server to crash. You can edit the connection string to rabbitmq in app.ts. The only requirement is that you've run npm i and then run npm start with the correct connection string.

We've used an older library before and didn't have issues, I've set the same test up with https://www.npmjs.com/package/amqplib and the data passes through.

Is this an issue from the library side or should we handle messages differently? Or are there parameters to circumvent this?

@2joocy 2joocy changed the title Large messages causes channels to shut down Large messages causes channel to shut down Sep 27, 2022
@chadknutson
Copy link

Try adding the await operator for the publish statement in your code. That will prevent javascript from trying to send the next message before the first message's body is fully written to the socket. E.g.,
for (let index = 0; index < 3; index++) { await q.publish(data).catch(console.error); }

@2joocy
Copy link
Author

2joocy commented Sep 27, 2022

Try adding the await operator for the publish statement in your code. That will prevent javascript from trying to send the next message before the first message's body is fully written to the socket. E.g., for (let index = 0; index < 3; index++) { await q.publish(data).catch(console.error); }

That's done on purpose to provoke the error. This error only happens when we load test the whole system, and the code I presented is just a simplification to provoke the same thing. With the await it works, however I tried to not include async sending from multiple clients to simulate the load testing that we do, just to keep the example project small. The data is also just mock data to fill around the same as the production data that is logged during the crash. Just as a note, in the production code we are awaiting the sends

const encrypted = encrypt(data);
const buffer = Buffer.from(encrypted);
await channel.basicPublish("", message?.properties.replyTo ?? "", buffer, publishOptions);

@dentarg
Copy link
Member

dentarg commented Jul 24, 2023

I'll just note that I can still reproduce with amqp-client.js v3.0.0 (on both RabbitMQ 3.9.16 used in the example here and RabbitMQ 3.12.1).

@dentarg dentarg added the bug Something isn't working label Jul 24, 2023
@vinayakakv
Copy link

We observed the same issue in prod where we were writing around 140 messages/second. The ingress rate at exchange was around 70 messages/second before we observed the crashes.

Currently we are planning to use multiple channels (say n) for writes, and a channel for each consumer.

While writing a message to the exchange, we plan to use a semaphore1 and round robin the channel and do the basicPublish.

We observed that with n channels, n+1 messages without await are sufficient to bring this error message up. In production, although we use await on a single channel, I'm not sure why this behaviour is happening. Maybe it is due to concurrency introduced by the promises at the scale?

@carlhoerberg
Copy link
Member

carlhoerberg commented Sep 16, 2024

A publish on the AMQP protocol involves sending multiple different frames, the BasicPublish method frame, the Headers frame and then 0 or more Body frames. These frames all have to be sent together or else RabbitMQ will freak out (kind of defeats the purpose of multiplexing but anyway). We await sending each frame in Channel#basicPublish, being good citizens, but if the user isn't using await everywhere they can inject frames from other channels between the basicPublish frames.

One way I guess, if user's won't or can't await is that this library won't either, or just await the last sent frame. NodeJS will continue to buffer data internally until the socket is drained (increasing memory usage, until the application crashes): https://nodejs.org/api/stream.html#writablewritechunk-encoding-callback

@cressie176
Copy link

Hi @carlhoerberg

If I understand the clarification of the original issue, the problem occurs during a load test with multiple clients. I imagine the poster means multiple HTTP clients were hitting their application during the test, which in turn published messages to RabbitMQ. The application under test correctly used await but still encountered the problem. The demo application deliberately missed await for the purpose of demonstrating this issue.

carlhoerberg added a commit that referenced this issue Sep 16, 2024
Even if the user doesn't await basicPublish all framed beloging to one
publish should be published together. So intead of waiting for a
potentially blocked socket to be drain enqueue all data and only await
for the last sent frame.

Fixes #49
@carlhoerberg
Copy link
Member

Right, I think the PR #123 should solve for that.

@vinayakakv
Copy link

vinayakakv commented Sep 16, 2024

Thank you @carlhoerberg for the quick turnaround! Will test the PR and let you know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants