Skip to content
This repository has been archived by the owner on Oct 15, 2024. It is now read-only.

Use HTTP when adding orders to mesh #111

Merged
merged 7 commits into from
Feb 28, 2020
Merged

Use HTTP when adding orders to mesh #111

merged 7 commits into from
Feb 28, 2020

Conversation

xianny
Copy link
Contributor

@xianny xianny commented Feb 19, 2020

This should fix the message too big error that we are getting from Mesh. Chunks orders whenever we are adding orders to Mesh. This already existed in meshUtils so it's a tiny code change.

Context

We've been getting this error recently especially on start-up when syncing the orderbook with Mesh:

{
"err_string":"websocket: close 1009 (message too big): Frame size of 1984660 bytes exceeds maximum accepted frame size",
"level":"error","msg":"rpcSub returned an error",
"myPeerID":"16Uiu2HAm1Z7Z2WTRtgPWkhvfLgbLELKd8YsMyE8iaKXYZRoJdHd7",
"time":"2020-02-06T21:39:59Z"
}

This error is caused by sending too many orders at once to Mesh. We need to always use chunking when calling meshClient.addOrdersAsync. The underlying cause is a bug in the websocket client.

Discussed with @fabioberger who suggested the best way to deal with it is to add the chunking on the 0x-api side.

Tests

Deployed on staging, smoke tests are passing

@dekz
Copy link
Member

dekz commented Feb 19, 2020

We are already chunking on a sync, see here: https://github.com/0xProject/0x-api/blob/master/src/services/order_watcher_service.ts#L26 so I thought we'd already be covered by this. I have a gut feeling that something more peculiar is happening on a disconnect.

When I last tested this I was dropping the Mesh container and restarting it, this would force a reconnect and it would show errors. The underlying connection can report some debug info if DEBUG="*" is enabled which could help diagnose.

@fragosti
Copy link
Contributor

What @dekz said. Have you looked into using the HTTP endpoint?

@xianny
Copy link
Contributor Author

xianny commented Feb 19, 2020

I'd like to try deploying this fix and monitoring for a couple of days. Our writeup in Quip (link) suggests that right now this happens occasionally when POST sra/v3/orders is used. This PR adds chunking for the handler for that endpoint specifically.

I think it's also important to figure out if chunking everywhere fixes it. If it doesn't that indicates an unknown bug in the client.

@dekz I'm with you on funky disconnects. I did some testing with the unit tests in https://github.com/0xProject/0x-mesh and noticed weird behaviour on disconnects with the client there too, but it might have just been my local env.

@fragosti
Copy link
Contributor

@xianny that /orders endpoint has only been out for like a week.

@xianny
Copy link
Contributor Author

xianny commented Feb 21, 2020

Went down a bit of a rabbithole examining the websocket implementation in Mesh and web3.js and trying to rule out bugs there.

Most likely cause of this problem (despite currently chunking by 100 orders at a time) is inconsistent order size, e.g. ERC1155 orders. There's no chunking built into the Mesh client, and it also doesn't make use of the fragmentation in the 'websocket' implementation (theturtle32/WebSocket-Node#359); so the frame is just dropped if it's over 1MB. For performance reasons we don't want to increase the max frame size.

Long-term, we should try to figure out how to use fragmentation in the Mesh client, or add automatic chunking by size of bytes.

For now we can switch to using the HTTP endpoint which was implemented in Mesh v9 0xProject/0x-mesh#658; just waiting till #113 is merged

@xianny xianny force-pushed the fix/msg-too-big branch 3 times, most recently from 5b94b5b to 0b11122 Compare February 25, 2020 22:08
@xianny
Copy link
Contributor Author

xianny commented Feb 25, 2020

deploy staging

@xianny
Copy link
Contributor Author

xianny commented Feb 25, 2020

deploy kovan

@xianny
Copy link
Contributor Author

xianny commented Feb 26, 2020

Ready for review again!

Changes

Adding orders to mesh should now only be done through meshUtils.addOrdersToMeshAsync. This method routes through websocket OR http depending on how many orders are being sent. The batch size threshold has a default value set in constants.ts; please review.

The implementation of meshUtils.addOrdersViaHttp mirrors the implementation in WSClient (Mesh) and WebsocketProvider (web3.js). It's all JSON-RPC transformations.

Testing

I deployed this on Kovan and successfully posted 10 orders (websocket) and 60 orders (http) via POST /sra/v3/orders. The code path for orderbook syncing is the same as for posting orders, so this should resolve the issue of dropping frames when syncing the orderbook.

Potential issues

I don't know what the effects of chunking and sending multiple requests are. Sometimes orderbook syncing has > 1000 orders, which means 10 HTTP requests of 100 orders each. I think this should be OK, but am flagging it as an unknown.

Will need https://github.com/0xProject/0x-main-infra/pull/176 to deploy

@xianny xianny requested review from fragosti and dekz February 26, 2020 00:30
package.json Outdated Show resolved Hide resolved
src/constants.ts Outdated Show resolved Hide resolved
src/services/order_watcher_service.ts Outdated Show resolved Hide resolved
src/utils/mesh_utils.ts Outdated Show resolved Hide resolved
src/utils/mesh_utils.ts Outdated Show resolved Hide resolved
src/utils/utils.ts Outdated Show resolved Hide resolved
src/services/orderbook_service.ts Outdated Show resolved Hide resolved
package.json Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@xianny xianny changed the title Use chunking when adding orders to mesh Use HTTP when adding orders to mesh Feb 26, 2020
@xianny
Copy link
Contributor Author

xianny commented Feb 26, 2020

deploy kovan

README.md Outdated
| `CHAIN_ID` | Required. No default. | The chain id you'd like your API to run on (e.g: `1` -> mainnet, `42` -> Kovan, `3` -> Ropsten, `1337` -> Ganache). Defaults to `42` in the API, but required for `docker-compose up`. |
| `ETHEREUM_RPC_URL` | Required. No default. | The URL used to issue JSON RPC requests. Use `http://ganache:8545` to use the local ganache instance. |
| `MESH_WEBSOCKET_URI` | Required. Default for dev: `ws://localhost:60557` | The URL pointing to the 0x Mesh node. A default node is spun up in `docker-compose up` |
| `MESH_HTTP_URI` | Optional. Used when syncing the orderbook; defaults to websocket connection if not provided. | The URL pointing to the Mesh node's HTTP JSON-RPC endpoint. A default node is spun up in `docker-compose up` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I would just say "Optional. No default." and then put the description part in the description column.

src/config.ts Outdated
@@ -40,6 +40,7 @@ export const ETHEREUM_RPC_URL = assertEnvVarType('ETHEREUM_RPC_URL', process.env
export const MESH_WEBSOCKET_URI = _.isEmpty(process.env.MESH_WEBSOCKET_URI)
? 'ws://localhost:60557'
: assertEnvVarType('MESH_WEBSOCKET_URI', process.env.MESH_WEBSOCKET_URI, EnvVarType.Url);
export const MESH_HTTP_URI = process.env.MESH_HTTP_URI;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should still use the assertEnvVarType function right? if it isn't falsey.

Comment on lines 11 to 13
if (_.isEmpty(this.httpURI)) {
return super.addOrdersAsync(orders, pinned);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should just fail right? The function name here is misleading if there is a case where it would post over websockets. You could also rename the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renaming function

Comment on lines 16 to 22
chunks.forEach(async chunk => {
// format request payload
const data = {
jsonrpc: '2.0',
id: +new Date(),
method: 'mesh_addOrders',
params: [chunk, { pinned }],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need batching logic for HTTP?

@xianny xianny merged commit bfd3527 into master Feb 28, 2020
@xianny xianny deleted the fix/msg-too-big branch February 28, 2020 18:48
@github-actions
Copy link

🎉 This PR is included in version 1.0.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants