Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Socket hang up while uploading sourcemaps #1129

Open
mychaelgo opened this issue Dec 4, 2023 · 14 comments · Fixed by #1180
Open

Socket hang up while uploading sourcemaps #1129

mychaelgo opened this issue Dec 4, 2023 · 14 comments · Fixed by #1180
Labels
bug Something isn't working source-code-integration Related to [git-metadata]

Comments

@mychaelgo
Copy link

mychaelgo commented Dec 4, 2023

Bug description

Sourcemaps upload error, even the file size < 500K
Screenshot 2023-12-04 at 14 46 25

# List the source maps in the build directory
du -h build/static/js/*.map

56.0K	build/static/js/1.70c9dc61.chunk.js.map
16.0K	build/static/js/1026.d629a7dd.chunk.js.map
4.0K	build/static/js/1039.2ff430f0.chunk.js.map
20.0K	build/static/js/1062.5aea4764.chunk.js.map
376.0K	build/static/js/1235.02d92438.chunk.js.map
4.0K	build/static/js/1376.b953db64.chunk.js.map
76.0K	build/static/js/1406.a7cf9fed.chunk.js.map
100.0K	build/static/js/1410.6dd2b10b.chunk.js.map
4.0K	build/static/js/1411.49d77a9e.chunk.js.map
804.0K	build/static/js/1436.5b6a2334.chunk.js.map
108.0K	build/static/js/1510.978b7d6a.chunk.js.map
12.0K	build/static/js/1535.b4728b50.chunk.js.map
40.0K	build/static/js/1631.de4a17e5.chunk.js.map
88.0K	build/static/js/1690.05ce4a2e.chunk.js.map
56.0K	build/static/js/1745.b9831f5b.chunk.js.map
8.0K	build/static/js/1773.2079c7ce.chunk.js.map
32.0K	build/static/js/1802.c7d66a91.chunk.js.map
4.0K	build/static/js/1814.ac02fc4f.chunk.js.map
4.0K	build/static/js/1817.76e878cd.chunk.js.map
284.0K	build/static/js/1820.74ed522c.chunk.js.map
4.0K	build/static/js/1851.24fd3196.chunk.js.map
76.0K	build/static/js/1871.684c27d2.chunk.js.map
396.0K	build/static/js/1908.cea29a08.chunk.js.map
108.0K	build/static/js/1915.b5e8f2d3.chunk.js.map
40.0K	build/static/js/193.d6ded0ee.chunk.js.map
64.0K	build/static/js/1959.94f0d86b.chunk.js.map
28.0K	build/static/js/2014.db213200.chunk.js.map
1.4M	build/static/js/2076.332b43c7.chunk.js.map
4.0K	build/static/js/212.c2f81b7f.chunk.js.map
160.0K	build/static/js/2245.d3c6fb67.chunk.js.map
48.0K	build/static/js/2258.6e69318b.chunk.js.map
8.0K	build/static/js/2285.bb48db85.chunk.js.map
4.0K	build/static/js/2403.d998609e.chunk.js.map
152.0K	build/static/js/2463.baaa35db.chunk.js.map
8.0K	build/static/js/2487.a2220685.chunk.js.map
40.0K	build/static/js/2537.209c5a7d.chunk.js.map
12.0K	build/static/js/2544.660baf40.chunk.js.map
124.0K	build/static/js/2592.a4d6bff6.chunk.js.map
8.0K	build/static/js/2593.1624ef93.chunk.js.map
24.0K	build/static/js/2602.668b6959.chunk.js.map
4.0K	build/static/js/2717.8441d59e.chunk.js.map
4.0K	build/static/js/274.10993e31.chunk.js.map
4.0K	build/static/js/2847.f891f160.chunk.js.map
60.0K	build/static/js/2874.3b934815.chunk.js.map
32.0K	build/static/js/2890.5c46f2ae.chunk.js.map
16.0K	build/static/js/3135.7d0e2ae5.chunk.js.map
4.0K	build/static/js/3145.04636504.chunk.js.map
4.0K	build/static/js/3157.8301a3aa.chunk.js.map
16.0K	build/static/js/3161.84581ad8.chunk.js.map
36.0K	build/static/js/3202.e66f94ab.chunk.js.map
20.0K	build/static/js/3207.2536ef96.chunk.js.map
44.0K	build/static/js/3316.b3bfbb39.chunk.js.map
372.0K	build/static/js/3442.96e9f1bf.chunk.js.map
12.0K	build/static/js/3445.e25a1da9.chunk.js.map
4.0K	build/static/js/3498.1ef6fa0a.chunk.js.map
32.0K	build/static/js/3592.341a18bd.chunk.js.map
64.0K	build/static/js/3594.f24f0c1c.chunk.js.map
4.0K	build/static/js/3625.260409ab.chunk.js.map
124.0K	build/static/js/3643.e526bf4a.chunk.js.map
52.0K	build/static/js/3646.bb874758.chunk.js.map
4.0K	build/static/js/3706.c381f4ed.chunk.js.map
16.0K	build/static/js/3726.83a14433.chunk.js.map
4.0K	build/static/js/3823.a140d01e.chunk.js.map
296.0K	build/static/js/3935.fe843014.chunk.js.map
6.9M	build/static/js/3943.5645ddf0.chunk.js.map
8.0K	build/static/js/3974.d599deaa.chunk.js.map
208.0K	build/static/js/400.6a080db6.chunk.js.map
28.0K	build/static/js/4074.55b3a1d0.chunk.js.map
84.0K	build/static/js/4088.421141e1.chunk.js.map
4.0K	build/static/js/4183.fd0d401e.chunk.js.map
80.0K	build/static/js/4190.c14909eb.chunk.js.map
12.0K	build/static/js/4195.179f747c.chunk.js.map
12.0K	build/static/js/4349.fe97b90b.chunk.js.map
108.0K	build/static/js/4364.91479dba.chunk.js.map
8.0K	build/static/js/4365.2968eeb4.chunk.js.map
20.0K	build/static/js/4389.00801080.chunk.js.map
24.0K	build/static/js/4569.51c4339e.chunk.js.map
8.0K	build/static/js/4584.6a0b19e6.chunk.js.map
28.0K	build/static/js/4610.89b0def6.chunk.js.map
36.0K	build/static/js/4637.017314c8.chunk.js.map
88.0K	build/static/js/464.53463562.chunk.js.map
4.0K	build/static/js/4657.e82a1907.chunk.js.map
4.0K	build/static/js/4733.0049af43.chunk.js.map
4.0K	build/static/js/4748.ee91ff22.chunk.js.map
4.0K	build/static/js/4756.05ae94ad.chunk.js.map
24.0K	build/static/js/4768.b26b6ede.chunk.js.map
12.0K	build/static/js/4807.8b596151.chunk.js.map
8.0K	build/static/js/4828.605893b4.chunk.js.map
128.0K	build/static/js/4846.13c82be7.chunk.js.map
8.0K	build/static/js/4879.ce137c3b.chunk.js.map
60.0K	build/static/js/4886.1deeb569.chunk.js.map
36.0K	build/static/js/4890.acdbf454.chunk.js.map
60.0K	build/static/js/4893.ab54c02f.chunk.js.map
124.0K	build/static/js/4944.66aa7825.chunk.js.map
32.0K	build/static/js/4960.2ce28a41.chunk.js.map
128.0K	build/static/js/4969.eb879c81.chunk.js.map
180.0K	build/static/js/4982.38528b8d.chunk.js.map
72.0K	build/static/js/4987.b5b6cbff.chunk.js.map
8.0K	build/static/js/5041.1e883c68.chunk.js.map
124.0K	build/static/js/5049.30dc262a.chunk.js.map
20.0K	build/static/js/5080.29bf85f5.chunk.js.map
88.0K	build/static/js/5132.5ee55c8b.chunk.js.map
40.0K	build/static/js/5168.f9435084.chunk.js.map
28.0K	build/static/js/5323.3f863dfb.chunk.js.map
88.0K	build/static/js/5332.83fe56e3.chunk.js.map
68.0K	build/static/js/5339.446b0e09.chunk.js.map
80.0K	build/static/js/5409.8d1ddee6.chunk.js.map
32.0K	build/static/js/5462.2da902bf.chunk.js.map
4.0K	build/static/js/5517.9418d24f.chunk.js.map
56.0K	build/static/js/5554.a8d0ab12.chunk.js.map
48.0K	build/static/js/5562.4ab399bf.chunk.js.map
220.0K	build/static/js/5580.4d9ec847.chunk.js.map
48.0K	build/static/js/5598.0b5d3e60.chunk.js.map
76.0K	build/static/js/5651.fce74122.chunk.js.map
24.0K	build/static/js/5695.450c4bce.chunk.js.map
160.0K	build/static/js/5697.5f683745.chunk.js.map
4.0K	build/static/js/573.19568373.chunk.js.map
28.0K	build/static/js/5730.0514b984.chunk.js.map
4.0K	build/static/js/5794.7acc9a64.chunk.js.map
40.0K	build/static/js/5831.42f2cb1f.chunk.js.map
4.0K	build/static/js/5849.8bc8c482.chunk.js.map
84.0K	build/static/js/5887.f2ecf2e0.chunk.js.map
124.0K	build/static/js/5935.a9cf4493.chunk.js.map
4.0K	build/static/js/601.27a3723e.chunk.js.map
120.0K	build/static/js/6024.f1372195.chunk.js.map
4.0K	build/static/js/6046.47b5bcbb.chunk.js.map
224.0K	build/static/js/6066.4d38f99c.chunk.js.map
4.0K	build/static/js/6110.0aedd425.chunk.js.map
76.0K	build/static/js/6145.00df0082.chunk.js.map
4.0K	build/static/js/6147.b53c4f70.chunk.js.map
4.0K	build/static/js/6156.c0539c86.chunk.js.map
52.0K	build/static/js/616.027fc503.chunk.js.map
8.0K	build/static/js/6182.8a8ad87c.chunk.js.map
36.0K	build/static/js/629.a655c359.chunk.js.map
20.0K	build/static/js/6299.813e4e3f.chunk.js.map
32.0K	build/static/js/6309.ef4269f7.chunk.js.map
16.0K	build/static/js/6324.17753013.chunk.js.map
84.0K	build/static/js/6326.4855247a.chunk.js.map
4.0K	build/static/js/6397.5de0e694.chunk.js.map
4.0K	build/static/js/6413.d694ed6f.chunk.js.map
68.0K	build/static/js/6443.563fcfc0.chunk.js.map
8.0K	build/static/js/6557.af3d71b6.chunk.js.map
4.0K	build/static/js/6581.7db204dd.chunk.js.map
4.0K	build/static/js/6666.cda6ddca.chunk.js.map
268.0K	build/static/js/6671.091abe0d.chunk.js.map
40.0K	build/static/js/6674.358d40f3.chunk.js.map
84.0K	build/static/js/6686.f3dab0b8.chunk.js.map
4.0K	build/static/js/6742.5cf877a7.chunk.js.map
20.0K	build/static/js/6769.8655980e.chunk.js.map
36.0K	build/static/js/6810.add16b63.chunk.js.map
64.0K	build/static/js/6816.c0328bb7.chunk.js.map
652.0K	build/static/js/6831.09fd4e92.chunk.js.map
4.0K	build/static/js/6847.2d563d94.chunk.js.map
8.0K	build/static/js/6852.57099c68.chunk.js.map
152.0K	build/static/js/6872.742e0a1d.chunk.js.map
60.0K	build/static/js/6874.ff4a3923.chunk.js.map
8.0K	build/static/js/6892.7e6bc03b.chunk.js.map
8.0K	build/static/js/6972.3db02b1c.chunk.js.map
32.0K	build/static/js/7042.b762dff5.chunk.js.map
20.0K	build/static/js/7056.1aa38b30.chunk.js.map
28.0K	build/static/js/7095.ac1f9df2.chunk.js.map
1.5M	build/static/js/7221.607edef2.chunk.js.map
24.0K	build/static/js/7294.33701be9.chunk.js.map
28.0K	build/static/js/7389.f7821441.chunk.js.map
4.0K	build/static/js/7399.0f69c786.chunk.js.map
28.0K	build/static/js/7484.9e6b6274.chunk.js.map
68.0K	build/static/js/7573.c1a2c6db.chunk.js.map
76.0K	build/static/js/7576.8a59cdd2.chunk.js.map
296.0K	build/static/js/7587.9f9a615f.chunk.js.map
88.0K	build/static/js/7605.8d7bf747.chunk.js.map
40.0K	build/static/js/7650.59d9dd58.chunk.js.map
32.0K	build/static/js/766.f52089c7.chunk.js.map
28.0K	build/static/js/7686.32190382.chunk.js.map
4.0K	build/static/js/7695.90062550.chunk.js.map
8.0K	build/static/js/77.97e35e8c.chunk.js.map
68.0K	build/static/js/7703.23b296c1.chunk.js.map
44.0K	build/static/js/7739.be985c1c.chunk.js.map
4.0K	build/static/js/7760.da1f5dc3.chunk.js.map
36.0K	build/static/js/778.3aa82d4a.chunk.js.map
24.0K	build/static/js/7805.627d2f3c.chunk.js.map
144.0K	build/static/js/7814.ab2bb93f.chunk.js.map
124.0K	build/static/js/7825.23952a24.chunk.js.map
284.0K	build/static/js/7854.c6382dc3.chunk.js.map
216.0K	build/static/js/7972.535d04ee.chunk.js.map
128.0K	build/static/js/8001.88a24752.chunk.js.map
52.0K	build/static/js/8002.27f9b490.chunk.js.map
684.0K	build/static/js/8020.8e09cdc9.chunk.js.map
4.0K	build/static/js/8036.84a9c01b.chunk.js.map
4.0K	build/static/js/8096.a4ef43a6.chunk.js.map
12.0K	build/static/js/8110.e87da554.chunk.js.map
16.0K	build/static/js/8122.ab499de3.chunk.js.map
4.0K	build/static/js/8123.97497c17.chunk.js.map
12.0K	build/static/js/8187.069fcddb.chunk.js.map
196.0K	build/static/js/8188.a43323d3.chunk.js.map
32.0K	build/static/js/8263.f917a27c.chunk.js.map
36.0K	build/static/js/8278.547e26a8.chunk.js.map
4.0K	build/static/js/837.bf4c6bce.chunk.js.map
48.0K	build/static/js/8376.998ebc27.chunk.js.map
4.0K	build/static/js/8411.eaf6e065.chunk.js.map
40.0K	build/static/js/8436.33cd090f.chunk.js.map
32.0K	build/static/js/8464.ddd63d19.chunk.js.map
260.0K	build/static/js/8467.561b23bc.chunk.js.map
32.0K	build/static/js/8480.4f2d7d1b.chunk.js.map
40.0K	build/static/js/8524.8fe07903.chunk.js.map
72.0K	build/static/js/8591.bd9af022.chunk.js.map
24.0K	build/static/js/8595.23002a78.chunk.js.map
44.0K	build/static/js/8607.ea49a0db.chunk.js.map
4.0K	build/static/js/8729.f40d2329.chunk.js.map
4.0K	build/static/js/8732.fd772b8e.chunk.js.map
4.0K	build/static/js/8758.b5d6cd9a.chunk.js.map
180.0K	build/static/js/8776.f742f32c.chunk.js.map
4.0K	build/static/js/878.6ae8f5ae.chunk.js.map
20.0K	build/static/js/8805.bcf4ab5f.chunk.js.map
16.0K	build/static/js/8851.ac33ac5c.chunk.js.map
36.0K	build/static/js/8854.2c621baa.chunk.js.map
44.0K	build/static/js/8866.e00200c9.chunk.js.map
88.0K	build/static/js/8937.1cfe8ae4.chunk.js.map
52.0K	build/static/js/8976.8c601738.chunk.js.map
200.0K	build/static/js/9016.35feab01.chunk.js.map
12.0K	build/static/js/9020.e0d754b8.chunk.js.map
132.0K	build/static/js/9057.fd0737c8.chunk.js.map
68.0K	build/static/js/9079.7310dd7c.chunk.js.map
4.0K	build/static/js/9155.7ad744fc.chunk.js.map
12.0K	build/static/js/9179.c6b394f4.chunk.js.map
332.0K	build/static/js/9214.e8d008c6.chunk.js.map
32.0K	build/static/js/923.83ca250e.chunk.js.map
4.0K	build/static/js/93.f2b31674.chunk.js.map
52.0K	build/static/js/9317.23dcbdff.chunk.js.map
16.0K	build/static/js/9365.c6a3962e.chunk.js.map
108.0K	build/static/js/9466.c55e29d8.chunk.js.map
88.0K	build/static/js/9482.c9df5356.chunk.js.map
28.0K	build/static/js/9484.13a55282.chunk.js.map
20.0K	build/static/js/9517.f6fa6612.chunk.js.map
148.0K	build/static/js/9533.cd0e09a7.chunk.js.map
60.0K	build/static/js/9585.48ff0fb2.chunk.js.map
24.0K	build/static/js/9623.861fabe6.chunk.js.map
8.0K	build/static/js/9671.cf382f45.chunk.js.map
4.9M	build/static/js/9736.436e5c30.chunk.js.map
112.0K	build/static/js/9745.c9706529.chunk.js.map
60.0K	build/static/js/9801.32db900d.chunk.js.map
4.0K	build/static/js/9832.a5fde5ec.chunk.js.map
72.0K	build/static/js/9857.622d1b30.chunk.js.map
32.0K	build/static/js/9880.3a615e39.chunk.js.map
104.0K	build/static/js/9971.2e967695.chunk.js.map
256.0K	build/static/js/9994.a4b0f00c.chunk.js.map
24.0K	build/static/js/9999.b2d248fe.chunk.js.map
92.0K	build/static/js/index.69e25fda.js.map

Describe what you expected

All sourcemaps uploaded without any errors

Steps to reproduce the issue

./node_modules/.bin/datadog-ci sourcemaps upload ./build --service my-service --release-version $version --minified-path-prefix / --max-concurrency 15

Additional context

> node -v
v20.10.0

"@datadog/datadog-ci": "2.24.1"

Command

sourcemaps

@mychaelgo mychaelgo added the bug Something isn't working label Dec 4, 2023
@github-actions github-actions bot added the source-code-integration Related to [git-metadata] label Dec 4, 2023
@BenoitZugmeyer
Copy link
Member

Hi @mychaelgo. How often are you experiencing this issue? It looks like this happens because the upload takes too much time. It might be a temporary connectivity issue on your side, for example on a network with very low bandwidth.

@mattlewis92
Copy link

@BenoitZugmeyer we're experiencing this issue as well, we can reproduce it consistently when trying to upload thousands of sourcemaps, it's fast for the first minute and then everything starts failing and starts retrying. I suspect the DataDog backend is either rate limiting our CI or the backend is failing due to the volume of requests being made to upload all sourcemaps for our application. In the logs we see a combination of "socket hang up" and "Request failed with status code 408".

@BenoitZugmeyer
Copy link
Member

Thank you for your feedback. I investigated a bit and found something that might help improve the situation. Stay tuned

@BenoitZugmeyer
Copy link
Member

@mattlewis92 is this an issue you started experiencing recently, or did you always experienced it? Do your CI runs on Azure/GCP/AWS?

@mattlewis92
Copy link

@mattlewis92 is this an issue you started experiencing recently, or did you always experienced it? Do your CI runs on Azure/GCP/AWS?

It's only started happening to us since January 5th, but that build also increased the amount of sourcemaps we were uploading, so it's possible the issue has always been there and we've just hit an upper limit where it starts to trigger. This would make sense as the problem only starts to occur towards the last bunch of sourcemaps uploaded.

Our CI runs on github actions, so would be from the Azure data center under the hood, as we are not using self hosted runners.

@BenoitZugmeyer
Copy link
Member

Thank you for those informations.

I didn't reproduce the issue even after uploading thousands of sourcemaps at once.

I released #1158 as part of v2.28.0, which might improve the situation. Could you give it a try?

@mattlewis92
Copy link

Sure thing! Our next prod release isn't for another week, so will check in then and let you know if the issue is gone!

@mattlewis92
Copy link

I took a look at our logs today after upgrading @datadog/datadog-ci to 2.28.0, but am still seeing the socket hang up message. It definitely only seems to happen right at the end of the upload, after we uploaded over 3000 sourcemaps.

@BenoitZugmeyer
Copy link
Member

Ok, thank you for trying it out.

Maybe with less concurrency, requests will end more quicky, preventing timeouts? Could you try to run the upload command with the --max-concurrency 4 flag? (default is 20)

datadog-ci sourcemaps upload --max-concurrency 4 ...

@mattlewis92
Copy link

Ok, thank you for trying it out.

Maybe with less concurrency, requests will end more quicky, preventing timeouts? Could you try to run the upload command with the --max-concurrency 4 flag? (default is 20)

datadog-ci sourcemaps upload --max-concurrency 4 ...

Sure thing, will try and that and let you know how it goes!

@mattlewis92
Copy link

max-concurrency=4 had no effect, the upload step took about 9 minutes and there was still a bunch of socket hangup errors:
CleanShot 2024-01-26 at 11 31 50

Then I tried deleting a few hundred sourcemaps to bring the total count under 3000 and everything gets fast, and all of them are uploaded in under a minute:
CleanShot 2024-01-26 at 11 32 27

The problem only started after we crossed the 3000 count for sourcemaps, so I'm reasonably confident there is a hard limit somewhere on the datadog side, either connections are not being closed after upload by the client, or maybe the backend API that accepts the uploads is having a firewall rule triggered that starts blocking the requests as it thinks github actions is trying to ddos the endpoint.

@BenoitZugmeyer
Copy link
Member

Let's keep this issue open until we have confirmation that it is indeed fixed.

@BenoitZugmeyer BenoitZugmeyer reopened this Feb 8, 2024
@BenoitZugmeyer
Copy link
Member

Each request has a 1 minute timeout on our side, this is why we are seeing some HTTP requests failing with status 408. When the request was retried, an implementation bug was causing the request to never end, causing the socket hang up errors.

We released a fix for retrying uploads in v2.30.0. We won't increase the 1 minute timeout for now as it has security implications. We might revisit later.

As your source maps files are relatively small, retrying the upload after a timeout should work now that it is fixed. If it's still unreliable and you are still seeing failing HTTP requests, using a lower concurrency might help uploading individual source maps faster. Could you try it again and let me know if it improves the situation?

Sorry for the inconvenience and the back and forth!

@mattlewis92
Copy link

mattlewis92 commented Feb 12, 2024

Each request has a 1 minute timeout on our side, this is why we are seeing some HTTP requests failing with status 408. When the request was retried, an implementation bug was causing the request to never end, causing the socket hang up errors.

We released a fix for retrying uploads in v2.30.0. We won't increase the 1 minute timeout for now as it has security implications. We might revisit later.

As your source maps files are relatively small, retrying the upload after a timeout should work now that it is fixed. If it's still unreliable and you are still seeing failing HTTP requests, using a lower concurrency might help uploading individual source maps faster. Could you try it again and let me know if it improves the situation?

Sorry for the inconvenience and the back and forth!

Thanks for digging into this!

After upgrading to 2.30.0 the situation is a bit better, all the socket hangup messages are gone and the upload time has dropped from ~9m to ~5m, although a handful still fail to upload:
CleanShot 2024-02-12 at 12 14 13@2x

When compared to the run where I deleted a few hundred sourcemaps, it's still quite slower:
image

Will try lowering max-concurrency and see if that makes any difference.

Edit: setting max-concurrency=4 resulted in no errors but uploads took close to 9m this time, max-concurrency=15 resulted in some upload errors and the upload taking about 5 minutes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working source-code-integration Related to [git-metadata]
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants