-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unsock'ed NGINX reverse proxy server issue #2
Comments
Hi @Bert-Proesmans, thanks for reporting! It looks like nginx clears all environment variables prior to launching its worker processes, see https://nginx.org/en/docs/ngx_core_module.html Try setting "env UNSOCK_DIR=/tmp/unsockets;" or similar in nginx.conf. I haven't tried this, though. The nginx documents mention that setting these this way may be too late (our constructor may be called before that). Right now, we configure everything upon process initialization. Deferring that until UNSOCK_DIR etc. is available at an arbitrary time would be possible in theory but would come with a significant increase in complexity, timing problems, etc. Please let me know if this gets you further. Also consider filing a bug against nginx. |
Thanks for the swift reply! I'll test with the additional configuration but fail to see how the whitelisted environment variables will change the effects because there is proof that the socket type is replaced correctly, at least there is data going out through VSOCK which is received by the upstream service.
I was thinking about asking the nginx devs about this case too and started figuring out how to contact them! |
Me again! Whitelisting/setting environment variables for the nginx workers did not have any effect. I traced nginx and after staring a long time at the trace it's actually obvious what is happening;
The missing piece is that the new AF_VSOCK FD is not registered into epoll, because nginx did this for the old socket right after getting its file descriptor, and that explains why the upstream response is never sent to the client and nginx times out waiting. strace excerpt
As for a quick fix; I'm currently looking at http://nginx.org/en/docs/events.html and trying to validate against the code if any of the other polling methods have different ordering so my approach works without code changes. EDIT; Yeah, quick solution is to enable the select or poll event systems. See http://nginx.org/en/docs/events.html for more information because these are not included into the build if epoll is available. |
Nice find! In hindsight, my suggestion was wrong anyways: a missing environment variable UNSOCK_DIR would have caused the child process to exit with an error...
In order to support that, unsock would have to add a wrapper for epoll, and keep track of these changes. That's not trivial, but doable. Tragically, the kernel lacks support for querying the state of a file descriptor added to an epoll fd ("EPOLL_CTL_GET"), although this was already proposed a long time ago. [1][2] (and even then, it would be great to get all epoll fds associated with that fd, which is yet another problem). So right now, if you wanted to support this in unsock, you would not only have to keep track of the open epoll fds, but also any registered/modified/deregistered file descriptors, and correctly re-register the new fd with all these epollfds after dup3. [1] https://lists.linuxfoundation.org/pipermail/bugme-new/2004-August/010979.html |
You could also try adding a wrapper around |
Also see https://nginx.org/en/docs/events.html You can try setting the following nginx config directive: |
Hi, inspiring (great) tool!
I've been trying the library on nginx to redirect into VSOCK (starting from both AF_INET and AF_UNIX). It seems to work, but not 100% without issue.
Setup;
To clarify, I validated that my proxy and upstream VM's properly communicate with each other; socat VSOCK-LISTEN and VSOCK-CONNECT allows me to bidirectionally communicate.
My nginx configuration also validates without any issues (nginx -t).
Observed behaviour;
Nginx actually properly proxies towards the upstream. The upstream receives the request and responds with data. Then nginx never forwards that data to the client.
Nginx terminates the client connection due to proxy timeout after set amount of seconds (60s in my case).
The weird thing is that nginx sometimes logs a special client disconnect message (can't exactly reproduce it), this message includes how many bytes were received/sent to client and upstream, and it certainly shows a pretty large amount of bytes received by the upstream (corresponding to the actual response data of the upstream).
I tested; 127.0.0.1:443 -> NGINX Server -> 127.175.0.0:8000 - [UNSOCK] -> VSOCK:10:8000
I tested; 127.0.0.1:443 -> NGINX Server -> unix:/run/nginx-vsock/upstream.vsock - [UNSOCK] -> VSOCK:10:8000
I tested; 127.0.0.1:443 -> NGINX Stream -> unix:/run/nginx/frontend.sock -> NGINX Server -> unix:/run/nginx-vsock/upstream.vsock - [UNSOCK] -> VSOCK:10:8000
Every scenario has the same symptoms.
Excerpt of logs when symptoms occur
Server 1 - Proxy
EDIT; I patched unsock with this patch to add the VMADDR_FLAG_TO_HOST flag to the VSOCK socket struct flags field, this instructs the driver to forward vsock data to the host (cid 2) even though the destination cid is another value.
https://github.com/Bert-Proesmans/nix/blob/cb77bf615af27c8413c419e57b46bbb802851834/packages/unsock/001-flag-to-host.patch
Server 1 - Local curl test
The above log records a 504 Gateway Time-out on the client-side. The returned value should be an empty http directory listing.
Server 2 - Upstream
EDIT; You can see here at the end; the HTTP upstream server replied instantly to the proxied request.
Nginx config
Expected behaviour;
Nginx proxies upstream data correctly to the client. Upstream connections are transparently proxied through the VSOCK driver.
It seems like some signaling is not happening correctly. I've scoured the internet looking for similar symptoms but after a few days i'm tired of seeing causes about too low file-size limits or too low timeout values...
I'll have to pull out strace to debug this issue, but at this point I'm in a bit too deep over my head. Hoping you have some ideas into which direction I should investigate.
As a null hypothesis kinda thing; not unsocking nginx and using straight unix sockets works completely as expected. the cause of the symptoms is the unsock library somehow.
The text was updated successfully, but these errors were encountered: