-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
evbuffer_get_length of autossl in environment where sender speed is slower than receiver (Buffer watermarking not working in autossl) #303
Comments
I use the autossl proxyspecs for POP3 and SMTP servers with STARTTLS support. And I am not aware of any HTTP server with STARTTLS support, perhaps it's my ignorance, but why do you use autossl for HTTP? In other words, I think the issue may not be about the sslsplit code for buffer watermarking. |
Dear Soner Tari The https I mentioned is an example environment. I wanted to try using sslsplit for well-known protocols including pop3, STARTTLS from smtp, as well as unknown pseudo-SSL communications that start with plaintext and have an SSL client hello in the middle. In the end, sslsplit was a solution that basically worked well by using autossl for https, pop3, smtp, ... etc. However, it seems that evbuffer_get_length(outbuf) always returns 0 only for autossl, so I was wondering why. It is questionable that buffer watermarking does not work in autossl. |
I cannot create your test environment, and it is hard to guess looking at the code, so we need further info. If you enable debug logging with -D, what does sslsplit say until oom? Does it say Also, if you enable content logging do you see any content until oom? Btw, how do you know "evbuffer_get_length(outbuf) always returns 0"? |
Just before oom occurs, the following message is generated. In some cases, this message occurs just before OOM, but in other cases it is not. But this message doesn't seem to be highly related to OOM as it's been mixed with multiple connection tests.
The log confirms that ClientHello is found as follows:
When content logging is enabled, content is displayed up to OOM. Until OOM occurs, it seems to work just fine except for the memory increase of sslsplit. I found out that evbuffer_get_length(outbuf) returns 0 by inserting the following debugging code in that part.
P.s. |
I look at the debug logs you have provided, and at the code in pxy_bev_readcb(). So far this is what we know:
Yes, I agree, this is as strange as you say. And, it may be related with libevent, as you have suspected, since you have already tried different versions of libevent. As you mention you have multiple test connections, so in sslsplit logs it is hard to know which log line is related with which connection. So, I want to suggest that we use sslproxy instead of sslsplit now. The design of sslproxy is different from sslsplit, but it supports autossl, buffer watermarking, and split style proxyspecs. Also, if you enable the DEBUG_PROXY switch in Mk/main.mk and start sslproxy with -D4, it is going to print very verbose debug logs with connection ids and file descriptor numbers, so we can easily relate debug lines with connections. We can even follow the execution of its source code by looking at those debug logs. (Btw, sslsplit supports the DEBUG_PROXY switch too, but it is much less verbose.) If we use sslproxy, we can perhaps get more detailed info about the issue, and more importantly perhaps we can rule out libevent as the culprit. Also, since the design and implementation of sslproxy is different from sslsplit, it may be better to try sslproxy now instead of keeping trying sslsplit (for example, if sslproxy succeeds then it is the sslsplit code). It is possible that sslproxy will not work in your test environment at all. So, it's up to you. |
Also, just to make sure that callbacks for the overridden bevs are disabled, can you apply the following patch and report back please?:
It may look unnecessary, but this is what we do while terminating connections and freeing bevs, which we have realized necessary some time ago, so perhaps it is necessary while upgrading too. Just in case. |
I applied the patch you mentioned to sslsplit, but it's the same.
sslproxy may take some time to test. |
It seems to work well because I set bufferevent_setwatermark in the bev as follows right before the call to bufferevent_openssl_filter_new. It is probably related to bev_ssl->outbuf_cb of bufferevent_openssl_new_impl function called inside bufferevent_openssl_filter_new . This is the same behavior that copies when evbuffer_add_buffer is called and is called when evbuffer_invoke_callbacks_() (consider_writing) is called. That is, it works if high watermark is properly set before calling bufferevent_openssl_filter_new.
|
Hm, that's interesting. I've also looked into the source code of libevent 2.1.12, and seen that the only time the bufferevent_setwatermark() function is called in the entire libevent code is in bufferevent_openssl_new_impl(), as I guess you have also seen. And I see that it reenables the read (and write) callback which we have disabled in our code for the underlying bev (this is relevant to autossl, since its underlying bev is tcp, and it switches to another ssl bev), but it resets the watermark for the read cb not for the write cb. If that's really the case, then it may explain the strange issue you have seen. Do I understand this correctly? But, what I don't understand in your trial-fix is that if the theory above is right, your fix was not supposed to work, because you call bufferevent_setwatermark() for the write cb before the call to bufferevent_openssl_filter_new(), which resets the watermark for the read cb and reenables it as I said above. Because in your fix, you set the watermark for the write cb for the underlying bev. So, I currently don't understand why your fix works. In fact, I would expect a fix which would (again) disable the read cb after the call to bufferevent_openssl_filter_new(), of course after checking buffer sizes and watermark limits. Because it seems to me like the actual fix should be about disabling the read cb again, not setting the write watermarks again. Or perhaps they both lead to the same result in our implementation, I don't know yet. Am I missing something? I may be really missing something, because I am still looking into the source code of libevent, and check with ours. I have also noted that in libevent code both bufferevent_openssl_filter_new() and bufferevent_openssl_socket_new() call bufferevent_openssl_new_impl(). So, I should continue looking into it. But I think you have found a clue. Btw, if your trial-fix really leads to a solution or a workaround, please do not bother trying sslproxy, unless you want to try it out of interest. |
Also, we enable read callbacks for src and dst bevs after we find ClientHello and while we complete upgrade, not for the underlying bevs, but for the new bevs. I don't know how this affects this issue. |
Here's my understanding of libevent: Of course, since this is the first time I've seen the libevent source, it's not perfect. Difference between ssl and autossl In the flow, autossl starts with ctx->src|dst.bev as TCP bev and is replaced by bufferevent_openssl_filter_new with SSL bev with this as 'underlying bev'. At this time, in the process of copying from inbuf to outbuf through evbuffer_add_buffer, 'evbuffer_invoke_callbacks_(outbuf);' It calls be_openssl_outbuf_cb according to the syntax. If you look at the be_openssl_outbuf_cb function, you will find the following part.
Now looking at the consider_writing function, the following 'target' seems to mean the output of the underlying bev .
After all, I think that BIO_new_bufferevent(underlying) / BIO_s_bufferevent() called by bufferevent_openssl_filter_new controls the underlying output through bio_bufferevent_write. (If the high write watermark is not set in the underlying bev, the memory is accumulated as all are added unconditionally by calling evbuffer_add in bio_bufferevent_write )
The point in autossl is that you need to set the high write watermark of the underlying bev to control the output of the underlying bev, and the low write watermark doesn't seem to have any special meaning. The deeper content seems to be getting lost in my understanding of this. |
The purpose of setting a watermark is to call the write callback when the water level drops to some acceptable level, i.e. outbuf has some space now (otherwise, the write events are suspended until the level drops), so that in the write callback we can reenable the read events, i.e. restart accepting data into inbuf. So, IMO, while we are switching from tcp to ssl bevs in autossl, we should check if any write watermarks are set and also if the read events are disabled for the underlying bevs, and if so, we should set the watermarks and disable the read events for the new bevs accordingly. We should do these checks and updates in the upgrade code. Therefore, the following patch is my proposal to fix this issue:
@minzkn, can you apply this patch (and this patch only) and try please? So that we can convert this into a pull request if it works. |
@sonertari The patch you suggested still shows the same memory increase. That is, the patch doesn't hit. The bev passed to the bufferevent_openssl_filter_new argument is read and write enabled as the underlying bev, sets the watermark for READ to low 0, high 0, and creates a new ssl bev and returns it. Therefore, it is thought that reading and writing should be enabled for the ssl bev returned after the call to bufferevent_openssl_filter_new as the existing implementation, and the WRITE watermark should be set in order not to exceed the buffer limit in the underlying bev passed before calling bufferevent_openssl_filter_new. The WRITE watermark set on the underlying bev before calling bufferevent_openssl_filter_new will be a control condition inside libevent, and it seems that sslsplit has nothing to control the underlying bev directly after bufferevent_openssl_filter_new. So, in the trial fix patch I suggested, I think it is right to set the WRITE watermark right before bufferevent_openssl_filter_new.
Pseudo code :
|
I am not surprised about the memory increase you report now, that's why in my earlier post I said we should disable the read events for the underlying bev if we have disabled them, because the bufferevent_openssl_new_impl() function in the libevent code enables them regardless. Back then I wasn't sure how libevent was working if a bev was wrapping an underlying bev, rather than directly connecting to a socket. But I think it is clear now that the main cause for the issue you have reported is the read events for the underlying bev. For example, disabling or watermarking the wrapper bev did not stop filling up the buffer for the underlying bev, as we have observed. So, contrary to your comments, based on your reports (because I cannot create your test environment), now my theory is that our watermark code should always deal with underlying bevs, never wrapper bevs. To put it perhaps more correctly, we should not disable or watermark wrapper bevs, instead we should always deal with bevs directly connected to sockets. (And the only proxyspec type with underlying bevs is autossl.) Also, I hope you understand that we cannot use your trial-fix as it is, because it sets watermarks for underlying bevs unconditionally. We should set the watermarks only if the underlying bevs have them set already. Plus, if we deal with wrapper bevs only, as you suggest, who is going to reset the watermarks for the underlying bevs? Note that our read and write callbacks currently deal with wrapper bevs, not underlying bevs. Therefore, the following patch is my updated proposal to fix this issue:
This patch is supposed to work, because it conditionally includes your trial-fix too. But since I have modified our watermarking code to deal with underlying bevs only, it should be tested properly, with or without autossl. @minzkn, can you apply this patch (and this patch only) and try please? |
@sonertari: The same thing happens with the watermark control patch for the underlying bev . It seems that the underlying_wm_set condition in the following part should be reversed, so I tried that, but it didn't work.
Also, in the condition of whether outbuf exceeds OUTBUF_LIMIT, I tried adding an additional condition as follows, but it was not resolved.
I still think it's right to keep the watermark control on the SSL bev in the existing implementation and only control the maximum load on the outbuf of the underlying bev... The reason is that when trying to set the watermark for the underlying bev set in pxy_bev_readcb, the watermark is immediately initialized by the SSL bev callback, pxy_bev_writecb call, so it seems that something needs to be considered. |
UPDATE: Please note that I have modified the patch after the first post. Your report that my last patch does not work is strange, because it includes your trial-fix, which you say works. But it also reveals something else, as I was concerned since the beginning: We don't know the exact moment when the need for watermarking starts (that's why I had proposed the very verbose debug logs of sslproxy earlier). Apparently, and perhaps as expected, it does not coincide with the moment we upgrade from tcp to ssl. Because, the conditional inclusion of your trial-fix does not work at upgrade time. Obviously, it is not needed during upgrade, because the buffer load is not high at that moment, and that is why I say perhaps as expected. Since we should not unconditionally set the watermarks for underlying bevs, as in your trial-fix, now I think that our watermarking code should deal with both the underlying and wrapper bevs. This seems to be the only way to disable and enable read events, and also set and reset watermarks at the right times that we need them (in other words, this seems to be the only way to include your trial-fix conditionally). Therefore, the following patch is my proposal to fix this issue. I don't care if some parts of it may be redundant or unnecessary or could be improved at the moment, what I care is that it works hopefully. This may be my last trial, because if this does not work again, first I will feel confused, and second it will mean that I need to create your test environment myself.
@minzkn, can you apply this patch (and this patch only) and try please, hopefully one last time? |
@sonertari: This patch also increases memory. So, when I check further debugging as follows, ubev_other outbuf continues to increase.
debug output: (ubev_other outbuf continues to increase)
In addition, it seems that the following conditions are necessary for the ubev of the bev of pxy_bev_writecb to drop to a low watermark in the patch contents. This is only to not increase the memory and there is a problem with the operation. I just feel like I need some kind of control like this. I think it may be necessary to additionally register writecb in ubev. I wonder if I need to register an additional ubev writecb that complements each other.
If I remove the following parts from the patch, it works without increasing memory.
|
To save @sonertari's precious time, I simply rebuilt the test environment out of the secure net. And we provide this test environment. Please refer to the connection method and test method below. (When the test is complete, the VM environment will be destroyed.) !!! @sonertari After you log in, it seems that you can use it by changing the password of your account ('sslsplit') for security. ((Please change immediately after reading this article))
The test VM was rebuilt using the following standard distribution environment. (Please refer to show-system-settings.sh which simply checks the sslsplit system configuration environment.)
|
Thanks @minzkn for giving me access to your test environment. I have started using it. But I didn't know wrapper bevs were so strange. It will take some time until I can convince myself if I can solve this issue or not? |
I think I have found a working solution to this issue. I have tested and convinced myself that the following patch works in the test environment you have provided. It includes your modified debug print, so that you can directly test yourself too. As I have explained in the comments in the code, what I have realized is that the write events we get may be due to the wrapper bev or the underlying bev. And it seems like the only way of telling which is to check if the buf len of the ubev is below the low mark, which triggers the write event for ubev. If that is not the case, then the write event is for the wrapper bev, so we should not reset the watermark for the ubev. I think this is the correct way of including your trial-fix conditionally. If this works for you too, we can convert it into a pull request, at long last. Note that I did not test this code until your test download fully finishes, so I hope it works until the end of download. And I hope it works for proxyspecs other than autossl too.
|
I think the following simplified patch is enough. I have realized that disabling or enabling the read events for ubev has no effect.
|
Even more simplified:
|
The simplified patch seems to work fine for autossl, ssl, https. However, if the other side is disconnected, I think it is necessary to review the condition until the buffer is empty. In other words, I am wondering whether there is no need to consider the conditions in which ctx->closed is set or when other->closed is set. I do not know. Wouldn't there be a problem without considering ubev for the condition that the other side of the Proxy terminates and the buffer exhaustion status as a result? In the pxy_bev_writecb function
In the pxy_bev_eventcb function
( Usually, HTTP(s) server side implements to have a structure that closes the connection later than the client in order to avoid the accumulation of connections in the TIME_WAIT/CLOSE_WAIT state... ) |
It's good to hear that it works. I guess you are referring to the case where libevent reports that the len of outbuf is 0, but the len of ubev outbuf is not. We don't know if that's still the case while closing. But probably it will happen again, because the client is slow, but sslsplit is fast in processing and moving data from the bev outbuf to the ubev outbuf, making the bev outbuf 0 but not the ubev outbuf. But I think we can test it with the current setup, if you wait until the end of download, and see if curl reports success or fail. Have you waited until the download completes? I would use a smaller download file though. |
I've been trying to figure out how to test it. And in the patch part, I set the high watermark for ubev large in pxy_bev_readcb.
And in iterative testing, quite often in tests like this one, the downloaded file is in an incomplete state like this: In sslsplit, the connection status is as follows.
curl (https client) will look like this:
In my test environment, I used the following command in sslsplit-client.minzkn.com (SeeAlso: test-downstream-34MBytes-file.sh script) with a smaller file (about 34MBytes / 34998089 bytes) for testing. 7fbad59d74f9a89fa392457163a525c5 sample.mp4 <<< Original md5 hash, However, in this test, the size of the downloaded file is small and the hash value is different from the original. Notice that there is a problem with curl as well. In other words, the high watermark for outbuf of ubev is large, so it is an opinion that it is necessary to take action on the buffer remaining at the time of connection termination during testing. ((if the test method is correct)) |
Thanks @minzkn for the analysis. Based on your reports I have modified the source code to keep conn open if underlying buffers have data. Please see the I think this branch works, because I have used your test env with a 2.1MB file sample.ogg with the test modification you have (with OUTBUF_LIMIT * 64), and curl can download the file successfully with the same checksum each time. This can be observed in the debug output of sslsplit:
So even though one of the conn ends close, the other remains open until its underlying buffer is empty, during which time curl continues to download the file, and then the other conn end closes too. The debug output of sslsplit look incomplete now, but anyway. |
Debug out underlying bev sizes at EOF:
|
Just FYI, now I have fixed a couple of issues with sslproxy, and it seems to work fine in your test env too. |
HTTPS server => (1 Gbps - faster network) sslsplit autossl tproxy (10 Mbps - slower network) => HTTPS client
sslsplit autossl 0.0.0.0 tproxy
When the client Big file downloads HTTPS from the server
In the pxy_bev_readcb function, evbuffer_get_length(outbuf) in the following implementation always returns 0.
In an environment where the client side speed is slower than the server side speed in the autossl tproxy setting, the inbuf side continues to accumulate and the memory increases. Eventually oom will be raised.
If you do it with ssl instead of autossl in the same environment, it works fine.
using sslsplit v0.5.5, libevent v2.1.8, openssl v1.1.1k
Thanks for making a great program.
The text was updated successfully, but these errors were encountered: