-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clientContextImpl: Cap the number and age of beacons #191
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy to see a PR addressing this issue. I think some further work is required though.
/* Before creating a new beacon, cleanup any old ones */ | ||
for (AddressBeaconHandlerMap::iterator it = m_beaconHandlers.begin(); it != m_beaconHandlers.end();) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a linear iteration of the list of tracked servers each time a beacon is received. If the number of servers grows large, or one gets stuck in a restart loop, this could potentially take up a lot of CPU time.
Timer InternalClientContextImpl::m_timer
is available in getBeaconHandler()
. cf. class ChannelSearchManager
for an example of using Timer
.
@@ -3957,7 +3957,7 @@ class InternalClientContextImpl : | |||
static size_t num_instances; | |||
|
|||
InternalClientContextImpl(const Configuration::shared_pointer& conf) : | |||
m_addressList(""), m_autoAddressList(true), m_connectionTimeout(30.0f), m_beaconPeriod(15.0f), | |||
m_addressList(""), m_autoAddressList(true), m_connectionTimeout(30.0f), m_beaconPeriod(15.0f), m_maxBeacons(10), m_maxBeaconLifetime(15), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A limit of 10 servers, and a 15 second lifetime seem quite small. Have you checked one a network with more than 10 PVA servers to see how often ChannelSearchManager::newServerDetected()
is called?
@@ -4115,6 +4117,8 @@ class InternalClientContextImpl : | |||
m_beaconPeriod = m_configuration->getPropertyAsFloat("EPICS_PVA_BEACON_PERIOD", m_beaconPeriod); | |||
m_broadcastPort = m_configuration->getPropertyAsInteger("EPICS_PVA_BROADCAST_PORT", m_broadcastPort); | |||
m_receiveBufferSize = m_configuration->getPropertyAsInteger("EPICS_PVA_MAX_ARRAY_BYTES", m_receiveBufferSize); | |||
m_maxBeacons = m_configuration->getPropertyAsInteger("EPICS_PVA_BEACON_MAX", m_maxBeacons); | |||
m_maxBeaconLifetime = m_configuration->getPropertyAsFloat("EPICS_PVA_BEACON_LIFETIME", m_maxBeaconLifetime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see making the number of servers configurable. However, I don't think the lifetime (aka. 2x the max beacon period of 180 seconds) needs to be separately configurable.
Since it is not super obvious. The Beacon TX timing of pvAccessCPP differs from what experience with CA servers might lead you to expect. pvAccessCPP/src/server/beaconEmitter.cpp Lines 31 to 33 in 581d100
A server will send out the first 10 beacons (not configurable) with a 15 second interval (by default), then switch to a 180 second period (not configurable). While So I think the beacon tracking lifetime must be >= 360 seconds. (fyi. with PVXS I try to follow the same model and timings, with a non-configurable limit of 20k servers. Of course, there I only allocate ~64 bytes per server) |
The windows CI failures are due to epics-base/ci-scripts#84. When you update, please rebase to pick up ed7eae5. |
✅ Build pvAccessCPP 1.0.70 completed (commit cdf3720715 by @JJL772) |
2b51b97
to
777d68d
Compare
@mdavidsaver Thanks for the feedback! I finally got around to applying the requested changes. |
❌ Build pvAccessCPP 1.0.74 failed (commit 73c3932b45 by @JJL772) |
777d68d
to
afca135
Compare
❌ Build pvAccessCPP 1.0.75 failed (commit 4e054f2e07 by @JJL772) |
afca135
to
d44adc9
Compare
✅ Build pvAccessCPP 1.0.79 completed (commit 2ae88b70f1 by @JJL772) |
@mdavidsaver Just wanted to follow up, are there any other changes required for this? |
if (m_beaconHandlers.size() >= maxTrackedBeacons) | ||
{ | ||
char ipa[64]; | ||
sockAddrToDottedIP(&responseFrom->sa, ipa, sizeof(ipa)); | ||
LOG(logLevelDebug, "Tracked beacon limit reached (%d), ignoring %s\n", maxTrackedBeacons, ipa); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To minimize log spam it would be friendlier to only log when size()==max. So once each time the limit is reached, but not again until falling below the limit. eg. consider if some PVA server gets stuck in a reset loop.
// stores weak_ptr | ||
handler.reset(new BeaconHandler(internal_from_this(), responseFrom)); | ||
m_timer->scheduleAfterDelay(BeaconCleanupCallback::shared_pointer(new BeaconCleanupCallback(*this, *responseFrom)), maxBeaconLifetime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class Timer
can be a little strange to use. As written, this timer will never be cancel()
ed. If fact, since the callback pointer isn't being stored, it can't be canceled. So each time a new server appears, this timer will drop it 360 seconds later. Then, since that server will probably still be active, it will be "new" again 180 after than.
I think you are on the right track though. What you would need to do is to store a BeaconCleanupCallback
in (or as part of) BeaconHandler
. Then the logic would be to start the timer on creation (first beacon). Then cancel and restart it on each subsequent beacon. This way the timer will only expire when the corresponding server stops sending beacons.
This PR now depends on some changes made to pvData: epics-base/pvDataCPP#94 I'm going to mark this as a draft for now because I'm not exactly happy with these changes yet. |
❌ Build pvAccessCPP 1.0.108 failed (commit c4e4658381 by @JJL772) |
Each beacon has an associated mutex. If we allocate too many beacons on resource constrained systems, i.e. RTEMS, we may run out of resources and crash.
✅ Build pvAccessCPP 1.0.109 completed (commit 9651462441 by @JJL772) |
Each beacon has an associated mutex. If we don't cap the beacon count, IOCs running on resource limited platforms like RTEMS may eventually run out of resources and crash.
Closes #184
The configuration options are probably unnecessary, let me know if I should remove them.
Requires some changes to pvData: epics-base/pvDataCPP#94