Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove use of thread_local from ModuleAllocMonitor #46606

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 106 additions & 2 deletions PerfTools/AllocMonitor/plugins/ModuleAllocMonitor.cc
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,114 @@
#include "FWCore/ServiceRegistry/interface/ModuleCallingContext.h"
#include "DataFormats/Provenance/interface/ModuleDescription.h"

#if defined(ALLOC_USE_PTHREADS)
#include <pthread.h>
#else
#include <unistd.h>
#include <sys/syscall.h>
#endif

#include "moduleAlloc_setupFile.h"
#include "ThreadAllocInfo.h"

namespace {
inline auto thread_id() {
#if defined(ALLOC_USE_PTHREADS)
/*NOTE: if use pthread_self, the values returned by linux had
lots of hash collisions when using a simple % hash. Worked
better if first divided value by 0x700 and then did %.
[test done on el8]
*/
return pthread_self();
#else
return syscall(SYS_gettid);
#endif
}

struct ThreadTracker {
static constexpr unsigned int kHashedEntries = 128;
static constexpr unsigned int kExtraEntries = 128;
static constexpr unsigned int kTotalEntries = kHashedEntries + kExtraEntries;
using entry_type = decltype(thread_id());
static constexpr entry_type kUnusedEntry = ~entry_type(0);
std::array<std::atomic<entry_type>, kHashedEntries> hashed_threads_;
std::array<std::atomic<entry_type>, kExtraEntries> extra_threads_;

ThreadTracker() {
//put a value which will not match the % used when looking up the entry
entry_type entry = 0;
for (auto& v : extra_threads_) {
v = kUnusedEntry;
}
for (auto& v : hashed_threads_) {
v = ++entry;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these free "hashed threads" are marked differently from the "extra threads"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this special value for a free "hashed thread" mean there is an assumption the thread ID can not be smaller than kHashedEntries? (probably it is a good assumption, but nevertheless)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the hashed thread all the thread ids have been modulo-ed by kHashedEntries. The value initially assigned to the hashed_threads_ entry is actually a value that is impossible to be allowed in that spot with such a modulo (i.e. it is always 1 larger than the smallest allowed value in that spot).

Given the extra thread is linearly search, there is notvalue that can be safely inserted as the initial value for an entry. I assumed that no thread would be assigned ~entry_type(0).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, got it. Could you add these comments to code itself (for the next reader)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the comments.

}
}

std::size_t thread_index() {
auto id = thread_id();
auto index = thread_index_guess(id);
auto used_id = hashed_threads_[index].load();

if (id == used_id) {
return index;
}
//try to be first thread to grab the index
auto expected = entry_type(index + 1);
if (used_id == expected) {
if (hashed_threads_[index].compare_exchange_strong(expected, id)) {
return index;
} else {
//another thread just beat us so have to go to non-hash storage
return find_new_index(id);
}
}
//search in non-hash storage
return find_index(id);
}

private:
std::size_t thread_index_guess(entry_type id) const {
#if defined(ALLOC_USE_PTHREADS)
return (id / 0x700) % kHashedEntries;
#else
return id % kHashedEntries;
#endif
}

std::size_t find_new_index(entry_type id) {
std::size_t index = 0;
for (auto& v : extra_threads_) {
entry_type expected = kUnusedEntry;
if (v == expected) {
if (v.compare_exchange_strong(expected, id)) {
return index + kHashedEntries;
}
}
++index;
}
//failed to find an open entry
abort();
return 0;
}

std::size_t find_index(entry_type id) {
std::size_t index = 0;
for (auto const& v : extra_threads_) {
if (v == id) {
return index + kHashedEntries;
}
++index;
}
return find_new_index(id);
}
};

static ThreadTracker& getTracker() {
static ThreadTracker s_tracker;
return s_tracker;
}

using namespace edm::service::moduleAlloc;
class MonitorAdaptor : public cms::perftools::AllocMonitorBase {
public:
Expand All @@ -43,8 +147,8 @@ namespace {

private:
static ThreadAllocInfo& threadAllocInfo() {
thread_local ThreadAllocInfo s_info;
return s_info;
static ThreadAllocInfo s_info[ThreadTracker::kTotalEntries];
return s_info[getTracker().thread_index()];
}
void allocCalled(size_t iRequested, size_t iActual, void const*) final {
auto& allocInfo = threadAllocInfo();
Expand Down