Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eCapture consuming lot of memory #433

Closed
h0x0er opened this issue Nov 30, 2023 · 9 comments · Fixed by #434 or #435
Closed

eCapture consuming lot of memory #433

h0x0er opened this issue Nov 30, 2023 · 9 comments · Fixed by #434 or #435
Labels
enhancement New feature or request improve question Further information is requested

Comments

@h0x0er
Copy link
Contributor

h0x0er commented Nov 30, 2023

Describe the bug
eCapture is consuming huge amount of memory.

To Reproduce

Note: Output of command free -h may vary.

  1. Checkout memory before starting eCatpure
file -h
               total        used        free      shared  buff/cache   available
Mem:            11Gi       3.8Gi       2.8Gi       1.1Gi       4.9Gi       5.7Gi
Swap:           15Gi       672Mi        14Gi
  1. Start eCapture
sudo ecapture tls
  1. In another terminal re-check the memory
free -h
               total        used        free      shared  buff/cache   available
Mem:            11Gi       6.3Gi       366Mi       1.1Gi       4.8Gi       3.2Gi
Swap:           15Gi       679Mi        14Gi
  1. You will notice the huge amount of consumption

Expected behaviour

eCapture shouldn't consume that much memory.

@cfc4n
Copy link
Member

cfc4n commented Dec 1, 2023

This calculation is inaccurate. It's best to only look at the resource usage of eCapture.

For example, top -p $ECAPTURE_PID.

@h0x0er
Copy link
Contributor Author

h0x0er commented Dec 1, 2023

@cfc4n , I am investigating the issue. I will keep you updated.

@cfc4n cfc4n added the question Further information is requested label Dec 1, 2023
@h0x0er
Copy link
Contributor Author

h0x0er commented Dec 1, 2023

Following are some details

  1. While creating perf buffer, notice the size of perCpuBuffer

perCpuBuffer := os.Getpagesize() * BufferSizeOfEbpfMap

func (m *Module) perfEventReader(errChan chan error, em *ebpf.Map) {
rd, err := perf.NewReader(em, os.Getpagesize()*BufferSizeOfEbpfMap)
if err != nil {

  1. BufferSizeofEbpfMap is declared as

    // buffer size times of ebpf perf map
    // buffer size = BufferSizeOfEbpfMap * os.pagesize
    const BufferSizeOfEbpfMap = 1024 * 10

  2. Inside perf.NewReader(), buffer of perCPUBuffer size is allocated for each CPU by calling newPerfEventRing()

https://github.com/cilium/ebpf/blob/f0d238d1934f15fe8c5ef8755337be11bbc114e9/perf/reader.go#L225-L245

  1. Per CPU Memory allocation inside newPerfEventRing() from line 45-49

https://github.com/cilium/ebpf/blob/f0d238d1934f15fe8c5ef8755337be11bbc114e9/perf/ring.go#L25-L49

Calculations

For my machine

  1. os.Getpagesize() = 4096 (bytes)
  2. BufferSizeOfEbpfMap = 10240 (bytes)
  3. perCpuBuffer = os.Getpagesize() * BufferSizeOfEbpfMap = 41943040 (bytes) = 40 MB
  4. Total CPUs = 8
  5. Memory Allocated for 1 module = 40 * 8 = 320 MB
  6. In case of ecapture tls 3 modules are initialised ,
    therefore Memory allocated for ecapture tls = 3 * 320 = 960 MB

Almost 1GB of RAM

fyi @cfc4n

@cfc4n cfc4n added enhancement New feature or request improve labels Dec 1, 2023
@cfc4n
Copy link
Member

cfc4n commented Dec 1, 2023

Indeed, as you said, eCapture occupies a relatively large amount of memory.

  • BufferSizeOfEbpfMap = 40M : this is to prevent tls events from being lost. Many times, when network traffic is particularly high, it is easy to fill up the ebpf map.
  • per CPU : per CPU type maps have better concurrency safety to avoid errors caused by data write order.
  • 3 modules : This is indeed an area that can be optimized.

Currently, eCapture supports three libraries: openssl\nss\nspr; however, openssl has the highest usage and supports the most mature library compared to the other two which are more niche.

I plan to default close those two modules or create a new subcommand for separate support. Do you have any better ideas?

@h0x0er
Copy link
Contributor Author

h0x0er commented Dec 1, 2023

Regarding BufferSizeOfEbpfMap

this is to prevent tls events from being lost

I agree with this, but I think setting it to 40M by default is not a good idea.

I checked the tetragon implementation, I noticed following things;

  • BufferSize is user configurable. checkout and
  • By default it is set to 65535. checkout

So, I think we should do similar things,

  • Reduce the default size.
  • Make it configurable using a flag, so that end-user can adjust it as per need.

How are your thoughts on this ?


Regarding per CPU types map performance

I am having a little doubt about the performance of per-cpu-buffers
Give a read to: https://nakryiko.com/posts/bpf-ringbuf


I plan to default close those two modules or create a new subcommand for separate support

Disabling unnecessary modules by default seems good idea

fyi @cfc4n

@cfc4n
Copy link
Member

cfc4n commented Dec 1, 2023

Thank you for your suggestion.

@cfc4n
Copy link
Member

cfc4n commented Dec 1, 2023

I will submit another PR for the custom mapSize flag tomorrow . @h0x0er

good night.

@h0x0er
Copy link
Contributor Author

h0x0er commented Dec 1, 2023

Thanks @cfc4n . Good Night 🌃

cfc4n added a commit that referenced this issue Dec 2, 2023
used `--map-size` to set mapSize perCPU in cli. default:10240KB,

Signed-off-by: cfc4n <[email protected]>
@cfc4n cfc4n linked a pull request Dec 2, 2023 that will close this issue
@cfc4n
Copy link
Member

cfc4n commented Dec 2, 2023

fixed at #435

Terminal 1

sudo free -m
[sudo] password for cfc4n:
               total        used        free      shared  buff/cache   available
Mem:            3876         477         277           1        3121        3106
Swap:           3893           0        3893

#### exec ecapture at other terminal.
sudo free -m
               total        used        free      shared  buff/cache   available
Mem:            3876         513         240           1        3121        3069
Swap:           3893           0        3893

Terminal 2

sudo bin/ecapture tls

and , openssl module create 3 ebpf maps.

{
    Name: "tls_events",
},
{
    Name: "connect_events",
},
{
    Name: "mastersecret_events",
},
  • mapSizePerCPU = 5M
  • 2 CPUS
  • 3 eBPF maps

all eBPF maps used memory = 2 * 5 * 3 = 30MB.
now, eCapture used memory (include ebpf maps) = 513-477 ≈ 277-240 ≈ 36M .

As expected.

@cfc4n cfc4n closed this as completed Dec 2, 2023
cfc4n added a commit that referenced this issue Dec 2, 2023
used `--map-size` to set mapSize perCPU in cli. default:10240KB,

Signed-off-by: cfc4n <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request improve question Further information is requested
Projects
None yet
2 participants