-
Notifications
You must be signed in to change notification settings - Fork 619
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Change-Id: I233d02a669b6a0504cd54590c6c8e4fefadc4713 Signed-off-by: Florin Coras <[email protected]>
- Loading branch information
1 parent
03f942a
commit e3e2f07
Showing
3 changed files
with
144 additions
and
20 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
e3e2f07
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hello,when i user pool_put ,it happens a core dump
0x00007fed358fc37f in raise () from /lib64/libc.so.6
#1 0x00007fed358e6db5 in abort () from /lib64/libc.so.6
#2 0x000055cde7d8031a in os_exit (code=code@entry=1) at /usr/src/debug/vpp-21.06/src/vpp/vnet/main.c:431
#3 0x00007fed37810d67 in unix_signal_handler (signum=6, si=, uc=) at /usr/src/debug/vpp-21.06/src/vlib/unix/main.c:187
#4
#5 0x00007fed358fc37f in raise () from /lib64/libc.so.6
#6 0x00007fed358e6db5 in abort () from /lib64/libc.so.6
#7 0x000055cde7d802c3 in os_panic () at /usr/src/debug/vpp-21.06/src/vpp/vnet/main.c:407
#8 0x00007fed3601d4c5 in vec_resize_allocate_memory () from /lib64/libvppinfra.so.21.06
#9 0x00007fec354a6bf1 in _vec_resize_inline (numa_id=255, data_align=8, header_bytes=0, data_bytes=, length_increment=, v=) at /usr/src/debug/vpp-21.06/src/vppinfra/vec.h:172
#10 clib_bitmap_ori_notrim (i=3219795329229292237, ai=0x7fec386d9740) at /usr/src/debug/vpp-21.06/src/vppinfra/bitmap.h:648
follow is my debug info
8sizeof((uword*)0x7fec386d9740)
$23 = 64
(gdb) p ((vec_header_t *) (v) - 1)
No symbol "v" in current context.
(gdb) p ((vec_header_t *) (0x7fec386d9740) - 1)
$24 = (vec_header_t *) 0x7fec386d9738
(gdb) p *0x7fec386d9738
$25 = 734048780
(gdb) p *((vec_header_t *) (0x7fec386d9740) - 1)
$26 = {len = 734048780, numa_id = 255 '\377', vpad = "\000\000", vector_data = 0x7fec386d9740 ""}
(gdb) x/20xg 0x7fec386d9738
0x7fec386d9738: 0x000000ff2bc0b20c 0x0000000000000000
0x7fec386d9748: 0x0000000000000000 0x0000000000000000
0x7fec386d9758: 0x0000000000000033 0x0000000464612074
0x7fec386d9768: 0x000000ff00000000 0x0000000100000000
0x7fec386d9778: 0x0000000000000000 0x0000000000000000
0x7fec386d9788: 0x0000000000000041 0x00007fec3b2f1660
0x7fec386d9798: 0x00007fec3a3d8470 0x203a747065636341
0x7fec386d97a8: 0x00000000002a2f2a 0x0000000000000000
0x7fec386d97b8: 0x0000000000000000 0x0000000000000040
0x7fec386d97c8: 0x0000000000000082 0x00000004357c4188
I found that the element "free_bitmap" in pool_header_t is a vector, when i see the len of vector, the value is 734048780, I can not figure out how it happened, my application call pool_get_zero a few times, can not make the len of vector so big! I wonder if anyone has met the same problem
e3e2f07
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
This is a really old patch, many things, including vppinfra have changed since 2018. Having said that, it looks like the pool might've been corrupted. First thing that comes to mind is thread safety, i.e., make sure only one thread allocates and frees elements from the pool.
e3e2f07
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thandk you,
You are absolutely right. Our business logic involves a worker thread and a main thread. The worker thread calls pool_get_zero to obtain a piece of memory from the global variable event_pool, and then sends an event with the index of this memory block through the function vlib_process_signal_event_mt. On the other side, the main thread creates an event listener using the function vlib_process_create. When it receives an event from the worker thread, it retrieves the corresponding memory based on the index, uses it, and then returns it to the pool using pool_put. What are the potential issues with this usage method?
e3e2f07
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e3e2f07
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the debug info,i notice i=3219795329229292237, The value of i is the current pointer minus the memory pool base address. Normally, this value should be 1, but the stack information shows that this value is 3219795329229292237, which suggests that when calling poop_put to return memory, the memory pool base address may have changed, leading to incorrect calculations in the pool_put function where uword pool_var (l) = pool_var (e) - pool_var (p_);. As a result, the free_bitmap may expand beyond control, causing problems.
I would like to know if it is possible to use the pool_init_fixed function to pre-allocate a memory pool without using locks. This way, the memory pool base address will not change when pool_get re-applies for memory.
e3e2f07
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e3e2f07
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, you are definitely right. so i try to use vec_validate to malloc a memory , vec_validate is thread safatey, right?
e3e2f07
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I plan to use vec_validate in a worker thread to allocate memory, then use vlib_process_signal_event_mt to send the memory address to the main thread. After the main thread uses this memory, it will call vec_free to release the memory. This logic should work fine.
e3e2f07
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e3e2f07
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, it relies heavily on the locks in the dmalloc underlying and could potentially increase memory fragmentation if worker threads frequently call vec_validate