Skip to content

Commit

Permalink
ftrace: update after introduction of ftrace_format.py
Browse files Browse the repository at this point in the history
The ftrace implementation has changed [1]. Update the documentation
accordingly.

Link: OP-TEE/optee_os@5c2c0fb31efb [1]
Signed-off-by: Jerome Forissier <[email protected]>
Acked-by: Jens Wiklander <[email protected]>
  • Loading branch information
jforissier authored and jbech-linaro committed Jun 29, 2023
1 parent 17bb9b7 commit be6f978
Showing 1 changed file with 70 additions and 39 deletions.
109 changes: 70 additions & 39 deletions debug/ftrace.rst
Original file line number Diff line number Diff line change
@@ -1,73 +1,99 @@
.. _ftrace:

Ftrace
######
This section describes how to generate function call graph for user Trusted
Applications using ``ftrace``.
Ftrace (function tracing)
#########################
This section describes how to generate a function call graph for user Trusted
Applications using ``ftrace``. The name comes from the Linux framework which
has a similar purpose, but the OP-TEE ``ftrace`` is very much specific.

A call graph logs all the calls to instrumented functions and contains timing
information. It is therefore a valuable tool to troubleshoot performance
problems or to optimize the code in general.

The configuration option ``CFG_FTRACE_SUPPORT=y`` enables OP-TEE to collect
function graph information from Trusted Applications running in user mode and
compiled with ``-pg``. Once collected, the function graph data is formatted
in the ``ftrace.out`` format and sent to ``tee-supplicant`` via RPC, so they
can be saved to disk, later processed and displayed using helper script called
``symbolize.py`` present as part of ``optee_os`` repo.
compiled with ``-pg``. Once collected, the function graph data is sent to
``tee-supplicant`` via RPC, so they can be saved to disk, later processed
and displayed using helper scripts (``ftrace_format.py`` and ``symbolize.py``
which can be found in ``optee_os/scripts``).

Another configuration option ``CFG_SYSCALL_FTRACE=y`` in addition to
``CFG_FTRACE_SUPPORT=y`` enables OP-TEE to collect function graph information
for syscalls as well while running in kernel mode on behalf of Trusted
Applications.
Applications. Note that a small number of kernel functions cannot be traced;
they have the ``__noprof`` attribute in the source code.

A third configuration option ``CFG_ULIBS_MCOUNT=y`` enables tracing of user
space libraries contained in ``optee_os`` and used by TAs (such as ``libutee``
and ``libutils``).

Usage
*****

- Build OP-TEE OS and OP-TEE Client with ``CFG_FTRACE_SUPPORT=y``. You
may also set ``CFG_ULIBS_MCOUNT=y`` in OP-TEE OS to instrument the
user TA libraries contained in ``optee_os`` (such as ``libutee`` and
``libutils``).

- Optionally build OP-TEE OS with ``CFG_SYSCALL_FTRACE=y`` to dump
additional syscall function graph information.
- Build OP-TEE OS with ``CFG_FTRACE_SUPPORT=y`` and optionally
``CFG_ULIBS_MCOUNT=y`` and ``CFG_SYSCALL_FTRACE=y``.

- Build user TAs with ``-pg``, for instance enable ``CFG_TA_MCOUNT=y`` to
instrument whole TA. Also, in case user wants to set ``-pg`` for a
instrument the whole TA. Also, in case user wants to set ``-pg`` for a
particular file, following should go in corresponding sub.mk:
``cflags-<file-name>-y+=-pg``. Note that instrumented TAs have a larger
``.bss`` section. The memory overhead depends on ``CFG_FTRACE_BUF_SIZE``
macro which can be configured specific to user TAs using config:
``CFG_FTRACE_BUF_SIZE=4096`` (default value: 2048, refer to the TA linker
script for details: ``ta/arch/arm/ta.ld.S``).
script for details: ``ta/arch/$(ARCH)/ta.ld.S``).

- Run the application normally. When the current session exits or there is
any abort during TA execution, ``tee-supplicant`` will write function
graph data to ``/tmp/ftrace-<ta_uuid>.out``. If the file already exists,
a number is appended, such as: ``ftrace-<ta_uuid>.1.out``.

- Run helper script called ``symbolize.py`` to translate the function graph
addresses into function names: ``cat ftrace-<ta_uuid>.out |
./optee_os/scripts/symbolize.py -d <ta_uuid>.elf -d tee.elf``
- Run helper scripts called ``ftrace_format.py`` to translate the function
graph binary data into human readable text and ``symbolize.py`` to
convert function addresses into function names:
``cat ftrace-<ta_uuid>.out | optee_os/scripts/ftrace_format.py |
optee_os/scripts/symbolize.py -d <ta_uuid>.elf -d tee.elf``

- Refer to `commit 5c2c0fb31efb`_ for a full usage example on QEMU.

Typical output
**************

.. code-block:: bash
| __ta_entry() {
| __utee_entry() {
1.664 us | ta_header_get_session();
11.264 us | from_utee_params();
.896 us | memcpy();
| TA_InvokeCommandEntryPoint() {
| TEE_GenerateRandom() {
163.360 us | utee_cryp_random_number_generate();
186.848 us | }
214.288 us | }
19.088 us | to_utee_params();
| ta_header_save_params() {
.736 us | memset();
2.832 us | }
304.880 us | }
307.168 us | }
.. code-block:: none
TEE load address @ 0x5ab04000
Function graph for TA: cb3e5ba0-adf1-11e0-998b-0002a5d5c51b @ 80085000
| 1 | __ta_entry() {
| 2 | __utee_entry() {
43.840 us | 3 | ta_header_get_session()
7.216 us | 3 | tahead_get_trace_level()
14.480 us | 3 | trace_set_level()
| 3 | malloc_add_pool() {
| 4 | raw_malloc_add_pool() {
46.032 us | 5 | bpool()
| 5 | raw_realloc() {
166.256 us | 6 | bget()
23.056 us | 6 | raw_malloc_return_hook()
267.952 us | 5 | }
398.720 us | 4 | }
426.992 us | 3 | }
| 3 | TEE_GetPropertyAsU32() {
23.600 us | 4 | is_propset_pseudo_handle()
| 4 | __utee_check_instring_annotation() {
26.416 us | 5 | strlen()
| 5 | check_access() {
| 6 | TEE_CheckMemoryAccessRights() {
| 7 | _utee_check_access_rights() {
| 8 | syscall_check_access_rights() {
| 9 | ts_get_current_session() {
4.304 us | 10 | ts_get_current_session_may_fail()
10.976 us | 9 | }
| 9 | to_user_ta_ctx() {
2.496 us | 10 | is_user_ta_ctx()
8.096 us | 9 | }
| 9 | vm_check_access_rights() {
| 10 | vm_buf_is_inside_um_private() {
| 11 | core_is_buffer_inside() {
...
The duration (function's time of execution) is displayed on the closing bracket
line of a function or on the same line in case the function is the leaf one.
Expand All @@ -76,3 +102,8 @@ It comprises the time spent executing the function and any of its callees. The
Counter-timer Physical Count register (CNTPCT) and the Counter-timer Frequency
register (CNTFRQ) are used to compute durations. Time spent servicing foreign
interrupts is subtracted.

The second column is the stack depth for the current function. It helps
visually match function entries and exit.

.. _commit 5c2c0fb31efb: https://github.com/OP-TEE/optee_os/commit/5c2c0fb31efb

0 comments on commit be6f978

Please sign in to comment.