Skip to content

Latest commit

 

History

History
50 lines (39 loc) · 7.84 KB

kata-metrics-in-runtime-rs.md

File metadata and controls

50 lines (39 loc) · 7.84 KB

Kata Metrics in Rust Runtime(runtime-rs)

Rust Runtime(runtime-rs) is responsible for:

  • Gather metrics about shim.
  • Gather metrics from hypervisor (through channel).
  • Get metrics from agent (through ttrpc).

Here are listed all the metrics gathered by runtime-rs.

  • Current status of each entry is marked as:
  • ✅:DONE
  • 🚧:TODO

Kata Shim

STATUS Metric name Type Units Labels
🚧 kata_shim_agent_rpc_durations_histogram_milliseconds:
RPC latency distributions.
HISTOGRAM milliseconds
  • action (RPC actions of Kata agent)
    • grpc.CheckRequest
    • grpc.CloseStdinRequest
    • grpc.CopyFileRequest
    • grpc.CreateContainerRequest
    • grpc.CreateSandboxRequest
    • grpc.DestroySandboxRequest
    • grpc.ExecProcessRequest
    • grpc.GetMetricsRequest
    • grpc.GuestDetailsRequest
    • grpc.ListInterfacesRequest
    • grpc.ListProcessesRequest
    • grpc.ListRoutesRequest
    • grpc.MemHotplugByProbeRequest
    • grpc.OnlineCPUMemRequest
    • grpc.PauseContainerRequest
    • grpc.RemoveContainerRequest
    • grpc.ReseedRandomDevRequest
    • grpc.ResumeContainerRequest
    • grpc.SetGuestDateTimeRequest
    • grpc.SignalProcessRequest
    • grpc.StartContainerRequest
    • grpc.StatsContainerRequest
    • grpc.TtyWinResizeRequest
    • grpc.UpdateContainerRequest
    • grpc.UpdateInterfaceRequest
    • grpc.UpdateRoutesRequest
    • grpc.WaitProcessRequest
    • grpc.WriteStreamRequest
  • sandbox_id
kata_shim_fds:
Kata containerd shim v2 open FDs.
GAUGE
  • sandbox_id
kata_shim_io_stat:
Kata containerd shim v2 process IO statistics.
GAUGE
  • item (see /proc/<pid>/io)
    • cancelledwritebytes
    • rchar
    • readbytes
    • syscr
    • syscw
    • wchar
    • writebytes
  • sandbox_id
kata_shim_netdev:
Kata containerd shim v2 network devices statistics.
GAUGE
  • interface (network device name)
  • item (see /proc/net/dev)
    • recv_bytes
    • recv_compressed
    • recv_drop
    • recv_errs
    • recv_fifo
    • recv_frame
    • recv_multicast
    • recv_packets
    • sent_bytes
    • sent_carrier
    • sent_colls
    • sent_compressed
    • sent_drop
    • sent_errs
    • sent_fifo
    • sent_packets
  • sandbox_id
🚧 kata_shim_pod_overhead_cpu:
Kata Pod overhead for CPU resources(percent).
GAUGE percent
  • sandbox_id
🚧 kata_shim_pod_overhead_memory_in_bytes:
Kata Pod overhead for memory resources(bytes).
GAUGE bytes
  • sandbox_id
kata_shim_proc_stat:
Kata containerd shim v2 process statistics.
GAUGE
  • item (see /proc/<pid>/stat)
    • cstime
    • cutime
    • stime
    • utime
  • sandbox_id
kata_shim_proc_status:
Kata containerd shim v2 process status.
GAUGE
  • item (see /proc/<pid>/status)
    • hugetlbpages
    • nonvoluntary_ctxt_switches
    • rssanon
    • rssfile
    • rssshmem
    • vmdata
    • vmexe
    • vmhwm
    • vmlck
    • vmlib
    • vmpeak
    • vmpin
    • vmpmd
    • vmpte
    • vmrss
    • vmsize
    • vmstk
    • vmswap
    • voluntary_ctxt_switches
  • sandbox_id
🚧 kata_shim_process_cpu_seconds_total:
Total user and system CPU time spent in seconds.
COUNTER seconds
  • sandbox_id
🚧 kata_shim_process_max_fds:
Maximum number of open file descriptors.
GAUGE
  • sandbox_id
🚧 kata_shim_process_open_fds:
Number of open file descriptors.
GAUGE
  • sandbox_id
🚧 kata_shim_process_resident_memory_bytes:
Resident memory size in bytes.
GAUGE bytes
  • sandbox_id
🚧 kata_shim_process_start_time_seconds:
Start time of the process since unix epoch in seconds.
GAUGE seconds
  • sandbox_id
🚧 kata_shim_process_virtual_memory_bytes:
Virtual memory size in bytes.
GAUGE bytes
  • sandbox_id
🚧 kata_shim_process_virtual_memory_max_bytes:
Maximum amount of virtual memory available in bytes.
GAUGE bytes
  • sandbox_id
🚧 kata_shim_rpc_durations_histogram_milliseconds:
RPC latency distributions.
HISTOGRAM milliseconds
  • action (Kata shim v2 actions)
    • checkpoint
    • close_io
    • connect
    • create
    • delete
    • exec
    • kill
    • pause
    • pids
    • resize_pty
    • resume
    • shutdown
    • start
    • state
    • stats
    • update
    • wait
  • sandbox_id
kata_shim_threads:
Kata containerd shim v2 process threads.
GAUGE
  • sandbox_id

Kata Hypervisor

Different from golang runtime, hypervisor and shim in runtime-rs belong to the same process, so all previous metrics for hypervisor and shim only need to be gathered once. Thus, we currently only collect previous metrics in kata shim.

At the same time, we added the interface(VmmAction::GetHypervisorMetrics) to gather hypervisor metrics, in case we design tailor-made metrics for hypervisor in the future. Here're metrics exposed from src/dragonball/src/metric.rs.

Metric name Type Units Labels
kata_hypervisor_scrape_count:
Metrics scrape count
COUNTER
  • sandbox_id
kata_hypervisor_vcpu:
Hypervisor metrics specific to VCPUs' mode of functioning.
IntGauge
  • item
    • exit_io_in
    • exit_io_out
    • exit_mmio_read
    • exit_mmio_write
    • failures
    • filter_cpuid
  • sandbox_id
kata_hypervisor_seccomp:
Hypervisor metrics for the seccomp filtering.
IntGauge
  • item
    • num_faults
  • sandbox_id
kata_hypervisor_seccomp:
Hypervisor metrics for the seccomp filtering.
IntGauge
  • item
    • sigbus
    • sigsegv
  • sandbox_id