-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RSS memory increase on tetragon #2892
Comments
Hello, thanks for the detailed report. The fact that you see a memory increase on loading tracing policies is normal behavior. If you are looking at That said, I've been working on tracking memory consumption of Tetragon and trying to avoid unnecessary memory waste. On your heap dumps you can see the biggest consumption post is:
Which is the process cache, I'm currently working on fixing a potential issue we have there: having a cache that grows too much compared to the actual process running on the host. That might enable Tetragon to overall consume less memory. |
Hi @mtardy , that's exciting! We're seeing something similar to @Jianlin-lv as well where our memory seems to grow unbounded. Our heap dump does also have the process cache as the largest consumer, and our workloads are ephemeral by nature (lots of pod churns), so I'm quite curious about your point on having a process cache that grows too much. I have some questions regarding your work:
|
Let me answer both questions here: theoretically, the process cache size should be in line with the number of processes currently running on the host. So a lot of different situations can happen here depending on what you do with your host, but in a general case, it should eventually be stable and pretty low in size. The issue we see is that in some situations, the process cache fills pretty quickly while the number of processes on the host is under a few hundred. We are currently merging work that will allow us to diagnose on a running agent what's happening in the cache #2246. Eventually, if people have very different needs (running tetragon at scale on a host with hundreds of thousands of processes), we are open to implementing tuning on the sizing of cache and BPF maps.
Here it depends on what we are talking about exactly. Generally, Tetragon has two big posts of memory consumption if you look at
|
We merged this patch which should help on base heap (thus RSS) use: We also merged this, which should help to understand memory use by process cache issues (the thing you are seeing as |
What happened?
In my test environment, applied two TracingPolicy and observed an increasing trend in tetragon's consumption of rss memory.
Enabled Pprof try to figure out which part consume the most memory
Comparison before and after the two samples, the process initProcessInternalExec, tracing. HandleMsgGenericKprobe, namespace. GetMsgNamespaces, Caps. GetCapabilitiesTypes has increased consumption of memory.
I'm not sure if this is the desired behavior or if there is a memory leak.
TracingPolicy
Tetragon Version
v1.1.2
Kernel Version
ubuntu 22.04 , kernel 5.15.0-26
Kubernetes Version
No response
Bugtool
No response
Relevant log output
No response
Anything else?
No response
The text was updated successfully, but these errors were encountered: