Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sidecar fails after reaching 4GB RSS Memory #1485

Open
Imod7 opened this issue Aug 30, 2024 · 7 comments
Open

Sidecar fails after reaching 4GB RSS Memory #1485

Imod7 opened this issue Aug 30, 2024 · 7 comments
Labels
I7 - Optimization 🚴 Make Sidecar drive faster

Comments

@Imod7
Copy link
Contributor

Imod7 commented Aug 30, 2024

Description
A team using Sidecar has reported issues where the Sidecar processes gradually consumes up to 4Gb of RSS memory and then becomes incapable of handling any requests, requiring the process to restart. It's running in its own container alongside a Polkadot node. Despite allocating more resources to the container, it consistently fails once it hits the 4GB mark.

Linked to #1361

Possible Issues
Memory Leaks ?

Possible Good Resources

@Imod7 Imod7 added the I7 - Optimization 🚴 Make Sidecar drive faster label Aug 30, 2024
@filvecchiato
Copy link
Contributor

While the container has more resources to allow Sidecar to run, if the process is not assigned a larger heap it would crash when the limit is reached. The start command could be modified to allow more heap memory (to match a larger container).

Some of the work done between PJS and Sidecar should allow it to run more efficiently so decreasing the likelihood to this specific case happening, however, more work is required to reduce the memory consumption. The PJS cache system will still be a big contributor to this issue.

@gituser
Copy link

gituser commented Nov 30, 2024

After updating from v19.0.2 to v19.3.1 sidecar failed in 24 hours with error:

Nov 30 03:20:04 polkadot sidecar[577113]: <--- Last few GCs --->
Nov 30 03:20:04 polkadot sidecar[577113]: [577113:0x62e3480] 115832777 ms: Mark-sweep 3784.2 (4142.6) -> 3767.8 (4142.6) MB, 107.0 / 0.0 ms  (average mu = 0.337, current mu = 0.328) allocation failure; scavenge might not succeed
Nov 30 03:20:04 polkadot sidecar[577113]: [577113:0x62e3480] 115832931 ms: Mark-sweep 3784.0 (4142.9) -> 3767.7 (4142.9) MB, 107.4 / 0.0 ms  (average mu = 0.318, current mu = 0.298) task; scavenge might not succeed
Nov 30 03:20:04 polkadot sidecar[577113]: <--- JS stacktrace --->
Nov 30 03:20:04 polkadot sidecar[577113]: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
Nov 30 03:20:04 polkadot sidecar[577113]:  1: 0xb83f50 node::Abort() [/usr/bin/node]
Nov 30 03:20:04 polkadot sidecar[577113]:  2: 0xa94834  [/usr/bin/node]
Nov 30 03:20:04 polkadot sidecar[577113]:  3: 0xd647c0 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/usr/bin/node]
Nov 30 03:20:04 polkadot sidecar[577113]:  4: 0xd64b67 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/bin/node]
Nov 30 03:20:04 polkadot sidecar[577113]:  5: 0xf42265  [/usr/bin/node]
Nov 30 03:20:04 polkadot sidecar[577113]:  6: 0xf43168 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [/usr/bin/node]
Nov 30 03:20:04 polkadot sidecar[577113]:  7: 0xf53673  [/usr/bin/node]
Nov 30 03:20:04 polkadot sidecar[577113]:  8: 0xf544e8 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/bin/node]
Nov 30 03:20:04 polkadot sidecar[577113]:  9: 0xf2ee4e v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/bin/node]
Nov 30 03:20:04 polkadot sidecar[577113]: 10: 0xf30217 v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/bin/node]
Nov 30 03:20:04 polkadot sidecar[577113]: 11: 0xf113ea v8::internal::Factory::NewFillerObject(int, v8::internal::AllocationAlignment, v8::internal::AllocationType, v8::internal::AllocationOrigin) [/usr/bin/node]
Nov 30 03:20:04 polkadot sidecar[577113]: 12: 0x12d674f v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [/usr/bin/node]
Nov 30 03:20:04 polkadot sidecar[577113]: 13: 0x17035b9  [/usr/bin/node]

The container running sidecar has a plenty of memory, previously v19.0.2 has been running without issues for 6 months at least.

Here is a workaround, run sidecar with this option max-old-space-size:

/usr/bin/node --max-old-space-size=8192 /home/polkadot/sidecar-current/src/main.js

@Imod7
Copy link
Contributor Author

Imod7 commented Dec 2, 2024

Thank you @gituser so much for your feedback! 💯
Yes, the --max-old-space-size is one of the workarounds we also suggested in the issue #1555. Sharing which exact memory settings/amount works for you is super useful. We will keep it in mind for other users and might also update the README to give more visibility. Thanks again!

@gituser
Copy link

gituser commented Dec 4, 2024

Thank you @gituser so much for your feedback! 💯 Yes, the --max-old-space-size is one of the workarounds we also suggested in the issue #1555. Sharing which exact memory settings/amount works for you is super useful. We will keep it in mind for other users and might also update the README to give more visibility. Thanks again!

@Imod7

Unfortunately 8192MB is not enough for sidecar, I'll continue increasing it, but it's not normal and this bug should be fixed.
Also, interesting note that after OOM sidecar starts working and getting this specific block works again.
Most likely there is a memory leak somewhere in the sidecar, as I noted earlier v19.0.2 worked just fine without any issues.

@gituser
Copy link

gituser commented Dec 18, 2024

Any update on this? @Imod7 @filvecchiato

v19.0.2 works just fine and v19.3.1 fails after running about 20 hours. It's clearly a regression in the code somewhere.

@TarikGul
Copy link
Member

Any update on this? @Imod7 @filvecchiato

v19.0.2 works just fine and v19.3.1 fails after running about 20 hours. It's clearly a regression in the code somewhere.

Just to clarify, you haven't tried any other versions between 19.0.2 and 19.3.1 right? I haven't dived into this much, but I am happy to profile some of the releases in between 19.0.2, and 19.3.1 when I am back on Monday. I would be glad to give a helping hand on this and see it get resolved.

I would be curious to see if this is related to polkadot-js in any way as well.

@gituser
Copy link

gituser commented Dec 18, 2024

Any update on this? @Imod7 @filvecchiato
v19.0.2 works just fine and v19.3.1 fails after running about 20 hours. It's clearly a regression in the code somewhere.

Just to clarify, you haven't tried any other versions between 19.0.2 and 19.3.1 right? I haven't dived into this much, but I am happy to profile some of the releases in between 19.0.2, and 19.3.1 when I am back on Monday. I would be glad to give a helping hand on this and see it get resolved.

@TarikGul No, I didn't try versions in between. But I think v19.3.0 is affected as well.

I would be curious to see if this is related to polkadot-js in any way as well.

Will be waiting for your results! Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I7 - Optimization 🚴 Make Sidecar drive faster
Projects
None yet
Development

No branches or pull requests

4 participants