Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource management with cgroups and friends #1459

Open
zimbatm opened this issue Sep 18, 2024 · 8 comments
Open

Resource management with cgroups and friends #1459

zimbatm opened this issue Sep 18, 2024 · 8 comments

Comments

@zimbatm
Copy link
Member

zimbatm commented Sep 18, 2024

A tracking issue.

Now that NixOS/nix#11412 has been merged, it might be worth deploying Nix with groups enabled in the next release.

Arian also built this prom exporter that is nicely complementary and could be used to collect metrics and better understand resource utilisation: https://github.com/arianvp/cgroup-exporter

@Mic92
Copy link
Member

Mic92 commented Sep 18, 2024

Part of nix-experimental.

@zowoq
Copy link
Contributor

zowoq commented Oct 29, 2024

Maybe I should just apply the cgroups patches to our current nix version so we can try to make some progress on this?

#1466

@zimbatm
Copy link
Member Author

zimbatm commented Oct 30, 2024

@Mic92 do you know if a new release will be cut soon?

@Mic92
Copy link
Member

Mic92 commented Oct 31, 2024

We are behind schedule actually for a release. I will ask if we can do it next week.

@zowoq
Copy link
Contributor

zowoq commented Nov 10, 2024

Remote builds support passing max-jobs, how feasible would it be to add cores so we could set only for hydra jobs? Could be useful as a complement to cgroups and would also work for darwin jobs. If upstream wasn't very keen on it we could even carry the patch (if it isn't too complicated)?

@zowoq
Copy link
Contributor

zowoq commented Nov 29, 2024

Interesting error, has happened in a few builds now:

Seems this is the build hitting the 20min max-silent-time but instead of Timed Out it is Aborted and it is added back to the queue for another build attempt.

cgroup1
cgroup2

@Mic92
Copy link
Member

Mic92 commented Nov 29, 2024

@zowoq can you also open a nix issue for this?

@zowoq
Copy link
Contributor

zowoq commented Nov 30, 2024

Trying to find a trivial way to reproduce the error but so far haven't been able to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants