We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
At scale, some AWs do not enter into a complete state due to the fact that the informer and etcd do not agree.
Please specify the component versions in which you have encountered this bug.
Codeflare SDK: MCAD:
Fire 1K AWs with very short jobs (10 seconds) and wait for completion of all 1K AWs
I have run scale tests to reproduce the issue
All AWs should be completed.
NA
Current 1.35.0 release and main branch
Add any other information you think might be useful here.
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Describe the Bug
At scale, some AWs do not enter into a complete state due to the fact that the informer and etcd do not agree.
Codeflare Stack Component Versions
Please specify the component versions in which you have encountered this bug.
Codeflare SDK:
MCAD:
Steps to Reproduce the Bug
Fire 1K AWs with very short jobs (10 seconds) and wait for completion of all 1K AWs
What Have You Already Tried to Debug the Issue?
I have run scale tests to reproduce the issue
Expected Behavior
All AWs should be completed.
Screenshots, Console Output, Logs, etc.
NA
Affected Releases
Current 1.35.0 release and main branch
Additional Context
NA
Add any other information you think might be useful here.
The text was updated successfully, but these errors were encountered: