-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix creation of AWs/Jobs with same name in different namespaces #652
base: main
Are you sure you want to change the base?
Fix creation of AWs/Jobs with same name in different namespaces #652
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
if len(newName + namespace) > 63 { | ||
newName = newName[:len(newName) - (len(newName) + len(namespace) - 63)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at quick glance, I'm not 100% sure about the correctness of these lines.
but actually, even the original code:
if len(newName) > 63 {
newName = newName[:63]
makes me doubtful. Do we really want to truncate the name at 64 chars?
wouldn't it lead to unexpected clashes when using similar-but-ending-differently long names?
Raising an error could be saner (and thus making the AppWrapper impossible to process), it's a common K8s thing to abort on long names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried in the past to concatenate the namespace and name of a job and observed that this would indeed run long for many of our jobs. The use of the hyphen is also problematic because it is ambiguous. We should have separate labels for the name and namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @kpouget, you raised a very good point. For what I could read, it seems that it is done to meet with DNS standards RFC 1035/1123. Since Kubernetes namespaces and names are often used to form a DNS subdomain, K8s enforces it here.
After looking at this code in particular again, I can confirm that we can keep the original code here as the namespace is used in the creation of the object anyways.
Now, to deal with very long names, we could raise an error and stop the operation as suggested. But I was thinking what if we handle the error this way? Here are some ideas:
- Hashing the name for uniqueness?
- Or, Truncate to 60 characters and append i.e., 3 random numbers?
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tardieu, having separate labels for the name and namespace does look like the best option instead of using a hyphen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, to deal with very long names, we could raise an error and stop the operation as suggested. But I was thinking what if we handle the error this way?
do you know how this DNS name is used? is it something that the AppWrapper owner will ever want to use?
if it's purely internal, then using a hash, or truncating the name + random chars seems to be a good idea
@@ -163,7 +163,7 @@ func (gr *GenericResources) Cleanup(aw *arbv1.AppWrapper, awr *arbv1.AppWrapperG | |||
} | |||
|
|||
// Get the resource to see if it exists | |||
labelSelector := fmt.Sprintf("%s=%s, %s=%s", appwrapperJobName, aw.Name, resourceName, unstruct.GetName()) | |||
labelSelector := fmt.Sprintf("%s=%s, %s=%s", appwrapperJobLabelName, aw.Name, resourceName, unstruct.GetNamespace() + "-" + unstruct.GetName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
labelSelector := fmt.Sprintf("%s=%s, %s=%s", appwrapperJobLabelName, aw.Name, resourceName, unstruct.GetNamespace() + "-" + unstruct.GetName()) | |
labelSelector := fmt.Sprintf("%s=%s, %s=%s-%s", appwrapperJobLabelName, aw.Name, resourceName, unstruct.GetNamespace(), unstruct.GetName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't why I didn't think of that, thanks @astefanutti :)
@@ -163,7 +163,7 @@ func (gr *GenericResources) Cleanup(aw *arbv1.AppWrapper, awr *arbv1.AppWrapperG | |||
} | |||
|
|||
// Get the resource to see if it exists | |||
labelSelector := fmt.Sprintf("%s=%s, %s=%s", appwrapperJobName, aw.Name, resourceName, unstruct.GetName()) | |||
labelSelector := fmt.Sprintf("%s=%s, %s=%s", appwrapperJobLabelName, aw.Name, resourceName, unstruct.GetNamespace() + "-" + unstruct.GetName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can that label be removed altogether? It'll avoid those cluster-wide list queries, and be fully deterministic. Because even with that fix, you could imagine case when labels clash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For normal cleanup of eg an appwrapper containing a pytorchjob, we should only delete the pytorch job and let the training operator do the rest. But we are also exploring adding "forceful" cleanup (kill -9) as a last resort where we delete (forcefully) any resource that can be traced back to an appwrapper. For the latter at least, it makes sense to select on labels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would assume the underlying operators propagate the label, right? It seems owner references and cascading deletion would be a more robust and generally supported mechanism. Granted it would restrict the generic items to be within the same namespace as that of the AppWrapper (cross namespace ownership is not allowed), but even that may possibly be seen as a good restriction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This indeed assumes that wrapped resources in the appwrapper are properly labelled and that labels are propagated by operators for resources indirectly created. Again, this is only a last resort, but an important last resort. This definitely does not preclude trying to do the right thing first. Ownership references are not an option in general due to the cross namespace restriction. Transitive ownership and cascading deletion is good practice but there is no guarantee operators follow these practices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For more context, we do experience node failures scenarios where the kube scheduler hence typically operators fail to terminate pods. The pods remain in terminating state until forcefully deleted (irrespective of grace periods), hence the need to orchestrate forceful deletion in MCAD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, even for forceful deletion, I am not advocating for relying solely on labels, but labels are one of the tools we want to use to scope the forceful deletion.
Do we have any plans to change the label name from |
Yes, I agree it'll have to be changed, e.g., to |
@astefanutti Yes, sounds good to me. It's a good idea to include the change here. |
The label is used to select resources directly created by MCAD (in The same label is used to select pods directly or indirectly caused by an appwrapper (in The change needs to be consistent across. The introduction of a separate label for the namespace should also be done consistently across. Otherwise, we will end up confusing pods resulting from appwrapper This does mean that all tests and all appwrapper yamls in circulation have to be adjusted. I would rather sync this change with the apigroup change than have users go back and forth between 3 CRD revisions (old group + old label, new group + old label, new group + new label). |
8d5acc1
to
719dea1
Compare
If I read this correctly, the latest push adds a label with the namespace of the wrapped resource itself rather than the appwrapper namespace. I am not sure I understand the logic. |
We should probably avoid unqualified label names like |
I think I understand your point. Would it make more sense to include labels for both wrapped resources and appwrappers? or just use the appwrapper name and namespace? I think that the issue with this is that i.e., if two jobs are created with same name and different namespaces, the 2nd job's pods are not created, so it's not originally the appwrapper where the issue lies. Let me know if I misunderstand, thank you Olivier |
441402a
to
cedd7ad
Compare
8a92543
to
f3f7858
Compare
64d82ba
to
a24f2f9
Compare
a24f2f9
to
c7fa546
Compare
0693e5d
to
85d8617
Compare
@astefanutti Hi Antonin, I'm wondering if you think updating the
|
I attempted to not use completely the resourceName label but caused the e2e tests to fail. I believe it has to do with certain scenarios where the resourceName differs from the AppWrapper name, such as this example: multi-cluster-app-dispatcher/test/e2e-kuttl-deployment-01/steps/03-assert.yaml Lines 27 to 30 in 41833f2
|
Hello @ChristianZaccaria , a lot of changes seems to be unrelated, or indirectly at least, to the original issue: shouldn't these change be merged in another PR? 🤔 |
Hi @kpouget, it was suggested to include these changes in the scope of this PR: #652 (comment) |
993f665
to
10b2c49
Compare
10b2c49
to
008bdd9
Compare
newName = newName[:63] | ||
if len(newName) > 60 { | ||
newName = newName[:60] | ||
newName += GetRandomString(3) | ||
} | ||
|
||
err = deleteObject(namespaced, namespace, newName, rsrc, dclient) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will almost always fail for length >60 because the string you create is not deterministic. You should use a hash for getting the last 3 digits
newName = newName[:63] | ||
if len(newName) > 60 { | ||
newName = newName[:60] | ||
newName += GetRandomString(3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here too
Issue link
Resolves #433
And resolves #383
And closes #671
In the
genericresource.go
file there is a SyncQueueJob function which is used to create resources inside etcd. In this function, the identification of AppWrappers/Jobs was primarily based on a label (resourceName
) which had a value of the AW name without taking into account the namespace. As a result, when 2 AWs/Jobs with the same name were created in different namespaces, the 2nd AW/Job was identified as a duplicate.What changes have been made
appwrapper.mcad.ibm.com
toworkload.codeflare.dev/appwrapper
, and fromresourceName
toworkload.codeflare.dev/resourceName
.GetRandomString()
to truncate the name of objects in etcd at 60 characters instead of 63, to allow to append 3 random alphanumeric (lowercased) characters to avoid unexpected clashes when using similar-but-ending-differently long names.ibm.com
toquota.codeflare.dev
Verification steps
Checks