-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the missing view operations from sequence parallel(async). #6750
base: master
Are you sure you want to change the base?
Conversation
@loadams The CI encountered a "no space left" issue, which doesn't seem to be caused by this patch. Could you please retrigger it? thanks! |
Hi @inkcherry - yes, sorry that does seem to be an intermittent failure, I'll re-trigger them, but it looks like there are merged conflicts now, could you take a look? |
@loadams Thank you for the reminder, I have resolved and verified the merge conflicts(both ds+megads) : ) |
FYI @loadams
a view operation was missing in some updates compared to the original version
DeepSpeed/deepspeed/sequence/layer.py
Line 56 in 17ed7c7
add missing view operation.
The shape required for the view cannot be easily obtained in the current function, so refactor layout params code.