Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix vits low-precision dtype #35418

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

jiqing-feng
Copy link
Contributor

@jiqing-feng jiqing-feng commented Dec 26, 2024

This PR fixed vits model dtype when torch_dtype=torch.float16.

To reproduce the error:

import torch
from transformers import pipeline

pipe = pipeline("text-to-speech", model="facebook/mms-tts-eng", torch_dtype=torch.float16)
output = pipe("Hello, my dog is cooler than you!")
print(output)

Traceback:

......
  File "/home/jiqing/transformers/src/transformers/models/vits/modeling_vits.py", line 916, in forward
    query_states = self.q_proj(hidden_states) * self.scaling
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/linear.py", line 125, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 must have the same dtype, but got Float and Half

Signed-off-by: jiqing-feng <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant