model.config.to_diff_dict()
delivers different result to model.save_pretrained()
#35426
Open
2 of 4 tasks
Labels
System Info
transformers
version: 4.48.0.dev0Who can help?
@ArthurZuc
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I have a use case that requires that model weights always be encrypted when in local storage and only be decrypted in memory. As a result, it is not an option to use
model.from_pretrained(dir)
.Instead, my workaround has been to do:
The problem I've noticed, however, is that when I serialize my config like so:
The resulting config includes the key
_attn_implementation_autoset
set toTrue
whereas the actual config of the model does not include that key and as a result when I try loading the config withAutoConfig.from_pretrained()
, it ends up not using the default attention implementation for my model, SDPA, delivering effectively a different model with different logits.My current hotfix is to just delete the key
_attn_implementation_autoset
from all of my configs. But is it really necessary to add that key toto_diff_dict()
when it is not added when you dosave_pretrained()
?Expected behavior
I get the same model in a reproduciable way as when I save the config with
to_diff_dict()
vssave_pretrained()
.The text was updated successfully, but these errors were encountered: