Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Model support #113

Open
huangtingwei9988 opened this issue Nov 7, 2024 · 3 comments
Open

Custom Model support #113

huangtingwei9988 opened this issue Nov 7, 2024 · 3 comments
Labels
new models require new models question Further information is requested

Comments

@huangtingwei9988
Copy link

hi~,@YangWang92 ,for custom model, how to generate the hessian and inv_hessian weights?

@YangWang92
Copy link
Contributor

YangWang92 commented Nov 7, 2024

Hi @huangtingwei9988, thank you for your interest in our method.

The Hessian matrix can be collected through quip-sharp (here is my fork, which includes some minor fixes): https://github.com/YangWang92/quip-sharp/blob/wy/hessian/note.md and #79 .

The inverse Hessian can be obtained from https://gist.github.com/YangWang92/ec98a86c3a33c573b601cf4348d0a0e7.

The current tutorial is a bit simple, but I will release a complete tutorial on obtaining the Hessian soon. Please stay tuned.

@YangWang92 YangWang92 added question Further information is requested new models require new models labels Nov 7, 2024
@huangtingwei9988
Copy link
Author

Hi @huangtingwei9988, thank you for your interest in our method.

The Hessian matrix can be collected through quip-sharp (here is my fork, which includes some minor fixes): https://github.com/YangWang92/quip-sharp/blob/wy/hessian/note.md and #79 .

The inverse Hessian can be obtained from https://gist.github.com/YangWang92/ec98a86c3a33c573b601cf4348d0a0e7.

The current tutorial is a bit simple, but I will release a complete tutorial on obtaining the Hessian soon. Please stay tuned.

Thank you very much. I also want to know whether the qwen2-moe model will be supported in the future?

@YangWang92
Copy link
Contributor

Hi @huangtingwei9988, thank you for your interest in our method.
The Hessian matrix can be collected through quip-sharp (here is my fork, which includes some minor fixes): https://github.com/YangWang92/quip-sharp/blob/wy/hessian/note.md and #79 .
The inverse Hessian can be obtained from https://gist.github.com/YangWang92/ec98a86c3a33c573b601cf4348d0a0e7.
The current tutorial is a bit simple, but I will release a complete tutorial on obtaining the Hessian soon. Please stay tuned.

Thank you very much. I also want to know whether the qwen2-moe model will be supported in the future?

I previously have a quick try on Qwen2.5 A52 and encountered some minor issues. I haven't had the time to fix them yet. Please give me 1-2 weeks, and I will address the issues in MoE models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new models require new models question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants