Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

does onnx can support the model an inference with onnxruntime? #3

Open
dragen1860 opened this issue Jun 7, 2024 · 2 comments
Open

Comments

@dragen1860
Copy link

Hi, dear author:
The memory reduce is very attractive and will benefits its application. I wonder does current onnx support the techniques you proposed and inference with onnxruntime framework?

@A-suozhang
Copy link
Member

Thank you for your interest in our work! We haven't tried ONNXRuntime yet, we think it is applicable. MixDQ adopts the standard and deployment-friendly quantization scheme, We have already tested MixDQ with the pytorch_quantization deployment tool.

@A-suozhang
Copy link
Member

If you are interested in deploying MixDQ with onnxrumtime or other tool,s we are also open for discussion and support, PRs are welcomed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants