Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is MixDQ a PTQ or QAT method? #11

Open
LiMa-cas opened this issue Jul 20, 2024 · 6 comments
Open

is MixDQ a PTQ or QAT method? #11

LiMa-cas opened this issue Jul 20, 2024 · 6 comments

Comments

@LiMa-cas
Copy link

in the base_quantizier.py, there are these words:PyTorch Function that can be used for asymmetric quantization (also called uniform affine
quantization). Quantizes its argument in the forward pass, passes the gradient 'straight
through' on the backward pass, ignoring the quantization that occurred.
Based on https://arxiv.org/abs/1806.08342.,
So is MixDQ a PTQ or QAT method?need backward pass when quantizationing?

@A-suozhang
Copy link
Member

Thank you for your interest in our work. MixDQ is a PTQ method that does not require tuning, the code in the base_quantizer.py is simply for compatibility.

@LiMa-cas
Copy link
Author

LiMa-cas commented Jul 20, 2024 via email

@LiMa-cas
Copy link
Author

path: "/share/public/diffusion_quant/calib_dataset/bs32_t30_sdxl.pt" HI, where can I download this file?i need all the file download

@A-suozhang
Copy link
Member

You could generate this file, following the instruction of README.md step 1.1 "Generate Calibration Data"

CUDA_VISIBLE_DEVICES=$1 python scripts/gen_calib_data.py --config ./configs/stable-diffusion/$config_name --save_image_path ./debug_imgs

@LiMa-cas
Copy link
Author

thanks a lot. another question, when I reference, is it much slower since I need if else to see which precision to dequantize?

@A-suozhang
Copy link
Member

I'm not quite sure I fully understand your question. But Yes, the code within this repository is the "algorithm-level" quantization simulation code, and runs slower than FP16. For actual speedup, customized CUDA kernel that utilizes the INT computation is needed (our huggingface demo code, https://huggingface.co/nics-efc/MixDQ ).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants