Shihan Qu's picture

🤝 Open to Collab

Shihan Qu

zenmagnets

·

AI & ML interests

None yet

Recent Activity

liked a model 3 days ago

brandonmusic/GLM-5.2-NVFP4-REAP-Recall-N172

liked a model 5 days ago

madeby561/GLM-5.2-NVFP4-REAP-504B

liked a model 10 days ago

brandonmusic/MiniMax-M3-NVFP4

View all activity

Organizations

None yet

New activity in Qwen/Qwen3.5-397B-A17B 29 days ago

Qwen3.6 397b

#75 opened 2 months ago by

New activity in mmangkad/Qwen3.6-27B-NVFP4 2 months ago

31gb NVFP4 Model?

#1 opened 2 months ago by

New activity in MiniMaxAI/MiniMax-M2.7 2 months ago

license

#5 opened 2 months ago by

New activity in varjosoft/GLM-5.1-Open-TQ3 2 months ago

Pending GPU & vLLM validation

#1 opened 3 months ago by

New activity in MiniMaxAI/MiniMax-M2.7 2 months ago

No commercial use allowed in License?

#6 opened 2 months ago by

New activity in lukealonso/Qwen3.5-397B-A17B-NVFP4 4 months ago

How to run on vLLM for 4xSM120

#1 opened 4 months ago by

New activity in lukealonso/MiniMax-M2.5-NVFP4 4 months ago

Here's the vLLM recipe I'm using with 2x RTX Pro 6000

#1 opened 4 months ago by

New activity in Ex0bit/Kimi-K2.5-PRISM-REAP-530B-A32B 4 months ago

Anyone get this working on 4x RTX 6000 Pro?

#1 opened 4 months ago by

New activity in GadflyII/MiniMax-M2.1-NVFP4 4 months ago

Throughput NVFP4 on Dual 6000 Blackwells

#2 opened 4 months ago by

New activity in vincentzed-hf/Qwen3.5-397B-A17B-NVFP4 4 months ago

Anyone try this on 4x RTX 6000 Pro yet?

#1 opened 4 months ago by

New activity in mlx-community/Qwen3.5-397B-A17B-nvfp4 4 months ago

I wish it would fit in 2x6000 PRO!

#2 opened 4 months ago by

New activity in lukealonso/MiniMax-M2.5-NVFP4 4 months ago

"w1_weight_scale_2 must match w3_weight_scale_2. Accuracy may be affected."

#2 opened 4 months ago by

New activity in GadflyII/GLM-4.7-Flash-NVFP4 5 months ago

Wasn't able to recreate MMLU-Pro benchmarks

#5 opened 5 months ago by

New activity in zai-org/GLM-4.7-Flash 5 months ago

Enormous KV-cache size?

#3 opened 5 months ago by

New activity in GadflyII/GLM-4.7-Flash-NVFP4 5 months ago

Really appreciate that you ran performance comparison tests with BF16!

#2 opened 5 months ago by

New activity in marksverdhei/GLM-4.7-Flash-FP8 5 months ago

Performance comps with BF16?

#3 opened 5 months ago by

New activity in cyankiwi/GLM-4.7-Flash-AWQ-4bit 5 months ago

Any plans for a 6bit or 8bit version?

#3 opened 5 months ago by

New activity in marksverdhei/GLM-4.7-Flash-FP8 5 months ago

If 8bit, why shaped like 16 bit

#2 opened 5 months ago by

New activity in nvidia/Qwen3-30B-A3B-NVFP4 7 months ago

6 months since intro of NVFP4, and it's basically still a myth

#4 opened 7 months ago by

New activity in RESMP-DEV/Qwen3-Next-80B-A3B-Instruct-NVFP4 7 months ago

Works with vllm? Any recommendations or howtos?

#1 opened 8 months ago by