Djuunaa

djuna

·

AI & ML interests

None yet

Recent Activity

reacted to stas's post with 🔥 about 16 hours ago

After many months of intense work the Snowflake AI Research team is happy to present to you the new open source project: Arctic RL https://snowflake.com/en/blog/engineering/arctic-rl-open-source-backend/ - Arctic RL integrates with VeRL and SkyRL today; enable ZoRRo with one config flag, no code changes required - ZoRRo delivers up to 6x actor-update acceleration and a 3.5x end-to-end training speedup, reducing Arctic-Text2SQL-R2 training from ~5 days to ~36 hours on 32 H200 GPUs - Arctic-Text2SQL-R2 achieved higher accuracy scores (48.7) than Gemini 3.1 Pro (47.9) and Claude 4.7 (47.3) on Snowflake's evaluated enterprise SQL benchmark under the tested conditions - Two open source recipes ship with this release: a text-to-SQL recipe that improved BIRD dev accuracy from 59.92% to 70.35%, and a multi-hop QA recipe that improved average accuracy from 69.6% to 72.3%

new activity 1 day ago

Etherll/Qwen3.6-27B-Layerdose:Whats layerdose?

liked a model 3 days ago

arbazsiddiqui/Ozan-v1-12B

View all activity

Organizations

New activity in Etherll/Qwen3.6-27B-Layerdose 1 day ago

Whats layerdose?

#1 opened 1 day ago by

New activity in prithivMLmods/PiD-Image-Upscaler about 1 month ago

Download button for png output

#1 opened about 1 month ago by

New activity in zerofata/Q3.5-BlueStar-v2-27B 2 months ago

q3.6?

#4 opened 2 months ago by

New activity in SL-AI/GRaPE-2-Pro 2 months ago

WOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO

#1 opened 2 months ago by

New activity in Lambent/IsoC-Gemma-3-12B 6 months ago

Weight = 1.0?

#1 opened 6 months ago by

New activity in ChenkinNoob/ChenkinNoob-XL-V0.1 7 months ago

Diffusers library format?

#4 opened 7 months ago by

New activity in MiniMaxAI/MiniMax-M2 8 months ago

No lightning attention?

#8 opened 8 months ago by

New activity in darsadilab/zagros-1.0-quick 9 months ago

Why is the first discussion deleted?

#2 opened 9 months ago by

New activity in huggingface/InferenceSupport 9 months ago

darsadilab/zagros-1.0-quick

#4987 opened 9 months ago by

New activity in inclusionAI/GroveMoE-Inst 9 months ago

llama.cpp support

#1 opened 11 months ago by

New activity in Qwen/Qwen3-235B-A22B-Thinking-2507 9 months ago

How to obtain value for sparse_attention_config

#8 opened 9 months ago by

New activity in Qwen/Qwen3-Next-80B-A3B-Thinking 10 months ago

Support of 1M context doubt

#2 opened 10 months ago by

New activity in ggml-org/gguf-my-lora 10 months ago

Need factory restart?

#5 opened 10 months ago by

New activity in djuna-test-lab/Q3-IIJAN-4B 11 months ago

It should be 4B 😑

#1 opened 11 months ago by

New activity in MetaStoneTec/XBai-o4 11 months ago

Please clarify base model (Qwen-32B) and include appropriate Apache 2.0 license

#3 opened 11 months ago by

New activity in jjzha/Qwen2.5-14B-Instruct-SEFL 11 months ago

Usage example?

#2 opened 11 months ago by

New activity in tngtech/DeepSeek-R1T-Chimera about 1 year ago

Any plans to release an updated version based on DeepSeek-V3-0526 + R1, or how to create the merge myself?

#4 opened about 1 year ago by

New activity in rednote-hilab/dots.llm1.inst about 1 year ago

Game-changer for 4x24GB setups! AWQ request

#1 opened about 1 year ago by

New activity in deepseek-ai/DeepSeek-R1-0528-Qwen3-8B about 1 year ago

DeepSeek-R1-Lite

#6 opened about 1 year ago by

New activity in arcee-ai/mergekit-gui about 1 year ago

The space keeps sleeping due to inactivity

#54 opened about 1 year ago by