Luisa Crawford
Jul 01, 2026 07:35
Collectively AI presents 9 groundbreaking papers at ICML 2026, masking AI brokers, mannequin effectivity, and GPU optimization. Sales space B714, July 6-11.
Collectively AI is making a big mark on the Worldwide Convention on Machine Studying (ICML) 2026, with 9 analysis papers accepted for presentation. Held from July 6–11 in Seoul, South Korea, ICML is among the many prime world machine studying conferences, and this yr acquired a report 24,371 submissions, accepting solely 26.6%. Collectively AI’s contributions span your complete AI stack, from agent programs to GPU kernel optimization, reflecting the corporate’s deal with vertically built-in analysis and production-scale AI options.
Key highlights from Collectively AI’s analysis embody:
1. Chopping-Edge AI Brokers
On the forefront of AI agent analysis, Collectively AI launched three notable developments:
- DSGym: A framework for evaluating and coaching knowledge science brokers, that includes over 1,000 duties throughout 10+ domains. Notably, DSGym eliminates widespread benchmarking loopholes by requiring brokers to work immediately with datasets moderately than recalling pre-learned solutions.
- ThunderAgent: A novel inference system delivering as much as 3.6x quicker throughput for multi-turn agent workflows by treating workflows as first-class objects.
- TTT-Uncover: A way for state-of-the-art discoveries in fields like arithmetic, biology, and GPU kernel design utilizing open fashions, reaching distinctive outcomes at a fraction of the price of proprietary programs.
2. Mannequin Shaping and Reasoning
The corporate’s contributions to mannequin coaching emphasize bettering reasoning capabilities with out scaling compute:
- RARO: A framework enabling AI fashions to realize RL-grade reasoning on duties with out clear reply keys, comparable to poetry or monetary evaluation.
- V1: A coaching approach that enhances reply accuracy by 10% with out requiring extra compute, leveraging pairwise comparability for higher output choice.
3. Effectivity and Optimization
Collectively AI can be tackling the problem of optimizing inference and {hardware} utilization:
- Aurora: An adaptive speculative decoding system that improves mannequin inference speeds by 1.25x as visitors patterns evolve, demonstrating its utility in real-time manufacturing environments.
- Untied Ulysses: A memory-efficient technique enabling 5 million-token context coaching on a single GPU node, decreasing reminiscence utilization by as much as 87.5%.
- Opportunistic Professional Activation (OEA): A routing technique that cuts Combination-of-Specialists (MoE) decode latency by as much as 39%, reclaiming effectivity misplaced throughout batch inference.
4. Low-Stage {Hardware} Improvements
On the GPU kernel stage, Collectively AI’s ParallelKernelBench benchmark evaluates multi-GPU kernel technology, highlighting the challenges of scaling AI fashions throughout distributed programs. The benchmark provides 87 actual workloads, emphasizing the necessity for environment friendly multi-GPU communication and computation.
Why This Issues
Collectively AI’s presence at ICML 2026 underscores its ambition to guide in AI analysis and production-scale effectivity. By addressing challenges throughout the complete stack—from high-level agent design to hardware-level optimizations—the corporate is positioning itself as a key participant in scalable AI options. For context, ICML 2026 has already garnered consideration for its use of agentic AI peer reviewers, reflecting the rising integration of automation within the analysis course of.
Attendees can study extra about Collectively AI’s work at sales space B714 all through the convention. The workforce can be actively recruiting researchers and engineers to additional its mission of creating AI-native cloud infrastructures and next-generation machine studying programs.
For extra particulars on the analysis and to discover profession alternatives, go to Collectively AI.
Picture supply: Shutterstock

