AI Intelligence · Research

AI Research Radar

Curated research papers from top AI labs and conferences with accessible summaries.

10
Papers Tracked
0
Total Citations
10
New This Week
ResearcharXivNEW

Physics Is All You Need? A Case Study in Physicist-Supervised AI Development of Scientific Software

Are AI agents tools, co-authors, or researchers? We present a quantified case study ($N=1$): a physicist supervising an AI coding agent (Claude Code, Sonnet and Opus models) over 12 work days and 57 sessions to build CLAX-PT, a differentiable one-loop perturbation theory module in JAX. We documented and classified 15 supervision events by intervention level. The agent resolved ten autonomously by

AIResearch
Nhat-Minh Nguyen
Nguyen
May 28, 2026
arXiv:2605.30353v1
ResearcharXivNEW

VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion

Long-rollout causal video diffusion has converged on a fixed-size sliding-window KV cache, with recent progress innovating within this layout by changing which tokens occupy the window or how their positions are encoded. The per-head KV layout itself, a dominant contributor to streaming memory and latency, has been mostly left unchanged. In this paper, we present the first study of Multi-Head Late

AIResearch
Hidir Yesiltepe, Jiazhen Hu et al.
Yesiltepe
May 28, 2026
arXiv:2605.30351v1
ResearcharXivNEW

DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation

Robot manipulation critically depends on perception that preserves the action-relevant aspects of a scene. Yet most robot learning pipelines are built upon visual encoders pre-trained for static recognition or vision-language alignment, leaving motion understanding to downstream policies. We introduce DynaFLIP, a dynamics-aware multimodal pre-training framework that pushes motion understanding ups

AIResearch
Jusuk Lee, Seungjae Lee et al.
Lee
May 28, 2026
arXiv:2605.30350v1
ResearcharXivNEW

LLMSurgeon: Diagnosing Data Mixture of Large Language Models

The pretraining data mixture of Large Language Models (LLMs) constitutes their "digital DNA", shaping model behaviors, capabilities, and failure modes. Yet this composition is rarely disclosed, making post-hoc auditing of data combination or provenance difficult. In this work, we formalize $\textbf{Data Mixture Surgery (DMS)}$: given only generated text from a target LLM, estimate the domain-level

AIResearch
Yaxin Luo, Jiacheng Cui et al.
Luo
May 28, 2026
arXiv:2605.30348v1
ResearcharXivNEW

AdaState: Self-Evolving Anchors for Streaming Video Generation

Autoregressive video diffusion models generate streaming video by producing frames sequentially, conditioning each chunk on previously generated content. These models are structurally anchored to the first frame: its key-value representation occupies a privileged position in the attention cache and serves as the primary scene reference throughout generation. As the cleanest and most error-free pos

AIResearch
Yusuf Dalva, Pinar Yanardag
Dalva
May 28, 2026
arXiv:2605.30349v1
ResearcharXivNEW

YoCausal: How Far is Video Generation from World Model? A Causality Perspective

As video diffusion models (VDMs) advance toward world models, a key question arises: do they truly understand causality, or merely overfit to statistical temporal patterns? Existing benchmarks mostly rely on synthetic data, limiting real-world generalization due to the sim-to-real gap. We present YoCausal, a two-level benchmark inspired by the Violation of Expectation (VoE) paradigm from cognitive

AIResearch
You-Zhe Xie, Yu-Hsuan Li et al.
Xie
May 28, 2026
arXiv:2605.30346v1
ResearcharXivNEW

SchGen: PCB Schematic Generation with Semantic-Grounded Code Representations

Printed circuit board (PCB) schematic design defines nearly all electronic hardware, but it remains manual and expertise-intensive. While generative AI has advanced digital and analog IC design, PCB schematic generation from natural-language intent is largely unexplored. This paper presents SchGen, the first large language model that generates editable PCB schematics from natural-language requests

AIResearch
Qinpei Luo, Ruichun Ma et al.
Luo
May 28, 2026
arXiv:2605.30345v1
ResearcharXivNEW

REST3D: Reconstructing Physically Stable 3D Scenes from a Single Image

Reconstructing physically stable 3D scenes from a single RGB image enables casual images to be converted into simulation-ready digital assets for applications such as immersive interaction and content creation. However, existing single-image reconstruction methods fall short in capturing the physical structure of a scene. As a result, they often produce geometrically plausible but physically incon

AIResearch
Xiaoxuan Ma, Jiashun Wang et al.
Ma
May 28, 2026
arXiv:2605.30338v1
ResearcharXivNEW

Locally Coherent, Globally Incoherent: Bounding Compositional Incoherence in Multi-Component LLM Agents

Multi-component LLM agents assemble probabilistic claims from components that each see only part of a joint problem; the composition can violate basic probability axioms even when every component is locally coherent. We formalise this locally coherent, globally incoherent failure via the compositional residual eps*, the L2 distance from the composed quote to the joint coherent polytope, computable

AIResearch
Anany Kotawala
Kotawala
May 28, 2026
arXiv:2605.30335v1
ResearcharXivNEW

SoundnessBench: Can Your AI Scientist Really Tell Good Research Ideas from Bad Ones?

Autonomous AI research agents aim to accelerate scientific discovery by automating the research pipeline, from hypothesis generation to peer review. However, existing benchmarks rarely test a fundamental bottleneck: whether Large Language Models can judge the methodological viability of a research idea before expending time and computational resources. We introduce SoundnessBench, a curated benchm

AIResearch
Sy-Tuyen Ho, Minghui Liu et al.
Ho
May 28, 2026
arXiv:2605.30329v1