Lan

50 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×34NLP×19ML×14Vision×10Crypto×6Robotics×4Society×4HCI×4

Frequent co-authors

Ruslan Salakhutdinov2×

Yufang Hou2×

Research Timeline

2026

Investigating and Alleviating Harm Amplification in LLM Interactions

This paper introduces HarmAmp, a new benchmark for multi-turn harm amplification, and proposes TrajSafe, a proactive monitoring system that significantly reduces harmfulness in LLM interactions while maintaining usability.

Entropy Minimization without Model Collapse: Mitigating Prediction Bias in Medical Imaging

This paper identifies prediction bias, a failure mode of entropy minimization in test-time adaptation, and proposes Distribution Shift Bias Reduction (DSBR) to stabilize adaptation and prevent model collapse.

HumanNOVA: Photorealistic, Universal and Rapid 3D Human Avatar Modeling from a Single Image

HumanNOVA introduces a photorealistic, universal, and rapid model capable of generating high-quality 3D human avatars from a single input RGB image.

Not All Points Are Equal: Uncertainty-Aware 4D LiDAR Scene Synthesis

The paper introduces U4D, an uncertainty-aware framework that synthesizes 4D LiDAR scenes by prioritizing the reconstruction of geometrically difficult and uncertain regions first, leading to state-of-the-art fidelity and temporal consistency.

InsightVQA: High-Dimensional Emotion-Cognitive Visual Question Answering Benchmark

The paper introduces InsightVQA, a large-scale benchmark dataset designed for hierarchical visual question answering that assesses complex emotion understanding and cognitive reasoning beyond simple emotion recognition.

AutoForest: Automatically Generating Forest Plots from Biomedical Studies with End-to-End Evidence Extraction and Synthesis

AutoForest is an end-to-end system that automatically generates publication-ready forest plots directly from biomedical papers, streamlining the labor-intensive process of meta-analysis.

Coordination Graphs for Constrained Multi-Agent Reinforcement Learning

The paper introduces Coordination Graphs for Constrained Multi-Agent Reinforcement Learning (CG-CMARL), a scalable framework that decomposes complex joint action spaces into pairwise regions to handle coordination and constraints efficiently.

Who Annotates in NLP? A Large-scale Assessment of Human Annotation Reporting between 2018 and 2025

This paper conducts a large-scale audit of human annotation reporting in NLP, finding that while reporting has improved, critical details needed to assess annotation validity, such as training and agreement values, are frequently omitted.

Multilingual Idioms in Sentences and Conversations Across High-, Medium-, and Low-Resource Languages

The paper introduces MIDI, a novel multilingual dataset that embeds idioms in realistic sentence and conversational contexts across diverse resource levels, revealing that idiom comprehension is significantly harder in low-resource languages and that literal interpretations pose a greater challenge than figurative ones.

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

The paper introduces TELBench and the DRIFT framework to enable fine-grained, span-level error localization in deep-research agents, significantly improving the ability to pinpoint exactly where an agent's reasoning fails.

The Image Reconstruction Game: Drawing Common Ground Through Iterative Multimodal Dialogue

The paper introduces the Image Reconstruction Game, a benchmark showing that the quality of the descriptive model is the primary determinant of image reconstruction success, while the generator's role is secondary.

Using Reward Uncertainty to Induce Diverse Behaviour in Reinforcement Learning

The paper proposes a novel RL framework that naturally induces diverse agent behavior by reformulating the objective to treat the reward as a distribution over functions, making diversity a rational response to reward uncertainty.

Large Language Models Hack Rewards, and Society

The paper hypothesizes that LLMs can exploit gaps in societal rules, a phenomenon termed 'societal hacking,' and demonstrates this using a new sandbox environment.

AI Agents Enable Adaptive Computer Worms

The paper demonstrates a novel, self-sustaining computer worm powered by AI agents that generates tailored attack strategies in real-time, representing a significant shift from traditional, vulnerability-exploiting malware.

Selective Token-Level Cryptographic Redaction for Privacy-Preserving Clinical Deployment of Large Language Models

The paper introduces HERALD, a token-level cryptographic redaction framework that encrypts only sensitive tokens in clinical text, enabling privacy-preserving LLM deployment without significant loss of utility.

RREDCoT: Segment-Level Reward Redistribution for Reasoning Models

This paper introduces RREDCoT, a method for approximating optimal reward redistribution in Chain-of-Thought reasoning language models without additional generation.

RiskFlow: Fast and Faithful Safety-Critical Traffic Scenario Generation

RiskFlow is a novel framework that generates realistic and safety-critical multi-agent traffic scenarios by reformulating trajectory generation as a single-pass transport problem in the action space.

ZERO-APT: A Closed-Loop Adversarial Framework for LLM-Driven Automated Penetration Testing under Intelligent Defense

ZERO-APT introduces a novel closed-loop adversarial framework for automated penetration testing that simulates attacks against an intelligent, real-time defending system, achieving a high attack success rate and verifiable decision consistency.

FACTR 2: Learning External Force Sensing for Commodity Robot Arms Improves Policy Learning

This paper presents a data-driven method to estimate external joint torques without dedicated force sensors, enabling force-feedback teleoperation on low-cost arms.

DIRECT: When and Where Should You Allocate Test-Time Compute in Embodied Planners?

This paper introduces DIRECT, a routing framework that allocates test-time compute per prompt to improve the success--cost Pareto frontier for embodied agents.

Highlighted terms show continued research focus across papers

Papers

cs.ROcs.AIcs.LGEmpiricalRecentJun 10, 2026

FACTR 2: Learning External Force Sensing for Commodity Robot Arms Improves Policy Learning

Steven Oh, Jason Jingzhou Liu, Tony Tao, Philip Han +4 more

This paper presents a data-driven method to estimate external joint torques without dedicated force sensors, enabling force-feedback teleoperation on low-cost arms.

View →

cs.ROcs.AIcs.CVEmpirical