Lin Yan

9 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

NLP×5AI×5Crypto×5ML×3Software Eng.×2Instrumentation and Methods for Astrophysics×1Vision×1Stats ML×1

Frequent co-authors

Research Timeline

2026

TraceSafe: A Systematic Assessment of LLM Guardrails on Multi-Step Tool-Calling Trajectories

The paper introduces TraceSafe-Bench, a comprehensive benchmark, and finds that securing LLM agents requires jointly optimizing for structural reasoning and safety alignment to mitigate risks during multi-step tool-use.

SRTJ: Self-Evolving Rule-Driven Training-Free LLM Jailbreaking

The paper proposes SRTJ, a Self-Evolving Rule-Driven Training-Free Jailbreak framework that systematically discovers and refines attack strategies using rule composition and feedback to achieve robust and generalizable jailbreaking against modern LLMs.

AgentTrust: Runtime Safety Evaluation and Interception for AI Agent Tool Use

AgentTrust is a novel runtime safety layer that intercepts and evaluates AI agent tool calls before execution, achieving high accuracy in detecting unsafe actions across complex and obfuscated scenarios.

Inverting the Shield: Systematically Generating Safety Tests from Policy Specifications

The paper introduces POLARIS, a novel framework that systematically generates comprehensive and verifiable safety tests for LLMs by formalizing natural language policies into First-Order Logic and exploring the resulting Semantic Policy Graph.

Turning Bias into Bugs: Bandit-Guided Style Manipulation Attacks on LLM Judges

The paper introduces BITE, a black-box adversarial framework that exploits stylistic biases in LLM judges by adaptively generating semantically equivalent edits to artificially inflate assigned scores.

MOOSE-Copilot: A Web-Based Interactive Assistant for Unified Exploratory and Fine-Grained Scientific Hypothesis Discovery

MOOSE-Copilot is a novel web-based framework that unifies scientific hypothesis discovery by formalizing human-AI interaction, significantly improving performance over autonomous LLM baselines.

Preference-Aware Rubric Learning for Personalized Evaluation

The paper introduces PARL, a framework that learns personalized evaluation rubrics directly from raw user interaction histories to accurately assess how well LLM outputs align with subjective, user-specific preferences.

From Empathy to Personalized Empathy: Adapting Empathetic Strategies to Individual Users

This paper introduces personalized empathy, a capability for LLMs to adapt empathetic strategies based on individual user history, and proposes PereGRM, a reward modeling framework that significantly enhances this personalized empathy.

Identifying Gems from Roman RAPIDly

This paper introduces a machine learning model, RuBR, and a methodology to reliably distinguish genuine astronomical transients from spurious detections for the upcoming Roman Space Telescope's data pipeline.

Highlighted terms show continued research focus across papers

Papers

cs.LGastro-ph.IMcs.CVRecentJun 3, 2026

Identifying Gems from Roman RAPIDly

Karan Gandhi, Ashish A. Mahabal, Jacob E. Jencson, Russ R. Laher +5 more

View →

cs.CLRecentMay 30, 2026