Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Lin Yan

Lin Yan

9 indexed papers

Recent (6 mo)
9
With code
0
Influential cites
0
Benchmarked
0

Publications per year

9
26

Top categories

NLP×5AI×5Crypto×5ML×3Software Eng.×2Instrumentation and Methods for Astrophysics×1Vision×1Stats ML×1

Frequent co-authors

Xianglin Yang2×
Jin Song Dong2×
Karan Gandhi1×
Ashish A. Mahabal1×
Jacob E. Jencson1×
Russ R. Laher1×

Research Timeline

2026
TraceSafe: A Systematic Assessment of LLM Guardrails on Multi-Step Tool-Calling Trajectories

The paper introduces TraceSafe-Bench, a comprehensive benchmark, and finds that securing LLM agents requires jointly optimizing for structural reasoning and safety alignment to mitigate risks during multi-step tool-use.

SRTJ: Self-Evolving Rule-Driven Training-Free LLM Jailbreaking

The paper proposes SRTJ, a Self-Evolving Rule-Driven Training-Free Jailbreak framework that systematically discovers and refines attack strategies using rule composition and feedback to achieve robust and generalizable jailbreaking against modern LLMs.

AgentTrust: Runtime Safety Evaluation and Interception for AI Agent Tool Use

AgentTrust is a novel runtime safety layer that intercepts and evaluates AI agent tool calls before execution, achieving high accuracy in detecting unsafe actions across complex and obfuscated scenarios.

Inverting the Shield: Systematically Generating Safety Tests from Policy Specifications

The paper introduces POLARIS, a novel framework that systematically generates comprehensive and verifiable safety tests for LLMs by formalizing natural language policies into First-Order Logic and exploring the resulting Semantic Policy Graph.

Turning Bias into Bugs: Bandit-Guided Style Manipulation Attacks on LLM Judges

The paper introduces BITE, a black-box adversarial framework that exploits stylistic biases in LLM judges by adaptively generating semantically equivalent edits to artificially inflate assigned scores.

MOOSE-Copilot: A Web-Based Interactive Assistant for Unified Exploratory and Fine-Grained Scientific Hypothesis Discovery

MOOSE-Copilot is a novel web-based framework that unifies scientific hypothesis discovery by formalizing human-AI interaction, significantly improving performance over autonomous LLM baselines.

Preference-Aware Rubric Learning for Personalized Evaluation

The paper introduces PARL, a framework that learns personalized evaluation rubrics directly from raw user interaction histories to accurately assess how well LLM outputs align with subjective, user-specific preferences.

From Empathy to Personalized Empathy: Adapting Empathetic Strategies to Individual Users

This paper introduces personalized empathy, a capability for LLMs to adapt empathetic strategies based on individual user history, and proposes PereGRM, a reward modeling framework that significantly enhances this personalized empathy.

Identifying Gems from Roman RAPIDly

This paper introduces a machine learning model, RuBR, and a methodology to reliably distinguish genuine astronomical transients from spurious detections for the upcoming Roman Space Telescope's data pipeline.

Highlighted terms show continued research focus across papers

Papers

cs.LGastro-ph.IMcs.CVRecentJun 3, 2026

Identifying Gems from Roman RAPIDly

Karan Gandhi, Ashish A. Mahabal, Jacob E. Jencson, Russ R. Laher +5 more

This paper introduces a machine learning model, RuBR, and a methodology to reliably distinguish genuine astronomical transients from spurious detections for the upcoming Roman Space Telescope's data p…

View →
cs.CLRecentMay 30, 2026

From Empathy to Personalized Empathy: Adapting Empathetic Strategies to Individual Users

Wuqiang Zheng, Chengbing Wang, Yilin Yang, Junyi Cheng +5 more

This paper introduces personalized empathy, a capability for LLMs to adapt empathetic strategies based on individual user history, and proposes PereGRM, a reward modeling framework that significantly…

View →
cs.CLRecentMay 29, 2026

Preference-Aware Rubric Learning for Personalized Evaluation

Yilun Qiu, Xiaoyan Zhao, Yang Zhang, Yuxin Chen +6 more

The paper introduces PARL, a framework that learns personalized evaluation rubrics directly from raw user interaction histories to accurately assess how well LLM outputs align with subjective, user-sp…

View →
cs.CLcs.AIcs.CERecentMay 28, 2026

MOOSE-Copilot: A Web-Based Interactive Assistant for Unified Exploratory and Fine-Grained Scientific Hypothesis Discovery

Hongran An, Zonglin Yang

MOOSE-Copilot is a novel web-based framework that unifies scientific hypothesis discovery by formalizing human-AI interaction, significantly improving performance over autonomous LLM baselines.

View →
cs.AIcs.CRcs.SERecentMay 24, 2026

Inverting the Shield: Systematically Generating Safety Tests from Policy Specifications

Xiaoyue Lu, Xianglin Yang, Haijun Liu, Jiahao Liu +3 more

The paper introduces POLARIS, a novel framework that systematically generates comprehensive and verifiable safety tests for LLMs by formalizing natural language policies into First-Order Logic and exp…

View →
cs.CRcs.AIcs.LGRecentMay 24, 2026

Turning Bias into Bugs: Bandit-Guided Style Manipulation Attacks on LLM Judges

Xianglin Yang, Bryan Hooi, Gelei Deng, Tianwei Zhang +1 more

The paper introduces BITE, a black-box adversarial framework that exploits stylistic biases in LLM judges by adaptively generating semantically equivalent edits to artificially inflate assigned scores…

View →
cs.AIcs.CRRecentMay 6, 2026

AgentTrust: Runtime Safety Evaluation and Interception for AI Agent Tool Use

Chenglin Yang

AgentTrust is a novel runtime safety layer that intercepts and evaluates AI agent tool calls before execution, achieving high accuracy in detecting unsafe actions across complex and obfuscated scenari…

View →
cs.CRcs.CLRecentMay 1, 2026

SRTJ: Self-Evolving Rule-Driven Training-Free LLM Jailbreaking

Jindong Li, Ying Liu, Yali Fu, Jinjing Zhu +3 more

The paper proposes SRTJ, a Self-Evolving Rule-Driven Training-Free Jailbreak framework that systematically discovers and refines attack strategies using rule composition and feedback to achieve robust…

View →
cs.CRcs.AIcs.CLRecentApr 8, 2026

TraceSafe: A Systematic Assessment of LLM Guardrails on Multi-Step Tool-Calling Trajectories

Yen-Shan Chen, Sian-Yao Huang, Cheng-Lin Yang, Yun-Nung Chen

The paper introduces TraceSafe-Bench, a comprehensive benchmark, and finds that securing LLM agents requires jointly optimizing for structural reasoning and safety alignment to mitigate risks during m…

View →