Jie He
4 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces MalSkills, a neuro-symbolic framework that detects malicious skills in the expanding agentic supply chain by analyzing security-sensitive operations across heterogeneous artifacts.
The paper introduces FinBoardBench, a novel evaluation suite using financial board games to demonstrate that current LLMs, despite strong static reasoning, fail at complex, dynamic wealth management and strategic decision-making.
The paper introduces NICE, a novel, theory-grounded diagnostic benchmark for assessing the social intelligence of LLMs, which reveals that current frontier models consistently struggle with specific facets of communication.
Persona prompting does not universally improve LLM performance; instead, it systematically trades increased expertise depth for reduced clarity, making multi-metric evaluation essential.
Papers
NICE: A Theory-Grounded Diagnostic Benchmark for Social Intelligence of LLMs
Yunjin Qi, Zhaojun Jiang, Xuan Wu, Hanxi Pan +9 more
The paper introduces NICE, a novel, theory-grounded diagnostic benchmark for assessing the social intelligence of LLMs, which reveals that current frontier models consistently struggle with specific f…