~ similar to 2606.01632· 20 results
ShaplEIG introduces a Bayesian experimental design framework to efficiently and adaptively estimate Shapley values by minimizing the number of required costly function evaluations.
The paper introduces the quotient semivalue mechanism to provide fair data attribution that is resistant to contributors manipulating their reported identities by splitting or duplicating data.
Zhaoyu Wang, Pingchuan Ma, Zhantong Xue, Yuguang Zhou +3 more
ZK-Value introduces a practical, scalable zero-knowledge system for calculating data valuations (Shapley values) in data marketplaces, significantly reducing proving time while maintaining high accura…
Yubo Gao, Haotian Wu, Hong Chen, Junquan Huang +7 more
The paper introduces Hierarchical Adaptive Budgeter (HAB), a framework that improves LLM reasoning efficiency by adaptively allocating computational resources to match the intrinsic complexity of both…
FundaPod is a multi-persona agent platform designed for fundamental investment research, enabling AI agents with distinct viewpoints to independently gather evidence and surface disagreements for huma…
Alex Leung, Rex Zhang, Ervin Ling, Kentaroh Toyoda +1 more
This paper maps the emerging insurability frontier of AI risk by coding 55 AI threat classes against 26 insurance products, identifying four tiers of coverage: affirmative, silent, excluded, and outsi…
The paper analyzes the nascent DeFi investment agent market, finding that while token valuations are high, current deployments are heterogeneous, lack clear autonomous execution, and exhibit poor risk…
The paper empirically analyzes the nascent DeFi investment agent market, finding that while token valuations are high, current deployments lack robust autonomous execution and exhibit poor risk-adjust…
CHRONOS is a novel three-layer architecture designed to address coupled failures in temporal data marketplaces by integrating temporal decay, changepoint-aware pricing, and differential privacy for ro…
This empirical study of Pearl's cuPOW protocol demonstrates that the network's Proof-of-Useful-Work mechanism generates zero useful AI computation, instead causing economic harm and displacing legitim…
Taojie Zhu, Wentao Zhao, Rui Sun, Beidi Luan +6 more
The paper introduces KTD-Fin, a novel benchmark that evaluates LLM trading agents by masking historical market data and decomposing returns, finding that LLM agents' profits are largely due to passive…
Qiuyu Tian, Zequn Liu, Yingce Xia, Haojie Yin +1 more
The paper introduces ForeSci, a novel benchmark that evaluates LLM agents' ability to make forward-looking research judgments using only historical evidence, finding that explicit evidence organizatio…
The paper proposes replacing individual agent autonomy with a structured 'social contract' and institutional Separation of Power (SoP) to mitigate systemic failures and deceptive behavior in multi-age…
Ailiya Borjigin, Igor Stadnyk, Ben Bilski, Maksym Chikita +3 more
The paper proposes the Interaction-Native Knowledge Harness (InKH), an architecture that absorbs complex context into financial LLM agents, significantly improving performance, reducing latency, and e…
Aaron Chan, Tengfei Li, Tianyi Xiao, Angela Chen +2 more
The paper introduces LATTICE, a novel benchmark for evaluating how well crypto agents assist user decision-making, finding that different agents excel in different specific areas rather than having a…
The paper proposes using an LLM aggregator that analyzes complete reasoning traces, demonstrating that trace-level synthesis is superior to traditional consensus methods like majority voting for solvi…
Haoxiang Cheng, Yunfei Wang, Chao Chen, Kewei Cheng +4 more
The paper proposes GRiD, a novel framework that uses a two-phase training strategy (supervised pre-training and RL fine-tuning) to discover complex, graph-like rules for knowledge graph reasoning, ove…
Shuning Zhang, Eve He, Xiao Zhan, Shijing He +3 more
This paper investigates how Generative AI enables scalable, hyper-realistic fraud in Chinese e-commerce by fabricating product defect evidence, proposing new defense mechanisms like verifiable materia…
ResearchLoop introduces an evidence-gated control plane to manage and audit the state of AI-assisted computational research, mitigating the risk of unverified claims.
Junyu Lu, Qi Wei, Peishuo Zheng, Jie Zhang +5 more
The paper introduces Prosecution Decision Prediction (PDP), a new legal AI task that assesses prosecutorial review decisions, showing that current state-of-the-art LLMs perform significantly worse on…