20 results for “Basic knowledge of human-computer interaction and document highlighting behavior.”
CS papers onlyHybrid search: Keyword + semantic, ranked by combined score.ⓘ
Want pure semantic search? Try claim verification →
The paper introduces I-WebGenBench, a framework and benchmark that converts static scientific papers into executable, interactive web systems, allowing users to dynamically explore the paper's mechani…
SkillPager is a novel two-stage framework that efficiently selects minimal, execution-sufficient context from large procedural skill documents by leveraging typed semantic nodes, significantly reducin…
This paper investigates whether a group of people highlighting the same document forms a single consensus or is internally structured into reader sub-groups.
This paper predicts the aggregate crowd salience of a document from its text before its marks accumulate.
Minglai Yang, Xinyan Velocity Yu, Pengyuan Li, Xinyu Guo +21 more
The paper introduces Dr. DocBench, a difficulty-aware, comprehensive benchmark designed to rigorously test expert-level and challenging document parsing capabilities for VLMs, demonstrating that curre…
Lauren Sismeiro, Remy Plastre, Binbin Xu, Frederic Puyjarinet +1 more
This paper demonstrates a proof-of-concept method using top-view video to detect 'Pen-Up' states in handwriting, showing it can reliably complement traditional digitizing tablets for developmental dis…
Haoyue Yang, Zhangxiao Shen, Fan Ding, Hangting Lou +7 more
The paper introduces Cookie-Bench, a novel, autonomous, and reference-free evaluation framework that significantly improves the assessment of interactive web generation capabilities for frontier LLMs.
The paper systematically compares multimodal transformer and LLM approaches for document type classification, finding that specialized multimodal Transformers outperform LLM-based models, especially w…
Sina Mavali, David Pape, Jonathan Evertz, Samira Abedini +4 more
The paper introduces the Task Alignment Benchmark (TAB) to evaluate terminal agents' ability to selectively follow relevant environmental instructions while ignoring misleading distractors, revealing…
Jiaman He, Riccardo Xia, Dana McKay, Damiano Spina +1 more
The paper presents SearchLog, a web browser extension for collecting natural search logs during lab-based studies.
Fan Wu, Lishuai Dong, Cuiyun Gao, Yujia Chen +3 more
The paper introduces WebIGBench, a novel benchmark designed to rigorously evaluate multimodal LLMs' ability to generate code for complex, interactive webpages, addressing the limitations of existing s…
The paper introduces UniKE, a benchmark showing that successful knowledge edits in text-only multimodal models do not reliably transfer to image generation, revealing a significant modality gap.
Shihao Rao, Liang Li, Jiapeng Liu, Tong Lin +5 more
The paper introduces DocFormBench, a new benchmark for content-aware document formatting, and proposes DocFormFlow, a workflow that improves formatting accuracy and efficiency by decoupling target loc…
HuiMing Fan, Xiao Wang, Zheng Chu, Qianyu Wang +4 more
The paper argues that current search agents often verify existing knowledge rather than genuinely searching, and introduces LiveBrowseComp, a new benchmark to measure true evidence-driven discovery.
The paper introduces OpAI-Bench, a novel benchmark designed to study how AI authorship signals evolve and accumulate during the progressive co-editing process between humans and AI.
MEMENTO proposes a novel framework that treats the open web as a continuous learning signal, enabling agents to acquire task-specific expertise and reusable research strategies in low-data domains wit…
This study analyzed eye-tracking data in a simulated military environment, finding that multimodal adaptive decision support tools significantly improve mission performance compared to visual-only too…
The paper introduces a framework to quantitatively measure evolving agent behaviors (traits) by analyzing changes in their configuration text files, achieving high accuracy in classifying behavioral s…
This paper introduces a new benchmark dataset and evaluation framework for 'data snapshot extraction,' focusing on identifying and localizing semantically meaningful analytical artifacts within operat…
The paper introduces KnowledgeGain, a novel metric that measures the actual knowledge gained by readers from science news, and demonstrates its use in optimizing news generation to improve reader lear…