cs.IRcs.CLcs.HCcs.SIEmpirical

Factions Within, Uncertain Across: Within-Document Reader Sub-Groups in Social Highlighting

Jun 10, 2026

AI Summaryllama-3.1-8b-instruct

This paper investigates whether a group of people highlighting the same document forms a single consensus or is internally structured into reader sub-groups.

This paper provides new insights into the structure of reader highlighting behavior within and across documents, which is a novel contribution to the field of human-computer interaction.

Keywords

reader sub-groups document highlighting co-readership platform statistical analysis

Before reading this…

Basic knowledge of human-computer interaction and document highlighting behavior.Understanding of statistical analysis and research methods.

Applications

→Understanding reader sub-groups can inform the design of collaborative document editing tools and platforms.
→The findings of this paper can also be applied to other areas of human-computer interaction, such as collaborative writing and knowledge management.

Skill Ladder

To understand this paper, make sure you know these concepts first:

Basic knowledge of human-computer interaction and document highlighting behavior.find papers →
Understanding of statistical analysis and research methods.find papers →

Abstract

More Like This

When many people highlight the same document, is the crowd a single consensus, or is it internally structured into reader sub-groups that mark different things -- and is that structure a stable property of a reader or of the document? Building on prior work showing an individual's within-document highlighting signal is a whisper while individuality lives in selection, we ask the group-level question on a co-readership platform using a margin-preserving curveball null. Experiment 1: within a document, readers form strong sub-groups -- pairs agree far beyond what shared salience, mark density, and sentence popularity predict (nearest-neighbour agreement z=+6.3, significant in 88% of documents). Under an eight-block region-preserving null, shared engagement with the same coarse regions of the document accounts for about 40% of this excess; the majority survives as finer reader-specific agreement (z=+3.6, 77% significant). So the within-document crowd is, in a descriptive sense, factional. Experiment 2: is that grouping a stable reader trait? Here we are honest about power. The cross-document split-half reproducibility of a pair's agreement is near zero pooled (+0.078 and 0.000 in two separately drawn samples), and a power calibration shows the test is informative only for pairs that co-read many documents. In the only informative high-overlap subset (k>=4), point estimates are positive but small-sample, imprecise across the separately drawn samples, never significant, and attenuate under the region-preserving null. We therefore leave cross-document stability unresolved: the data is consistent with anything from situational grouping to a weak-to-moderate stable reader trait. The crowd is factional within a document; whether its factions follow the reader across documents is, honestly, beyond our reach.