Shreyas Fadnavis
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces the 'readout-mediator angle' to demonstrate that simple linear probes, while capable of decoding information, often capture directions orthogonal to the model's actual causal computation, suggesting a fundamental failure mode in interpretability.
The paper proposes using an LLM aggregator that analyzes complete reasoning traces, demonstrating that trace-level synthesis is superior to traditional consensus methods like majority voting for solving complex problems.
Papers
When and How Long? The Readout-Mediator Angle in Temporal Reasoning
The paper introduces the 'readout-mediator angle' to demonstrate that simple linear probes, while capable of decoding information, often capture directions orthogonal to the model's actual causal comp…