AdaptR1: Reinforcement Learning Based Adaptive Interleaved Thinking in Multi-hop Question Answering | ArxivCSExplorer