RREDCoT: Segment-Level Reward Redistribution for Reasoning Models | ArxivCSExplorer