CIVIC: End-to-End Sequence Compactness for Efficient Vision-Language Models | ArxivCSExplorer