Seung-Bin Kim
1 indexed paper
Recent (6 mo)
1With code
0Influential cites
0Benchmarked
0Publications per year
126
Top categories
Audio and Speech Processing×1AI×1NLP×1
Frequent co-authors
Research Timeline
2026
ImmersiveTTS: Environment-Aware Text-to-Speech with Multimodal Diffusion Transformer and Domain-Specific Representation Alignment
ImmersiveTTS is an environment-aware text-to-speech model that generates natural speech seamlessly integrated within environmental contexts by explicitly modeling cross-modal interactions, achieving superior naturalness and fidelity.
Highlighted terms show continued research focus across papers