VLESA: Vision-Language Embodied Safety Agent for Human Activity Monitoring | ArxivCSExplorer