
Apple: LLMs Accurately Recognize Activities from Captioned Audio and Motion Data
LLMs can accurately recognize daily activities by fusing captioned audio and motion data—boosting performance without raw audio or specialized multimodal training.
Using sensors, machine learning, or AI to automatically identify and classify human physical activities such as walking, cooking, or exercising.

LLMs can accurately recognize daily activities by fusing captioned audio and motion data—boosting performance without raw audio or specialized multimodal training.