Search results for: 'Internvid: A large-scale video-text dataset for multimodal understanding and generation'