Student Researcher - Doubao - Seed - Foundation Model - Vision and Language - 2025 Start - PhD

Student Researcher – Doubao – Seed – Foundation Model – Vision and Language – 2025 Start – PhD

Currently pursuing a PhD in Software Development, Computer Science, Computer Engineering, or a related technical discipline.
Research experience in multi-modal understanding, vision and language, such as video captioning, VQA, Text-to-video retrieval, audio/music understanding and generation, and other related topics.
Publications in top-tier venues, such as CVPR, ECCV, ICCV, NeurIPS, ICLR, ICML, EMNLP, ACL, COLING, etc.
Highly competent in algorithms and programming; Strong coding skills in Python and popular deep learning frameworks.
Must obtain work authorization in country of employment at the time of hire, and maintain ongoing work authorization during employment.

Conduct cutting-edge research and development in computer vision and natural language processing, especially in the areas of multi-modality, vision and language, etc.
Publish our latest research results, and help to build our brand in the research community.
Transfer our research results to product applications, and explore new product ideas with CV/NLP at its core.