Skip to content

OpenGVLab/InternVideo

Repository files navigation

InternVideo: Video Foundation Models for Multimodal Understanding


internvideo2_performance.

This repo contains InternVideo series and related works in video foundation models.

  • InternVideo: general video foundation models via generative and discriminative learning
  • InternVideo2: scaling video foundation models for multimodal video understanding
  • InternVideo2.5: empowering video mllms with long and rich context modeling
  • InternVideo3: multimodal contextual reasoning via efficient long-horizon agents
  • InternVideo-Next: general video foundation models for genuine world understanding
  • InternVid: a large-scale video-text dataset for multimodal understanding and generation

Updates

Contact

  • If you have any questions during the trial, running or deployment, feel free to join our WeChat group discussion! If you have any ideas or suggestions for the project, you are also welcome to join our WeChat group discussion!
wechatgroup
  • We are hiring researchers, engineers and interns in General Vision Group, Shanghai AI Lab. If you are interested in working with us on video foundation models and related topics, please contact Yi Wang (wangyi@pjlab.org.cn).