Code

  • SeedVR2 and SeedVR (CVPR 2025) - One-step video super-resolution with high-fidelity.
  • TAR Unified multimodal LLM for both visual understanding and generation using discrete tokens.
  • MAGVIT (CVPR 2023) - Multi-task video generation using masked transformer used in MAGVIT-V2 (ICLR 2024).
  • Generative Transfer Learning (CVPR 2023)- Benchmark results on generative transfer learning.
  • MaskGIT (CVPR 2022) - Non-Autoregressive transformer for image synthesis
  • LeCAM-GAN (CVPR 2021) - Regularization approach to learn robust GAN on limited training data.
  • MentorMix (ICML 2020)- Robust deep learning method for realistic noisy labels.
  • Contrastive Adaptation Network (TPAMI 2022) - Unsupervised-Domain-Adaptation.
  • Future Prediction in Video (CVPR 2019) - Joint path and activity prediction.
  • Text Image Residual Gating (CVPR 2019) - Using vision and language modification for retrieval.
  • Eidetic-3D LSTM (ICLR 2019)- self-supervised video learning model.
  • MentorNet (ICML 2018) - Weakly-supervised deep learning method.
  • Self-paced Learning (NeurIPS 2014) - An implementation for self-paced learning used in our paper.

Data Set

  • Controlled Noisy Web Labels (ICML 2020) - First dataset and benchmark for realistic, real-world label noise sourced from the web.
  • MemexQA (TPAMI 2019) - Multimodal dataset consisting of real personal photos and crowd-sourced questions/answers.
  • YouTube-8M - Large-scale labeled video dataset consisting of millions of YouTube videos.
  • CMU Viral Video Dataset (ICMR 2014) - Public dataset for viral video study.