# | Team Name | BLEU@4 | METEOR | CIDEr-D | SPICE |
---|---|---|---|---|---|
1 | CASIA_IVA | 26.13 | 20.86 | 35.09 | 7.85 |
2 | Gene | 23.67 | 19.63 | 31.19 | 7.52 |
3 | aimc_21 | 20.66 | 20.13 | 30.18 | 7.40 |
4 | Nameless | 22.80 | 18.87 | 27.95 | 6.40 |
5 | Micro Genius | 20.93 | 17.34 | 24.42 | 5.60 |
6 | MSVLPT | 21.26 | 17.10 | 23.35 | 5.50 |
7 | tsinghua_hhh | 7.98 | 13.90 | 17.28 | 5.16 |
# | Team Name | Top-1 accuracy |
---|---|---|
1 | Silver_Bullet | 62.28 |
2 | MSVLPT | 56.77 |
3 | sunny_flower | 54.33 |
4 | ethan | 53.66 |
5 | ghost_rider | 50.83 |
For the evaluation in the downstream task of video captioning, we will use and publish in a leaderboard the automatic metric results, including BLEU@4, METEOR, CIDEr and SPICE, on the testing set of MSR-VTT dataset.
For the evaluation in the downstream task of video categorization, we will report the top-1 accuracy on the testing set of Downstream dataset.
@article{autogif2020, title={Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training}, author={Yingwei Pan and Yehao Li and Jianjie Luo and Jun Xu and Ting Yao and Tao Mei}, journal={arXiv preprint arXiv:2007.02375}, year={2020}} @inproceedings{msrvtt, title={MSR-VTT: A Large Video Description Dataset for Bridging Video and Language}, author={Jun Xu and Tao Mei and Ting Yao and Yong Rui}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2016}}