请给我详细描述一下我的毕设到底是要做什么？

如题，我的毕设描述是这样的：
Video comprises the major communications and entertainment media asset, occupying more than 60% of today’s Internet traffic. Yet, video remains the least-manageable element of the big data ecosystem. This is because all state-of-the-art methods for high-level semantic description in video require either manual annotation, or compute-intensive video decoding and processing. This project aims to create a robust and performant ecosystem of machine learning algorithms to uniquely identify and describe semantic video attributes within networks and file systems (e.g., automatic semantic labeling of video segments). The tools to be used are: TensorFlow, Python, potentially a bit of
C/C++ programming, Docker containers and the Linux operating system.
It is understood that the student will not be familiar with these tools, so some summer study will be required. Team members of Dr Andreopoulos's group will be available to tutor the student on some of the practical aspects. This project is a demanding piece of work, but it is ideal for a student who is
genuinely interested to understand what deep neural networks can (and cannot) do, and the application area in video analysis is of very strong relevance to a number of industries in the UK and worldwide.

请问这到底是做视频分类还是给每个视频标上标签？？我之前用cnn加lstm做了一个视频分类模型，现在在回来审题好像不太对。另外教授让我用transformer模型，请问哪个transformer模型比较适合我这个项目呢？

Kinetics-400数据集，video transformer，你的任务我理解就是这个数据集的任务，多分类，你们教授也是想让你看看这篇论文，在这基础上改进。

题目要求：打标签（PS：有了标签才能分类吧）
transformer模型是用来处理文本分类的模型。
总结：使用tranformer模型作文本分类

您好，我是有问必答小助手，您的问题已经有小伙伴帮您解答，感谢您对有问必答的支持与关注！
PS：问答VIP年卡【限时加赠：IT技术图书免费领】，了解详情>>> https://vip.csdn.net/askvip?utm_source=1146287632