Search
Now showing items 1-2 of 2
Turkish Video Captioning with Msvd-Turkish Dataset
(Fen Bilimleri Enstitüsü, 2020)
The problem of video captioning can be defined as describing a video content by using natural language in a way that a person can identify the video by performing information extraction from the given videos. Video captioning ...
Towards Understandıng Intuıtıve Physıcs Wıth Language And Vısıon
(Fen Bilimleri Enstitüsü, 2021)
Visual question answering (VQA) is one of the difficult tasks in multimodal machine reasoning. VQA requires machines to provide correct answers to questions about an image or a video. Here, the machine should perceive the ...