Publish In |
International Journal of Advance Computational Engineering and Networking (IJACEN)-IJACEN |
Journal Home Volume Issue |
||||||||
Issue |
Volume-7, Issue-4 ( Apr, 2019 ) | |||||||||
Paper Title |
Visual Question Answering: An Analysis of Various AI Models and Datasets | |||||||||
Author Name |
Neeraj Sanish S Joseph | |||||||||
Affilition |
Computer Science, School of Research and Innovation, CMR University, Bangalore, 560043 | |||||||||
Pages |
26-30 | |||||||||
Abstract |
Visual Question Answering is considered to be one of the latest advances in the field of Artificial Intelligence (AI). This is a unique task, which combines the three most important realms of AI, namely-Computer Vision (CV), Natural Language Processing (NLP) and Knowledge representation and reasoning (KR), each of which is being researched extensively. Given an image and an open-ended natural language question about the image, the VQA model needs to provide an open-ended natural language answer. To achieve this, the model would need to develop an understanding of the different entities of an image and language, and their dependencies. This is regarded as a true AI task. In this review we detail out the various algorithms proposed to build a VQA model, by classifying them based on the mechanisms used to extract and map the input visual and natural language features to a common feature vector space. Finally, we analyze the correctness of these models and propose some alternatives using Capsule Networks (CapsNet) for future directions. | |||||||||
View Paper |