Publish In
International Journal of Advance Computational Engineering and Networking (IJACEN)-IJACEN
Journal Home
Volume Issue
Volume-7, Issue-4  ( Apr, 2019 )
Paper Title
Visual Question Answering: An Analysis of Various AI Models and Datasets
Author Name
Neeraj Sanish S Joseph
Computer Science, School of Research and Innovation, CMR University, Bangalore, 560043
Visual Question Answering is considered to be one of the latest advances in the field of Artificial Intelligence (AI). This is a unique task, which combines the three most important realms of AI, namely-Computer Vision (CV), Natural Language Processing (NLP) and Knowledge representation and reasoning (KR), each of which is being researched extensively. Given an image and an open-ended natural language question about the image, the VQA model needs to provide an open-ended natural language answer. To achieve this, the model would need to develop an understanding of the different entities of an image and language, and their dependencies. This is regarded as a true AI task. In this review we detail out the various algorithms proposed to build a VQA model, by classifying them based on the mechanisms used to extract and map the input visual and natural language features to a common feature vector space. Finally, we analyze the correctness of these models and propose some alternatives using Capsule Networks (CapsNet) for future directions.
  View Paper