### STAIR Captions : A Large-Scale Japanese Image Caption DatasetSTAIR Captions : A Large-Scale Japanese Image Caption Dataset

Akikazu Takeuchi, Yuya Yoshikawa, Yutaro Shigeto

In recent years, automatic generation of image descriptions (captions), that is, image captioning, has attracted a great deal of attention. Most studies on image captioning target English language, and there are few image caption datasets in Japanese. To tackle this problem, we construct a large-scale Japanese image caption dataset based on images from MS-COCO. Our dataset consists of 820,310 Japanese captions for 164,062 images.