Image to text
Image to text models generate text from the images. Depending on the model, the generated text may be a caption for the image or an answer to the question about the image. This demo provides two types of models - image captioning and visual question answering. The image captioning models generate a caption for the image. The visual question answering models generate an answer to the question about the image. I recommend using quantized versions of the models as they are much smaller in size but provide almost the same quality as the full-precision models.
How to use the demo:
- Select the model.
- Load the image.
- Enter the prefix. If the selected model is an image captioning model, the prefix is used as a starting point for the caption. If the selected model is a visual question answering model, the prefix is used as a question about the image. For image captioning models, the prefix is optional.
- Click the "Process" button.
Status: select and load the model
Powered by Web AI.