Skip to content
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.

Analyze image with chat completion request #464

Open
hynek5 opened this issue Feb 11, 2024 · 1 comment
Open

Analyze image with chat completion request #464

hynek5 opened this issue Feb 11, 2024 · 1 comment

Comments

@hynek5
Copy link

hynek5 commented Feb 11, 2024

I'm trying to analyze image following guidelines at https://platform.openai.com/docs/guides/vision?lang=curl

I cannot make my solution work, getting weird responses like : I cannot accurately identify the contents of the image as it is encoded in base64 format. Please provide a direct image link or describe the image.
which is weird as example at open api docs using python works like a charm.

@Component
public class ImageAnalyser {

    private final OpenAiService openAiService;

    @Autowired
    public ImageAnalyser(OpenAiService openAiService) {
        this.openAiService = openAiService;
    }

    public List<String> analyze(String pathToFile) {
        ChatCompletionRequest completionRequest = ChatCompletionRequest.builder()
                .model("gpt-4-vision-preview")
                .messages(List.of(getChatMessage(pathToFile)))
                .maxTokens(500)

                .build();
        System.out.println(completionRequest.toString());
        return openAiService.createChatCompletion(completionRequest)
                .getChoices().stream()
                .map(chatCompletionChoice -> chatCompletionChoice.getMessage().getContent())
                .collect(Collectors.toList());
    }

    private ChatMessage getChatMessage(String pathToImage) {
        return new ChatMessage("user",getContent(pathToImage));
    }

    private String getContent(String filePath){
        return "[" +
                "{" +
                "\"type\": \"text\"," +
                "\"text\": \"What’s in this image?\"" +
                "}," +
                "{" +
                "\"type\": \"image_url\"," +
                "\"image_url\": {" +
                "\"url\": \"data:image/jpeg;base64," + imageB64(filePath) + "\"" +
                "}" +
                "}" +
                "]";
    };

    public String imageB64(String imagePath) {
        File file = new File(imagePath);
        try (FileInputStream imageInFile = new FileInputStream(file)) {
            // Reading a file from file system
            byte imageData[] = new byte[(int) file.length()];
            imageInFile.read(imageData);

            // Converting Image byte array into Base64 String
            return Base64.getEncoder().encodeToString(imageData);
        } catch (IOException e) {
            e.printStackTrace();
            return null;
        }
    }
}

Anyone with working example or idea what could be an issue here? Or possible where to look during debug for serialization? I suspect that might be an issue but I was unable to find the right class.

Thanks a lot!

@hynek5
Copy link
Author

hynek5 commented Feb 12, 2024

Okay it seems that the serialization of the content is the issue. Following is the payload :

2024-02-12T11:17:50.118+01:00  INFO 13616 --- [           main] okhttp3.OkHttpClient                     : --> POST https://api.openai.com/v1/chat/completions
2024-02-12T11:17:50.119+01:00  INFO 13616 --- [           main] okhttp3.OkHttpClient                     : Content-Type: application/json; charset=UTF-8
2024-02-12T11:17:50.119+01:00  INFO 13616 --- [           main] okhttp3.OkHttpClient                     : Content-Length: 26315
2024-02-12T11:17:50.119+01:00  INFO 13616 --- [           main] okhttp3.OkHttpClient                     : 
{
  "model": "gpt-4-vision-preview",
  "messages": [
    {
      "role": "user",
      "content": "[{\"type\": \"text\",\"text\": \"What’s in this image?\"},\"{\"type\": \"image_url\",\"image_url\": {\"url\": \".....kZJRgABAQBQoUKhAUKFCoQ//Z\"}\"}\"]\""
    }
  ],
  "max_tokens": 500
}

It would be nice if it was possible to add json/json array to com.theokanning.openai.completion.chat.ChatMessage.content

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant