Creates a new Gemini OCR capability adapter.
Owning provider instance used for initialization checks and merged config access.
Initialized Google GenAI SDK client.
Executes OCR through Gemini multimodal content generation.
Responsibilities:
NormalizedOCRDocumentUnified OCR request envelope.
Optional execution context. Unused directly in this adapter.
Optionalsignal: AbortSignalOptional cancellation signal.
Provider-normalized OCR artifacts.
Adapts Gemini OCR responses into ProviderPlaneAI's normalized OCR artifact surface.
Gemini does not expose a dedicated OCR endpoint here, so OCR is implemented as prompt-driven multimodal extraction over images and documents via
generateContent.Practical support is narrower than Mistral OCR in this repo. The most reliable document/image path is PDF plus Gemini's supported image formats; some text-like and office-document inputs can fail slowly at the provider layer.