Turn a finished raster image into a PDF while preserving the image exactly and adding a transparent text layer for selection, copying, and search.
references/ocr-alignment.md when you need guidance for extracting, correcting, or prompting for box-level text.references/layout-json.md for the exact JSON schema.python scripts/compose_image_text_pdf.py \
--image /path/to/image.png \
--layout /path/to/layout.json \
--output /path/to/image-text.pdf \
--debug-output /path/to/image-text-check.pdf
For CJK or other non-Latin text, pass a Unicode font:
python scripts/compose_image_text_pdf.py \
--image /path/to/image.png \
--layout /path/to/layout.json \
--output /path/to/image-text.pdf \
--debug-output /path/to/image-text-check.pdf \
--font-file /path/to/NotoSansCJK-Regular.ttc
text field in layout JSON, not the OCR image.When an OCR tool returns word-level boxes, convert them to line-level layout items:
python scripts/ocr_words_to_layout.py \
--ocr /path/to/ocr.json \
--output /path/to/layout.json \
--image-width 1536 \
--image-height 2048 \
--source-text /path/to/source.txt
The converter accepts common JSON shapes containing words, items, textAnnotations, or nested page/line/word objects. It groups nearby words into lines and can replace OCR text with the closest line from the source text when the match is strong.
共 1 个版本