Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Font Embedding: Template Text is broken #1362

Open
1 task
MrSerth opened this issue Sep 18, 2024 · 5 comments
Open
1 task

Font Embedding: Template Text is broken #1362

MrSerth opened this issue Sep 18, 2024 · 5 comments

Comments

@MrSerth
Copy link

MrSerth commented Sep 18, 2024

Since updating to prawn 2.5.0, we face issues with font embedding and changes to text provided on a template.

Reproduction example:

  1. This reproduction example assumes that the file Arial Unicode MS.TTF is present in the local working directory. While the font is shipped with Microsoft Office by default and can be licensed by Microsoft, it is also available on GitHub (for example here).
  2. This reproduction example further assumes that a file called template.pdf is present in the local working directory downloaded from this issue: template.pdf. The template text was created in Word and then exported to PDF. It contains two lines of text in two different fonts; namely "Source Sans Pro" and "Calibri". Both fonts are shipped with the template as embedded subset:
Bildschirmfoto 2024-09-18 um 20 57 22

With both files being available, a simple irb can be started and the following script can be executed:

require 'prawn'
require 'prawn/templates'

@pdf = Prawn::Document.new(
  template: 'template.pdf',
  page_size: 'A4',
  page_layout: :landscape,
  left_margin: 0,
  right_margin: 0,
  top_margin: 0,
  bottom_margin: 0
)

# Subsetting is disabled, so that https://github.com/prawnpdf/prawn/issues/1361 doesn't apply.
@pdf.font_families.update("ArialUnicodeMS" => {normal: {file: "Arial Unicode MS.TTF", subset: false}})

@pdf.font "ArialUnicodeMS", style: :normal
@pdf.text_rendering_mode :fill_stroke do
  @pdf.draw_text "Demo Text @ Draw Text", size: 22.0, style: :normal, text_anchor: "middle", at: [500, 500], offset: [0, 0]
end

@pdf.render_file('prawn.pdf')
puts "Done"

Issue noticed:

  • The "Template Text in Source Sans Pro" is not correctly encoded any longer after being processed by Prawn. Instead, the text is now displayed as "2=EHD9L=2=PLAF1GMJ;=19FK.JG".

We assume that this issue is not directly related to #1361, since subsetting is explicitly disabled for the font. However, since we are not fully aware of the inner workings of Prawn, both issues could be well related. In this case, we excuse for the duplicate reporting. Any help to fix the issue mentioned is very appreciated!

--

prawn: 2.5.0
Ruby: 3.3.5
Adobe Acrobat Pro: 2024.003.20121
Microsoft Word: Version 2408 Build 16.0.17928.20156

@gettalong
Copy link
Member

Thanks for the nicely reproducible steps! I can confirm the issue:
image

The reason why the initial text looks now mangled is that Prawn overwrites the font object for the text with the one for the Arial font. If you add another TrueType font and write some text in it, the second font object will also be overwritten.

So this indicates a problem with choosing a PDF internal name for the fonts when the template PDFs contains font reference in the style Prawn itself generates.

The code for generating these internal font names seems to be at https://github.com/prawnpdf/prawn/blob/master/lib/prawn/font.rb#L543-L553 if you want to take a stab.

@MrSerth
Copy link
Author

MrSerth commented Sep 18, 2024

Thanks for confirming the issue and pointing me in the right direction! Indeed, I was able to verify that a non-unique ID is selected. In the PDF template shared, the existing fonts are embedded as F1, F2, F3, and F4. This is similar to the keys generated by Prawn, starting with F1, too.

When checking for a unique key, the method key_is_unique? is used. There, none? iterates over all existing keys (F1, F2, F3, and F4) and checks for start_with?("#{test_key}."). Since none of the existing keys in our PDF contains a dot ., this always returns false. Then, however, the same font key is used by Prawn (F1) than is present in the PDF already (F1).

It seems like the dot . was added to the match in 906739d as part of #898. As a first draft for the PR, the code was checking the given key with start_with? only (not looking for the dot after #{key_id}). I didn't found a specific reason for that choice, but it be because F1 (as generated by Prawn) would otherwise match F1-prefixed IDs (such as F10), too and due to the usage of subsets with #{font_key}.#{subset}.


I don't have a good proposal for a permanent solution yet. However, I was able to confirm that using another key than :"F#{font_count}" fixes the problem 👍

@swistak
Copy link

swistak commented Oct 17, 2024

start_with?("#{test_key}.")

This looks like a straight-up bug, possibly resulting from conversion from regexp which would be /^{test_key}./ (. would match any character).

@pointlessone
Copy link
Member

@swistak The string would be quoted before turning into a regexp for matching. It's fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants
@gettalong @pointlessone @swistak @MrSerth and others