Edge displays “123456” in PDF but prints “114447”

danso|8 years ago

Just in case you don't click through, the bug submitter refers to a color PDF (not a scan, so image compression artifacts are not an issue) that is similar in appearance to a periodic table. That is, it is not the sequence of `123456` that is mistranslated into `114447`, but a sequence of 6 table cells, each containing a single digit.

It's not just the numbers that are misprinted, but the text inside those cells too, which suggests that Edge's PDF engine is re-rendering the original PDF, rather than printing the original PDF as is, which I thought was the entire point of using PDF in the first place.

But maybe this is an edge case? In the sense that Microsoft assumes that given a PDF file, if a user wants to "Print to PDF", the user should just save the PDF file. "Print to PDF" is ostensibly used to convert HTML/DOC into PDF format.

saurik|8 years ago

As someone who works in PDFs constantly (due to work in local government), I would say the point of a PDF is to be able to reproduce the result given a file, not to assume the file cannot be changed... you can easily edit and save PDF files using Acrobat, for example.

It is common that when you "print to PDF" you take the output of the printer and serialize that to PDF. I use this feature often on my Mac (which I think many would claim has excellent support for dealing with PDF files) to build a PDF that is stripped of any interactive forms: so as to get an output which is only the PDF "as printed".

jahewson|8 years ago

> which suggests that Edge's PDF engine is re-rendering the original PDF, rather than printing the original PDF as is

That's to be expected. The bitmap which Edge has rendered to the screen is not what will be sent to the printer driver. Instead, rich vector graphics will be sent. On Windows, the native print format is XPS, so this is most likely a bug in how Edge converts PDF to XPS for printing.

For simpler use cases Windows' graphics APIs can be used to both render to bitmap and to XPS but when printing something as rich and sophisticated as PDF better results are achieved by directly targeting the native print format, such as PCL, PostScript, or XPS. I suspect that's what the Edge devs have done and why it's producing different results on screen and in print.

cesarb|8 years ago

To make things even more interesting, the "original" PDF seems to have been generated by Ghostscript 8.15 and PScript5.dll 5.2, that is, it was also "printed to PDF" (from Microsoft Word, I presume).

stephanheijl|8 years ago

I have also encountered something similar whilst attempting to print a ticket to a major amusement park in Europe using Edge. The page the tickets were on secured by a login mechanism, and attempting to print the tickets resulted a page with an error. I had to save the PDF to the computer and print from there to get the proper output. It definitely seems like Edge re-renders or even re-requests the PDF before printing.

dluc|8 years ago

Nice pun, it's an edge case...

givinguflac|8 years ago

|Maybe this is an edge case?

Slow clap.

type0|8 years ago

> But maybe this is an edge case?

It is, Edge caused case

unknown|8 years ago

[deleted]

kiliancs|8 years ago

Definitely an Edge case.

nom|8 years ago

The PDF "format" never fails to amuse me. Check out the talk "OMG WTF PDF" [0] from the 27. Chaos Communication Congress, it's eye opening.

0: https://media.ccc.de/v/27c3-4221-en-omg_wtf_pdf

tjalfi|8 years ago

You would probably like the James Mickens video "Life As A Developer: My Code Does Not Work Because I Am A Victim Of Complex Societal Factors That Are Beyond My Control"[0]. He starts talking about the Adobe PDF reader at 19:30.

[0] https://vimeo.com/180568023

H4CK3RM4N|8 years ago

I still love that the SHA 1 collision was able to change the color of an image in a PDF, due to the junk data present.

peapicker|8 years ago

Thanks for that link.

faragon|8 years ago

This reminds me JBIG2 compression errors... [1]

[1] https://abbyy.technology/en:kb:tip:jbig2_compression_and_ocr

bloaf|8 years ago

Hence the joke in the bug report.

dualogy|8 years ago

Well.. 1+2+3+4+5+6 == 1+1+4+4+4+7 --- a bug with a sense for 'numerology'!

EliRivers|8 years ago

This reminds me of that photocopier that changed the numbers it was copying sometimes, through a dodgy image compression algorithm.

teilo|8 years ago

Yes, note the tongue-in-cheek reference reference to that problem in the article:

> (Possible workaround: Copy the document after printing using a Xerox copier.)

ungzd|8 years ago

Link: http://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_...

unknown|8 years ago

[deleted]

cmurf|8 years ago

The PDF goes through different rendering paths for display vs print. It's GDI+ for display, and WPF with an XPS spool file for print. So my guess is whatever does PDF to XPS filtering/conversion is getting something wrong; but then it could be complicated by an addtitional bug in the print driver which is why the report says the bug depends on what printer is used for printing.

reacweb|8 years ago

printing to pdf should be independent of printer.

askvictor|8 years ago

I've had similar issues with chrome's PDF viewer where it displays one number, but if I copy paste, it shows a different number.

wdr1|8 years ago

I wonder if it's related to when Xerox copiers changed numbers?

http://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_...?

hoodoof|8 years ago

x

unknown|8 years ago

[deleted]

rileytg|8 years ago

you're missing the joke

steipete|8 years ago

Maybe we really should bring https://pdfviewer.io to Windows. Looks like the default app is somewhat crap :P

Sunset|8 years ago

Why doesn't everyone use FoxitPDF in $current_year ?

caxtonmon|8 years ago

[deleted]

65 comments