PDFsharp & MigraDoc Foundation
http://forum.pdfsharp.de/

Document fields like Keywords, Title - Unicode?
http://forum.pdfsharp.de/viewtopic.php?f=2&t=1098
Page 1 of 1

Author:  Minimalist360 [ Sat Mar 13, 2010 1:54 pm ]
Post subject:  Document fields like Keywords, Title - Unicode?

I am bulk creating PDF documents, and need to set the metadata.

I notice when I set Title or Keywords to have something like Włocławek which has unicode charactes in it, I get a giant dialog box that says that "Raw string contains invalid character with a value > 255", and asks me to Abort, Retry or Ignore.

Does anyone know if it's ok to put Unicode in these fields? Is it a problem with the PDF "standard" or is it something else?

thanks in advance.

Author:  Minimalist360 [ Mon Mar 15, 2010 4:21 pm ]
Post subject:  Re: Document fields like Keywords, Title - Unicode?

/// <summary>
/// An encoder for raw strings. The raw encoding is simply the identity relation between
/// charachters and bytes. PDFsharp internally works with raw encoded strings instead of
/// byte arrays because strings are much more handy than byte arrays.
/// </summary>


really? maybe it's handy, but I can't set the PDF title or keywords to anything with unicode.

Author:  jeffhare [ Wed Mar 02, 2011 9:56 pm ]
Post subject:  Re: Document fields like Keywords, Title - Unicode?

Having this problem myself.

Yes, it's possible to have doc prop fields like /Title, /Keywords, etc have unicode characters.

I haven't figured out how to make that happen however. What we have to do if we want to use these object constructs is to write the values out as hex bytes.

We've found that:

/Title <FEFF65876863>

Render a unicode character in the title property when examined in a pdf reader. Note that it appears we have to use UTF16(BigEndian) byte ordering and I believe it must start with "FEFF" and the entire string be surrounded with angle brackets.

Unfortunately, just setting the Title value to a string containing these hex bytes doesn't seem to do the trick as it is written to the file like this: note the parentheses that make this a literal string I'm guessing.

/Title (<FEFF65876863>)

Is there some other way to set this Title property that would allow it to be written as just the hex byte string I built?

-Jeff

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
https://www.phpbb.com/