ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

DonationCoder.com Software > N.A.N.Y. 2014

N.A.N.Y. 2014 Submission: sumatra_earmarks

<< < (6/7) > >>

ewemoa:
Regarding page numbering in PDFs, I'd be noticing that the page number displayed on some PDFs didn't match what applications were displaying.  At some point, that lead to some digging in the specifications -- perhaps you are already aware, but FWIW there's a bit of somewhat introductory material about this at:

  http://www.w3.org/WAI/GL/WCAG20-TECHS/PDF17.html

(Relevant sections: Description and Example 2)

More details are in "12.4.2 Page Labels" in the PDF 1.7 spec it seems.


Haven't tested the new version yet but hope to -- must find appropriate PDF first :)

Nod5:
Yes, there can be two mismatches I think:
1. page numbers in the pdf page margins (like on a physical book page) differs from the Sumatra toolbar page number edit box.
2. the Sumatra toolbar page number edit box page differs from the "count number" of the current page within the pdf file (first number in the parenthesis after the edit box in my screenshot below).

s_e uses whatever is in the toolbar edit box. The latest update handles earmarks for pages like this.

The spec you linked also shows a special appendix page number format (A-1 ...). But I haven't seen that in any pdf file yet.

ewemoa:
I tried earmarking the first three entries in the bookmarks for Free as in Freedom 2.0.

The three things listed in the popup are (in order):

1.1
00v
vii

I expected to see:

v
vii
1.1

My bad -- tested with an older version.

What I see with 131108 is:

i5
i7
 i (preceded by space?)

For each case, the grey area of the popup shows:

v i5
vii i7
1.1 i


On a side note, just learned that there's a book on PDF from O'Reilly - Developing with PDF - Dive Into the Portable Document Format - with the following blurb:

PDF is becoming the standard for digital documents worldwide, but it’s not easy to learn on your own. With capabilities that let you use a variety of images and text, embed audio and video, and provide links and navigation, there’s a lot to explore. This practical guide helps you understand how to work with PDF to construct your own documents, troubleshoot problems, and even build your own tools.

--- End quote ---

Don't know much about it, but FWIW.

Nod5:
What I see with 131108 is:
i5
i7
 i (preceded by space?)

For each case, the grey area of the popup shows:
v i5
vii i7
1.1 i
-ewemoa (November 12, 2013, 05:16 AM)
--- End quote ---

This is as planned, except for the page numbered "1.1". s_e doesn't recognize that format. That pdf has page numbering in the following order: 1 2 i ii iii ... xiv 1.1 2.1 3 4 5 6 7 ... 229
I may tweak s_e to handle the 1.1 pages better. But the 1 and 2 at the start is a challenge.

Coding background: I've indexed pagenumbers with roman numerals as the numeral value minus 100. E.g. xii --> 7 - 100 = -93. That gets the ordering right for the next/prev jumps. Earmarks at i, xii, 2, 6, 12 would make up the index -99, -93, 2, 6, 12. So If I'm at page i (-99) and jump to next earmark s_e would correctly jump to xii (-93) which is the next item to the right in the index. I think I can index 1.1 as 0.1 and get the right ordering. But fitting the first 1 in that pdf into this way of doing things is trickier. A more general worry is that people may in practice use various different page naming schemes. A general fix would have to for each pdf first read *all* its page numbers to an index/array in order and, when the user earmarks a page, mark that page in the index. For that I'll first need to find a way to read all page numbers of any pdf using autohotkey.

edit: the pdftk command "dump_data" gives useful output. The relevant bit for faif-2.0.pdf :

--- ---PageLabelNewIndex: 1
PageLabelStart: 1
PageLabelNumStyle: DecimalArabicNumerals
PageLabelNewIndex: 3
PageLabelStart: 1
PageLabelNumStyle: LowercaseRomanNumerals
PageLabelNewIndex: 17
PageLabelStart: 1
PageLabelNumStyle: DecimalArabicNumerals
But to work with this I'd have to recode much of s_e and add pdftk.exe as a dependancy. And I'm not sure if pdftk dump_data would work on all pdf files. And pdftk cannot handle djvu files. So I hesitate to go there.

I choose to translate the roman i ii iii ... xii into a made up format i1 i2 i3 ... i7 to avoid a lot of column spacing in the grid view if someone earmarks xxviiii.

kyrathaba:
Nod5, I'm impressed with the sustained effort you're putting into your entry. Kudos!

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version