ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > General Software Discussion

Accessing articles from many pdfs

(1/2) > >>

tsaint:
I download and keep a newspaper, pdf format, text searchable, each day.
I have 2 pieces of software  for doing a key word or phrase search on the year's collection of pdfs.
 My problem though, is that this is not sufficient for keeping track of just some articles on a particular topic....eg I can locate all articles on renewable energy, but only really am interested in 5% of them.
So I want to end up with a list of links, preferably with comments, which allows me to link to relevant pdf articles.
I don't want to do heaps of copy/pasting into a notes database

Three approaches spring to mind:
1. A sticky notes program which would allow for attaching a note to a specific page in a given pdf, and being able to use the sticky notes s/ware to be able to search its database of notes
2. Bookmarking articles or pages in the pdf and then somehow extracting all bookmarks from all pages and using some other (unknown to me) s/ware to manage those. ChatGPT suggested a python routine/script to me to extract bookmarks but it's over my head
3. Some info management s/ware which allows for linking to specific locations in local pdfs. I saw that excel allows this if you use acrobat and within that, copy the page number. I use PDF-xchange and can't seem to do that.

Probably there's a better way, as I'm sure this is not a unique want. Any ideas would be appreciated.
Thanks
Tony

rjbull:
I'm not sure this is a helpful bit of lateral thinking, but had you considered using XPDF to convert your PDFs to text?  Text would be a much more tractable format, easier to search, edit and bookmark.

tsaint:
Thanks for taking the time to reply RJ ... it's a possibility worth me pursuing, and agree with tractable.
I'm still hopeful of a pdf only solution though.
I know I can insert text into a pdf referring to article I'm interested in keeping view of,
and atm, that seems the easiest option.
 I was hoping just using my favoured pdf editor would allow for either the bookmarks, comments or sticky notes it can do, to be searchable globally (eg by DocFetcherPro or AnyTXT searcher), but it seems I'm out of luck there.

rjbull:
RightNote has features to deal with PDFs.  From the Help file:

Indexing settings

The professional version of RightNote allows you to index attachments and links of the following file types:
.txt, .rtf, .htm/html, .doc/docx, .xls, .csv, .pdf

In the options dialog under the Indexing settings section, you can select which file types you want to be indexed by default. Further, every note has it's own setting which will override the default settings.

For example, you may want to set pdf files to not be indexed by default, since often times these can be large files, and you generally do not need the contents of these file to be indexed. If you then need a specific pdf file to be indexed, you can adjust the setting in the attachment viewer.

---------

Attachment note type

The attachment note type allows you to store any type of file in a RightNote database. For example you can store MS Word documents, Excel files and PDF Documents. If supported, the contents of the file will be indexed and made searchable.

Currently the following file types will be indexed:

txt, rtf, htm, html, doc, docx, xls, pdf.

You can open/view the file by clicking on the Open File link in the viewer. This will open the file with the default associated application for the file type, for example and xls file will be opened by MS Excel (if it is installed); a doc file will be opened by MS Word.

[...]
Note:

When you open an attachment, you are viewing a copy of the original document. If you make changes to the attachment, this will not affect the original source document.

---------

Link note type

The link note type allows you to create a link to any type of file on your computer. For example you can create links to MS Word documents, Excel files and PDF Documents. If supported, the contents of the file will be indexed and made searchable.

Currently the following file types will be indexed:

txt, rtf, htm, html, doc, docx, xls, pdf.

Note:

A link note does not store the actual contents of the file in the RightNote database. It simply points to a file on the file system (or internet url). If you open the link, you will be opening the source file pointed to by the link url.

---------
--- End quote ---

Warnings:
   A) I haven't tried it
   B) Features only in the Professional version, i.e. the payware one

tsaint:
Thanks RJ. I'll investigate but probably wouldn't spend the money.
 I found out that PDF Xchange does indeed allow for global searching of its bookmarks and comments (sticky notes)... and saving the search
While I'd like to have a notes type program allowing linking to those, without importing the pdfs themselves, I'm happy enough to go with what I can do.

 .

Navigation

[0] Message Index

[#] Next page

Go to full version