topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Saturday December 14, 2024, 3:23 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: Searching inside of Open Document files?  (Read 4733 times)

bastik

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 14
    • View Profile
    • Donate to Member
Searching inside of Open Document files?
« on: February 27, 2012, 12:08 PM »
Problem: Searching inside of Open Document files.

Goal: Search through .odt-files, specified by the user and look for string(s) specified by the  user and list matching ones.

Does anybody know a tool that's capable of doing that? (o3find, worked but only with old binary files)
In the case it does not exist (at least I was not able to find something useful) I included information, so one might be able to code something.

Don't treat it as request!

Requirements:
- Ability to search through Open Document Text files (.odt)
- Handle multiple files at once

Optional:
- Support all Open Document Files
- Support M$ Documents, txt files
- Portable
- Support RegEx
- be an plugin for FARR
- Tell the user which files are encrypted and therefor could not be searched

Background:
Open Document files are zipped and therefore ignored. The tool therefor needs to unpack the files and be able to read the contents of the content.xml

I don't think it has to parse the XML.

dantheman

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 742
  • Be good if you can!
    • View Profile
    • Donate to Member
Re: Searching inside of Open Document files?
« Reply #1 on: February 27, 2012, 06:56 PM »
Filelocator Lite (freeware) or Pro version should help you out on some of your needs.
http://www.mythicsof...orlite&page=home

fenixproductions

  • Honorary Member
  • Joined in 2006
  • **
  • Posts: 1,186
    • View Profile
    • Donate to Member
Re: Searching inside of Open Document files?
« Reply #2 on: February 27, 2012, 07:18 PM »
@bastik
I am using Total Commander with plugins for such task :)
But... since I wrote plugin for Office files once I may be able to think about something.

BTW Parsing is not needed but nice tags stripping is.

iphigenie

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 1,170
    • View Profile
    • Donate to Member
Re: Searching inside of Open Document files?
« Reply #3 on: February 28, 2012, 03:22 AM »
Not a plugin for farr yet but that could probably be arranged

I use Open Source tool DocFetcher http://docfetcher.so...ge.net/en/index.html
* has a portable version
* can work as a one off or constantly updating (maybe not in portable)
* supports all the formats you mention, and unicode
* supports regexp for search but also for excluding files from indexing

it doesnt do archives yet, not sure if it can warn of encryption. Worth a test though :)

bastik

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 14
    • View Profile
    • Donate to Member
Re: Searching inside of Open Document files?
« Reply #4 on: February 28, 2012, 01:08 PM »
Thanks to all of you. Strange that I haven't found them myself.

All I need(ed) is/was something to search inside the files. I don't need that often, but once in a while. I will look into the different solutions/tools.

BTW Parsing is not needed but nice tags stripping is.
-fenixproductions (February 27, 2012, 07:18 PM)
I assumed that something like that would be useful. I did not check the tags, but assumed that they would contain information that should not be found by the search.

it doesnt do archives yet, not sure if it can warn of encryption
My description isn't good. Basically the.odt is an .zip containing the data. As I re-read what I wrote it sounds like I would have zipped them. The error message for encrypted files was something I was thinking of as interesting feature, but nothing required.

fenixproductions

  • Honorary Member
  • Joined in 2006
  • **
  • Posts: 1,186
    • View Profile
    • Donate to Member
Re: Searching inside of Open Document files?
« Reply #5 on: February 29, 2012, 04:32 PM »
BTW Parsing is not needed but nice tags stripping is.
-fenixproductions (February 27, 2012, 07:18 PM)
I assumed that something like that would be useful. I did not check the tags, but assumed that they would contain information that should not be found by the search.
It is rather the case of stripping too much.

I.e.
<tag1>this is</tag1><tag2>test</tag2> may be shown in Word as this is test but wrongly stripped will become this istest.