I downloaded InfoRapid, so I have two now to try out, but as for regex, I really think for the common users, the function in an editor needs to be as simple as the easy button on the staples commerical!, check marking the box for the function called (exact match), but this box would say (clostest match) or (nearest match)! Or perhaps (exact match and near match together) That takes all the tedium out, which also leaves common users completely in the dark. I have to edit thousands of pages of results from 30 years of bicycle racing with some riders names being mispelled over the years many times. If I miss a name that is mispelled, then I might miss the results for that race, not acceptable. So instead of Luperini, the saftest might be just (Lup) which will bring back probably much more then what I want in the search results. These archives are on my hard drive, so its extremely and fast to search, if I have the right utility.
Some riders names like Pucinskaite, I have heard pronounced like this, Pushing-Sky-Ah and some pronounce it as Pushing-Scooty. So when Euro journalists and writers, editors write names, they guess sometimes and write them as to how they sound either to them, or what they think it would sound like in the native language if they do not have the correct spelling in hand. So for instance, the Bela Russian champion Zinaida Stahurskaia, who was recently busted for selling steriods has always had her name spelled two different ways, and no one seems to know really how its spelled, as its always Stahurskaia or Stahurskaya on the Interent. This name is literally spelled hunreds of times both ways on Google, so its impossible to know, and no one seems to know.
Now the professional rider Jeannie Longo would be a good example. Of course I could just use her last name, Longo. But for the sake of shining some light on this, if I use the first name, sometimes its spelled Jeannie, Jeanie, Jeanne, and so on.
Add to that, many Euro names use there own accents marks on certain letters which wrecks havoc on search function like these French characters. Like Magali Le Floc'h, migh just be spelled Magali Le Floch, or LeFloch and some letters like ù, é, ê, Â, are used in French names, and other accents are used in other languages, but often when editors write the names in English, they just use regular letters, but sometimes they don't, and what you are left with is sometimes not only mispelled, but they contain these little devils which really throw a wrench in the works, so what is needed is for the editor to bring back close matches that include these little devils, as well as without. Otherwise I will miss their names! It should not ignore the accents in close matches!
Some of these names are mispelled many ways, but usually close, because the names are complex like Polikeviciute, Zabelinskaya, Polkhanova, Vzesniauskaite, now add some accent marks, with mispelling and you have alphabet soup!
Now the final wrench in the works! When I scan with OCR Omnipage Pro, it does a good job, but also mispells names, which adds to the fog of war here. So if an editor can be cleaver enough to bring back close matches, or even gross matches as a last result but still would be better then the nonsense I am going through now!
So you can see the need for an easy button like Staples! The easy button is besides, the box for exact match, a box that says close match! (mispellings, accents and all!)
thanks,
Bruce