Thanks for another great writeup! I just want to add my 0.02, um, Euro
I've been wondering if there is any underlying regularity to the division between the excelent and not-so-excellent implementations of live search.
From your overview it seems that live search implementations tend to be better in programs that are not designed to deal with huge amounts of data. It makes sense. You can add instant search to *any* application, but not all applications can be realistically expected to perform it in a truly instant manner. It's one thing to scan an addressbook or a list of bookmarks - a typical user will not have more than a few dozen addresses or a few hundred bookmarks. And it's an entirely different thing to search through a multi-megabyte database, as will be the case with TheBat or MyBase.
I've never used MyBase, but I now have 240 MB of data in TheBat mailboxes. I don't think any search algorithm, no matter how good, will run through that much text as fast as you can type. So I think it's not the case of some programmers being lazier than others, or falling behind the curve. It's often the nature of the database that dictates what's feasible. In TB, if you didn't have to press Enter to initiate searching, you'd be experiencing a brief "freeze" after typing each character - and that would produce a much worse usability experience, the program would feel clunky.
You can experience that clunky feeling in some blogs that use live search (there are live sarch add-ons for TextPattern and, I think, WordPress). That doesn't work too well. Each keypress requires a trip to the server, running a query on the DB and returning the results, so both the network latency and DB performance come into play. If the blog has only a couple of entries, the result may be acceptable, but if you're searching through years of archives, you press a key, then wait until the results refresh, then press another key - and you actually lose time if you make a typo and have to start over.
Evernote seems to be an exception to this rule, perhaps because, as you observe, it must have been built pretty much around the live search feature. Perhaps also it uses a particularly efficient indexing system. But it would be interesting to know the size of the data each of the applications you tested had to deal with.
Data size is not the only factor, of course. There's also the format in which data is stored. Let me use KeyNote as an example. KeyNote stores notes as RTF, there really is no other way. You cannot search through RTF directly, because of all the embedded formatting codes and character encodings. The only thing I could do at the time was to sequentially load each note into an (invisible) richedit control and use the control's built-in search function. This is, of course, slow. On today's fast machines you won't notice any bothersome delay, but still the search isn't instant. The only way around it would be to store each note twice: the original RTF, and a "cleaned-up" plain text version. That way I could use the plain text version for searching. This is indeed what I'm aiming to do at the moment, but it requires a different storage mechanism, and of course it bloats the file size.
Another factor is whether the application uses Unicode. There are search routines written in assembly that perform amazingly fast on ANSI strings, but I've yet to see a library that handles Unicode equally fast, at least in Delphi land. (TheBat supports Unicode). Just the fact that you may have to deal with variable number of bytes per character slows you down enormously, and of course with encodings like UTF-16 you have to read through twice as much raw data.
Finally, the feasibility of live search (and other neat things) will depend on the storage model. We can pretty safely assume that a lightweight program like PowerMarks loads all the data into RAM and keeps it there. KeyNote does the same, as I'm sure do most addressbooks. (I'd like to know about Evernote). It's a reasonable behavior for a bookmark manager, but, as it happens, turns out not to have been reasonable for KeyNote. I found that out when I questions started to pour in from people who created 30- or 40-MB files in KeyNote and were wondering why loading those files took so long. But if your storage model is disk-bound, and each piece of data is read from disk only when needed, this too will have a huge impact on how you can search, and how fast.
Sorry for the longwindedness... I'm still avoiding posting to the famous Notetakers thread, because I've been thinking about this for the last five years or so and I could write volumes, but it would mostly be about what would be great to have versus what I think is realistically possible