SuperboyAC's DC blog #2 (Live Search feature in software)

ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > General Software Discussion

<< < (8/15) > >>

WhiteLion:
Awesome find Babis!

Ava Find's interface looks to be right up my alley. :Thmbsup:
I find DT Search's gui to be a bit clunky.

I can't wait to use it. It has been indexing for an hour & still is going. I have 3 HDs so it's expected. It is a pricy one but it looks like it may be just worth overlooking the expense with this puppy.

tranglos:
Thanks for another great writeup! I just want to add my 0.02, um, Euro :) I've been wondering if there is any underlying regularity to the division between the excelent and not-so-excellent implementations of live search.

From your overview it seems that live search implementations tend to be better in programs that are not designed to deal with huge amounts of data. It makes sense. You can add instant search to *any* application, but not all applications can be realistically expected to perform it in a truly instant manner. It's one thing to scan an addressbook or a list of bookmarks - a typical user will not have more than a few dozen addresses or a few hundred bookmarks. And it's an entirely different thing to search through a multi-megabyte database, as will be the case with TheBat or MyBase.

I've never used MyBase, but I now have 240 MB of data in TheBat mailboxes. I don't think any search algorithm, no matter how good, will run through that much text as fast as you can type. So I think it's not the case of some programmers being lazier than others, or falling behind the curve. It's often the nature of the database that dictates what's feasible. In TB, if you didn't have to press Enter to initiate searching, you'd be experiencing a brief "freeze" after typing each character - and that would produce a much worse usability experience, the program would feel clunky.

You can experience that clunky feeling in some blogs that use live search (there are live sarch add-ons for TextPattern and, I think, WordPress). That doesn't work too well. Each keypress requires a trip to the server, running a query on the DB and returning the results, so both the network latency and DB performance come into play. If the blog has only a couple of entries, the result may be acceptable, but if you're searching through years of archives, you press a key, then wait until the results refresh, then press another key - and you actually lose time if you make a typo and have to start over.

Evernote seems to be an exception to this rule, perhaps because, as you observe, it must have been built pretty much around the live search feature. Perhaps also it uses a particularly efficient indexing system. But it would be interesting to know the size of the data each of the applications you tested had to deal with.

Data size is not the only factor, of course. There's also the format in which data is stored. Let me use KeyNote as an example. KeyNote stores notes as RTF, there really is no other way. You cannot search through RTF directly, because of all the embedded formatting codes and character encodings. The only thing I could do at the time was to sequentially load each note into an (invisible) richedit control and use the control's built-in search function. This is, of course, slow. On today's fast machines you won't notice any bothersome delay, but still the search isn't instant. The only way around it would be to store each note twice: the original RTF, and a "cleaned-up" plain text version. That way I could use the plain text version for searching. This is indeed what I'm aiming to do at the moment, but it requires a different storage mechanism, and of course it bloats the file size.

Another factor is whether the application uses Unicode. There are search routines written in assembly that perform amazingly fast on ANSI strings, but I've yet to see a library that handles Unicode equally fast, at least in Delphi land. (TheBat supports Unicode). Just the fact that you may have to deal with variable number of bytes per character slows you down enormously, and of course with encodings like UTF-16 you have to read through twice as much raw data.

Finally, the feasibility of live search (and other neat things) will depend on the storage model. We can pretty safely assume that a lightweight program like PowerMarks loads all the data into RAM and keeps it there. KeyNote does the same, as I'm sure do most addressbooks. (I'd like to know about Evernote). It's a reasonable behavior for a bookmark manager, but, as it happens, turns out not to have been reasonable for KeyNote. I found that out when I questions started to pour in from people who created 30- or 40-MB files in KeyNote and were wondering why loading those files took so long. But if your storage model is disk-bound, and each piece of data is read from disk only when needed, this too will have a huge impact on how you can search, and how fast.

Sorry for the longwindedness... I'm still avoiding posting to the famous Notetakers thread, because I've been thinking about this for the last five years or so and I could write volumes, but it would mostly be about what would be great to have versus what I think is realistically possible :)

Babis:
@brotherS, my first minireview posted ;)

@WhiteLion, in DTSearch you have to go to options>indexing options>filtering options>binary files and set it "do not index". Now the index will finish in less than half time

urlwolf:
Gosh! I almost forgot my absolute favorite of all, which also supports live search: ZTreeWin (http://www.ztree.com/).
-yksyks (February 28, 2007, 03:25 PM)
--- End quote ---
I'm curious about ZTreeWin, but as a Dopus user, I'm not sure I need to invest time in this. Can you post an overview/review? thanks!

superboyac:
Sorry for the longwindedness... I'm still avoiding posting to the famous Notetakers thread, because I've been thinking about this for the last five years or so and I could write volumes, but it would mostly be about what would be great to have versus what I think is realistically possible
--- End quote ---
Tranglos, your thoughts are always welcome, and the longer the better! First off, it's an honor to have you post here since you're one of the pioneers of modern notetaking software.

I've never used MyBase, but I now have 240 MB of data in TheBat mailboxes. I don't think any search algorithm, no matter how good, will run through that much text as fast as you can type. So I think it's not the case of some programmers being lazier than others, or falling behind the curve. It's often the nature of the database that dictates what's feasible. In TB, if you didn't have to press Enter to initiate searching, you'd be experiencing a brief "freeze" after typing each character - and that would produce a much worse usability experience, the program would feel clunky.
--- End quote ---
You bring up some really good points about the speed of live searching vs. the size of the database. Now that I think about it, I don't think it was fair of me to include the Bat's filter box in this little review, because it's not really meant to be a live search. It's more of a filtering box. But, yes, I can definitely see how these little programs have an advantage over the ones that have large databases. I never thought about that, so it's a good perspective to have when I compare the features.

Evernote seems to be an exception to this rule, perhaps because, as you observe, it must have been built pretty much around the live search feature. Perhaps also it uses a particularly efficient indexing system. But it would be interesting to know the size of the data each of the applications you tested had to deal with.
--- End quote ---
Yes, that is interesting! I'm also curious as to how Evernote can do it. I wonder if the EN database gets really big if the searching slows down at all. My database is only a few MB, but I wonder if there's anyone who has 2 GB or more and if it slows down the search at all. It's true that while EN had the live search from the beginning, Mybase added it in after several versions, so maybe it's harder to make an existing program adjust to it than to design it in from the beginning. However, my complaint for Mybase's live search had less to do with its speed and more to do with the klunky implementation of it.

This is indeed what I'm aiming to do at the moment, but it requires a different storage mechanism, and of course it bloats the file size.
--- End quote ---
So, am I to understand that you are currently working on another notetaking program?! If that's true, that would be great news for a lot of people :up: ! Last I heard, you had shelved the Keynote project.

I'm still avoiding posting to the famous Notetakers thread, because I've been thinking about this for the last five years or so and I could write volumes, but it would mostly be about what would be great to have versus what I think is realistically possible
--- End quote ---
Whenever you're ready, we'd love to hear from you over there!

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version