topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Thursday April 18, 2024, 2:27 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: [request] "Fuzzy search" only after first character  (Read 6698 times)

TucknDar

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 1,133
    • View Profile
    • Donate to Member
[request] "Fuzzy search" only after first character
« on: August 05, 2008, 08:48 AM »
Don't know if that topic subject made much sense, but let me explain. Non-contiguous pattern matching (i.e. fuzzy search) is very useful, but I find myself only using it like this: Search for Movie Collector, type 'mocoll' or similar, not like this: 'vicoll' (both match). Ok, that didn't make much sense either  :wallbash: The difference is that in the way I search, I always use the first character of what I'm searching for, in this example that is of course 'm'. So my request is that as an option, the non-contiguous pattern matching should only match results that has the same first character as I typed!

Did I finally make sense?

Type: 'mocoll' would match 'Movie Collector', 'more collages.doc', 'more cool lettuce.xls' (fictional...), etc, but not 'emoticon collection', 'grandmother's cold lobster soup.doc' or similar (great examples, huh).

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,900
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Re: [request] "Fuzzy search" only after first character
« Reply #1 on: August 05, 2008, 09:50 AM »
i can make an option for this.. in fact farr knows about this basic pattern and uses it to bias scores efficiently (it's always looking for FIRSTLETTER*REMAINDER and giving highest scores to those).

however there are other things that make it difficult to make your request a default (i can make it an option though).  because other common ways to non-contiguous search:
  • First letter from each syllable/word: MSO (for microsoft office)
  • Some shortcuts have an extra word at front (like Mozilla Firefox), and so searching for ffox wouldnt find it in your case.

but like i said i'm happy to add more options to reduce the noise of such searches.
the other thing i could do is reallly bias cases like you describe where first letter matches first letter of search, and give that a much higher score.

TucknDar

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 1,133
    • View Profile
    • Donate to Member
Re: [request] "Fuzzy search" only after first character
« Reply #2 on: August 05, 2008, 10:27 AM »
Yes, i can see what you mean. I'd really like the option, although being able to bias first letter matches much more could be worth looking at as well. I guess you're right about such cases as Mozilla Firefox, as most Microsoft products would fall in that "trap" as well. I still think I'd turn that option on, as typing 'fire' would still put Firefox at the top, as would 'word', 'excel' etc.

d4ni

  • Supporting Member
  • Joined in 2008
  • **
  • Posts: 129
    • View Profile
    • Donate to Member
Re: [request] "Fuzzy search" only after first character
« Reply #3 on: August 06, 2008, 05:26 PM »
lol I never realized farr actually has fuzzy search :D but good to know. making it an option is fine, just don't make it default as I am sure some people wouldn't like it, like myself :)

herojoker

  • Participant
  • Joined in 2008
  • *
  • Posts: 124
    • View Profile
    • Donate to Member
Re: [request] "Fuzzy search" only after first character
« Reply #4 on: August 07, 2008, 05:55 AM »
@d4ni: It is an option, called "Score non-contiguous matches" on the "Search Behavior" page.

d4ni

  • Supporting Member
  • Joined in 2008
  • **
  • Posts: 129
    • View Profile
    • Donate to Member
Re: [request] "Fuzzy search" only after first character
« Reply #5 on: August 08, 2008, 05:00 AM »
I am sorry I did not express myself very clearly there ;) I meant making TucknDar's suggestion an option.

etiman

  • Participant
  • Joined in 2008
  • *
  • default avatar
  • Posts: 1
    • View Profile
    • Donate to Member
Re: [request] "Fuzzy search" only after first character
« Reply #6 on: September 07, 2008, 11:24 AM »
hi guys - I only just registered - then logged on - never done a blog in my life - but my job is Charity/ helping out et al.
I was looking for a quality fuzzy search for .net/c# SOA architecture. (used j-bean in previous life). Then I saw your you guys blogging - someone talking about setting the right weights. I'm doing ETI now (SOA is old hat). Being open, like you guys, I'll say my thoughts but forgive my impertinence if you're way ahead of me.
You have to build components that evolve (improve over time). That way, you don't have to get it right first time (reason why SOA Atemporal/Top down doesn't breed well (i.e. become the dominant methodology) in the real world. My architecture is far smarter (I think) and mimics the real world (temporal components (temporal on human species time scale - not on Mayfly's) like living organisms, species, companies, countries, et. al)
You guys (“guys/girls” meant in a good way, I assure you) try to design components and species and architectures (all the same) as though they were atemporal (without time like quantum mechanics). Yet --- you are trying to mimic things that have to live (objective longevity) in a temporal world.
So- I'll get to the point - my run-time thinking on fuzzy searches:
1. In my application people often use capital letters (Airline AF not af -- LHR not lhr )
but they could make a mistake and put lhr - come to than later.
2. They might also use Keywords like -- via or VIA or from or September or USD or US$ or London or even LONDRES (French - just in case you didn't know :)
3. Then maybe strings on numbers
etc. etc.

OK - we have 3 factors - how are we going to “weight” them?

Let's make them even, you've got to start somewhere.
Now think of building a temporal algorithm that evolves such that the weightings are adjusted (over time - with repeated use) to become closer to getting the right answer (which is going to be different for an Frenchman than a marsupial-like Australian – different nature and nurture) 
So build your fuzzy component – implant it into your application and watch it weight watching on its own (whilst you enjoy a nice glass of low-calorie wine)
It learns like Nural networks (see UCLA http://www.bio.net/m...97-March/001265.html) The middle cells alter the connections between the input and output cells with the help of simple feedback (could give you maths, but getting late for me)

So - the user (say) puts at street address into Google maps (say). You detect his ipaddr as marsupial. He enters his town as Nucastle (honest mistake - blame the English teacher) You suspect Newcastle, NSW, not Tyneside, UK. But you make sure by asking user potitely ("I think you mean Newcastle, NSW, Australia?) . He says Yes, then you make a cross reference file between Nucastle and Newcastle, NSW, under the Logon-user's profile (or even against all user profile in Australia). In reality, the street or zip-code would be a givaway anyway - but then again, a simple X-ref lookup (or cookie) might avoid central disk access & latency delays.
hope this help you. (this is my ETI logic rather than SOA)
- please let me have contact with someone with enough mipp-power (aka honesty) so we could help with some big problems (like African children, climate change)
kind regards, etiman