Forum search gets another update, which deals exclusively with relevance ranking (as I'm confident that previous updates already fixed missing results and excessive noise).
Currently the following factors are calculated to work out how relevant is a topic/post to your search query:
Original SMF Factors:
- Number of matching messages within a topic (the more replies to a topic in which same keywords can be found, the more likely this topic is relevant)
- Age of last matching message (we love latest news, how recent is this message?)
- Topic length (number of replies in a topic. How interested are people in this discussion?)
- Matching subject (bull's eye!)
- First message match (sets the theme of the topic)
- Sticky topic (moderator's "sponsored link") (for now we are ignoring this factor)
Per-message Keyword Analysis:
Rarer keywords get higher points, cliched ones lower. And how concentrated are the keywords in a message? Suppose we just posted a super-lengthy newsletter on the forum, with truckloads of keywords -- there's a great chance that it'll match our query, however it shouldn't be well-ranked coz keywords are diluted.
Penalty for messages written with "Spartan brevity":
They get awarded for high keyword concentration but there won't be much to read. We don't like them.
Active board ranking heuristic:
Most likely you are more interested in relevant topics within the board you are currently browsing, as such we give "local" matching topics an edge over others to catch you eye.
First suggestion:
Ignore the "MOVED: " threads, since those are a repetition of the real thread.
-jgpaiva
Hmmm I haven't added any routine to check for MOVED posts at this time because:
1. Not so many "MOVED: " threads out there, it's unlikely to be efficient to add this check for every matching topic
2. They get penalized for "not saying much"
I hope this update will help you get to your targets in the shortest time possible.
I have no doubt there's still much to be done when it comes to relevance ranking. so...