Hi,

I would take a different approach.  Track users' queries and their
clicks.  Aggregate queries and start thinking of them as tags/labels.
Aggregate them and use top N to tag your docs.
Alternatively/additionally, extract significant terms and phrases from
clicked-to docs and use that to tag your docs.

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html




On Tue, May 14, 2013 at 7:04 AM, David Parks <davidpark...@yahoo.com> wrote:
> We have a number of queries that produce good results based on the textual
> data, but are contextually wrong (for example, an "SSD hard drive" search
> matches the music album "SSD hip hop drives us crazy".
>
>
>
> Textually a fair match, but SSD is a term that strongly relates to technical
> documents.
>
>
>
> We'd like to be able to direct this query more strictly in the direction of
> the technical documents based on the term "SSD".  I am considering whether
> it would be worth trying to cluster all documents, thus tending to group the
> music with the music and tech items with the tech items. Then pulling out
> the term vectors that define each group; do a human review of that data; and
> plug it back into the documents of each cluster as a separate search field
> that gets boosted.
>
>
>
> In my head it seems like a plausible way to weigh terms like SSD to the
> cluster of items that it most closely associates.
>
>
>
> Should I spend the effort to find out?
>
> Yeh or neh?
>

Reply via email to