Re: PHP Solr API

2010-09-30 Thread Neil Lunn
On Fri, 2010-10-01 at 12:00 +1000, Scott Yeadon wrote:
> Hi,
> 

> The problem is that the article text is HTML and Solr appears to strip 
> the HTML by default.

I think what you need to look at is how the fields are defined by
default in your schema. If Data sent as HTML is being added to the
standard html-text type and stored then the html is stripped and words
indexed by default. If you want to store the raw html then maybe you
should be doing that and not storing the stripped version, just indexing
it.

-- 


Regards,

Neil Lunn




Sorting on Multiple fields

2009-10-14 Thread Neil Lunn
We have come up against a situation we are trying to resolve in our Solr
implementation project. This revolves mostly around how to sort results from
index data we are likely to store in multiple fields but at runtime we are
likely to query on the result of which one is most relevant. A brief
example:
We have product catalog information in the index which will have multiple
prices dependent on the user logged in and other scenarios. For
simplification this will look something like this:

price_id101 = 100.00
price_id102 = 105.00
price_id103 = 110.00
price_id104 = 95.00
(etc)

What we are looking at is at runtime we want to know which one of several
selected prices is the minimum (or maximum), but not all prices, just a
select set of say 3 or 2 id's. The purpose we are looking at is to determine
a sort order to the results. This as we would be aware approaching a SQL
respository we would feed it some query logic to say "find me the least
amount of these set of id's", therefore the search approach here raises some
questions.

- Do we attempt to raise some sort of functional query to find the least
amount of the requested price id's? This would seem to imply some playing
around in the query handler to allow a function of this sort.

- Do we look at this rather than some internal method to handle the query
and sort actions as a matter of relevancy on a calculated field? If so the
methods of determining the fields included in the calculated field are
alluding me at the moment. So pointers are welcome.

- Does this ultimately involve the implementation of some sort of custom
type and handler to do this sort of task.

I am open to any response as if someone has not come across a similar
problem before and can suggest an approach we are willing to open up a patch
branch or similar to do some work on the issue. Though if there are no
suggestions this will likely move out of our current stream and into future
development.

Neil


Re: Sorting on Multiple fields

2009-10-15 Thread Neil Lunn
On Thu, Oct 15, 2009 at 12:55 AM, Avlesh Singh  wrote:

> >
> > Do we attempt to raise some sort of functional query to find the least
> > amount of the requested price id's? This would seem to imply some playing
> > around in the query handler to allow a function of this sort.
> >
> Unless I am missing something, this information can always be obtained by
> post-processing the data obtained from search results. Isn't it?
>

What I was looking for is processing the sort/relevancy at the server and
obtaining the page of results.


>
> Do we look at this rather than some internal method to handle the query
> > and sort actions as a matter of relevancy on a calculated field? If so
> the
> > methods of determining the fields included in the calculated field are
> > alluding me at the moment. So pointers are welcome.
> >
> I really did not understand the question. Is it related to "sorting" of
> results?
>

Point being from sample data one document might have fields like this:

price_id102 = 105.00
price_id103 = 110.00

and the other fields like this:

price_id103 = 110.00


What i (think) need to do here is return the results ordered by the lesser
of the two fields price_id102 and price_id103. As one of the documents does
not have this field sorting by one  field and then another would return in
incorrect order (the second document before the first).

So the correct minimums and orders are

Document 1 : 105.00
Document 2 : 110.00

My issue here is I cannot (dont think) just add a field called
"lesser_of_102_and_103" because I may want different fields in combination.
And what I am looking for is a paramter more like:

minimum(,...)

To use to calculate the order of results returned based on what that
calculates when issued from a query.


>
> Does this ultimately involve the implementation of some sort of custom
> > type and handler to do this sort of task.
> >
> If the answer to my previous question is affirmative, then yes, you would
> need to implement custom sorting behavior. It can be achieved in multiple
> ways depending upon your requirement. From something as simple as
> function-queries to using the power of dynamic fields to writing a custom
> field-type to writing a custom implementation of Lucene's Similarity .. any
> of these can be a potential answer to custom sorting.
>
> Cheers
> Avlesh
>
> On Wed, Oct 14, 2009 at 5:53 PM, Neil Lunn  wrote:
>
> > We have come up against a situation we are trying to resolve in our Solr
> > implementation project. This revolves mostly around how to sort results
> > from
> > index data we are likely to store in multiple fields but at runtime we
> are
> > likely to query on the result of which one is most relevant. A brief
> > example:
> > We have product catalog information in the index which will have multiple
> > prices dependent on the user logged in and other scenarios. For
> > simplification this will look something like this:
> >
> > price_id101 = 100.00
> > price_id102 = 105.00
> > price_id103 = 110.00
> > price_id104 = 95.00
> > (etc)
> >
> > What we are looking at is at runtime we want to know which one of several
> > selected prices is the minimum (or maximum), but not all prices, just a
> > select set of say 3 or 2 id's. The purpose we are looking at is to
> > determine
> > a sort order to the results. This as we would be aware approaching a SQL
> > respository we would feed it some query logic to say "find me the least
> > amount of these set of id's", therefore the search approach here raises
> > some
> > questions.
> >
> > - Do we attempt to raise some sort of functional query to find the least
> > amount of the requested price id's? This would seem to imply some playing
> > around in the query handler to allow a function of this sort.
> >
> > - Do we look at this rather than some internal method to handle the query
> > and sort actions as a matter of relevancy on a calculated field? If so
> the
> > methods of determining the fields included in the calculated field are
> > alluding me at the moment. So pointers are welcome.
> >
> > - Does this ultimately involve the implementation of some sort of custom
> > type and handler to do this sort of task.
> >
> > I am open to any response as if someone has not come across a similar
> > problem before and can suggest an approach we are willing to open up a
> > patch
> > branch or similar to do some work on the issue. Though if there are no
> > suggestions this will likely move out of our current stream and into
> future
> > development.
> >
> > Neil
> >
>


Re: How use implement Lucene for perl.

2009-12-28 Thread Neil Lunn
On Tue, Dec 29, 2009 at 3:42 PM, Maheshwar  wrote:

>
> I am new for Lucene.
> I haven't any idea about Lucene.
> I want to implement Lucene in my search script.
> Please guide me what I needs to be do for Lucene implementation.
>

Yes probably the wrong list or not, but for both Lucene or Solr the best
place for you to possibly look for Perl is on CPAN:

http://search.cpan.org/~bricas/WebService-Lucene-0.10/lib/WebService/Lucene.pm

and

http://search.cpan.org/~bricas/WebService-Solr-0.09/lib/WebService/Solr.pm

provide appropriate interfaces and documentation. Among others.



>
> Actually, I want to integrate lucene search with message board system where
> people come to post new topic, edit that topic and delete that on needs. I
> want, to update search index at every action.
> So I need some valuable help.
>
>
>
> --
> View this message in context:
> http://old.nabble.com/How-use-implement-Lucene-for-perl.-tp26951130p26951130.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>