Re: Preparing the ground for a real multilang index

2009-07-03 Thread Paul Libbrecht


Le 03-juil.-09 à 07:43, Michael Lackhoff a écrit :


On 03.07.2009 00:49 Paul Libbrecht wrote:

[I'll try to address the other responses as well]


I believe the proper way is for the server to compute a list of
accepted languages in order of preferences.
The web-platform language (e.g. the user-setting), and the values in
the Accept-Language http header (which are from the browser or
platform).


All this is not going to help much because the main application is a
scientific search portal for books and articles with many users
searching cross-language. The most typical use case is a German user
searching multilingual. So we might even get the search multilingual,
e.g. TITLE:cancer OR TITLE:krebs. No way here to watch out for
Accept-headers or a language select field (would be left on "any" in
most cases). Other popular use cases are citations (in whatever
language) cut and pasted into the search field.


The algorithm I described does take all this in account: the ambiguity  
of the query's language.
You have no other way to offer any form of stemming in each language  
(e.g. removing -ing and removing -ung) than to actually do this.
Is it because you use solr directly that languages can't be passed  
around?

You need a server part to get the headers, indeed.
Oh, and yes, you have to double all what I described to prefer matches  
in the title btw.
We've implemented something that might be close to what you're search,  
i2geo search which approaches much closer the cross-lingual problem by  
request entity designation:

It's under APL.

 Try to search for, say, Viereck in the search box. See a little  
description at:

  http://i2geo.net/xwiki/bin/view/About/GeoSkills


I think the best would be to process the data according to its  
language
but don't make any assumptions about the query language and I am  
totally

lost how to get a clever schema.xml out of all this.


just or them properly.
Storing different languages in different fields (title-de, title-en)  
is the right way to get the schema.xml properly configured with an  
analyzer I think.


paul

smime.p7s
Description: S/MIME cryptographic signature


custom queryparser issues

2009-07-03 Thread Julian Davchev
Hi,
I got several issues with custom query parser. Actually all I am doing
is extending

org.apache.solr.search.LuceneQParserPluginoverwritingpublic
QParser createParser(String qstr, SolrParams localParams,
SolrParams params, SolrQueryRequest req) {

so that I can adjust params before I pass it with super.createParser()
to original LuceneQParserPlugin

1. when jason request how can I set application/json   headers so
response is returned appropriately wt=json doesn't seem todo this
2. is there something special I need todo on createParser  or cleaning
up req and params because from what I see if I set some params they are
used in next request as well  as if using same req object. I couldn't
really trace what is calling createParser to figure out if
SolrQueryRequest object is always new.  Also I am setting my values on
all 3  localParams, params, and req arguments...and this might not be
quite right.
3. I see that apart of setting String qstr to new value I can also mess
with some solr reserved vars like  facet etc etc but can't really reset
rows and start..this is already stored somewhere else and can't really
be affect from what I see.

Sadly I didn't find answer to any of those in solr documentation. Any
help is very welcome.
Thanks.


Re: Implementing PhraseQuery and MoreLikeThis Query in one app

2009-07-03 Thread SergeyG

Otis,

Thanks a lot. I'd certainly follow your advice and check the logs. Although,
I must say that I've already tried all possible variations of the string for
the "fl" parameter (spaces, commas, plus signs). More than that - the query
still doesn't want to fetch any docs (other than the one with the id
specified in the query) even when the line solrQuery.setParam("fl", "title
author score"); is commented out. So I suspect that the problem is that the
request with the url
"http://localhost:8080/solr/select?q=id:1&mlt=true&mlt.fl=content&..."; due
to some reason doesn't work properly. And when I use the GetMethod(url)
approach and send url directly in the form
"http://localhost:8080/solr/mlt?q=id:1&mlt.fl=content&...";, Solr picks up
the mlt component. (At least, I'll have this backup solution if the main one
keeps committing sabotage. :) I'll just need to add a parser for an incoming
xml-response.)

I'll continue my "research" of this issue and, if you're interested in
results, I'll definitely let you know.

Cheers,
Sergey


Otis Gospodnetic wrote:
> 
> 
> Sergey,
> 
> Glad to hear the suggestion worked!
> 
> I can't spot the problem (though I think you want to use a comma to
> separate the list of fields in the fl parameter value).
> I suggest you look at the servlet container logs and Solr logs and compare
> requests that these two calls make.  Once you see what how the second one
> is different from the first one, you will probably be able to figure out
> how to adjust the second one to produce the same results as the first one.
> 
>  Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
>> From: SergeyG 
>> To: solr-user@lucene.apache.org
>> Sent: Thursday, July 2, 2009 6:17:59 PM
>> Subject: Re: Implementing PhraseQuery and MoreLikeThis Query in one app
>> 
>> 
>> Otis,
>> 
>> Your recipe does work: after copying an indexing field and excluding stop
>> words the MoreLikeThis query started fetching meaningful results. :)
>> 
>> Just one issue remained. 
>> 
>> When I execute query in this way:
>> 
>> String query = "q=id:1&mlt.fl=content&...&fl=title+author+score";
>> HttpClient client = new HttpClient();
>> GetMethod get = new GetMethod("http://localhost:8080/solr/mlt";);
>> get.setQueryString(query);
>> client.executeMethod(get);
>> ...
>> 
>> it works fine bringing results as an XML string. 
>> 
>> But when I use "Solr-like" approach:
>> 
>> String query = "id:1";
>> solrQuery.setQuery(query);
>> solrQuery.setParam("mlt", "true");
>> solrQuery.setParam("mlt.fl", "content");
>> solrQuery.setParam("fl", "title author score");
>> QueryResponse queryResponse = server.query( solrQuery );
>> 
>> the result contains only one doc with id=1 and no other "more like" docs. 
>> 
>> In my solrconfig.xml, I have these settings: 
>> ...
>> 
>> ...
>> 
>> I guess it all is a matter of syntax but I can't figure out what's wrong.
>> 
>> Thank you very much (and again, thanks to Michael and Walter).
>> 
>> Cheers,
>> Sergey
>> 
>> 
>> 
>> Michael Ludwig-4 wrote:
>> > 
>> > SergeyG schrieb:
>> > 
>> >> Can both queries - PhraseQuery and MoreLikeThis Query - be implemented
>> >> in the same app taking into account the fact that for the former to
>> >> work the stop words list needs to be included and this results in the
>> >> latter putting stop words among the most important words?
>> > 
>> > Why would the inclusion of a stopword list result in stopwords being of
>> > top importance in the MoreLikeThis query?
>> > 
>> > Michael Ludwig
>> > 
>> > 
>> 
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Implementing-PhraseQuery-and-MoreLikeThis-Query-in-one-app-tp24303817p24314840.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Implementing-PhraseQuery-and-MoreLikeThis-Query-in-one-app-tp24303817p24319269.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: DocSlice andNotSize

2009-07-03 Thread Ben
DocSet isn't an object it's an interface. The DocSlice class 
*implements* DocSet.
What you're saying about set operations not working for DocSlice but 
working for DocSet then doesn't make any sense... can you clarify?


The failure of these set operations to work as expected is confusing the 
hell out of me too!


Thanks
Ben


Yonik Seeley wrote:

On Thu, Jul 2, 2009 at 4:24 PM, Candide Kemmler wrote:
  

I have a simple question rel the DocSlice class. I'm trying to use the (very
handy) set operations on DocSlices and I'm rather confused by the way it
behaves.

I have 2 DocSlices, atlDocs which, by looking at the debugger, holds a
"docs" array of ints of size 1; the second DocSlice is btlDocs, with a
"docs" array of ints of size 67. I know that atlDocs is a subset of btlDocs,
so the doing btlDocs.andNotSize(atlDocs) should really return 66.

But it's returning 10.



The short answer is that all of the set operations were only designed
for DocSets (as opposed to DocLists).
Yes, perhaps DocList should not have extended DocSet...

-Yonik
http://www.lucidimagination.com
  




Stopwords when facetting

2009-07-03 Thread Pierre-Yves LANDRON

Hello,

When indexing or querying text, i'm using the solr.StopFilterFactory ; it seems 
to works just fine...

But I want to use the text field as a facet, and get all the commonly used 
words in a set of results, without the stopwords. As far as I tried, I always 
get stopwords, and numerical terms, that pollute my facets results. How can I 
perform this ?

Thanks,
Pierre

_
Drag n’ drop—Get easy photo sharing with Windows Live™ Photos.

http://www.microsoft.com/windows/windowslive/products/photos.aspx

Re: Stopwords when facetting

2009-07-03 Thread Erik Hatcher
Pierre - the field you're faceting must not have the StopFilter  
applied at indexing time, or the words you want removed aren't in the  
stop word list file.


Erik

On Jul 3, 2009, at 5:21 AM, Pierre-Yves LANDRON wrote:



Hello,

When indexing or querying text, i'm using the  
solr.StopFilterFactory ; it seems to works just fine...


But I want to use the text field as a facet, and get all the  
commonly used words in a set of results, without the stopwords. As  
far as I tried, I always get stopwords, and numerical terms, that  
pollute my facets results. How can I perform this ?


Thanks,
Pierre

_
Drag n’ drop—Get easy photo sharing with Windows Live™ Photos.

http://www.microsoft.com/windows/windowslive/products/photos.aspx




Merging SOLR Documents

2009-07-03 Thread Amandeep Singh09
Hi list,
I am new to this list and just starting solr. My question is how can we merge 
the results of two different searches. I mean if we have a function that has 
two threads so it has to go to two differen solr servers to get the result. Is 
there any way to merge the result using solr and solrj or dow we have to do it 
in java only?

Thanks

Amandeep Singh



 CAUTION - Disclaimer *
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely 
for the use of the addressee(s). If you are not the intended recipient, please 
notify the sender by e-mail and delete the original message. Further, you are 
not 
to copy, disclose, or distribute this e-mail or its contents to any other 
person and 
any such actions are unlawful. This e-mail may contain viruses. Infosys has 
taken 
every reasonable precaution to minimize this risk, but is not liable for any 
damage 
you may sustain as a result of any virus in this e-mail. You should carry out 
your 
own virus checks before opening the e-mail or attachment. Infosys reserves the 
right to monitor and review the content of all messages sent to or from this 
e-mail 
address. Messages sent to or from this e-mail address may be stored on the 
Infosys e-mail system.
***INFOSYS End of Disclaimer INFOSYS***


NYC Apache Lucene/Solr/Nutch/etc. Meetup

2009-07-03 Thread Grant Ingersoll

Hi All, (sorry for the cross-post)

For those in NYC, there will be a Lucene ecosystem (Lucene/Solr/Mahout/ 
Nutch/Tika/Droids/Lucene ports) Meetup on July 22, hosted by MTV  
Networks and co-sponsored with Lucid Imagination.


For more info and to RSVP, see http://www.meetup.com/NYC-Apache-Lucene-Solr-Meetup/ 
.  There is limited seating, so get your spot early.   Note, you must  
register with your first and last name so that security badges can be  
printed ahead of time for access.


Cheers,
Grant


Re: Merging SOLR Documents

2009-07-03 Thread Eric Pugh
What you are talking about is federated search, and is beyond the  
scope of Solr.  However, maybe you can merge the two indexes into one  
index, and then distribute over multiple servers to get the  
performance you are looking for?


http://wiki.apache.org/solr/DistributedSearch

Eric

On Jul 3, 2009, at 7:24 AM, Amandeep Singh09 wrote:


Hi list,
I am new to this list and just starting solr. My question is how can  
we merge the results of two different searches. I mean if we have a  
function that has two threads so it has to go to two differen solr  
servers to get the result. Is there any way to merge the result  
using solr and solrj or dow we have to do it in java only?


Thanks

Amandeep Singh



 CAUTION - Disclaimer *
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION  
intended solely
for the use of the addressee(s). If you are not the intended  
recipient, please
notify the sender by e-mail and delete the original message.  
Further, you are not
to copy, disclose, or distribute this e-mail or its contents to any  
other person and
any such actions are unlawful. This e-mail may contain viruses.  
Infosys has taken
every reasonable precaution to minimize this risk, but is not liable  
for any damage
you may sustain as a result of any virus in this e-mail. You should  
carry out your
own virus checks before opening the e-mail or attachment. Infosys  
reserves the
right to monitor and review the content of all messages sent to or  
from this e-mail
address. Messages sent to or from this e-mail address may be stored  
on the

Infosys e-mail system.
***INFOSYS End of Disclaimer INFOSYS***


-
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com
Free/Busy: http://tinyurl.com/eric-cal






Re: DocSlice andNotSize

2009-07-03 Thread Yonik Seeley
I've opened a JIRA issue for this:
https://issues.apache.org/jira/browse/SOLR-1260

And what I meant by  "set operations were only designed for DocSets
(as opposed to DocLists)" I meant that Solr has never done set
operations on DocList objects AFAIK, and so there aren't any tests for
it.

-Yonik
http://www.lucidimagination.com



On Thu, Jul 2, 2009 at 4:24 PM, Candide Kemmler wrote:
> Hi,
>
> I have a simple question rel the DocSlice class. I'm trying to use the (very
> handy) set operations on DocSlices and I'm rather confused by the way it
> behaves.
>
> I have 2 DocSlices, atlDocs which, by looking at the debugger, holds a
> "docs" array of ints of size 1; the second DocSlice is btlDocs, with a
> "docs" array of ints of size 67. I know that atlDocs is a subset of btlDocs,
> so the doing btlDocs.andNotSize(atlDocs) should really return 66.
>
> But it's returning 10.
>
> Any idea what I'm understanding wrong here?
>
> Thanks in advance.
>
> Candide
>


Popular keywords statistics .

2009-07-03 Thread Alexander Wallace

Hi all!

I'd like to know if there is anything built into Solr that keeps track 
of keywords being searched and has statistics of those?


If not, and or in any case, I'd like to hear what approaches are being 
used by users to know what people is searching for in their apps.


Thanks in advance!


Re: reindexed data on master not replicated to slave

2009-07-03 Thread solr jay
I tried it with the latest nightly build and got the same result.

Actually that was the symptom and it made me looking at the index directory.
The same log messages repeated again and again, never end.



2009/7/2 Noble Paul നോബിള്‍ नोब्ळ् 

> jay , I see updating index properties... twice
>
>
>
> this should happen rarely. in your case it should have happened only
> once. because you cleaned up the master only once
>
>
> On Fri, Jul 3, 2009 at 6:09 AM, Otis
> Gospodnetic wrote:
> >
> > Jay,
> >
> > You didn't mention which version of Solr you are using.  It looks like
> some trunk or nightly version.  Maybe you can try the latest nightly?
> >
> >  Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >
> >
> >
> > - Original Message 
> >> From: solr jay 
> >> To: solr-user@lucene.apache.org
> >> Sent: Thursday, July 2, 2009 9:14:48 PM
> >> Subject: reindexed data on master not replicated to slave
> >>
> >> Hi,
> >>
> >> When index data were corrupted on master instance, I wanted to wipe out
> all
> >> the index data and re-index everything. I was hoping the newly created
> index
> >> data would be replicated to slaves, but it wasn't.
> >>
> >> Here are the steps I performed:
> >>
> >> 1. stop master
> >> 2. delete the directory 'index'
> >> 3. start master
> >> 4. disable replication on master
> >> 5. index all data from scratch
> >> 6. enable replication on master
> >>
> >> It seemed from log file that the slave instances discovered that new
> index
> >> are available and claimed that new index installed, and then trying to
> >> update index properties, but looking into the index directory on slaves,
> you
> >> will find that no index data files were updated or added, plus slaves
> keep
> >> trying to get new index. Here are some from slave's log file:
> >>
> >> Jul 1, 2009 3:59:33 PM org.apache.solr.handler.SnapPuller
> fetchLatestIndex
> >> INFO: Starting replication process
> >> Jul 1, 2009 3:59:33 PM org.apache.solr.handler.SnapPuller
> fetchLatestIndex
> >> INFO: Number of files in latest snapshot in master: 69
> >> Jul 1, 2009 3:59:33 PM org.apache.solr.handler.SnapPuller
> fetchLatestIndex
> >> INFO: Total time taken for download : 0 secs
> >> Jul 1, 2009 3:59:33 PM org.apache.solr.handler.SnapPuller
> fetchLatestIndex
> >> INFO: Conf files are not downloaded or are in sync
> >> Jul 1, 2009 3:59:33 PM org.apache.solr.handler.SnapPuller
> modifyIndexProps
> >> INFO: New index installed. Updating index properties...
> >> Jul 1, 2009 4:00:33 PM org.apache.solr.handler.SnapPuller
> fetchLatestIndex
> >> INFO: Master's version: 1246488421310, generation: 9
> >> Jul 1, 2009 4:00:33 PM org.apache.solr.handler.SnapPuller
> fetchLatestIndex
> >> INFO: Slave's version: 1246385166228, generation: 56
> >> Jul 1, 2009 4:00:33 PM org.apache.solr.handler.SnapPuller
> fetchLatestIndex
> >> INFO: Starting replication process
> >> Jul 1, 2009 4:00:33 PM org.apache.solr.handler.SnapPuller
> fetchLatestIndex
> >> INFO: Number of files in latest snapshot in master: 69
> >> Jul 1, 2009 4:00:33 PM org.apache.solr.handler.SnapPuller
> fetchLatestIndex
> >> INFO: Total time taken for download : 0 secs
> >> Jul 1, 2009 4:00:33 PM org.apache.solr.handler.SnapPuller
> fetchLatestIndex
> >> INFO: Conf files are not downloaded or are in sync
> >> Jul 1, 2009 4:00:33 PM org.apache.solr.handler.SnapPuller
> modifyIndexProps
> >> INFO: New index installed. Updating index properties...
> >>
> >>
> >> Is this process incorrect, or it is a bug? If the process is incorrect,
> what
> >> is the right process?
> >>
> >> Thanks,
> >>
> >> J
> >
> >
>
>
>
> --
> -
> Noble Paul | Principal Engineer| AOL | http://aol.com
>


Suggestions needed: Lots of updates for tiny changes

2009-07-03 Thread Development Team
Hi everybody,
 Let's say I had an index with 10M large-ish documents, and as people
logged into a website and viewed them the "last viewed date" was updated to
the current time. We index a document's last-viewed-date because we allow
users to a) search on this last-viewed-date alongside all other searchable
criteria, and b) we can order results of any search by the last-viewed-date.
 The problem is that in a given 5-minute period, we may have many
thousands of updated documents (due to this simple last-viewed-date). We
have a task that looks for changed documents, loads the full documents, and
then feeds them into Solr to update the index, but unfortunately reading
these changed documents and continually feeding them to Solr is generating *
far* more load on our system (both Solr and the database) than any of the
searches. In a given day, *we may have more updates to documents than we
have total documents indexed*. (Databases don't handle this well either, the
contention on rows for updates slows the database down significantly.)
 How should we approach this problem? It seems like such a waste of
resources to be doing so much work in applications/database/solr only for
last-viewed-dates.

 Solutions we've looked at include:
 1) Update only partial document. --Apparently this isn't supported in
Solr yet (we're using nightly Solr 1.4 builds currently).
 2) Use "near-real-time updates". --Not supported yet. Also, the
"freshness" of the data isn't as much as concern as the sheer volume of
changes that we have to make here. For example, we could update Solr
less-fequently, but then we'd just have many more documents to update. The
data only has to be, say, fresh to within 30 minutes.
 3) Use a separate index for the last-viewed-date. --This won't work
because we need to search on the last-viewed-date alongside other criteria,
and we use it as scoring criteria for all our searches.

 Any suggestions?

Sincerely,

 Daryl.


Re: Suggestions needed: Lots of updates for tiny changes

2009-07-03 Thread Otis Gospodnetic

I don't have a very specific suggestion, but I wonder if you could have a data 
structure that lives outside of the main index and keeps only these dates.  
Presumably this smaller data structure would be simpler/faster to update, and 
you'd just have to remain in sync with the main index (document-document 
mapping).  I think ParallelReader in Lucene is a similar approach, as it Solr's 
ExternalFileField.

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Development Team 
> To: solr-user@lucene.apache.org
> Sent: Friday, July 3, 2009 4:46:37 PM
> Subject: Suggestions needed: Lots of updates for tiny changes
> 
> Hi everybody,
>  Let's say I had an index with 10M large-ish documents, and as people
> logged into a website and viewed them the "last viewed date" was updated to
> the current time. We index a document's last-viewed-date because we allow
> users to a) search on this last-viewed-date alongside all other searchable
> criteria, and b) we can order results of any search by the last-viewed-date.
>  The problem is that in a given 5-minute period, we may have many
> thousands of updated documents (due to this simple last-viewed-date). We
> have a task that looks for changed documents, loads the full documents, and
> then feeds them into Solr to update the index, but unfortunately reading
> these changed documents and continually feeding them to Solr is generating *
> far* more load on our system (both Solr and the database) than any of the
> searches. In a given day, *we may have more updates to documents than we
> have total documents indexed*. (Databases don't handle this well either, the
> contention on rows for updates slows the database down significantly.)
>  How should we approach this problem? It seems like such a waste of
> resources to be doing so much work in applications/database/solr only for
> last-viewed-dates.
> 
>  Solutions we've looked at include:
>  1) Update only partial document. --Apparently this isn't supported in
> Solr yet (we're using nightly Solr 1.4 builds currently).
>  2) Use "near-real-time updates". --Not supported yet. Also, the
> "freshness" of the data isn't as much as concern as the sheer volume of
> changes that we have to make here. For example, we could update Solr
> less-fequently, but then we'd just have many more documents to update. The
> data only has to be, say, fresh to within 30 minutes.
>  3) Use a separate index for the last-viewed-date. --This won't work
> because we need to search on the last-viewed-date alongside other criteria,
> and we use it as scoring criteria for all our searches.
> 
>  Any suggestions?
> 
> Sincerely,
> 
>  Daryl.



Re: Implementing PhraseQuery and MoreLikeThis Query in one app

2009-07-03 Thread Otis Gospodnetic

Sergey,

I think I confused you.  The comment about the fields listed in the "fl" 
parameter has nothing to do with the SolrJ calls not working.

For SolrJ calls not working my suggestion is to look at the logs and compare 
the GetMethod call with the SolrJ call.  Paste them if you want more people to 
look at them.


Otis 
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: SergeyG 
> To: solr-user@lucene.apache.org
> Sent: Friday, July 3, 2009 4:08:37 AM
> Subject: Re: Implementing PhraseQuery and MoreLikeThis Query in one app
> 
> 
> Otis,
> 
> Thanks a lot. I'd certainly follow your advice and check the logs. Although,
> I must say that I've already tried all possible variations of the string for
> the "fl" parameter (spaces, commas, plus signs). More than that - the query
> still doesn't want to fetch any docs (other than the one with the id
> specified in the query) even when the line solrQuery.setParam("fl", "title
> author score"); is commented out. So I suspect that the problem is that the
> request with the url
> "http://localhost:8080/solr/select?q=id:1&mlt=true&mlt.fl=content&..."; due
> to some reason doesn't work properly. And when I use the GetMethod(url)
> approach and send url directly in the form
> "http://localhost:8080/solr/mlt?q=id:1&mlt.fl=content&...";, Solr picks up
> the mlt component. (At least, I'll have this backup solution if the main one
> keeps committing sabotage. :) I'll just need to add a parser for an incoming
> xml-response.)
> 
> I'll continue my "research" of this issue and, if you're interested in
> results, I'll definitely let you know.
> 
> Cheers,
> Sergey
> 
> 
> Otis Gospodnetic wrote:
> > 
> > 
> > Sergey,
> > 
> > Glad to hear the suggestion worked!
> > 
> > I can't spot the problem (though I think you want to use a comma to
> > separate the list of fields in the fl parameter value).
> > I suggest you look at the servlet container logs and Solr logs and compare
> > requests that these two calls make.  Once you see what how the second one
> > is different from the first one, you will probably be able to figure out
> > how to adjust the second one to produce the same results as the first one.
> > 
> >  Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > 
> > 
> > 
> > - Original Message 
> >> From: SergeyG 
> >> To: solr-user@lucene.apache.org
> >> Sent: Thursday, July 2, 2009 6:17:59 PM
> >> Subject: Re: Implementing PhraseQuery and MoreLikeThis Query in one app
> >> 
> >> 
> >> Otis,
> >> 
> >> Your recipe does work: after copying an indexing field and excluding stop
> >> words the MoreLikeThis query started fetching meaningful results. :)
> >> 
> >> Just one issue remained. 
> >> 
> >> When I execute query in this way:
> >> 
> >> String query = "q=id:1&mlt.fl=content&...&fl=title+author+score";
> >> HttpClient client = new HttpClient();
> >> GetMethod get = new GetMethod("http://localhost:8080/solr/mlt";);
> >> get.setQueryString(query);
> >> client.executeMethod(get);
> >> ...
> >> 
> >> it works fine bringing results as an XML string. 
> >> 
> >> But when I use "Solr-like" approach:
> >> 
> >> String query = "id:1";
> >> solrQuery.setQuery(query);
> >> solrQuery.setParam("mlt", "true");
> >> solrQuery.setParam("mlt.fl", "content");
> >> solrQuery.setParam("fl", "title author score");
> >> QueryResponse queryResponse = server.query( solrQuery );
> >> 
> >> the result contains only one doc with id=1 and no other "more like" docs. 
> >> 
> >> In my solrconfig.xml, I have these settings: 
> >> ...
> >> 
> >> ...
> >> 
> >> I guess it all is a matter of syntax but I can't figure out what's wrong.
> >> 
> >> Thank you very much (and again, thanks to Michael and Walter).
> >> 
> >> Cheers,
> >> Sergey
> >> 
> >> 
> >> 
> >> Michael Ludwig-4 wrote:
> >> > 
> >> > SergeyG schrieb:
> >> > 
> >> >> Can both queries - PhraseQuery and MoreLikeThis Query - be implemented
> >> >> in the same app taking into account the fact that for the former to
> >> >> work the stop words list needs to be included and this results in the
> >> >> latter putting stop words among the most important words?
> >> > 
> >> > Why would the inclusion of a stopword list result in stopwords being of
> >> > top importance in the MoreLikeThis query?
> >> > 
> >> > Michael Ludwig
> >> > 
> >> > 
> >> 
> >> -- 
> >> View this message in context: 
> >> 
> http://www.nabble.com/Implementing-PhraseQuery-and-MoreLikeThis-Query-in-one-app-tp24303817p24314840.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> > 
> > 
> > 
> 
> -- 
> View this message in context: 
> http://www.nabble.com/Implementing-PhraseQuery-and-MoreLikeThis-Query-in-one-app-tp24303817p24319269.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Is it problem? I use solr to search and index is made by lucene. (not EmbeddedSolrServer(wiki is old))

2009-07-03 Thread Otis Gospodnetic

Yes, it could be a problem.  For some reason over the last few days several 
people mentioned trying to build indices with Lucene directly.  Not sure why 
that;s so attractive...

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: James liu 
> To: solr-user@lucene.apache.org
> Sent: Friday, July 3, 2009 2:07:33 AM
> Subject: Re: Is it problem? I use solr to search and index is made by lucene. 
>  (not EmbeddedSolrServer(wiki is old))
> 
> solr have much fieldtype, like: integer,long, double, sint, sfloat,
> tint,tfloat,,and more.
> 
> but lucene not fieldtype,,just name and value, value only string.
> 
> so i not sure is it a problem when i use solr to search( index made by
> lucene).
> 
> 
> 
> -- 
> regards
> j.L ( I live in Shanghai, China)