Re: dismax and long phrases

2008-10-09 Thread Norberto Meijome
On Tue, 07 Oct 2008 09:27:30 -0700
Jon Drukman <[EMAIL PROTECTED]> wrote:

> > Yep, you can "fake" it by only using fieldsets (qf) that have a 
> > consistent set of stopwords.  
> 
> does that mean changing the query or changing the schema?

Jon,
- you change schema.xml to define which type each field is. The fieldType says 
whether you have stopwords or not.
- you change solrconfig.xml to define which fields will dismax query on.

i dont think you should have to change your query.

b

_
{Beto|Norberto|Numard} Meijome

"Mix a little foolishness with your serious plans;
it's lovely to be silly at the right moment."
   Horace

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Solr 1.3 try to fire delta-import.

2008-10-09 Thread sunnyfr

Hi Erik,

It's exactly what I done,

/data/solr
-rwxr-xr-x 1 tomcat55 nogroup1955 Oct  8 16:28 gc.log
drwxr-xr-x 4 tomcat55 root   4096 Oct  1 13:43 group
drwxr-xr-x 2 tomcat55 root   4096 Oct  8 16:27 lib
-rwxr-xr-x 1 tomcat55 root191 Oct  1 15:44 solr-jndi.xml
-rwxr-xr-x 1 tomcat55 root3738412 Oct  8 16:28 solr.war
-rwxr-xr-x 1 root root182 Oct  8 15:43 solr.xml
drwxr-xr-x 4 tomcat55 root   4096 Oct  1 13:43 user
drwxr-xr-x 8 tomcat55 root   4096 Oct  8 18:13 book

/data/solr/lib
-rwxr-xr-x 1 root root  89144 Oct  8 16:26 apache-solr-common-1.4-dev.jar
-rwxr-xr-x 1 root root 921040 Oct  8 16:26 apache-solr-core-1.4-dev.jar
-rwxr-xr-x 1 root root 146642 Oct  8 16:26
apache-solr-dataimporthandler-1.4-dev.jar
-rwxr-xr-x 1 root root  97780 Oct  8 16:26 apache-solr-solrj-1.4-dev.jar
-rwxr-xr-x 1 root root 655620 Oct  8 16:27 mysql-connector-java-5.1.5.jar

Welcome to solr, works, then I select my core books and it works either.
so http://solr-test.adm.bookclub.com:8180/solr/book/admin/ works fine
and not http://solr-test.adm.bookclub.com:8180/solr/book/dataimport

It's like a link missing or I dont' know?
thanks a lot for your help,



Erik Hatcher wrote:
> 
> Maybe, and I haven't followed your details sorry, an issue is the lib  
> directory for plugins.  Under the solr home directory you can put a  
> lib/ subdirectory with the data import handler JAR and it's  
> dependencies (that aren't already in the WAR's WEB-INF/lib).
> 
> To deploy Solr into Tomcat, you just need a binary solr.war, you don't  
> need to be repackaging it yourself - unless you're doing something  
> highly custom.  And if do need to do something highly custom, I still  
> strongly suggest you back up steps to get things working.
> 
>   Erik
> 
> On Oct 8, 2008, at 5:55 PM, sunnyfr wrote:
> 
>>
>> Sorry but I can't choose for that  very sorry.
>> I will try to make it work with tomcat55, it used to work with solr  
>> 1.2
>> so it should be I don't know a parameter or ??? path or something  
>> that I
>> miss for importing?
>>
>> thanks,
>>
>>
>> Erik Hatcher wrote:
>>>
>>> Wouldn't life be simpler if you simply used Solr's Jetty container
>>> configuration, at least to start with?
>>>
>>> Is Tomcat a requirement for some reason?   You're struggling with
>>> things that "just work" out of the box with Solr, it seems, and I'm
>>> wondering why change around what works.
>>>
>>> Erik
>>>
>>>
>>> On Oct 8, 2008, at 11:06 AM, sunnyfr wrote:
>>>

 More information :

 [EMAIL PROTECTED]:/usr/share/tomcat5.5/webapps/solr/WEB-INF/lib# ls
 README.committers.txt  commons-io-1.3.1.jar
 lucene-snowball-2.4-dev.jar
 solr-lucene-highlighter-pom.xml.template
 apache-solr-common-1.4-dev.jar commons-logging-1.0.4.jar
 lucene-spellchecker-2.4-dev.jar solr-lucene-queries-
 pom.xml.template
 apache-solr-core-1.4-dev.jar
 geronimo-stax-api_1.0_spec-1.0.1.jar  mysql-connector-java-5.1.5.jar
 solr-lucene-snowball-pom.xml.template
 apache-solr-dataimporthandler-1.4-dev.jar  junit-4.3.jar
 slf4j-api-1.5.3.jar
 solr-lucene-spellchecker-pom.xml.template
 apache-solr-solrj-1.4-dev.jar  lucene-analyzers-2.4-
 dev.jar
 slf4j-jdk14-1.5.3.jar   stax-utils.jar
 commons-codec-1.3.jar  lucene-core-2.4-dev.jar
 solr-commons-csv-pom.xml.template   wstx-asl-3.2.7.jar
 commons-csv-1.0-SNAPSHOT-r609327.jar   lucene-highlighter-2.4-
 dev.jar
 solr-lucene-analyzers-pom.xml.template
 commons-fileupload-1.2.jar lucene-memory-2.4-dev.jar
 solr-lucene-contrib-pom.xml.template
 commons-httpclient-3.1.jar lucene-queries-2.4- 
 dev.jar
 solr-lucene-core-pom.xml.template
 [EMAIL PROTECTED]:/usr/share/tomcat5.5/webapps/solr/WEB-INF/lib# q




 Shalin Shekhar Mangar wrote:
>
> Is the DataImportHandler defined in the solrconfig.xml for the
> "video"
> core?
>
> On Wed, Oct 8, 2008 at 7:26 PM, sunnyfr <[EMAIL PROTECTED]>  
> wrote:
>
>>
>> Hi,
>>
>> I've a wierd problem, my solr seems running.
>> When i go to :
>> http://solr-test.adm.bookclub.com:8180/solr/books/admin/
>>
>> I've a proper page :
>> Solr Admin (videos)
>> solr-test.adm.bookclub.com:8180
>> cwd=/data/solr SolrHome=/data/solr/books/
>>
>> Even
>> http://solr-test.adm.dailymotion.com:8180/solr/video/admin/stats.jsp
>> Bring me back a good page with proper information inside.
>>
>> BUT :) when i go through :
>>
>> http://solr-test.adm.dailymotion.com:8180/solr/video/dataimport?command=delta-import&entity=b
>> or
>> http://solr-test.adm.dailymotion.com:8180/solr/video/dataimport
>> I've HTTP Status 404 - /solr/video/dataimport
>>
>> My config is :
>> Java version : 1.6.0
>> M

Re: Problem in using Unique key

2008-10-09 Thread Norberto Meijome
On Wed, 8 Oct 2008 03:45:20 -0700 (PDT)
con <[EMAIL PROTECTED]> wrote:

> But in that case, while doing a full-import I am getting the following
> error:
> 
> org.apache.solr.common.SolrException: QueryElevationComponent requires the
> schema to have a uniqueKeyField 

Con, if you don't use the Query Elevation component, you can disable it in 
solrconfig.xml . Not sure why uniqueField is needed for it though.

b

_
{Beto|Norberto|Numard} Meijome

"First they ignore you, then they laugh at you, then they fight you, then you 
win."
  Mahatma Gandhi.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Solr 1.3 try to fire delta-import.

2008-10-09 Thread sunnyfr

This is as well my logs :
http://www.nabble.com/file/p19895639/syslog syslog 

I don't know really, I looked for as well dataimport.jsp file ... and I can
find it :
/var/lib/tomcat5.5/webapps/solr/admin/dataimport.jsp

Maybe it's my link which is bad, I tried as well with :
http://solr-test.adm.bookclub.com:8180/solr/book/admin/dataimport
and again 
http://solr-test.adm.bookclub.com:8180/solr/book/dataimport

if I try admin/ping it works .. no idea?
-- 
View this message in context: 
http://www.nabble.com/Solr-1.3-try-to-fire-delta-import.-tp19879259p19895639.html
Sent from the Solr - User mailing list archive at Nabble.com.



sub skus with colour and size

2008-10-09 Thread simon123

Hi

Please forgive my ignorance, i'm a complete newbie with solr and struggling
to find any actual information online.

I'm wanting to build a search for shoes, which from searching the archives I
can see others have been trying to do, but without a clear indication of
how.

I'm attempting to use solrsharp so we can use this with our inhouse .net
applications.

Every product we have comes in colour and size combinations, I need to do a
faceted search on these that allows for colour and size and various other
fields. A single product may have multiple colours and multiple sizes. 

For example a style might be available in black size 12, but also have other
sizes in red. If someone searches for red and size 12, it should not bring
the product as that combination is not possible.

I've see from other threads that people have tried doing a seperate document
for each variation and grouping them together, but have found no examples of
actually making this to work.

I don't even know where to start with this - I've gone through the online
documentation and it doesn't really cover this scenario.

If anyone has any suggestions, i'd be most grateful.


-- 
View this message in context: 
http://www.nabble.com/sub-skus-with-colour-and-size-tp19896106p19896106.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr 1.3 try to fire delta-import.

2008-10-09 Thread Shalin Shekhar Mangar
As I said earlier too, you must register the DataImportHandler in your
solrconfig.xml

If the solrconfig.xml does not have the following lines, you should add it:



  /home/username/data-config.xml

  

Here data-config.xml should be the one you are using for your core.

On Thu, Oct 9, 2008 at 3:19 PM, sunnyfr <[EMAIL PROTECTED]> wrote:

>
> This is as well my logs :
> http://www.nabble.com/file/p19895639/syslog syslog
>
> I don't know really, I looked for as well dataimport.jsp file ... and I can
> find it :
> /var/lib/tomcat5.5/webapps/solr/admin/dataimport.jsp
>
> Maybe it's my link which is bad, I tried as well with :
> http://solr-test.adm.bookclub.com:8180/solr/book/admin/dataimport
> and again
> http://solr-test.adm.bookclub.com:8180/solr/book/dataimport
>
> if I try admin/ping it works .. no idea?
> --
> View this message in context:
> http://www.nabble.com/Solr-1.3-try-to-fire-delta-import.-tp19879259p19895639.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Regards,
Shalin Shekhar Mangar.


Re: feeding documents tru API

2008-10-09 Thread Shalin Shekhar Mangar
Take a look at http://wiki.apache.org/solr/Solrj

On Thu, Oct 9, 2008 at 5:08 PM, Cam Bazz <[EMAIL PROTECTED]> wrote:

> Hello,
>
> I have been looking at the API documentation but I dont know where to
> look in order to feed documents tru API without using xml files.
>
> any ideas?
>
> Best.
> -C.B.
>



-- 
Regards,
Shalin Shekhar Mangar.


Re: Need help with more than just one index

2008-10-09 Thread Kraus, Ralf | pixelhouse GmbH

Hannes Carl Meyer schrieb:

Hi Ralf,

since Solr 1.3 it is possible to run multiple cores (indexes) inside a
single deployment, please check:

http://wiki.apache.org/solr/MultipleIndexes

it is not even about seperating indexes but also have different
configurations, index and query analyzers etc.
  

Thx a lot Hannes !

Greets -Ralf-


Need help with more than just one index

2008-10-09 Thread Kraus, Ralf | pixelhouse GmbH

Hello,

I am wondering if there is a chance to use solr with more than just one 
index ? Is there a chance a could switch to another index if

I want to search another context ?

for example :

searching for books : use index1 (schema1.xml)
searching for magazines : use index 2 (schema2.xml)

please help me...

--
Greets -Ralf-



Re: spellcheck: issues

2008-10-09 Thread Grant Ingersoll


On Oct 8, 2008, at 6:20 PM, Jason Rennie wrote:

On Wed, Oct 8, 2008 at 3:31 PM, Jason Rennie <[EMAIL PROTECTED]>  
wrote:


I just tried J-W and *yes* it seems to do a much better job!  I'd  
certainly

vote for that becoming the default :)



Ack!  I did some more testing and J-W results started to get weird
(including suggesting "courses" for "coursets" even though "corsets"  
is 4x
as frequent as "courses", and "nylo" for "nylom" even though "nylon"  
is 200x
more frequent than "nylo").  The default measure got these right.   
Does J-W

use frequency information at all?



Sorting in the SpellChecker is handled by the SuggestWord.compareTo()  
method in Lucene.  It looks like:

public final int compareTo(SuggestWord a) {
// first criteria: the edit distance
if (score > a.score) {
  return 1;
}
if (score < a.score) {
  return -1;
}

// second criteria (if first criteria is equal): the popularity
if (freq > a.freq) {
  return 1;
}

if (freq < a.freq) {
  return -1;
}
return 0;
  }

I could see you opening a JIRA issue in Lucene against the SC to make  
it so that the sorting could be overridden/pluggable.  A patch to do  
so would be even better ;-)


Cheers,
Grant


populating a spellcheck dictionary

2008-10-09 Thread Matt Mitchell
I'm starting to implement the new SpellCheckComponent. The solr 1.3 dist
example is using a file based dictionary, but I'd like to figure out the
best way to populate the dictionary from our index. Should the spellcheck
field be multivalued?

Thanks,
Matt


sint in schema.xml

2008-10-09 Thread sanraj25

Hi,
  I create own field name using integer field type and sint field
type(solr.SortableIntField) in schema.xml.
i can't differentiate between these two field type. When this sint exactly
use? If we use sint how it is sortable? I test by {sort =field name} in
query window .but it's not work properly.please tell me with clear example
thanks in advance

-sanraj

-- 
View this message in context: 
http://www.nabble.com/sint-in-schema.xml-tp19900303p19900303.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: feeding documents tru API

2008-10-09 Thread Kraus, Ralf | pixelhouse GmbH

Cam Bazz schrieb:

Hello,

I have been looking at the API documentation but I dont know where to
look in order to feed documents tru API without using xml files.

any ideas?

  

Look for the "SolrIndexWriter" class...

http://lucene.apache.org/solr/api/org/apache/solr/update/SolrIndexWriter.html

--
Greets -Ralf-



Re: Solr 1.3 try to fire delta-import.

2008-10-09 Thread sunnyfr

Brilliant :)
My bad, I thought it would have been there by default.
Sorry and thanks a lot,


Shalin Shekhar Mangar wrote:
> 
> As I said earlier too, you must register the DataImportHandler in your
> solrconfig.xml
> 
> If the solrconfig.xml does not have the following lines, you should add
> it:
> 
>  class="org.apache.solr.handler.dataimport.DataImportHandler">
> 
>   /home/username/data-config.xml
> 
>   
> 
> Here data-config.xml should be the one you are using for your core.
> 
> On Thu, Oct 9, 2008 at 3:19 PM, sunnyfr <[EMAIL PROTECTED]> wrote:
> 
>>
>> This is as well my logs :
>> http://www.nabble.com/file/p19895639/syslog syslog
>>
>> I don't know really, I looked for as well dataimport.jsp file ... and I
>> can
>> find it :
>> /var/lib/tomcat5.5/webapps/solr/admin/dataimport.jsp
>>
>> Maybe it's my link which is bad, I tried as well with :
>> http://solr-test.adm.bookclub.com:8180/solr/book/admin/dataimport
>> and again
>> http://solr-test.adm.bookclub.com:8180/solr/book/dataimport
>>
>> if I try admin/ping it works .. no idea?
>> --
>> View this message in context:
>> http://www.nabble.com/Solr-1.3-try-to-fire-delta-import.-tp19879259p19895639.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Solr-1.3-try-to-fire-delta-import.-tp19879259p19896259.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: populating a spellcheck dictionary

2008-10-09 Thread Grant Ingersoll
The example in example/solr/conf/solrconfig.xml should show a couple  
of different options:




textSpell


  default
  spell
  ./spellchecker1



  jarowinkler
  spell
  
  name 
="distanceMeasure">org.apache.lucene.search.spell.JaroWinklerDistancestr>

  ./spellchecker2




  solr.FileBasedSpellChecker
  file
  spellings.txt
  UTF-8
  ./spellcheckerFile

  

The first two are index based.

The spell field for the example is:
   multiValued="true"/>


HTH,
Grant

On Oct 9, 2008, at 9:38 AM, Matt Mitchell wrote:

I'm starting to implement the new SpellCheckComponent. The solr 1.3  
dist
example is using a file based dictionary, but I'd like to figure out  
the
best way to populate the dictionary from our index. Should the  
spellcheck

field be multivalued?

Thanks,
Matt


--
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ










Re: Need help with more than just one index

2008-10-09 Thread Hannes Carl Meyer
Hi Ralf,

since Solr 1.3 it is possible to run multiple cores (indexes) inside a
single deployment, please check:

http://wiki.apache.org/solr/MultipleIndexes

it is not even about seperating indexes but also have different
configurations, index and query analyzers etc.

Regards

Hannes

2008/10/9 Kraus, Ralf | pixelhouse GmbH <[EMAIL PROTECTED]>

> Hello,
>
> I am wondering if there is a chance to use solr with more than just one
> index ? Is there a chance a could switch to another index if
> I want to search another context ?
>
> for example :
>
> searching for books : use index1 (schema1.xml)
> searching for magazines : use index 2 (schema2.xml)
>
> please help me...
>
> --
> Greets -Ralf-
>
>


Re: feeding data

2008-10-09 Thread Cam Bazz
Hello Erik,

I am specially interested on how to integrate it to a glassfish/ejb3
environment.

In the past, I have done something like a proxy servlet to forward the
request and get back the request. it is kind of bother some.

also for indexing i need some sort of api access.

Anyone has done integration of solr to a serlvet/ejb3 based system?

Best Regards,
-C.B.


On Thu, Sep 4, 2008 at 3:32 PM, Erik Hatcher <[EMAIL PROTECTED]> wrote:
>
> On Sep 4, 2008, at 8:27 AM, Cam Bazz wrote:
>>
>> hello,
>> is there no other way then making xml files and feeding those to solr?
>>
>> I just want to feed solr programmatically. - without xml
>
> There are several options.  You can feed Solr XML, or CSV, or use any of the
> Solr client APIs (though those use XML under the covers for indexing
> documents, but transparently).  A more advanced option is to use Solr in
> embedded mode where you use its Java API directly with no intermediate
> representation needed.
>
>Erik
>
>


Re: populating a spellcheck dictionary

2008-10-09 Thread Matt Mitchell
Woops, I was looking at the wrong example solrconfig.xml

Thanks Grant!

Matt

On Thu, Oct 9, 2008 at 10:01 AM, Grant Ingersoll <[EMAIL PROTECTED]>wrote:

> The example in example/solr/conf/solrconfig.xml should show a couple of
> different options:
>
> 
>
>textSpell
>
>
>  default
>  spell
>  ./spellchecker1
>
>
>
>  jarowinkler
>  spell
>  
>   name="distanceMeasure">org.apache.lucene.search.spell.JaroWinklerDistance
>  ./spellchecker2
>
>
>
>
>  solr.FileBasedSpellChecker
>  file
>  spellings.txt
>  UTF-8
>  ./spellcheckerFile
>
>  
>
> The first two are index based.
>
> The spell field for the example is:
>multiValued="true"/>
>
> HTH,
> Grant
>
>
> On Oct 9, 2008, at 9:38 AM, Matt Mitchell wrote:
>
>  I'm starting to implement the new SpellCheckComponent. The solr 1.3 dist
>> example is using a file based dictionary, but I'd like to figure out the
>> best way to populate the dictionary from our index. Should the spellcheck
>> field be multivalued?
>>
>> Thanks,
>> Matt
>>
>
> --
> Grant Ingersoll
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
>
>
>


feeding documents tru API

2008-10-09 Thread Cam Bazz
Hello,

I have been looking at the API documentation but I dont know where to
look in order to feed documents tru API without using xml files.

any ideas?

Best.
-C.B.


solr 1.3 list of language managed org.apache.lucene.analysis

2008-10-09 Thread sunnyfr

Hi,

I'm using solr1.3 and I would like to know where can I find a place where
you have the list of the language managed by solr :
like for greek in the example : org.apache.lucene.analysis.el.GreekAnalyze.

Thanks a lot,

-- 
View this message in context: 
http://www.nabble.com/solr-1.3-list-of-language-managed-org.apache.lucene.analysis-tp19902137p19902137.html
Sent from the Solr - User mailing list archive at Nabble.com.



solr.SynonymFilterFactory

2008-10-09 Thread sunnyfr

Hi guys,

Just to know, in the schema.xml comments, it's wrote synonyms will be used
at the query time.
So when files are indexed is it too late, how it works really for this
synonyms, stopwords, protwords, spellings ... 
should i feel them up before index data ... once done is it too late ...? 

Thanks, 

-- 
View this message in context: 
http://www.nabble.com/solr.SynonymFilterFactory-tp19902329p19902329.html
Sent from the Solr - User mailing list archive at Nabble.com.



dismax and stopwords (was Re: dismax and long phrases)

2008-10-09 Thread Jon Drukman

Norberto Meijome wrote:

On Tue, 07 Oct 2008 09:27:30 -0700
Jon Drukman <[EMAIL PROTECTED]> wrote:

Yep, you can "fake" it by only using fieldsets (qf) that have a 
consistent set of stopwords.  

does that mean changing the query or changing the schema?


Jon,
- you change schema.xml to define which type each field is. The fieldType says 
whether you have stopwords or not.
- you change solrconfig.xml to define which fields will dismax query on.

i dont think you should have to change your query.


i got it to work.  the solution is:

add a new field to the schema without stopwords, i use the following type:

  positionIncrementGap="100">

  


  



then use copyField to copy the stopworded version to a second, 
non-stopworded field.  add the non-stopword field to the dismax qf and 
pf fields.  in this example, the stopword field is name and the 
non-stopword field is name_text:


 
name^1.5 name_text^1.8 description^1.0 tags^0.5 location^0.6 
user_name^0.4 misc^0.3 group_name^1.5

 
 
name^1.5 name_text^1.8 description^1.0 group_name^1.5
 


restart solr and reindex everything.  it now works.

thanks for all the help!

-jsd-



RE: sub skus with colour and size

2008-10-09 Thread Ensdorf Ken


> Every product we have comes in colour and size combinations,
> I need to do a
> faceted search on these that allows for colour and size and
> various other
> fields. A single product may have multiple colours and multiple sizes.
>
> For example a style might be available in black size 12, but
> also have other
> sizes in red. If someone searches for red and size 12, it
> should not bring
> the product as that combination is not possible.

I'm no expert, but one way to do this would be to have a multi-valued field 
with all the possible combinations, eg if you have the following in your data:


red
10,12


black
8,10


you could create a solr doc with a mulitvalued "color" field:

color_red size_10 size_12
color_black size_8 size_10

Then if you set the "positionIncrementGap" in your schema to a sufficiently 
high value (say 1000), you can use the following query to search for a color 
size combination:

color:"color_red size_10"~1000

which executes a phrase search with a slop factor of 1000, ensuring it won't 
cross the field boundary

hope this helps!
-Ken


Re: Using the more like this feature in solrj.

2008-10-09 Thread Erik Holstad
Thanks Bruce!
That worked very well.

Erik

On Wed, Oct 8, 2008 at 9:14 PM, Bruce Ritchie <[EMAIL PROTECTED]>wrote:

> Erik,
>
> I just got this to work myself and the documentation was only partially
> helpful in figuring it out. Two main points on making this work via sor1j:
>
> #1 - Define the mlt handles in solrconfig.xml (it's not defined in the
> example solrconfig.xml I was using):
>
> 
>
> #2 - with Solrj, access the mlt handler via something similar to the
> following:
>
> query.setQueryType("/" + MoreLikeThisParams.MLT);
> query.set(MoreLikeThisParams.MATCH_INCLUDE, false);
> query.set(MoreLikeThisParams.MIN_DOC_FREQ, 1);
> query.set(MoreLikeThisParams.MIN_TERM_FREQ, 1);
> query.set(MoreLikeThisParams.SIMILARITY_FIELDS, "subject,body");
> query.setQuery("Your query here or in my case the unique key field:value");
>
> Note that the two lines:
>
> query.set(MoreLikeThisParams.MIN_DOC_FREQ, 1);
> query.set(MoreLikeThisParams.MIN_TERM_FREQ, 1);
>
> seem to be required for mlt to work - not sure why. Also, the fields that
> you use to determine similarity should be stored with termVectors=true and
> stored=true.
>
>
> All the best,
>
> Bruce Ritchie
>
>
> -Original Message-
> From: Erik Holstad [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, October 08, 2008 9:46 PM
> To: solr-user@lucene.apache.org
> Subject: Using the more like this feature in solrj.
>
> Hi!
> Have been going though the documentation for the more like this/these
> feature but haven't found anything about how to use it in Solrj.
>
> Regards Erik
>
> No virus found in this incoming message.
> Checked by AVG - http://www.avg.com
> Version: 8.0.173 / Virus Database: 270.7.6/1715 - Release Date: 10/8/2008
> 7:19 PM
>


Re: Quick RSS feed questions

2008-10-09 Thread Chris Hostetter

: 1) Where can I find docs on how to get Solr to feed RSS directly?

if you mean "consume rss" you should take a look at the DataImporthandler 
-- it let's you configure XPath expressions for extracting 
documents/fields from xml files.  If you mean "produce rss" the 
XSLTResponseWriter can apply a stylesheet server side to generate RSS 
(that's why the example rss XSLT exists)

:  However, how do you handle RSS feeds for indexes where data can be both
: added and removed? For example, if I want to have an RSS feed of users on my
: site, I want new users to show up as new items in the RSS feed as they come
: along. However, users don't stick around forever, they can also disappear
: from the database. Similarly, users can change their information and thus
: they may not match a particular query anymore (and would thus disappear from
: the RSS feed, right?).
:  Wouldn't this cause havoc for RSS readers if results changed often?

This question is a little out of the scope of a Solr discussion.  The 
potential issues you describe with trying to serve data using RSS are 
going to be something you'll have to consider regardless of wether the RSS 
is generated from a Solr result, or a database result, or even a flat file 
on disk that you manually edit by hand.  It's all very dependent on what 
your clients are expecting to see in that feed in each of hte situations 
(delete/edit) that you describe.

If you can decide how you *want* the resulting feed to look in each of 
those use cases, you cna probably model the Solr Documents in a way to 
represent it.



-Hoss



Re: dismax and long phrases

2008-10-09 Thread Mike Klaas


On 7-Oct-08, at 9:27 AM, Jon Drukman wrote:


Mike Klaas wrote:

On 6-Oct-08, at 11:20 AM, Jon Drukman wrote:


is there any way i could 'fake' it by adding a second field  
without stopwords, or something like that?
Yep, you can "fake" it by only using fieldsets (qf) that have a  
consistent set of stopwords.


does that mean changing the query or changing the schema?


As you already found out, changing the schema (and consequently the  
fields you query).  You can also have fields that have two version,  
stopped and non-stopped.  Just make sure that the list of fields you  
query at a given time are all stopped or all non-stopped.


cheers,
-Mike


Re: How stop properly solr to modify solrconfig or ... files

2008-10-09 Thread Chris Hostetter

you need to use whatever mechanism your servlet container has for shuting 
down cleanly -- depending on how you started tomcat, that might be hitting 
Ctrl-C in a terminal, or it might be running a "stop" command.

: I did a full import and put adaptive parameter for mySql to avoid OOM error.

If you did something to avoid an OOM error that suggests maybe you 
recently had an OOM error -- in that case your index may have been left in 
an unusable state.

i thought the fsync additions to the version of Lucene in Solr 1.3 
prevented situations like this (ensuring that your index was still usable, 
even if some documents were lost) but i also seem to recall you mentioning 
recently you were using an older nightly build of Solr -- and/or maybe 
that feature doesn't work on certain fileseystems you're using.  I'm 
not sure but i thought i'd mention it.


-Hoss



Re: solr 1.3 list of language managed org.apache.lucene.analysis

2008-10-09 Thread Chris Hostetter

: I'm using solr1.3 and I would like to know where can I find a place where
: you have the list of the language managed by solr :
: like for greek in the example : org.apache.lucene.analysis.el.GreekAnalyze.

There isn't an explicitly list of langauges supported -- but if you look 
at the javadocs, both for Solr and Lucene, you can get a very good sense 
of what Tokenizers, TokenFilters, and Analyzers are included with Solr.

There *may* be a few Analayzers in Lucene contribs which are not in Solr 
OOTB, but they should be fairly easy to add as plugins...

http://lucene.apache.org/solr/api/org/apache/solr/analysis/package-tree.html

Keep in mind some Analysis classes (like SnowballPorterFilterFactory) 
actually support many different langauges based on runtime configuration.



-Hoss



Re: Discarding undefined fields in query

2008-10-09 Thread Chris Hostetter

: I'll catch that and deal with it then (Or is it bad programming ?) .

that's a psuedo-religious question -- i will only say that many people 
recomend against using Exception catching to drive control flow, it's 
called an "Exception" because it's suppose to be the "Exception" to the 
norm ... if an large percentage of hte time these field names aren't going 
to exist in your schema, you should probably be checking for the 
non-existing field first.

Note that another way to approach this is to add a dynamicField that 
matches "*" (there's an example commented out in the example schema) and 
give it a fieldtype using a query Analyzer that does whatever you want 
your special analyzer to do in the event of non-existent field (probably 
just produce no tokens) ... the only downside to this approach is that you 
won't get any errors if you inadvertantly index a document with an 
unexpected field (unless you give this new fieldtype a custom *indexing* 
analyzer that allways throws an exception)



-Hoss



Re: Need help with Solr Performance

2008-10-09 Thread Chris Hostetter

Maybe i missed it, but skimming this thread i haven't seen any indication 
of how your configured the various caches in solrconfig.xml ... or any 
indication of what kinds of cache hit/miss/expullsion stats you see from 
stats.jsp after running any tests.

considering you're doing faceting on quite a few fields, the filterCache 
is somewhat important.



-Hoss



Re: Need help with Solr Performance

2008-10-09 Thread Chris Hostetter

: considering you're doing faceting on quite a few fields, the filterCache 
: is somewhat important.

Sorry ... i overlooked the bit where QueryComponent was taking 6.x seconds 
... in general knowing what the cache hit rates are looking like is 
crucial to understanding the performance, but as Ryan mentioned figuring 
out what parts of yourquery are slow is clearly the first step.


-Hoss



Re: scoring individual values in a multivalued field

2008-10-09 Thread Chris Hostetter

: and my query string is "Hennessy", the length normalization factor considers
: all 4 tokens as in "John", "Hennessy",  "David",  "Patterson". This is
: similar to the score if my field was like:
: 
:   John Hennessy David Patterson 
: 
: I want the score to consider only that field value with any matches (here,
: "John Hennessy"). 
: Thanks in advance

unfortunately not possible.  lengthNorm is part of fieldNorm and for each 
doc there is one fieldNorm per field name...

: > as the OP mentioned, the index time boost values for a field are per field 
: > *name* not per value ... they all get folded in together into hte 
: > fieldNorm for that field name in that document.



-Hoss



Re: How stop properly solr to modify solrconfig or ... files

2008-10-09 Thread Mark Miller

Chris Hostetter wrote:
i thought the fsync additions to the version of Lucene in Solr 1.3 
prevented situations like this (ensuring that your index was still usable, 
even if some documents were lost) 
  
It doesn't prevent it if the IO system has write caching enabled or if 
you have a hard drive that lies to fsync to improve benchmark scores 
(most consumers hard drives I believe).


- Mark


Re: sint in schema.xml

2008-10-09 Thread sanraj25

Hi,
  I create own field name using integer field type and sint field
type(solr.SortableIntField) in schema.xml.
i can't differentiate between these two field type. When this sint exactly
use? If we use sint how it is sortable?



sanraj25 wrote:
> 
> Hi,
>   I create own field name using integer field type and sint field
> type(solr.SortableIntField) in schema.xml.
> i can't differentiate between these two field type. When this sint exactly
> use? If we use sint how it is sortable? I test by {sort =field name} in
> query window .but it's not work properly.please tell me with clear example
> thanks in advance
> 
> -sanraj
> 
> 

-- 
View this message in context: 
http://www.nabble.com/sint-in-schema.xml-tp19900303p19911165.html
Sent from the Solr - User mailing list archive at Nabble.com.



sint and omitnorms

2008-10-09 Thread sanraj25

Hi,
  I create own field name using integer field type and sint field
type(solr.SortableIntField) in schema.xml.
i can't differentiate between these two field type. When this sint exactly
use? If we use sint how it is sortable? I test by {sort =field name} in
query window .but it's not work properly.I have one more question.What is
the purpose of omitNorms attribute?If we use omitNorms what will happen? 
please tell me with clear example
thanks in advance

-sanraj 
-- 
View this message in context: 
http://www.nabble.com/sint-and-omitnorms-tp19912537p19912537.html
Sent from the Solr - User mailing list archive at Nabble.com.