Re: how can I use dll to analyze

2007-03-13 Thread James liu

I use jsp only for test.

you've gotten your analyzer/jar/jni/dll to work in a command line app, the

next step before trying to use it in Solr is probably to try and use it in
a simple JSP



Do u mean if it work well in cmd that meas it can use with solr?





2007/3/13, Ryan McKinley <[EMAIL PROTECTED]>:


does your "use bean" jsp example work if you dump it into the exploded
solr.war directory?

(solr does not rely on jsp - it is only used for the admin interface.)


On 3/12/07, James liu <[EMAIL PROTECTED]> wrote:
> I can use it with jspjust use bean.
>
> how can i use it with solr?
>
> i think solr use jsp to index, search...
>
>
>
> 2007/3/13, Chris Hostetter <[EMAIL PROTECTED]>:
> >
> >
> > : i have some.dll jni_some.dll, some_analyzer_func.jar whick used to
> > anlyze. I
> > : try it in cmd and it ok.
> > :
> > : now i wanna use it with solr so i have to use java call these dll
files.
> > :
> > : I use tomcat 6, java 1.5_11, in winxp,,,jar files all set in
tomcat's
> > lib
> > : directory. dll all set in sytem32 directory.
> >
> > DLLs are a weird windows voodoo i don't pretend to understand ... if
> > you've gotten your analyzer/jar/jni/dll to work in a command line app,
the
> > next step before trying to use it in Solr is probably to try and use
it in
> > a simple JSP .. make a little webapp containing the lucene jar, your
> > analyzer jar, and a simple JSP that runs some text through your
analyzer
> > ... the tomcat user community should be able to help you figure out
how
> > to get the JNI stuff working (and by using a simple little webapp,
thye
> > won't have to understand all of Solr)
> >
> > once you get that working, Solr should be cake.
> >
> >
> > -Hoss
> >
> >
>
>
> --
> regards
> jl
>





--
regards
jl


Re: how can I use dll to analyze

2007-03-13 Thread James liu

But now i can't use it with solr.(i compiled solr with ant)



2007/3/13, James liu <[EMAIL PROTECTED]>:


I use jsp only for test.

you've gotten your analyzer/jar/jni/dll to work in a command line app, the
>
> next step before trying to use it in Solr is probably to try and use it
> in
> a simple JSP


Do u mean if it work well in cmd that meas it can use with solr?





2007/3/13, Ryan McKinley <[EMAIL PROTECTED]>:
>
> does your "use bean" jsp example work if you dump it into the exploded
> solr.war directory?
>
> (solr does not rely on jsp - it is only used for the admin interface.)
>
>
> On 3/12/07, James liu < [EMAIL PROTECTED]> wrote:
> > I can use it with jspjust use bean.
> >
> > how can i use it with solr?
> >
> > i think solr use jsp to index, search...
> >
> >
> >
> > 2007/3/13, Chris Hostetter < [EMAIL PROTECTED]>:
> > >
> > >
> > > : i have some.dll jni_some.dll, some_analyzer_func.jar whick used to
> > > anlyze. I
> > > : try it in cmd and it ok.
> > > :
> > > : now i wanna use it with solr so i have to use java call these dll
> files.
> > > :
> > > : I use tomcat 6, java 1.5_11, in winxp,,,jar files all set in
> tomcat's
> > > lib
> > > : directory. dll all set in sytem32 directory.
> > >
> > > DLLs are a weird windows voodoo i don't pretend to understand ... if
> > > you've gotten your analyzer/jar/jni/dll to work in a command line
> app, the
> > > next step before trying to use it in Solr is probably to try and use
> it in
> > > a simple JSP .. make a little webapp containing the lucene jar, your
> > > analyzer jar, and a simple JSP that runs some text through your
> analyzer
> > > ... the tomcat user community should be able to help you figure out
> how
> > > to get the JNI stuff working (and by using a simple little webapp,
> thye
> > > won't have to understand all of Solr)
> > >
> > > once you get that working, Solr should be cake.
> > >
> > >
> > > -Hoss
> > >
> > >
> >
> >
> > --
> > regards
> > jl
> >
>



--
regards
jl





--
regards
jl


faceting for further MLT/grouping

2007-03-13 Thread Brian Whitman
We have a site with users that post things (say it's a review of a  
book. each review gets a solr doc.)


On the user info page we show a 'top 10' faceting result on the query:

q=username:bwhitman&facet=true&facet.field=review&facet.limit=10

which comes back with

9
4
..

Which is great as a simple "auto-tagger." But we would love to take  
these results and compare it to others to make a "related people"  
query. I could take the top 10 terms, perform a query like


q=pynchon^9%20barthelme^4&facet=true&facet.field=username&facet.limit=10

and get the 2nd-order facets back, but I imagine this is a common  
task that could be optimized in-solr. Sort of a:


q=username:bwhitman&facet=true&facet.field=review&facet.limit=10&facet.m 
lt=true&facet.mlt.field=username


Does anyone else use this sort of functionality? Am I going too far  
out of my way?


-Brian




Limiting results based on score

2007-03-13 Thread Corey Tisdale
Is there a way to specify that you only want results with a score  
greater than x? I've noticed that when two generic words mean  
something specific if seen together, I get good results with a high  
score but thousands of not-so-relevent results below score x. This is  
only a problem when sorting by somthing other than score, but if I  
could figure out a way to cut out the irrelevent stuff I think it  
woul help a lot


Corey


Re: how can I use dll to analyze

2007-03-13 Thread Ryan McKinley

I'm not totally following your problem, but i'll give it a shot.

You need to make sure the classpath for the solr web-app has
everything you need.  I suggested putting your test jsp file into the
solr web-app.

If your jsp test works, you should be able to use the JNI bindings from solr.

If your jsp test does not work, keep futzing with the classpath till
it does work.  The tomcat docs/mailing list is probably a good place
to get help with this

ryan


On 3/13/07, James liu <[EMAIL PROTECTED]> wrote:

I use jsp only for test.

you've gotten your analyzer/jar/jni/dll to work in a command line app, the
> next step before trying to use it in Solr is probably to try and use it in
> a simple JSP


Do u mean if it work well in cmd that meas it can use with solr?





2007/3/13, Ryan McKinley <[EMAIL PROTECTED]>:
>
> does your "use bean" jsp example work if you dump it into the exploded
> solr.war directory?
>
> (solr does not rely on jsp - it is only used for the admin interface.)
>
>
> On 3/12/07, James liu <[EMAIL PROTECTED]> wrote:
> > I can use it with jspjust use bean.
> >
> > how can i use it with solr?
> >
> > i think solr use jsp to index, search...
> >
> >
> >
> > 2007/3/13, Chris Hostetter <[EMAIL PROTECTED]>:
> > >
> > >
> > > : i have some.dll jni_some.dll, some_analyzer_func.jar whick used to
> > > anlyze. I
> > > : try it in cmd and it ok.
> > > :
> > > : now i wanna use it with solr so i have to use java call these dll
> files.
> > > :
> > > : I use tomcat 6, java 1.5_11, in winxp,,,jar files all set in
> tomcat's
> > > lib
> > > : directory. dll all set in sytem32 directory.
> > >
> > > DLLs are a weird windows voodoo i don't pretend to understand ... if
> > > you've gotten your analyzer/jar/jni/dll to work in a command line app,
> the
> > > next step before trying to use it in Solr is probably to try and use
> it in
> > > a simple JSP .. make a little webapp containing the lucene jar, your
> > > analyzer jar, and a simple JSP that runs some text through your
> analyzer
> > > ... the tomcat user community should be able to help you figure out
> how
> > > to get the JNI stuff working (and by using a simple little webapp,
> thye
> > > won't have to understand all of Solr)
> > >
> > > once you get that working, Solr should be cake.
> > >
> > >
> > > -Hoss
> > >
> > >
> >
> >
> > --
> > regards
> > jl
> >
>



--
regards
jl



Re: Limiting results based on score

2007-03-13 Thread Chris Hostetter

short answer: No

long answer: You could with a custom request handler but it would be
meaningless.

There are more details on this in the LUcene FAQ which has few pointers to
past discussions on the lucene java mailing lists (there are many, MANY
more discussions about this in the lucene archives as well) ...

http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-912c1f237bb00259185353182948e5935f0c2f03



-Hoss



Re: how can I use dll to analyze

2007-03-13 Thread Chris Hostetter

: you've gotten your analyzer/jar/jni/dll to work in a command line app, the
: > next step before trying to use it in Solr is probably to try and use it in
: > a simple JSP

: Do u mean if it work well in cmd that meas it can use with solr?

i mean that if you wrote a simple little command line app, that includes
the lucene jar (with Analyzer.class) and includes your custom analyzer,
and uses your analyzer-- and that runs successully, then it should be
possible to get it working in Tomcat/Solr ... i just don't know how off
the top of my head.

: > > I can use it with jspjust use bean.

I don't know what that means ... an Analyzer isn't a Java Bean.

if you've got a JSP that can successfully use your JNI based Analyzer
(cast to a regular lucene Analyzer) but you can't use your Analyzer in
solr, please paste your JSP into your email so we can understand how you
are using it and try to figure out why it doens't work in Solr.




-Hoss



How to decide to use Analyzer, AnalyzerFactory, or TokenizerFactory

2007-03-13 Thread Teruhiko Kurosaka
Hello,
I am new to solr, and trying to undestand how things work.
If I want to use my tokenizer, there seems to be three choices:
1. Write a TokenizerFactory that create() my Tokenizer, and specify the factory 
in schema.xml.
2. Write an Analyzer that uses my Tokenizer, and specify that Analyzer in 
schema.xml.
3. Write an Analyzer that uses my Tokenizer and an AnalyzerFactory that 
create() that Analyzer, and specify that factory in schema.xml.

Is there document that describes differences of these approaches, guides when 
to use which?

-Kuro


Re: Limiting results based on score

2007-03-13 Thread Corey Tisdale

Thanks!

Corey

On Mar 13, 2007, at 4:24 PM, Chris Hostetter wrote:



short answer: No

long answer: You could with a custom request handler but it would be
meaningless.

There are more details on this in the LUcene FAQ which has few  
pointers to
past discussions on the lucene java mailing lists (there are many,  
MANY

more discussions about this in the lucene archives as well) ...

http://wiki.apache.org/jakarta-lucene/ 
LuceneFAQ#head-912c1f237bb00259185353182948e5935f0c2f03




-Hoss





Re: How to decide to use Analyzer, AnalyzerFactory, or TokenizerFactory

2007-03-13 Thread Chris Hostetter

: Is there document that describes differences of these approaches, guides
: when to use which?

there is some guidence in these wiki pages...

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
http://wiki.apache.org/solr/SolrPlugins

In general, do what ever is easiest for you ... if you already have an
Analyzer, just use it.  If you have a Tokenizer (or you are writing one)
then it's just as easy to write a TOkenizerFactory as it is to write an
Analyzer - and the TokenizerFactory allows you to mix and match with all
sorts of TokenFIlter configurations.

: 3. Write an Analyzer that uses my Tokenizer and an AnalyzerFactory that
: create() that Analyzer, and specify that factory in schema.xml.

there's no such thing as an AnalyzerFactory (as far as i remember...) ..
did you see that mentioned in the docs somewhere?



-Hoss



Re: Commit after how many updates?

2007-03-13 Thread Maximilian Hütter
Mike Klaas schrieb:
> On 3/12/07, Maximilian Hütter <[EMAIL PROTECTED]> wrote:
>> Hi,
>>
>> I have a question regarding Solr's behaviour, in the standard
>> installation. When use the start.jar with a rather complex schema and I
>> do about 1000 updates and then try to commit, I get this:
>>
>> java.lang.OutOfMemoryError: Java heap space
>> 
>>
>> I know I can fix it by giving the VM a larger heap size, but still I
>> wonder what a good number of updates would be?
>>
>> What are your experiences?
> 
> That seems awfully few docs to cause OOM--I'm using autocommit @
> 100,000 docs without issues (then again, I give my instances a least a
> gig of heap).
> 
> What is your current heap size?
> 
> -Mike
> 
It is the default heap size for the Sun JVM, so I guess 64MB max. The
documents are rather large, but if you manage to index 100,000 docs,
there seems to be some problem with Solr.

What would be the recommended heap size for Solr?

-- 
Maximilian Hütter
blue elephant systems GmbH
Wollgrasweg 49
D-70599 Stuttgart

Tel:  (+49) 0711 - 45 10 17 578
Fax:  (+49) 0711 - 45 10 17 573
e-mail :  [EMAIL PROTECTED]
Sitz   :  Stuttgart, Amtsgericht Stuttgart, HRB 24106
Geschäftsführer:  Joachim Hörnle, Thomas Gentsch, Holger Dietrich


Index arbitrary xml-elments in only one field without copying

2007-03-13 Thread thomas arni
Hello

I'm currently evaluate solr for our needs. In a first step I used your
example and adapted the “schema.xml”.

In contrast to the example docs provided I haven't homogeneous
documents, which means I only want to index to two fields. This fields
are the uniqueKey (docno) and a textfield (text). 






Instead of using the copyField for other XML-elements, to copy (and
duplicate) this fields to my “text”-field, I want to specify which
fields should be indexed directly in the “text”-field without copying
nor duplicating. I have no need for additional index-fields in my
heterogeneous environment. This extra fields only need additional space
in my index, which is a disadvantage for me.


How can I specify arbitrary xml-elements, which should be indexed in my
one and only field “text”. I have no need of additional fields in my
index.


Any help is appreciated.


Thomas