Re: Solr and OpenPipe

2008-04-04 Thread Rogerio Pereira
Hi Espen!

I tried to follow the getting started guide at openpipe site, the maven
build for intranet example doesn't generate the jar with dependencies, what
are the current dependencies of openpipe?

2008/4/4, Espen Amble Kolstad <[EMAIL PROTECTED]>:
>
> Hi,
>
> I'm one of the developers of the initial version of OpenPipe.
>
> We are currently using OpenPipe with Solr to index the Norwegian and
> English wikipedia.
>
> Anything in particular you want to know?
>
> - Espen
>
> > From: "Rogerio Pereira" <[EMAIL PROTECTED]>
> > Date: 2. april 2008 23.00.32 GMT+02:00
> > To: solr-user@lucene.apache.org
> > Subject: Solr and OpenPipe
> > Reply-To: solr-user@lucene.apache.org
> > Reply-To: [EMAIL PROTECTED]
>
> >
> >  Hi!
> >
> > Somebody has been working with Solr and OpenPipe?
> >
> > --
> > Yours truly (Atenciosamente),
> >
> > Rogério (_rogerio_)
> > http://faces.eti.br
> >
> > "Faça a diferença! Ajude o seu país a crescer, não retenha conhecimento,
> > distribua e aprenda mais." (http://faces.eti.br/?p=45)
> >
>



-- 
Yours truly (Atenciosamente),

Rogério (_rogerio_)
http://faces.eti.br

"Faça a diferença! Ajude o seu país a crescer, não retenha conhecimento,
distribua e aprenda mais." (http://faces.eti.br/?p=45)


why don't all stored fields show up?

2008-04-04 Thread Hung Huynh
 

I have about 20 stored fields in string, text, and int, but only about 10
fields show up when I query for them, whether I do fl=*,score or list them
out. What's my problem? How do I retrieve all of fields? Thanks.



Re: why don't all stored fields show up?

2008-04-04 Thread Yonik Seeley
On Fri, Apr 4, 2008 at 9:25 AM, Hung Huynh <[EMAIL PROTECTED]> wrote:
>  I have about 20 stored fields in string, text, and int, but only about 10
>  fields show up when I query for them, whether I do fl=*,score or list them
>  out. What's my problem? How do I retrieve all of fields? Thanks.

You should be getting back all stored fields for every document.
Documents will only show fields they have (fields are sparse, it's not
like a DB table).

-Yonik


RE: why don't all stored fields show up?

2008-04-04 Thread Hung Huynh
Do you think it might be a problem with my schema and data loading? I loaded
CSV with 39 fields and didn't get any error message. I have a total of 39
stored fields, but not all of them are reported back when I query for them.
Should I reload the Index? Is there a way for me to check if the Index has
all 39 fields?

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley
Sent: Friday, April 04, 2008 10:48 AM
To: solr-user@lucene.apache.org
Subject: Re: why don't all stored fields show up?

On Fri, Apr 4, 2008 at 9:25 AM, Hung Huynh <[EMAIL PROTECTED]> wrote:
>  I have about 20 stored fields in string, text, and int, but only about 10
>  fields show up when I query for them, whether I do fl=*,score or list
them
>  out. What's my problem? How do I retrieve all of fields? Thanks.

You should be getting back all stored fields for every document.
Documents will only show fields they have (fields are sparse, it's not
like a DB table).

-Yonik



Re: why don't all stored fields show up?

2008-04-04 Thread Yonik Seeley
On Fri, Apr 4, 2008 at 11:57 AM, Hung Huynh <[EMAIL PROTECTED]> wrote:
> Do you think it might be a problem with my schema and data loading?

Maybe.

> I loaded
>  CSV with 39 fields and didn't get any error message. I have a total of 39
>  stored fields, but not all of them are reported back when I query for them.

Try to tackle it by getting more specific.
Look at a single row in the CSV, and query for the id of that document
in the index and see what's missing.  Check the schema for those
missing fields.  Try to replicate the problem with another CSV file
with just that single record.

If you still can't figure it out, give us the following info:
- the URL used to load the CSV data
- the single record CSV file
- the result of querying for that single record
- your schema

-Yonik


Single Core Can't Find the solrconfig.xml file

2008-04-04 Thread kirk beers
Hi,

I tried setting up a single core application and I get the following error
which claims it can't find the solrconfig.xml yet it is located under
solr/conf/solrconfig.xml in my application :

thnx

*type* Status report

*message* *Severe errors in solr configuration. Check your log files for
more detailed information on what may be wrong. If you want solr to continue
after configuration errors, change:
false in
solrconfig.xml -
java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in
classpath or 'solr/conf/', cwd=/home/kirber/Desktop/tomcat-solr at
org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:168)
at
org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:136)
at org.apache.solr.core.Config.(Config.java:97) at
org.apache.solr.core.SolrConfig.(SolrConfig.java:108) at
org.apache.solr.core.SolrConfig.(SolrConfig.java:65) at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:88)
at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
at
org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:78)
at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
at org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
at
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544) at
org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:825) at
org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:714) at
org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490) at
org.apache.catalina.startup.HostConfig.start(HostConfig.java:1138) at
org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
at
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022) at
org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at
org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at
org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at
org.apache.catalina.core.StandardService.start(StandardService.java:448) at
org.apache.catalina.core.StandardServer.start(StandardServer.java:700) at
org.apache.catalina.startup.Catalina.start(Catalina.java:552) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at
org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) *


Re: numDocs and maxDoc

2008-04-04 Thread Chris Hostetter
: Thanks hossman, this is exactly what I want to do.
: Final question: so I need to merge the field by myself first? (Actually my
: original plan is to do 2 consecutive postingso merging is possible)

you need to send Solr whole documents with all the fields in them.  if you 
send another "doc" with the same value for the uniqueKey field, it will 
replace the previous doc.




-Hoss



Re: Multiple unique field?

2008-04-04 Thread Chris Hostetter

: When I set 2 unique key field, it looks like Solr only accept the first
: definition in schema.xml...question: so once the unique Key defined, it
: can't be overrided?

there is one and only one uniqueKey field ... trying to declare two should 
probably be an error 9anyone wnat to submit a patch?), but you definitley 
can't "override" the uniqueKey field ... declare it once, and that's what 
it is for your whole index.


-Hoss



Re: Date range performance

2008-04-04 Thread Mike Klaas

On 3-Apr-08, at 4:24 PM, Jonathan Ariel wrote:
Is this depends on the number of documents that matches the query or  
the

number of documents in the index?


This aspect is more depedent on the number of terms that the date  
query translates into.


If in a 3 million documents index my query matches 4, having date  
with a

precision of seconds could slow down the query?


Yes.  Solr does range queries by taking the disjunction of a bunch of  
term queries, so it is the total number of terms checked that is the  
limiting factor.


It would be better to implement this using an ordered index that could  
be binary-searched, but Solr isn't currently designed for that (though  
I think range optimization algorithms would be a cool addition).


-Mike



On Thu, Apr 3, 2008 at 7:45 PM, Mike Klaas <[EMAIL PROTECTED]>  
wrote:




On 3-Apr-08, at 2:14 PM, Jonathan Ariel wrote:


Hi,
I'm experiencing a really poor performance when using date ranges in
solr
query. Is it a know issue? is there any special consideration when  
using

date ranges? It seems weird because I always thought date dates are
translated to strings, so internally lucene resolves everything  
the same
way. So maybe the problem is with parsing the dates and traslating  
it to

the
internal value?
Any suggestion?



Range query is highly dependent on the total number of unique terms
covered by the range.  If you are indexing dates with very high  
precision
(e.g., milliseconds), this can consist of ridiculous numbers of  
terms.


Try rounding the dates to something more granular when indexing.

-Mike





Re: solr commit command questions

2008-04-04 Thread Mike Klaas

On 3-Apr-08, at 10:04 AM, oleg_gnatovskiy wrote:


Hello. I was wondering what happens when an add command is done  
without a

commit command. Is there any way to roll back?


No, there isn't (unless you've taken a snapshot of the index using  
snapshooter).


The main problem is that there is no way to "undelete" a document in  
lucene, so this might be impossible until lucene has more transaction  
support.


-Mike



Re: Date range performance

2008-04-04 Thread Jonathan Ariel
Thanks! I'll try taking some precision and let you know about the result.

Looking into the code it seems like a Lucene problem, more than Solr. It is
in the RangeQuery and RangeFilter classes. The problem with changing this to
have a sorted index and than binary search is that you have to sort it,
which is slow. Unless we can store the ordered index somewhere and reuse it,
it will be even slower than now. And if we store it, we will have to face
the problem with updating ordered index with new terms.


On Fri, Apr 4, 2008 at 3:30 PM, Mike Klaas <[EMAIL PROTECTED]> wrote:

> On 3-Apr-08, at 4:24 PM, Jonathan Ariel wrote:
>
> > Is this depends on the number of documents that matches the query or the
> > number of documents in the index?
> >
>
> This aspect is more depedent on the number of terms that the date query
> translates into.
>
>  If in a 3 million documents index my query matches 4, having date with a
> > precision of seconds could slow down the query?
> >
>
> Yes.  Solr does range queries by taking the disjunction of a bunch of term
> queries, so it is the total number of terms checked that is the limiting
> factor.
>
> It would be better to implement this using an ordered index that could be
> binary-searched, but Solr isn't currently designed for that (though I think
> range optimization algorithms would be a cool addition).
>
> -Mike
>
>
>
> > On Thu, Apr 3, 2008 at 7:45 PM, Mike Klaas <[EMAIL PROTECTED]> wrote:
> >
> >
> > > On 3-Apr-08, at 2:14 PM, Jonathan Ariel wrote:
> > >
> > >  Hi,
> > > > I'm experiencing a really poor performance when using date ranges in
> > > > solr
> > > > query. Is it a know issue? is there any special consideration when
> > > > using
> > > > date ranges? It seems weird because I always thought date dates are
> > > > translated to strings, so internally lucene resolves everything the
> > > > same
> > > > way. So maybe the problem is with parsing the dates and traslating
> > > > it to
> > > > the
> > > > internal value?
> > > > Any suggestion?
> > > >
> > > >
> > > Range query is highly dependent on the total number of unique terms
> > > covered by the range.  If you are indexing dates with very high
> > > precision
> > > (e.g., milliseconds), this can consist of ridiculous numbers of terms.
> > >
> > > Try rounding the dates to something more granular when indexing.
> > >
> > > -Mike
> > >
> > >
>


RE: why don't all stored fields show up?

2008-04-04 Thread Hung Huynh
Thanks for spending time on this issue.

I removed most the fields, and it's still not working:

http://localhost:8983/solr/update/csv?commit=true&separator=|&escape=\&strea
m.file=exampledocs/test1.txt

test1.txt content
guid|sku
1|ABC001

Query:
http://localhost:8983/solr/select/?q=guid%3A1&version=2.2&start=0&rows=10&in
dent=on&fl=*,score

output:




 0
 0
 
  *,score
  on
  0
  guid:1
  2.2
  10
 


 
  0.71231794
  1
  2008-04-04T19:35:44.427Z
 



Schema:


   

Guid is the unique numeric field.

Thanks,

Hung

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley
Sent: Friday, April 04, 2008 12:02 PM
To: solr-user@lucene.apache.org
Subject: Re: why don't all stored fields show up?

On Fri, Apr 4, 2008 at 11:57 AM, Hung Huynh <[EMAIL PROTECTED]> wrote:
> Do you think it might be a problem with my schema and data loading?

Maybe.

> I loaded
>  CSV with 39 fields and didn't get any error message. I have a total of 39
>  stored fields, but not all of them are reported back when I query for
them.

Try to tackle it by getting more specific.
Look at a single row in the CSV, and query for the id of that document
in the index and see what's missing.  Check the schema for those
missing fields.  Try to replicate the problem with another CSV file
with just that single record.

If you still can't figure it out, give us the following info:
- the URL used to load the CSV data
- the single record CSV file
- the result of querying for that single record
- your schema

-Yonik



Re: Date range performance

2008-04-04 Thread Chris Hostetter

: Looking into the code it seems like a Lucene problem, more than Solr. It is
: in the RangeQuery and RangeFilter classes. The problem with changing this to
: have a sorted index and than binary search is that you have to sort it,
: which is slow. Unless we can store the ordered index somewhere and reuse it,
: it will be even slower than now. And if we store it, we will have to face
: the problem with updating ordered index with new terms.

FWIW: Lucene Term enumeration is already indexed, it's just not a binary 
search tree (the details escape me at the moment, but there there is an 
interval value of N somewhere in the code, and every Nth Term is loaded 
into memory so a TermEnum.seek can skip ahead N terms at a time).

But the number of unique terms can be a bottle neck ... rounding to the 
level of precision you absolutely need can save you in these cases by 
reducing the number of unique terms.




-Hoss



Merging Solr index

2008-04-04 Thread Norskog, Lance
Hi-
 
http://wiki.apache.org/solr/MergingSolrIndexes recommends using the
Lucene contributed app IndexMergeTool to merge two Solr indexes. What
happens if both indexes have records with the same unique key? Will they
both go into the new index?
 
Is the implementation of unique IDs in the Solr java or in Lucene? If it
is in Solr, how would I hackup a Solr IndexMergeTool?
 
Cheers,
 
Lance Norskog
 


Re: solr commit command questions

2008-04-04 Thread oleg_gnatovskiy

So, what is the point of the commit?

oleg_gnatovskiy wrote:
> 
> Hello. I was wondering what happens when an add command is done without a
> commit command. Is there any way to roll back?
> 

-- 
View this message in context: 
http://www.nabble.com/solr-commit-command-questions-tp16467824p16504441.html
Sent from the Solr - User mailing list archive at Nabble.com.



admin.jsp java.lang.NoSuchFieldError

2008-04-04 Thread Mendes, Richard
I have been testing our solr homes and applications with Solr 1.3 using
builds I do from the SVN trunk. All of our code runs fine with Solr 1.2.
I am running Solr under Tomcat 5.5.26 using JNDI.

 

When running with Solr 1.3, Tomcat comes up clean. However, if you hit
the admin index page, you get the following exception.

 

 Apr 4, 2008 12:03:10 PM org.apache.catalina.core.StandardWrapperValve
invoke

SEVERE: Servlet.service() for servlet jsp threw exception

java.lang.NoSuchFieldError: config

at org.apache.jsp.admin.index_jsp._jspService(index_jsp.java:88)

 

I have the IBM developer works sample app running under the same Tomcat
instance with the war file I am building and the admin page for that
instance does not throw an exception. I have been able to reproduce the
behavior with nightly builds.

 

Any ideas on what I might check to resolve this?

 

Thanks.

-Rick



Re: solr commit command questions

2008-04-04 Thread Leonardo Santagada


On 04/04/2008, at 20:24, oleg_gnatovskiy wrote:


So, what is the point of the commit?



I always tought about it... this should have been named flush as it is  
on xapian... it has nothing to do with databases commits and the data  
will end up in the index one way or the other.


--
Leonardo Santagada






Re: solr commit command questions

2008-04-04 Thread Mike Klaas

On 4-Apr-08, at 4:24 PM, oleg_gnatovskiy wrote:


So, what is the point of the commit?


It makes the data you have updated since last commit visible to the  
searchers.


-Mike