MultiValue dynamicField and copyField

2010-07-14 Thread Jan Simon Winkelmann
Hi everyone,

i was wondering if the following was possible somehow:





As in: using copyField to copy a multiValued field into another multiValued 
field.

Cheers,
Jan


AW: MultiValue dynamicField and copyField

2010-07-14 Thread Jan Simon Winkelmann
I figured out where the problem was. The destination wildcard was actually 
matching the wrong field. I changed the fieldnames around a bit and now 
everything works fine. Thanks!

> -Ursprüngliche Nachricht-
> Von: kenf_nc [mailto:ken.fos...@realestate.com]
> Gesendet: Mittwoch, 14. Juli 2010 15:56
> An: solr-user@lucene.apache.org
> Betreff: Re: MultiValue dynamicField and copyField
> 
> 
> Yep, my schema does this all day long.
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/MultiValue-dynamicField-and-
> copyField-tp965941p966536.html
> Sent from the Solr - User mailing list archive at Nabble.com.


MultiCore config less stable than SingleCore?

2010-12-07 Thread Jan Simon Winkelmann
Hi,

i have recently moved Solr at one of our customers to a MultiCore environment 
running 2 indexes. Since then, we seem to be having problems with locks not 
being removed properly, .lock files keep sticking around in the index 
directory. 
Hence, any updates to the index keep returning 500 errors with the following 
stack trace:

Error 500 Lock obtain timed out: 
NativeFSLock@/data/jetty/solr/index1/data/index/lucene-96165c19c16f26b93de3954f6891-write.lock

org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: 
NativeFSLock@/data/jetty/solr/index1/data/index/lucene-96165c19c16f26b93de3954f6891-write.lock
at org.apache.lucene.store.Lock.obtain(Lock.java:85)
at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1545)
at 
org.apache.lucene.index.IndexWriter.(IndexWriter.java:1402)
at 
org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:190)
at 
org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:98)
at 
org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:173)
at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:220)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1187)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:425)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:457)
at 
org.eclipse.jetty.server.session.SessionHandler.handle(SessionHandler.java:182)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:933)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:362)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:867)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:245)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113)
at org.eclipse.jetty.server.Server.handle(Server.java:334)
at 
org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:559)
at 
org.eclipse.jetty.server.HttpConnection$RequestHandler.content(HttpConnection.java:1007)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:747)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:209)
at 
org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:406)
at 
org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:462)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436)
at java.lang.Thread.run(Thread.java:662)

All our other installations with a similar SingleCore config are running very 
smoothly.
Does anyone have an idea what the problem is? Could I have missed something 
when configuring the MultiCore environment?

Regards,
Jan


Dynamically changing the stored-state of a dynamicField

2009-12-17 Thread Jan-Simon Winkelmann
Hi,

i am currently building a Solr configuration for a rather large search
index. To allow for indexing of differently named fields for each dataset, i
have includ the following dynamicField:



What i don't like about this, is the fact that all dynamic fields are now
being stored. Actually, I only need a single field for each dataset tob e
stored. Is there a way to parameterize the stored parameter on a per-field
basis or do I have to add a second dynamicField like below to allow that?



Thanks in advance!

----
Jan-Simon Winkelmann



Searching for empty fields possible?

2010-01-25 Thread Jan-Simon Winkelmann
Hi there,

i have a field defined in my schema as follows:



Valid values for the fields are (in theory) any unsigned integer value. Also
the field can be empty.

My problem is, that if I have (puid:[0 TO *] OR -puid:[* TO *]) as filter
query i get 0 results, without the "-puid:[* TO *])" part i get about 6500
results. What am I doing wrong? I was under the impression i could find
empty fields with a "* TO *"
 range?

Thanks very much in advance!

Best,
Jan



AW: Searching for empty fields possible?

2010-01-25 Thread Jan-Simon Winkelmann
> Are you indexing an empty value?  Or not indexing a field at all?
> -field:[* TO *] will match documents that do not have the field at all.

I'm not sure, theoretically fields with a null value (php-side) should end
up not having the field. But then again i don't think it's relevant just
yet. What bugs me is that if I add the -puid:[* TO *], all results for
puid:[0 TO *] disappear, even though I am using "OR".

Best,
Jan


> On Jan 25, 2010, at 7:02 AM, Jan-Simon Winkelmann wrote:

> > Hi there,
> >
> > i have a field defined in my schema as follows:
> >
> >  > required="false" />
> >
> > Valid values for the fields are (in theory) any unsigned integer  
> > value. Also
> > the field can be empty.
> >
> > My problem is, that if I have (puid:[0 TO *] OR -puid:[* TO *]) as  
> > filter
> > query i get 0 results, without the "-puid:[* TO *])" part i get  
> > about 6500
> > results. What am I doing wrong? I was under the impression i could  
> > find
> > empty fields with a "* TO *"
> > range?
> >
> > Thanks very much in advance!
> >
> > Best,
> > Jan
> >




AW: AW: Searching for empty fields possible?

2010-01-26 Thread Jan-Simon Winkelmann
>> I'm not sure, theoretically fields with a null value
>> (php-side) should end
>> up not having the field. But then again i don't think it's
>> relevant just
>> yet. What bugs me is that if I add the -puid:[* TO *], all
>> results for
>> puid:[0 TO *] disappear, even though I am using "OR".
>
>- operator does not work with OR operator as you think. 
>Your query can be re-written as (puid:[0 TO *] OR (*:* -puid:[* TO *]))
>
>This new query satisfies your needs? And more importantly does
type="integer"  supports correct numeric range queries? In Solr 1.4.0 range
queries work >correctly with type="tint".


Strangely enough when I rewrote my query to ((puid:[0 TO *]) OR (-puid:[* TO
*])) I did actually get results. Weather they were correct I currently
cannot verify properly since my index does not actually contain null values
for the column. I will however check out if your query gets me any different
results :)
Speaking of your query, I don't quite understand what the *:* does there and
how it gets parsed.

Best
Jan



Solr not starting JMX

2010-02-04 Thread Jan-Simon Winkelmann
Hi everyone,

I am currently trying to set up JMX support for Solr, but somehow the
listening socket is not even created on my specified port.
My parameters look like this (running the Solr example):

java -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=6060
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false -jar start.jar


After running, netstat does not show anything listening on port 6060.

In my solrconfig.xml i have  activated in the  section.


I would very much appreciate some help here, I am out of ideas.


Thanks very much in advance
Jan-Simon Winkelmann





AW: Solr not starting JMX

2010-02-05 Thread Jan Simon Winkelmann
: My parameters look like this (running the Solr example):
: 
: java -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=6060
: -Dcom.sun.management.jmxremote.authenticate=false
: -Dcom.sun.management.jmxremote.ssl=false -jar start.jar

What implementation/version of java are you running?

On my Mac running Sun Java 1.6, that exact command works fine for Solr 1.4 
(using JConsole to connect to "localhost:6060" as a remote process)

Perhaps other JVM providers require differnet options to start up JMX?



~# /usr/lib/jvm/java-1.5.0-sun-1.5.0.14/jre/bin/java -version
java version "1.5.0_14"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_14-b03)
Java HotSpot(TM) Client VM (build 1.5.0_14-b03, mixed mode, sharing)


Sun Java 1.5, should work fine afaik.


Solr-JMX/Jetty agentId

2010-02-10 Thread Jan Simon Winkelmann
Hi,

I am (still) trying to get JMX to work. I have finally managed to get a Jetty 
installation running with the right parameters to enable JMX. Now the next 
problem appeared. I need to get Solr to register ist MBeans with the Jetty 
MBeanServer. Using , Solr doesn't 
complain on loading, but the MBeans simply don't show up in JConsole, so I 
would like to use . But where do I get the agentId? 
And what exactly does this Id represent? Does it change every time I restart 
Jetty?

Thanks in advance!
Jan-Simon Winkelmann


AW: Solr-JMX/Jetty agentId

2010-02-10 Thread Jan Simon Winkelmann
2010/2/10 Jan Simon Winkelmann :
> I am (still) trying to get JMX to work. I have finally managed to get a Jetty 
> installation running with the right parameters to enable JMX. Now the next 
> problem appeared. I need to get Solr to register ist MBeans with the Jetty 
> MBeanServer. Using  serviceUrl="service:jmx:rmi:///jndi/rmi:///jettymbeanserver" />, Solr doesn't 
> complain on loading, but the MBeans simply don't show up in JConsole, so I 
> would like to use . But where do I get the 
> agentId? And what exactly does this Id represent? Does it change every time I 
> restart Jetty?

I just have  in solrconfig.xml. On command line I start solr with this:
$ java -Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false -jar start.jar

In jconsole I can browse the solr beans just fine.




Thanks for that, it appears my thinking was just too complicated here. Works 
fine now :)

Best
Jan


xml error when indexing

2010-02-17 Thread Jan Simon Winkelmann
Hi,

I'm having a strange problem when indexing data through our application. 
Whenever I post something to the update resource, I get

Unexpected character 'a' (code 97) in prolog; expected '<'  at [row,col 
{unknown-source}]: [1,1], 


Error 400 Unexpected character 'a' (code 97) in prolog; expected '<'
 at [row,col {unknown-source}]: [1,1]

HTTP ERROR 400
Problem accessing /solr/update. Reason:
Unexpected character 'a' (code 97) in prolog; expected '<'
 at [row,col {unknown-source}]: [1,1]Powered by 
Jetty://


However, when I post the same data from an xml file using curl it works.

The add command looks like this:

145405329411702010-02-16T15:30:02Z02010-02-16T15:30:02Z2019-12-31T00:00:00Z0145-4053294«Positives Gespräch» zwischen Bielefeld und DFL«Positives Gespräch» zwischen Bielefeld und 
DFLBielefeld (dpa) - Der finanziell 
angeschlagene Zweitligist Arminia Bielefeld hat der Deutschen Fußball Liga in 
Frankfurt/Main einen Maßnahmen-Katalog präsentiert. 

Bielefeld (dpa) - Der finanziell angeschlagene Zweitligist Arminia Bielefeld hat der Deutschen Fußball Liga in Frankfurt/Main einen Maßnahmen-Katalog präsentiert.

«Daran arbeiten wir derzeit mit Hochdruck», teilte Arminia-Geschäftsführer Heinz Anders mit. Die Arminia-Delegation, zu der noch Manager Detlev Dammeier, Aufsichtsratschef Norbert Leopoldseder und Finanz-Prokurist Henrik Wiehl gehörten, habe die Lage vor den DFL-Vertretern laut Anders «offen und transparent» analysiert. Es sei ein «sehr positives Gespräch gewesen». Die nicht näher erläuterten Maßnahmen müssten nun umgesetzt und bei der DFL entsprechend nachgewiesen werden.

Die DFL kommentierte das Zusammentreffen in ihrer Frankfurter Zentrale nicht. «Zu solchen Dinge äußern wir uns nicht», erklärte ein Sprecher auf Anfrage der Deutschen Presse-Agentur dpa.

Der frühere Erstligist Bielefeld hat Verbindlichkeiten und Schulden von rund 15,5 Millionen Euro. Im operativen Geschäft dieser Saison gibt es eine Finanzierungslücke von 2,5 Millionen Euro. Der Club hat sich vor allem mit dem Ausbau und der Modernisierung der SchücoArena übernommen. Zudem ist die Entwicklung bei den Zuschauer-Zahlen und den Sponsorzuwendungen nach dem Bundesliga-Abstieg unerfreulich. Allein für das Stadion sind noch 13 Millionen Euro zu tilgen. Der Verein denkt sogar an einen Verkauf der SchücoArena.

The System we run on is Solr 1.4 with Jetty Hightide 7.0.1. Am I missing something here? Would be glad for any help. Best Jan

Strange search behavior

2010-02-24 Thread Jan Simon Winkelmann
Hi,

I'm having some problems understanding why certain search queries don't return 
any results.
I have a field of type "text", which is defined like this:



 

 

 





 




 





I have a total of about 3.2 Million documents indexed, of which a few hundred 
are in the format of "Tagesergebnisse der Oddset-Spiele vom 18.02.2010". 

My problem is, that if I search for "oddset-spiele", i get no results, but when 
I search for "oddsetspiele" or "oddset*spiele" i get lots of results. As far as 
I understand the WordDelimiterFilter converts each phrase into "name:oddset 
(spiel oddsetspiel)", at least thats what the analyzer says. What I don't get 
ist hat when I search for "oddset-spiele" I get no results at all.

I would appreciate any help or insight anyone could privide.

Best
Jan


copyField for dynamicFields

2010-04-27 Thread Jan Simon Winkelmann
Hi,

i have the following configured in my schema.xml:





What I can't quite figure out, is when exactly the data from the _i fields gets 
copied to the _i_f fields. Does it get processed first (Tokenizer, Filters, 
etc.) or copied first?

I would appreciate any insight. Thanks in advance!

Best,
Jan-Simon Winkelmann


AW: copyField for dynamicFields

2010-04-27 Thread Jan Simon Winkelmann
Hi,

thanks very much for that. I was actually worried I would have to restructure 
the index and the interface in our application.

regards,
Jan-Simon

> -Ursprüngliche Nachricht-
> Von: Naga Darbha [mailto:ndar...@opentext.com]
> Gesendet: Dienstag, 27. April 2010 12:16
> An: solr-user@lucene.apache.org
> Betreff: RE: copyField for dynamicFields
> 
> Hi,
> 
> I think copyField copies the un-processed content (that will be
> processed by source field) onto the target field and processes it based
> on target field's type.  It is *copied first*.
> 
> regards,
> Naga
> 
> -----Original Message-
> From: Jan Simon Winkelmann [mailto:winkelm...@newsfactory.de]
> Sent: Tuesday, April 27, 2010 2:41 PM
> To: solr-user@lucene.apache.org
> Subject: copyField for dynamicFields
> 
> Hi,
> 
> i have the following configured in my schema.xml:
> 
>  required="false" />
>  required="false" />
> 
> 
> What I can't quite figure out, is when exactly the data from the _i
> fields gets copied to the _i_f fields. Does it get processed first
> (Tokenizer, Filters, etc.) or copied first?
> 
> I would appreciate any insight. Thanks in advance!
> 
> Best,
> Jan-Simon Winkelmann


Slow Date-Range Queries

2010-04-29 Thread Jan Simon Winkelmann
Hi,

I am currently having serious performance problems with date range queries. 
What I am doing, is validating a datasets published status by a valid_from and 
a valid_till date field.

I did get a performance boost of ~ 100% by switching from a normal 
solr.DateField to a solr.TrieDateField with precisionStep="8", however my query 
still takes about 1,3 seconds.

My field defintion looks like this:







And the query looks like this:
((valid_from:[* TO 2010-04-29T10:34:12Z]) AND (valid_till:[2010-04-29T10:34:12Z 
TO *])) OR ((*:* -valid_from:[* TO *]) AND (*:* -valid_till:[* TO *])))

I use the empty checks for datasets which do not have a valid from/till range.


Is there any way to get this any faster? Would it be faster using 
unix-timestamps with int fields?

I would appreciate any insight and help on this.

regards,
Jan-Simon



AW: Slow Date-Range Queries

2010-04-29 Thread Jan Simon Winkelmann
> > ((valid_from:[* TO 2010-04-29T10:34:12Z]) AND
> > (valid_till:[2010-04-29T10:34:12Z TO *])) OR ((*:*
> > -valid_from:[* TO *]) AND (*:* -valid_till:[* TO *])))
> >
> > I use the empty checks for datasets which do not have a
> > valid from/till range.
> >
> >
> > Is there any way to get this any faster?
> 
> I can suggest you two things.
> 
> 1-) valid_till:[* TO *] and valid_from:[* TO *] type queries can be
> performance killer. You can create a new boolean field ( populated via
> conditional copy or populated client side) that holds the information
> whether valid_from exists or not. So that valid_till:[* TO *] can be
> rewritten as valid_till_bool:true.

That may be an idea, however i checked what happens when I simply leave them 
out. It does affect the performance but the query is still somewhere around 1 
second.
 
> 2-) If you are embedding these queries into q parameter, you can write
> your clauses into (filter query) fq parameters so that they are cached.

The problem here is, that the timestamp itself does change quite a bit and 
hence cannot be properly cached. It could be for a few seconds, but occasional 
response times of more than a second is still unacceptable for us. We need a 
solution that responds quickly ALL the time, not just most of the time.

Thanks for your ideas though :)

regards,
Jan-Simon



AW: Slow Date-Range Queries

2010-04-30 Thread Jan Simon Winkelmann
For now I need them, I will however most likely (as suggested by Ahmet Arslan), 
create another boolean field to get rid of them, just simply due to the fact 
that I am switching to Solr 1.4 frange queries.

On the topic of frange queries, is there a way to simulate the date range 
wildcards here? They don't seem to be working for the frange.

> Do you really need the *:* stuff in the date range subqueries? That
> may add to the execution time.