Aliases for fields

2009-08-18 Thread Licinio Fernández Maurelo
Hello everybody,

can i set an alias for a field? Something like :



is there any jira issue related?

Thx

-- 
Lici


Re: Aliases for fields

2009-08-18 Thread Avlesh Singh
What could possibly be a use case for such a need?

Cheers
Avlesh

2009/8/18 Licinio Fernández Maurelo 

> Hello everybody,
>
> can i set an alias for a field? Something like :
>
>  stored="true" multiValued="false" termVectors="false"
> alias="source.date"/>
>
> is there any jira issue related?
>
> Thx
>
> --
> Lici
>


Re: Aliases for fields

2009-08-18 Thread Licinio Fernández Maurelo
Currently we are trying to unmarshall objets from the index (solr bean
tags didn't fully acomplish this issue in our project due to model
complexity).
It will be nice to set an alias for some fields to match the pojo.property name.
Don't know if there is an alternative (maybe copyfield?)  to implement
this beahaviour

thanks

2009/8/18 Avlesh Singh :
> What could possibly be a use case for such a need?
>
> Cheers
> Avlesh
>
> 2009/8/18 Licinio Fernández Maurelo 
>
>> Hello everybody,
>>
>> can i set an alias for a field? Something like :
>>
>> > stored="true" multiValued="false" termVectors="false"
>> alias="source.date"/>
>>
>> is there any jira issue related?
>>
>> Thx
>>
>> --
>> Lici
>>
>



-- 
Lici


Re: Aliases for fields

2009-08-18 Thread Avlesh Singh
>
> solr bean tags didn't fully acomplish this issue in our project due to
> model complexity
>
Did you try annotating your pojo in this manner?
@Field("index_field_name)
pojoPropertyName;

It will be nice to set an alias for some fields to match the pojo.property
> name. Don't know if there is an alternative (maybe copyfield?) to implement
> this beahaviour
>
Though I am not sure what you want to achieve, yet a copyField is very
similar to what you are asking for.

Cheers
Avlesh

2009/8/18 Licinio Fernández Maurelo 

> Currently we are trying to unmarshall objets from the index (solr bean
> tags didn't fully acomplish this issue in our project due to model
> complexity).
> It will be nice to set an alias for some fields to match the pojo.property
> name.
> Don't know if there is an alternative (maybe copyfield?)  to implement
> this beahaviour
>
> thanks
>
> 2009/8/18 Avlesh Singh :
> > What could possibly be a use case for such a need?
> >
> > Cheers
> > Avlesh
> >
> > 2009/8/18 Licinio Fernández Maurelo 
> >
> >> Hello everybody,
> >>
> >> can i set an alias for a field? Something like :
> >>
> >>  >> stored="true" multiValued="false" termVectors="false"
> >> alias="source.date"/>
> >>
> >> is there any jira issue related?
> >>
> >> Thx
> >>
> >> --
> >> Lici
> >>
> >
>
>
>
> --
> Lici
>


Re: Questions about MLT

2009-08-18 Thread Avlesh Singh
Invalid question?

Cheers
Avlesh

On Mon, Aug 17, 2009 at 10:05 PM, Avlesh Singh  wrote:

> I have an index of documents which contain these two fields:
>  termVectors="true" termPositions="true" termOffsets="true"/>
>  termVectors="true" termPositions="true" termOffsets="true"/>
>
> Using the MLT handler with similarity field as city_id works fine and as
> expected, however with categories it does not work at all. I tried looking
> at "interestingTerms" in the latter case and but the handler does not return
> anything. Something to do with multiValued fields?
> I am using Solr 1.3.
>
> Any help would be appreciated.
>
> Cheers
> Avlesh
>


Re: How can i get lucene index format version information?

2009-08-18 Thread Licinio Fernández Maurelo
Nobody knoes how can i get exactly this info : index format : -9 (UNKNOWN)

Despite of knowing 2.9-dev 794238 -
2009-07-15 18:05:08 helps, i assume that it doesn't implies  an
index format change

Am i wrong?

El 11 de agosto de 2009 11:53, Licinio Fernández
Maurelo escribió:
> Thanks all for your responses,
>
> what i expect to get is the index format version as it appears in
> luke's overview  tab (index format : -9 (UNKNOWN)
>
> 2009/7/31 Jay Hill :
>> Check the system request handler: http://localhost:8983/solr/admin/system
>>
>> Should look something like this:
>> 
>> 1.3.0.2009.07.28.10.39.42
>> 1.4-dev 797693M - jayhill - 2009-07-28
>> 10:39:42
>> 2.9-dev
>> 2.9-dev 794238 - 2009-07-15 18:05:08
>> 
>>
>> -Jay
>>
>>
>> On Thu, Jul 30, 2009 at 10:32 AM, Walter Underwood 
>> wrote:
>>
>>> I think the properties page in the admin UI lists the Lucene version, but I
>>> don't have a live server to check that on at this instant.
>>>
>>> wunder
>>>
>>>
>>> On Jul 30, 2009, at 10:26 AM, Chris Hostetter wrote:
>>>
>>>
 : > i want to get the lucene index format version from solr web app (as

 : the Luke request handler writes it out:
 :
 :    indexInfo.add("version", reader.getVersion());

 that's the index version (as in "i have added docs to the index, so the
 version number has changed") the question is about the format version (as
 in: "i have upgraded Lucene from 2.1 to 2.3, so the index format has
 changed")

 I'm not sure how Luke get's that ... it's not exposed via a public API on
 an IndexReader.

 Hmm...  SegmentInfos.readCurrentVersion(Directory) seems like it would do
 the trick; but i'm not sure how that would interact with customized
 INdexReader implementations.  i suppose we could always make it non-fatal
 if it throws an exception (just print the exception mesg in place of hte
 number)

 anybody want to submit a patch to add this to the LukeRequestHandler?


 -Hoss

>>>
>>>
>>
>
>
>
> --
> Lici
>



-- 
Lici


Re: Aliases for fields

2009-08-18 Thread Licinio Fernández Maurelo
Our purpose is to reuse the data stored in our indexes  serving  it to
multiple format clients (xml, php, json) directly (no mapper tier
wanted).

As clients model entities names doesn't match index field names, we
want to use alias in some way to adapt the response for the client.

Taking a look at solr wiki found this:

"This (CopyField) is provided as a convenient way to ensure that data
is put into several fields, without needing to include the data in the
update command multiple times"

i want to perform this behaviour in read-only mode (don't want to
duplicate data)

Thx


2009/8/18 Avlesh Singh :
>>
>> solr bean tags didn't fully acomplish this issue in our project due to
>> model complexity
>>
> Did you try annotating your pojo in this manner?
> @Field("index_field_name)
> pojoPropertyName;
>
> It will be nice to set an alias for some fields to match the pojo.property
>> name. Don't know if there is an alternative (maybe copyfield?) to implement
>> this beahaviour
>>
> Though I am not sure what you want to achieve, yet a copyField is very
> similar to what you are asking for.
>
> Cheers
> Avlesh
>
> 2009/8/18 Licinio Fernández Maurelo 
>
>> Currently we are trying to unmarshall objets from the index (solr bean
>> tags didn't fully acomplish this issue in our project due to model
>> complexity).
>> It will be nice to set an alias for some fields to match the pojo.property
>> name.
>> Don't know if there is an alternative (maybe copyfield?)  to implement
>> this beahaviour
>>
>> thanks
>>
>> 2009/8/18 Avlesh Singh :
>> > What could possibly be a use case for such a need?
>> >
>> > Cheers
>> > Avlesh
>> >
>> > 2009/8/18 Licinio Fernández Maurelo 
>> >
>> >> Hello everybody,
>> >>
>> >> can i set an alias for a field? Something like :
>> >>
>> >> > >> stored="true" multiValued="false" termVectors="false"
>> >> alias="source.date"/>
>> >>
>> >> is there any jira issue related?
>> >>
>> >> Thx
>> >>
>> >> --
>> >> Lici
>> >>
>> >
>>
>>
>>
>> --
>> Lici
>>
>



-- 
Lici


Re: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on CentOS

2009-08-18 Thread Constantijn Visinescu
Ok ... sounds like something is screwed up somewhere(tm). Hard part is
figuring out where :)


My suggestion:

Throw everything that's remotely related to the webapp you're trying to
build off that server and make sure you get all of it. (no stray bits of
solr confuration files leftover anywhere).

Next get apache-solr-1.3.0.zip from
http://apache.mirrors.webazilla.nl/lucene/solr/1.3.0/ (or whatever mirror
you prefer, see solr main page)

Make a directory for solr on your server somewhere that's completely
separate from anything tomcat or webserver related.(i.e. not a subdirectory
of usr/share/tomcat5/ ... maybe try making /usr/share/solr/ ? put it
somewhere it's guaranteed not to interfere with anything :))

Copy the contents of \apache-solr-1.3.0\example\solr\ to the directory you
just made using ssh

Throw away the data folder if it exists. (that should leave you with the bin
and conf folders + readme.txt, no lib folder or anything else)

in the conf folder comment out the
${solr.data.dir:./solr/data} line in solrconfig.xml near
the top.

make sure tomcat has all the rights it needs on the /usr/share/solr/ folder
(i'm assuming read/write, someone correct me if I'm wrong)

Grab apache-solr-1.3.0.war from apache-solr-1.3.0\dist\ and edit the web.xml
to include

   solr/home
   /usr/share/solr
   java.lang.String

near the end.

Upload the war with the new web.xml using plesk.

Surf to http://myserver:myport/apache-solr-1.3.0/index.jsp to see your solr
installation up and running.

After that solr should have created a new blank index in the
usr/shared/solr/data folder and solr should be up and running.

If this works you should be able to start adding all the bits and pieces of
your own app and go from there :) This might be a bit overkill but if you
can get this up and running you should be able to get it to do what you need
to do in the end. If this doesn't work double check you got the path in the
web.xml correct and that tomcat can access that folder with all rights. If
it still doesn't work after that I'm not sure what to do ;)


On Tue, Aug 18, 2009 at 5:34 AM, Funtick  wrote:

>
> It is NOT sample war, it is SOLR application: solr.war - it should be!!! I
> usually build from source and use dist/apache-solr-1.3.war instead, so I am
> not sure about solr.war
>
> solr.xml contains configuration for multicore; most probably something is
> wrong with it.
> Would be better if you try to install Tomcat on localbox and play with it
> before going to production...
>
>
>
> Aaron Aberg wrote:
> >
> > On Mon, Aug 17, 2009 at 10:58 AM, Fuad Efendi wrote:
> >> Looks like you are using SOLR multicore, with solr.xml... I never tried
> >> it...
> >> The rest looks fine, except suspicious solr.xml
> >
> > whats suspicious about it? is it in the wrong place? Is it not suppose
> > to be there?
> >
> > technically my war file is not in my webapps directory. I'm using
> > plesk and it installed my war file here:
> > tomcat5/psa-webapps/mywebk9.com/solr.war
> >
> > I have installed a sample war file and its in the same location. It works
> > fine.
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Cannot-get-solr-1.3.0-to-run-properly-with-plesk-9.2.1-on-CentOS-tp24980824p25017895.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


Re: Maximum number of values in a multi-valued field.

2009-08-18 Thread Constantijn Visinescu
Hi,

This would also depend on the amount of documents you got in your solr
index.

10k ? 100k? 1m? 10m? 100m?

I'm by no means an expert on solr but i recently had a similar question and
to get my answer i grabbed a new blank solr index, modiefied my schema.xml,
reindexed all my data assuming the worst case.

Make up something like 100k (pick a number that seams realistic and multiply
by 10) different catalog names and assign 2000 to each document randomly.
Let your computer crunch bits for a few hours to rebuild an index (close to
10m documents in my case), copy the index to a server similar to your
production server and see what happens :)

For me there was no noticeable performance difference.

Constantijn Visinescu

On Tue, Aug 18, 2009 at 1:15 AM, Aravind Naidu  wrote:

>
> Hi,
> The possibility is that all items in this field could be unique. Let me
> clarify.
> The main Solr index is a for a list of products. Some products belong to
> catalogues.  So, the consideration is to add a multi-valued field to put
> the
> id of the catalogue in each product as a multi-valued field to be used as a
> filter.
>
> -- Aravind
>
>
> Jason Rutherglen-2 wrote:
> >
> > Your term dictionary will grow somewhat, which means the term
> > index could consume more memory. Because the term dictionary has
> > grown there could be less performance in looking up terms but
> > that is unlikely to affect your application. How many unique
> > terms will there be?
> >
> > On Mon, Aug 17, 2009 at 3:50 PM, Arv wrote:
> >>
> >> All,
> >> We are considering some new changes to our Solr schema to better support
> >> some new functionality for our application. To that extent, we want to
> >> add
> >> an additional field that is multi-valued, but will contain a large
> number
> >> of
> >> values per document. Potentially up to 2000 values on this field per
> >> document.
> >>
> >> Questions:
> >> - Is this wise?
> >> - Though we will not be faceting on this field, are there any
> >> implications
> >> for performance?
> >> - I understand that the XML in/out will be large, and we may need to
> stop
> >> this field being sent back on every query, as this field is essentially
> >> used
> >> as a filter only.
> >>
> >> The reason I am asking is that our instance of Solr currently works
> >> wonderfully and is very fast, and I am wary of doing anything that will
> >> affect this.  So, any pointer on design here will help.
> >>
> >> -- Aravind
> >>
> >> --
> >> View this message in context:
> >>
> http://www.nabble.com/Maximum-number-of-values-in-a-multi-valued-field.-tp25015685p25015685.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Maximum-number-of-values-in-a-multi-valued-field.-tp25015685p25015945.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


Re: Aliases for fields

2009-08-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
2009/8/18 Licinio Fernández Maurelo :
> Hello everybody,
>
> can i set an alias for a field? Something like :
>
>  stored="true" multiValued="false" termVectors="false"
> alias="source.date"/>
>
> is there any jira issue related?
yes https://issues.apache.org/jira/browse/SOLR-1205
>
> Thx
>
> --
> Lici
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


Re: Aliases for fields

2009-08-18 Thread Fergus McMenemie
>What could possibly be a use case for such a need?
>

I would love to see such a feature.

I have a multi core solr setup with each core having utterly 
different content. Each core has its own "custom search app"
that exploits nuances specific to a particular data set. The
fieldnames are chosen as best fits a particular data set.

However I would also like to have one of two general search
features that span all cores. This is a crude, one size fits
all, type of search:-

One core has fields:-
author
title
text
Another has
sender
subject
message
Another has
placename
description

I either need to rename all fields within some of the
"custom search apps" to account for the needs of the global
search or perform lots of copyFields, or construct really
nasty queries. 

I currently use the copyFields approach. I think aliases 
would allow for far more efficient indexes and clear code.

Regards Fergus.

>Cheers
>Avlesh
>
>2009/8/18 Licinio Fernández Maurelo 
>
>> Hello everybody,
>>
>> can i set an alias for a field? Something like :
>>
>> > stored="true" multiValued="false" termVectors="false"
>> alias="source.date"/>
>>
>> is there any jira issue related?
>>
>> Thx
>>
>> --
>> Lici
>>

-- 

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


Distributed index

2009-08-18 Thread ToJira

Hi,

I am very new to Solr and overall a newbie in software developing. I have a
problem with cross-platform implementation. Basically I have a local index
running on a windows server 2003 aided with a web service (asp.net) for the
user queries. However, I need to add another index on a remote Linux
computer. As I somehow understood, it is necessary to implement a Solr
server on the Linux because it is not possible to use .net there. So I would
like to know how do I search the remote index using the asp.net web service
user inqueries? Any tutorials related to this?

thanks!  
-- 
View this message in context: 
http://www.nabble.com/Distributed-index-tp25022053p25022053.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Issue with Collection & Distribution

2009-08-18 Thread william pink
Hi,

Sorry for the delayed response didn't even realise I had got a reply, those
logs are from the slave and the both version of Solr are the same

Solr Implementation Version: 1.2.0 - Yonik - 2007-06-02 17:35:12

It maybe worth upgrading them?

Thank you for the assistance,
Will

On Thu, Aug 13, 2009 at 6:28 PM, Bill Au  wrote:

> Have you check the solr log on the slave to see if there was any commit
> done?  It looks to me you are still using an older version of the commit
> script that is not compatible with the newer Solr response format.  If
> thats' the case, the commit was actually performed.  It is just that the
> script failed to handle the Solr response.  See
>
> https://issues.apache.org/jira/browse/SOLR-463
> https://issues.apache.org/jira/browse/SOLR-426
>
> Bill
>
> On Thu, Aug 13, 2009 at 12:28 PM, william pink 
> wrote:
>
> > Hello,
> >
> > I am having a few problems with the snapinstaller/commit on the slave, I
> > have a pull_from_master script which is the following
> >
> > #!/bin/bash
> > cd /opt/solr/solr/bin -v
> > ./snappuller -v -P 18983
> > ./snapinstaller -v
> >
> >
> > I have been executing snapshooter manually on the master then running the
> > above script to test but I am getting the following
> >
> > 2009/08/13 17:18:16 notifing Solr to open a new Searcher
> > 2009/08/13 17:18:16 failed to connect to Solr server
> > 2009/08/13 17:18:17 snapshot installed but Solr server has not open a new
> > Searcher
> >
> > Commit logs
> >
> > 2009/08/13 17:18:16 started by user
> > 2009/08/13 17:18:16 command: /opt/solr/solr/bin/commit
> > 2009/08/13 17:18:16 commit request to Solr at
> > http://slave-server:8983/solr/update failed:
> > 2009/08/13 17:18:16  
>  > name="responseHeader">0 > name="QTime">28 
> > 2009/08/13 17:18:16 failed (elapsed time: 0 sec)
> >
> > Snappinstaller logs
> >
> > 2009/08/13 17:18:16 started by user
> > 2009/08/13 17:18:16 command: ./snapinstaller -v
> > 2009/08/13 17:18:16 installing snapshot
> > /opt/solr/solr/data/snapshot.20090813171835
> > 2009/08/13 17:18:16 notifing Solr to open a new Searcher
> > 2009/08/13 17:18:16 failed to connect to Solr server
> > 2009/08/13 17:18:17 snapshot installed but Solr server has not open a new
> > Searcher
> > 2009/08/13 17:18:17 failed (elapsed time: 1 sec)
> >
> >
> > Is there a way of telling why it is failing?
> >
> > Many Thanks,
> > Will
> >
>


Index health checking

2009-08-18 Thread Licinio Fernández Maurelo
As you suppose, i'm asking if currently solr implements this
functionality or there is any related jira issue.

A few days ago, our solr server suffered an unsafe power shutdown.
After restoring, we found wrong behaviour (we got NullPointerException
when aplying sort criteria in some queries) due to index corruption

I think it could be nice if Solr could check if the index is corrupted
(fix corrupted indexes too)

Found this functionality at :

 http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/index/CheckIndex.html

Thx
-- 
Lici


Writing and using your own Query class in solr 1.4 (trunk)

2009-08-18 Thread Jérôme Etévé
Hi all,

 I have a custom search component which uses a query I wrote.
Basically, this Query (called DocSetQuery) is a Query decorator that
skips any document which is not in a given document set. My code used
to work perfectly in solr 1.3 but in solr 1.4, it seems that my
DocSetQuery has lost all its power.

I noticed that to be compliant with solr 1.4 trunk and the lucene it
contains, I should implement two new methods:

createQueryWeight
and
queryWeight

So I did. It was very easy, because basically it's only about re-using
the deprecated Weight createWeight and wrapping the result with a
QueryWeightWrapper.

So now I believe my DocSetQuery complies with the new
solr1.4/lucene2.9-dev api. And I've got those methods:

 public QueryWeight queryWeight(Searcher searcher) throws IOException {
return createQueryWeight(searcher);
  }
 public QueryWeight createQueryWeight(Searcher searcher) throws IOException {
log.info("[sponsoring] creating QueryWeight calling createQueryWeight ");
return new QueryWeightWrapper(createWeight(searcher));
  }
 public Weight weight(Searcher searcher) throws IOException {
return createWeight(searcher);
  }

//and of course

protected Weight createWeight(final Searcher searcher) throws IOException {
log.info("[sponsoring] creating weight with DoCset " + docset.size());
...
}

I'm then using my DocSetQuery in my custom SearchComponent like that:

Query limitedQuery = new DocSetQuery(decoratedQuery , ... );

Then I simply perform a search by doing

rb.req.getSearcher().search(limitedQuery, myCollector);

My problem is neither of createQueryWeight or createWeight is called
by the solr Searcher, and I'm wondering what I did wrong.
Should I build the Weight myself and call the search method which
accepts a Weight object?

This is quite confusing because:
- it used to work perfectly in solr 1.3
- in the nightly build version of lucene API, those new methods
createQueryWeight and queryWeight have disappeared but with the lucene
solr1.4trunk uses, they exists plus the old ones ( createWeight and
weight) are deprecated.


Thanks for your help.

Jerome Eteve.

-- 
Jerome Eteve.

Chat with me live at http://www.eteve.net

jer...@eteve.net


MultiCore Queries? are they possible

2009-08-18 Thread Ninad Raut
Hi,
Can we create a Join query between two indexes on two cores? Is this
possible in Solr?
I have a index which stores author profiles and other index which stores
content and a author id as a reference. Can I query as
select Content,AuthorName
from Core0,Core1
where core0.authorid = core1.authorid and authorid=A123
Regards,
Ninad


Re: Writing and using your own Query class in solr 1.4 (trunk)

2009-08-18 Thread Mark Miller

You have run into some stuff that has been somewhat rolled back in Lucene.

QueryWieght, and the methods it brought have been reverted.

Shortly (when Solr trunk updates Lucene), Solr will go back to just 
createWeight and weight.


The main change that will be left is that Weight will be an abstract 
class rather than an interface.



--
- Mark

http://www.lucidimagination.com

Jérôme Etévé wrote:

Hi all,

I have a custom search component which uses a query I wrote.
Basically, this Query (called DocSetQuery) is a Query decorator that
skips any document which is not in a given document set. My code used
to work perfectly in solr 1.3 but in solr 1.4, it seems that my
DocSetQuery has lost all its power.

I noticed that to be compliant with solr 1.4 trunk and the lucene it
contains, I should implement two new methods:

createQueryWeight
and
queryWeight

So I did. It was very easy, because basically it's only about re-using
the deprecated Weight createWeight and wrapping the result with a
QueryWeightWrapper.

So now I believe my DocSetQuery complies with the new
solr1.4/lucene2.9-dev api. And I've got those methods:

public QueryWeight queryWeight(Searcher searcher) throws IOException {
return createQueryWeight(searcher);
}
public QueryWeight createQueryWeight(Searcher searcher) throws 
IOException {

log.info("[sponsoring] creating QueryWeight calling createQueryWeight ");
return new QueryWeightWrapper(createWeight(searcher));
}
public Weight weight(Searcher searcher) throws IOException {
return createWeight(searcher);
}

//and of course

protected Weight createWeight(final Searcher searcher) throws 
IOException {

log.info("[sponsoring] creating weight with DoCset " + docset.size());
...
}

I'm then using my DocSetQuery in my custom SearchComponent like that:

Query limitedQuery = new DocSetQuery(decoratedQuery , ... );

Then I simply perform a search by doing

rb.req.getSearcher().search(limitedQuery, myCollector);

My problem is neither of createQueryWeight or createWeight is called
by the solr Searcher, and I'm wondering what I did wrong.
Should I build the Weight myself and call the search method which
accepts a Weight object?

This is quite confusing because:
- it used to work perfectly in solr 1.3
- in the nightly build version of lucene API, those new methods
createQueryWeight and queryWeight have disappeared but with the lucene
solr1.4trunk uses, they exists plus the old ones ( createWeight and
weight) are deprecated.


Thanks for your help.

Jerome Eteve.








Re: Index health checking

2009-08-18 Thread Grant Ingersoll

See http://issues.apache.org/jira/browse/SOLR-566.

Patches welcome.

On Aug 18, 2009, at 7:46 AM, Licinio Fernández Maurelo wrote:


As you suppose, i'm asking if currently solr implements this
functionality or there is any related jira issue.

A few days ago, our solr server suffered an unsafe power shutdown.
After restoring, we found wrong behaviour (we got NullPointerException
when aplying sort criteria in some queries) due to index corruption

I think it could be nice if Solr could check if the index is corrupted
(fix corrupted indexes too)

Found this functionality at :

http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/index/CheckIndex.html

Thx
--  
Lici


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Proximity Search

2009-08-18 Thread Ninad Raut
Hi,
I want to count the words between two significant words like "shell" and
"petroleum". Or want to write a query to find all the documents where the
content has "shell" and "petroleum" in close proximity of less than 10 words
between them.
Can such quries be created in Solr?
Regards,
Ninad Raut.


Re: Issue with Collection & Distribution

2009-08-18 Thread Bill Au
I say it is worth upgrading since 1.2 is old.  1.4 is almost ready to be
released.  So you may want to wait a little while longer.  There are many
nice new features in 1.4.  There are performance improvement too.  In the
mean time, you can just get the latest version of the scripts from SVN.
Those should work as is.

Bill

On Tue, Aug 18, 2009 at 6:54 AM, william pink  wrote:

> Hi,
>
> Sorry for the delayed response didn't even realise I had got a reply, those
> logs are from the slave and the both version of Solr are the same
>
> Solr Implementation Version: 1.2.0 - Yonik - 2007-06-02 17:35:12
>
> It maybe worth upgrading them?
>
> Thank you for the assistance,
> Will
>
> On Thu, Aug 13, 2009 at 6:28 PM, Bill Au  wrote:
>
> > Have you check the solr log on the slave to see if there was any commit
> > done?  It looks to me you are still using an older version of the commit
> > script that is not compatible with the newer Solr response format.  If
> > thats' the case, the commit was actually performed.  It is just that the
> > script failed to handle the Solr response.  See
> >
> > https://issues.apache.org/jira/browse/SOLR-463
> > https://issues.apache.org/jira/browse/SOLR-426
> >
> > Bill
> >
> > On Thu, Aug 13, 2009 at 12:28 PM, william pink 
> > wrote:
> >
> > > Hello,
> > >
> > > I am having a few problems with the snapinstaller/commit on the slave,
> I
> > > have a pull_from_master script which is the following
> > >
> > > #!/bin/bash
> > > cd /opt/solr/solr/bin -v
> > > ./snappuller -v -P 18983
> > > ./snapinstaller -v
> > >
> > >
> > > I have been executing snapshooter manually on the master then running
> the
> > > above script to test but I am getting the following
> > >
> > > 2009/08/13 17:18:16 notifing Solr to open a new Searcher
> > > 2009/08/13 17:18:16 failed to connect to Solr server
> > > 2009/08/13 17:18:17 snapshot installed but Solr server has not open a
> new
> > > Searcher
> > >
> > > Commit logs
> > >
> > > 2009/08/13 17:18:16 started by user
> > > 2009/08/13 17:18:16 command: /opt/solr/solr/bin/commit
> > > 2009/08/13 17:18:16 commit request to Solr at
> > > http://slave-server:8983/solr/update failed:
> > > 2009/08/13 17:18:16  
> >  > > name="responseHeader">0 > > name="QTime">28 
> > > 2009/08/13 17:18:16 failed (elapsed time: 0 sec)
> > >
> > > Snappinstaller logs
> > >
> > > 2009/08/13 17:18:16 started by user
> > > 2009/08/13 17:18:16 command: ./snapinstaller -v
> > > 2009/08/13 17:18:16 installing snapshot
> > > /opt/solr/solr/data/snapshot.20090813171835
> > > 2009/08/13 17:18:16 notifing Solr to open a new Searcher
> > > 2009/08/13 17:18:16 failed to connect to Solr server
> > > 2009/08/13 17:18:17 snapshot installed but Solr server has not open a
> new
> > > Searcher
> > > 2009/08/13 17:18:17 failed (elapsed time: 1 sec)
> > >
> > >
> > > Is there a way of telling why it is failing?
> > >
> > > Many Thanks,
> > > Will
> > >
> >
>


Re: Writing and using your own Query class in solr 1.4 (trunk)

2009-08-18 Thread Jérôme Etévé
Hi Mark,


Thanks for clarifying this. So should I keep both sets of method
implemented? I guess it won't hurt when solr trunk will use the
updated version of lucene without those methods.

What I don't get is that neither my createWeight or createQueryWeight
methods seem to be called when I call
rb.req.getSearcher().search(limitedQuery, myCollector);

I'll look at the code to find out.

Thanks!

Jerome

2009/8/18 Mark Miller :
> You have run into some stuff that has been somewhat rolled back in Lucene.
>
> QueryWieght, and the methods it brought have been reverted.
>
> Shortly (when Solr trunk updates Lucene), Solr will go back to just
> createWeight and weight.
>
> The main change that will be left is that Weight will be an abstract class
> rather than an interface.
>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
> Jérôme Etévé wrote:
>>
>> Hi all,
>>
>> I have a custom search component which uses a query I wrote.
>> Basically, this Query (called DocSetQuery) is a Query decorator that
>> skips any document which is not in a given document set. My code used
>> to work perfectly in solr 1.3 but in solr 1.4, it seems that my
>> DocSetQuery has lost all its power.
>>
>> I noticed that to be compliant with solr 1.4 trunk and the lucene it
>> contains, I should implement two new methods:
>>
>> createQueryWeight
>> and
>> queryWeight
>>
>> So I did. It was very easy, because basically it's only about re-using
>> the deprecated Weight createWeight and wrapping the result with a
>> QueryWeightWrapper.
>>
>> So now I believe my DocSetQuery complies with the new
>> solr1.4/lucene2.9-dev api. And I've got those methods:
>>
>> public QueryWeight queryWeight(Searcher searcher) throws IOException {
>> return createQueryWeight(searcher);
>> }
>> public QueryWeight createQueryWeight(Searcher searcher) throws IOException
>> {
>> log.info("[sponsoring] creating QueryWeight calling createQueryWeight ");
>> return new QueryWeightWrapper(createWeight(searcher));
>> }
>> public Weight weight(Searcher searcher) throws IOException {
>> return createWeight(searcher);
>> }
>>
>> //and of course
>>
>> protected Weight createWeight(final Searcher searcher) throws IOException
>> {
>> log.info("[sponsoring] creating weight with DoCset " + docset.size());
>> ...
>> }
>>
>> I'm then using my DocSetQuery in my custom SearchComponent like that:
>>
>> Query limitedQuery = new DocSetQuery(decoratedQuery , ... );
>>
>> Then I simply perform a search by doing
>>
>> rb.req.getSearcher().search(limitedQuery, myCollector);
>>
>> My problem is neither of createQueryWeight or createWeight is called
>> by the solr Searcher, and I'm wondering what I did wrong.
>> Should I build the Weight myself and call the search method which
>> accepts a Weight object?
>>
>> This is quite confusing because:
>> - it used to work perfectly in solr 1.3
>> - in the nightly build version of lucene API, those new methods
>> createQueryWeight and queryWeight have disappeared but with the lucene
>> solr1.4trunk uses, they exists plus the old ones ( createWeight and
>> weight) are deprecated.
>>
>>
>> Thanks for your help.
>>
>> Jerome Eteve.
>>
>
>
>
>
>



-- 
Jerome Eteve.

Chat with me live at http://www.eteve.net

jer...@eteve.net


Release Date Solr 1.4

2009-08-18 Thread Daniel Knapp

Hello Mailinglist,


does anyone know the release date from Solr 1.4?

Thanks for your reply.


Regards,
Daniel


Re: Writing and using your own Query class in solr 1.4 (trunk)

2009-08-18 Thread Mark Miller

I'm pretty sure one of them is called. In the version you have:

 public void search(Query query, HitCollector results)
   throws IOException {
   search(createQueryWeight(query), null, new 
HitCollectorWrapper(results));

 }

 protected QueryWeight createQueryWeight(Query query) throws IOException {
   return query.queryWeight(this);
 }


Query.queryWeight will in turn call Query.createQueryWight (either for 
your Query, or for the primitive Query

it rewrites itself too).



--
- Mark

http://www.lucidimagination.com



Jérôme Etévé wrote:

Hi Mark,


Thanks for clarifying this. So should I keep both sets of method
implemented? I guess it won't hurt when solr trunk will use the
updated version of lucene without those methods.

What I don't get is that neither my createWeight or createQueryWeight
methods seem to be called when I call
rb.req.getSearcher().search(limitedQuery, myCollector);

I'll look at the code to find out.

Thanks!

Jerome

2009/8/18 Mark Miller :
  

You have run into some stuff that has been somewhat rolled back in Lucene.

QueryWieght, and the methods it brought have been reverted.

Shortly (when Solr trunk updates Lucene), Solr will go back to just
createWeight and weight.

The main change that will be left is that Weight will be an abstract class
rather than an interface.


--
- Mark

http://www.lucidimagination.com

Jérôme Etévé wrote:


Hi all,

I have a custom search component which uses a query I wrote.
Basically, this Query (called DocSetQuery) is a Query decorator that
skips any document which is not in a given document set. My code used
to work perfectly in solr 1.3 but in solr 1.4, it seems that my
DocSetQuery has lost all its power.

I noticed that to be compliant with solr 1.4 trunk and the lucene it
contains, I should implement two new methods:

createQueryWeight
and
queryWeight

So I did. It was very easy, because basically it's only about re-using
the deprecated Weight createWeight and wrapping the result with a
QueryWeightWrapper.

So now I believe my DocSetQuery complies with the new
solr1.4/lucene2.9-dev api. And I've got those methods:

public QueryWeight queryWeight(Searcher searcher) throws IOException {
return createQueryWeight(searcher);
}
public QueryWeight createQueryWeight(Searcher searcher) throws IOException
{
log.info("[sponsoring] creating QueryWeight calling createQueryWeight ");
return new QueryWeightWrapper(createWeight(searcher));
}
public Weight weight(Searcher searcher) throws IOException {
return createWeight(searcher);
}

//and of course

protected Weight createWeight(final Searcher searcher) throws IOException
{
log.info("[sponsoring] creating weight with DoCset " + docset.size());
...
}

I'm then using my DocSetQuery in my custom SearchComponent like that:

Query limitedQuery = new DocSetQuery(decoratedQuery , ... );

Then I simply perform a search by doing

rb.req.getSearcher().search(limitedQuery, myCollector);

My problem is neither of createQueryWeight or createWeight is called
by the solr Searcher, and I'm wondering what I did wrong.
Should I build the Weight myself and call the search method which
accepts a Weight object?

This is quite confusing because:
- it used to work perfectly in solr 1.3
- in the nightly build version of lucene API, those new methods
createQueryWeight and queryWeight have disappeared but with the lucene
solr1.4trunk uses, they exists plus the old ones ( createWeight and
weight) are deprecated.


Thanks for your help.

Jerome Eteve.

  









  







Re: Release Date Solr 1.4

2009-08-18 Thread Mark Miller

Daniel Knapp wrote:

Hello Mailinglist,


does anyone know the release date from Solr 1.4?

Thanks for your reply.


Regards,
Daniel
The last note I saw said we hope to release 1.4 a week or so after 
Lucene 2.9 (though of course a week may not end up being enough).


It will follow Lucene 2.9 though, which is yet to be released. Its 
looking like we will hit code freeze with Lucene 2.9 this week though. 
That should put it out within a couple weeks I'd hope, with Solr 
following a week or two behind if all cylinders fire correctly.


--
- Mark

http://www.lucidimagination.com





Re: Query not working as expected

2009-08-18 Thread Matt Schraeder
Awesome that works great. Thanks a lot!

>>> markrmil...@gmail.com 8/17/2009 5:32:46 PM >>>
Matt Schraeder wrote:
> I'm attempting to write a query as follows:
>  
> ($query^10) OR (NOT ($query)) which effectively would return everything, but 
> if it matches the first query it will get a higher score and thus be sorted 
> first in the result set.  Unfortunately the results are not coming back as 
> expected. 
>  
> ($query) works by itself and gets X rows
> (NOT ($query)) works by itself and gets Y rows
>  
> You would expect ($query) OR (NOT ($query)) to return X+Y rows but it is only 
> returning X rows.
>  
> What am I doing wrong?
>
>
>   
I believe that Solr will only allow a put 'not' in the top level of a 
Boolean query - also not sure if it supports NOT or just !.

In any case, to use it deeper than the top level you must use the 
MatchAllDocsQuery syntax (obviously doesn't apply to DisMax):

($query^10) OR (*:* NOT ($query))



-- 
- Mark

http://www.lucidimagination.com







Re: Release Date Solr 1.4

2009-08-18 Thread Constantijn Visinescu
Last i heard the eta was aprox a month, but they won't release it untill
it's ready.

Check JIRA here for the list of open issues that need fixing before 1.4
https://issues.apache.org/jira/secure/IssueNavigator.jspa?sorter/field=updated&sorter/order=DESC

Constantijn Visinescu

On Tue, Aug 18, 2009 at 2:57 PM, Daniel Knapp <
daniel.kn...@mni.fh-giessen.de> wrote:

> Hello Mailinglist,
>
>
> does anyone know the release date from Solr 1.4?
>
> Thanks for your reply.
>
>
> Regards,
> Daniel
>


Re: Writing and using your own Query class in solr 1.4 (trunk)

2009-08-18 Thread Jérôme Etévé
That's right. I just had another decorator which was not adapted for
the new API. My fault ..

Thanks,

Jerome.

2009/8/18 Mark Miller :
> I'm pretty sure one of them is called. In the version you have:
>
>  public void search(Query query, HitCollector results)
>   throws IOException {
>   search(createQueryWeight(query), null, new HitCollectorWrapper(results));
>  }
>
>  protected QueryWeight createQueryWeight(Query query) throws IOException {
>   return query.queryWeight(this);
>  }
>
>
> Query.queryWeight will in turn call Query.createQueryWight (either for your
> Query, or for the primitive Query
> it rewrites itself too).
>
>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
> Jérôme Etévé wrote:
>>
>> Hi Mark,
>>
>>
>> Thanks for clarifying this. So should I keep both sets of method
>> implemented? I guess it won't hurt when solr trunk will use the
>> updated version of lucene without those methods.
>>
>> What I don't get is that neither my createWeight or createQueryWeight
>> methods seem to be called when I call
>> rb.req.getSearcher().search(limitedQuery, myCollector);
>>
>> I'll look at the code to find out.
>>
>> Thanks!
>>
>> Jerome
>>
>> 2009/8/18 Mark Miller :
>>
>>>
>>> You have run into some stuff that has been somewhat rolled back in
>>> Lucene.
>>>
>>> QueryWieght, and the methods it brought have been reverted.
>>>
>>> Shortly (when Solr trunk updates Lucene), Solr will go back to just
>>> createWeight and weight.
>>>
>>> The main change that will be left is that Weight will be an abstract
>>> class
>>> rather than an interface.
>>>
>>>
>>> --
>>> - Mark
>>>
>>> http://www.lucidimagination.com
>>>
>>> Jérôme Etévé wrote:
>>>

 Hi all,

 I have a custom search component which uses a query I wrote.
 Basically, this Query (called DocSetQuery) is a Query decorator that
 skips any document which is not in a given document set. My code used
 to work perfectly in solr 1.3 but in solr 1.4, it seems that my
 DocSetQuery has lost all its power.

 I noticed that to be compliant with solr 1.4 trunk and the lucene it
 contains, I should implement two new methods:

 createQueryWeight
 and
 queryWeight

 So I did. It was very easy, because basically it's only about re-using
 the deprecated Weight createWeight and wrapping the result with a
 QueryWeightWrapper.

 So now I believe my DocSetQuery complies with the new
 solr1.4/lucene2.9-dev api. And I've got those methods:

 public QueryWeight queryWeight(Searcher searcher) throws IOException {
 return createQueryWeight(searcher);
 }
 public QueryWeight createQueryWeight(Searcher searcher) throws
 IOException
 {
 log.info("[sponsoring] creating QueryWeight calling createQueryWeight
 ");
 return new QueryWeightWrapper(createWeight(searcher));
 }
 public Weight weight(Searcher searcher) throws IOException {
 return createWeight(searcher);
 }

 //and of course

 protected Weight createWeight(final Searcher searcher) throws
 IOException
 {
 log.info("[sponsoring] creating weight with DoCset " + docset.size());
 ...
 }

 I'm then using my DocSetQuery in my custom SearchComponent like that:

 Query limitedQuery = new DocSetQuery(decoratedQuery , ... );

 Then I simply perform a search by doing

 rb.req.getSearcher().search(limitedQuery, myCollector);

 My problem is neither of createQueryWeight or createWeight is called
 by the solr Searcher, and I'm wondering what I did wrong.
 Should I build the Weight myself and call the search method which
 accepts a Weight object?

 This is quite confusing because:
 - it used to work perfectly in solr 1.3
 - in the nightly build version of lucene API, those new methods
 createQueryWeight and queryWeight have disappeared but with the lucene
 solr1.4trunk uses, they exists plus the old ones ( createWeight and
 weight) are deprecated.


 Thanks for your help.

 Jerome Eteve.


>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
>
>
>
>



-- 
Jerome Eteve.

Chat with me live at http://www.eteve.net

jer...@eteve.net


Re: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on CentOS

2009-08-18 Thread Aaron Aberg
Constantijn,

First of all, I want you to know how much I appreciate you not giving
up on me. Second of all, your instructions were really great. I think
that I am getting closer to solving this issue. I am STILL get that
error but after a full tomcat reboot it picked up my solr.home
environment variable from my web.xml and its pointing to the new
location. (Good idea)

Here is the FULL log from start up of Tomcat. It might be excessive,
but I want to give you all of the information that I can:

Aug 17, 2009 11:16:08 PM org.apache.catalina.core.AprLifecycleListener
lifecycleEvent
INFO: The Apache Tomcat Native library which allows optimal
performance in production environments was not found on the
java.library.path:
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/lib/i386/client:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/lib/i386:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib
Aug 17, 2009 11:16:09 PM org.apache.coyote.http11.Http11BaseProtocol init
INFO: Initializing Coyote HTTP/1.1 on http-8080
Aug 17, 2009 11:16:09 PM org.apache.coyote.http11.Http11BaseProtocol init
INFO: Initializing Coyote HTTP/1.1 on http-9080
Aug 17, 2009 11:16:09 PM org.apache.catalina.startup.Catalina load
INFO: Initialization processed in 3382 ms
Aug 17, 2009 11:16:09 PM org.apache.catalina.core.StandardService start
INFO: Starting service Catalina
Aug 17, 2009 11:16:09 PM org.apache.catalina.core.StandardEngine start
INFO: Starting Servlet Engine: Apache Tomcat/5.5.23
Aug 17, 2009 11:16:09 PM org.apache.catalina.core.StandardHost start
INFO: XML validation disabled
Aug 17, 2009 11:16:12 PM org.apache.catalina.core.ApplicationContext log
INFO: ContextListener: contextInitialized()
Aug 17, 2009 11:16:12 PM org.apache.catalina.core.ApplicationContext log
INFO: SessionListener: contextInitialized()
Aug 17, 2009 11:16:12 PM org.apache.catalina.core.ApplicationContext log
INFO: ContextListener: contextInitialized()
Aug 17, 2009 11:16:12 PM org.apache.catalina.core.ApplicationContext log
INFO: SessionListener: contextInitialized()
Aug 17, 2009 11:16:12 PM org.apache.catalina.core.ApplicationContext log
INFO: org.apache.webapp.balancer.BalancerFilter: init(): ruleChain:
[org.apache.webapp.balancer.RuleChain:
[org.apache.webapp.balancer.rules.URLStringMatchRule: Target string:
News / Redirect URL: http://www.cnn.com],
[org.apache.webapp.balancer.rules.RequestParameterRule: Target param
name: paramName / Target param value: paramValue / Redirect URL:
http://www.yahoo.com],
[org.apache.webapp.balancer.rules.AcceptEverythingRule: Redirect URL:
http://jakarta.apache.org]]
Aug 17, 2009 11:16:13 PM org.apache.coyote.http11.Http11BaseProtocol start
INFO: Starting Coyote HTTP/1.1 on http-8080
Aug 17, 2009 11:16:13 PM org.apache.jk.common.ChannelSocket init
INFO: JK: ajp13 listening on /0.0.0.0:8009
Aug 17, 2009 11:16:13 PM org.apache.jk.server.JkMain start
INFO: Jk running ID=0 time=0/57  config=null
Aug 17, 2009 11:16:13 PM org.apache.catalina.core.StandardService start
INFO: Starting service PSA
Aug 17, 2009 11:16:13 PM org.apache.catalina.core.StandardEngine start
INFO: Starting Servlet Engine: Apache Tomcat/5.5.23
Aug 17, 2009 11:16:13 PM org.apache.catalina.core.StandardHost start
INFO: XML validation disabled
Aug 17, 2009 11:16:15 PM org.apache.solr.servlet.SolrDispatchFilter init
INFO: SolrDispatchFilter.init()
Aug 17, 2009 11:16:15 PM org.apache.solr.core.SolrResourceLoader
locateInstanceDir
INFO: Using JNDI solr.home: /usr/share/solr
Aug 17, 2009 11:16:15 PM
org.apache.solr.core.CoreContainer$Initializer initialize
INFO: looking for solr.xml: /usr/share/solr/solr.xml
Aug 17, 2009 11:16:15 PM org.apache.solr.core.SolrResourceLoader 
INFO: Solr home set to '/usr/share/solr/'
Aug 17, 2009 11:16:15 PM org.apache.solr.core.SolrResourceLoader
createClassLoader
INFO: Reusing parent classloader
Aug 17, 2009 11:16:15 PM org.apache.solr.servlet.SolrDispatchFilter init
SEVERE: Could not start SOLR. Check solr/home property
java.lang.ExceptionInInitializerError
at 
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117)
at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
at 
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
at 
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
at 
org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:78)
at 
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
at 
org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014)
at org.apache.catalina.core.StandardHost.start(StandardHost.java:736)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014)
at 
org.apache.catalina.core.StandardEngine

Re: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on CentOS

2009-08-18 Thread Constantijn Visinescu
Am pretty sure solr.xml is if you want to define multiple solr
cores for your application.So it makes sense that solr checks for that
first, however if it doesn't find one it continues to start up with a single
core. 

I KNOW it runs just fine for me without a solr.xml.

The exception seems to be complaining about Xpath. Etiher this means that
you have a weird xml library on your classpath somewhere (unlikely but check
it, Xalan/Xerces seems to be the most common) or tomcat/solr doesn't have
the rights it needs to access the solr folder.

What rights did you give the solr folder and all the files in there ?

On Tue, Aug 18, 2009 at 3:27 PM, Aaron Aberg  wrote:

> Constantijn,
>
> First of all, I want you to know how much I appreciate you not giving
> up on me. Second of all, your instructions were really great. I think
> that I am getting closer to solving this issue. I am STILL get that
> error but after a full tomcat reboot it picked up my solr.home
> environment variable from my web.xml and its pointing to the new
> location. (Good idea)
>
> Here is the FULL log from start up of Tomcat. It might be excessive,
> but I want to give you all of the information that I can:
>
> Aug 17, 2009 11:16:08 PM org.apache.catalina.core.AprLifecycleListener
> lifecycleEvent
> INFO: The Apache Tomcat Native library which allows optimal
> performance in production environments was not found on the
> java.library.path:
>
> /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/lib/i386/client:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/lib/i386:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib
> Aug 17, 2009 11:16:09 PM org.apache.coyote.http11.Http11BaseProtocol init
> INFO: Initializing Coyote HTTP/1.1 on http-8080
> Aug 17, 2009 11:16:09 PM org.apache.coyote.http11.Http11BaseProtocol init
> INFO: Initializing Coyote HTTP/1.1 on http-9080
> Aug 17, 2009 11:16:09 PM org.apache.catalina.startup.Catalina load
> INFO: Initialization processed in 3382 ms
> Aug 17, 2009 11:16:09 PM org.apache.catalina.core.StandardService start
> INFO: Starting service Catalina
> Aug 17, 2009 11:16:09 PM org.apache.catalina.core.StandardEngine start
> INFO: Starting Servlet Engine: Apache Tomcat/5.5.23
> Aug 17, 2009 11:16:09 PM org.apache.catalina.core.StandardHost start
> INFO: XML validation disabled
> Aug 17, 2009 11:16:12 PM org.apache.catalina.core.ApplicationContext log
> INFO: ContextListener: contextInitialized()
> Aug 17, 2009 11:16:12 PM org.apache.catalina.core.ApplicationContext log
> INFO: SessionListener: contextInitialized()
> Aug 17, 2009 11:16:12 PM org.apache.catalina.core.ApplicationContext log
> INFO: ContextListener: contextInitialized()
> Aug 17, 2009 11:16:12 PM org.apache.catalina.core.ApplicationContext log
> INFO: SessionListener: contextInitialized()
> Aug 17, 2009 11:16:12 PM org.apache.catalina.core.ApplicationContext log
> INFO: org.apache.webapp.balancer.BalancerFilter: init(): ruleChain:
> [org.apache.webapp.balancer.RuleChain:
> [org.apache.webapp.balancer.rules.URLStringMatchRule: Target string:
> News / Redirect URL: http://www.cnn.com],
> [org.apache.webapp.balancer.rules.RequestParameterRule: Target param
> name: paramName / Target param value: paramValue / Redirect URL:
> http://www.yahoo.com],
> [org.apache.webapp.balancer.rules.AcceptEverythingRule: Redirect URL:
> http://jakarta.apache.org]]
> Aug 17, 2009 11:16:13 PM org.apache.coyote.http11.Http11BaseProtocol start
> INFO: Starting Coyote HTTP/1.1 on http-8080
> Aug 17, 2009 11:16:13 PM org.apache.jk.common.ChannelSocket init
> INFO: JK: ajp13 listening on /0.0.0.0:8009
> Aug 17, 2009 11:16:13 PM org.apache.jk.server.JkMain start
> INFO: Jk running ID=0 time=0/57  config=null
> Aug 17, 2009 11:16:13 PM org.apache.catalina.core.StandardService start
> INFO: Starting service PSA
> Aug 17, 2009 11:16:13 PM org.apache.catalina.core.StandardEngine start
> INFO: Starting Servlet Engine: Apache Tomcat/5.5.23
> Aug 17, 2009 11:16:13 PM org.apache.catalina.core.StandardHost start
> INFO: XML validation disabled
> Aug 17, 2009 11:16:15 PM org.apache.solr.servlet.SolrDispatchFilter init
> INFO: SolrDispatchFilter.init()
> Aug 17, 2009 11:16:15 PM org.apache.solr.core.SolrResourceLoader
> locateInstanceDir
> INFO: Using JNDI solr.home: /usr/share/solr
> Aug 17, 2009 11:16:15 PM
> org.apache.solr.core.CoreContainer$Initializer initialize
> INFO: looking for solr.xml: /usr/share/solr/solr.xml
> Aug 17, 2009 11:16:15 PM org.apache.solr.core.SolrResourceLoader 
> INFO: Solr home set to '/usr/share/solr/'
> Aug 17, 2009 11:16:15 PM org.apache.solr.core.SolrResourceLoader
> createClassLoader
> INFO: Reusing parent classloader
> Aug 17, 2009 11:16:15 PM org.apache.solr.servlet.SolrDispatchFilter init
> SEVERE: Could not start SOLR. Check solr/home property
> java.lang.ExceptionInInitializerError
> at
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117)
> at
> org.apache.solr.servlet.So

Re: Issue with Collection & Distribution

2009-08-18 Thread william pink
Many thanks Bill

Yeah I did take a look at the features of 1.4 and they do look fantastic
very much looking forward to them

On Tue, Aug 18, 2009 at 1:42 PM, Bill Au  wrote:

> I say it is worth upgrading since 1.2 is old.  1.4 is almost ready to be
> released.  So you may want to wait a little while longer.  There are many
> nice new features in 1.4.  There are performance improvement too.  In the
> mean time, you can just get the latest version of the scripts from SVN.
> Those should work as is.
>
> Bill
>
> On Tue, Aug 18, 2009 at 6:54 AM, william pink  wrote:
>
> > Hi,
> >
> > Sorry for the delayed response didn't even realise I had got a reply,
> those
> > logs are from the slave and the both version of Solr are the same
> >
> > Solr Implementation Version: 1.2.0 - Yonik - 2007-06-02 17:35:12
> >
> > It maybe worth upgrading them?
> >
> > Thank you for the assistance,
> > Will
> >
> > On Thu, Aug 13, 2009 at 6:28 PM, Bill Au  wrote:
> >
> > > Have you check the solr log on the slave to see if there was any commit
> > > done?  It looks to me you are still using an older version of the
> commit
> > > script that is not compatible with the newer Solr response format.  If
> > > thats' the case, the commit was actually performed.  It is just that
> the
> > > script failed to handle the Solr response.  See
> > >
> > > https://issues.apache.org/jira/browse/SOLR-463
> > > https://issues.apache.org/jira/browse/SOLR-426
> > >
> > > Bill
> > >
> > > On Thu, Aug 13, 2009 at 12:28 PM, william pink 
> > > wrote:
> > >
> > > > Hello,
> > > >
> > > > I am having a few problems with the snapinstaller/commit on the
> slave,
> > I
> > > > have a pull_from_master script which is the following
> > > >
> > > > #!/bin/bash
> > > > cd /opt/solr/solr/bin -v
> > > > ./snappuller -v -P 18983
> > > > ./snapinstaller -v
> > > >
> > > >
> > > > I have been executing snapshooter manually on the master then running
> > the
> > > > above script to test but I am getting the following
> > > >
> > > > 2009/08/13 17:18:16 notifing Solr to open a new Searcher
> > > > 2009/08/13 17:18:16 failed to connect to Solr server
> > > > 2009/08/13 17:18:17 snapshot installed but Solr server has not open a
> > new
> > > > Searcher
> > > >
> > > > Commit logs
> > > >
> > > > 2009/08/13 17:18:16 started by user
> > > > 2009/08/13 17:18:16 command: /opt/solr/solr/bin/commit
> > > > 2009/08/13 17:18:16 commit request to Solr at
> > > > http://slave-server:8983/solr/update failed:
> > > > 2009/08/13 17:18:16  
> > >  > > > name="responseHeader">0 > > > name="QTime">28 
> > > > 2009/08/13 17:18:16 failed (elapsed time: 0 sec)
> > > >
> > > > Snappinstaller logs
> > > >
> > > > 2009/08/13 17:18:16 started by user
> > > > 2009/08/13 17:18:16 command: ./snapinstaller -v
> > > > 2009/08/13 17:18:16 installing snapshot
> > > > /opt/solr/solr/data/snapshot.20090813171835
> > > > 2009/08/13 17:18:16 notifing Solr to open a new Searcher
> > > > 2009/08/13 17:18:16 failed to connect to Solr server
> > > > 2009/08/13 17:18:17 snapshot installed but Solr server has not open a
> > new
> > > > Searcher
> > > > 2009/08/13 17:18:17 failed (elapsed time: 1 sec)
> > > >
> > > >
> > > > Is there a way of telling why it is failing?
> > > >
> > > > Many Thanks,
> > > > Will
> > > >
> > >
> >
>


Re: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on CentOS

2009-08-18 Thread Marco Westermann
t normal for it to be looking there?

Here is my file structure:

/usr/share/solr/
/usr/share/solr/bin
/usr/share/solr/bin/rsyncd-stop
/usr/share/solr/bin/abo
/usr/share/solr/bin/scripts-util
/usr/share/solr/bin/snappuller-disable
/usr/share/solr/bin/backupcleaner
/usr/share/solr/bin/snapcleaner
/usr/share/solr/bin/rsyncd-disable
/usr/share/solr/bin/snapinstaller
/usr/share/solr/bin/commit
/usr/share/solr/bin/snappuller-enable
/usr/share/solr/bin/snappuller
/usr/share/solr/bin/backup
/usr/share/solr/bin/rsyncd-start
/usr/share/solr/bin/abc
/usr/share/solr/bin/rsyncd-enable
/usr/share/solr/bin/optimize
/usr/share/solr/bin/snapshooter
/usr/share/solr/bin/readercycle
/usr/share/solr/conf
/usr/share/solr/conf/schema.xml
/usr/share/solr/conf/solrconfig.xml
/usr/share/solr/conf/synonyms.txt
/usr/share/solr/conf/xslt
/usr/share/solr/conf/xslt/example_atom.xsl
/usr/share/solr/conf/xslt/luke.xsl
/usr/share/solr/conf/xslt/example_rss.xsl
/usr/share/solr/conf/xslt/example.xsl
/usr/share/solr/conf/elevate.xml
/usr/share/solr/conf/scripts.conf
/usr/share/solr/conf/protwords.txt
/usr/share/solr/conf/spellings.txt
/usr/share/solr/conf/admin-extra.html
/usr/share/solr/conf/stopwords.txt
/usr/share/solr/README.txt

I'm pretty sure I should have a solr.xml somewhere for tomcat. What do
you think?

Thanks again for all the help,
Aaron

__ Hinweis von ESET NOD32 Antivirus, Signaturdatenbank-Version 4345 
(20090818) __

E-Mail wurde geprüft mit ESET NOD32 Antivirus.

http://www.eset.com




  



--
++ Business-Software aus einer Hand ++
++ Internet, Warenwirtschaft, Linux, Virtualisierung ++
http://www.intersales.de
http://www.eisxen.org
http://www.tarantella-partner.de
http://www.medisales.de
http://www.eisfair.net

interSales AG Internet Commerce
Subbelrather Str. 247
50825 Köln

Tel  02 21 - 27 90 50
Fax  02 21 - 27 90 517
Mail i...@intersales.de
Mail m...@intersales.de
Web  www.intersales.de

Handelsregister Köln HR B 30904
Ust.-Id.: DE199672015
Finanzamt Köln-Nord. UstID: nicht vergeben
Aufsichtsratsvorsitzender: Michael Morgenstern
Vorstand: Andrej Radonic, Peter Zander 



Re: spellcheck component in 1.4 distributed

2009-08-18 Thread Ian Connor
Once it goes through Ruby the stack is not as neat but here is what the
error comes through as:

String_index_out_of_range_1
__javalangStringIndexOutOfBoundsException_String_index_out_of_range_1
__at_javalangAbstractStringBuilderreplaceAbstractStringBuilderjava797
__at_javalangStringBuilderreplaceStringBuilderjava271
__at_orgapachesolrhandlercomponentSpellCheckComponentfinishStageSpellCheckComponentjava236
__at_orgapachesolrhandlercomponentSearchHandlerhandleRequestBodySearchHandlerjava272
__at_orgapachesolrhandlerRequestHandlerBasehandleRequestRequestHandlerBasejava131
__at_orgapachesolrcoreSolrCoreexecuteSolrCorejava1333
__at_orgapachesolrservletSolrDispatchFilterexecuteSolrDispatchFilterjava303
__at_orgapachesolrservletSolrDispatchFilterdoFilterSolrDispatchFilterjava232
__at_orgmortbayjettyservletServletHandler$CachedChaindoFilterServletHandlerjava1089
__at_orgmortbayjettyservletServletHandlerhandleServletHandlerjava365
__at_orgmortbayjettysecuritySecurityHandlerhandleSecurityHandlerjava216
__at_orgmortbayjettyservletSessionHandlerhandleSessionHandlerjava181
__at_orgmortbay! jettyhandlerContextHandlerhandleContextHandlerjava712
__at_orgmortbayjettywebappWebAppContexthandleWebAppContextjava405
__at_orgmortbayjettyhandlerContextHandlerCollectionhandleContextHandlerCollectionjava211
__at_orgmortbayjettyhandlerHandlerCollectionhandleHandlerCollectionjava114
__at_orgmortbayjettyhandlerHandlerWrapperhandleHandlerWrapperjava139
__at_orgmortbayjettyServerhandleServerjava285
__at_orgmortbayjettyHttpConnectionhandleRequestHttpConnectionjava502
__at_orgmortbayjettyHttpConnection$RequestHandlercontentHttpConnectionjava835
__at_orgmortbayjettyHttpParserparseNextHttpParserjava641
__at_orgmortbayjettyHttpParserparseAvailableHttpParserjava208
__at_orgmortbayjettyHttpConnectionhandleHttpConnectionjava378
__at_orgmortbayjettybioSocketConnector$ConnectionrunSocketConnectorjava226
__at_orgmortbaythreadBoundedThreadPool$PoolThreadrunBoun"
/usr/lib/ruby/1.8/net/http.rb:2097:in `error!'

On Mon, Aug 17, 2009 at 12:08 PM, Mark Miller  wrote:

> Ian Connor wrote:
>
>> Hi,
>>
>> Just a quick update to the list. Mike and I were able to apply it to 1.4
>> and
>> it works. We have it loaded on a few production servers and there is an
>> odd
>> "StringIndexOutOfBoundsException" error but most of the time it seems to
>> work just fine.
>>
>>
> Do you happen to have the stack trace?
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>


-- 
Regards,

Ian Connor
1 Leighton St #723
Cambridge, MA 02141
Call Center Phone: +1 (714) 239 3875 (24 hrs)
Fax: +1(770) 818 5697
Skype: ian.connor


Re: Proximity Search

2009-08-18 Thread Toby Cole

See the Lucene query parser syntax documentation:

http://lucene.apache.org/java/2_3_2/queryparsersyntax.html#Proximity%20Searches

basically... "shell petroleum"~10 should do the trick (if you're using  
a standard request handler, can't remember if dismax supports  
proximity).


On 18 Aug 2009, at 13:28, Ninad Raut wrote:


Hi,
I want to count the words between two significant words like "shell"  
and
"petroleum". Or want to write a query to find all the documents  
where the
content has "shell" and "petroleum" in close proximity of less than  
10 words

between them.
Can such quries be created in Solr?
Regards,
Ninad Raut.


--

Toby Cole
Software Engineer, Semantico Limited
 
Registered in England and Wales no. 03841410, VAT no. GB-744614334.
Registered office Lees House, 21-23 Dyke Road, Brighton BN1 3FE, UK.

Check out all our latest news and thinking on the Discovery blog
http://blogs.semantico.com/discovery-blog/



Is negative boost possible?

2009-08-18 Thread Larry He
Hi all,

I am looking for a way to assign negative boost to a term in Solr query.
Our use scenario is that we want to boost matching documents that are
updated recently and penalize those that have not been updated for a long
time.  There are other terms in the query that would affect the scores as
well.  For example we construct a query similar to this:

*:* field1:value1^2  field2:value2^2 lastUpdateTime:[NOW/DAY-90DAYS TO *]^5
lastUpdateTime:[* TO NOW/DAY-365DAYS]^-3

I notice it's not possible to simply use a negative boosting factor in the
query.  Is there any way to achieve such result?

Regards,
Shi Quan He


Re: Proximity Search

2009-08-18 Thread Erik Hatcher


On Aug 18, 2009, at 8:28 AM, Ninad Raut wrote:


Hi,
I want to count the words between two significant words like "shell"  
and
"petroleum". Or want to write a query to find all the documents  
where the
content has "shell" and "petroleum" in close proximity of less than  
10 words

between them.
Can such quries be created in Solr?


Yes, this is known as a "sloppy" phrase query.  Using Solr's standard  
query parser, a query of "some phrase" within quotes makes an exact  
phrase query.  Using "some words"~10 matches when the field has those  
words within that distance of term positions.


Erik



[ANNOUNCEMENT] Newly released book: Solr 1.4 Enterprise Search Server

2009-08-18 Thread Smiley, David W.
Fellow Solr users,

I've finally finished the book "Solr 1.4 Enterprise Search Server" with my 
co-author Eric.  We are proud to present the first book on Solr and hope you 
find it a valuable resource.   You can find full details about the book and 
purchase it here:
http://www.packtpub.com/solr-1-4-enterprise-search-server/book
It can be pre-ordered at a discount now and should be shipping within a week or 
two.  The book is also available through Amazon.  You can feel good about the 
purchase knowing that 5% of each sale goes to support the Apache Software 
Foundation.  For a free sample, there is a portion of chapter 5 covering 
faceting available as an article online here:
http://www.packtpub.com/article/faceting-in-solr-1.4-enterprise-search-server

By the way, we realize Solr 1.4 isn't out [quite] yet.  It is feature-frozen 
however, and there's little in the forthcoming release that isn't covered in 
our book.  About the only notable thing that comes to mind is the contrib 
module on search result clustering.  However Eric plans to write a free online 
article available from Packt Publishing on that very subject.

"Solr 1.4 Enterprise Search Server" In Detail:

If you are a developer building a high-traffic web site, you need to have a 
terrific search engine. Sites like Netflix.com and Zappos.com employ Solr, an 
open source enterprise search server, which uses and extends the Lucene search 
library. This is the first book in the market on Solr and it will show you how 
to optimize your web site for high volume web traffic with full-text search 
capabilities along with loads of customization options. So, let your users gain 
a terrific search experience

This book is a comprehensive reference guide for every feature Solr has to 
offer. It serves the reader right from initiation to development to deployment. 
It also comes with complete running examples to demonstrate its use and show 
how to integrate it with other languages and frameworks

This book first gives you a quick overview of Solr, and then gradually takes 
you from basic to advanced features that enhance your search. It starts off by 
discussing Solr and helping you understand how it fits into your 
architecture—where all databases and document/web crawlers fall short, and Solr 
shines. The main part of the book is a thorough exploration of nearly every 
feature that Solr offers. To keep this interesting and realistic, we use a 
large open source set of metadata about artists, releases, and tracks courtesy 
of the MusicBrainz.org project. Using this data as a testing ground for Solr, 
you will learn how to import this data in various ways from CSV to XML to 
database access. You will then learn how to search this data in a myriad of 
ways, including Solr's rich query syntax, "boosting" match scores based on 
record data and other means, about searching across multiple fields with 
different boosts, getting facets on the results, auto-complete user queries, 
spell-correcting searches, highlighting queried text in search results, and so 
on.

After this thorough tour, we'll demonstrate working examples of integrating a 
variety of technologies with Solr such as Java, JavaScript, Drupal, Ruby, XSLT, 
PHP, and Python.

Finally, we'll cover various deployment considerations to include indexing 
strategies and performance-oriented configuration that will enable you to scale 
Solr to meet the needs of a high-volume site


Sincerely,

David Smiley (primary-author)
dsmi...@mitre.org
Eric Pugh (co-author)
ep...@opensourceconnections.com


Solr 1.3 JNDI Datasource

2009-08-18 Thread brianeno

Hello,
  We have deployed Solr in our application within Weblogic and all is
working well. The last piece I am struggling with is configuring the
datasource for our data import handler to work with our Weblogic configured
JNDI datasource.   Can anyone lead me in the right direction how to
configure this in my dataconfig.xml? We are using Solr 1.3.

Thanks,
Brian
-- 
View this message in context: 
http://www.nabble.com/Solr-1.3-JNDI-Datasource-tp25025740p25025740.html
Sent from the Solr - User mailing list archive at Nabble.com.



Can I search for a term in any field or a list of fields?

2009-08-18 Thread Paul Tomblin
I've got "text" and so if I
do an unqualified search it only finds in the field text.  If I want
to search title, I can do "title:foo", but what if I want to find if
the search term is in any field, or if it's in "text" or "title" or
"concept" or "keywords"?  I already tried "*:foo", but that throws an
exception:

Caused by: org.apache.solr.client.solrj.SolrServerException:
org.apache.solr.client.solrj.SolrServerException:
org.apache.solr.common.SolrException: undefined field *
 [java] at
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:161)


-- 
http://www.linkedin.com/in/paultomblin


Re: Solr 1.3 JNDI Datasource

2009-08-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
DIH in Solr1.3 does not support JNDI datasource. Only 1.supports.


On Tue, Aug 18, 2009 at 7:41 PM, brianeno wrote:
>
> Hello,
>  We have deployed Solr in our application within Weblogic and all is
> working well. The last piece I am struggling with is configuring the
> datasource for our data import handler to work with our Weblogic configured
> JNDI datasource.   Can anyone lead me in the right direction how to
> configure this in my dataconfig.xml? We are using Solr 1.3.
>
> Thanks,
> Brian
> --
> View this message in context: 
> http://www.nabble.com/Solr-1.3-JNDI-Datasource-tp25025740p25025740.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


MoreLikeThis (MLT) in 1.4 distributed

2009-08-18 Thread mike anderson
I'm trying to get MLT working in 1.4 distributed mode. I was hoping the
patch *SOLR-788  *would do the trick, but after
applying the patch by hand to revision 737810 (it kept choking on
component/MoreLikeThisComponent.java) I still get nothing. The URL I am
using is this:
http://localhost:8983/solr/select?q=graph%20theory&mlt=true&mlt.fl=abstract&mlt.mindf=1&mlt.mintf=1&shards=localhost:8983/solr

and without the shards param it works fine:

http://localhost:8983/solr/select?q=graph%20theory&mlt=true&mlt.fl=abstract&mlt.mindf=1&mlt.mintf=1

debugQuery=true shows that the MLT component is being called, is there
elsewhere I can check for more debug information? Any advice on getting this
to work?



Thanks in advance,
Mike


Re: Can I search for a term in any field or a list of fields?

2009-08-18 Thread Marco Westermann

Hi Paul,

I would say, you should use the copyField tag in the schema. eg:



the text-field has to be difined as multivalued=true. When you now do an 
unqualified search, it will search every field, which is copied to the 
text-field.


with best regards,

Marco Westermann

Paul Tomblin schrieb:

I've got "text" and so if I
do an unqualified search it only finds in the field text.  If I want
to search title, I can do "title:foo", but what if I want to find if
the search term is in any field, or if it's in "text" or "title" or
"concept" or "keywords"?  I already tried "*:foo", but that throws an
exception:

Caused by: org.apache.solr.client.solrj.SolrServerException:
org.apache.solr.client.solrj.SolrServerException:
org.apache.solr.common.SolrException: undefined field *
 [java] at
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:161)


  



--
++ Business-Software aus einer Hand ++
++ Internet, Warenwirtschaft, Linux, Virtualisierung ++
http://www.intersales.de
http://www.eisxen.org
http://www.tarantella-partner.de
http://www.medisales.de
http://www.eisfair.net

interSales AG Internet Commerce
Subbelrather Str. 247
50825 Köln

Tel  02 21 - 27 90 50
Fax  02 21 - 27 90 517
Mail i...@intersales.de
Mail m...@intersales.de
Web  www.intersales.de

Handelsregister Köln HR B 30904
Ust.-Id.: DE199672015
Finanzamt Köln-Nord. UstID: nicht vergeben
Aufsichtsratsvorsitzender: Michael Morgenstern
Vorstand: Andrej Radonic, Peter Zander 



Re: Is negative boost possible?

2009-08-18 Thread Koji Sekiguchi

Hi,

Use decimal figure less than 1, e.g. 0.5, to express less importance.

Koji

Larry He wrote:

Hi all,

I am looking for a way to assign negative boost to a term in Solr query.
Our use scenario is that we want to boost matching documents that are
updated recently and penalize those that have not been updated for a long
time.  There are other terms in the query that would affect the scores as
well.  For example we construct a query similar to this:

*:* field1:value1^2  field2:value2^2 lastUpdateTime:[NOW/DAY-90DAYS TO *]^5
lastUpdateTime:[* TO NOW/DAY-365DAYS]^-3

I notice it's not possible to simply use a negative boosting factor in the
query.  Is there any way to achieve such result?

Regards,
Shi Quan He

  




Re: Can I search for a term in any field or a list of fields?

2009-08-18 Thread Paul Tomblin
So if I want to make it so that the default search always searches
three specific fields, I can make another field multi-valued that they
are all copied into?

On Tue, Aug 18, 2009 at 10:46 AM, Marco Westermann wrote:
> I would say, you should use the copyField tag in the schema. eg:
>
> 
>
> the text-field has to be difined as multivalued=true. When you now do an
> unqualified search, it will search every field, which is copied to the
> text-field.



-- 
http://www.linkedin.com/in/paultomblin


Re: Can I search for a term in any field or a list of fields?

2009-08-18 Thread Toby Cole
I would consider using the dismax query handler. This allows you to  
send a list of keywords or phrases along with the fields to search over.

e.g., you could use ?qt=dismax&q=foo&qf=title+text+keywords+concept

More details here: http://wiki.apache.org/solr/DisMaxRequestHandler


On 18 Aug 2009, at 15:56, Paul Tomblin wrote:


So if I want to make it so that the default search always searches
three specific fields, I can make another field multi-valued that they
are all copied into?

On Tue, Aug 18, 2009 at 10:46 AM, Marco Westermann  
wrote:

I would say, you should use the copyField tag in the schema. eg:



the text-field has to be difined as multivalued=true. When you now  
do an
unqualified search, it will search every field, which is copied to  
the

text-field.




--
http://www.linkedin.com/in/paultomblin


--

Toby Cole
Software Engineer, Semantico Limited
 
Registered in England and Wales no. 03841410, VAT no. GB-744614334.
Registered office Lees House, 21-23 Dyke Road, Brighton BN1 3FE, UK.

Check out all our latest news and thinking on the Discovery blog
http://blogs.semantico.com/discovery-blog/



solrconfig.xml and ExtractingRequestHandler

2009-08-18 Thread Kevin Miller
I am using the 8/11/09 nightly build of Solr and have a couple of
questions about the ExtractingRequestHandler in the solrconfig.xml file.

1. What is the purpose of 'startup="lazy"' in the requestHandler?

2. Is there a way to change the information in the requestHandler so
that the text within a file that Tika is reading can be stored?


Kevin Miller
Oklahoma Tax Commission
Web Services


Re: Can I search for a term in any field or a list of fields?

2009-08-18 Thread Marco Westermann
exactly! for example you could create a field called "all". And you copy 
your fields to it, which should be searched, when all fields are searched.


then you have two possibilities: either you make this field the 
defaultSearchField for use of unqualified searches. or you qualify the 
field in the query all:foo and all fields are searched which have been 
copied to the all-field.


best
Marco

Paul Tomblin schrieb:

So if I want to make it so that the default search always searches
three specific fields, I can make another field multi-valued that they
are all copied into?

On Tue, Aug 18, 2009 at 10:46 AM, Marco Westermann wrote:
  

I would say, you should use the copyField tag in the schema. eg:



the text-field has to be difined as multivalued=true. When you now do an
unqualified search, it will search every field, which is copied to the
text-field.





  



--
++ Business-Software aus einer Hand ++
++ Internet, Warenwirtschaft, Linux, Virtualisierung ++
http://www.intersales.de
http://www.eisxen.org
http://www.tarantella-partner.de
http://www.medisales.de
http://www.eisfair.net

interSales AG Internet Commerce
Subbelrather Str. 247
50825 Köln

Tel  02 21 - 27 90 50
Fax  02 21 - 27 90 517
Mail i...@intersales.de
Mail m...@intersales.de
Web  www.intersales.de

Handelsregister Köln HR B 30904
Ust.-Id.: DE199672015
Finanzamt Köln-Nord. UstID: nicht vergeben
Aufsichtsratsvorsitzender: Michael Morgenstern
Vorstand: Andrej Radonic, Peter Zander 



dynamic changes to schema

2009-08-18 Thread Marco Westermann

Hi there,

is there a possibility to change the solr-schema over php dynamically. 
The web-application I want to index at the moment has the feature to add 
fields to entitys and you can tell this fields that they are searchable. 
To realize this with solr the schema has to change when a searchable 
field is added or removed.


Any suggestions,

Thanks a lot,

Marco Westermann

--
++ Business-Software aus einer Hand ++
++ Internet, Warenwirtschaft, Linux, Virtualisierung ++
http://www.intersales.de
http://www.eisxen.org
http://www.tarantella-partner.de
http://www.medisales.de
http://www.eisfair.net

interSales AG Internet Commerce
Subbelrather Str. 247
50825 Köln

Tel  02 21 - 27 90 50
Fax  02 21 - 27 90 517
Mail i...@intersales.de
Mail m...@intersales.de
Web  www.intersales.de

Handelsregister Köln HR B 30904
Ust.-Id.: DE199672015
Finanzamt Köln-Nord. UstID: nicht vergeben
Aufsichtsratsvorsitzender: Michael Morgenstern
Vorstand: Andrej Radonic, Peter Zander 



Re: solrconfig.xml and ExtractingRequestHandler

2009-08-18 Thread Mark Miller

Kevin Miller wrote:

I am using the 8/11/09 nightly build of Solr and have a couple of
questions about the ExtractingRequestHandler in the solrconfig.xml file.

1. What is the purpose of 'startup="lazy"' in the requestHandler?
  

Makes it so the RequestHandler won't be inited until its actually accessed.


2. Is there a way to change the information in the requestHandler so
that the text within a file that Tika is reading can be stored?
  
Huh? Just use stored fields for the fields you are pulling the Tikia 
data/metadata to?




Kevin Miller
Oklahoma Tax Commission
Web Services
  



--
- Mark

http://www.lucidimagination.com





Re: MoreLikeThis (MLT) in 1.4 distributed

2009-08-18 Thread Grant Ingersoll

Are there errors in the logs?

-Grant

On Aug 18, 2009, at 10:42 AM, mike anderson wrote:

I'm trying to get MLT working in 1.4 distributed mode. I was hoping  
the

patch *SOLR-788  *would do the trick, but after
applying the patch by hand to revision 737810 (it kept choking on
component/MoreLikeThisComponent.java) I still get nothing. The URL I  
am

using is this:
http://localhost:8983/solr/select?q=graph%20theory&mlt=true&mlt.fl=abstract&mlt.mindf=1&mlt.mintf=1&shards=localhost:8983/solr

and without the shards param it works fine:

http://localhost:8983/solr/select?q=graph%20theory&mlt=true&mlt.fl=abstract&mlt.mindf=1&mlt.mintf=1

debugQuery=true shows that the MLT component is being called, is there
elsewhere I can check for more debug information? Any advice on  
getting this

to work?



Thanks in advance,
Mike


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Strange error with shards

2009-08-18 Thread ahammad

Hello,

I have been using multicore/shards for the past 5 months or so with no
problems at all. I just added another core to my Solr server, but for some
reason I can never get the shards working when that specific core is
anywhere in the URL (either in the shards list or the base URL).

HTTP Status 500 - null java.lang.NullPointerException at
org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:437)
at
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:281)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:290)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1330) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at
org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:859)
at
org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:574)
at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1527)
at java.lang.Thread.run(Thread.java:619) 

The way I created this shard was to copy an existing one, erasing all the
data files/folders, and modifying my schema/data-config files. So the core
settings are pretty much the same.

If I try the shard parameter with any of the other 7 cores that I have, it
works fine. It's only when this specific one is in the URL...

Cheers
-- 
View this message in context: 
http://www.nabble.com/Strange-error-with-shards-tp25027486p25027486.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Can I search for a term in any field or a list of fields?

2009-08-18 Thread Paul Tomblin
On Tue, Aug 18, 2009 at 11:04 AM, Marco Westermann wrote:
> exactly! for example you could create a field called "all". And you copy
> your fields to it, which should be searched, when all fields are searched.
>

Awesome, that worked great.  I made my "all" field 'stored="false"
indexed="true"' and I can search for a term that I know is in any of
the key fields and it finds it.

Thanks.


-- 
http://www.linkedin.com/in/paultomblin


Re: Release Date Solr 1.4

2009-08-18 Thread Michael
I think this link gets you the exact bug count: it's Constantijn's link,
filtered to Unresolved Solr issues marked for fixing in 1.4:
https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&&pid=12310230&fixfor=12313351&resolution=-1&sorter/field=issuekey&sorter/order=DESC


Michael

On Tue, Aug 18, 2009 at 9:05 AM, Constantijn Visinescu
wrote:

> Last i heard the eta was aprox a month, but they won't release it untill
> it's ready.
>
> Check JIRA here for the list of open issues that need fixing before 1.4
>
> https://issues.apache.org/jira/secure/IssueNavigator.jspa?sorter/field=updated&sorter/order=DESC
>
> Constantijn Visinescu
>
> On Tue, Aug 18, 2009 at 2:57 PM, Daniel Knapp <
> daniel.kn...@mni.fh-giessen.de> wrote:
>
> > Hello Mailinglist,
> >
> >
> > does anyone know the release date from Solr 1.4?
> >
> > Thanks for your reply.
> >
> >
> > Regards,
> > Daniel
> >
>


Re: MoreLikeThis (MLT) in 1.4 distributed

2009-08-18 Thread mike anderson
There doesn't appear to be any related errors in the log. I've included it
below anyhow (there is a java.lang.NumberFormatException, i'm not sure what
that is).
thanks,
mike

for the query:
http://localhost:8983/solr/select?q=%22theory%20of%20colorful%20graphs%22&mlt=true&mlt.fl=abstract&mlt.mindf=1&mlt.mintf=1&shards=localhost:8983/solr

Aug 18, 2009 12:11:56 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/select
params={mlt.mindf=1&mlt.fl=abstract&shards=localhost:8983/solr&q="theory+of+colorful+graphs"&mlt.mintf=1&mlt=true}
status=0 QTime=68
Aug 18, 2009 12:12:08 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/select
params={spellcheck=true&mlt.fl=abstract&spellcheck.extendedResults=false&mlt.mintf=1&mlt=true&spellcheck.collate=true&wt=javabin&spellcheck.onlyMorePopular=false&rows=10&version=2.2&mlt.mindf=1&fl=id,score&start=0&q="theory+of+colorful+graphs"&spellcheck.dictionary=titleCheck&spellcheck.count=1&isShard=true&fsv=true}
hits=1 status=0 QTime=5
Aug 18, 2009 12:12:08 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/select
params={spellcheck=true&mlt.fl=abstract&spellcheck.extendedResults=false&ids=2b0b321193c61dfbebe58d35f7d42bcf&mlt.mintf=1&mlt=true&spellcheck.collate=true&wt=javabin&spellcheck.onlyMorePopular=false&version=2.2&mlt.mindf=1&q="theory+of+colorful+graphs"&spellcheck.dictionary=titleCheck&spellcheck.count=1&isShard=true}
status=0 QTime=5
Aug 18, 2009 12:12:08 PM
org.apache.solr.request.BinaryResponseWriter$Resolver getDoc
WARNING: Error reading a field from document : SolrDocument[{abstract=  The
theory of colorful graphs can be developed by working in Galois field
modulo (p), p > 2 and a prime number. The paper proposes a program of
possible
conversion of graph theory into a pleasant colorful appearance. We propose
to
paint the usual black (indicating presence of an edge) and white (indicating
absence of an edge) edges of graphs using multitude of colors and study
their
properties. All colorful graphs considered here are simple, i.e. not having
any
multiple edges or self-loops. This paper is an invitation to the program of
generalizing usual graph theory in this direction.
, affiliations=, all_authors=Dhananjay P Mehendale, article_date=Sat Apr 28
19:59:59 EDT 2007, authors=[Mehendale, Dhananjay P Mehendale, Mehendale
Dhananjay P, D P Mehendale, Mehendale D P, D Mehendale, Mehendale D,
Dhananjay Mehendale, Mehendale Dhananjay, DP Mehendale, Mehendale DP],
created_at=Sat Apr 28 19:59:59 EDT 2007, description=10 pages, doi=, eissn=,
first_author=[Mehendale, Dhananjay P Mehendale, Mehendale Dhananjay P, D P
Mehendale, Mehendale D P, D Mehendale, Mehendale D, Dhananjay Mehendale,
Mehendale Dhananjay, DP Mehendale, Mehendale DP], first_page=,
id=2b0b321193c61dfbebe58d35f7d42bcf}]
java.lang.NumberFormatException: For input string: ""
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:493)
at java.lang.Integer.valueOf(Integer.java:570)
at org.apache.solr.schema.IntField.toObject(IntField.java:71)
at org.apache.solr.schema.IntField.toObject(IntField.java:32)
at
org.apache.solr.request.BinaryResponseWriter$Resolver.getDoc(BinaryResponseWriter.java:147)
at
org.apache.solr.request.BinaryResponseWriter$Resolver.writeDocList(BinaryResponseWriter.java:123)
at
org.apache.solr.request.BinaryResponseWriter$Resolver.resolve(BinaryResponseWriter.java:88)
at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:142)
at
org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:132)
at
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:220)
at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:137)
at org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:86)
at
org.apache.solr.request.BinaryResponseWriter.write(BinaryResponseWriter.java:48)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:254)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
at org.mortbay.jetty.HttpParser.parseNext(HttpPa

Passing a Cookie in SolrJ

2009-08-18 Thread Ramirez, Paul M (388J)
Hi All,

The project I am working on is using Solr and OpenSSO (Sun's single sign on 
service). I need to write some sample code for our users that shows them how to 
query Solr and I would just like to point them to the SolrJ documentation but I 
can't see an easy way to be able to pass a cookie with the request. The cookie 
is needed to be able to get through the SSO layer but will just be ignored by 
Solr. I see that you are using Apache Commons Http Client and with that I would 
be able to write the cookie if I had access to the HttpMethod being used 
(GetMethod or PostMethod). However, I can not find an easy way to get access to 
this with SolrJ and thought I would ask before rewriting a simple example using 
only an ApacheHttpClient without the SolJ library. Thanks in advance for any 
pointers you may have.

Thanks,
Paul Ramirez


Re: dynamic changes to schema

2009-08-18 Thread Constantijn Visinescu
use a dynamic field ?

On Tue, Aug 18, 2009 at 5:09 PM, Marco Westermann  wrote:

> Hi there,
>
> is there a possibility to change the solr-schema over php dynamically. The
> web-application I want to index at the moment has the feature to add fields
> to entitys and you can tell this fields that they are searchable. To realize
> this with solr the schema has to change when a searchable field is added or
> removed.
>
> Any suggestions,
>
> Thanks a lot,
>
> Marco Westermann
>
> --
> ++ Business-Software aus einer Hand ++
> ++ Internet, Warenwirtschaft, Linux, Virtualisierung ++
> http://www.intersales.de
> http://www.eisxen.org
> http://www.tarantella-partner.de
> http://www.medisales.de
> http://www.eisfair.net
>
> interSales AG Internet Commerce
> Subbelrather Str. 247
> 50825 Köln
>
> Tel  02 21 - 27 90 50
> Fax  02 21 - 27 90 517
> Mail i...@intersales.de
> Mail m...@intersales.de
> Web  www.intersales.de
>
> Handelsregister Köln HR B 30904
> Ust.-Id.: DE199672015
> Finanzamt Köln-Nord. UstID: nicht vergeben
> Aufsichtsratsvorsitzender: Michael Morgenstern
> Vorstand: Andrej Radonic, Peter Zander
>


Re: Release Date Solr 1.4

2009-08-18 Thread Chris Hostetter

: In-Reply-To: 
: Subject: Release Date Solr 1.4
: References: 
: <4a8a9c42.20...@gmail.com>
: 

http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is "hidden" in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.
See Also:  http://en.wikipedia.org/wiki/Thread_hijacking




-Hoss



Re: Release Date Solr 1.4

2009-08-18 Thread Yonik Seeley
On Tue, Aug 18, 2009 at 9:02 AM, Mark Miller wrote:
> The last note I saw said we hope to release 1.4 a week or so after Lucene
> 2.9 (though of course a week may not end up being enough).

Yep, I think this is still doable.

-Yonik
http://www.lucidimagination.com


RE: Passing a Cookie in SolrJ

2009-08-18 Thread Fuad Efendi
> some sample code for our users that shows them how to query Solr

- I believe you don't have to use SolrJ to query Solr; SolrJ can query and
parse XML response from server; if your clients can use raw URL as a query
and raw XML (JSON etc.) as a response - you don't need SolrJ.

To  pass cookie with SolrJ you need modify source code... or may be you can
get access to core HttpClient objects via some configuration (singleton) and
pass default per-client cookie without altering SOLR


-Original Message-
From: Ramirez, Paul M (388J) [mailto:paul.m.rami...@jpl.nasa.gov] 
Sent: August-18-09 12:48 PM
To: solr-user@lucene.apache.org
Subject: Passing a Cookie in SolrJ

Hi All,

The project I am working on is using Solr and OpenSSO (Sun's single sign on
service). I need to write some sample code for our users that shows them how
to query Solr and I would just like to point them to the SolrJ documentation
but I can't see an easy way to be able to pass a cookie with the request.
The cookie is needed to be able to get through the SSO layer but will just
be ignored by Solr. I see that you are using Apache Commons Http Client and
with that I would be able to write the cookie if I had access to the
HttpMethod being used (GetMethod or PostMethod). However, I can not find an
easy way to get access to this with SolrJ and thought I would ask before
rewriting a simple example using only an ApacheHttpClient without the SolJ
library. Thanks in advance for any pointers you may have.

Thanks,
Paul Ramirez




Can synonyms be defined in a multi-valued field or a database?

2009-08-18 Thread Kelly Taylor

I need the ability to remotely administer synonyms for each of my Solr
standalone instances. It seems that my only option is that of uploading a
file per instance, restarting the respective Solr instance(s), and then
rebuilding my indexes.

Can synonyms be defined in a multi-valued field or a database in place of a
file? Or can synonyms be administered in any way other than editing a file?

-Kelly
-- 
View this message in context: 
http://www.nabble.com/Can-synonyms-be-defined-in-a-multi-valued-field-or-a-database--tp25030700p25030700.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: SOLR - extremely strange behavior! Documents disappeared...

2009-08-18 Thread Fuad Efendi
UPDATE:

Crazy staff with SLES10 SP2 default installation/partitioning, LVM (Logical
Volume Manager) shows 400Gb available, but... I lost 90% of index without
even noticing that!

Aug 16, 2009 8:04:32 PM org.apache.solr.common.SolrException log
SEVERE: java.io.IOException: No space left on device
at java.io.RandomAccessFile.writeBytes(RandomAccessFile.java)

- then somehow no any exceptions without few hours, no any corrupted index
after several commits, then again "not enough space", etc.; finally
corrupted index (still, SATA)


Thanks


-Original Message-
From: Funtick [mailto:f...@efendi.ca] 
Sent: August-18-09 12:25 AM
To: solr-user@lucene.apache.org
Subject: Re: SOLR  - extremely strange behavior! Documents
disappeared...


sorry for typo in prev msg,

Increase = 2,297,231 - 1,786,552  = 500,000 (average)

RATE (non-unique-id:unique-id) = 7,000,000 : 500,000 = 14:1

but 125:1 (initial 30 hours) was very strange...



Funtick wrote:
> 
> UPDATE:
> 
> After few more minutes (after previous commit):
> docsPending: about 7,000,000
> 
> After commit:
> numDocs: 2,297,231
> 
> Increase = 2,297,231 - 1,281,851 = 1,000,000 (average)
> 
> So that I have 7 docs with same ID in average.
> 
> Having 100,000,000 and then dropping below 1,000,000 is strange; it is a
> bug somewhere... need to investigate ramBufferSize and MergePolicy,
> including SOLR uniqueId implementation...
> 
> 
> 
> Funtick wrote:
>> 
>> After running an application which heavily uses MD5 HEX-representation as
>>  for SOLR v.1.4-dev-trunk:
>> 
>> 1. After 30 hours: 
>> 101,000,000 documents added
>> 
>> 2. Commit: 
>> numDocs = 783,714 
>> maxDoc = 3,975,393
>> 
>> 3. Upload new docs to SOLR during 1 hour(!!!), then commit, then
>> optimize:
>> numDocs=1,281,851
>> maxDocs=1,281,851
>> 
>> It looks _extremely_ strange that within an hour I have such a huge
>> increase with same 'average' document set...
>> 
>> I am suspecting something goes wrong with Lucene buffer flush / index
>> merge OR SOLR - Unique ID handling...
>> 
>> According to my own estimates, I should have about 10,000,000 new
>> documents now... I had 0.5 millions within an hour, and 0.8 mlns within a
>> day; same 'random' documents.
>> 
>> This morning index size was about 4Gb, then suddenly dropped below 0.5
>> Gb. Why? I haven't issued any "commit"...
>> 
>> I am using ramBufferMB=8192
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 

-- 
View this message in context:
http://www.nabble.com/SOLR-%3CuniqueKey%3E---extremely-strange-behavior%21-D
ocuments-disappeared...-tp25017728p25018263.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Passing a Cookie in SolrJ

2009-08-18 Thread Chris Hostetter

: Subject: Passing a Cookie in SolrJ
: In-Reply-To: <8efd35820908180833u4140682bjcfbf2816b1710...@mail.gmail.com>

http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is "hidden" in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.
See Also:  http://en.wikipedia.org/wiki/Thread_hijacking




-Hoss



How to boost fields with many terms against single-term?

2009-08-18 Thread Fuad Efendi
I don't want single-term docs such as "home" to appear in top for simple
search for a home; I need "home improvement made easy" in top... How to
implement it at query time?

Thanks!

 



Re: Is negative boost possible?

2009-08-18 Thread Chris Hostetter

: Use decimal figure less than 1, e.g. 0.5, to express less importance.

but that's stil la positive boost ... it still increases the scores of 
documents that match.

the only way to "negative boost" is to "positively boost" the inverse...

(*:* -field1:value_to_penalize)^10

: > I am looking for a way to assign negative boost to a term in Solr query.
: > Our use scenario is that we want to boost matching documents that are
: > updated recently and penalize those that have not been updated for a long
: > time.  There are other terms in the query that would affect the scores as
: > well.  For example we construct a query similar to this:
: > 
: > *:* field1:value1^2  field2:value2^2 lastUpdateTime:[NOW/DAY-90DAYS TO *]^5
: > lastUpdateTime:[* TO NOW/DAY-365DAYS]^-3
: > 
: > I notice it's not possible to simply use a negative boosting factor in the
: > query.  Is there any way to achieve such result?
: > 
: > Regards,
: > Shi Quan He
: > 
: >   



-Hoss



Using Solr Cell to index a Word Document

2009-08-18 Thread Kevin Miller
I am using the Solr nightly build 8/11/09.  I have set the text field in the 
solrconfig.xml file to be stored.  I index an MS Word document and when I 
search for a word in the text of the document and it pulls up the xml format.  
The text field is showing the text of the document but there are areas in the 
document that are FORMDROPDOWNs.  What I want to know is if there is some way 
that the information that was entered into the FORMDROPDOWNs can be retrieved.  
The text field contains the following information (I have entered in 
parenthesis the actual data from the MS Word document for the FORMDROPDOWNs:


−

 OKLAHOMA TAX COMMISSION  
  
FISCAL IMPACT STATEMENT AND/OR ADMINISTRATIVE IMPACT STATEMENT
FIRST REGULAR SESSION, FIFTY-SECOND OKLAHOMA LEGISLATURE
  
  
  DATE OF IMPACT STATEMENT:  May 21, 2009
   
  BILL NUMBER:  HB 1097 
  
  STATUS AND DATE OF BILL:FORMDROPDOWN(Enrolled Bill)05/20/2009 
  
  AUTHORS:  House FORMTEXT   Dank   Senate  Brogdon
  
  TAX TYPE (S):All   SUBJECT:  FORMDROPDOWN(Credit)   
  
  PROPOSAL:   FORMDROPDOWN(New Law) 
  
  This measure creates a nine (9) member task force to study tax credits.  The 
measure also includes provisions of procedures and duties for the task force 
and directs the task force to produce a final written report for the Speaker, 
the Governor and the Pro Tempore.
  
  
  EFFECTIVE DATE:   August 21, 2009 (Assuming sine die is May 22, 2009)
  
  REVENUE IMPACT: 
  
  Insert dollar amount (plus or minus) of the expected change in state revenues 
due to this proposed legislation.
  
  FY 09:None
  FY 10:None  FORMTEXT
  
  ADMINISTRATIVE IMPACT:
  
  Insert the estimated cost or savings to the Tax Commission due to this 
proposed legislation.
  
  FY 10:None
  
  
lrh 
  DATE  DIVISION DIRECTOR
  

 
  DATE  X XX, ECONOMIST
  
 
  DATE  FOR THE COMMISSION
   



Kevin Miller
Web Services


Re: How to boost fields with many terms against single-term?

2009-08-18 Thread Bill Au
Lucene's default scoring formula gives shorter fields a higher score:

http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html

Sounds like you want the opposite.  You can write your own Similarity class
overriding the lengthNorm() method:

http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html#lengthNorm%28java.lang.String,%20int%29

Bill

http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html

On Tue, Aug 18, 2009 at 3:02 PM, Fuad Efendi  wrote:

> I don't want single-term docs such as "home" to appear in top for simple
> search for a home; I need "home improvement made easy" in top... How to
> implement it at query time?
>
> Thanks!
>
>
>
>


Re: How can i get lucene index format version information?

2009-08-18 Thread Shalin Shekhar Mangar
2009/8/18 Licinio Fernández Maurelo 

> Nobody knoes how can i get exactly this info : index format : -9 (UNKNOWN)
>

I think Luke may be using an older version of Lucene which is not able to
read the index created by Solr.


>
> Despite of knowing 2.9-dev 794238 -
> 2009-07-15 18:05:08 helps, i assume that it doesn't implies  an
> index format change
>

The best way to confirm this is to checkout Lucene revision 794238 and look
at the CHANGES.txt file.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Maximum number of values in a multi-valued field.

2009-08-18 Thread Shalin Shekhar Mangar
On Tue, Aug 18, 2009 at 4:20 AM, Arv  wrote:

>
> All,
> We are considering some new changes to our Solr schema to better support
> some new functionality for our application. To that extent, we want to add
> an additional field that is multi-valued, but will contain a large number
> of
> values per document. Potentially up to 2000 values on this field per
> document.
>
> Questions:
> - Is this wise?


Depends :)


>
> - Though we will not be faceting on this field, are there any implications
> for performance?


Should be ok but benchmark it to be sure.


>
> - I understand that the XML in/out will be large, and we may need to stop
> this field being sent back on every query, as this field is essentially
> used
> as a filter only.
>

You can do that with the "fl" request parameter. If you don't need it back
ever, don't "store" it.

You mentioned in a later mail that these are catalogue ids stored for each
product. I guess the number of unique ids across all documents should not be
that much? If yes, you may breathe easier.

-- 
Regards,
Shalin Shekhar Mangar.


Replication over multi-core solr

2009-08-18 Thread vivek sar
Hi,

  We use multi-core setup for Solr, where new cores are added
dynamically to solr.xml. Only one core is active at a time. My
question is how can the replication be done for multi-core - so every
core is replicated on the slave?

I went over the wiki, http://wiki.apache.org/solr/SolrReplication,
and few questions related to that,

1) How do we replicate solr.xml where we have list of cores? Wiki
says, "Only files in the 'conf' dir of solr instance is replicated. "
- since, solr.xml is in the home directory how do we replicate that?

2) Solrconfig.xml in slave takes a static core url,

http://localhost:port/solr/corename/replication

As in our case cores are created dynamically (new core created after
the active one reaches some capacity), how can we define master core
dynamically for replication? The only I see it is using "fetchIndex"
command and passing new core info there - is it right? If so, does the
slave application have write code to poll Master periodically and fire
"fetchIndex" command, but how would Slave know the Master corename -
as they are created dynamically on the Master?

Thanks,
-vivek


Re: Distributed index

2009-08-18 Thread Shalin Shekhar Mangar
On Tue, Aug 18, 2009 at 3:49 PM, ToJira  wrote:

>
> Hi,
>
> I am very new to Solr and overall a newbie in software developing. I have a
> problem with cross-platform implementation. Basically I have a local index
> running on a windows server 2003 aided with a web service (asp.net) for
> the
> user queries. However, I need to add another index on a remote Linux
> computer. As I somehow understood, it is necessary to implement a Solr
> server on the Linux because it is not possible to use .net there.


Solr can run on Windows as well as on Linux. If you want, you can run it on
your windows server box.


> So I would
> like to know how do I search the remote index using the asp.net web
> service
> user inqueries?


Search and update is done through HTTP with the default data format being
XML. All you need to know is to make the correct http call and create/parse
XML. There are a couple of C# client libraries for Solr.


> Any tutorials related to this?
>

The Solr wiki is a good place to start looking.

http://wiki.apache.org/solr/

Good Luck and welcome to Solr!

-- 
Regards,
Shalin Shekhar Mangar.


Re: spellcheck component in 1.4 distributed

2009-08-18 Thread Shalin Shekhar Mangar
On Mon, Aug 17, 2009 at 8:32 PM, Ian Connor  wrote:

> Hi,
>
> Just a quick update to the list. Mike and I were able to apply it to 1.4
> and
> it works. We have it loaded on a few production servers and there is an odd
> "StringIndexOutOfBoundsException" error but most of the time it seems to
> work just fine.
>
> On Fri, Aug 7, 2009 at 7:30 PM, mike anderson  >wrote:
>
> > I am e-mailing to inquire about the status of the spellchecking component
> > in
> > 1.4 (distributed). I saw SOLR-785, but it is unreleased and for 1.5. Any
> > help would be much appreciated.
> > Thanks in advance,
> > Mike
> >
>

Sorry guys, this has been sitting on my plate. The patch didn't have unit
tests and I didn't have enough time to look into it so I marked it for 1.5

On a related note, it is best to have such discussion on the issue itself so
that problems regarding the patch (or lack thereof) are recorded in the
right place.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Using Solr Cell to index a Word Document

2009-08-18 Thread Mark Miller
Solr defers to Tika for this. Tika uses getParagraph text from the POI 
WordExtractor class:


http://poi.apache.org/apidocs/org/apache/poi/hwpf/extractor/WordExtractor.html

POI appears to be in limbo and I'm not seeing anything in WordExtractor 
that looks like it might help you.


I'd inquire at the Tika project though.


--
- Mark

http://www.lucidimagination.com



Kevin Miller wrote:

I am using the Solr nightly build 8/11/09.  I have set the text field in the 
solrconfig.xml file to be stored.  I index an MS Word document and when I 
search for a word in the text of the document and it pulls up the xml format.  
The text field is showing the text of the document but there are areas in the 
document that are FORMDROPDOWNs.  What I want to know is if there is some way 
that the information that was entered into the FORMDROPDOWNs can be retrieved.  
The text field contains the following information (I have entered in 
parenthesis the actual data from the MS Word document for the FORMDROPDOWNs:


−

   	 OKLAHOMA TAX COMMISSION  
  
  	FISCAL IMPACT STATEMENT AND/OR ADMINISTRATIVE IMPACT STATEMENT

FIRST REGULAR SESSION, FIFTY-SECOND OKLAHOMA LEGISLATURE
  
  
  DATE OF IMPACT STATEMENT:	 May 21, 2009
   
  BILL NUMBER:  HB 1097	
  
  STATUS AND DATE OF BILL:	  FORMDROPDOWN(Enrolled Bill)05/20/2009 
  
  AUTHORS:	House	  FORMTEXT   Dank 			Senate	Brogdon
  
  TAX TYPE (S):All   SUBJECT:  FORMDROPDOWN(Credit)   
  
  PROPOSAL:	  FORMDROPDOWN(New Law) 
  
  This measure creates a nine (9) member task force to study tax credits.  The measure also includes provisions of procedures and duties for the task force and directs the task force to produce a final written report for the Speaker, the Governor and the Pro Tempore.
  
  
  EFFECTIVE DATE:	August 21, 2009 (Assuming sine die is May 22, 2009)
  
  REVENUE IMPACT: 
  
  Insert dollar amount (plus or minus) of the expected change in state revenues due to this proposed legislation.
  
  FY 09:	None		
  FY 10:	None  FORMTEXT
  
  ADMINISTRATIVE IMPACT:
  
  Insert the estimated cost or savings to the Tax Commission due to this proposed legislation.
  
  FY 10:	None
  
  
	  	lrh 
  DATEDIVISION DIRECTOR
  
 
  DATEX XX, ECONOMIST
  
 
  DATEFOR THE COMMISSION
   




Kevin Miller
Web Services
  






Re: [ANNOUNCEMENT] Newly released book: Solr 1.4 Enterprise Search Server

2009-08-18 Thread Shalin Shekhar Mangar
Fantastic! This is great news for Solr! Congratulations!

You might want to post this to the general-lucene mailing list and the
linkedin group too.

On Tue, Aug 18, 2009 at 7:39 PM, Smiley, David W.  wrote:

> Fellow Solr users,
>
> I've finally finished the book "Solr 1.4 Enterprise Search Server" with my
> co-author Eric.  We are proud to present the first book on Solr and hope you
> find it a valuable resource.   You can find full details about the book and
> purchase it here:
> http://www.packtpub.com/solr-1-4-enterprise-search-server/book
> It can be pre-ordered at a discount now and should be shipping within a
> week or two.  The book is also available through Amazon.  You can feel good
> about the purchase knowing that 5% of each sale goes to support the Apache
> Software Foundation.  For a free sample, there is a portion of chapter 5
> covering faceting available as an article online here:
>
> http://www.packtpub.com/article/faceting-in-solr-1.4-enterprise-search-server
>
> By the way, we realize Solr 1.4 isn't out [quite] yet.  It is
> feature-frozen however, and there's little in the forthcoming release that
> isn't covered in our book.  About the only notable thing that comes to mind
> is the contrib module on search result clustering.  However Eric plans to
> write a free online article available from Packt Publishing on that very
> subject.
>
> "Solr 1.4 Enterprise Search Server" In Detail:
>
> If you are a developer building a high-traffic web site, you need to have a
> terrific search engine. Sites like Netflix.com and Zappos.com employ Solr,
> an open source enterprise search server, which uses and extends the Lucene
> search library. This is the first book in the market on Solr and it will
> show you how to optimize your web site for high volume web traffic with
> full-text search capabilities along with loads of customization options. So,
> let your users gain a terrific search experience
>
> This book is a comprehensive reference guide for every feature Solr has to
> offer. It serves the reader right from initiation to development to
> deployment. It also comes with complete running examples to demonstrate its
> use and show how to integrate it with other languages and frameworks
>
> This book first gives you a quick overview of Solr, and then gradually
> takes you from basic to advanced features that enhance your search. It
> starts off by discussing Solr and helping you understand how it fits into
> your architecture—where all databases and document/web crawlers fall short,
> and Solr shines. The main part of the book is a thorough exploration of
> nearly every feature that Solr offers. To keep this interesting and
> realistic, we use a large open source set of metadata about artists,
> releases, and tracks courtesy of the MusicBrainz.org project. Using this
> data as a testing ground for Solr, you will learn how to import this data in
> various ways from CSV to XML to database access. You will then learn how to
> search this data in a myriad of ways, including Solr's rich query syntax,
> "boosting" match scores based on record data and other means, about
> searching across multiple fields with different boosts, getting facets on
> the results, auto-complete user queries, spell-correcting searches,
> highlighting queried text in search results, and so on.
>
> After this thorough tour, we'll demonstrate working examples of integrating
> a variety of technologies with Solr such as Java, JavaScript, Drupal, Ruby,
> XSLT, PHP, and Python.
>
> Finally, we'll cover various deployment considerations to include indexing
> strategies and performance-oriented configuration that will enable you to
> scale Solr to meet the needs of a high-volume site
>
>
> Sincerely,
>
> David Smiley (primary-author)
>dsmi...@mitre.org
> Eric Pugh (co-author)
>ep...@opensourceconnections.com
>



-- 
Regards,
Shalin Shekhar Mangar.


Faceting Performance Factors

2009-08-18 Thread CameronL

Our current search is faceting on a single integer field. The field is
multi-valued.

facet=true
facet.mincount=1
facet.limit=-1
facet.field=fieldA

The number of unique values in our index for fieldA is around 8000, and a
typical query can return about 500 counts. A typical single document can
have anywhere from 5 to 20 values for fieldA. The performance we are getting
for this implementation is pretty acceptable (under 2 seconds).

Now, we are trying to add in a 2nd facet, also an integer and also
multi-valued.

facet=true
facet.mincount=1
facet.limit=-1
facet.field=fieldA
facet.field=fieldB

The number of unique values in our index for fieldB is around 100k, but a
typical query still only returns about 400 counts. However, a single
document will only have 5 or 6 values for fieldB. The performance of our
queries dropped significantly (about 15-20 seconds per query!).

I'm unable to figure out why there is such a significant drop in performance
here. Is it the fact that there are more than 10x more possible unique
values for fieldB? Hopefully I have provided enough info above, but do any
of these strike you as a big contributing factor to the drop in performance?

We are currently using Solr 1.3 and upgrading to 1.4 will not be an option
until it is finalized.

Thanks for the help.
-- 
View this message in context: 
http://www.nabble.com/Faceting-Performance-Factors-tp25033622p25033622.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: dynamic changes to schema

2009-08-18 Thread Marco Westermann

hi,

thanks for the advise but the problem with dynamic fields is, that i 
cannot restrict how the user calls the field in the application. So 
there isn't a pattern I can use. But I thought about using mulitvalued 
fields for the dynamically added fields. Good Idea?


thanks,
Marco

Constantijn Visinescu schrieb:

use a dynamic field ?

On Tue, Aug 18, 2009 at 5:09 PM, Marco Westermann  wrote:

  

Hi there,

is there a possibility to change the solr-schema over php dynamically. The
web-application I want to index at the moment has the feature to add fields
to entitys and you can tell this fields that they are searchable. To realize
this with solr the schema has to change when a searchable field is added or
removed.

Any suggestions,

Thanks a lot,

Marco Westermann

--
++ Business-Software aus einer Hand ++
++ Internet, Warenwirtschaft, Linux, Virtualisierung ++
http://www.intersales.de
http://www.eisxen.org
http://www.tarantella-partner.de
http://www.medisales.de
http://www.eisfair.net

interSales AG Internet Commerce
Subbelrather Str. 247
50825 Köln

Tel  02 21 - 27 90 50
Fax  02 21 - 27 90 517
Mail i...@intersales.de
Mail m...@intersales.de
Web  www.intersales.de

Handelsregister Köln HR B 30904
Ust.-Id.: DE199672015
Finanzamt Köln-Nord. UstID: nicht vergeben
Aufsichtsratsvorsitzender: Michael Morgenstern
Vorstand: Andrej Radonic, Peter Zander






__ Hinweis von ESET NOD32 Antivirus, Signaturdatenbank-Version 4346 
(20090818) __

E-Mail wurde geprüft mit ESET NOD32 Antivirus.

http://www.eset.com


  



--
++ Business-Software aus einer Hand ++
++ Internet, Warenwirtschaft, Linux, Virtualisierung ++
http://www.intersales.de
http://www.eisxen.org
http://www.tarantella-partner.de
http://www.medisales.de
http://www.eisfair.net

interSales AG Internet Commerce
Subbelrather Str. 247
50825 Köln

Tel  02 21 - 27 90 50
Fax  02 21 - 27 90 517
Mail i...@intersales.de
Mail m...@intersales.de
Web  www.intersales.de

Handelsregister Köln HR B 30904
Ust.-Id.: DE199672015
Finanzamt Köln-Nord. UstID: nicht vergeben
Aufsichtsratsvorsitzender: Michael Morgenstern
Vorstand: Andrej Radonic, Peter Zander 



RE: DataImportHandler - very slow delta import

2009-08-18 Thread Matthew Painter
Thanks; that confirms my observed behaviour.

However, why would the delta query have to make a single db call per changed 
row? For simple delta queries like mine below, batching a chunk of rows at the 
time from the database seems quite doable. Or are there less-trivial situations 
where batching wouldn't work?

Does the deletedPkQuery suffer from the same performance issues? The problem in 
our specific instance is that often we're removing and modifying thousands of 
rows in one hit so I may have to adopt a different approach. I'm not 
comfortable using Solr 1.4 in a production environment yet, so unfortunately 
the nice new features in the DataImportHandler aren't an option.

I'll try your suggested solution soon.

M

 

-Original Message-
From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble 
Paul ??? ??
Sent: Tuesday, 18 August 2009 5:11 p.m.
To: solr-user@lucene.apache.org
Subject: Re: DataImportHandler - very slow delta import

delta imports are likely to be far slower that the full imports
because it makes one db call per changed row. if you can write the
"query" in such a way that it gives only the changed rows, then write
a separate entity (directly under ) and just run a
full-import with that entity only.

On Tue, Aug 18, 2009 at 6:32 AM, Matthew
Painter wrote:
> Hi,
>
> We are using Solr's DataImportHandler to populate the Solr index from a
> SQL Server database of nearly 4,000,000 rows. Whereas the population
> itself is very fast (around 1000 rows per second), the delta import is
> only processing around one row a second.
>
> Is this a known performance issue? We are using Solr 1.3.
>
> For reference, the abridged entity configuration (cuts indicated by
> '...') is below:
>
>              query="select archwaypublic.getSolrIdentifier(oid, 'agency')
> as oid, oid as realoid, archwaypublic.getSolrIdentifier(oid, 'agency')
> as id, code, name, ..."
>   deltaQuery="select oid from publicagency with (nolock) where
> modifiedtime > '${dataimporter.last_index_time}'"
>   deletedPkQuery="select archwaypublic.getSolrIdentifier(entityoid,
> 'agency') as oid from pendingsolrdeletions with (nolock) where
> entitytype='agency'">
>
> ...
> 
>
> Thanks,
> Matt
>
> This e-mail message and any attachments are CONFIDENTIAL to the addressee(s) 
> and may also be LEGALLY PRIVILEGED.  If you are not the intended addressee, 
> please do not use, disclose, copy or distribute the message or the 
> information it contains.  Instead, please notify me as soon as possible and 
> delete the e-mail, including any attachments.  Thank you.
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com
This e-mail message and any attachments are CONFIDENTIAL to the addressee(s) 
and may also be LEGALLY PRIVILEGED.  If you are not the intended addressee, 
please do not use, disclose, copy or distribute the message or the information 
it contains.  Instead, please notify me as soon as possible and delete the 
e-mail, including any attachments.  Thank you.


DataImportHandler ignoring most rows

2009-08-18 Thread Erik Earle
Using:
- apache-solr-1.3.0
- java 1.6 
- tomcat 6
- sql server 2005 w/ JSQLConnect 4.0 driver

I have a group table with 3007 rows.  I have confirmed the key is
unique with "select distinct id from group"  and it returns 3007.  When i 
re-index using http://host:port/solr/dataimport?command=full-import  I only get 
7 records indexed.  Any insight into what is going on would be really great.  

A partial response:

1
7
0


I have other entities that index all the rows without issue.

There are no errors in the logs.

I am not using any Transformers (and most of my config is not changed from 
install)

My schema.xml contains:

 key

and field defs (not a full list of fields):

   

   
   
   

data-config.xml






   







   




  


Re: MoreLikeThis (MLT) in 1.4 distributed

2009-08-18 Thread mike anderson
Perhaps it was something about the way I applied the patch by hand, but
after trying it again (on a later revision, maybe that was the trick), I got
solr to acknowledge I was using MLT when also passing the shards parameter.
However, unlike a query without shards, I get numFound=0 for all results:


Any advice to this end?

My query is still the same:

http://localhost:8983/solr/select?q=graph%20theory&mlt=true&mlt.fl=abstract&mlt.mindf=1&mlt.mintf=1&shards=localhost:8983/solr

thanks in advance,
Mike


On Tue, Aug 18, 2009 at 12:18 PM, mike anderson wrote:

> There doesn't appear to be any related errors in the log. I've included it
> below anyhow (there is a java.lang.NumberFormatException, i'm not sure what
> that is).
> thanks,
> mike
>
> for the query:
>
> http://localhost:8983/solr/select?q=%22theory%20of%20colorful%20graphs%22&mlt=true&mlt.fl=abstract&mlt.mindf=1&mlt.mintf=1&shards=localhost:8983/solr
>
> Aug 18, 2009 12:11:56 PM org.apache.solr.core.SolrCore execute
> INFO: [] webapp=/solr path=/select
> params={mlt.mindf=1&mlt.fl=abstract&shards=localhost:8983/solr&q="theory+of+colorful+graphs"&mlt.mintf=1&mlt=true}
> status=0 QTime=68
> Aug 18, 2009 12:12:08 PM org.apache.solr.core.SolrCore execute
> INFO: [] webapp=/solr path=/select
> params={spellcheck=true&mlt.fl=abstract&spellcheck.extendedResults=false&mlt.mintf=1&mlt=true&spellcheck.collate=true&wt=javabin&spellcheck.onlyMorePopular=false&rows=10&version=2.2&mlt.mindf=1&fl=id,score&start=0&q="theory+of+colorful+graphs"&spellcheck.dictionary=titleCheck&spellcheck.count=1&isShard=true&fsv=true}
> hits=1 status=0 QTime=5
> Aug 18, 2009 12:12:08 PM org.apache.solr.core.SolrCore execute
> INFO: [] webapp=/solr path=/select
> params={spellcheck=true&mlt.fl=abstract&spellcheck.extendedResults=false&ids=2b0b321193c61dfbebe58d35f7d42bcf&mlt.mintf=1&mlt=true&spellcheck.collate=true&wt=javabin&spellcheck.onlyMorePopular=false&version=2.2&mlt.mindf=1&q="theory+of+colorful+graphs"&spellcheck.dictionary=titleCheck&spellcheck.count=1&isShard=true}
> status=0 QTime=5
> Aug 18, 2009 12:12:08 PM
> org.apache.solr.request.BinaryResponseWriter$Resolver getDoc
> WARNING: Error reading a field from document : SolrDocument[{abstract=  The
> theory of colorful graphs can be developed by working in Galois field
> modulo (p), p > 2 and a prime number. The paper proposes a program of
> possible
> conversion of graph theory into a pleasant colorful appearance. We propose
> to
> paint the usual black (indicating presence of an edge) and white
> (indicating
> absence of an edge) edges of graphs using multitude of colors and study
> their
> properties. All colorful graphs considered here are simple, i.e. not having
> any
> multiple edges or self-loops. This paper is an invitation to the program of
> generalizing usual graph theory in this direction.
> , affiliations=, all_authors=Dhananjay P Mehendale, article_date=Sat Apr 28
> 19:59:59 EDT 2007, authors=[Mehendale, Dhananjay P Mehendale, Mehendale
> Dhananjay P, D P Mehendale, Mehendale D P, D Mehendale, Mehendale D,
> Dhananjay Mehendale, Mehendale Dhananjay, DP Mehendale, Mehendale DP],
> created_at=Sat Apr 28 19:59:59 EDT 2007, description=10 pages, doi=, eissn=,
> first_author=[Mehendale, Dhananjay P Mehendale, Mehendale Dhananjay P, D P
> Mehendale, Mehendale D P, D Mehendale, Mehendale D, Dhananjay Mehendale,
> Mehendale Dhananjay, DP Mehendale, Mehendale DP], first_page=,
> id=2b0b321193c61dfbebe58d35f7d42bcf}]
> java.lang.NumberFormatException: For input string: ""
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>  at java.lang.Integer.parseInt(Integer.java:493)
> at java.lang.Integer.valueOf(Integer.java:570)
>  at org.apache.solr.schema.IntField.toObject(IntField.java:71)
> at org.apache.solr.schema.IntField.toObject(IntField.java:32)
>  at
> org.apache.solr.request.BinaryResponseWriter$Resolver.getDoc(BinaryResponseWriter.java:147)
> at
> org.apache.solr.request.BinaryResponseWriter$Resolver.writeDocList(BinaryResponseWriter.java:123)
>  at
> org.apache.solr.request.BinaryResponseWriter$Resolver.resolve(BinaryResponseWriter.java:88)
> at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:142)
>  at
> org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:132)
> at
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:220)
>  at
> org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:137)
> at org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:86)
>  at
> org.apache.solr.request.BinaryResponseWriter.write(BinaryResponseWriter.java:48)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:254)
>  at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
> at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
>  at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at org.mor

RE: DataImportHandler - very slow delta import

2009-08-18 Thread Matthew Painter
I was wary of the potential maintenance issues and clutter involved with 
copying each entity block as suggested below (they're all large and there are 
around ten of them), so I just modifying the main full import query to be of 
the syntax:

query="select x,y,z from table where modifiedtime > 
'${dataimporter.last_index_time}'"

It appears to work fine. I suspect this isn't the way that it's *supposed* to 
be used, however may be worth mentioning in the wiki as an alternative way to 
use the DataImportHandler for situations like mine where the dataset is large 
and data is reasonably volatile and where the current delta query code isn't 
appropriate for performance raesons.

M

PS. 22 hours later, and I killed the original delta import query ;)



-Original Message-
From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble 
Paul ??? ??
Sent: Tuesday, 18 August 2009 5:11 p.m.
To: solr-user@lucene.apache.org
Subject: Re: DataImportHandler - very slow delta import

delta imports are likely to be far slower that the full imports
because it makes one db call per changed row. if you can write the
"query" in such a way that it gives only the changed rows, then write
a separate entity (directly under ) and just run a
full-import with that entity only.

On Tue, Aug 18, 2009 at 6:32 AM, Matthew
Painter wrote:
> Hi,
>
> We are using Solr's DataImportHandler to populate the Solr index from a
> SQL Server database of nearly 4,000,000 rows. Whereas the population
> itself is very fast (around 1000 rows per second), the delta import is
> only processing around one row a second.
>
> Is this a known performance issue? We are using Solr 1.3.
>
> For reference, the abridged entity configuration (cuts indicated by
> '...') is below:
>
>              query="select archwaypublic.getSolrIdentifier(oid, 'agency')
> as oid, oid as realoid, archwaypublic.getSolrIdentifier(oid, 'agency')
> as id, code, name, ..."
>   deltaQuery="select oid from publicagency with (nolock) where
> modifiedtime > '${dataimporter.last_index_time}'"
>   deletedPkQuery="select archwaypublic.getSolrIdentifier(entityoid,
> 'agency') as oid from pendingsolrdeletions with (nolock) where
> entitytype='agency'">
>
> ...
> 
>
> Thanks,
> Matt
>
> This e-mail message and any attachments are CONFIDENTIAL to the addressee(s) 
> and may also be LEGALLY PRIVILEGED.  If you are not the intended addressee, 
> please do not use, disclose, copy or distribute the message or the 
> information it contains.  Instead, please notify me as soon as possible and 
> delete the e-mail, including any attachments.  Thank you.
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com
This e-mail message and any attachments are CONFIDENTIAL to the addressee(s) 
and may also be LEGALLY PRIVILEGED.  If you are not the intended addressee, 
please do not use, disclose, copy or distribute the message or the information 
it contains.  Instead, please notify me as soon as possible and delete the 
e-mail, including any attachments.  Thank you.


Re: CorruptIndexException: Unknown format version

2009-08-18 Thread Chris Hostetter

: how can that happen, it is a new index, and it is already corrupt?
: 
: Did anybody else something like this?

"Unknown format version" doesn't mean your index is corrupt .. it means 
the version of LUcnee parsing the index doesn't recognize the index format 
version ... typically it means you are trying to open an index generated 
by a newer version of lucene then the one you are using.




-Hoss



Re: Relevant results with DisMaxRequestHandler

2009-08-18 Thread Chris Hostetter

: The 'qf' parameter used in the dismax seems to work with a 'AND' separator.
: I have much more results without dixmax. Is there any way to keep the same
: amount of document and process the 'qf' ?

did you read any of the docs on dismax?

http://wiki.apache.org/solr/DisMaxRequestHandler

did you look at the "mm" param?

http://wiki.apache.org/solr/DisMaxRequestHandler#mm


-Hoss



Re: DataImportHandler ignoring most rows

2009-08-18 Thread Erik Earle
Upgraded to the tip from svn and still no love.



- Original Message 
From: Erik Earle 
To: solr-user@lucene.apache.org
Sent: Tuesday, August 18, 2009 3:16:47 PM
Subject: DataImportHandler ignoring most rows

Using:
- apache-solr-1.3.0
- java 1.6 
- tomcat 6
- sql server 2005 w/ JSQLConnect 4.0 driver

I have a group table with 3007 rows.  I have confirmed the key is
unique with "select distinct id from group"  and it returns 3007.  When i 
re-index using http://host:port/solr/dataimport?command=full-import  I only get 
7 records indexed.  Any insight into what is going on would be really great.  

A partial response:

1
7
0


I have other entities that index all the rows without issue.

There are no errors in the logs.

I am not using any Transformers (and most of my config is not changed from 
install)

My schema.xml contains:

 key

and field defs (not a full list of fields):

   

   
   
   

data-config.xml






  







  




  


Re: schema configuration with different kind of score report

2009-08-18 Thread Chris Hostetter

: Hence, some sort of different query will be applied, which I am unable to
: ascertain.

well that would be step one.  before anyone can help you generate a 
"different kind of score report" you have to be able to describe the 
general algorithm you want for determining when there is a match and what hte 
score should be if there is a match ... listing a single input/output 
example isn't very helpful, and doesn't tell anyone what kind of behavior 
you want for the millions of other posisble search inputs you could 
recieve.

please *describe* the behavior you wnat -- not some magical absolute 
numbers you epxect ot get back from a single example.


-Hoss



Re: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on CentOS

2009-08-18 Thread Aaron Aberg
Marco might be right about the JRE thing.
Here is my classpath entry when Tomcat starts up
java.library.path:
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/lib/i386/client:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/lib/i386:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib

Constantijn,

Here is my solr home file list with permissions:

-bash-3.2$ ll /usr/share/solr/*
-rw-r--r-- 1 tomcat tomcat 2150 Aug 17 22:51 /usr/share/solr/README.txt

/usr/share/solr/bin:
total 160
-rwxr-xr-x 1 tomcat tomcat 4896 Aug 17 22:51 abc
-rwxr-xr-x 1 tomcat tomcat 4919 Aug 17 22:51 abo
-rwxr-xr-x 1 tomcat tomcat 2915 Aug 17 22:51 backup
-rwxr-xr-x 1 tomcat tomcat 3435 Aug 17 22:51 backupcleaner
-rwxr-xr-x 1 tomcat tomcat 3312 Aug 17 22:51 commit
-rwxr-xr-x 1 tomcat tomcat 3306 Aug 17 22:51 optimize
-rwxr-xr-x 1 tomcat tomcat 3163 Aug 17 22:51 readercycle
-rwxr-xr-x 1 tomcat tomcat 1752 Aug 17 22:51 rsyncd-disable
-rwxr-xr-x 1 tomcat tomcat 1740 Aug 17 22:51 rsyncd-enable
-rwxr-xr-x 1 tomcat tomcat 3508 Aug 17 22:51 rsyncd-start
-rwxr-xr-x 1 tomcat tomcat 2295 Aug 17 22:51 rsyncd-stop
-rwxr-xr-x 1 tomcat tomcat 2132 Aug 17 22:51 scripts-util
-rwxr-xr-x 1 tomcat tomcat 3775 Aug 17 22:51 snapcleaner
-rwxr-xr-x 1 tomcat tomcat 4994 Aug 17 22:51 snapinstaller
-rwxr-xr-x 1 tomcat tomcat 7980 Aug 17 22:51 snappuller
-rwxr-xr-x 1 tomcat tomcat 1768 Aug 17 22:51 snappuller-disable
-rwxr-xr-x 1 tomcat tomcat 1770 Aug 17 22:51 snappuller-enable
-rwxr-xr-x 1 tomcat tomcat 3269 Aug 17 22:51 snapshooter

/usr/share/solr/conf:
total 124
-rw-r--r-- 1 tomcat tomcat  1125 Aug 17 22:51 admin-extra.html
-rw-r--r-- 1 tomcat tomcat  1310 Aug 17 22:51 elevate.xml
-rw-r--r-- 1 tomcat tomcat   894 Aug 17 22:51 protwords.txt
-rw-r--r-- 1 tomcat tomcat 20083 Aug 17 22:51 schema.xml
-rw-r--r-- 1 tomcat tomcat   921 Aug 17 22:51 scripts.conf
-rw-r--r-- 1 tomcat tomcat 30281 Aug 17 22:53 solrconfig.xml
-rw-r--r-- 1 tomcat tomcat16 Aug 17 22:51 spellings.txt
-rw-r--r-- 1 tomcat tomcat  1226 Aug 17 22:51 stopwords.txt
-rw-r--r-- 1 tomcat tomcat  1163 Aug 17 22:51 synonyms.txt
drwxr-xr-x 2 tomcat tomcat  4096 Aug 17 22:51 xslt


Re: Faceting Performance Factors

2009-08-18 Thread Jason Rutherglen
Hi Cameron,

You'll need to upgrade to Solr 1.4 as the 1.3 method of faceting
is quite slow (i.e. intersecting bitsets). 1.4 uses
UnInvertedField which caches the terms per doc and
iterates/counts them. The 1.3 method is slow because for every
term (i.e. unique field value) there needs to be a bitset.

-J

On Tue, Aug 18, 2009 at 2:17 PM, CameronL wrote:
>
> Our current search is faceting on a single integer field. The field is
> multi-valued.
>
> facet=true
> facet.mincount=1
> facet.limit=-1
> facet.field=fieldA
>
> The number of unique values in our index for fieldA is around 8000, and a
> typical query can return about 500 counts. A typical single document can
> have anywhere from 5 to 20 values for fieldA. The performance we are getting
> for this implementation is pretty acceptable (under 2 seconds).
>
> Now, we are trying to add in a 2nd facet, also an integer and also
> multi-valued.
>
> facet=true
> facet.mincount=1
> facet.limit=-1
> facet.field=fieldA
> facet.field=fieldB
>
> The number of unique values in our index for fieldB is around 100k, but a
> typical query still only returns about 400 counts. However, a single
> document will only have 5 or 6 values for fieldB. The performance of our
> queries dropped significantly (about 15-20 seconds per query!).
>
> I'm unable to figure out why there is such a significant drop in performance
> here. Is it the fact that there are more than 10x more possible unique
> values for fieldB? Hopefully I have provided enough info above, but do any
> of these strike you as a big contributing factor to the drop in performance?
>
> We are currently using Solr 1.3 and upgrading to 1.4 will not be an option
> until it is finalized.
>
> Thanks for the help.
> --
> View this message in context: 
> http://www.nabble.com/Faceting-Performance-Factors-tp25033622p25033622.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


RE: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on CentOS

2009-08-18 Thread Fuad Efendi
The only difference between JRE and JDK (from TOMCAT viewpoint) is absence
of javac compiler for JSPs. But it will complain only if you try to use JSPs
(via admin console).

Have you tried to install SOLR on your localbox and play with samples
described at many WIKI pages?



-Original Message-
From: Aaron Aberg [mailto:aaronab...@gmail.com] 
Sent: August-18-09 9:04 PM
To: solr-user@lucene.apache.org
Subject: Re: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on
CentOS

Marco might be right about the JRE thing.
Here is my classpath entry when Tomcat starts up
java.library.path:
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/lib/i386/client:/usr/lib/jvm/jav
a-1.6.0-openjdk-1.6.0.0/jre/lib/i386:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0
/jre/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib

Constantijn,

Here is my solr home file list with permissions:

-bash-3.2$ ll /usr/share/solr/*
-rw-r--r-- 1 tomcat tomcat 2150 Aug 17 22:51 /usr/share/solr/README.txt

/usr/share/solr/bin:
total 160
-rwxr-xr-x 1 tomcat tomcat 4896 Aug 17 22:51 abc
-rwxr-xr-x 1 tomcat tomcat 4919 Aug 17 22:51 abo
-rwxr-xr-x 1 tomcat tomcat 2915 Aug 17 22:51 backup
-rwxr-xr-x 1 tomcat tomcat 3435 Aug 17 22:51 backupcleaner
-rwxr-xr-x 1 tomcat tomcat 3312 Aug 17 22:51 commit
-rwxr-xr-x 1 tomcat tomcat 3306 Aug 17 22:51 optimize
-rwxr-xr-x 1 tomcat tomcat 3163 Aug 17 22:51 readercycle
-rwxr-xr-x 1 tomcat tomcat 1752 Aug 17 22:51 rsyncd-disable
-rwxr-xr-x 1 tomcat tomcat 1740 Aug 17 22:51 rsyncd-enable
-rwxr-xr-x 1 tomcat tomcat 3508 Aug 17 22:51 rsyncd-start
-rwxr-xr-x 1 tomcat tomcat 2295 Aug 17 22:51 rsyncd-stop
-rwxr-xr-x 1 tomcat tomcat 2132 Aug 17 22:51 scripts-util
-rwxr-xr-x 1 tomcat tomcat 3775 Aug 17 22:51 snapcleaner
-rwxr-xr-x 1 tomcat tomcat 4994 Aug 17 22:51 snapinstaller
-rwxr-xr-x 1 tomcat tomcat 7980 Aug 17 22:51 snappuller
-rwxr-xr-x 1 tomcat tomcat 1768 Aug 17 22:51 snappuller-disable
-rwxr-xr-x 1 tomcat tomcat 1770 Aug 17 22:51 snappuller-enable
-rwxr-xr-x 1 tomcat tomcat 3269 Aug 17 22:51 snapshooter

/usr/share/solr/conf:
total 124
-rw-r--r-- 1 tomcat tomcat  1125 Aug 17 22:51 admin-extra.html
-rw-r--r-- 1 tomcat tomcat  1310 Aug 17 22:51 elevate.xml
-rw-r--r-- 1 tomcat tomcat   894 Aug 17 22:51 protwords.txt
-rw-r--r-- 1 tomcat tomcat 20083 Aug 17 22:51 schema.xml
-rw-r--r-- 1 tomcat tomcat   921 Aug 17 22:51 scripts.conf
-rw-r--r-- 1 tomcat tomcat 30281 Aug 17 22:53 solrconfig.xml
-rw-r--r-- 1 tomcat tomcat16 Aug 17 22:51 spellings.txt
-rw-r--r-- 1 tomcat tomcat  1226 Aug 17 22:51 stopwords.txt
-rw-r--r-- 1 tomcat tomcat  1163 Aug 17 22:51 synonyms.txt
drwxr-xr-x 2 tomcat tomcat  4096 Aug 17 22:51 xslt




RE: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on CentOS

2009-08-18 Thread Fuad Efendi
I forgot to add: compiler is inside "tools.jar" in some cases if I am
correct... doesn't matter really... try to access Tomcat default homepage
before trying to use SOLR!





The only difference between JRE and JDK (from TOMCAT viewpoint) is absence
of javac compiler for JSPs. But it will complain only if you try to use JSPs
(via admin console).

Have you tried to install SOLR on your localbox and play with samples
described at many WIKI pages?



-Original Message-
From: Aaron Aberg [mailto:aaronab...@gmail.com] 
Sent: August-18-09 9:04 PM
To: solr-user@lucene.apache.org
Subject: Re: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on
CentOS

Marco might be right about the JRE thing.
Here is my classpath entry when Tomcat starts up
java.library.path:
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/lib/i386/client:/usr/lib/jvm/jav
a-1.6.0-openjdk-1.6.0.0/jre/lib/i386:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0
/jre/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib

Constantijn,

Here is my solr home file list with permissions:

-bash-3.2$ ll /usr/share/solr/*
-rw-r--r-- 1 tomcat tomcat 2150 Aug 17 22:51 /usr/share/solr/README.txt

/usr/share/solr/bin:
total 160
-rwxr-xr-x 1 tomcat tomcat 4896 Aug 17 22:51 abc
-rwxr-xr-x 1 tomcat tomcat 4919 Aug 17 22:51 abo
-rwxr-xr-x 1 tomcat tomcat 2915 Aug 17 22:51 backup
-rwxr-xr-x 1 tomcat tomcat 3435 Aug 17 22:51 backupcleaner
-rwxr-xr-x 1 tomcat tomcat 3312 Aug 17 22:51 commit
-rwxr-xr-x 1 tomcat tomcat 3306 Aug 17 22:51 optimize
-rwxr-xr-x 1 tomcat tomcat 3163 Aug 17 22:51 readercycle
-rwxr-xr-x 1 tomcat tomcat 1752 Aug 17 22:51 rsyncd-disable
-rwxr-xr-x 1 tomcat tomcat 1740 Aug 17 22:51 rsyncd-enable
-rwxr-xr-x 1 tomcat tomcat 3508 Aug 17 22:51 rsyncd-start
-rwxr-xr-x 1 tomcat tomcat 2295 Aug 17 22:51 rsyncd-stop
-rwxr-xr-x 1 tomcat tomcat 2132 Aug 17 22:51 scripts-util
-rwxr-xr-x 1 tomcat tomcat 3775 Aug 17 22:51 snapcleaner
-rwxr-xr-x 1 tomcat tomcat 4994 Aug 17 22:51 snapinstaller
-rwxr-xr-x 1 tomcat tomcat 7980 Aug 17 22:51 snappuller
-rwxr-xr-x 1 tomcat tomcat 1768 Aug 17 22:51 snappuller-disable
-rwxr-xr-x 1 tomcat tomcat 1770 Aug 17 22:51 snappuller-enable
-rwxr-xr-x 1 tomcat tomcat 3269 Aug 17 22:51 snapshooter

/usr/share/solr/conf:
total 124
-rw-r--r-- 1 tomcat tomcat  1125 Aug 17 22:51 admin-extra.html
-rw-r--r-- 1 tomcat tomcat  1310 Aug 17 22:51 elevate.xml
-rw-r--r-- 1 tomcat tomcat   894 Aug 17 22:51 protwords.txt
-rw-r--r-- 1 tomcat tomcat 20083 Aug 17 22:51 schema.xml
-rw-r--r-- 1 tomcat tomcat   921 Aug 17 22:51 scripts.conf
-rw-r--r-- 1 tomcat tomcat 30281 Aug 17 22:53 solrconfig.xml
-rw-r--r-- 1 tomcat tomcat16 Aug 17 22:51 spellings.txt
-rw-r--r-- 1 tomcat tomcat  1226 Aug 17 22:51 stopwords.txt
-rw-r--r-- 1 tomcat tomcat  1163 Aug 17 22:51 synonyms.txt
drwxr-xr-x 2 tomcat tomcat  4096 Aug 17 22:51 xslt






RE: [ANNOUNCEMENT] Newly released book: Solr 1.4 Enterprise Search Server

2009-08-18 Thread Fuad Efendi
Some very smart guys at Hadoop even posted some discount codes at WIKI, and
it's even possible to buy in-advance not published yet chapters :) -
everything changes extremely quick...


Why did you keeep it in secret? Waiting for SOLR-4.1 :))) - do you still use
outdated pre-1.4 "faceting" term in your book?

Congratulations!



-Original Message-
From: Smiley, David W. [mailto:dsmi...@mitre.org] 
Sent: August-18-09 10:10 AM
To: solr
Subject: [ANNOUNCEMENT] Newly released book: Solr 1.4 Enterprise Search
Server

Fellow Solr users,

I've finally finished the book "Solr 1.4 Enterprise Search Server" with my
co-author Eric.  We are proud to present the first book on Solr and hope you
find it a valuable resource.   You can find full details about the book and
purchase it here:
http://www.packtpub.com/solr-1-4-enterprise-search-server/book
It can be pre-ordered at a discount now and should be shipping within a week
or two.  The book is also available through Amazon.  You can feel good about
the purchase knowing that 5% of each sale goes to support the Apache
Software Foundation.  For a free sample, there is a portion of chapter 5
covering faceting available as an article online here:
http://www.packtpub.com/article/faceting-in-solr-1.4-enterprise-search-serve
r

By the way, we realize Solr 1.4 isn't out [quite] yet.  It is feature-frozen
however, and there's little in the forthcoming release that isn't covered in
our book.  About the only notable thing that comes to mind is the contrib
module on search result clustering.  However Eric plans to write a free
online article available from Packt Publishing on that very subject.

"Solr 1.4 Enterprise Search Server" In Detail:

If you are a developer building a high-traffic web site, you need to have a
terrific search engine. Sites like Netflix.com and Zappos.com employ Solr,
an open source enterprise search server, which uses and extends the Lucene
search library. This is the first book in the market on Solr and it will
show you how to optimize your web site for high volume web traffic with
full-text search capabilities along with loads of customization options. So,
let your users gain a terrific search experience

This book is a comprehensive reference guide for every feature Solr has to
offer. It serves the reader right from initiation to development to
deployment. It also comes with complete running examples to demonstrate its
use and show how to integrate it with other languages and frameworks

This book first gives you a quick overview of Solr, and then gradually takes
you from basic to advanced features that enhance your search. It starts off
by discussing Solr and helping you understand how it fits into your
architecture-where all databases and document/web crawlers fall short, and
Solr shines. The main part of the book is a thorough exploration of nearly
every feature that Solr offers. To keep this interesting and realistic, we
use a large open source set of metadata about artists, releases, and tracks
courtesy of the MusicBrainz.org project. Using this data as a testing ground
for Solr, you will learn how to import this data in various ways from CSV to
XML to database access. You will then learn how to search this data in a
myriad of ways, including Solr's rich query syntax, "boosting" match scores
based on record data and other means, about searching across multiple fields
with different boosts, getting facets on the results, auto-complete user
queries, spell-correcting searches, highlighting queried text in search
results, and so on.

After this thorough tour, we'll demonstrate working examples of integrating
a variety of technologies with Solr such as Java, JavaScript, Drupal, Ruby,
XSLT, PHP, and Python.

Finally, we'll cover various deployment considerations to include indexing
strategies and performance-oriented configuration that will enable you to
scale Solr to meet the needs of a high-volume site


Sincerely,

David Smiley (primary-author)
dsmi...@mitre.org
Eric Pugh (co-author)
ep...@opensourceconnections.com




Spanish Stemmer

2009-08-18 Thread Darien Rosa
Hello,

I am trying to configure Solr to index Spanish documents and I've found some 
problems with the Spanish stemmer. I have a basic install using Tomcat.

I suspect that the Spanish stemmer isn't working very well. The site 
http://snowball.tartarus.org/algorithms/spanish/stemmer.html shows a sample of 
Spanish vocabulary with the stemmed forms that will be generated with the 
algorithm. I tried with several of them and I didn't get the same result.

For example: the site says that the term "chicas" is stemmed as "chic". 
However, in my project, the term "chicas" is stemmed as "chica" (I can see it 
using Luke - Lucene Index Toolbox). I don't realize where the problem is.

Here is a fragment of my schema.xml file:

   
   
   
   
   
   



Please, if someone can provide me any information related to this I would be 
very grateful.



Thanks in advance,



Darien



Universidad Central "Marta Abreu" de Las Villas. http://www.uclv.edu.cu

- VI Conferencia Internacional Medio Ambiente Siglo XXI. Universidad Central de 
Las Villas, del 3 al 6 de noviembre de 2009. 
http://eventos.fim.uclv.edu.cu/masxxi
- IV Conferencia Internacional de ECOMATERIALES. Hotel Sierra Maestra. Bayamo, 
del 24 al 27 de noviembre de 2009
- Universidad 2010, La Habana, del 8 al 12 de febrero de 2010. 
http://www.universidad2010.cu


Re: Spanish Stemmer

2009-08-18 Thread Robert Muir
hi, it looks like you might just have a simple typo:

 

if you change it to language="Spanish" it should work.


-- 
Robert Muir
rcm...@gmail.com


Re: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on CentOS

2009-08-18 Thread Aaron Aberg
Tomcat is running fine. It's solr that is having the issue. I keep
seeing people talk about this:

-Dsolr.solr.home='/some/path'

Should I be putting that somewhere? Or is that already taken care of
when I edited the web.xml file in my solr.war file?

On Tue, Aug 18, 2009 at 7:29 PM, Fuad Efendi wrote:
> I forgot to add: compiler is inside "tools.jar" in some cases if I am
> correct... doesn't matter really... try to access Tomcat default homepage
> before trying to use SOLR!
>
>
>
> 
>
> The only difference between JRE and JDK (from TOMCAT viewpoint) is absence
> of javac compiler for JSPs. But it will complain only if you try to use JSPs
> (via admin console).
>
> Have you tried to install SOLR on your localbox and play with samples
> described at many WIKI pages?
>
>
>
> -Original Message-
> From: Aaron Aberg [mailto:aaronab...@gmail.com]
> Sent: August-18-09 9:04 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on
> CentOS
>
> Marco might be right about the JRE thing.
> Here is my classpath entry when Tomcat starts up
> java.library.path:
> /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/lib/i386/client:/usr/lib/jvm/jav
> a-1.6.0-openjdk-1.6.0.0/jre/lib/i386:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0
> /jre/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib
>
> Constantijn,
>
> Here is my solr home file list with permissions:
>
> -bash-3.2$ ll /usr/share/solr/*
> -rw-r--r-- 1 tomcat tomcat 2150 Aug 17 22:51 /usr/share/solr/README.txt
>
> /usr/share/solr/bin:
> total 160
> -rwxr-xr-x 1 tomcat tomcat 4896 Aug 17 22:51 abc
> -rwxr-xr-x 1 tomcat tomcat 4919 Aug 17 22:51 abo
> -rwxr-xr-x 1 tomcat tomcat 2915 Aug 17 22:51 backup
> -rwxr-xr-x 1 tomcat tomcat 3435 Aug 17 22:51 backupcleaner
> -rwxr-xr-x 1 tomcat tomcat 3312 Aug 17 22:51 commit
> -rwxr-xr-x 1 tomcat tomcat 3306 Aug 17 22:51 optimize
> -rwxr-xr-x 1 tomcat tomcat 3163 Aug 17 22:51 readercycle
> -rwxr-xr-x 1 tomcat tomcat 1752 Aug 17 22:51 rsyncd-disable
> -rwxr-xr-x 1 tomcat tomcat 1740 Aug 17 22:51 rsyncd-enable
> -rwxr-xr-x 1 tomcat tomcat 3508 Aug 17 22:51 rsyncd-start
> -rwxr-xr-x 1 tomcat tomcat 2295 Aug 17 22:51 rsyncd-stop
> -rwxr-xr-x 1 tomcat tomcat 2132 Aug 17 22:51 scripts-util
> -rwxr-xr-x 1 tomcat tomcat 3775 Aug 17 22:51 snapcleaner
> -rwxr-xr-x 1 tomcat tomcat 4994 Aug 17 22:51 snapinstaller
> -rwxr-xr-x 1 tomcat tomcat 7980 Aug 17 22:51 snappuller
> -rwxr-xr-x 1 tomcat tomcat 1768 Aug 17 22:51 snappuller-disable
> -rwxr-xr-x 1 tomcat tomcat 1770 Aug 17 22:51 snappuller-enable
> -rwxr-xr-x 1 tomcat tomcat 3269 Aug 17 22:51 snapshooter
>
> /usr/share/solr/conf:
> total 124
> -rw-r--r-- 1 tomcat tomcat  1125 Aug 17 22:51 admin-extra.html
> -rw-r--r-- 1 tomcat tomcat  1310 Aug 17 22:51 elevate.xml
> -rw-r--r-- 1 tomcat tomcat   894 Aug 17 22:51 protwords.txt
> -rw-r--r-- 1 tomcat tomcat 20083 Aug 17 22:51 schema.xml
> -rw-r--r-- 1 tomcat tomcat   921 Aug 17 22:51 scripts.conf
> -rw-r--r-- 1 tomcat tomcat 30281 Aug 17 22:53 solrconfig.xml
> -rw-r--r-- 1 tomcat tomcat    16 Aug 17 22:51 spellings.txt
> -rw-r--r-- 1 tomcat tomcat  1226 Aug 17 22:51 stopwords.txt
> -rw-r--r-- 1 tomcat tomcat  1163 Aug 17 22:51 synonyms.txt
> drwxr-xr-x 2 tomcat tomcat  4096 Aug 17 22:51 xslt
>
>
>
>
>


RE: [ANNOUNCEMENT] Newly released book: Solr 1.4 Enterprise Search Server

2009-08-18 Thread Smiley, David W.
Hi Faud.

It's true I didn't publicize its release beforehand; I have no idea if it is 
normal to do so or not.  I guess I'm a bit shy.

I honestly have no clue what you're referring to as the successor to the 
"faceting" term.

~ David Smiley

From: Fuad Efendi [f...@efendi.ca]
Sent: Tuesday, August 18, 2009 10:39 PM
To: solr-user@lucene.apache.org
Subject: RE: [ANNOUNCEMENT] Newly released book: Solr 1.4 Enterprise Search 
Server

Some very smart guys at Hadoop even posted some discount codes at WIKI, and
it's even possible to buy in-advance not published yet chapters :) -
everything changes extremely quick...


Why did you keeep it in secret? Waiting for SOLR-4.1 :))) - do you still use
outdated pre-1.4 "faceting" term in your book?

Congratulations!



-Original Message-
From: Smiley, David W. [mailto:dsmi...@mitre.org]
Sent: August-18-09 10:10 AM
To: solr
Subject: [ANNOUNCEMENT] Newly released book: Solr 1.4 Enterprise Search
Server

Fellow Solr users,

I've finally finished the book "Solr 1.4 Enterprise Search Server" with my
co-author Eric.  We are proud to present the first book on Solr and hope you
find it a valuable resource.   You can find full details about the book and
purchase it here:
http://www.packtpub.com/solr-1-4-enterprise-search-server/book
It can be pre-ordered at a discount now and should be shipping within a week
or two.  The book is also available through Amazon.  You can feel good about
the purchase knowing that 5% of each sale goes to support the Apache
Software Foundation.  For a free sample, there is a portion of chapter 5
covering faceting available as an article online here:
http://www.packtpub.com/article/faceting-in-solr-1.4-enterprise-search-serve
r

By the way, we realize Solr 1.4 isn't out [quite] yet.  It is feature-frozen
however, and there's little in the forthcoming release that isn't covered in
our book.  About the only notable thing that comes to mind is the contrib
module on search result clustering.  However Eric plans to write a free
online article available from Packt Publishing on that very subject.

"Solr 1.4 Enterprise Search Server" In Detail:

If you are a developer building a high-traffic web site, you need to have a
terrific search engine. Sites like Netflix.com and Zappos.com employ Solr,
an open source enterprise search server, which uses and extends the Lucene
search library. This is the first book in the market on Solr and it will
show you how to optimize your web site for high volume web traffic with
full-text search capabilities along with loads of customization options. So,
let your users gain a terrific search experience

This book is a comprehensive reference guide for every feature Solr has to
offer. It serves the reader right from initiation to development to
deployment. It also comes with complete running examples to demonstrate its
use and show how to integrate it with other languages and frameworks

This book first gives you a quick overview of Solr, and then gradually takes
you from basic to advanced features that enhance your search. It starts off
by discussing Solr and helping you understand how it fits into your
architecture-where all databases and document/web crawlers fall short, and
Solr shines. The main part of the book is a thorough exploration of nearly
every feature that Solr offers. To keep this interesting and realistic, we
use a large open source set of metadata about artists, releases, and tracks
courtesy of the MusicBrainz.org project. Using this data as a testing ground
for Solr, you will learn how to import this data in various ways from CSV to
XML to database access. You will then learn how to search this data in a
myriad of ways, including Solr's rich query syntax, "boosting" match scores
based on record data and other means, about searching across multiple fields
with different boosts, getting facets on the results, auto-complete user
queries, spell-correcting searches, highlighting queried text in search
results, and so on.

After this thorough tour, we'll demonstrate working examples of integrating
a variety of technologies with Solr such as Java, JavaScript, Drupal, Ruby,
XSLT, PHP, and Python.

Finally, we'll cover various deployment considerations to include indexing
strategies and performance-oriented configuration that will enable you to
scale Solr to meet the needs of a high-volume site


Sincerely,

David Smiley (primary-author)
dsmi...@mitre.org
Eric Pugh (co-author)
ep...@opensourceconnections.com

Re: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on CentOS

2009-08-18 Thread Chris Hostetter

: -Dsolr.solr.home='/some/path'
: 
: Should I be putting that somewhere? Or is that already taken care of
: when I edited the web.xml file in my solr.war file?

No ... you do not need to set that system property if you already have it 
working because of modifications to the web.xml ... according to the log 
you posted earlier, Solr is seeing your solr home dir set correctly...

Aug 17, 2009 11:16:15 PM org.apache.solr.core.SolrResourceLoader 
locateInstanceDir
INFO: Using JNDI solr.home: /usr/share/solr
Aug 17, 2009 11:16:15 PM org.apache.solr.core.CoreContainer$Initializer 
initialize
INFO: looking for solr.xml: /usr/share/solr/solr.xml
Aug 17, 2009 11:16:15 PM org.apache.solr.core.SolrResourceLoader 
INFO: Solr home set to '/usr/share/solr/'

...that's were you want it to point, correct?

(don't be confused by the later message of "Check solr/home property" ... 
that's just a hint because 9 times out of 10 an error initializing solr 
comes from solr needing to *guess* about the solr home dir)

The crux of your error is being able to load an XPathFactory, the fact 
that it can't load an XPath factory prevents the your 
classloader from even being able to load the SolrConfig class -- note this 
also in the log you posted earlier...

java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.solr.core.SolrConfig 

...the root of the problem is here...

Caused by: java.lang.RuntimeException: XPathFactory#newInstance()
failed to create an XPathFactory for the default object model:
http://java.sun.com/jaxp/xpath/dom with the
XPathFactoryConfigurationException:
javax.xml.xpath.XPathFactoryConfigurationException: No XPathFctory
implementation found for the object model:
http://java.sun.com/jaxp/xpath/dom
at javax.xml.xpath.XPathFactory.newInstance(Unknown Source)
at org.apache.solr.core.Config.(Config.java:41) 

XPathFactory.newInstance() is used to construct an instance of an 
XPathFactory where the concrete type is unknown by the caller (in this 
case: solr)  There is an alternte form (XPathFactory.newInstance(String 
uri)) which allows callers to specify *which* model they want, and it can 
throw an exception if the model isn't available in the current JVM using 
reflection, but if you read the javadocs for hte method being called...

http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/xpath/XPathFactory.html#newInstance()
   Get a new XPathFactory instance using the default object model, 
   DEFAULT_OBJECT_MODEL_URI, the W3C DOM.

   This method is functionally equivalent to:

  newInstance(DEFAULT_OBJECT_MODEL_URI)
 
   Since the implementation for the W3C DOM is always available, this 
   method will never fail.

...except that in your case, it is in fact clearly failing.  which 
suggests that your hosting provider has given you a crapy JVM.  I have no 
good suggestions for debugging this, other then this google link...

http://www.google.com/search?q=+No+XPathFctory+implementation+found+for+the+object+model%3A+http%3A%2F%2Fjava.sun.com%2Fjaxp%2Fxpath%2Fdom

The good new is, there isn't anything solr specific about this problem.  
Any servlet container giving you that error when you load solr, should 
cause the exact same error with a servlet as simple as this...

  public class TestServlet extends javax.servlet.http.HttpServlet {
public static Object X = javax.xml.xpath.XPathFactory.newInstance();
public void doGet (javax.servlet.http.HttpServletRequest req,
   javax.servlet.http.HttpServletResponse res) {
   // NOOP
}
  }

...which should provide you with a nice short bug report for your hosting 
provider.

One last important note (because it may burn you once you get the XPath 
problem resolved).  you mentioned this before...


: > Here is my solr home file list with permissions:
: >
: > -bash-3.2$ ll /usr/share/solr/*
: > -rw-r--r-- 1 tomcat tomcat 2150 Aug 17 22:51 /usr/share/solr/README.txt
: >
: > /usr/share/solr/bin:
...

that all looks fine, but the information is incomplete.  you didn't 
include the permisions for the directories themselves (/usr/share/solr, 
/usr/share/solr/conf, and /usr/share/solr/bin)  they need to be readable 
by tomcat (they probably are) but most importantly /usr/share/solr needs 
to be writable by tomcat so that solr can create the data directory for 
you.


-Hoss



RE: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on CentOS

2009-08-18 Thread Fuad Efendi
>>-Dsolr.solr.home='/some/path'


CORRECT:
-Dsolr.data.dir=..

It should be in java startup parameters; for instance, JAVA_OPTS="-server
-Zms32768M -Xmx32768M -Dsolr.data.dir=/some/path" inside catalina.sh as a
first statement...


According to the logs you posted probably  mistake in solr.xml which is
multicore definition;  you should post here it's content.

Java 1.4/5/6 supports "nested exceptions".

The root cause of your problem:
java.lang.NoClassDefFoundError: org.apache.solr.core.Config

This exception causes another one:
javax.xml.xpath.XPathFactoryConfigurationException: No XPathFctory
implementation found for the object model:
http://java.sun.com/jaxp/xpath/dom
at javax.xml.xpath.XPathFactory.newInstance(Unknown Source)
at org.apache.solr.core.Config.(Config.java:41)

etc. etc. etc.


NoClassDefFoundError means: classloader didn't have any problem with finding
class def, but it couldn't "define" it due for instance dependency on
another library and/or dependency on configuration file such as solr.xml
etc. 

XPath should be called on DOM (after Config is properly initialized)


Difficult to explain what is wrong with your mess of files in config (you
are obviously using double-core) - you should do following:
1. Install Tomcat
2. Copy SOLR war file to webapps folder
3. Start Tomcat and verify logs; ensure that you have some clear messages in
it (SOLR should use default home? Verify!)
4. Configure SOLR-home with sample solrconfig.xml and schema.xml, restart,
verify 
... ... ...


Don't go to multicore until you play enough with simplest SOLR installation


-Original Message-
From: Aaron Aberg [mailto:aaronab...@gmail.com] 
Sent: August-19-09 12:28 AM
To: solr-user@lucene.apache.org
Subject: Re: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on
CentOS

Tomcat is running fine. It's solr that is having the issue. I keep
seeing people talk about this:

-Dsolr.solr.home='/some/path'

Should I be putting that somewhere? Or is that already taken care of
when I edited the web.xml file in my solr.war file?

On Tue, Aug 18, 2009 at 7:29 PM, Fuad Efendi wrote:
> I forgot to add: compiler is inside "tools.jar" in some cases if I am
> correct... doesn't matter really... try to access Tomcat default homepage
> before trying to use SOLR!
>
>
>
> 
>
> The only difference between JRE and JDK (from TOMCAT viewpoint) is absence
> of javac compiler for JSPs. But it will complain only if you try to use
JSPs
> (via admin console).
>
> Have you tried to install SOLR on your localbox and play with samples
> described at many WIKI pages?
>
>
>
> -Original Message-
> From: Aaron Aberg [mailto:aaronab...@gmail.com]
> Sent: August-18-09 9:04 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on
> CentOS
>
> Marco might be right about the JRE thing.
> Here is my classpath entry when Tomcat starts up
> java.library.path:
>
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/lib/i386/client:/usr/lib/jvm/jav
>
a-1.6.0-openjdk-1.6.0.0/jre/lib/i386:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0
> /jre/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib
>
> Constantijn,
>
> Here is my solr home file list with permissions:
>
> -bash-3.2$ ll /usr/share/solr/*
> -rw-r--r-- 1 tomcat tomcat 2150 Aug 17 22:51 /usr/share/solr/README.txt
>
> /usr/share/solr/bin:
> total 160
> -rwxr-xr-x 1 tomcat tomcat 4896 Aug 17 22:51 abc
> -rwxr-xr-x 1 tomcat tomcat 4919 Aug 17 22:51 abo
> -rwxr-xr-x 1 tomcat tomcat 2915 Aug 17 22:51 backup
> -rwxr-xr-x 1 tomcat tomcat 3435 Aug 17 22:51 backupcleaner
> -rwxr-xr-x 1 tomcat tomcat 3312 Aug 17 22:51 commit
> -rwxr-xr-x 1 tomcat tomcat 3306 Aug 17 22:51 optimize
> -rwxr-xr-x 1 tomcat tomcat 3163 Aug 17 22:51 readercycle
> -rwxr-xr-x 1 tomcat tomcat 1752 Aug 17 22:51 rsyncd-disable
> -rwxr-xr-x 1 tomcat tomcat 1740 Aug 17 22:51 rsyncd-enable
> -rwxr-xr-x 1 tomcat tomcat 3508 Aug 17 22:51 rsyncd-start
> -rwxr-xr-x 1 tomcat tomcat 2295 Aug 17 22:51 rsyncd-stop
> -rwxr-xr-x 1 tomcat tomcat 2132 Aug 17 22:51 scripts-util
> -rwxr-xr-x 1 tomcat tomcat 3775 Aug 17 22:51 snapcleaner
> -rwxr-xr-x 1 tomcat tomcat 4994 Aug 17 22:51 snapinstaller
> -rwxr-xr-x 1 tomcat tomcat 7980 Aug 17 22:51 snappuller
> -rwxr-xr-x 1 tomcat tomcat 1768 Aug 17 22:51 snappuller-disable
> -rwxr-xr-x 1 tomcat tomcat 1770 Aug 17 22:51 snappuller-enable
> -rwxr-xr-x 1 tomcat tomcat 3269 Aug 17 22:51 snapshooter
>
> /usr/share/solr/conf:
> total 124
> -rw-r--r-- 1 tomcat tomcat  1125 Aug 17 22:51 admin-extra.html
> -rw-r--r-- 1 tomcat tomcat  1310 Aug 17 22:51 elevate.xml
> -rw-r--r-- 1 tomcat tomcat   894 Aug 17 22:51 protwords.txt
> -rw-r--r-- 1 tomcat tomcat 20083 Aug 17 22:51 schema.xml
> -rw-r--r-- 1 tomcat tomcat   921 Aug 17 22:51 scripts.conf
> -rw-r--r-- 1 tomcat tomcat 30281 Aug 17 22:53 solrconfig.xml
> -rw-r--r-- 1 tomcat tomcat    16 Aug 17 22:51 spellings.txt
> -rw-r--r-- 1 tomcat tomcat  1226 Aug 17 22:51 stopwords.txt
> -rw-r-

Re: dynamic changes to schema

2009-08-18 Thread Constantijn Visinescu
huh? I think I lost you :)
You want to use a multivalued field to list what dynamic fields you have in
your document?

Also if you program your application correctly you should be able to
restrict your users from doing anything you please (or don't please in this
case).


On Tue, Aug 18, 2009 at 11:38 PM, Marco Westermann  wrote:

> hi,
>
> thanks for the advise but the problem with dynamic fields is, that i cannot
> restrict how the user calls the field in the application. So there isn't a
> pattern I can use. But I thought about using mulitvalued fields for the
> dynamically added fields. Good Idea?
>
> thanks,
> Marco
>
> Constantijn Visinescu schrieb:
>
>> use a dynamic field ?
>>
>> On Tue, Aug 18, 2009 at 5:09 PM, Marco Westermann 
>> wrote:
>>
>>
>>
>>> Hi there,
>>>
>>> is there a possibility to change the solr-schema over php dynamically.
>>> The
>>> web-application I want to index at the moment has the feature to add
>>> fields
>>> to entitys and you can tell this fields that they are searchable. To
>>> realize
>>> this with solr the schema has to change when a searchable field is added
>>> or
>>> removed.
>>>
>>> Any suggestions,
>>>
>>> Thanks a lot,
>>>
>>> Marco Westermann
>>>
>>> --
>>> ++ Business-Software aus einer Hand ++
>>> ++ Internet, Warenwirtschaft, Linux, Virtualisierung ++
>>> http://www.intersales.de
>>> http://www.eisxen.org
>>> http://www.tarantella-partner.de
>>> http://www.medisales.de
>>> http://www.eisfair.net
>>>
>>> interSales AG Internet Commerce
>>> Subbelrather Str. 247
>>> 50825 Köln
>>>
>>> Tel  02 21 - 27 90 50
>>> Fax  02 21 - 27 90 517
>>> Mail i...@intersales.de
>>> Mail m...@intersales.de
>>> Web  www.intersales.de
>>>
>>> Handelsregister Köln HR B 30904
>>> Ust.-Id.: DE199672015
>>> Finanzamt Köln-Nord. UstID: nicht vergeben
>>> Aufsichtsratsvorsitzender: Michael Morgenstern
>>> Vorstand: Andrej Radonic, Peter Zander
>>>
>>>
>>>
>>
>>
>>
>> __ Hinweis von ESET NOD32 Antivirus, Signaturdatenbank-Version
>> 4346 (20090818) __
>>
>> E-Mail wurde geprüft mit ESET NOD32 Antivirus.
>>
>> http://www.eset.com
>>
>>
>>
>>
>
>
> --
> ++ Business-Software aus einer Hand ++
> ++ Internet, Warenwirtschaft, Linux, Virtualisierung ++
> http://www.intersales.de
> http://www.eisxen.org
> http://www.tarantella-partner.de
> http://www.medisales.de
> http://www.eisfair.net
>
> interSales AG Internet Commerce
> Subbelrather Str. 247
> 50825 Köln
>
> Tel  02 21 - 27 90 50
> Fax  02 21 - 27 90 517
> Mail i...@intersales.de
> Mail m...@intersales.de
> Web  www.intersales.de
>
> Handelsregister Köln HR B 30904
> Ust.-Id.: DE199672015
> Finanzamt Köln-Nord. UstID: nicht vergeben
> Aufsichtsratsvorsitzender: Michael Morgenstern
> Vorstand: Andrej Radonic, Peter Zander
>


RE: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on CentOS

2009-08-18 Thread Fuad Efendi
DO NOT RELY on your hosting provider. They use automated tools creating
complete mess with approved for production on CentOS versions of Lucene,
Servlet-API, java.util.* package, and etc; look at this:

> Here is my classpath entry when Tomcat starts up
> java.library.path:
> /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/lib/i386/client:/usr/lib/j
> vm/jav 
> a-1.6.0-openjdk-1.6.0.0/jre/lib/i386:/usr/lib/jvm/java-1.6.0-openjdk-1
> .6.0.0 /jre/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib


Who is vendor of this "openjdk-1.6.0.0"? Who is vendor of JVM which this JDK
runs on? Do you use "client" when you really need "server" environment? Is
it HotSpot? Is yoyr platform really i386?

I mentioned in previous post such installs for Java are totally mess, you
may have incompatible Servlet API loaded by bootstrap classloader before
Tomcat classes etc.


Install everything from scratch.



=???
INFO: Adding 'file:/usr/share/tomcat5/solr/lib/jetty-util-6.1.3.jar' to Solr
classloader
=???







-Original Message-
From: Fuad Efendi [mailto:f...@efendi.ca] 
Sent: August-19-09 1:43 AM
To: solr-user@lucene.apache.org
Subject: RE: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on
CentOS

>>-Dsolr.solr.home='/some/path'


CORRECT:
-Dsolr.data.dir=..

It should be in java startup parameters; for instance, JAVA_OPTS="-server
-Zms32768M -Xmx32768M -Dsolr.data.dir=/some/path" inside catalina.sh as a
first statement...


According to the logs you posted probably  mistake in solr.xml which is
multicore definition;  you should post here it's content.

Java 1.4/5/6 supports "nested exceptions".

The root cause of your problem:
java.lang.NoClassDefFoundError: org.apache.solr.core.Config

This exception causes another one:
javax.xml.xpath.XPathFactoryConfigurationException: No XPathFctory
implementation found for the object model:
http://java.sun.com/jaxp/xpath/dom
at javax.xml.xpath.XPathFactory.newInstance(Unknown Source)
at org.apache.solr.core.Config.(Config.java:41)

etc. etc. etc.


NoClassDefFoundError means: classloader didn't have any problem with finding
class def, but it couldn't "define" it due for instance dependency on
another library and/or dependency on configuration file such as solr.xml
etc. 

XPath should be called on DOM (after Config is properly initialized)


Difficult to explain what is wrong with your mess of files in config (you
are obviously using double-core) - you should do following:
1. Install Tomcat
2. Copy SOLR war file to webapps folder
3. Start Tomcat and verify logs; ensure that you have some clear messages in
it (SOLR should use default home? Verify!)
4. Configure SOLR-home with sample solrconfig.xml and schema.xml, restart,
verify 
... ... ...


Don't go to multicore until you play enough with simplest SOLR installation


-Original Message-
From: Aaron Aberg [mailto:aaronab...@gmail.com] 
Sent: August-19-09 12:28 AM
To: solr-user@lucene.apache.org
Subject: Re: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on
CentOS

Tomcat is running fine. It's solr that is having the issue. I keep
seeing people talk about this:

-Dsolr.solr.home='/some/path'

Should I be putting that somewhere? Or is that already taken care of
when I edited the web.xml file in my solr.war file?

On Tue, Aug 18, 2009 at 7:29 PM, Fuad Efendi wrote:
> I forgot to add: compiler is inside "tools.jar" in some cases if I am
> correct... doesn't matter really... try to access Tomcat default homepage
> before trying to use SOLR!
>
>
>
> 
>
> The only difference between JRE and JDK (from TOMCAT viewpoint) is absence
> of javac compiler for JSPs. But it will complain only if you try to use
JSPs
> (via admin console).
>
> Have you tried to install SOLR on your localbox and play with samples
> described at many WIKI pages?
>
>
>
> -Original Message-
> From: Aaron Aberg [mailto:aaronab...@gmail.com]
> Sent: August-18-09 9:04 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on
> CentOS
>
> Marco might be right about the JRE thing.
> Here is my classpath entry when Tomcat starts up
> java.library.path:
>
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/lib/i386/client:/usr/lib/jvm/jav
>
a-1.6.0-openjdk-1.6.0.0/jre/lib/i386:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0
> /jre/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib
>
> Constantijn,
>
> Here is my solr home file list with permissions:
>
> -bash-3.2$ ll /usr/share/solr/*
> -rw-r--r-- 1 tomcat tomcat 2150 Aug 17 22:51 /usr/share/solr/README.txt
>
> /usr/share/solr/bin:
> total 160
> -rwxr-xr-x 1 tomcat tomcat 4896 Aug 17 22:51 abc
> -rwxr-xr-x 1 tomcat tomcat 4919 Aug 17 22:51 abo
> -rwxr-xr-x 1 tomcat tomcat 2915 Aug 17 22:51 backup
> -rwxr-xr-x 1 tomcat tomcat 3435 Aug 17 22:51 backupcleaner
> -rwxr-xr-x 1 tomcat tomcat 3312 Aug 17 22:51 commit
> -rwxr-xr-x 1 tomcat tomcat 3306 Aug 17 22:51 optimize
> -rwxr-xr-x 1 tomcat tomcat 3163 Aug 17 22:51

Re: DataImportHandler ignoring most rows

2009-08-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
this comment says that
   7

the query fetched only 7 rows. If possible open a tool and just run
the same query and see how many rows are returned

On Wed, Aug 19, 2009 at 3:46 AM, Erik Earle wrote:
> Using:
> - apache-solr-1.3.0
> - java 1.6
> - tomcat 6
> - sql server 2005 w/ JSQLConnect 4.0 driver
>
> I have a group table with 3007 rows.  I have confirmed the key is
> unique with "select distinct id from group"  and it returns 3007.  When i 
> re-index using http://host:port/solr/dataimport?command=full-import  I only 
> get 7 records indexed.  Any insight into what is going on would be really 
> great.
>
> A partial response:
>    
>    1
>    7
>    0
>
>
> I have other entities that index all the rows without issue.
>
> There are no errors in the logs.
>
> I am not using any Transformers (and most of my config is not changed from 
> install)
>
> My schema.xml contains:
>
>     key
>
> and field defs (not a full list of fields):
>    required="true" />
>    required="true" />
>   
>   
>   
>   
>
> data-config.xml
> 
> 
>    
>            driver="com.jnetdirect.jsql.JSQLDriver"
>        
> url="jdbc:JSQLConnect://se-eriearle-lt1/database=SocialSite2/user=SocialSite2"
>        user="SocialSite2"
>        password="SocialSite2" />
>
>    
>
>    
>
>                    query="select 'group.'+id as 'key', 'group' as 'class', name, 
> handle, description, created, updated from group order by created asc">
>        
>
>                    query="<...redacted...>">
>        
>
>    
> 
>
>
>
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


Re: Replication over multi-core solr

2009-08-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, Aug 19, 2009 at 2:27 AM, vivek sar wrote:
> Hi,
>
>  We use multi-core setup for Solr, where new cores are added
> dynamically to solr.xml. Only one core is active at a time. My
> question is how can the replication be done for multi-core - so every
> core is replicated on the slave?

replication does not handle new core creation. You will have to issue
the core creation command to each slave separately.
>
> I went over the wiki, http://wiki.apache.org/solr/SolrReplication,
> and few questions related to that,
>
> 1) How do we replicate solr.xml where we have list of cores? Wiki
> says, "Only files in the 'conf' dir of solr instance is replicated. "
> - since, solr.xml is in the home directory how do we replicate that?
solr.xml canot be replicated. even if you did it is not reloaded.
>
> 2) Solrconfig.xml in slave takes a static core url,
>
>    http://localhost:port/solr/corename/replication

put a placeholder like
http://localhost:port/solr/${solr.core.name}/replication
so the corename is automatically replaced

>
> As in our case cores are created dynamically (new core created after
> the active one reaches some capacity), how can we define master core
> dynamically for replication? The only I see it is using "fetchIndex"
> command and passing new core info there - is it right? If so, does the
> slave application have write code to poll Master periodically and fire
> "fetchIndex" command, but how would Slave know the Master corename -
> as they are created dynamically on the Master?
>
> Thanks,
> -vivek
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com