Re: dataimporthandler and mysql connector jar

2008-08-26 Thread Walter Ferrara
Shalin Shekhar Mangar wrote:
> Can you please open a JIRA issue for this? However, we may only be able to
> fix this after 1.3 because a code freeze has been decided upon, to release
> 1.3 asap.
>   
I've open https://issues.apache.org/jira/browse/SOLR-726

Walter



Re: dataimporthandler and mysql connector jar

2008-08-26 Thread Shalin Shekhar Mangar
Thanks Walter. I shall work up a patch soon.

On Tue, Aug 26, 2008 at 1:22 PM, Walter Ferrara <[EMAIL PROTECTED]>wrote:

> Shalin Shekhar Mangar wrote:
> > Can you please open a JIRA issue for this? However, we may only be able
> to
> > fix this after 1.3 because a code freeze has been decided upon, to
> release
> > 1.3 asap.
> >
> I've open https://issues.apache.org/jira/browse/SOLR-726
>
> Walter
>
>


-- 
Regards,
Shalin Shekhar Mangar.


dataimporthandler and multiple delta-import

2008-08-26 Thread Walter Ferrara
I'm using DIH and its wonderful delta-import.
I have a question: the delta-import is synchronized? multiple call to
delta imports, shouldn't result in one refused because the status is not
idle?
I've noticed however that calling multiple times in a sec the
dataimport/?command=delta-import result in a strange exception:

GRAVE: Delta Import Failed
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
execute query: select entity from testtable where last_modified >
'2008-08-26 13:05:09' Processing Document # 1
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:171)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:128)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:41)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextModifiedRowKey(SqlEntityProcessor.java:92)
at
org.apache.solr.handler.dataimport.DocBuilder.collectDelta(DocBuilder.java:479)
at
org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:192)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:131)
at
org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:357)
at
org.apache.solr.handler.dataimport.DataImporter.rumCmd(DataImporter.java:386)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:375)
Caused by:
com.mysql.jdbc.exceptions.MySQLNonTransientConnectionException: No
operations allowed after connection closed.
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:888)
at com.mysql.jdbc.Connection.checkClosed(Connection.java:1930)
at com.mysql.jdbc.Connection.createStatement(Connection.java:3094)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:159)
... 10 more

while calling the delta-import, waiting a bit, and recalling it again
works fine...

thanks,
Walter



Re: dataimporthandler and multiple delta-import

2008-08-26 Thread Shalin Shekhar Mangar
Hi Walter,

Indeed, there's a race condition there because we didn't expect people to
hit it concurrently. We expected that imports would be run sequentially.

Thanks for noticing this. We shall add synchronization to the next release.
Do you mind (again) opening an issue for this? We'll attach a patch soon.

On Tue, Aug 26, 2008 at 4:56 PM, Walter Ferrara <[EMAIL PROTECTED]>wrote:

> I'm using DIH and its wonderful delta-import.
> I have a question: the delta-import is synchronized? multiple call to
> delta imports, shouldn't result in one refused because the status is not
> idle?
> I've noticed however that calling multiple times in a sec the
> dataimport/?command=delta-import result in a strange exception:
>
> GRAVE: Delta Import Failed
> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
> execute query: select entity from testtable where last_modified >
> '2008-08-26 13:05:09' Processing Document # 1
>at
>
> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:171)
>at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:128)
>at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:41)
>at
>
> org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
>at
>
> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextModifiedRowKey(SqlEntityProcessor.java:92)
>at
>
> org.apache.solr.handler.dataimport.DocBuilder.collectDelta(DocBuilder.java:479)
>at
> org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:192)
>at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:131)
>at
>
> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:357)
>at
>
> org.apache.solr.handler.dataimport.DataImporter.rumCmd(DataImporter.java:386)
>at
>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:375)
> Caused by:
> com.mysql.jdbc.exceptions.MySQLNonTransientConnectionException: No
> operations allowed after connection closed.
>at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:888)
>at com.mysql.jdbc.Connection.checkClosed(Connection.java:1930)
>at com.mysql.jdbc.Connection.createStatement(Connection.java:3094)
>at
>
> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:159)
>... 10 more
>
> while calling the delta-import, waiting a bit, and recalling it again
> works fine...
>
> thanks,
> Walter
>
>


-- 
Regards,
Shalin Shekhar Mangar.


Question about search suggestion

2008-08-26 Thread Aleksey Gogolev

Hello.

I'm new to solr and I need to make a search suggest (like google
suggestions).

In first approach I decided to make it in this way:

1. Script make a query to solr using words that were typed by user.
2. Then script analyse docs (which solr return) and build a suggestions.

But I confronted with difficulties.

For example, user is typing word "samsung". At the current moment he
typed only part of the word: "samsu".

So according to my plan, I make a query with word "samsu" and I expect
to get docs which contain substring "samsu". But solr doesn't find any
docs for this request (These docs exist, I checked). I guess I need
to tweak some preferences but I can not find the corresponding options.

Thanks for your attention.

-- 
Aleksey Gogolev
developer, 
dev.co.ua
Aleksey



Re: dataimporthandler and multiple delta-import

2008-08-26 Thread Walter Ferrara
Shalin Shekhar Mangar wrote:
> Hi Walter,
>
> Indeed, there's a race condition there because we didn't expect people to
> hit it concurrently. We expected that imports would be run sequentially.
>
> Thanks for noticing this. We shall add synchronization to the next release.
> Do you mind (again) opening an issue for this? We'll attach a patch soon.
>   
no problem! I've opened https://issues.apache.org/jira/browse/SOLR-728
I do understand the fact that import should be run sequentially, the
main issue I can foresee is a delta-import via curl in crontab, that
curl have no way to know if previous delta import was effectively over
-- in my opinion, if there is a (delta|full)-import already running it
should state that it cannot go ahead because another import process is
running already.

thank you for your fast reply and all your work in solr,
Walter



SolrJ - SolrServer#commit() doesn't return

2008-08-26 Thread Machisuji

Hey.

I've been working with SolR for a few days now and as long as I haven't
worked with
too much data everything was alright.

However, now that I wanted to index really all data, I've got problems with
SolrJ
not returning from a call to CommonHttpSolrServers's commit().
I try to upload data from online shops, to be more precise name, category,
price and description of tens of millions of items.
After a few million items the call of commit() doesn't return anymore and
simply does nothing.
At least the cpu usage on the computer running the solr server falls to 0%.

I always add 10,000 items at a time by calling
SolrServer#add(Collection) followed by
SolrServer.commit().
Has someone an idea what could be the problem here?
-- 
View this message in context: 
http://www.nabble.com/SolrJ---SolrServer-commit%28%29-doesn%27t-return-tp19161293p19161293.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: dataimporthandler and multiple delta-import

2008-08-26 Thread Shalin Shekhar Mangar
On Tue, Aug 26, 2008 at 5:52 PM, Walter Ferrara <[EMAIL PROTECTED]>wrote:

> no problem! I've opened https://issues.apache.org/jira/browse/SOLR-728


Thanks!

>
> I do understand the fact that import should be run sequentially, the
> main issue I can foresee is a delta-import via curl in crontab, that
> curl have no way to know if previous delta import was effectively over
> -- in my opinion, if there is a (delta|full)-import already running it
> should state that it cannot go ahead because another import process is
> running already.
>

It already does that. The "importResponse" element in the output says "A
command is still running..." and the "status" element says "busy" if an
import is already in progress. Just that it is not synchronized properly
against multiple import requests hitting concurrently. A 5-10 ms gap in
import requests would avoid the exception that you encountered.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Question about search suggestion

2008-08-26 Thread Shalin Shekhar Mangar
Hi Aleksey,

Welcome to Solr!

You should append a wildcard at the end e.g. search like "samsu*" which
should match "samsung".

On Tue, Aug 26, 2008 at 5:45 PM, Aleksey Gogolev <[EMAIL PROTECTED]> wrote:

>
> Hello.
>
> I'm new to solr and I need to make a search suggest (like google
> suggestions).
>
> In first approach I decided to make it in this way:
>
> 1. Script make a query to solr using words that were typed by user.
> 2. Then script analyse docs (which solr return) and build a suggestions.
>
> But I confronted with difficulties.
>
> For example, user is typing word "samsung". At the current moment he
> typed only part of the word: "samsu".
>
> So according to my plan, I make a query with word "samsu" and I expect
> to get docs which contain substring "samsu". But solr doesn't find any
> docs for this request (These docs exist, I checked). I guess I need
> to tweak some preferences but I can not find the corresponding options.
>
> Thanks for your attention.
>
> --
> Aleksey Gogolev
> developer,
> dev.co.ua
> Aleksey
>
>


-- 
Regards,
Shalin Shekhar Mangar.


Re: Question about search suggestion

2008-08-26 Thread Norberto Meijome
On Tue, 26 Aug 2008 15:15:21 +0300
Aleksey Gogolev <[EMAIL PROTECTED]> wrote:

> 
> Hello.
> 
> I'm new to solr and I need to make a search suggest (like google
> suggestions).
> 

Hi Aleksey,
please search the archives of this list for subjects containing 'autocomplete'
or 'auto-suggest'. that should give you a few ideas and starting points.

best,
B

_
{Beto|Norberto|Numard} Meijome

"The more I see the less I know for sure." 
  John Lennon

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


in a RequestHandler's init, how to get solr data dir?

2008-08-26 Thread Brian Whitman
I want to be able to store non-solr data in solr's data directory  
(like solr/solr/data/stored alongside solr/solr/data/index)
The java class that sets up this data is instantiated from a  
RequestHandlerBase class like:


public class StoreDataHandler extends RequestHandlerBase {
  StoredData storedData;
  @Override
  public void init(NamedList args)
  {
super.init(args);
String dataDirectory = 
storedData = new StoredData(dataDirectory);
  }

  @Override
  public void handleRequestBody(SolrQueryRequest req,  
SolrQueryResponse rsp) throws Exception

...

req.getCore() etc will eventually get me solr's data directory  
location, but how do I get it in the init method? I want to init the  
data store once on solr launch, not on every call.


What do I replace those  above with?





RE: Less aggressive stemmer?

2008-08-26 Thread Wagner,Harry
OK. I put it here
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/Kstem and
linked it from the stemming paragraph found here:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters  

Cheers!
harry

-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
Sent: Monday, August 25, 2008 3:20 PM
To: solr-user@lucene.apache.org
Subject: Re: Less aggressive stemmer?

I'd create a new page and link it from, perhaps, the page about stemming
if there is one or from the page about analyzers/tokens/filters.

 
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: "Wagner,Harry" <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Monday, August 25, 2008 1:57:38 PM
> Subject: RE: Less aggressive stemmer?
> 
> Otis,
> I'd be happy to. Where do you think the best place to put this is -
> under 'hacking Solr' or with the other stemming text?
> 
> -Original Message-
> From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
> Sent: Friday, August 22, 2008 12:24 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Less aggressive stemmer?
> 
> It won't be integrated in Solr 1.3, I believe, because of KStem's
> license.  But we should document what the Factory for it can look
like,
> perhaps by posting it on the Wiki.  Harry, if you have the code handy,
> feel free to post it on the Solr Wiki somewhere.
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
> > From: "Wagner,Harry" 
> > To: solr-user@lucene.apache.org
> > Sent: Friday, August 22, 2008 8:40:18 AM
> > Subject: RE: Less aggressive stemmer?
> > 
> > We use KStem also and are very happy with it.  I think it has been
> > integrated into Solr and will be included in 1.3 (someone please
> correct
> > me if this is not the case). You should be able to get it from the
> > nightly builds now. 
> > 
> > Cheers!
> > harry
> > 
> > -Original Message-
> > From: Kevin Osborn [mailto:[EMAIL PROTECTED] 
> > Sent: Thursday, August 21, 2008 5:30 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Less aggressive stemmer?
> > 
> > We had similar problems and then switched to KStem and have been
> pretty
> > happy with the results.
> > 
> > http://ciir.cs.umass.edu/cgi-bin/downloads/downloads.cgi
> > 
> > 
> > 
> > - Original Message 
> > From: Jason Rennie 
> > To: solr-user@lucene.apache.org
> > Sent: Thursday, August 21, 2008 2:23:36 PM
> > Subject: Less aggressive stemmer?
> > 
> > Is there an option to perform less aggressive stemming in solr?
We're
> > using
> > the Porter stemmer.  I see that there is an option for Snowball, but
> my
> > understanding is that Snowball is a refinement of Porter rather than
> > something radically different.  I think we'd be best off with
> something
> > very
> > basic, possibly as simple as removing plural endings.  Our index is
> over
> > product descriptions, so it's important that we stem normal
variations
> > in
> > nouns, but adverbs, verbs and possibly adjective variations are not
so
> > important and sometimes cause problems for us.
> > 
> > Jason





Re: SolrJ - SolrServer#commit() doesn't return

2008-08-26 Thread Alexander Ramos Jardim
Are you using any postCommit postOptimize eventListener?

I got some problems using them, that I run on scenario where the
commit/optimize thread never ended.

2008/8/26 Machisuji <[EMAIL PROTECTED]>

>
> Hey.
>
> I've been working with SolR for a few days now and as long as I haven't
> worked with
> too much data everything was alright.
>
> However, now that I wanted to index really all data, I've got problems with
> SolrJ
> not returning from a call to CommonHttpSolrServers's commit().
> I try to upload data from online shops, to be more precise name, category,
> price and description of tens of millions of items.
> After a few million items the call of commit() doesn't return anymore and
> simply does nothing.
> At least the cpu usage on the computer running the solr server falls to 0%.
>
> I always add 10,000 items at a time by calling
> SolrServer#add(Collection) followed by
> SolrServer.commit().
> Has someone an idea what could be the problem here?
> --
> View this message in context:
> http://www.nabble.com/SolrJ---SolrServer-commit%28%29-doesn%27t-return-tp19161293p19161293.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Alexander Ramos Jardim


Re: in a RequestHandler's init, how to get solr data dir?

2008-08-26 Thread Shalin Shekhar Mangar
Hi Brian,

You can implement the SolrCoreAware interface which will give you access to
the SolrCore object through the SolrCoreAware#inform method you will need to
implement. It is called after the init method.

On Tue, Aug 26, 2008 at 7:06 PM, Brian Whitman <[EMAIL PROTECTED]>wrote:

> I want to be able to store non-solr data in solr's data directory (like
> solr/solr/data/stored alongside solr/solr/data/index)
> The java class that sets up this data is instantiated from a
> RequestHandlerBase class like:
>
> public class StoreDataHandler extends RequestHandlerBase {
>  StoredData storedData;
>  @Override
>  public void init(NamedList args)
>  {
>super.init(args);
>String dataDirectory = 
>storedData = new StoredData(dataDirectory);
>  }
>
>  @Override
>  public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp)
> throws Exception
> ...
>
> req.getCore() etc will eventually get me solr's data directory location,
> but how do I get it in the init method? I want to init the data store once
> on solr launch, not on every call.
>
> What do I replace those  above with?
>
>
>
>


-- 
Regards,
Shalin Shekhar Mangar.


Re: SolrJ - SolrServer#commit() doesn't return

2008-08-26 Thread Machisuji

No, I just add the documents, call commit() and wait for the response.

Now I made a workaround where I put each commit() in a separate thread,
which I wait for (Thread#join(long)) and recommit the documents if
the Thread didn't end within a given amount of time.
This way I am able to commit all documents - during uploading 15.000.000
documents 5 or 6 retries were necessary - but probably the
Threads all go on for eternity, which isn't that good I reckon,
even though they don't seem to do anything at all.

Not a very satisfactory solution. :(


zayhen wrote:
> 
> Are you using any postCommit postOptimize eventListener?
> 
> I got some problems using them, that I run on scenario where the
> commit/optimize thread never ended.
> 
> 2008/8/26 Machisuji <[EMAIL PROTECTED]>
> 
>>
>> Hey.
>>
>> I've been working with SolR for a few days now and as long as I haven't
>> worked with
>> too much data everything was alright.
>>
>> However, now that I wanted to index really all data, I've got problems
>> with
>> SolrJ
>> not returning from a call to CommonHttpSolrServers's commit().
>> I try to upload data from online shops, to be more precise name,
>> category,
>> price and description of tens of millions of items.
>> After a few million items the call of commit() doesn't return anymore and
>> simply does nothing.
>> At least the cpu usage on the computer running the solr server falls to
>> 0%.
>>
>> I always add 10,000 items at a time by calling
>> SolrServer#add(Collection) followed by
>> SolrServer.commit().
>> Has someone an idea what could be the problem here?
>> --
>> View this message in context:
>> http://www.nabble.com/SolrJ---SolrServer-commit%28%29-doesn%27t-return-tp19161293p19161293.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> Alexander Ramos Jardim
> 
> 
> -
> RPG da Ilha 
> 

-- 
View this message in context: 
http://www.nabble.com/SolrJ---SolrServer-commit%28%29-doesn%27t-return-tp19161293p19165657.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: SolrJ - SolrServer#commit() doesn't return

2008-08-26 Thread Alexander Ramos Jardim
2008/8/26 Machisuji <[EMAIL PROTECTED]>

>
> No, I just add the documents, call commit() and wait for the response.
>
> Now I made a workaround where I put each commit() in a separate thread,
> which I wait for (Thread#join(long)) and recommit the documents if
> the Thread didn't end within a given amount of time.
> This way I am able to commit all documents - during uploading 15.000.000
> documents 5 or 6 retries were necessary - but probably the
> Threads all go on for eternity, which isn't that good I reckon,
> even though they don't seem to do anything at all.
>
> Not a very satisfactory solution. :(
>
>
How is your disk and memory usage during these thread hang ups?


>
> zayhen wrote:
> >
> > Are you using any postCommit postOptimize eventListener?
> >
> > I got some problems using them, that I run on scenario where the
> > commit/optimize thread never ended.
> >
> > 2008/8/26 Machisuji <[EMAIL PROTECTED]>
> >
> >>
> >> Hey.
> >>
> >> I've been working with SolR for a few days now and as long as I haven't
> >> worked with
> >> too much data everything was alright.
> >>
> >> However, now that I wanted to index really all data, I've got problems
> >> with
> >> SolrJ
> >> not returning from a call to CommonHttpSolrServers's commit().
> >> I try to upload data from online shops, to be more precise name,
> >> category,
> >> price and description of tens of millions of items.
> >> After a few million items the call of commit() doesn't return anymore
> and
> >> simply does nothing.
> >> At least the cpu usage on the computer running the solr server falls to
> >> 0%.
> >>
> >> I always add 10,000 items at a time by calling
> >> SolrServer#add(Collection) followed by
> >> SolrServer.commit().
> >> Has someone an idea what could be the problem here?
> >> --
> >> View this message in context:
> >>
> http://www.nabble.com/SolrJ---SolrServer-commit%28%29-doesn%27t-return-tp19161293p19161293.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
> > --
> > Alexander Ramos Jardim
> >
> >
> > -
> > RPG da Ilha
> >
>
> --
> View this message in context:
> http://www.nabble.com/SolrJ---SolrServer-commit%28%29-doesn%27t-return-tp19161293p19165657.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Alexander Ramos Jardim


Weighting the Licene score

2008-08-26 Thread s d
I want to weighted average the Lucene score with an additional score i have,
i.e. (W1 * Lucene score + W2 * Other score) / (W1 + W2) .
What is the easiest way to do this?
Also, is the Lucene score normalized.
Thanks,


Querying Greater Than and Less Than

2008-08-26 Thread Jake Conk
Hello,

I was trying to figure out how to query ranges greater than and less
than. The closest solution I could find was using the range format:

field:[x TO z]

While this solution works for querying greater than items how would I
query all items less than 10 assuming I have some items that have a
negative number that should be selected as well. The closest thing
I've came to was this:

field:[0 TO 10]

Given I don't know what is the smallest negative number but I want to
be able to somehow be able to get all items, is there a way somehow?

Thanks,

- Jake


Re: Querying Greater Than and Less Than

2008-08-26 Thread mike topper

you can also  use queries like field:[* to Z]  or field:[Z TO *]

-Mike

Jake Conk wrote:

Hello,

I was trying to figure out how to query ranges greater than and less
than. The closest solution I could find was using the range format:

field:[x TO z]

While this solution works for querying greater than items how would I
query all items less than 10 assuming I have some items that have a
negative number that should be selected as well. The closest thing
I've came to was this:

field:[0 TO 10]

Given I don't know what is the smallest negative number but I want to
be able to somehow be able to get all items, is there a way somehow?

Thanks,

- Jake
  




Re: in a RequestHandler's init, how to get solr data dir?

2008-08-26 Thread Brian Whitman


On Aug 26, 2008, at 12:24 PM, Shalin Shekhar Mangar wrote:


Hi Brian,

You can implement the SolrCoreAware interface which will give you  
access to
the SolrCore object through the SolrCoreAware#inform method you will  
need to

implement. It is called after the init method.


Shalin, that worked. Thanks a ton!




Adding a field?

2008-08-26 Thread Jon Drukman
Is there a way to add a field to an existing index without stopping the 
server, deleting the index, and reloading every document from scratch?


-jsd-



Re: Adding a field?

2008-08-26 Thread Brian Whitman


On Aug 26, 2008, at 3:09 PM, Jon Drukman wrote:

Is there a way to add a field to an existing index without stopping  
the server, deleting the index, and reloading every document from  
scratch?




You can add a field to the schema at any time without adversely  
affecting the rest of the index. You have to restart the server, but  
you don't have to re-index existing documents. Of course, only new  
documents with that field specified will come back in queries.


You can also define dynamic fields like x_* which would let you add  
any field name you want without restarting the server.





Re: Adding a field?

2008-08-26 Thread Smiley, David W. (DSMILEY)
You can modify the schema file but you'll need to reload Solr for Solr to
see be aware of it.  You can use multiCore to eliminate downtime.  But at
this point you obviously don't have any data in it so I can only assume
you'll want to reload everything -- what else would you expect?  This
doesn't necessarily need to involve deleting all the data first since as you
add documents that have the same primary key, the older document is replaced
by the new one once you commit.

~ David


On 8/26/08 3:09 PM, "Jon Drukman" <[EMAIL PROTECTED]> wrote:

> Is there a way to add a field to an existing index without stopping the
> server, deleting the index, and reloading every document from scratch?
> 
> -jsd-
> 



How does Solr search when a field is not specified?

2008-08-26 Thread Jake Conk
Hello,

I was wondering how does Solr search when a field is not specified,
just a query? Say for example I got the following:

?q="Jake" AND "Test"

I have a mixture of integer, string, and text columns. Some indexed,
some stored, and some string fields copied to text fields.

Say I have a string field with the value "Jake is Testing" which is
also copied to a text field. If I did not copyField that string field
to a text field then would the above query not return any results if
the word "Jake" and "Test" are not found anywhere else since we cannot
do fulltext searches on string fields?

Lastly, is there a limit how many characters can be in a string and text field?

Thanks,
- Jake


Sorting and also looking at stored fields

2008-08-26 Thread jennyv
I'm having trouble with sorting -- things don't return in the order
that I think they should. To look into this problem further, I thought
I'd query solr and try to display the data field in question. I tried
forming my own URL but solr wasn't returning my field. Then I tried
the admin form in case I was formatting my URL incorrectly ... still
no dice. The default on the admin form looks to return all fields plus
score. However, I only get a few fields back (pk_i, id, and score). So
... I was wondering, where are my other fields and is there a way I
can see them? I'm sure they're being saved since I can do searches
against them. Here's the XML for creating one of them (I'm using
acts_as_solr for Ruby on Rails for this):


  
  1
  BaseAsset:1
  1
  This is a photo of a field
at sunset.
  simple photo.jpg
  1
  Images


The pk_i and id fields are coming through okay -- but the others are
not. Could it be related to the boost that's on those fields?

Anyway, this is just to help me with troubleshooting ... the main
issue is that sorting of results isn't working. For example, I've
created a field called name_for_sort (which is just the name but all
lower-case) that I want to sort. When I try either of these URLs, I
get the name in all sorts of order, but not alpha order which is what
I expect (non-URL encoded to make it easier to read, but I've tried
submitting it URL-encoded:

http://localhost:8982/solr/select?qt=standard&q=(photo) AND
type_t:BaseAsset&fl=*,score&wt=ruby&sort=name_for_solr_t asc

The very strange thing is that when I make up a field name for sort it
doesn't complain, but if I omit the _t at the end, it complains that
the field doesn't exist.

Anyway, any light anyone can shine on this would be greatly appreciated! Thanks!


Re: Weighting the Licene score

2008-08-26 Thread Otis Gospodnetic
I think the easiest approach might be making use of Lucene's function query.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: s d <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, August 26, 2008 1:55:38 PM
> Subject: Weighting the Licene score
> 
> I want to weighted average the Lucene score with an additional score i have,
> i.e. (W1 * Lucene score + W2 * Other score) / (W1 + W2) .
> What is the easiest way to do this?
> Also, is the Lucene score normalized.
> Thanks,



Re: Sorting and also looking at stored fields

2008-08-26 Thread jennyv
Ack. Reading the FAQ would help. :-) I've fixed the sorting issue by
using a different analyzer. However, I'd love to get an answer on
whether there's a way to query solr to include more fields in the
results. Thanks!


Re: Sorting and also looking at stored fields

2008-08-26 Thread Stephen Weiss

I do this by specifying them in the query string:

fl=field_name1,field_name2,etc

See here: http://www.ibm.com/developerworks/java/library/j-solr1/#searching 
 (table 2)


--
Steve

On Aug 26, 2008, at 6:36 PM, jennyv wrote:


Ack. Reading the FAQ would help. :-) I've fixed the sorting issue by
using a different analyzer. However, I'd love to get an answer on
whether there's a way to query solr to include more fields in the
results. Thanks!




Re: Sorting and also looking at stored fields

2008-08-26 Thread jennyv
On Tue, Aug 26, 2008 at 3:46 PM, Stephen Weiss <[EMAIL PROTECTED]> wrote:
> I do this by specifying them in the query string:
>
> fl=field_name1,field_name2,etc
>
> See here:
> http://www.ibm.com/developerworks/java/library/j-solr1/#searching (table 2)

Thanks! I've tried that but it won't return any of the fields aside
from pki, id, and score. I've tried the wildcard and I've also tried
providing the names of the fields (e.g., text_for_solr_t, which is the
name of the main search field). It seems the wildcard should return
all fields, no? One thing I should mention -- not all documents will
have all fields. Will that make a difference?

Thanks!


Re: Weighting the Licene score

2008-08-26 Thread s d
But function query doesn't give access to the SOLR score, only to fields in
the index, no ?
thx

On Tue, Aug 26, 2008 at 2:02 PM, Otis Gospodnetic <
[EMAIL PROTECTED]> wrote:

> I think the easiest approach might be making use of Lucene's function
> query.
>
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> - Original Message 
> > From: s d <[EMAIL PROTECTED]>
> > To: solr-user@lucene.apache.org
> > Sent: Tuesday, August 26, 2008 1:55:38 PM
> > Subject: Weighting the Licene score
> >
> > I want to weighted average the Lucene score with an additional score i
> have,
> > i.e. (W1 * Lucene score + W2 * Other score) / (W1 + W2) .
> > What is the easiest way to do this?
> > Also, is the Lucene score normalized.
> > Thanks,
>
>


Re: Sorting and also looking at stored fields

2008-08-26 Thread Stephen Weiss
Are you sure you have the fields set to store in your schema?  You  
said before you know they're being saved because you can search  
against them... While any field can be searched against which is set  
to indexed="true", only fields set to stored="true" can actually be  
retrieved for further display.


Here's a link on that...

http://wiki.apache.org/solr/SchemaXml#head-af67aefdc51d18cd8556de164606030446f56554

If that's not it maybe seeing the schema would help.
--
Steve


On Aug 26, 2008, at 7:46 PM, jennyv wrote:

On Tue, Aug 26, 2008 at 3:46 PM, Stephen Weiss  
<[EMAIL PROTECTED]> wrote:

I do this by specifying them in the query string:

fl=field_name1,field_name2,etc

See here:
http://www.ibm.com/developerworks/java/library/j-solr1/#searching  
(table 2)


Thanks! I've tried that but it won't return any of the fields aside
from pki, id, and score. I've tried the wildcard and I've also tried
providing the names of the fields (e.g., text_for_solr_t, which is the
name of the main search field). It seems the wildcard should return
all fields, no? One thing I should mention -- not all documents will
have all fields. Will that make a difference?

Thanks!




Re: Sorting and also looking at stored fields

2008-08-26 Thread Shalin Shekhar Mangar
On Wed, Aug 27, 2008 at 5:16 AM, jennyv <[EMAIL PROTECTED]> wrote:

> Thanks! I've tried that but it won't return any of the fields aside
> from pki, id, and score. I've tried the wildcard and I've also tried
> providing the names of the fields (e.g., text_for_solr_t, which is the
> name of the main search field). It seems the wildcard should return
> all fields, no? One thing I should mention -- not all documents will
> have all fields. Will that make a difference?
>

Are those fields (which are not showing up) defined as stored in your
schema.xml ?

-- 
Regards,
Shalin Shekhar Mangar.


Re: How does Solr search when a field is not specified?

2008-08-26 Thread Otis Gospodnetic
Jake,

Yes, that field would have to be some kind of an analyzed field (e.g. text), 
not string if you wanted that query to match "Jake is Testing" input.  There 
are no built-in Lucene or Solr-specific limits on field lengths.  There is one 
parameter called maxFieldLength in Solr's solrconfig.xml, I think, which 
tells Lucene how many tokens to consider for indexing.  If you don't want that 
limit, increase that parameter's value to the max.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Jake Conk <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, August 26, 2008 4:38:09 PM
> Subject: How does Solr search when a field is not specified?
> 
> Hello,
> 
> I was wondering how does Solr search when a field is not specified,
> just a query? Say for example I got the following:
> 
> ?q="Jake" AND "Test"
> 
> I have a mixture of integer, string, and text columns. Some indexed,
> some stored, and some string fields copied to text fields.
> 
> Say I have a string field with the value "Jake is Testing" which is
> also copied to a text field. If I did not copyField that string field
> to a text field then would the above query not return any results if
> the word "Jake" and "Test" are not found anywhere else since we cannot
> do fulltext searches on string fields?
> 
> Lastly, is there a limit how many characters can be in a string and text 
> field?
> 
> Thanks,
> - Jake



Re: SolrJ - SolrServer#commit() doesn't return

2008-08-26 Thread Otis Gospodnetic
Hm, it's hard to tell from here. Have you checked various logs for "Exception", 
"ERROR", and such?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Machisuji <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, August 26, 2008 8:46:24 AM
> Subject: SolrJ - SolrServer#commit() doesn't return
> 
> 
> Hey.
> 
> I've been working with SolR for a few days now and as long as I haven't
> worked with
> too much data everything was alright.
> 
> However, now that I wanted to index really all data, I've got problems with
> SolrJ
> not returning from a call to CommonHttpSolrServers's commit().
> I try to upload data from online shops, to be more precise name, category,
> price and description of tens of millions of items.
> After a few million items the call of commit() doesn't return anymore and
> simply does nothing.
> At least the cpu usage on the computer running the solr server falls to 0%.
> 
> I always add 10,000 items at a time by calling
> SolrServer#add(Collection) followed by
> SolrServer.commit().
> Has someone an idea what could be the problem here?
> -- 
> View this message in context: 
> http://www.nabble.com/SolrJ---SolrServer-commit%28%29-doesn%27t-return-tp19161293p19161293.html
> Sent from the Solr - User mailing list archive at Nabble.com.