Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-04-02 Thread Fergus McMenemie
>On Apr 1, 2009, at 9:39 AM, Fergus McMenemie wrote:
>
>> Grant,
>>
>> Redoing the work with your patch applied does not seem to
>
>>
>> make a difference! Is this the expected result?
>
>No, I didn't expect Solr 1095 to fix the problem. Overwrite = false +  
>1095, does, however, AFAICT by your last line, right?
>
>>
>>
>> I did run it again using the full file, this time using my Imac:-
>>  643465took  22min 14sec 2008-04-01
>>  734796  73min 58sec 2009-01-15
>>  758795  70min 55sec 2009-03-26
>> Again using only the first 1M records with  
>> commit=false&overwrite=true:-
>>  643465took  2m51.516s   2008-04-01
>>  734796  7m29.326s   2009-01-15
>>  758795  8m18.403s   2009-03-26
>>  SOLR-1095   7m41.699s
>> this time with commit=true&overwrite=true.
>>  643465took  2m49.200s   2008-04-01
>>  734796  8m27.414s   2009-01-15
>>  758795  9m32.459s   2009-03-26
>>  SOLR-1095   7m58.825s
>> this time with commit=false&overwrite=false.
>>  643465took  2m46.149s   2008-04-01
>>  734796  3m29.909s   2009-01-15
>>  758795  3m26.248s   2009-03-26
>>  SOLR-1095   2m49.997s
>>
Grant,

Hmmm, the big difference is made by &overwrite=false. But,
can you explain why &overwrite=false makes such a difference.
I am starting off with an empty index and I have checked the
content there are no duplicates in the uniqueKey field.

I guess if &overwrite=false then a few checks can be removed
from the indexing process, and if I am confident that my content
contains no duplicates then this is a good speed up. 

http://wiki.apache.org/solr/UpdateCSV says that if overwrite 
is true (the default) then overwrite documents based on the
uniqueKey. However what will solr/lucene do if the uniqueKey
is not unique and overwrite=false?  

fergus: perl -nlaF"\t" -e 'print "$F[2]";' geonames.txt | wc -l
 100
fergus: perl -nlaF"\t" -e 'print "$F[2]";' geonames.txt | sort -u | wc -l
 100
fergus: /usr/bin/head geonames.txt
RC  UFI UNI LAT LONGDMS_LAT DMS_LONGMGRSJOG 
FC  DSG PC  CC1 ADM1ADM2POP ELEVCC2 NT  
LC  SHORT_FORM  GENERIC SORT_NAME   FULL_NAME   FULL_NAME_ND
MODIFY_DATE
1   -130782860524   12.47   -69.9   122800  -695400 
19PDP0219578323 ND19-14 T   MT  AA  00  
PALUMARGA   Palu Marga  Palu Marga  1995-03-23
1   -1307756-189172012.5-70.016667  123000  -700100 
19PCP8952982056 ND19-14 P   PPLX

PS. do you want me to do some kind of chop through the
different versions to see where the slow down happened
or are you happy you have nailed it?
-- 

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


Problem using ExtractingRequestHandler with tomcat

2009-04-02 Thread Fergus McMenemie
Hello all,

I cant get ExtractingRequestHandler to work with tomcat. Using the
latest version from svn and then a "make clean dist" and copying the
war file to a clean tomcat does not work.
 
Adding the following to solconfig.xml ands restarting tomcat i get

>   class="org.apache.solr.handler.extraction.ExtractingRequestHandler">
>
>  last_modified
>  true
>  
>


>Apr 2, 2009 9:20:02 AM org.apache.solr.util.plugin.AbstractPluginLoader load
>INFO: created /update/javabin: 
>org.apache.solr.handler.BinaryUpdateRequestHandler
>Apr 2, 2009 9:20:02 AM org.apache.solr.common.SolrException log
>SEVERE: org.apache.solr.common.SolrException: Error loading class 
>'org.apache.solr.handler.extraction.ExtractingRequestHandler'
>   at 
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:310)
>   at 
> org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:325)
>   at 
> org.apache.solr.util.plugin.AbstractPluginLoader.create(AbstractPluginLoader.java:84)
>   at 
> org.apache.solr.core.RequestHandlers$1.create(RequestHandlers.java:154)
>   at 
> org.apache.solr.core.RequestHandlers$1.create(RequestHandlers.java:163)

Any ideas?
-- 

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


Re: Problem using ExtractingRequestHandler with tomcat

2009-04-02 Thread Erik Hatcher


On Apr 2, 2009, at 4:26 AM, Fergus McMenemie wrote:

I cant get ExtractingRequestHandler to work with tomcat. Using the
latest version from svn and then a "make clean dist" and copying the
war file to a clean tomcat does not work.


make?!  :)

try "ant example" to see if that gets it working - it copies the  
ExtractingRequestHandler JAR and dependencies to /lib


Erik






Adding the following to solconfig.xml ands restarting tomcat i get

class="org.apache.solr.handler.extraction.ExtractingRequestHandler">

  
last_modified
true

  



Apr 2, 2009 9:20:02 AM  
org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created /update/javabin:  
org.apache.solr.handler.BinaryUpdateRequestHandler

Apr 2, 2009 9:20:02 AM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: Error loading class  
'org.apache.solr.handler.extraction.ExtractingRequestHandler'
	at  
org 
.apache 
.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:310)
	at  
org 
.apache 
.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java: 
325)
	at  
org 
.apache 
.solr 
.util.plugin.AbstractPluginLoader.create(AbstractPluginLoader.java: 
84)
	at org.apache.solr.core.RequestHandlers 
$1.create(RequestHandlers.java:154)
	at org.apache.solr.core.RequestHandlers 
$1.create(RequestHandlers.java:163)


Any ideas?
--

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===




Re: autowarm static queries

2009-04-02 Thread sunnyfr

Hi Hoss,

Do I need autowarming > 0 to have newSearcher and firstSearcher fired?

Thanks a lot,


hossman wrote:
> 
> 
> : Subject: autowarm static queries
> 
> A minor followup about terminology:
> 
> "auto-warming" describes what Solr does when it opens a new cache, and 
> seeds it with key/val pairs based on the "top" keys from the old instance 
> of the cache.
> 
> "static warming" describes what you can do using newSearcher and 
> firstSearcher event listeners to force explicit warming actions to be 
> taken when one of these events happens -- frequently it involves seeding 
> one or more caches with values from "static" queries hard coded in the 
> solrconfig.xml
> 
> i'm not sure what it would mean to autowarm a static query.
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/autowarm-partial-queries-in-solrconfig.xml-tp13167933p22843453.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Problem using ExtractingRequestHandler with tomcat

2009-04-02 Thread Fergus McMenemie
>On Apr 2, 2009, at 4:26 AM, Fergus McMenemie wrote:
>> I cant get ExtractingRequestHandler to work with tomcat. Using the
>> latest version from svn and then a "make clean dist" and copying the
>> war file to a clean tomcat does not work.
>
>make?!  :)
Oops!

>
>try "ant example" to see if that gets it working - it copies the  
>ExtractingRequestHandler JAR and dependencies to /lib
>
>   Erik
>
Thanks. Copying all those jar files to my solr/lib directory was
the trick. But why do I have to do this; is it by design or 
because ExtractingRequestHandler is yet to be fully incorporated 
into Solr?

Regards Fergus.
-- 

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


Re: Problem using ExtractingRequestHandler with tomcat

2009-04-02 Thread Erik Hatcher


On Apr 2, 2009, at 4:55 AM, Fergus McMenemie wrote:

Thanks. Copying all those jar files to my solr/lib directory was
the trick. But why do I have to do this; is it by design or
because ExtractingRequestHandler is yet to be fully incorporated
into Solr?


It's fully integrated into the example as a "plugin" and runs out of  
the box there.  It's not built into the .war file because it is pretty  
bulky and may not be desirable for everyone.


Erik



Re: autowarm static queries

2009-04-02 Thread Shalin Shekhar Mangar
On Thu, Apr 2, 2009 at 2:13 PM, sunnyfr  wrote:
>
> Hi Hoss,
>
> Do I need autowarming > 0 to have newSearcher and firstSearcher fired?
>
> Thanks a lot,
>

Did you mean autowarmCount > 0?

No, firstSearcher and newSearcher are always executed if specified.
The autowarmCount can be anything, it does not matter.

-- 
Regards,
Shalin Shekhar Mangar.


Re: autowarm static queries

2009-04-02 Thread sunnyfr

Ok so It doesn't seems to work
after a replication, my first request on my slave is always  very long and
next one very quick ??? 
do I have to set something else ?




  
 solr 0 100 
 anything id
desc
  




  
 fast_warm 0 100 

 anything id
desc
 anything
  recip(rord(created),1,10,10)^3+pow(stat_views,0.1)^15+pow(stat_comments,0.1)^15
  status_published:1+AND+status_moderated:0+AND+status_personal:0+AND+status_private:0+AND+status_deleted:0+AND+status_error:0+AND+status_ready_web:1
  status_official:1^1.5+OR+status_creative:1^1+OR+language:en^0.5
  title^0.2+description^0.2+tags^1+owner_login^0.5

  





Shalin Shekhar Mangar wrote:
> 
> On Thu, Apr 2, 2009 at 2:13 PM, sunnyfr  wrote:
>>
>> Hi Hoss,
>>
>> Do I need autowarming > 0 to have newSearcher and firstSearcher fired?
>>
>> Thanks a lot,
>>
> 
> Did you mean autowarmCount > 0?
> 
> No, firstSearcher and newSearcher are always executed if specified.
> The autowarmCount can be anything, it does not matter.
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/autowarm-partial-queries-in-solrconfig.xml-tp13167933p22844377.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: FW: multicore

2009-04-02 Thread Neha Bhardwaj
Hi,
Thanks a lot ...
But still I am not very clear with creating multiple cores.
I read the document(wiki)but was not able to understand it properly.

Also how to index data in particular core.
Say.. we have core0 and core1 in multicore.
How can I specify that on which core iwant to index data.

Kindly help!!

-Original Message-
From: Marc Sturlese [mailto:marc.sturl...@gmail.com] 
Sent: Wednesday, April 01, 2009 8:06 PM
To: solr-user@lucene.apache.org
Subject: Re: FW: multicore



>>how to have multiple cores ? 
You need a solr.xml file in the root of your solr home. In this solr.xml you
will inicalize the cores. In this same folder you will have a folder per
core with its /conf and /data. Every core has it's own solrconfig.xml and
schema.xml.
If you grap a nighlty build you will see a config example in there.
Every thing is proper explained in the solr core wiki:
http://wiki.apache.org/solr/CoreAdmin

>>can we start all cores from single startup file or we need to start all
>>independently?

>>I need  a way by which I can start all of them in one go.
Once you have your cores configures in wour webapp, all of the will be
loaded automatically when you start your server.




Neha Bhardwaj wrote:
> 
>  
> 
>  
> 
> From: Neha Bhardwaj [mailto:neha_bhard...@persistent.co.in] 
> Sent: Wednesday, April 01, 2009 6:52 PM
> To: 'solr-user@lucene.apache.org'
> Subject: multicore
> 
>  
> 
> Hi,
> 
> I need to create multiple cores for my project.
> 
> I need to know:
> 
>  how to have multiple cores ?
> 
> can we start all cores from single startup file or we need to start all
> independently?
> 
> I need  a way by which I can start all of them in one go.
> 
>  
> 
>  
> 
> Thanks
> 
> Neha
> 
> 
> DISCLAIMER
> ==
> This e-mail may contain privileged and confidential information which is
> the property of Persistent Systems Ltd. It is intended only for the use of
> the individual or entity to which it is addressed. If you are not the
> intended recipient, you are not authorized to read, retain, copy, print,
> distribute or use this message. If you have received this communication in
> error, please notify the sender and delete all copies of this message.
> Persistent Systems Ltd. does not accept any liability for virus infected
> mails.
> 
> 

-- 
View this message in context:
http://www.nabble.com/FW%3A-multicore-tp22827267p22827926.html
Sent from the Solr - User mailing list archive at Nabble.com.


DISCLAIMER
==
This e-mail may contain privileged and confidential information which is the 
property of Persistent Systems Ltd. It is intended only for the use of the 
individual or entity to which it is addressed. If you are not the intended 
recipient, you are not authorized to read, retain, copy, print, distribute or 
use this message. If you have received this communication in error, please 
notify the sender and delete all copies of this message. Persistent Systems 
Ltd. does not accept any liability for virus infected mails.


Using ExtractingRequestHandler to index a large PDF

2009-04-02 Thread Fergus McMenemie
Hello,

Sorry if this is a FAQ; I suspect it could be. But how do I work around the 
following:-

INFO: [] webapp=/apache-solr-1.4-dev path=/update/extract 
params={ext.def.fl=text&ext.literal.id=factbook/reference_maps/pdf/oceania.pdf} 
status=0 QTime=318 
Apr 2, 2009 11:17:46 AM org.apache.solr.common.SolrException log
SEVERE: 
org.apache.commons.fileupload.FileUploadBase$SizeLimitExceededException: the 
request was rejected because its size (4585774) exceeds the configured maximum 
(2097152)
at 
org.apache.commons.fileupload.FileUploadBase$FileItemIteratorImpl.(FileUploadBase.java:914)
at 
org.apache.commons.fileupload.FileUploadBase.getItemIterator(FileUploadBase.java:331)
at 
org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:349)
at 
org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:126)
at 
org.apache.solr.servlet.MultipartRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:343)
at 
org.apache.solr.servlet.StandardRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:396)
at 
org.apache.solr.servlet.SolrRequestParsers.parse(SolrRequestParsers.java:114)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)

Although the PDF is big, it contains very little text; it is a map. 

   "java -jar solr/lib/tika-0.3.jar -g" appears to have no bother with it.

Fergus...
-- 

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


Slow startup of solr-nightly

2009-04-02 Thread Jarek Zgoda
I'm testing solr-nightly (2009-04-02) and I noticed very long instance  
startup time compared to 1.3 (4 minutes to 20 seconds). The machine is  
pretty common 2 core AMD64 with 4GB RAM.


Is this normal or should I be worried?

--
We read Knuth so you don't have to. - Tim Peters

Jarek Zgoda, R&D, Redefine
jarek.zg...@redefine.pl



Re: Composite POJO support

2009-04-02 Thread Praveen Kumar Jayaram

Could someone give suggestions for this issue?


Praveen Kumar Jayaram wrote:
> 
> Hi
> 
> I am trying to have a complex POJO type in Solr 1.3
> i.e Object inside object.
> 
> Below is a sample Field created,
> 
> public class TestType extends FieldType{
> @Field
> private String requestorID_s_i_s_nm;
> 
> @Field
> private String partNumber;
> 
> @Field
> private String requestorName_s_i_s_nm;
> 
> @Field
> private InnerType innerType;
> }
> 
> Where InnerType is another custom Java type.
> 
> public class InnerType extends FieldType{
>   private String name_s_i_s_nm;
> }
> 
> 
> The schema configuration is as shown below,
> 
> 
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
> 
> 
> 
> When I try to add an TestType POJO using below code, am getting unkown
> field "innerType" error,
> 
> String url = "http://localhost:8983/solr";;
> SolrServer server = new CommonsHttpSolrServer( url );
> 
> InnerType inner = new InnerType();
> inner.setName_s_i_s_nm("Test");
>   
> TestType praveen = new TestType();
> praveen.setPartNumber("01-0001");
> praveen.setRequestorID_s_i_s_nm("");
> praveen.setRequestorName_s_i_s_nm("Praveen Kumar Jayaram");
> praveen.setInnerType(inner);
> 
> server.addBean(praveen);
> UpdateRequest req = new UpdateRequest();
> req.setAction( UpdateRequest.ACTION.COMMIT, false, false );
> UpdateResponse res = req.process(server);
> 
> Initially POJO was getting added when it was not composite POJO.
> After trying to have composite POJO things are not working.
> What is that I am doing wrong??
> 
> Any help will be appreciated.
> 
> 
> 


-
Regards,
Praveen
-- 
View this message in context: 
http://www.nabble.com/Composite-POJO-support-tp22841854p22845799.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: How do I combine WhitespaceTokenizerFactory and EdgeNGramTokenizerFactory?

2009-04-02 Thread Praveen Kumar Jayaram


Is there a way to achieve this requiremnt?
Please give your valuable comments.


Praveen Kumar Jayaram wrote:
> 
> Hi folks,
> 
> I am trying to use features of WhitespaceTokenizerFactory and
> EdgeNGramTokenizerFactory.
> My requirement is to search a String using substring and inword concepts.
> 
> For example:
> Let us assume the String as "Praveen  Kumar Jayaram".
> The query for "Kumar" will be successful if we use
> WhitespaceTokenizerFactory.
> And the query for "Pra" will be successful if we use
> EdgeNGramTokenizerFactory.
> 
> Now I need to search with query string "Kum". I tried using these two
> tokenizer factories parallely but got an error saying we can use only one
> tokenizer per field.
> Could you please help in achieving this requirement??
> 
> Thanks in advance.
> 
> Regards,
> Praveen
> 


-
Regards,
Praveen
-- 
View this message in context: 
http://www.nabble.com/How-do-I-combine-WhitespaceTokenizerFactory-and-EdgeNGramTokenizerFactory--tp22841455p22845811.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Composite POJO support

2009-04-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
why is the POJO extending FieldType?
it does not have to.

composite types are not supported.because Solr cannot support that.
But the field can be a List or array.

On Thu, Apr 2, 2009 at 5:00 PM, Praveen Kumar Jayaram
 wrote:
>
> Could someone give suggestions for this issue?
>
>
> Praveen Kumar Jayaram wrote:
>>
>> Hi
>>
>> I am trying to have a complex POJO type in Solr 1.3
>> i.e Object inside object.
>>
>> Below is a sample Field created,
>>
>> public class TestType extends FieldType{
>>     @Field
>>     private String requestorID_s_i_s_nm;
>>
>>     @Field
>>     private String partNumber;
>>
>>     @Field
>>     private String requestorName_s_i_s_nm;
>>
>>     @Field
>>     private InnerType innerType;
>> }
>>
>> Where InnerType is another custom Java type.
>>
>> public class InnerType extends FieldType{
>>       private String name_s_i_s_nm;
>> }
>>
>>
>> The schema configuration is as shown below,
>>
>> 
>> > sortMissingLast="true" omitNorms="true"/>
>> > sortMissingLast="true" omitNorms="true"/>
>> 
>> 
>>
>> When I try to add an TestType POJO using below code, am getting unkown
>> field "innerType" error,
>>
>> String url = "http://localhost:8983/solr";;
>> SolrServer server = new CommonsHttpSolrServer( url );
>>
>> InnerType inner = new InnerType();
>> inner.setName_s_i_s_nm("Test");
>>
>> TestType praveen = new TestType();
>> praveen.setPartNumber("01-0001");
>> praveen.setRequestorID_s_i_s_nm("");
>> praveen.setRequestorName_s_i_s_nm("Praveen Kumar Jayaram");
>> praveen.setInnerType(inner);
>>
>> server.addBean(praveen);
>> UpdateRequest req = new UpdateRequest();
>> req.setAction( UpdateRequest.ACTION.COMMIT, false, false );
>> UpdateResponse res = req.process(server);
>>
>> Initially POJO was getting added when it was not composite POJO.
>> After trying to have composite POJO things are not working.
>> What is that I am doing wrong??
>>
>> Any help will be appreciated.
>>
>>
>>
>
>
> -
> Regards,
> Praveen
> --
> View this message in context: 
> http://www.nabble.com/Composite-POJO-support-tp22841854p22845799.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
--Noble Paul


JVM best tune? help ... solr1.4

2009-04-02 Thread sunnyfr

Hi,

Just applied replication by requestHandler.
And since this the Qtime went mad and can reach long time 9068
Without this replication Qtime can be around 1sec. 

I've 14Mdocs stores for 11G. so not a lot of data stores.
I've servers with 8G and tomcat use 7G.
I'm updating every 30mn which is about 50 000docs.
Have a look as well at my cpu which are aswell quite full ? 

Have you an idea? Do I miss a patch ? 
Thanks a lot,

Solr Specification Version: 1.3.0.2009.01.22.13.51.22
Solr Implementation Version: 1.4-dev exported - root - 2009-01-22 13:51:22

http://www.nabble.com/file/p22846336/CPU.jpg CPU.jpg 
-- 
View this message in context: 
http://www.nabble.com/JVM-best-tune--help-...-solr1.4-tp22846336p22846336.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: multicore

2009-04-02 Thread Erik Hatcher


On Apr 2, 2009, at 6:32 AM, Neha Bhardwaj wrote:

Also how to index data in particular core.
Say.. we have core0 and core1 in multicore.
How can I specify that on which core iwant to index data.


You index into http://loalhost:8983/solr/core0/update or 
http://loalhost:8983/solr/core1/update

Erik



Re: JVM best tune? help ... solr1.4

2009-04-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
If you are looking at the QTime on the master it is likely to be
skewed by ReplicationHandler becaus ethe files are downloaded using a
request. On a slave it should not be a problem.

I guess we must not add the qtimes of ReplicationHandler
--Noble

On Thu, Apr 2, 2009 at 5:34 PM, sunnyfr  wrote:
>
> Hi,
>
> Just applied replication by requestHandler.
> And since this the Qtime went mad and can reach long time  name="QTime">9068
> Without this replication Qtime can be around 1sec.
>
> I've 14Mdocs stores for 11G. so not a lot of data stores.
> I've servers with 8G and tomcat use 7G.
> I'm updating every 30mn which is about 50 000docs.
> Have a look as well at my cpu which are aswell quite full ?
>
> Have you an idea? Do I miss a patch ?
> Thanks a lot,
>
> Solr Specification Version: 1.3.0.2009.01.22.13.51.22
> Solr Implementation Version: 1.4-dev exported - root - 2009-01-22 13:51:22
>
> http://www.nabble.com/file/p22846336/CPU.jpg CPU.jpg
> --
> View this message in context: 
> http://www.nabble.com/JVM-best-tune--help-...-solr1.4-tp22846336p22846336.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
--Noble Paul


Re: JVM best tune? help ... solr1.4

2009-04-02 Thread sunnyfr

I think about the slave.
When I start in multi thread 20 request second my cpu is very bad.
I'm sure I don't manage properly my gc. I've 8G per slave it should be fine.

I wonder, I shouldn't put 7G to xmx jvm, I don't know,
but slave is as well a little problem during replication from the master.



Noble Paul നോബിള്‍  नोब्ळ् wrote:
> 
> If you are looking at the QTime on the master it is likely to be
> skewed by ReplicationHandler becaus ethe files are downloaded using a
> request. On a slave it should not be a problem.
> 
> I guess we must not add the qtimes of ReplicationHandler
> --Noble
> 
> On Thu, Apr 2, 2009 at 5:34 PM, sunnyfr  wrote:
>>
>> Hi,
>>
>> Just applied replication by requestHandler.
>> And since this the Qtime went mad and can reach long time > name="QTime">9068
>> Without this replication Qtime can be around 1sec.
>>
>> I've 14Mdocs stores for 11G. so not a lot of data stores.
>> I've servers with 8G and tomcat use 7G.
>> I'm updating every 30mn which is about 50 000docs.
>> Have a look as well at my cpu which are aswell quite full ?
>>
>> Have you an idea? Do I miss a patch ?
>> Thanks a lot,
>>
>> Solr Specification Version: 1.3.0.2009.01.22.13.51.22
>> Solr Implementation Version: 1.4-dev exported - root - 2009-01-22
>> 13:51:22
>>
>> http://www.nabble.com/file/p22846336/CPU.jpg CPU.jpg
>> --
>> View this message in context:
>> http://www.nabble.com/JVM-best-tune--help-...-solr1.4-tp22846336p22846336.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> --Noble Paul
> 
> 

-- 
View this message in context: 
http://www.nabble.com/JVM-best-tune--help-...-solr1.4-tp22846336p22846546.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: JVM best tune? help ... solr1.4

2009-04-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
http://people.apache.org/~hossman/#threadhijack

On Thu, Apr 2, 2009 at 5:47 PM, sunnyfr  wrote:
>
> I think about the slave.
> When I start in multi thread 20 request second my cpu is very bad.
> I'm sure I don't manage properly my gc. I've 8G per slave it should be fine.
>
> I wonder, I shouldn't put 7G to xmx jvm, I don't know,
> but slave is as well a little problem during replication from the master.
>
>
>
> Noble Paul നോബിള്‍  नोब्ळ् wrote:
>>
>> If you are looking at the QTime on the master it is likely to be
>> skewed by ReplicationHandler becaus ethe files are downloaded using a
>> request. On a slave it should not be a problem.
>>
>> I guess we must not add the qtimes of ReplicationHandler
>> --Noble
>>
>> On Thu, Apr 2, 2009 at 5:34 PM, sunnyfr  wrote:
>>>
>>> Hi,
>>>
>>> Just applied replication by requestHandler.
>>> And since this the Qtime went mad and can reach long time >> name="QTime">9068
>>> Without this replication Qtime can be around 1sec.
>>>
>>> I've 14Mdocs stores for 11G. so not a lot of data stores.
>>> I've servers with 8G and tomcat use 7G.
>>> I'm updating every 30mn which is about 50 000docs.
>>> Have a look as well at my cpu which are aswell quite full ?
>>>
>>> Have you an idea? Do I miss a patch ?
>>> Thanks a lot,
>>>
>>> Solr Specification Version: 1.3.0.2009.01.22.13.51.22
>>> Solr Implementation Version: 1.4-dev exported - root - 2009-01-22
>>> 13:51:22
>>>
>>> http://www.nabble.com/file/p22846336/CPU.jpg CPU.jpg
>>> --
>>> View this message in context:
>>> http://www.nabble.com/JVM-best-tune--help-...-solr1.4-tp22846336p22846336.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>>
>> --
>> --Noble Paul
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/JVM-best-tune--help-...-solr1.4-tp22846336p22846546.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
--Noble Paul


RE: multicore

2009-04-02 Thread Neha Bhardwaj
Hi,
I want to index through commond line
How to do that?



-Original Message-
From: Erik Hatcher [mailto:e...@ehatchersolutions.com] 
Sent: Thursday, April 02, 2009 5:42 PM
To: solr-user@lucene.apache.org
Subject: Re: multicore


On Apr 2, 2009, at 6:32 AM, Neha Bhardwaj wrote:
> Also how to index data in particular core.
> Say.. we have core0 and core1 in multicore.
> How can I specify that on which core iwant to index data.

You index into http://loalhost:8983/solr/core0/update or
http://loalhost:8983/solr/core1/update

Erik


DISCLAIMER
==
This e-mail may contain privileged and confidential information which is the 
property of Persistent Systems Ltd. It is intended only for the use of the 
individual or entity to which it is addressed. If you are not the intended 
recipient, you are not authorized to read, retain, copy, print, distribute or 
use this message. If you have received this communication in error, please 
notify the sender and delete all copies of this message. Persistent Systems 
Ltd. does not accept any liability for virus infected mails.


Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-04-02 Thread Grant Ingersoll


On Apr 2, 2009, at 4:02 AM, Fergus McMenemie wrote:

Grant,

Hmmm, the big difference is made by &overwrite=false. But,
can you explain why &overwrite=false makes such a difference.
I am starting off with an empty index and I have checked the
content there are no duplicates in the uniqueKey field.

I guess if &overwrite=false then a few checks can be removed
from the indexing process, and if I am confident that my content
contains no duplicates then this is a good speed up.

http://wiki.apache.org/solr/UpdateCSV says that if overwrite
is true (the default) then overwrite documents based on the
uniqueKey. However what will solr/lucene do if the uniqueKey
is not unique and overwrite=false?


overwrite=false means Solr does not issue deletes first, meaning if  
you have a doc w/ that id already, you will now have two docs with  
that id.   unique Id is enforced by Solr, not by Lucene.


Even if you can't guarantee uniqueness, you can still do overwrite =  
false as a workaround using the suggestion I gave you in a prior email:
1. Add a new field that is unique for your data source, but is the  
same for all records in that data source.  i.e. type = geonames.txt
2. Before updating, issue a delete by query for the value of that  
type, which will delete all records with that term

3. Do your indexing with overwrite = false

I should note, however, that the speed difference you are seeing may  
not be as pronounced as it appears.  If I recall during ApacheCon, I  
commented on how long it takes to shutdown your Solr instance when  
exiting it.  That time it takes is in fact Solr doing the work that  
was put off by not committing earlier and having all those deletes  
pile up.


Thus, while it is likely that your older version is still faster due  
to the new fsync stuff in Lucene, it may not be that much faster.  I  
think you could see this by actually doing commit = true, but I'm not  
100% sure.






fergus: perl -nlaF"\t" -e 'print "$F[2]";' geonames.txt | wc -l
100
fergus: perl -nlaF"\t" -e 'print "$F[2]";' geonames.txt | sort -u |  
wc -l

100
fergus: /usr/bin/head geonames.txt
RC	UFI	UNI	LAT	LONG	DMS_LAT	DMS_LONG	MGRS	JOG	FC	DSG	PC	CC1	ADM1	 
ADM2	POP	ELEV	CC2	NT	LC	SHORT_FORM	GENERIC	SORT_NAME	FULL_NAME	 
FULL_NAME_ND	MODIFY_DATE
1	-1307828	60524	12.47	-69.9	122800	-695400	19PDP0219578323	 
ND19-14	T	MT		AA	00	PALUMARGA	Palu Marga	Palu Marga	1995-03-23
1	-1307756	-1891720	12.5	-70.016667	123000	-700100	19PCP8952982056	 
ND19-14	P	PPLX	


PS. do you want me to do some kind of chop through the
different versions to see where the slow down happened
or are you happy you have nailed it?
--

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Re: multicore

2009-04-02 Thread Akshay
On Thu, Apr 2, 2009 at 5:58 PM, Neha Bhardwaj <
neha_bhard...@persistent.co.in> wrote:

> Hi,
> I want to index through commond line
> How to do that?


You can use curl,
http://wiki.apache.org/solr/UpdateXmlMessages?highlight=%28curl%29#head-c614ba822059ae20dde5698290caeb851534c9de


>
>
>
> -Original Message-
> From: Erik Hatcher [mailto:e...@ehatchersolutions.com]
> Sent: Thursday, April 02, 2009 5:42 PM
> To: solr-user@lucene.apache.org
> Subject: Re: multicore
>
>
> On Apr 2, 2009, at 6:32 AM, Neha Bhardwaj wrote:
> > Also how to index data in particular core.
> > Say.. we have core0 and core1 in multicore.
> > How can I specify that on which core iwant to index data.
>
> You index into http://loalhost:8983/solr/core0/update or
> http://loalhost:8983/solr/core1/update
>
>Erik
>
>
> DISCLAIMER
> ==
> This e-mail may contain privileged and confidential information which is
> the property of Persistent Systems Ltd. It is intended only for the use of
> the individual or entity to which it is addressed. If you are not the
> intended recipient, you are not authorized to read, retain, copy, print,
> distribute or use this message. If you have received this communication in
> error, please notify the sender and delete all copies of this message.
> Persistent Systems Ltd. does not accept any liability for virus infected
> mails.
>



-- 
Regards,
Akshay K. Ukey.


Re: JVM best tune? help ... solr1.4

2009-04-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
it is not a good idea to optimize the index everytime you commit. That
is why your downloads are taking so long

2009/4/2 Johanna Mortemousque :
> I've so many update, almost 2 000 every 20mn, that lucene merge my index
> folder,
> so everytime my slave replicate its a new index folder merged so every time
> it brings back 10G datas.
>
> And during this time my repond time of my request are very slow.
> What can I check?
>
> Thanks Paul
>
> 2009/4/2 Noble Paul നോബിള്‍ नोब्ळ् 
>>
>> slave would not show increased request times because of replication.
>> If it does should be some bug
>>
>> On Thu, Apr 2, 2009 at 6:00 PM,   wrote:
>> > I think its the same problem, tune jvm for multi thread ... 20request
>> > seconde.
>> > no??
>> >
>> >
>> >
>> > Noble Paul നോബിള്‍  नोब्ळ् wrote:
>> >>
>> >> http://people.apache.org/~hossman/#threadhijack
>> >>
>> >> On Thu, Apr 2, 2009 at 5:47 PM, sunnyfr  wrote:
>> >>>
>> >>> I think about the slave.
>> >>> When I start in multi thread 20 request second my cpu is very bad.
>> >>> I'm sure I don't manage properly my gc. I've 8G per slave it should be
>> >>> fine.
>> >>>
>> >>> I wonder, I shouldn't put 7G to xmx jvm, I don't know,
>> >>> but slave is as well a little problem during replication from the
>> >>> master.
>> >>>
>> >>>
>> >>>
>> >>> Noble Paul നോബിള്‍  नोब्ळ् wrote:
>> 
>>  If you are looking at the QTime on the master it is likely to be
>>  skewed by ReplicationHandler becaus ethe files are downloaded using a
>>  request. On a slave it should not be a problem.
>> 
>>  I guess we must not add the qtimes of ReplicationHandler
>>  --Noble
>> 
>>  On Thu, Apr 2, 2009 at 5:34 PM, sunnyfr  wrote:
>> >
>> > Hi,
>> >
>> > Just applied replication by requestHandler.
>> > And since this the Qtime went mad and can reach long time > > name="QTime">9068
>> > Without this replication Qtime can be around 1sec.
>> >
>> > I've 14Mdocs stores for 11G. so not a lot of data stores.
>> > I've servers with 8G and tomcat use 7G.
>> > I'm updating every 30mn which is about 50 000docs.
>> > Have a look as well at my cpu which are aswell quite full ?
>> >
>> > Have you an idea? Do I miss a patch ?
>> > Thanks a lot,
>> >
>> > Solr Specification Version: 1.3.0.2009.01.22.13.51.22
>> > Solr Implementation Version: 1.4-dev exported - root - 2009-01-22
>> > 13:51:22
>> >
>> > http://www.nabble.com/file/p22846336/CPU.jpg CPU.jpg
>> > --
>> > View this message in context:
>> >
>> > http://www.nabble.com/JVM-best-tune--help-...-solr1.4-tp22846336p22846336.html
>> > Sent from the Solr - User mailing list archive at Nabble.com.
>> >
>> >
>> 
>> 
>> 
>>  --
>>  --Noble Paul
>> 
>> 
>> >>>
>> >>> --
>> >>> View this message in context:
>> >>>
>> >>> http://www.nabble.com/JVM-best-tune--help-...-solr1.4-tp22846336p22846546.html
>> >>> Sent from the Solr - User mailing list archive at Nabble.com.
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> --Noble Paul
>> >>
>> >>
>> > Quoted from:
>> >
>> > http://www.nabble.com/JVM-best-tune--help-...-solr1.4-tp22846336p22846716.html
>> >
>> >
>>
>>
>>
>> --
>> --Noble Paul
>
>



-- 
--Noble Paul


Re: JVM best tune? help ... solr1.4

2009-04-02 Thread sunnyfr

I don't optimize at all.
my delta-import&optimize=false

I didn't turnd on optimize, I think it merges segment alone, because size
increase too quickly?


Noble Paul നോബിള്‍  नोब्ळ् wrote:
> 
> it is not a good idea to optimize the index everytime you commit. That
> is why your downloads are taking so long
> 
> 2009/4/2 Johanna Mortemousque :
>> I've so many update, almost 2 000 every 20mn, that lucene merge my index
>> folder,
>> so everytime my slave replicate its a new index folder merged so every
>> time
>> it brings back 10G datas.
>>
>> And during this time my repond time of my request are very slow.
>> What can I check?
>>
>> Thanks Paul
>>
>> 2009/4/2 Noble Paul നോബിള്‍ नोब्ळ् 
>>>
>>> slave would not show increased request times because of replication.
>>> If it does should be some bug
>>>
>>> On Thu, Apr 2, 2009 at 6:00 PM,   wrote:
>>> > I think its the same problem, tune jvm for multi thread ... 20request
>>> > seconde.
>>> > no??
>>> >
>>> >
>>> >
>>> > Noble Paul നോബിള്‍  नोब्ळ् wrote:
>>> >>
>>> >> http://people.apache.org/~hossman/#threadhijack
>>> >>
>>> >> On Thu, Apr 2, 2009 at 5:47 PM, sunnyfr  wrote:
>>> >>>
>>> >>> I think about the slave.
>>> >>> When I start in multi thread 20 request second my cpu is very bad.
>>> >>> I'm sure I don't manage properly my gc. I've 8G per slave it should
>>> be
>>> >>> fine.
>>> >>>
>>> >>> I wonder, I shouldn't put 7G to xmx jvm, I don't know,
>>> >>> but slave is as well a little problem during replication from the
>>> >>> master.
>>> >>>
>>> >>>
>>> >>>
>>> >>> Noble Paul നോബിള്‍  नोब्ळ् wrote:
>>> 
>>>  If you are looking at the QTime on the master it is likely to be
>>>  skewed by ReplicationHandler becaus ethe files are downloaded using
>>> a
>>>  request. On a slave it should not be a problem.
>>> 
>>>  I guess we must not add the qtimes of ReplicationHandler
>>>  --Noble
>>> 
>>>  On Thu, Apr 2, 2009 at 5:34 PM, sunnyfr 
>>> wrote:
>>> >
>>> > Hi,
>>> >
>>> > Just applied replication by requestHandler.
>>> > And since this the Qtime went mad and can reach long time >> > name="QTime">9068
>>> > Without this replication Qtime can be around 1sec.
>>> >
>>> > I've 14Mdocs stores for 11G. so not a lot of data stores.
>>> > I've servers with 8G and tomcat use 7G.
>>> > I'm updating every 30mn which is about 50 000docs.
>>> > Have a look as well at my cpu which are aswell quite full ?
>>> >
>>> > Have you an idea? Do I miss a patch ?
>>> > Thanks a lot,
>>> >
>>> > Solr Specification Version: 1.3.0.2009.01.22.13.51.22
>>> > Solr Implementation Version: 1.4-dev exported - root - 2009-01-22
>>> > 13:51:22
>>> >
>>> > http://www.nabble.com/file/p22846336/CPU.jpg CPU.jpg
>>> > --
>>> > View this message in context:
>>> >
>>> >
>>> http://www.nabble.com/JVM-best-tune--help-...-solr1.4-tp22846336p22846336.html
>>> > Sent from the Solr - User mailing list archive at Nabble.com.
>>> >
>>> >
>>> 
>>> 
>>> 
>>>  --
>>>  --Noble Paul
>>> 
>>> 
>>> >>>
>>> >>> --
>>> >>> View this message in context:
>>> >>>
>>> >>>
>>> http://www.nabble.com/JVM-best-tune--help-...-solr1.4-tp22846336p22846546.html
>>> >>> Sent from the Solr - User mailing list archive at Nabble.com.
>>> >>>
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> --Noble Paul
>>> >>
>>> >>
>>> > Quoted from:
>>> >
>>> >
>>> http://www.nabble.com/JVM-best-tune--help-...-solr1.4-tp22846336p22846716.html
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> --Noble Paul
>>
>>
> 
> 
> 
> -- 
> --Noble Paul
> 
> 

-- 
View this message in context: 
http://www.nabble.com/JVM-best-tune--help-...-solr1.4-tp22846336p22847455.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: JVM best tune? help ... solr1.4

2009-04-02 Thread sunnyfr

This is my conf :

http://www.nabble.com/file/p22847570/solrconfig.xml solrconfig.xml 

And this is my delta import:

*/20 * * * * /usr/bin/wget -q --output-document=/home/video_import.txt
http:/master.com:8180/solr/video/dataimport?command=delta-import&optimize=false



-- 
View this message in context: 
http://www.nabble.com/JVM-best-tune--help-...-solr1.4-tp22846336p22847570.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-04-02 Thread Fergus McMenemie
Grant,



>I should note, however, that the speed difference you are seeing may  
>not be as pronounced as it appears.  If I recall during ApacheCon, I  
>commented on how long it takes to shutdown your Solr instance when  
>exiting it.  That time it takes is in fact Solr doing the work that  
>was put off by not committing earlier and having all those deletes  
>pile up.
>
I am confused about "work that was put off" vs committing. My script
was doing a commit right after the CVS import, and you are right
about the massive times required to shut tomcat down. But in my tests
the time taken to do the commit was under a second, yet I had to allow
300secs for tomcat shutdown. Also I dont have any duplicates. So 
what sort of work was being done at shutdown that was not being done
by a commit? Optimise!

Thanks for the all the help.

Fergus.
-- 

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


Re: multicore

2009-04-02 Thread Erik Hatcher


On Apr 2, 2009, at 8:47 AM, Akshay wrote:


On Thu, Apr 2, 2009 at 5:58 PM, Neha Bhardwaj <
neha_bhard...@persistent.co.in> wrote:


Hi,
I want to index through commond line
How to do that?



You can use curl,
http://wiki.apache.org/solr/UpdateXmlMessages?highlight=%28curl%29#head-c614ba822059ae20dde5698290caeb851534c9de


Or Solr's post.jar, setting the url parameter appropriately:

$ java -jar post.jar -help

SimplePostTool: version 1.2
This is a simple command line tool for POSTing raw XML to a Solr
port.  XML data can be read from files specified as commandline
args; as raw commandline arg strings; or via STDIN.
Examples:
  java -Ddata=files -jar post.jar *.xml
  java -Ddata=args  -jar post.jar '42'
  java -Ddata=stdin -jar post.jar < hd.xml
Other options controlled by System Properties include the Solr
URL to POST to, and whether a commit should be executed.  These
are the defaults for all System Properties...
  -Ddata=files
  -Durl=http://localhost:8983/solr/update
  -Dcommit=yes



Error in Importing from Oracle

2009-04-02 Thread Radha C.
Hello List,
 
I am trying to do full import from remote oracle server. I am getting the
below error, 
 
Can anyone please help me what configuration I am missing?  Thanks in
advance.
 
Apr 2, 2009 6:46:39 PM org.apache.solr.handler.dataimport.DataImporter
doFullImport
SEVERE: Full Import failed
org.apache.solr.handler.dataimport.DataImportHandlerException:
java.lang.UnsatisfiedLinkError: no ocijdbc10 in java.library.path
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:
402)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:225
)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:167)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.ja
va:323)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:381
)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:362)
Caused by: java.lang.UnsatisfiedLinkError: no ocijdbc10 in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1682)
at java.lang.Runtime.loadLibrary0(Runtime.java:822)
at java.lang.System.loadLibrary(System.java:992)
at oracle.jdbc.driver.T2CConnection$1.run(T2CConnection.java:3135)
at java.security.AccessController.doPrivileged(Native Method)
at
oracle.jdbc.driver.T2CConnection.loadNativeLibrary(T2CConnection.java:3131)
at oracle.jdbc.driver.T2CConnection.logon(T2CConnection.java:221)
at
oracle.jdbc.driver.PhysicalConnection.(PhysicalConnection.java:414)
at oracle.jdbc.driver.T2CConnection.(T2CConnection.java:132)
at
oracle.jdbc.driver.T2CDriverExtension.getConnection(T2CDriverExtension.java:
78)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:801)
at java.sql.DriverManager.getConnection(DriverManager.java:525)
at java.sql.DriverManager.getConnection(DriverManager.java:140)
at
org.apache.solr.handler.dataimport.JdbcDataSource$1.call(JdbcDataSource.java
:126)
at
org.apache.solr.handler.dataimport.JdbcDataSource$1.call(JdbcDataSource.java
:119)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getConnection(JdbcDataSour
ce.java:325)
at
org.apache.solr.handler.dataimport.JdbcDataSource.access$200(JdbcDataSource.
java:37)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(J
dbcDataSource.java:201)
 



Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-04-02 Thread Grant Ingersoll


On Apr 2, 2009, at 9:23 AM, Fergus McMenemie wrote:


Grant,




I should note, however, that the speed difference you are seeing may
not be as pronounced as it appears.  If I recall during ApacheCon, I
commented on how long it takes to shutdown your Solr instance when
exiting it.  That time it takes is in fact Solr doing the work that
was put off by not committing earlier and having all those deletes
pile up.


I am confused about "work that was put off" vs committing. My script
was doing a commit right after the CVS import, and you are right
about the massive times required to shut tomcat down. But in my tests
the time taken to do the commit was under a second, yet I had to allow
300secs for tomcat shutdown. Also I dont have any duplicates. So
what sort of work was being done at shutdown that was not being done
by a commit? Optimise!



The work being done is addressing the deletes, AIUI, but of course  
there are other things happening during shutdown, too.


How long is the shutdown if you do a commit first and then a shutdown?

At any rate, I don't know that there is a satisfying answer to the  
larger issue due to the things like the fsync stuff, which is an  
overall win for Lucene/Solr despite it being more slower.  Have you  
tried running the tests on other machines (non-Mac?)


Re: spectrum of Lucene queries in solr?

2009-04-02 Thread Paul Libbrecht

Sorry, I just realized I can use SolrIndexSearcher.search(Query, Hit)...

that was my question basically.

paul


Le 02-avr.-09 à 03:31, Erik Hatcher a écrit :


Paul,

I'm not sure I understand what you're looking for exactly.  Solr  
supports Lucene's QueryParser by default for /select?q=... so you  
get the breadth of what it supports including boolean, prefix,  
fuzzy, and more.  QueryParser has never supported span queries  
though.  There is also a dismax parser available (&defType=dismax to  
enable it), and numerous other parser plugins.  Queries with Solr  
aren't created from the client as a Query object, but rather some  
string parameters come from the client that are then used to build a  
Query on the server side.


You can also add your own QParserPlugin to build custom Lucene Query  
objects however you like.


Erik


On Apr 1, 2009, at 6:34 PM, Paul Libbrecht wrote:
I am surprised not to find any equivalent to the classical Lucene  
queries in Solr... I must have badly looked...
E.g. where can I get a BooleanQuery, a PrefixQuery, a FuzzyQuery,  
or even a few spanqueries?


thanks in advance

paul






smime.p7s
Description: S/MIME cryptographic signature


Re: Error in Importing from Oracle

2009-04-02 Thread Shalin Shekhar Mangar
On Thu, Apr 2, 2009 at 6:57 PM, Radha C.  wrote:
> Hello List,
>
> I am trying to do full import from remote oracle server. I am getting the
> below error,
>
> Can anyone please help me what configuration I am missing?  Thanks in
> advance.
>
> Apr 2, 2009 6:46:39 PM org.apache.solr.handler.dataimport.DataImporter
> doFullImport
> SEVERE: Full Import failed
> org.apache.solr.handler.dataimport.DataImportHandlerException:
> java.lang.UnsatisfiedLinkError: no ocijdbc10 in java.library.path
>

It looks like your Oracle jdbc driver depends on some native code
libraries which are missing in your environment. I suggest that you
look in the documentation of the jdbc driver. However, I'm quite
certain that there exists a pure java jdbc driver for Oracle too.

-- 
Regards,
Shalin Shekhar Mangar.


Re: How do I combine WhitespaceTokenizerFactory and EdgeNGramTokenizerFactory?

2009-04-02 Thread Otis Gospodnetic

Hi,

Try defining 2 separate fields and using copyField.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Praveen Kumar Jayaram 
> To: solr-user@lucene.apache.org
> Sent: Thursday, April 2, 2009 2:02:55 AM
> Subject: How do I combine WhitespaceTokenizerFactory and 
> EdgeNGramTokenizerFactory?
> 
> 
> Hi folks,
> 
> I am trying to use features of WhitespaceTokenizerFactory and
> EdgeNGramTokenizerFactory.
> My requirement is to search a String using substring and inword concepts.
> 
> For example:
> Let us assume the String as "Praveen  Kumar Jayaram".
> The query for "Kumar" will be successful if we use
> WhitespaceTokenizerFactory.
> And the query for "Pra" will be successful if we use
> EdgeNGramTokenizerFactory.
> 
> Now I need to search with query string "Kum". I tried using these two
> tokenizer factories parallely but got an error saying we can use only one
> tokenizer per field.
> Could you please help in achieving this requirement??
> 
> Thanks in advance.
> 
> Regards,
> Praveen
> 
> -
> Regards,
> Praveen
> -- 
> View this message in context: 
> http://www.nabble.com/How-do-I-combine-WhitespaceTokenizerFactory-and-EdgeNGramTokenizerFactory--tp22841455p22841455.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Search on all fields and know in which field was the match

2009-04-02 Thread Rui Carneiro
Hi all,

I am brand new with Solr and i have a pretty big problem to solve.

I need to index some e-mails (from Dovecot) and their attachments. The
index's structure i was thinking is:

Note: Only the last 2 fields are relevant to this problem.


   
   
   
   
   
   
   
   
   
   
   
   
 

With this structure i think (correct me if i am wrong) i cant search for all
attachBody_* and know where the match was (attachBody_1, _2, _3, etc).

I really don't know if this is the best approach so any help would be
appreciated.

Regards,
Rui Carneiro


Unexpected search results

2009-04-02 Thread muness

When search through solr, a user noticed some unintuitive results to a query. 
I am having trouble figuring out if this is the expected result or not.

Here's an example:
 * Using the query "otto" returns 10 results
 * Using the query "otto added" returns 10 results

The weird part is that the results returned for "otto added" are not a the
same set of results returned for "otto".  Instead they are a completely
different set of results.

For reference, the URLs being used for the queries is:
 * http://localhost:8982/solr/select/?q=otto
 * http://localhost:8982/solr/select/?q=otto+added

Shouldn't all the results being returned for "otto added" also be returned
for the more general query "otto"?  Or is this the expected behavior?  If
so, why?

Thanks!
Muness
-- 
View this message in context: 
http://www.nabble.com/Unexpected-search-results-tp22853940p22853940.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Unexpected search results

2009-04-02 Thread Vauthrin, Laurent
Are the queries only returning 10 results? (in the result element ->
)

By default, I believe Solr will only return the first ten results it
finds which may explain the results.

-Original Message-
From:
solr-user-return-20409-laurent.vauthrin=disney@lucene.apache.org
[mailto:solr-user-return-20409-laurent.vauthrin=disney@lucene.apache
.org] On Behalf Of muness
Sent: Thursday, April 02, 2009 11:11 AM
To: solr-user@lucene.apache.org
Subject: Unexpected search results


When search through solr, a user noticed some unintuitive results to a
query. 
I am having trouble figuring out if this is the expected result or not.

Here's an example:
 * Using the query "otto" returns 10 results
 * Using the query "otto added" returns 10 results

The weird part is that the results returned for "otto added" are not a
the
same set of results returned for "otto".  Instead they are a completely
different set of results.

For reference, the URLs being used for the queries is:
 * http://localhost:8982/solr/select/?q=otto
 * http://localhost:8982/solr/select/?q=otto+added

Shouldn't all the results being returned for "otto added" also be
returned
for the more general query "otto"?  Or is this the expected behavior?
If
so, why?

Thanks!
Muness
-- 
View this message in context:
http://www.nabble.com/Unexpected-search-results-tp22853940p22853940.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Slow startup of solr-nightly

2009-04-02 Thread Yonik Seeley
On Thu, Apr 2, 2009 at 6:56 AM, Jarek Zgoda  wrote:
> I'm testing solr-nightly (2009-04-02) and I noticed very long instance
> startup time compared to 1.3 (4 minutes to 20 seconds). The machine is
> pretty common 2 core AMD64 with 4GB RAM.
>
> Is this normal or should I be worried?

Definitely not normal unless you have static warming queries
configured or something.
Could you post the part of the logs that spans the long startup time?


-Yonik
http://www.lucidimagination.com


Facets drill down

2009-04-02 Thread revas
Hi,

I typically issue a facetdrill down query thus

q=somequery and Facetfield:facetval .

Is there any issues with the above approach as opposed to
&fq=facetfield:value in terms of memory consumption and the use of cache.

Regards
Suajatha


RE: Unexpected search results

2009-04-02 Thread muness

Laurent,

My bad, I should have used the exact query results I was getting, rather
than a made up example.  Here are those results:

$ curl http://localhost:8982/solr/select/?q=otto --silent | xmlstarlet fo |
grep numFound
  
$ curl http://localhost:8982/solr/select/?q=otto%20added --silent |
xmlstarlet fo | grep numFound
  

As you can see, the result set for "otto added" is actually bigger than that
for "otto".  Is that correct?

Muness

-


Vauthrin, Laurent wrote:
> 
> Are the queries only returning 10 results? (in the result element ->
> )
> 
> By default, I believe Solr will only return the first ten results it
> finds which may explain the results.
> 
> -Original Message-
> From:
> solr-user-return-20409-laurent.vauthrin=disney@lucene.apache.org
> [mailto:solr-user-return-20409-laurent.vauthrin=disney@lucene.apache
> .org] On Behalf Of muness
> Sent: Thursday, April 02, 2009 11:11 AM
> To: solr-user@lucene.apache.org
> Subject: Unexpected search results
> 
> 
> When search through solr, a user noticed some unintuitive results to a
> query. 
> I am having trouble figuring out if this is the expected result or not.
> 
> Here's an example:
>  * Using the query "otto" returns 10 results
>  * Using the query "otto added" returns 10 results
> 
> The weird part is that the results returned for "otto added" are not a
> the
> same set of results returned for "otto".  Instead they are a completely
> different set of results.
> 
> For reference, the URLs being used for the queries is:
>  * http://localhost:8982/solr/select/?q=otto
>  * http://localhost:8982/solr/select/?q=otto+added
> 
> Shouldn't all the results being returned for "otto added" also be
> returned
> for the more general query "otto"?  Or is this the expected behavior?
> If
> so, why?
> 
> Thanks!
> Muness
> -- 
> View this message in context:
> http://www.nabble.com/Unexpected-search-results-tp22853940p22853940.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Unexpected-search-results-tp22853940p22855832.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Unexpected search results

2009-04-02 Thread Erick Erickson
Sure, given that the default operator is OR you'd expect it to widen as you
add more terms.

On Thu, Apr 2, 2009 at 3:57 PM, muness  wrote:

>
> Laurent,
>
> My bad, I should have used the exact query results I was getting, rather
> than a made up example.  Here are those results:
>
> $ curl http://localhost:8982/solr/select/?q=otto --silent | xmlstarlet fo
> |
> grep numFound
>  
> $ curl http://localhost:8982/solr/select/?q=otto%20added --silent |
> xmlstarlet fo | grep numFound
>  
>
> As you can see, the result set for "otto added" is actually bigger than
> that
> for "otto".  Is that correct?
>
> Muness
>
> -
>
>
> Vauthrin, Laurent wrote:
> >
> > Are the queries only returning 10 results? (in the result element ->
> > )
> >
> > By default, I believe Solr will only return the first ten results it
> > finds which may explain the results.
> >
> > -Original Message-
> > From:
> > solr-user-return-20409-laurent.vauthrin=disney@lucene.apache.org
> > [mailto:solr-user-return-20409-laurent.vauthrin=disney@lucene.apache
> > .org] On Behalf Of muness
> > Sent: Thursday, April 02, 2009 11:11 AM
> > To: solr-user@lucene.apache.org
> > Subject: Unexpected search results
> >
> >
> > When search through solr, a user noticed some unintuitive results to a
> > query.
> > I am having trouble figuring out if this is the expected result or not.
> >
> > Here's an example:
> >  * Using the query "otto" returns 10 results
> >  * Using the query "otto added" returns 10 results
> >
> > The weird part is that the results returned for "otto added" are not a
> > the
> > same set of results returned for "otto".  Instead they are a completely
> > different set of results.
> >
> > For reference, the URLs being used for the queries is:
> >  * http://localhost:8982/solr/select/?q=otto
> >  * http://localhost:8982/solr/select/?q=otto+added
> >
> > Shouldn't all the results being returned for "otto added" also be
> > returned
> > for the more general query "otto"?  Or is this the expected behavior?
> > If
> > so, why?
> >
> > Thanks!
> > Muness
> > --
> > View this message in context:
> > http://www.nabble.com/Unexpected-search-results-tp22853940p22853940.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Unexpected-search-results-tp22853940p22855832.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


newSearcher doesn't fire

2009-04-02 Thread Kevin Osborn
I am trying to figure this out. I have a firstSearcher and a newSearcher event. 
They are almost identical. Upon startup, I see all the firstSearcher events in 
my log. I also see log events for Added SolrEventListener for both 
firstSearcher and newSearcher.

Next, I push out a new index. I see that my caches get regenerated and the new 
searcher is registered. However, I don't see my newSearcher events get fired. 
This is verifed in the Solr cache hits as well. And the logs show no errors or 
even any messages at all regarding newSearcher.



  

Remote Access To Schema Data

2009-04-02 Thread Fink, Clayton R.
Hi:

I want to get a list of the fields and field types for an index deployed on a 
Solr server (over HTTP or embedded). I can't see any obvious way to do this as 
a client.

This is part of the use case for an app we are working on where all field 
information for an index is available and we can programmatically format 
updates and queries based on the available fields.

Thanks,

Clay Fink



Re: newSearcher doesn't fire

2009-04-02 Thread Yonik Seeley
I just tried with the standard example in solr trunk... seems like
it's still working fine when I issue a commit:

Apr 2, 2009 5:30:19 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener sending requests to searc...@1702c48 main
Apr 2, 2009 5:30:19 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=null path=null params={rows=10&start=0&q=solr} hits=2 status=0 Q
Time=1
Apr 2, 2009 5:30:19 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=null path=null params={rows=10&start=0&q=rocks} hits=0 status=0
QTime=1
Apr 2, 2009 5:30:19 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=null path=null params={q=static+newSearcher+warming+query+from+s
olrconfig.xml} hits=0 status=0 QTime=1
Apr 2, 2009 5:30:19 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener done.


-Yonik
http://www.lucidimagination.com



On Thu, Apr 2, 2009 at 5:25 PM, Kevin Osborn  wrote:
> I am trying to figure this out. I have a firstSearcher and a newSearcher 
> event. They are almost identical. Upon startup, I see all the firstSearcher 
> events in my log. I also see log events for Added SolrEventListener for both 
> firstSearcher and newSearcher.
>
> Next, I push out a new index. I see that my caches get regenerated and the 
> new searcher is registered. However, I don't see my newSearcher events get 
> fired. This is verifed in the Solr cache hits as well. And the logs show no 
> errors or even any messages at all regarding newSearcher.


Re: newSearcher doesn't fire

2009-04-02 Thread Kevin Osborn
Found the issue. This was old code that I finally got around to enabling. 
Somebody put a slash at the end of my XML element.





From: Yonik Seeley 
To: solr-user@lucene.apache.org
Sent: Thursday, April 2, 2009 2:32:37 PM
Subject: Re: newSearcher doesn't fire

I just tried with the standard example in solr trunk... seems like
it's still working fine when I issue a commit:

Apr 2, 2009 5:30:19 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener sending requests to searc...@1702c48 main
Apr 2, 2009 5:30:19 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=null path=null params={rows=10&start=0&q=solr} hits=2 status=0 Q
Time=1
Apr 2, 2009 5:30:19 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=null path=null params={rows=10&start=0&q=rocks} hits=0 status=0
QTime=1
Apr 2, 2009 5:30:19 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=null path=null params={q=static+newSearcher+warming+query+from+s
olrconfig.xml} hits=0 status=0 QTime=1
Apr 2, 2009 5:30:19 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener done.


-Yonik
http://www.lucidimagination.com



On Thu, Apr 2, 2009 at 5:25 PM, Kevin Osborn  wrote:
> I am trying to figure this out. I have a firstSearcher and a newSearcher 
> event. They are almost identical. Upon startup, I see all the firstSearcher 
> events in my log. I also see log events for Added SolrEventListener for both 
> firstSearcher and newSearcher.
>
> Next, I push out a new index. I see that my caches get regenerated and the 
> new searcher is registered. However, I don't see my newSearcher events get 
> fired. This is verifed in the Solr cache hits as well. And the logs show no 
> errors or even any messages at all regarding newSearcher.



  

Re: Runtime exception when adding documents using solrj

2009-04-02 Thread vivek sar
Hello Shalin,

  Looks like I was using old version of solrconfig.xml (from Solr
1.2). After I updated to the latest solrconfig.xml (from 1.4) it seems
to be working fine.

Another question I got is how would I search on multi-cores,

 1) If I want to search for a word in two different cores?
 2) If I want to search for a word in all the cores.
 3) How would I search on multiple cores on multiple machines?

Single core I'm able to search like,

http://localhost:8080/solr/20090402/select?q=*:*

Thanks,
-vivek

--
Just in case if this might be helpful to others who might be trying to
use Solr multicore.

  Here is what I tried.

1)  Created this directory structure -

   multicore/core0 (put in the conf directory - with schema.xml
and solrconfig.xml - under core0)
   multicore/core1

 Make multicore as the solr.home and put the solr.xml under there

2) Added couple of cores in the solr.xml,

 

  
  



Here core1 is using instancedir of core0 (using same schema.xml and
solrconfig.xml).

2) Started Solr

3) Data/index directory is created under both cores

4)  Tried following URLs,

a) http://localhost:8080/solr/admin/cores
 - admin interface for both cores
b) http://localhost:8080/solr/core0/admin/
 - I see the single core admin page
c) http://localhost:8080/solr/admin/cores?action=STATUS
- same as a
d) http://localhost:8080/solr/admin/cores?action=STATUS&core=core0
   - same as b
e) http://localhost:8080/solr/core0/select?q=*:*
   - shows result xml

5) I then created the core dynamically using CREATE service (this
requires Solr 1.4),

   
http://localhost:8080/solr/admin/cores?action=CREATE&name=20090402&instanceDir=/Users/opal/temp/chat/solr/multicore/core0&dataDir=/Users/opal/temp/chat/solr/multicore/20090402/data

  - this dynamically updated the solr.xml and created a directory
structure (20090402/data) on the file system.

6) The use solrj to add beans to the recently created core







On Wed, Apr 1, 2009 at 8:26 PM, Shalin Shekhar Mangar
 wrote:
> On Thu, Apr 2, 2009 at 2:34 AM, vivek sar  wrote:
>> Thanks Shalin.
>>
>> I added that in the solrconfig.xml, but now I get this exception,
>>
>> org.apache.solr.common.SolrException: Not Found
>> Not Found
>> request: http://localhost:8080/solr/core0/update?wt=javabin&version=2.2
>>
>> I do have the "core0" under the solr.home. The core0 directory also
>> contains the conf and data directories. The solr.xml has following in
>> it,
>>
>> 
>>      
>>  
>>
>>
>
> Are you able to see the Solr admin dashboard at
> http://localhost:8080/solr/core0/admin/ ? Are there any exceptions in
> Solr log?
>
> --
> Regards,
> Shalin Shekhar Mangar.
>


Re: java.lang.ClassCastException: java.lang.Long using Solrj

2009-04-02 Thread vivek sar
Thanks Noble. That helped - turned out there was field name mismatch in my bean.

2009/4/1 Noble Paul നോബിള്‍  नोब्ळ् :
> The classcast exception is misleading. It happens because the response
> itself was some error response.
>
> debug it by setting the XmlResponseParser
> http://wiki.apache.org/solr/Solrj#head-12c26b2d7806432c88b26cf66e236e9bd6e91849
>
> On Thu, Apr 2, 2009 at 4:21 AM, vivek sar  wrote:
>> Hi,
>>
>>  I'm using solrj (released v 1.3) to add my POJO objects
>> (server.addbeans(...)), but I'm getting this exception,
>>
>> java.lang.ClassCastException: java.lang.Long
>>        at 
>> org.apache.solr.common.util.NamedListCodec.unmarshal(NamedListCodec.java:89)
>>        at 
>> org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:39)
>>        at 
>> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:385)
>>        at 
>> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:183)
>>        at 
>> org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:217)
>>        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48)
>>        at 
>> org.apache.solr.client.solrj.SolrServer.addBeans(SolrServer.java:57)
>>
>> I don't have any "Long" member variable in my java object - so not
>> sure where is this coming from. I've checked the schema.xml to make
>> sure the data types are ok. I'm adding 15K objects at a time - I'm
>> assuming that should be ok.
>>
>> Any ideas?
>>
>> Thanks,
>> -vivek
>>
>
>
>
> --
> --Noble Paul
>


Re: Remote Access To Schema Data

2009-04-02 Thread Jeff Newburn
Fastest way I know of to get the schema is using the luke browser.
http://localhost/solr/admin/luke
It returns in xml and has tons of info you probably aren't interested it.
However, it does contain information like fields and type.

-- 
Jeff Newburn
Software Engineer, Zappos.com
jnewb...@zappos.com - 702-943-7562


> From: "Fink, Clayton R." 
> Reply-To: 
> Date: Thu, 2 Apr 2009 17:29:38 -0400
> To: "solr-user@lucene.apache.org" 
> Subject: Remote Access To Schema Data
> 
> Hi:
> 
> I want to get a list of the fields and field types for an index deployed on a
> Solr server (over HTTP or embedded). I can't see any obvious way to do this as
> a client.
> 
> This is part of the use case for an app we are working on where all field
> information for an index is available and we can programmatically format
> updates and queries based on the available fields.
> 
> Thanks,
> 
> Clay Fink
> 



Oracle Clob column with DIH does not turn to String

2009-04-02 Thread ashokc

Hi,

I have set up to import some oracle clob columns with DIH. I am using the
latest nightly release. My config says,






But it does not seem to turn this clob into a String. The search results
show:


   1.8670129
oracle.sql.c...@aed3a5
   4486


Any pointers on why I do not get the 'string' out of the clob for indexing?
Is the nightly war NOT the right one to use?

Thanks for your help.

- ashok


-- 
View this message in context: 
http://www.nabble.com/Oracle-Clob-column-with-DIH-does-not-turn-to-String-tp22859837p22859837.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Oracle Clob column with DIH does not turn to String

2009-04-02 Thread ashokc

Correcting my earlier post. It lost some lines some how.

Hi,

I have set up to import some oracle clob columns with DIH. I am using the
latest nightly release. My config says,







But it does not seem to turn this clob into a String. The search results
show:


   1.8670129
oracle.sql.c...@aed3a5
   4486


Any pointers on why I do not get the 'string' out of the clob for indexing?
Is the nightly war NOT the right one to use?

Thanks for your help.

- ashok



ashokc wrote:
> 
> Hi,
> 
> I have set up to import some oracle clob columns with DIH. I am using the
> latest nightly release. My config says,
> 
>  column="description" clob="true" />
> 
> 
> 
> 
> But it does not seem to turn this clob into a String. The search results
> show:
> 
> 
>1.8670129
> oracle.sql.c...@aed3a5
>4486
> 
> 
> Any pointers on why I do not get the 'string' out of the clob for
> indexing? Is the nightly war NOT the right one to use?
> 
> Thanks for your help.
> 
> - ashok
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Oracle-Clob-column-with-DIH-does-not-turn-to-String-tp22859837p22859865.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Error in Importing from Oracle

2009-04-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
put your native dll/iso file in the LD_LIBRARY_PATH and start Solr
with that. Or the  best solution is to use a pure java driver

On Thu, Apr 2, 2009 at 8:13 PM, Shalin Shekhar Mangar
 wrote:
> On Thu, Apr 2, 2009 at 6:57 PM, Radha C.  wrote:
>> Hello List,
>>
>> I am trying to do full import from remote oracle server. I am getting the
>> below error,
>>
>> Can anyone please help me what configuration I am missing?  Thanks in
>> advance.
>>
>> Apr 2, 2009 6:46:39 PM org.apache.solr.handler.dataimport.DataImporter
>> doFullImport
>> SEVERE: Full Import failed
>> org.apache.solr.handler.dataimport.DataImportHandlerException:
>> java.lang.UnsatisfiedLinkError: no ocijdbc10 in java.library.path
>>
>
> It looks like your Oracle jdbc driver depends on some native code
> libraries which are missing in your environment. I suggest that you
> look in the documentation of the jdbc driver. However, I'm quite
> certain that there exists a pure java jdbc driver for Oracle too.
>
> --
> Regards,
> Shalin Shekhar Mangar.
>



-- 
--Noble Paul


Re: Oracle Clob column with DIH does not turn to String

2009-04-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
This looks strange. Apparently the Transformer did not get applied. Is
it possible for you to debug ClobTransformer adding(System.out.println
into ClobTransformer may help)

On Fri, Apr 3, 2009 at 6:04 AM, ashokc  wrote:
>
> Correcting my earlier post. It lost some lines some how.
>
> Hi,
>
> I have set up to import some oracle clob columns with DIH. I am using the
> latest nightly release. My config says,
>
>
>  ...
>
>    
>    
>
> 
>
> But it does not seem to turn this clob into a String. The search results
> show:
>
> 
>   1.8670129
>    oracle.sql.c...@aed3a5
>   4486
> 
>
> Any pointers on why I do not get the 'string' out of the clob for indexing?
> Is the nightly war NOT the right one to use?
>
> Thanks for your help.
>
> - ashok
>
>
>
> ashokc wrote:
>>
>> Hi,
>>
>> I have set up to import some oracle clob columns with DIH. I am using the
>> latest nightly release. My config says,
>>
>> > column="description" clob="true" />
>>     
>>
>> 
>>
>> But it does not seem to turn this clob into a String. The search results
>> show:
>>
>> 
>>    1.8670129
>>     oracle.sql.c...@aed3a5
>>    4486
>> 
>>
>> Any pointers on why I do not get the 'string' out of the clob for
>> indexing? Is the nightly war NOT the right one to use?
>>
>> Thanks for your help.
>>
>> - ashok
>>
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/Oracle-Clob-column-with-DIH-does-not-turn-to-String-tp22859837p22859865.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
--Noble Paul


Re: Oracle Clob column with DIH does not turn to String

2009-04-02 Thread ashokc

That would require me to recompile (with ant/maven scripts?) the source and
replace the jar for DIH, right? I can try - for the first time.
- ashok

Noble Paul നോബിള്‍  नोब्ळ् wrote:
> 
> This looks strange. Apparently the Transformer did not get applied. Is
> it possible for you to debug ClobTransformer adding(System.out.println
> into ClobTransformer may help)
> 
> On Fri, Apr 3, 2009 at 6:04 AM, ashokc  wrote:
>>
>> Correcting my earlier post. It lost some lines some how.
>>
>> Hi,
>>
>> I have set up to import some oracle clob columns with DIH. I am using the
>> latest nightly release. My config says,
>>
>>
>> > ...
>>
>>    
>>    
>>
>> 
>>
>> But it does not seem to turn this clob into a String. The search results
>> show:
>>
>> 
>>   1.8670129
>>    oracle.sql.c...@aed3a5
>>   4486
>> 
>>
>> Any pointers on why I do not get the 'string' out of the clob for
>> indexing?
>> Is the nightly war NOT the right one to use?
>>
>> Thanks for your help.
>>
>> - ashok
>>
>>
>>
>> ashokc wrote:
>>>
>>> Hi,
>>>
>>> I have set up to import some oracle clob columns with DIH. I am using
>>> the
>>> latest nightly release. My config says,
>>>
>>> >> column="description" clob="true" />
>>>     
>>>
>>> 
>>>
>>> But it does not seem to turn this clob into a String. The search results
>>> show:
>>>
>>> 
>>>    1.8670129
>>>     oracle.sql.c...@aed3a5
>>>    4486
>>> 
>>>
>>> Any pointers on why I do not get the 'string' out of the clob for
>>> indexing? Is the nightly war NOT the right one to use?
>>>
>>> Thanks for your help.
>>>
>>> - ashok
>>>
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Oracle-Clob-column-with-DIH-does-not-turn-to-String-tp22859837p22859865.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> --Noble Paul
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Oracle-Clob-column-with-DIH-does-not-turn-to-String-tp22859837p22861630.html
Sent from the Solr - User mailing list archive at Nabble.com.



crazy parentheses

2009-04-02 Thread Dean Missikowski (Consultant), CLSA
I've got a problem that's driving me crazy with parentheses.

 

I'm using a recent nightly Solr 1.4

 

My index includes these three docs. 

 

doc #1 has title: "saints & sinners"

doc #2 has title: "(saints and sinners)"

doc #3 has title: "( saints & sinners )"

doc #4 has title: "(saints & sinners)"

 

when I try any of these searches:

  title:saints & sinners 

  title:"saints & sinners"

  title:saints and sinners

 

Only docs  #1-3 are found, but doc #4 should match too?

 

The analyzer shows that the tokenizer and filters should find a match.  

I'm guessing this might be a bug with WordDelimiterFactory?

 

I've worked around by using a PatternReplaceFilterFactory to strip off
the parentheses.



 

Any ideas?

 

Thanks, Dean

 

 

Index Analyzer

org.apache.solr.analysis.WhitespaceTokenizerFactory {}

term position 1  2  3

term text   (saints  &  sinners)

term type  word word word

source start,end 0,78,910,18

payload 

org.apache.solr.analysis.WordDelimiterFilterFactory {catenateWords=1,
catenateNumbers=1, catenateAll=0, generateNumberParts=1,
generateWordParts=1}

term position 1  3

term text   saints   sinners

term type  word word

source start,end 1,710,17

payload 

org.apache.solr.analysis.LowerCaseFilterFactory {}

term position 1  3

term text   saints   sinners

term type  word word

source start,end 1,710,17

payload 

org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
ignoreCase=true}

term position 1  3

term text   saints   sinners

term type  word word

source start,end 1,710,17

payload 

org.apache.solr.analysis.EnglishPorterFilterFactory
{protected=protwords.txt}

term position 1  3

term text   saint sinner

term type  word word

source start,end 1,710,17

payload 

org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}

term position 1  3

term text   saint sinner

term type  word word

source start,end 1,710,17

payload 

Query Analyzer

org.apache.solr.analysis.WhitespaceTokenizerFactory {}

term position 1  2  3

term text   saints   &  sinners

term type  word word word

source start,end 0,67,89,16

payload 

org.apache.solr.analysis.WordDelimiterFilterFactory {catenateWords=1,
catenateNumbers=1, catenateAll=0, generateNumberParts=1,
generateWordParts=1}

term position 1  2

term text   saints   sinners

term type  word word

source start,end 0,69,16

payload 

org.apache.solr.analysis.LowerCaseFilterFactory {}

term position 1  2

term text   saints   sinners

term type  word word

source start,end 0,69,16

payload 

org.apache.solr.analysis.SynonymFilterFactory {expand=true,
ignoreCase=true, synonyms=synonyms.txt}

term position 1  2

term text   saints   sinners

term type  word word

source start,end 0,69,16

payload 

org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
ignoreCase=true}

term position 1  2

term text   saints   sinners

term type  word word

source start,end 0,69,16

payload 

org.apache.solr.analysis.EnglishPorterFilterFactory
{protected=protwords.txt}

term position 1  2

term text   saint sinner

term type  word word

source start,end 0,69,16

payload 

org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}

term position 1  2

term text   saint sinner

term type  word word

source start,end 0,69,16

payload 


CLSA CLEAN & GREEN: Please consider our environment before printing this email.
The content of this communication is subject to CLSA Legal and Regulatory 
Notices. 
These can be viewed at https://www.clsa.com/disclaimer.html or sent to you upon 
request.




Re: Oracle Clob column with DIH does not turn to String

2009-04-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
yeah, ant dist will give you the .war file you may need . just drop it
in and you are set to go. or if you can hook up a debugger to a
running Solr that is the easiest
--Noble

On Fri, Apr 3, 2009 at 9:35 AM, ashokc  wrote:
>
> That would require me to recompile (with ant/maven scripts?) the source and
> replace the jar for DIH, right? I can try - for the first time.
> - ashok
>
> Noble Paul നോബിള്‍  नोब्ळ् wrote:
>>
>> This looks strange. Apparently the Transformer did not get applied. Is
>> it possible for you to debug ClobTransformer adding(System.out.println
>> into ClobTransformer may help)
>>
>> On Fri, Apr 3, 2009 at 6:04 AM, ashokc  wrote:
>>>
>>> Correcting my earlier post. It lost some lines some how.
>>>
>>> Hi,
>>>
>>> I have set up to import some oracle clob columns with DIH. I am using the
>>> latest nightly release. My config says,
>>>
>>>
>>> >> ...
>>>
>>>    
>>>    
>>>
>>> 
>>>
>>> But it does not seem to turn this clob into a String. The search results
>>> show:
>>>
>>> 
>>>   1.8670129
>>>    oracle.sql.c...@aed3a5
>>>   4486
>>> 
>>>
>>> Any pointers on why I do not get the 'string' out of the clob for
>>> indexing?
>>> Is the nightly war NOT the right one to use?
>>>
>>> Thanks for your help.
>>>
>>> - ashok
>>>
>>>
>>>
>>> ashokc wrote:

 Hi,

 I have set up to import some oracle clob columns with DIH. I am using
 the
 latest nightly release. My config says,

 >>> column="description" clob="true" />
     

 

 But it does not seem to turn this clob into a String. The search results
 show:

 
    1.8670129
     oracle.sql.c...@aed3a5
    4486
 

 Any pointers on why I do not get the 'string' out of the clob for
 indexing? Is the nightly war NOT the right one to use?

 Thanks for your help.

 - ashok



>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/Oracle-Clob-column-with-DIH-does-not-turn-to-String-tp22859837p22859865.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>>
>> --
>> --Noble Paul
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/Oracle-Clob-column-with-DIH-does-not-turn-to-String-tp22859837p22861630.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
--Noble Paul


Re: Composite POJO support

2009-04-02 Thread Praveen Kumar Jayaram


Thanks for the reply Noble Paul.
In my application I will be having multiple types of object and the number
of properties in each object will vary.
So I have made them as FieldType and defined in schema.xml also

How do I store  the POJO without declaring it as a FieldType?
Solr needs to recognize a type right?



Noble Paul നോബിള്‍  नोब्ळ् wrote:
> 
> why is the POJO extending FieldType?
> it does not have to.
> 
> composite types are not supported.because Solr cannot support that.
> But the field can be a List or array.
> 
> On Thu, Apr 2, 2009 at 5:00 PM, Praveen Kumar Jayaram
>  wrote:
>>
>> Could someone give suggestions for this issue?
>>
>>
>> Praveen Kumar Jayaram wrote:
>>>
>>> Hi
>>>
>>> I am trying to have a complex POJO type in Solr 1.3
>>> i.e Object inside object.
>>>
>>> Below is a sample Field created,
>>>
>>> public class TestType extends FieldType{
>>>     @Field
>>>     private String requestorID_s_i_s_nm;
>>>
>>>     @Field
>>>     private String partNumber;
>>>
>>>     @Field
>>>     private String requestorName_s_i_s_nm;
>>>
>>>     @Field
>>>     private InnerType innerType;
>>> }
>>>
>>> Where InnerType is another custom Java type.
>>>
>>> public class InnerType extends FieldType{
>>>       private String name_s_i_s_nm;
>>> }
>>>
>>>
>>> The schema configuration is as shown below,
>>>
>>> 
>>> >> sortMissingLast="true" omitNorms="true"/>
>>> >> sortMissingLast="true" omitNorms="true"/>
>>> 
>>> 
>>>
>>> When I try to add an TestType POJO using below code, am getting unkown
>>> field "innerType" error,
>>>
>>> String url = "http://localhost:8983/solr";;
>>> SolrServer server = new CommonsHttpSolrServer( url );
>>>
>>> InnerType inner = new InnerType();
>>> inner.setName_s_i_s_nm("Test");
>>>
>>> TestType praveen = new TestType();
>>> praveen.setPartNumber("01-0001");
>>> praveen.setRequestorID_s_i_s_nm("");
>>> praveen.setRequestorName_s_i_s_nm("Praveen Kumar Jayaram");
>>> praveen.setInnerType(inner);
>>>
>>> server.addBean(praveen);
>>> UpdateRequest req = new UpdateRequest();
>>> req.setAction( UpdateRequest.ACTION.COMMIT, false, false );
>>> UpdateResponse res = req.process(server);
>>>
>>> Initially POJO was getting added when it was not composite POJO.
>>> After trying to have composite POJO things are not working.
>>> What is that I am doing wrong??
>>>
>>> Any help will be appreciated.
>>>
>>>
>>>
>>
>>
>> -
>> Regards,
>> Praveen
>> --
>> View this message in context:
>> http://www.nabble.com/Composite-POJO-support-tp22841854p22845799.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> --Noble Paul
> 
> 


-
Regards,
Praveen
-- 
View this message in context: 
http://www.nabble.com/Composite-POJO-support-tp22841854p22862433.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: How do I combine WhitespaceTokenizerFactory and EdgeNGramTokenizerFactory?

2009-04-02 Thread Praveen Kumar Jayaram


Thanks for the suggestion Otis.
This works.


Otis Gospodnetic wrote:
> 
> 
> Hi,
> 
> Try defining 2 separate fields and using copyField.
> 
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
>> From: Praveen Kumar Jayaram 
>> To: solr-user@lucene.apache.org
>> Sent: Thursday, April 2, 2009 2:02:55 AM
>> Subject: How do I combine WhitespaceTokenizerFactory and
>> EdgeNGramTokenizerFactory?
>> 
>> 
>> Hi folks,
>> 
>> I am trying to use features of WhitespaceTokenizerFactory and
>> EdgeNGramTokenizerFactory.
>> My requirement is to search a String using substring and inword concepts.
>> 
>> For example:
>> Let us assume the String as "Praveen  Kumar Jayaram".
>> The query for "Kumar" will be successful if we use
>> WhitespaceTokenizerFactory.
>> And the query for "Pra" will be successful if we use
>> EdgeNGramTokenizerFactory.
>> 
>> Now I need to search with query string "Kum". I tried using these two
>> tokenizer factories parallely but got an error saying we can use only one
>> tokenizer per field.
>> Could you please help in achieving this requirement??
>> 
>> Thanks in advance.
>> 
>> Regards,
>> Praveen
>> 
>> -
>> Regards,
>> Praveen
>> -- 
>> View this message in context: 
>> http://www.nabble.com/How-do-I-combine-WhitespaceTokenizerFactory-and-EdgeNGramTokenizerFactory--tp22841455p22841455.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 


-
Regards,
Praveen
-- 
View this message in context: 
http://www.nabble.com/How-do-I-combine-WhitespaceTokenizerFactory-and-EdgeNGramTokenizerFactory--tp22841455p22862468.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Composite POJO support

2009-04-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Fri, Apr 3, 2009 at 11:28 AM, Praveen Kumar Jayaram
 wrote:
>
>
> Thanks for the reply Noble Paul.
> In my application I will be having multiple types of object and the number
> of properties in each object will vary.
> So I have made them as FieldType and defined in schema.xml also
The POJO is a client side Object. It is converted to xml before
POSTING the data to Solr. The client side code is totally agnostic of
the field type in Solr schema.


>
> How do I store  the POJO without declaring it as a FieldType?
> Solr needs to recognize a type right?

What you can do is to write your field value as one String. Let the
FieldType in Solr parse and create appropriate data structure.
>
>
>
> Noble Paul നോബിള്‍  नोब्ळ् wrote:
>>
>> why is the POJO extending FieldType?
>> it does not have to.
>>
>> composite types are not supported.because Solr cannot support that.
>> But the field can be a List or array.
>>
>> On Thu, Apr 2, 2009 at 5:00 PM, Praveen Kumar Jayaram
>>  wrote:
>>>
>>> Could someone give suggestions for this issue?
>>>
>>>
>>> Praveen Kumar Jayaram wrote:

 Hi

 I am trying to have a complex POJO type in Solr 1.3
 i.e Object inside object.

 Below is a sample Field created,

 public class TestType extends FieldType{
     @Field
     private String requestorID_s_i_s_nm;

     @Field
     private String partNumber;

     @Field
     private String requestorName_s_i_s_nm;

     @Field
     private InnerType innerType;
 }

 Where InnerType is another custom Java type.

 public class InnerType extends FieldType{
       private String name_s_i_s_nm;
 }


 The schema configuration is as shown below,

 
 >>> sortMissingLast="true" omitNorms="true"/>
 >>> sortMissingLast="true" omitNorms="true"/>
 
 

 When I try to add an TestType POJO using below code, am getting unkown
 field "innerType" error,

 String url = "http://localhost:8983/solr";;
 SolrServer server = new CommonsHttpSolrServer( url );

 InnerType inner = new InnerType();
 inner.setName_s_i_s_nm("Test");

 TestType praveen = new TestType();
 praveen.setPartNumber("01-0001");
 praveen.setRequestorID_s_i_s_nm("");
 praveen.setRequestorName_s_i_s_nm("Praveen Kumar Jayaram");
 praveen.setInnerType(inner);

 server.addBean(praveen);
 UpdateRequest req = new UpdateRequest();
 req.setAction( UpdateRequest.ACTION.COMMIT, false, false );
 UpdateResponse res = req.process(server);

 Initially POJO was getting added when it was not composite POJO.
 After trying to have composite POJO things are not working.
 What is that I am doing wrong??

 Any help will be appreciated.



>>>
>>>
>>> -
>>> Regards,
>>> Praveen
>>> --
>>> View this message in context:
>>> http://www.nabble.com/Composite-POJO-support-tp22841854p22845799.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>>
>> --
>> --Noble Paul
>>
>>
>
>
> -
> Regards,
> Praveen
> --
> View this message in context: 
> http://www.nabble.com/Composite-POJO-support-tp22841854p22862433.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
--Noble Paul


RE: Error in Importing from Oracle

2009-04-02 Thread Radha C.

Thanks guys,

I used pure java driver. That works. 

-Original Message-
From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.p...@gmail.com] 
Sent: Friday, April 03, 2009 8:59 AM
To: solr-user@lucene.apache.org
Subject: Re: Error in Importing from Oracle

put your native dll/iso file in the LD_LIBRARY_PATH and start Solr with that. 
Or the  best solution is to use a pure java driver

On Thu, Apr 2, 2009 at 8:13 PM, Shalin Shekhar Mangar  
wrote:
> On Thu, Apr 2, 2009 at 6:57 PM, Radha C.  wrote:
>> Hello List,
>>
>> I am trying to do full import from remote oracle server. I am getting 
>> the below error,
>>
>> Can anyone please help me what configuration I am missing?  Thanks in 
>> advance.
>>
>> Apr 2, 2009 6:46:39 PM 
>> org.apache.solr.handler.dataimport.DataImporter
>> doFullImport
>> SEVERE: Full Import failed
>> org.apache.solr.handler.dataimport.DataImportHandlerException:
>> java.lang.UnsatisfiedLinkError: no ocijdbc10 in java.library.path
>>
>
> It looks like your Oracle jdbc driver depends on some native code 
> libraries which are missing in your environment. I suggest that you 
> look in the documentation of the jdbc driver. However, I'm quite 
> certain that there exists a pure java jdbc driver for Oracle too.
>
> --
> Regards,
> Shalin Shekhar Mangar.
>



--
--Noble Paul