RE: Solr vs. Compass

2010-01-25 Thread Funtick
Minutello, Nick wrote: > > Maybe spend some time playing with Compass rather than speculating ;) > I spent few weeks by studying Compass source code, it was three years ago, and Compass docs (3 years ago) were saying the same as now: "Compass::Core provides support for two phase commits transa

Re: SOLR - extremely strange behavior! Documents disappeared...

2009-08-17 Thread Funtick
sorry for typo in prev msg, Increase = 2,297,231 - 1,786,552 = 500,000 (average) RATE (non-unique-id:unique-id) = 7,000,000 : 500,000 = 14:1 but 125:1 (initial 30 hours) was very strange... Funtick wrote: > > UPDATE: > > After few more minutes (after previous commit): &

Re: SOLR - extremely strange behavior! Documents disappeared...

2009-08-17 Thread Funtick
bug somewhere... need to investigate ramBufferSize and MergePolicy, including SOLR uniqueId implementation... Funtick wrote: > > After running an application which heavily uses MD5 HEX-representation as > for SOLR v.1.4-dev-trunk: > > 1. After 30 hours: > 101,000,000 docu

Re: JVM Heap utilization & Memory leaks with Solr

2009-08-17 Thread Funtick
BTW, you should really prefer JRockit which really rocks!!! "Mission Control" has necessary toolongs; and JRockit produces _nice_ exception stacktrace (explaining almost everything) in case of even OOM which SUN JVN still fails to produce. SolrServlet still catches "Throwable": } catch (Th

Re: SOLR - extremely strange behavior! Documents disappeared...

2009-08-17 Thread Funtick
: 1,786,552 Same random docs retrieved from web... Funtick wrote: > > > But how to explain that within an hour (after commit) I have had about > 500,000 new documents, and within 30 hours (after commit) only 783,714? > > Same _random_enough_ documents... > > BTW, SOLR Cons

Re: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on CentOS

2009-08-17 Thread Funtick
It is NOT sample war, it is SOLR application: solr.war - it should be!!! I usually build from source and use dist/apache-solr-1.3.war instead, so I am not sure about solr.war solr.xml contains configuration for multicore; most probably something is wrong with it. Would be better if you tr

Re: SOLR - extremely strange behavior! Documents disappeared...

2009-08-17 Thread Funtick
nds on which > segments have been merged). > > So if you add a tone of documents over time, many with the same ids, you > would likely see this type of maxDoc, numDoc churn. maxDoc will include > deleted docs while numDoc will not. > > > -- > - Mark > > http://www.

Re: JVM Heap utilization & Memory leaks with Solr

2009-08-17 Thread Funtick
Can you tell me please how many non-tokenized single-valued fields your schema uses, and how many documents? Thanks, Fuad Rahul R wrote: > > My primary issue is not Out of Memory error at run time. It is memory > leaks: > heap space not being released after doing a force GC also. So after > som

SOLR - extremely strange behavior! Documents disappeared...

2009-08-17 Thread Funtick
After running an application which heavily uses MD5 HEX-representation as for SOLR v.1.4-dev-trunk: 1. After 30 hours: 101,000,000 documents added 2. Commit: numDocs = 783,714 maxDoc = 3,975,393 3. Upload new docs to SOLR during 1 hour(!!!), then commit, then optimize: numDocs=1,281,851

Contributions Needed: Faceting Performance, SOLR Caching

2008-10-19 Thread Funtick
Users & Developers & Possible Contributors, Hi, Recently I did some code hacks and I am using frequency calcs for TermVector instead of default out-of-the-box DocSet Intersections. It improves performance hundreds of times at shopping engine http://www.tokenizer.org - please check http://issue

Re: background merge hit exception

2008-08-24 Thread Funtick
I found an answer: not enough space in filesystem. Funtick wrote: > > Is it file-system error? I can commit and I can not optimize: > > Exception in thread "main" org.apache.solr.common.SolrException: > background merge hit exception: _ztu:C14604370 _105b

background merge hit exception

2008-08-24 Thread Funtick
Is it file-system error? I can commit and I can not optimize: Exception in thread "main" org.apache.solr.common.SolrException: background merge hit exception: _ztu:C14604370 _105b:C1690769 _105l:C340280 _105w:C336330 _1068:C336025 _106j:C330206 _106u:C338541 _1075:C337713 _1080:C463455 into _1081

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-31 Thread Funtick
y Key for instance. 001 Attraction CN Tower 002 Hotel CN Tower 003 Hotel Sheraton 004 Restaurant CN Tower Funtick wrote: > > Simple design with _single_ valued fields: > > Id

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-31 Thread Funtick
a specific value from > a multivalued field given a set of criteria. Now, compare that with my > current design in which these criteria pinpoint a specific field / column > to use and the difference should be clear. > > regards, > Britske > > > Funtick wrote: >

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-31 Thread Funtick
But answer to initial question... I think your documents are huge... Funtick wrote: > > > > Britske wrote: >> >> I do understand that, at first glance, it seems possible to use >> multivalued fields, but with multivalued fields it's not possible to >

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-31 Thread Funtick
I do understand that, at first glance, it seems possible to use multivalued fields, but with multivalued fields it's not possible to pinpoint the exact value within the multivalued field that I need. I used a technics with single document consisting on single Category and multiple Products (m

Re: Facet Performance

2008-07-31 Thread Funtick
Hoss, This is still extremely interesting area for possible improvements; I simply don't want the topic to die http://www.nabble.com/Facet-Performance-td7746964.html http://issues.apache.org/jira/browse/SOLR-665 http://issues.apache.org/jira/browse/SOLR-667 http://issues.apache.org/jira/browse/

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-31 Thread Funtick
Lucene instead of a database? Sorry if I don't understand your design... Britske wrote: > > > > Funtick wrote: >> >> >> Britske wrote: >>> >>> - Rows in solr represent productcategories. I will have up to 100k of >>> them. >

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-30 Thread Funtick
Funtick wrote: > > > Britske wrote: >> >> - Rows in solr represent productcategories. I will have up to 100k of >> them. >> - Each product category can have 10k products each. These are encoded as >> the 10k columns / fields (all 10k fields

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-30 Thread Funtick
Britske wrote: > > - Rows in solr represent productcategories. I will have up to 100k of > them. > - Each product category can have 10k products each. These are encoded as > the 10k columns / fields (all 10k fields are int values) > You are using multivalued fields, you are not using 10k fie

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-30 Thread Funtick
Britske wrote: > > When performing these queries I notice a big difference between qTime > (which is mostly in the 15-30 ms range due to caching) and total time > taken to return the response (measured through SolrJ's elapsedTime), which > takes between 500-1600 ms. > Documents have a lot of st

Re: Uprade lucene to 2.3

2008-04-29 Thread Funtick
Special things: - 2.3.1 fixes bugs with 'autocommit' of version 2.3.0 - I am having OutOfMemoryError constantly, I can't understand where the problem is yet... I didn't have it with default SOLR 1.2 installation. It's not memory-cache related, most probably it is a bug somewhere... Yongjun Rong

Re: Problems querying Russian content

2007-06-28 Thread funtick
I know Russian better than Russians ;) I currently use default configuration for "dismax" provided by SOLR 1.1; I can add few URLs tonight to the crawler to see what happens. As I know, Lucene/Nutch can even define web page (pdf, txt, html) language by checking raw bytearray (raw HTTP Respon

Re: Problems querying Russian content

2007-06-28 Thread funtick
Hi Danier, Ensure that UTF-8 is everywhere... SOLR, WebServer, AppServer, HTTP Headers, etc. And do not use q=Бамбарбиа Киркуду use this instead (encoded URL): q=%D0%91%D0%B0%D0%BC%D0%B1%D0%B0%D1%80%D0%B1%D0%B8%D0%B0+%D0%9A%D0%B8%D1%80%D0%BA%D1%83%D0%B4%D1%83 http://www.tokenizer.org is

Re: To make sure XML is UTF-8

2007-06-08 Thread funtick
Thought this is not directly related to Solr, but I have a XML output from mysql database, but during indexing the XML output is not working. And the problem is part of the XML output is not in UTF-8 encoding, how can I convert it to UTF-8 and how do I know what kind of coding it uses in the first

Re: To make sure XML is UTF-8

2007-06-08 Thread Funtick
Tiong Jeffrey wrote: > > Thought this is not directly related to Solr, but I have a XML output from > mysql database, but during indexing the XML output is not working. And the > problem is part of the XML output is not in UTF-8 encoding, how can I > convert it to UTF-8 and how do I know what ki