Minutello, Nick wrote:
>
> Maybe spend some time playing with Compass rather than speculating ;)
>
I spent few weeks by studying Compass source code, it was three years ago,
and Compass docs (3 years ago) were saying the same as now:
"Compass::Core provides support for two phase commits transa
sorry for typo in prev msg,
Increase = 2,297,231 - 1,786,552 = 500,000 (average)
RATE (non-unique-id:unique-id) = 7,000,000 : 500,000 = 14:1
but 125:1 (initial 30 hours) was very strange...
Funtick wrote:
>
> UPDATE:
>
> After few more minutes (after previous commit):
&
bug
somewhere... need to investigate ramBufferSize and MergePolicy, including
SOLR uniqueId implementation...
Funtick wrote:
>
> After running an application which heavily uses MD5 HEX-representation as
> for SOLR v.1.4-dev-trunk:
>
> 1. After 30 hours:
> 101,000,000 docu
BTW, you should really prefer JRockit which really rocks!!!
"Mission Control" has necessary toolongs; and JRockit produces _nice_
exception stacktrace (explaining almost everything) in case of even OOM
which SUN JVN still fails to produce.
SolrServlet still catches "Throwable":
} catch (Th
: 1,786,552
Same random docs retrieved from web...
Funtick wrote:
>
>
> But how to explain that within an hour (after commit) I have had about
> 500,000 new documents, and within 30 hours (after commit) only 783,714?
>
> Same _random_enough_ documents...
>
> BTW, SOLR Cons
It is NOT sample war, it is SOLR application: solr.war - it should be!!! I
usually build from source and use dist/apache-solr-1.3.war instead, so I am
not sure about solr.war
solr.xml contains configuration for multicore; most probably something is
wrong with it.
Would be better if you tr
nds on which
> segments have been merged).
>
> So if you add a tone of documents over time, many with the same ids, you
> would likely see this type of maxDoc, numDoc churn. maxDoc will include
> deleted docs while numDoc will not.
>
>
> --
> - Mark
>
> http://www.
Can you tell me please how many non-tokenized single-valued fields your
schema uses, and how many documents?
Thanks,
Fuad
Rahul R wrote:
>
> My primary issue is not Out of Memory error at run time. It is memory
> leaks:
> heap space not being released after doing a force GC also. So after
> som
After running an application which heavily uses MD5 HEX-representation as
for SOLR v.1.4-dev-trunk:
1. After 30 hours:
101,000,000 documents added
2. Commit:
numDocs = 783,714
maxDoc = 3,975,393
3. Upload new docs to SOLR during 1 hour(!!!), then commit, then
optimize:
numDocs=1,281,851
Users & Developers & Possible Contributors,
Hi,
Recently I did some code hacks and I am using frequency calcs for TermVector
instead of default out-of-the-box DocSet Intersections. It improves
performance hundreds of times at shopping engine http://www.tokenizer.org -
please check http://issue
I found an answer: not enough space in filesystem.
Funtick wrote:
>
> Is it file-system error? I can commit and I can not optimize:
>
> Exception in thread "main" org.apache.solr.common.SolrException:
> background merge hit exception: _ztu:C14604370 _105b
Is it file-system error? I can commit and I can not optimize:
Exception in thread "main" org.apache.solr.common.SolrException: background
merge hit exception: _ztu:C14604370 _105b:C1690769 _105l:C340280
_105w:C336330 _1068:C336025 _106j:C330206 _106u:C338541 _1075:C337713
_1080:C463455 into _1081
y Key
for instance.
001 Attraction CN Tower
002 Hotel CN Tower
003 Hotel Sheraton
004 Restaurant CN Tower
Funtick wrote:
>
> Simple design with _single_ valued fields:
>
> Id
a specific value from
> a multivalued field given a set of criteria. Now, compare that with my
> current design in which these criteria pinpoint a specific field / column
> to use and the difference should be clear.
>
> regards,
> Britske
>
>
> Funtick wrote:
>
But answer to initial question... I think your documents are huge...
Funtick wrote:
>
>
>
> Britske wrote:
>>
>> I do understand that, at first glance, it seems possible to use
>> multivalued fields, but with multivalued fields it's not possible to
>
I do understand that, at first glance, it seems possible to use multivalued
fields, but with multivalued fields it's not possible to pinpoint the exact
value within the multivalued field that I need.
I used a technics with single document consisting on single Category and
multiple Products (m
Hoss,
This is still extremely interesting area for possible improvements; I simply
don't want the topic to die
http://www.nabble.com/Facet-Performance-td7746964.html
http://issues.apache.org/jira/browse/SOLR-665
http://issues.apache.org/jira/browse/SOLR-667
http://issues.apache.org/jira/browse/
Lucene instead of a database?
Sorry if I don't understand your design...
Britske wrote:
>
>
>
> Funtick wrote:
>>
>>
>> Britske wrote:
>>>
>>> - Rows in solr represent productcategories. I will have up to 100k of
>>> them.
>
Funtick wrote:
>
>
> Britske wrote:
>>
>> - Rows in solr represent productcategories. I will have up to 100k of
>> them.
>> - Each product category can have 10k products each. These are encoded as
>> the 10k columns / fields (all 10k fields
Britske wrote:
>
> - Rows in solr represent productcategories. I will have up to 100k of
> them.
> - Each product category can have 10k products each. These are encoded as
> the 10k columns / fields (all 10k fields are int values)
>
You are using multivalued fields, you are not using 10k fie
Britske wrote:
>
> When performing these queries I notice a big difference between qTime
> (which is mostly in the 15-30 ms range due to caching) and total time
> taken to return the response (measured through SolrJ's elapsedTime), which
> takes between 500-1600 ms.
> Documents have a lot of st
Special things:
- 2.3.1 fixes bugs with 'autocommit' of version 2.3.0
- I am having OutOfMemoryError constantly, I can't understand where the
problem is yet... I didn't have it with default SOLR 1.2 installation. It's
not memory-cache related, most probably it is a bug somewhere...
Yongjun Rong
I know Russian better than Russians ;)
I currently use default configuration for "dismax" provided by SOLR
1.1; I can add few URLs tonight to the crawler to see what happens. As
I know, Lucene/Nutch can even define web page (pdf, txt, html)
language by checking raw bytearray (raw HTTP Respon
Hi Danier,
Ensure that UTF-8 is everywhere... SOLR, WebServer, AppServer, HTTP
Headers, etc.
And do not use
q=Бамбарбиа
Киркуду
use this instead (encoded URL):
q=%D0%91%D0%B0%D0%BC%D0%B1%D0%B0%D1%80%D0%B1%D0%B8%D0%B0+%D0%9A%D0%B8%D1%80%D0%BA%D1%83%D0%B4%D1%83
http://www.tokenizer.org is
Thought this is not directly related to Solr, but I have a XML output from
mysql database, but during indexing the XML output is not working. And the
problem is part of the XML output is not in UTF-8 encoding, how can I
convert it to UTF-8 and how do I know what kind of coding it uses in the
first
Tiong Jeffrey wrote:
>
> Thought this is not directly related to Solr, but I have a XML output from
> mysql database, but during indexing the XML output is not working. And the
> problem is part of the XML output is not in UTF-8 encoding, how can I
> convert it to UTF-8 and how do I know what ki
26 matches
Mail list logo