Re: Can TrieDateField fields be null?

2015-08-23 Thread Upayavira
To be strict about it, I'd say that TrieDateFields CANNOT be null, but they CAN be excluded from the document. You could then check whether or not a value exists for this field. Upayavira On Sun, Aug 23, 2015, at 02:55 AM, Erick Erickson wrote: > TrieDateFields can be null. Actually, just not in

SOLR 5.3

2015-08-23 Thread William Bell
At lucene.apache.org/solr it says SOLR 5.3 is there, but when I click on downloads it shows Solr 5.2.1... ?? "APACHE SOLR™ 5.3.0Solr is the popular, blazing-fast, open source enterprise search platform built on Apache Lucene™." -- Bill Bell billnb...@gmail.com cell 720-256-8076

Re: solr add document

2015-08-23 Thread CrazyDiamond
thx i just need to call solr.commit -- View this message in context: http://lucene.472066.n3.nabble.com/solr-add-document-tp4224480p4224698.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: DIH delta-import pk

2015-08-23 Thread CrazyDiamond
As far as I understand I cant use 2 uniquefield. i need db id and uuid because i moving data from database to solr index entirely. And temporaly i need it to be compatble with delta-import, but in future i will use new only uuid . -- View this message in context: http://lucene.472066.n3.nabble

Re: Solr performance is slow with just 1GB of data indexed

2015-08-23 Thread Toke Eskildsen
Zheng Lin Edwin Yeo wrote: > However, I find that clustering is exceeding slow after I index this 1GB of > data. It took almost 30 seconds to return the cluster results when I set it > to cluster the top 1000 records, and still take more than 3 seconds when I > set it to cluster the top 100 record

Re: DIH delta-import pk

2015-08-23 Thread CrazyDiamond
Now I set db id as unique field and uuid field,which should be generated automatically as required. but when i add document i have an error that my required uuid field is missing. -- View this message in context: http://lucene.472066.n3.nabble.com/DIH-delta-import-pk-tp4224342p4224701.html Sen

Re: SOLR 5.3

2015-08-23 Thread Arcadius Ahouansou
Solr-5.3 has been available for download from http://mirror.catn.com/pub/apache/lucene/solr/5.3.0/ The redirection on the web site will probably be fixed before we get the official announcement. Arcadius. On 23 August 2015 at 09:00, William Bell wrote: > At lucene.apache.org/solr it says SOLR

Re: Remove duplicate suggestions in Solr

2015-08-23 Thread Arcadius Ahouansou
Hi Edwin. What you are doing here is "search" as Solr has separate components for doing suggestions. About dedup, - have a look at the manual https://cwiki.apache.org/confluence/display/solr/De-Duplication - or simply do your dedup upfront before ingesting into Solr by assigning the same "id"

Multiple concurrent queries to Solr

2015-08-23 Thread Ashish Mukherjee
Hello, I want to run few Solr queries in parallel, which are being done in a multi-threaded model now. I was wondering if there are any client libraries to query Solr through a non-blocking I/O mechanism instead of a threaded model. Has anyone attempted something like this? Regards, Ashish

Re: Can TrieDateField fields be null?

2015-08-23 Thread Henrique O. Santos
Hi Erick and Upayavira, thanks for the reply. I am using Solr 5.2.1 and using SolrJ 5.2.1 API with an annotated POJO to update the index. And you were right, somehow my JODA DateTime field was being filled with the current timestamp prior to the update. Thanks for the clarification again. On

Re: DIH delta-import pk

2015-08-23 Thread William Bell
Send the SQL and Schema.xml. Also logs. Does it complain about _id_ or you field in schema? On Sun, Aug 23, 2015 at 4:55 AM, CrazyDiamond wrote: > Now I set db id as unique field and uuid field,which should be generated > automatically as required. but when i add document i have an error tha

Re: Can TrieDateField fields be null?

2015-08-23 Thread Henrique O. Santos
Hello again, I am doing some manual indexing using Solr Admin UI to be exactly sure how TrieDateFields and null values work. When I remove the TrieDateField from the document, I get the following when trying to index it: | "msg": "Invalid Date String:'NULL'", "code": 400| On Solr 5.

Re: Solr performance is slow with just 1GB of data indexed

2015-08-23 Thread Shawn Heisey
On 8/22/2015 10:28 PM, Zheng Lin Edwin Yeo wrote: > Hi Shawn, > > Yes, I've increased the heap size to 4GB already, and I'm using a machine > with 32GB RAM. > > Is it recommended to further increase the heap size to like 8GB or 16GB? Probably not, but I know nothing about your data. How many So

Re: Solr performance is slow with just 1GB of data indexed

2015-08-23 Thread Zheng Lin Edwin Yeo
Hi Shawn and Toke, I only have 520 docs in my data, but each of the documents is quite big in size, In the Solr, it is using 221MB. So when i set to read from the top 1000 rows, it should just be reading all the 520 docs that are indexed? Regards, Edwin On 23 August 2015 at 22:52, Shawn Heisey

Re: Too many updates received since start

2015-08-23 Thread Yago Riveiro
Indeed, I don't understand the caveat too, but I can imagine that is related with some algorithm to trigger a full sync if necessary. I will waiting for 5.3 to do the upgrade and have this configuration available. —/Yago Riveiro On Sun, Aug 23, 2015 at 3:37 AM, Shawn Heisey wrote: > On 8/

Re: Can TrieDateField fields be null?

2015-08-23 Thread Shawn Heisey
On 8/23/2015 8:29 AM, Henrique O. Santos wrote: > I am doing some manual indexing using Solr Admin UI to be exactly sure > how TrieDateFields and null values work. When I remove the TrieDateField > from the document, I get the following when trying to index it: > > | "msg": "Invalid Date Strin

Re: Can TrieDateField fields be null?

2015-08-23 Thread Erick Erickson
Following up on Shawn's comment, this can be the result of some sort of serialization or, if you're pulling info from a DB the literal string NULL may be returned from the DB. Solr really has no concept of a distinct value of NULL for a field, in Solr/Lucene terms that's just the total absence of

Re: Multiple concurrent queries to Solr

2015-08-23 Thread Shawn Heisey
On 8/23/2015 7:46 AM, Ashish Mukherjee wrote: > I want to run few Solr queries in parallel, which are being done in a > multi-threaded model now. I was wondering if there are any client libraries > to query Solr through a non-blocking I/O mechanism instead of a threaded > model. Has anyone attempt

Re: Solr performance is slow with just 1GB of data indexed

2015-08-23 Thread Alexandre Rafalovitch
Are you by any chance doing store=true on the fields you want to search? If so, you may want to switch to just index=true. Of course, they will then not come back in the results, but do you really want to sling huge content fields around. The other option is to do lazyLoading=true and not request

Re: Multiple concurrent queries to Solr

2015-08-23 Thread Walter Underwood
The last time that I used the HTTPClient library, it was non-blocking. It doesn’t try to read from the socket until you ask for data from the response object. That allows parallel requests without threads. Underneath, it has a pool of connections that can be reused. If the pool is exhausted, it

Re: Solr performance is slow with just 1GB of data indexed

2015-08-23 Thread Erick Erickson
You're confusing clustering with searching. Sure, Solr can index and lots of data, but clustering is essentially finding ad-hoc similarities between arbitrary documents. It must take each of the documents in the result size you specify from your result set and try to find commonalities. For perf i

Re: Solr performance is slow with just 1GB of data indexed

2015-08-23 Thread Upayavira
And be aware that I'm sure the more terms in your documents, the slower clustering will be. So it isn't just the number of docs, the size of them counts in this instance. A simple test would be to build an index with just the first 1000 terms of your clustering fields, and see if that makes a diff

Re: Solr performance is slow with just 1GB of data indexed

2015-08-23 Thread Jimmy Lin
unsubscribe On Sat, Aug 22, 2015 at 9:31 PM, Zheng Lin Edwin Yeo wrote: > Hi, > > I'm using Solr 5.2.1, and I've indexed about 1GB of data into Solr. > > However, I find that clustering is exceeding slow after I index this 1GB of > data. It took almost 30 seconds to return the cluster results wh

Re: Multiple concurrent queries to Solr

2015-08-23 Thread Arcadius Ahouansou
Hello Ashish. Therse is an unfinished work about this at https://issues.apache.org/jira/browse/SOLR-3383 Maybe you want to have a look and contribute? Arcadius. On 23 August 2015 at 17:02, Walter Underwood wrote: > The last time that I used the HTTPClient library, it was non-blocking. It > do

Re: DIH delta-import pk

2015-08-23 Thread CrazyDiamond
i don't use SQL now. i'm adding documents manually. db_id_s -- View this message in context: http://lucene.472066.n3.nabble.com/DIH-delta-import-pk-tp4224342p4224762.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Custom Solr caches in a FunctionQuery that emulates the ExternalFileField

2015-08-23 Thread Mikhail Khludnev
Hello Upayavira, It's a long month ago! I just described this approach in http://blog.griddynamics.com/2015/08/scoring-join-party-in-solr-53.html Coming back to our discussion I think I miss {!func} which turn fieldname into function query. On Fri, Jul 24, 2015 at 3:41 PM, Upayavira wrote: > Mik

Re: Solr performance is slow with just 1GB of data indexed

2015-08-23 Thread Bill Bell
We use 8gb to 10gb for those size indexes all the time. Bill Bell Sent from mobile > On Aug 23, 2015, at 8:52 AM, Shawn Heisey wrote: > >> On 8/22/2015 10:28 PM, Zheng Lin Edwin Yeo wrote: >> Hi Shawn, >> >> Yes, I've increased the heap size to 4GB already, and I'm using a machine >> with 32

Re: Remove duplicate suggestions in Solr

2015-08-23 Thread Zheng Lin Edwin Yeo
Hi Arcadius, Thank you for your reply. So this means that the de-duplication has to be done during indexing time, and not during query time? Yes, currently I'm building on the "search" to be do my suggestion as I faced some issues with the suggestions components in the Solr 5.1.0 version. Will t

Re: Solr performance is slow with just 1GB of data indexed

2015-08-23 Thread Zheng Lin Edwin Yeo
Yes, I'm using store=true. However, this field needs to be stored as my program requires this field to be returned during normal searching. I tried the lazyLoading=true, but it's not working. Will you do a copy field for the content, and not to set stored="true" for that field. So that field wil

Re: Solr performance is slow with just 1GB of data indexed

2015-08-23 Thread Zheng Lin Edwin Yeo
Hi Alexandre, I've tried to use just index=true, and the speed is still the same and not any faster. If I set to store=false, there's no results that came back with the clustering. Is this due to the index are not stored, and the clustering requires indexed that are stored? I've also increase my

Re: Exception while using {!cardinality=1.0}.

2015-08-23 Thread Modassar Ather
- Did you have the exact same data in both fields? No the data is not same. - Did your "real" query actually compute stats on the same field you had : done your main term query on? The query field is different and I missed to clearly put it. I will accordingly modify the jira. So the query can

Re: Multiple concurrent queries to Solr

2015-08-23 Thread Ashish Mukherjee
Thanks, everyone. Arcadius, that ticket is interesting. I was wondering if an implementation of SolrClient could be based on HttpAsyncClient instead of HttpSolrClient. Just a thought right now, which needs to be explored deeper. - Ashish On Mon, Aug 24, 2015 at 1:46 AM, Arcadius Ahouansou wrote

Re: Solr 4.10.3 cached grouping results but Solr 5.2.1 don't, why?

2015-08-23 Thread Pavel Hladik
Nobody knows or has the same issue? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-10-3-cached-grouping-results-but-Solr-5-2-1-don-t-why-tp4224396p4224812.html Sent from the Solr - User mailing list archive at Nabble.com.

GC parameters tuning for core of 140M docs on 50G of heap memory

2015-08-23 Thread Pavel Hladik
Hi, we have a Solr 5.2.1 with 9 cores and one of them has 140M docs. Can you please recommend tuning of those GC parameters? The performance is not a issue, sometimes during peaks we have OOM and we use 50G of heap memory, the server has 64G of ram. GC_TUNE="-XX:NewRatio=3 \ -XX:SurvivorRatio=4 \