dataimport.properties files

2014-01-13 Thread Karan jindal
Hi all, When the information regarding last_import_time is written into dataimport.properties file? Is it at the start before actual indexing start or at the end? If it is at the start than in cases where dataimport fails will the dataimport.properties file will be rolled back to its last state??

Re: background merge hit exception while optimizing index (SOLR 4.4.0)

2014-01-13 Thread Ralf Matulat
It sounds quite obvious to upgrade the java environment to go on with that. We are updating our index almost every second and so over time it counts up a lot of segment files (up to 32 in our case). So optimizing was always a good idea in the SOLR 3.X world. We regocnized that in SOLR 4.4.0 it

Re: Can I define the copy field like title_*

2014-01-13 Thread samsolr
Yes it's legitimate to copy like - Sumit Arora -- View this message in context: http://lucene.472066.n3.nabble.com/Can-I-define-the-copy-field-like-title-tp468p470.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Cancel Solr query?

2014-01-13 Thread Alexandre Rafalovitch
Sounds like two different things. On demand cancellation vs. timeout. Regards, Alex On 14 Jan 2014 04:20, "Mikhail Khludnev" wrote: > Hello Luis, > > It's not so difficult for Lucene, e.g. there are timeboxed queries > > http://docs.lucidworks.com/display/solr/Common+Query+Parameters#CommonQ

Re: Changing Cache Properties after Indexing

2014-01-13 Thread Shawn Heisey
On 1/13/2014 4:44 PM, Erick Erickson wrote: On the face of it, it's somewhat unusual to have the cache settings affect indexing performance. What are you seeing and how are you indexing? I think this is probably an indirect problem. Cache settings don't directly affect indexing speed, but whe

Re: question about DIH solr-data-config.xml and XML include

2014-01-13 Thread Shawn Heisey
On 1/13/2014 3:31 PM, Bill Au wrote: But when I use XML include, the Entity pull-down in the Dataimport section of the Solr admin UI is empty. I know that happens when there is a syntax error in solr-data-config.xml. Does DIH supports XML include? Also I am not seeing any error message in the

Re: Changing Cache Properties after Indexing

2014-01-13 Thread Erick Erickson
On the face of it, it's somewhat unusual to have the cache settings affect indexing performance. What are you seeing and how are you indexing? Best, Erick On Mon, Jan 13, 2014 at 2:40 PM, P Williams wrote: > Hi, > > I've gone through steps for tuning my cache sizes and I'm very happy with > the

Security specific url-patterns

2014-01-13 Thread sureshrk19
Hi, I'm using SOLR (deployed on jetty) for last few months. I got into a tricky situation and spent 2 days but, no luck. I did setup SOLR on Jetty and it is working fine. Now, I need to add security to specific section of SOLR functionality i.e., dataimport, replication etc... I want to make all

Re: Correct to use to store urls (unicode)

2014-01-13 Thread Hakim Benoudjit
Because I only need to retreive the link not search by url. I have already stored product id in an id field. Thanks for your answer Gora! 2014/1/13 Gora Mohanty > On 13 January 2014 00:30, Hakim Benoudjit wrote: > > > > Yep sure. But is it good for me to store a link(http://...) in a solr > >

question about DIH solr-data-config.xml and XML include

2014-01-13 Thread Bill Au
I am trying to simplify my Solr DIH configuration by using XML schema include element. Here is an example: ]> &dataSource; &entity1; &entity2; I know my included XML files are good because if I put them all into a single XML file, DIH works as expected. But

RE: Analysis page broken on trunk?

2014-01-13 Thread Chris Hostetter
: Hi - I cannot send a snapshot but here's an URL and raw output, it : happens on all text fields. On screen i only see the headers for each : row, but no text, positions etc. I can't reproduce the problem you are describing using the truck example schema and firefox can you please open a jir

Questionon CollapsingQParserPlugin

2014-01-13 Thread Shamik Bandopadhyay
Hi, I'm looking for some clarification on CollapsingQParserPlugin feature. Here's what I tried. I downloaded 4.6, updated "solr.xml" under exampledocs folder and added the following entry. I've added a new field "adskdedup" on which I'm planning to test field collapsing. As you can see, out of

Re: ANN: Solr Next

2014-01-13 Thread Yonik Seeley
That would be cool, but seems it would only work for simple term queries. I guess having both would be best. http://heliosearch.org -- off-heap filters for solr -Yonik On Mon, Jan 13, 2014 at 2:21 PM, Mikhail Khludnev wrote: > Yonik, > Don't you think that proper codec format can get the compar

Distributed search with Terms Component and Solr Cloud.

2014-01-13 Thread Ryan Fox
Hello, I am running Solr 4.6.0. I am experiencing some difficulties using the terms component across multiple shards. I see according to the documentation, it should work, but I am unable to do so with solr cloud. When I have one shard, queries using the terms component respond as I would expec

Re: Cancel Solr query?

2014-01-13 Thread Mikhail Khludnev
Hello Luis, It's not so difficult for Lucene, e.g. there are timeboxed queries http://docs.lucidworks.com/display/solr/Common+Query+Parameters#CommonQueryParameters-The{{timeAllowed}}Parameter . It's a problem for Servlet request/response model, it need to be revised to AJAX-one. On Mon, Jan 13,

Re: background merge hit exception while optimizing index (SOLR 4.4.0)

2014-01-13 Thread Michael McCandless
I have trouble understanding J9's version strings ... but, is it really from 2008? You could be hitting a JVM bug; can you test upgrading? I don't have much experience with Solr faceting on optimized vs unoptimized indices; maybe someone else can answer your question. Lucene's facet module (not

Changing Cache Properties after Indexing

2014-01-13 Thread P Williams
Hi, I've gone through steps for tuning my cache sizes and I'm very happy with the results of load testing. Unfortunately the cache settings for querying are not optimal for indexing - and in fact slow it down quite a bit. I've made the caches small by default for the indexing stage and then want

Re: ANN: Solr Next

2014-01-13 Thread Mikhail Khludnev
Yonik, Don't you think that proper codec format can get the comparable gain without changes in design? https://issues.apache.org/jira/browse/LUCENE-5052 On Mon, Jan 13, 2014 at 9:15 PM, Yonik Seeley wrote: > Update on the my initial performance findings for off-heap filters: > http://heliosearc

Cancel Solr query?

2014-01-13 Thread Luis Lebolo
Hi All, Is it possible to cancel a Solr query/request currently in progress? Suppose the user starts searching for something (that takes a long time for Solr to process), then decides the modify the query. I can simply ignore the previous request and create a new request, but Solr is still proces

Re: ANN: Solr Next

2014-01-13 Thread Yonik Seeley
Update on the my initial performance findings for off-heap filters: http://heliosearch.org/off-heap-filters/ -Yonik http://heliosearch.org -- making solr shine On Tue, Jan 7, 2014 at 1:53 PM, Yonik Seeley wrote: > Off-Heap Filters: > JVMs have never been good at dealing with large heaps. Large

Re: Searching Numeric Data

2014-01-13 Thread Satyanarayana Kakollu
Anything preventing you from calculating the similarity score at index time and storing it as a field? The field can be used for scoring at query time. Alternately you can make a field for each of the elements in the array and use custom scoring function at query time. There is no value to, s

Re: Indexing spatial fields into SolrCloud (HTTP)

2014-01-13 Thread Smiley, David W.
Hello Jim, By the way, using GeohashPrefixTree.getMaxLevelsPossible() is usually an extreme choice. Instead you probably want to choose only as many levels needed for your distance tolerance. See SpatialPrefixTreeFactory which you can use outright or borrow the code it uses. Looking at your

RE: Simple payloads example not working

2014-01-13 Thread michael.boom
Correction: I observed a pattern, the returned score is the same for all docs and equals with the payload of the term in the first doc: http://localhost:8983/solr/collection1/pds-search?q=payloads:testone&wt=json&indent=true&debugQuery=true ---> "explain":{ "1":"\n15.4 = (MATCH) btq(include

Re: Simple payloads example not working

2014-01-13 Thread michael.boom
Thanks Eric, I did create a custom query parser, which seems to work just fine. My only problem now is the one above, with all docs having the same score for some reason. See below the query parserL import org.apache.commons.lang.StringUtils; import org.apache.lucene.index.Term; import org.apach

Re: background merge hit exception while optimizing index (SOLR 4.4.0)

2014-01-13 Thread Ralf Matulat
> java -version java version "1.6.0" Java(TM) SE Runtime Environment (build pxa6460sr3ifix-20090218_02(SR3+IZ43791+IZ43798)) IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 Linux amd64-64 jvmxa6460-20081105_25433 (JIT enabled, AOT enabled) J9VM - 20081105_025433_LHdSMr JIT - r9_20081031_1330 GC

RE: Analysis page broken on trunk?

2014-01-13 Thread Markus Jelsma
Hi - I cannot send a snapshot but here's an URL and raw output, it happens on all text fields. On screen i only see the headers for each row, but no text, positions etc. HTMLSCF text bla bla bla WT text raw_bytes start end position type WDF text raw_bytes sta

Re: Simple payloads example not working

2014-01-13 Thread Erik Hatcher
There’s also the PayloadTermQueryParser here (which is surely a bit out of date as well, but maybe a bit simpler starting point). Erik On Jan 13, 2014, at 12:30 AM, michael.boom wrote: > Actually, i just checked the debugQuery output:

RE: Simple payloads example not working

2014-01-13 Thread michael.boom
Thanks, that indeed fixed the problem. Now i've created a custom Similarity class and used it in schema.xml. Problem is now that for all docs the calculated payload score is the same: public class CustomSolrSimilarity extends DefaultSimilarity { @Override public float scorePayload(int

RE: Analysis page broken on trunk?

2014-01-13 Thread Markus Jelsma
Hi - Here's an URL, it happens on all text fields: http://localhost:8983/solr/collection1/analysis/field?wt=json&analysis.showmatch=true&analysis.fieldvalue=bla%20bla%20bla&analysis.fieldtype=text_de&_=1389621799252&indent=true { "responseHeader":{ "status":0, "QTime":5}, "analysis":{

Re: Need Features offered and comparison Chart for Solr 3.6 and Solr 4.6

2014-01-13 Thread Aruna Kumar Pamulapati
Hi Mayank, In addition to what Erick suggested, you can also look at https://cwiki.apache.org/confluence/display/solr/Major+Changes+from+Solr+3+to+Solr+4 Thanks, On Mon, Jan 13, 2014 at 7:39 AM, Erick Erickson wrote: > The most comprehensive list is CHANGES.txt. I'd look > at the Solr one fir

RE: Simple payloads example not working

2014-01-13 Thread Markus Jelsma
Check the bytes property: http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/util/BytesRef.html#bytes @Override public float scorePayload(int doc, int start, int end, BytesRef payload) { if (payload != null) { return PayloadHelper.decodeFloat(payload.bytes); } return

Re: Simple payloads example not working

2014-01-13 Thread michael.boom
Thanks iorixxx, Actually I've just tried it and I hit a small wall, the tutorial looks not to be up to date with the codebase. When implementing my custom similarity class i should be using PayloadHelper, but following happens: in PayloadHelper: public static final float decodeFloat(byte [] bytes

Re: Need Features offered and comparison Chart for Solr 3.6 and Solr 4.6

2014-01-13 Thread Erick Erickson
The most comprehensive list is CHANGES.txt. I'd look at the Solr one first (this is the one in the at the root level of your installation directory). If you have the source code checked out, there's a CHANGES.txt in the /solr and /lucene. All the changes refer you to the Solr and Lucene JIRAs th

Re: How Solr join query works?

2014-01-13 Thread Erick Erickson
It's called "pseudo join" for a reason. It's not built to do what you want. There's no way to combine fields in the "from" and "to" clauses in the output. The "from" clause can be thought of as a filter. The first choice is usually to denormalize the data if possible. Best, Erick On Sun, Jan 12,

Re: background merge hit exception while optimizing index (SOLR 4.4.0)

2014-01-13 Thread Michael McCandless
Which version of Java are you using? That root cause exception is somewhat spooky: it's in the ByteBufferIndexCode that handles an "UnderflowException", ie when a small (maybe a few hundred bytes) read happens to span the 1 GB page boundary, and specifically the exception happens on the final read

Re: Simple payloads example not working

2014-01-13 Thread Ahmet Arslan
Hi Michael, To make payloads to be considered in score calculation you need two more things: 1) A custom similarity  2) A query parser that produces Payload*Query family. This blog post can be a good start point.  http://digitalpebble.blogspot.com/2010/08/using-payloads-with-dismaxqparser-in.htm

Copy/backup one core from one cloud 1 to cloud 2

2014-01-13 Thread Borut Bolčina
Hello, In solr 4.6.0, what is the recommended way for transferring one core's index in production cloud to say a staging cloud? I know I can just copy the data folder, but I am sure there is a smarter and safer way. I read https://cwiki.apache.org/confluence/display/solr/Backing+Up I am just maki

Re: How Solr join query works?

2014-01-13 Thread Ahmet Arslan
Hello, Did you consider using CollapsingQueryParser or FieldCollapsing? http://wiki.apache.org/solr/FieldCollapsing https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-CollapsingQueryParser Ahmet On Monday, January 13, 2014 4:56 AM, solr2020 wrote: Hi All, Can anyon

Re: Can I store only the index in Solr and not the actual data

2014-01-13 Thread David Santamauro
On 01/13/2014 06:16 AM, Bijoy Deb wrote: Hi, I have my data in HDFS,which I need to index using Solr.In that case,does Solr always store both the data (the fields that need to be retrieved) as well as the index, or can it be configured to store only the index that points to the original data

Can I store only the index in Solr and not the actual data

2014-01-13 Thread Bijoy Deb
Hi, I have my data in HDFS,which I need to index using Solr.In that case,does Solr always store both the data (the fields that need to be retrieved) as well as the index, or can it be configured to store only the index that points to the original data in HDFS. Personally,I would like the latte

Re: Simple payloads example not working

2014-01-13 Thread michael.boom
Actually, i just checked the debugQuery output: they all have the same score: "explain": { "1": "\n0.24276763 = (MATCH) weight(text:testone in 0) [DefaultSimilarity], result of:\n 0.24276763 = fieldWeight in 0, product of:\n1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n

Simple payloads example not working

2014-01-13 Thread michael.boom
Hi, I'm trying to test payloads in Solr Using solr 4.6.0 and the example configuration, i posted 3 docs to solr: 1 Doc one testone|100 testtwo|30 testthree|5 I testone, you testtwo, they testthree 2 Doc two testone|30 testtwo|200 testthree|5 I testone, yo