Solr round ratings to nearest integer value

2013-07-31 Thread Thyagaraj
I'm using solr 4.0 with DIH jdbc connector and I use Solr Admin web interface for testing. I have a field called *ratings* which varies like 0, 0.3, 0.5, 0.75, 1, 1.5, 1.6... and so on as per user input. I found the link http://lucene.472066.n3.nabble.com/How-to-round-solr-score-td495198.html <

Re: Improper shutdown of Solr in Jetty 9

2013-07-31 Thread Artem Karpenko
I don't use Cygwin, but I guess it's similar in that in Windows there are no signals at all so situation with missing shutdown hooks is the same. 31.07.2013 20:54, Chris Hostetter пишет: : it's Windows 7. I'm starting Jetty with java -jar start.jar Not sure if you are using cygwin, or if this

Re: Solr list all records but fq matching records first

2013-07-31 Thread Thyagaraj
Thank you very much!, -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-list-all-records-but-fq-matching-records-first-tp4081572p4081826.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Proposal/request for comments: Solr schema annotation

2013-07-31 Thread Walter Underwood
An annotation field would be much better than the current "anything goes" schema-less schema.xml. Has anyone built an XML Schema for schema.xml? I know it is extensible, but it would be worth a try. wunder On Jul 31, 2013, at 6:21 PM, Steve Rowe wrote: > In thinking about making the entire So

Proposal/request for comments: Solr schema annotation

2013-07-31 Thread Steve Rowe
In thinking about making the entire Solr schema REST-API-addressable (SOLR-4898), I'd like to be able to add arbitrary metadata at both the top level of the schema and at each leaf node, and allow read/write access to that metadata via the REST API. Some uses I've thought of for such a facility

debian package for solr with jetty

2013-07-31 Thread smanad
Hi, I am trying to create a debian package for solr 4.3 (default installation with jetty). Is there anything already available? Also, I need 3 different cores so plan to create corresponding packages for each of them to create solr core using admin/cores or collections api. I also want to use

no servers hosting shard

2013-07-31 Thread smanad
I have setup solr cloud and when I try to access documents I get this error, no servers hosting shard: 503 However if I add shards=shard1 param it works. -- View this message in context: http://lucene.472066.n3.nabble.com/no-servers-hosting-shard-tp4081783.html Sent from the Solr - User maili

Re: FieldCollapsing issues in SolrCloud 4.4

2013-07-31 Thread Paul Masurel
If your issue is that you want to retrieve the number of groups, group.ngroups will return the sum of the number of groups per shard. This is not the number of groups overall as there if some groups are present on more than one shard. To make sure that this does not happen, one can choose to dist

Re: SolrCloud and Joins

2013-07-31 Thread David Larochelle
Thanks Walter, Existing media sets will rarely change but new media sets will be added relatively frequently. (There is a many to many relationship between media sets and media sources.) Given the size of data, a new Media Set that only includes 1% of the collection would include 6 million rows.

Re: queryResultCache showing all zeros

2013-07-31 Thread Yonik Seeley
On Wed, Jul 31, 2013 at 3:49 PM, Chris Hostetter wrote: > there is however a "group.cache.percent" option tha you might look into -- > but i honestly have no idea if that toggles the use of queryResultCache or > something else, i havn't played with it before... That's only a single-request cache

Re: Measuring SOLR performance

2013-07-31 Thread Roman Chyla
No, I haven't had time for that (and unlikely won't have for the next few weeks), but it is on the list - if it is 25% improvement, it would be really worth of the change to G1. Thanks, roman On Wed, Jul 31, 2013 at 1:00 PM, Markus Jelsma wrote: > Did you also test indexing speed? With default

RE: Highlighting externally stored text

2013-07-31 Thread Bryan Loofbourrow
> Hey Bryan, Thanks for the response! To make use of the > FastVectorHighlighter > you need to enable termVectors, termPositions, and termOffsets correct? > Which takes a considerable amount of space, but is good to know and I may > possibly pursue this solution as well. Just starting to look at

Inconsistent facet ranges when using distributed search in Solr 4.3

2013-07-31 Thread Jose Aguilar
Hi all, I am seeing some inconsistent behavior with facets, specifically range facets, on Solr 4.3. Running the same query several times (pressing F5 on the browser) produces different facet ranges when doing distributed searches, as some times it doesn't include some of the "buckets". The resu

RE: Highlighting externally stored text

2013-07-31 Thread JohnRodey
Just an update. Change was pretty straight forward (at least for my simple test case) just a few lines in the getBestFragments method seemed to do the trick. -- View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-externally-stored-text-tp4078387p4081748.html Sent from

Re: Auto Correction of Solr Query

2013-07-31 Thread Otis Gospodnetic
Hi Siva, I think I mention this several days ago... DYM ReSearcher will do that: http://sematext.com/products/dym-researcher/index.html Otis On Tuesday, July 30, 2013, sivaprasad wrote: > Hi, > > Is there any way to "auto correct" the Solr query and get the results? For > example, user tries t

Re: Performance question on Spatial Search

2013-07-31 Thread Steven Bower
the list of IDs does change relatively frequently, but this doesn't seem to have very much impact on the performance of the query as far as I can tell. attached are the stacks thanks, steve On Wed, Jul 31, 2013 at 6:33 AM, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote: > On Wed, Jul 31

RE: queryResultCache showing all zeros

2013-07-31 Thread Chris Hostetter
: it's quite long but this request uses both faceting and grouping. If I : remove the grouping then the cache is used. Is this a normal behavior or : a bug? I believe that is expected -- i don't think grouping can take advantage of the queryResultCache because of how it collects documents. th

RE: queryResultCache showing all zeros

2013-07-31 Thread Jean-Sebastien Vachon
Also we do not have any useFilterForSortedQuery in our config. So we are relying on the default which I guess is false. From: Jean-Sebastien Vachon Sent: Wednesday, July 31, 2013 3:44 PM To: solr-user@lucene.apache.org Subject: RE: queryResultCache showi

RE: queryResultCache showing all zeros

2013-07-31 Thread Jean-Sebastien Vachon
Ok I might have found an Solr issue after I fixed a problem in our system. This the kind of query we are making: http://10.0.5.214:8201/solr/Current/select?fq=position_refreshed_date_id:[2747%20TO%203501]&fq=position_soc_2011_8_code:41101100&fq=country_id:1&fq=position_job_type_id:4&fq=position_e

RE: Alternative searches

2013-07-31 Thread Petersen, Robert
Hi Mark Yes, it is something we implemented also. We just try various subsets of the search terms when there are zero results. To increase performance for all these searches we return only the first three results and no facets so we can simply display the result counts for the various subsets

RE: Highlighting externally stored text

2013-07-31 Thread JohnRodey
Hey Bryan, Thanks for the response! To make use of the FastVectorHighlighter you need to enable termVectors, termPositions, and termOffsets correct? Which takes a considerable amount of space, but is good to know and I may possibly pursue this solution as well. Just starting to look at the code

RE: queryResultCache showing all zeros

2013-07-31 Thread Jean-Sebastien Vachon
Looks like the problem might not be related to Solr but to a proprietary system we have on top of it. I made some queries with facets and the cache was updated. We are looking into this... I should not have assumed that the problem was coming from Solr ;) I'll let you know if there is anything

Re: Sending shard requests to all replicas

2013-07-31 Thread Isaac Hebsh
Thanks to Ryan Ernst, my issue is duplicate of SOLR-4449. I think that this proposal might be very useful (some supporting links are attached there. worth reading..) On Tue, Jul 30, 2013 at 11:49 PM, Isaac Hebsh wrote: > Hi, > I submitted a new JIRA for this: > https://issues.apache.org/jira/br

Re: FieldCollapsing issues in SolrCloud 4.4

2013-07-31 Thread Ali, Saqib
Hello Paul, Can you please explain what you mean by: "To get the exact number of groups, you need to shard along your grouping field" Thanks! :) On Wed, Jul 31, 2013 at 3:08 AM, Paul Masurel wrote: > Do you mean you get different results with group=true? > numFound is supposed returns the numb

Re: queryResultCache showing all zeros

2013-07-31 Thread Chris Hostetter
: We just configured a new Solr cloud (5 nodes) running Solr 4.3, ran : about 200 000 queries taken from our production environment and measured : the performance of the cloud over a collection of 14M documents with the : default Solr settings. We are now trying to tune the different caches :

Re: Improper shutdown of Solr in Jetty 9

2013-07-31 Thread Chris Hostetter
: it's Windows 7. I'm starting Jetty with java -jar start.jar Not sure if you are using cygwin, or if this is related but... https://issues.apache.org/jira/browse/SOLR-3884 https://issues.apache.org/jira/browse/SOLR-3884?focusedCommentId=13462996&page=com.atlassian.jira.plugin.system.issuetabpan

Re: TrieField and FieldCache confusion

2013-07-31 Thread Chris Hostetter
: Can I expect the FieldCache of Lucene to return the correct values when : working : with TrieField with the precisionStep higher than 0. If not, what did I get : wrong? Yes -- the code for building FieldCaches from Trie fields is smart enough to ensure that only the "real" original values are

Re: Solr list all records but fq matching records first

2013-07-31 Thread Thyagaraj
Awesome Jack Krupansky-2!!!. It seems to work!. What I didn't understand is *^100*. Could you give some explanation on ^100 please? if it could be any number other than 100?. Thanks a lot!, I was working on this for past 3 days!. -- View this message in context: http://lucene.472066.n3.nabb

Re: Solr list all records but fq matching records first

2013-07-31 Thread Jack Krupansky
I was going to say 10, but frequently people find that they need a really big boost. Normally, a boost might be 1.5 or 2 or 5, or something like that. A fractional boost, like 0.5, 0.25, 0.1, or even 0.01 can de-emphasize terms. If you add debugQuery=true to your query request and look at th

Alternative searches

2013-07-31 Thread Mark
Can someone explain how one would go about providing alternative searches for a query… similar to Amazon. For example say I search for "Red Dump Truck" - 0 results for "Red Dump Truck" - 500 results for " Red Truck" - 350 results for "Dump Truck" Does this require multiple searches? Thanks

Re: Ingesting geo data into Solr very slow

2013-07-31 Thread Shawn Heisey
On 7/31/2013 11:20 AM, Simonian, Marta M (US SSA) wrote: Does anybody know if Solr performs better on Jetty vs Tomcat? Jetty has less complexity than tomcat. It is likely to use less memory. If you went with default settings for both, jetty is likely to perform better, but the difference wo

RE: Ingesting geo data into Solr very slow

2013-07-31 Thread Simonian, Marta M (US SSA)
Does anybody know if Solr performs better on Jetty vs Tomcat? -Original Message- From: David Smiley (@MITRE.org) [mailto:dsmi...@mitre.org] Sent: Tuesday, July 30, 2013 9:08 PM To: solr-user@lucene.apache.org Subject: Re: Ingesting geo data into Solr very slow Hi Marta, Presumably you a

RE: Ingesting geo data into Solr very slow

2013-07-31 Thread Simonian, Marta M (US SSA)
Hi guys, Here is the reply I got from the solr group. I'll change those settings. It's good to know that it doesn't matter if we use the bean vs solr doc. -Marta -Original Message- From: David Smiley (@MITRE.org) [mailto:dsmi...@mitre.org] Sent: Tuesday, July 30, 2013 9:08 PM To: solr

Re: Solr PolyField

2013-07-31 Thread Luís Portela Afonso
Ok, thanks. I will check it. On Jul 31, 2013, at 5:08 PM, "Jack Krupansky" wrote: > See: > https://builds.apache.org/job/Solr-Artifacts-4.x/javadoc/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html > > I have more examples in my book. > > -- Jack Krupansky > > F

RE: Measuring SOLR performance

2013-07-31 Thread Markus Jelsma
Did you also test indexing speed? With default G1GC settings we're seeing a slightly higher latency for queries than CMS. However, G1GC allows for much higher throughput than CMS when indexing. I haven't got the raw numbers here but it is roughly 45 minutes against 60 in favour of G1GC! Load is

Re: upgrade from 4.3 to 4.4

2013-07-31 Thread Jack Krupansky
A dot release should never require reindexing, unless... there is some change in a field type analyzer or update processor that your data depends on. For example, some changes occurred in the ngram filter, so whether that would impact your data is up to you to decide. See: https://issues.apa

Re: SolrCloud Exception

2013-07-31 Thread Sinduja Rajendran
Thanks shawn for the reply. I would upgrade to solr 4.3 and check that. On Wed, Jul 31, 2013 at 4:13 PM, Shawn Heisey wrote: > On 7/31/2013 4:27 AM, Sinduja Rajendran wrote: > > I am running solr 4.0 in a cloud. We have close to 100Mdocuments. The > data > > is from a single DB table. I use di

Re: Measuring SOLR performance

2013-07-31 Thread Roman Chyla
I'll try to run it with the new parameters and let you know how it goes. I've rechecked details for the G1 (default) garbage collector run and I can confirm that 2 out of 3 runs were showing high max response times, in some cases even 10secs, but the customized G1 never - so definitely the paramete

upgrade from 4.3 to 4.4

2013-07-31 Thread Joshi, Shital
We have SolrCloud (4.3.0) cluster (5 shards and 2 replicas) on 10 boxes. We have about 450 million documents. We're planning to upgrade to Solr 4.4.0. Do We need to re-index already indexed documents? Thanks!

Re: Does solr cloud support rename or swap function for collection?

2013-07-31 Thread thzinc
This is awesome news. I had been looking for the ability to do this with SolrCloud since 4.0.0-ALPHA. We're on 4.1.0 right now, so this is a great reason to plan for an upgrade. Just to be clear, CREATEALIAS both creates and updates an alias, right? -- View this message in context: http://luce

RE: monitor jvm heap size for solrcloud

2013-07-31 Thread Joshi, Shital
Thanks for all answers. We decided to use VisualVM with multiple remote connections. -Original Message- From: Utkarsh Sengar [mailto:utkarsh2...@gmail.com] Sent: Friday, July 26, 2013 6:19 PM To: solr-user@lucene.apache.org Subject: Re: monitor jvm heap size for solrcloud We have been

Re: Measuring SOLR performance

2013-07-31 Thread Roman Chyla
Hi Dmitry, probably mistake in the readme, try calling it with -q /home/dmitry/projects/lab/solrjmeter/queries/demo/demo.queries as for the base_url, i was testing it on solr4.0, where it tries contactin /solr/admin/system - is it different for 4.3? I guess I should make it configurable (it alread

Re: Solr Cloud Setup

2013-07-31 Thread AdityaR
Flavio, There was a problem with the solrconfig and schema files. One of the team members had deleted some entries in the solrconfig.xml and I was picking the same solr configuration everytime, I got the latest version of solr and carefully edited the solrconfig and schema files and it worked.

Re: Solr PolyField

2013-07-31 Thread Jack Krupansky
See: https://builds.apache.org/job/Solr-Artifacts-4.x/javadoc/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html I have more examples in my book. -- Jack Krupansky From: Luís Portela Afonso Sent: Wednesday, July 31, 2013 11:41 AM To: solr-user@lucene.apache.org Su

Re: solr 4.4 multiple datasource connection

2013-07-31 Thread Alexandre Rafalovitch
On Wed, Jul 31, 2013 at 11:49 AM, Carmine Paternoster wrote: > Try datasource => dataSource in: Regards, Alex. P.s. This check will be (eventually) part of SolrLint: https://github.com/arafalov/SolrLint/issues/7 Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedi

solr 4.4 multiple datasource connection

2013-07-31 Thread Carmine Paternoster
in my db-data-config.xml i have configured two datasource, each with his parameter name, for example: but when in solr from dataimport i execute the second entity-name-query i

Re: Autowarming last 15 days data

2013-07-31 Thread Shawn Heisey
On 7/31/2013 9:21 AM, Cool Techi wrote: > Would it make sense if we open a newSearcher with the last 15 days documents? > since these is the documents which are mostly used by the users. Also, how > could i do this if this is possible? When you open a searcher, it's for the entire index. You ma

Re: Sharding with a SolrCloud

2013-07-31 Thread Erick Erickson
Well, assuming you have solved the differences in statistics between the index you maintain and the one in the cloud with respect to the scoring... My comment about indexing is probably irrelevant, you're not indexing anything to the SolrCloud cluster. But still doubt this will work. Here's the p

Re: Solr PolyField

2013-07-31 Thread Luís Portela Afonso
Hum, ok. It's possible to add to a field, static text? Text that i write on the configuration and then append another field? I saw something like CloneFieldProcessor but when i'm starting solr, it says that could not find the class. I was trying to use processors to move one field to another.

Re: SolrCloud - Replica 'down'. How to get it back as 'active'? - Solr 4.3.0

2013-07-31 Thread Jeroen Steggink
Thanks Anshum, autoSoftCommit was alread set to 1000ms, but I changed the autoCommit to 3 minutes. I'll wait for it to come back. The index contains about 200.000 documents and the last commit was 14 hours ago. So I wonder how long it will take. I would have thought it would be back up alre

RE: Autowarming last 15 days data

2013-07-31 Thread Cool Techi
Would it make sense if we open a newSearcher with the last 15 days documents? since these is the documents which are mostly used by the users. Also, how could i do this if this is possible? > Date: Wed, 31 Jul 2013 08:54:47 -0600 > From: s...@elyograg.org > To: solr-user@lucene.apache.org > Sub

Re: Solr PolyField

2013-07-31 Thread Michael Della Bitta
OK, Then I would suggest creating multiValued enclosure_type, etc. tags for searching, and then one string-typed field to store the JSON snippet you've been showing. Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. “The Science of Influence Ma

Re: Solr PolyField

2013-07-31 Thread Luís Portela Afonso
As a single record? Hum, no. So an Rss has /rss/channel/ and then lot of /rss/channel/item, right? Each /rss/channel/item is a new document on Solr. I start with the solr example rss, but i change that to has more fields, other fields and get the feed url from a database. So each /rss/channel/i

Re: Solr PolyField

2013-07-31 Thread Michael Della Bitta
So you're trying to index a RSS feed as a single record, but you want to be able to search for and retrieve individual entries from within the feed? Is that the issue? Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. “The Science of Influence M

Re: Measuring SOLR performance

2013-07-31 Thread Shawn Heisey
On 7/31/2013 12:24 AM, William Bell wrote: > But that link does not tell me which on you are using? > > You are listing like 4 versions on your site. > > Also, what did it fix? Pause times? > > > Any other words of wisdom ? I'm not sure whether that was directed at me or Roman, but here's my a

Re: Sharding with a SolrCloud

2013-07-31 Thread Oliver Goldschmidt
Thank you very much for that information, Erick. That was what I was fearing... Well, the problem, why I am trying to do this is, that the SolrCloud is managed by someone else. We are indexing some content to a pretty small local index. To this index we have complete access and can do whatever we

Re: Solr PolyField

2013-07-31 Thread Luís Portela Afonso
This fields can be multiValued. I the rss standart there is not correct to do that, but some sources do and i like to grab it all. Is there any way that make it possible? Once again, Many thanks :) On Jul 31, 2013, at 3:54 PM, Michael Della Bitta wrote: > Luís, > > Is there a reason why spli

Re: Solr PolyField

2013-07-31 Thread Michael Della Bitta
Luís, Is there a reason why splitting this up into enclosure_type, enclosure_url, and enclosure_length would not work? Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10

Re: Autowarming last 15 days data

2013-07-31 Thread Shawn Heisey
On 7/31/2013 7:30 AM, Cool Techi wrote: > We have a solr master slave set up with close to 30 million records. Our > index changes/updates very frequently and replication is set up at 60 seconds > delay. > > Now every time replication completes, the new searches take a time. How can > this be i

Re: Unexpected character '<' (code 60) expected '='

2013-07-31 Thread Shawn Heisey
On 7/31/2013 7:16 AM, Vineet Mishra wrote: > I checked the File. . .nothing is there. I mean the formatting is correct, > its a valid XML file. What version of Solr, and how large is your XML file? If Solr is older than version 4.1, then the POST buffer limit is decided by your container config,

Re: How might one search for dupe IDs other than faceting on the ID field?

2013-07-31 Thread Shawn Heisey
On 7/30/2013 11:22 PM, Dotan Cohen wrote: > I see, thanks. I thought that 'disk cache' was something on disk, such > as swap space. The server is already maxed out on RAM: > $ free -m > total used free sharedbuffers > cached > Mem: 14980 14906 7

Re: Solr PolyField

2013-07-31 Thread Luís Portela Afonso
Hi, I'm trying to index information of RSS Feeds. So in a more detailed explanation: The RSS feed has something like: http://www.engadget.com/podcasts/Engadget_Podcast_353.mp3"; length="32642192" type="audio/mpeg"/> With my current configuration, this is working and i get a result like that:

Re: SolrCloud - Replica 'down'. How to get it back as 'active'? - Solr 4.3.0

2013-07-31 Thread Anshum Gupta
It perhaps is just replaying the transaction logs and coming up. Wait for it is what I'd say. The admin UI as of now doesn't show replaying of transaction log as 'recovering', it does so only during peer sync. Also, you may want to add autoSoftCommit and increase the autoCommit to a few minutes.

Re: Sharding with a SolrCloud

2013-07-31 Thread Erick Erickson
You're in uncharted territory. I can imagine you use a SolrCloud cluster as a separate Solr for a federated search, but using it as a single shard just seems wrong. If nothing else, indexing to the shards will require that the documents be routed correctly. But having one shard in SolrCloud and an

Re: Solr PolyField

2013-07-31 Thread Erick Erickson
Nope. Solr fields are flat. Why do you want to do this? I'm asking because this might be an XY problems and there may be other possibilities. Best Erick On Wed, Jul 31, 2013 at 5:09 AM, Luís Portela Afonso wrote: > Hi, I'm trying to create a field with multiple fields inside, that is: > > origin

Re: Trying to determine the benefit of spellcheck-based suggester vs. using terms component?

2013-07-31 Thread Erick Erickson
Right. The big difference is that the TermsComponent can't give you any suggestions that are NOT in your index. So if someone just misspells a word there's no real way to say "this term is similar" using straight TermsComponent. Also, the TermsComponent takes a prefix. So a nonsense prefix won't s

SolrCloud - Replica 'down'. How to get it back as 'active'? - Solr 4.3.0

2013-07-31 Thread Jeroen Steggink
Hi, After the following error, one of the replicas of the leader went down. "Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later." I increased the autoCommit time to 5000ms and restarted Solr. However, the status is still set to "down". How do I get it back to

Re: SolrCloud Exception

2013-07-31 Thread Shawn Heisey
On 7/31/2013 4:27 AM, Sinduja Rajendran wrote: > I am running solr 4.0 in a cloud. We have close to 100Mdocuments. The data > is from a single DB table. I use dih. > Our solrCloud has 3 zookeepers, one tomcat, 2 solr instances in same > tomcat. We have 8 GB Ram. > > After indexing 14M, my indexing

Re: Measuring SOLR performance

2013-07-31 Thread Dmitry Kan
Ok, got the error fixed by modifying the base solr ulr in solrjmeter.py (added core name after /solr part). Next error is: WARNING: no test name(s) supplied nor found in: ['/home/dmitry/projects/lab/solrjmeter/demo/queries/demo.queries'] It is a 'slow start with new tool' symptom I guess.. :) O

Re: Improper shutdown of Solr in Jetty 9

2013-07-31 Thread Dmitry Kan
OK. On ubuntu there are shell scripts that come with jetty 9. They seem to do the proper job (disclaimer: not yet extensive testing with solr done, but looks good so far). Not sure, how well jetty supports win environment on the life-cycle automation side. On Wed, Jul 31, 2013 at 1:43 PM, Artem K

Re: Measuring SOLR performance

2013-07-31 Thread Dmitry Kan
Hi Roman, What version and config of SOLR does the tool expect? Tried to run, but got: **ERROR** File "solrjmeter.py", line 1390, in main(sys.argv) File "solrjmeter.py", line 1296, in main check_prerequisities(options) File "solrjmeter.py", line 351, in check_prerequisities e

Autowarming last 15 days data

2013-07-31 Thread Cool Techi
Hi, We have a solr master slave set up with close to 30 million records. Our index changes/updates very frequently and replication is set up at 60 seconds delay. Now every time replication completes, the new searches take a time. How can this be improved? I have come across that warming would h

Re: Trying to determine the benefit of spellcheck-based suggester vs. using terms component?

2013-07-31 Thread Timothy Potter
Thanks for the reply Erick. I'm looking for type-ahead support; using spell checking too via the DirectSolrSpellChecker. Seems like the spell check based suggester is designed for type-head or am I not understanding something? Here's my config: explicit json

Re: Unexpected character '<' (code 60) expected '='

2013-07-31 Thread Vineet Mishra
I checked the File. . .nothing is there. I mean the formatting is correct, its a valid XML file. On Wed, Jul 31, 2013 at 6:38 PM, Markus Jelsma wrote: > This file is malformed: > > *SEVERE: org.apache.solr.common.SolrException: Unexpected character '<' > (code 60) expected '='* > * at [row,col {

Re: Solr list all records but fq matching records first

2013-07-31 Thread Jack Krupansky
A filter query is used to exclude all documents that do not match - it is an "AND" operation. In your case, you don't want to exclude documents, so... you can't use a filter query. What people typically do for your use case is to add an "OR" clause or a "bq" boost query that merely boosts docu

Re: How might one search for dupe IDs other than faceting on the ID field?

2013-07-31 Thread Jack Krupansky
Good to note! But... any "search" will not detect dupe IDs for uncommitted documents. -- Jack Krupansky -Original Message- From: Mikhail Khludnev Sent: Wednesday, July 31, 2013 6:11 AM To: solr-user Subject: Re: How might one search for dupe IDs other than faceting on the ID field?

RE: new field type - enum field

2013-07-31 Thread Elran Dvir
Hi, I have managed to attach the patch in Jira. Thanks. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Monday, July 29, 2013 2:15 PM To: solr-user@lucene.apache.org Subject: Re: new field type - enum field OK, if you can attach it to an e-mail, I'll atta

RE: Unexpected character '<' (code 60) expected '='

2013-07-31 Thread Markus Jelsma
This file is malformed: *SEVERE: org.apache.solr.common.SolrException: Unexpected character '<' (code 60) expected '='* * at [row,col {unknown-source}]: [20281,18]* Check row 20281 column 18 -Original message- > From:Vineet Mishra > Sent: Wednesday 31st July 2013 15:05 > To: solr-use

Unexpected character '<' (code 60) expected '='

2013-07-31 Thread Vineet Mishra
Hi All I am currently stuck in a Solr Issue while Posting some data to Solr Server. I have some record from Hbase which I am posting to Solr, but after posting some 1 Million of data records, it suddenly stopped. Checking the Catalina log trace it showed, *org.apache.solr.common.SolrException: U

Re: SimplePostTool: FATAL: Solr returned an error #400 Bad Request

2013-07-31 Thread Vineet Mishra
I got it resolved, actually the error trace was even more above this one. It was just that the posting XML was not forming properly for the Solr field *Date* which usually takes the format *2006-07-15T22:18:48Z* * * This is the standard format for the Solr date(datatype) which follows specifically

Re: Negative Query Behaviour in Solr 3.2

2013-07-31 Thread Jack Krupansky
Since there are no parentheses, the terms and operators are all at the save level and the "OR" is essentially a redundant operator and ignored, so: q=name:memory OR -name:encoded is treated as: q=name:memory -name:encoded When what you probably want is: q=name:memory OR (-name:encoded) BUT.

queryResultCache showing all zeros

2013-07-31 Thread Jean-Sebastien Vachon
Hi, We just configured a new Solr cloud (5 nodes) running Solr 4.3, ran about 200 000 queries taken from our production environment and measured the performance of the cloud over a collection of 14M documents with the default Solr settings. We are now trying to tune the different caches and whe

Working with solr over two different db schemas

2013-07-31 Thread Mysurf Mail
Been working on it for quitre some time. this is my config Now, this runs in my test env and the only thing I do is change the configuration to another db( and as a result also the schema name from [dbo] to another ) This result in a totally different behavior. In the first confi

Re: result grouping and paging, solr 4.21

2013-07-31 Thread Erick Erickson
Not that I know of. Grouping pretty much treats all groups the same... Best Erick On Wed, Jul 31, 2013 at 4:14 AM, Gunnar wrote: > Hello, > > I'm trying to page results with grouping /field collapsing. My query is: > > ?q=myKeywords&start=0&rows=100&group=true&group.field=myGroupField&group.form

Re: SimplePostTool: FATAL: Solr returned an error #400 Bad Request

2013-07-31 Thread Erick Erickson
Probably not the root of your problem, but bq: and committing it there after. Does that mean you're calling commit after every document? This is usually poor practice, I'd set the autocommit intervals on solrconfig.xml and NOT call commit explicitly. Does the same document fail every time? What

Re: Trying to determine the benefit of spellcheck-based suggester vs. using terms component?

2013-07-31 Thread Erick Erickson
The biggest thing is that the spellchecker has lots of knobs to tune, all the stuff in http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collate TermsComponent, on the other hand, just gives you what's in the index with essentially no knobs to tune. So it depends on your goal. Typeahead

Re: Improper shutdown of Solr in Jetty 9

2013-07-31 Thread Artem Karpenko
Hello Dmitry, it's Windows 7. I'm starting Jetty with java -jar start.jar 31.07.2013 12:36, Dmitry Kan пишет: Artem, Whats the OS are using? So far jetty 9 with solr 4.3.1 works ok under ubuntu 12.04. On 30 Jul 2013 17:23, "Alexandre Rafalovitch" wrote: Of course, I meant Jetty (not Tomcat)

Re: Performance question on Spatial Search

2013-07-31 Thread Mikhail Khludnev
On Wed, Jul 31, 2013 at 1:10 AM, Steven Bower wrote: > > not sure what you mean by good hit raitio? > I mean such queries are really expensive (even on cache hit), so if the list of ids changes every time, it never hit cache and hence executes these heavy queries every time. It's well known perf

SolrCloud Exception

2013-07-31 Thread Sinduja Rajendran
I am running solr 4.0 in a cloud. We have close to 100Mdocuments. The data is from a single DB table. I use dih. Our solrCloud has 3 zookeepers, one tomcat, 2 solr instances in same tomcat. We have 8 GB Ram. After indexing 14M, my indexing fails witht the below exception. solr org.apache.lucene.i

Re: How might one search for dupe IDs other than faceting on the ID field?

2013-07-31 Thread Mikhail Khludnev
fwiw, this code won't capture uncommitted duplicates. On Wed, Jul 31, 2013 at 9:41 AM, Dotan Cohen wrote: > On Tue, Jul 30, 2013 at 11:14 PM, Jack Krupansky > wrote: > > The Solr SignatureUpdateProcessorFactory is designed to facilitate > dedupe... > > any particular reason you did not use it

Re: FieldCollapsing issues in SolrCloud 4.4

2013-07-31 Thread Paul Masurel
Do you mean you get different results with group=true? numFound is supposed returns the number of ungrouped hits. To get the number of groups, you are expected to set set group.ngroups=true. Even then, the result will only give you an upperbound in a distributed environment. To get the exact numbe

TrieField and FieldCache confusion

2013-07-31 Thread Paul Masurel
Hello everyone, I have a question about Solr TrieField and Lucene FieldCache. >From my understanding, Solr added the implementation of TrieField to perform faster range queries. For each value it will index multiple terms. The n-th term being a masked version of our value, showing only it first (

Re: Negative Query Behaviour in Solr 3.2

2013-07-31 Thread Mikhail Khludnev
Can you try: q=+name:memory -name:encoded or q=name:memory AND -name:encoded On Wed, Jul 31, 2013 at 10:14 AM, karanjindal < karan_jin...@students.iiit.ac.in> wrote: > Hi All, > > I am using solr 3.2 and confused how a particular query is executed. > q=name:memory OR -name:encoded > separately

Re: Improper shutdown of Solr in Jetty 9

2013-07-31 Thread Dmitry Kan
Artem, Whats the OS are using? So far jetty 9 with solr 4.3.1 works ok under ubuntu 12.04. On 30 Jul 2013 17:23, "Alexandre Rafalovitch" wrote: > Of course, I meant Jetty (not Tomcat). So apologies for spam and confusion > of my own. The rest of the statement stands. > > Personal website: http:/

Solr list all records but fq matching records first

2013-07-31 Thread Thyagaraj
I’m using solr 4.0 with DIH jdbc connector. I have a field named *code* with multiple values separated by semicolon(;). I am trying to list the search results which matches the field query(fq) to be listed at the top and remaining should be listed below. So this is not just a filter query to list o

Solr show total row count in response of full import

2013-07-31 Thread Sandro Zbinden
Hey there Is there a way to show the total row count (documents that will be inserted) when executing a full import through the Data Import Request handler ? Currently after executing a full import and pointing to /dataimport you can get the total rows processed 6354 It would be nice if you

Sharding with a SolrCloud

2013-07-31 Thread Oliver Goldschmidt
Hi list, I have a Solr server, which uses sharding to make distributed search with another Solr server. The other Solr server now migrates to a Solr Cloud system. I've been trying recently to continue searching the Solr Cloud as a shard for my Solr server, but this is failing with mysterious effec

Solr PolyField

2013-07-31 Thread Luís Portela Afonso
Hi, I'm trying to create a field with multiple fields inside, that is: origin: { htmlUrl: "http://www.gazzetta.it/";, streamId: "feed/http://www.gazzetta.it/rss/Home.xml";, title: "Gazzetta.it" }, Get something like this. Is that possible? I'm using Solr 4.4.0. Thanks smime.p7s Description: S/M

Re: EmbeddedSolrServer Solr 4.4.0 bug?

2013-07-31 Thread Luis Cappa Banda
Thank you very much, Alan. Now it's working! I agree with you: this kind of things should be documented at least in CHANGELOG.txt, because when upgrading from one version to another all should be compatible between versions, but this is not the case, thus people should be noticed of that. Regards,

result grouping and paging, solr 4.21

2013-07-31 Thread Gunnar
Hello, I'm trying to page results with grouping /field collapsing. My query is: ?q=myKeywords&start=0&rows=100&group=true&group.field=myGroupField&group.format=simple&group.limit=1 The result will contain 70 groups, is there a way to get 100 records returned, means 70 from each group first doc

  1   2   >