Re: SolrException: Error opening new searcher

2013-03-13 Thread mark12345
Found the exception logs that match the notifications in http://localhost:8080/solr-app/#/~logging as quoted below: > 14:30:00 SEVERE SolrCore org.apache.solr.common.SolrException: Error > opening new searcher > 14:30:00 SEVERE SolrDispatchFilter > null:org.apache.solr.common.SolrException: Er

Re: Is maxPendingDeletes still used in Solr 4.x?

2013-03-13 Thread Mark Miller
No, totally gone. Super outdated info. - Mark On Mar 13, 2013, at 7:31 PM, roz dev wrote: > Hi All > > In earlier versions of Solr (1.4.x), we could define maxPendingDeletes to > indicate the number of deleted docs which should be batched before applying > them. > > SolrConfig.xml can have fo

Re: Using suggester for smarter phrase autocomplete

2013-03-13 Thread Jorge Luis Betancourt Gonzalez
Currently I'm using a separated core to query suggestions, for this I've started from: https://github.com/cominvent/autocomplete. Basically the suggester component I'm only using it for term suggestions based on the a tokenized field in my schema (all of this in solr 3.6), perhaps instead of us

NRT persistant flags?

2013-03-13 Thread Ryan McKinley
I'm looking for a way to quickly flag/unflag documents. This could be one at a time or by query (even *:*) I have hacked together something based on ExternalFileField that is essentially a FST holding all the ids (solr not lucene). Like the FieldCache, it holds a WeakHashMap where the OpenBitSet

Can we manipulate termfreq to count as 1 for multiple matches?

2013-03-13 Thread roz dev
Hi All I am wondering if there is a way to alter term frequency of a certain field as 1, even if there are multiple matches in that document? Use Case is: Let's say that I have a document with 2 fields - Name and - Description And, there is a document with data like this Document_1 Name = Blu

SolrCloud returns http code 100

2013-03-13 Thread mshirman
We have a cloud configuration with 2 shards, 2 nodes each, 2 replica's per shard with 1 collection/core. We are in the process of migrating from running Solr4 as a stand-alone. When we are indexing SolrCloud with roughly 2.5M docs without simultaneous client searches all seems to be ok. We are in

RE: Strange log messages from Jetty when running Solr Cloud example from 4.2.0

2013-03-13 Thread John, Phil (CSS)
I've raised SOLR-4573: https://issues.apache.org/jira/browse/SOLR-4573 Regards, Phil. From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Wed 13/03/2013 17:05 To: solr-user@lucene.apache.org Subject: Re: Strange log messages from Jetty when running Solr Clo

Solr 4.2 - DocValues on id field

2013-03-13 Thread Isaac Hebsh
Hi, The example schema.xml in Solr 4.2 does not define "id" field as docValues=true. Any good reason? (other than backward compat for index for previous version...) If my common case is fl=id (and no other field), DocValues is classic for me. Am I right?

Rejecting document already existing in different shard.

2013-03-13 Thread Marcin Rzewucki
Hi there, Let's say we use custom hashing algorithm and there is a document already indexed in "shard1". After some time the same document has changed and should be indexed to "shard2" (because of routing rules used in indexing program). It has been indexed without issues and as a result 2 "almost

discovery-based core enumeration with embedded solr

2013-03-13 Thread Michael Sokolov
Has the new core enumeration strategy been implemented in the CoreContainer.Initializer.initialize() code path? It doesn't seem like it has. I get this exception: Caused by: org.apache.solr.common.SolrException: Could not load config for solrconfig.xml at org.apache.solr.core.CoreContai

Re: Zookeeper: Could not get shard_id for core

2013-03-13 Thread Erick Erickson
Yes. If you're running Solr in non-SolrCloud mode, then it's just like 3.x, your indexing process is responsible for sending the documents to the correct shard. Look at the sharding documentation for 3.x etc. Best Erick On Wed, Mar 13, 2013 at 2:37 PM, Raghav Karol wrote: > We can run solr4 in

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

2013-03-13 Thread Erick Erickson
Stack traces.. First, jps -l that will give you a the process IDs of your running Java processes. Then: jstack Usually I pipe the output from jstack into a text file... Best Erick On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda wrote: > Uhm, how can I do that... 'cleanly'? I know that wi

RE: [SPAM] Re: strange edismax parsing when searching in multiple fields (#TB)

2013-03-13 Thread Ahmet Arslan
Hi Tom, I don't use stop word removal either. I use hl.q parameter fed with "meaningful words". http://wiki.apache.org/solr/HighlightingParameters#hl.q --- On Wed, 3/13/13, Burgmans, Tom wrote: > From: Burgmans, Tom > Subject: RE: [SPAM] Re: strange edismax parsing when searching in multi

Re: solr.DirectUpdateHandler2 failed to instantiate

2013-03-13 Thread Jack Park
I can safely say that it is not DirectUpdateHandler2 failing; By commenting out my own handlers, the system boots without error. This means that my handlers are problematic in some way. The moment I put back just one of my handlers: hello

Re: Zookeeper: Could not get shard_id for core

2013-03-13 Thread Raghav Karol
We can run solr4 in "non-cloud" mode without zookeeper and query data in the different shards as: > http://localhost:8983/solr/select?q=foo:bar&shards=localhost:8983/solr/core0,localhost:8983/solr/core1 How about reindexing? Does the reindexing process (external to solr) need to select and man

Re: [SPAM] Re: strange edismax parsing when searching in multiple fields (#TB)

2013-03-13 Thread Walter Underwood
Yeah, the Ultraseek highlighter did not highlight standalone stopwords. It did highlight stopwords in phrases. That is the "vitamin a" test. wunder On Mar 13, 2013, at 8:55 AM, Burgmans, Tom wrote: > The main reason of using stopwords is to speed up query performance, since we > see that a hug

Using String Values in Solr Query Functions

2013-03-13 Thread slund
A simple question based on the same problem discussed in the "Solr Grouping and empty fields" topic. It would seem that using group.func=if(exists(groupField),groupField,uuidField) would allow us to only group documents when we know the group field value, and solve the issue given in the above top

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

2013-03-13 Thread Luis Cappa Banda
Uhm, how can I do that... 'cleanly'? I know that with JConsole it´s posible to output this traces, but with a .war application built on top of Spring I don´t know how can I do that. In any case, here is my CloudSolrServer wrapper that is used by other classes. There is no sync method or piece of co

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

2013-03-13 Thread Mark Miller
Could you capture some thread stack traces in the 'engine' and see if there are any blocking methods? - Mark On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda wrote: > Just one correction: > > When I said: > > - I´ve checked SolrCloud via Solr Admin interface and it´s OK: > everything is gr

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

2013-03-13 Thread Luis Cappa Banda
Just one correction: When I said: - I´ve checked SolrCloud via Solr Admin interface and it´s OK: everything is green, and I cant execute queries directly into Solr. I mean: - I´ve checked SolrCloud via Solr Admin interface and it´s OK: everything is green, and *I can* execute queri

SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

2013-03-13 Thread Luis Cappa Banda
Hello, guys! I´ve been experiencing some annoying behavior with my current production scenario. Here is the snapshot: - SolrCloud: 2 shards - Zookeeper ensemble: 3 nodes in *different machines *(most of the tutorials installs 3 Zookeeper nodes in the same machine). - This is the zoo.

Re: Using suggester for smarter phrase autocomplete

2013-03-13 Thread Eric Wilson
I'm not concerned about stopwords, rather the situation where the first and second words are rarely used together, so don't occur together in a phrase in the dictionary. Thanks. On Wed, Mar 13, 2013 at 11:11 AM, Robert Muir wrote: > On Wed, Mar 13, 2013 at 11:07 AM, Eric Wilson > wrote: > > I'm

Re: Strange log messages from Jetty when running Solr Cloud example from 4.2.0

2013-03-13 Thread Mark Miller
Yeah, I was noticing this yesterday. It doesn't seem to affect the admin functionality, but I've seen the logging. I've been meaning to look into it, but no time yet. Could you file a JIRA issue? - Mark On Mar 13, 2013, at 12:05 PM, "John, Phil (CSS)" wrote: > Hi, > > > > I'm playing arou

Re: velocity in /srv/www

2013-03-13 Thread Shawn Heisey
On 3/13/2013 8:39 AM, Guy Dobson wrote: Is there a way to put the Velocity pages in /srv/www alongside of /htdocs and /cgi-bin and tell it to look in /opt/solr-4.1.0/... to find my Solr index so that we don't have to open port 8983 on the firewall When starting a new thread on a mailing list,

Re: velocity in /srv/www

2013-03-13 Thread Erik Hatcher
Not using Solr's VelocityResponseWriter. It literally is a Solr response writer :) And thus to use it you have to make standard search requests to Solr. You can move the Velocity templates to another location, as there is a way to specify the root directory for templates, but they only get re

Re: Version conflict during data import from another Solr instance into clean Solr

2013-03-13 Thread Alexandre Rafalovitch
What about update request processors to drop the field? Regards, Alex On Mar 13, 2013 9:45 AM, "Artem OXSEED" wrote: > Hello, thank you for response! > > Configuration option does not help - it's probably not yet > implemented. I found however the line of code which checks versions: > > Lo

Strange log messages from Jetty when running Solr Cloud example from 4.2.0

2013-03-13 Thread John, Phil (CSS)
Hi, I'm playing around with the solr cloud example in the latest 4.2 release (on Windows) and I'm getting lots of warnings in the window where Jetty is running when accessing pages on the admin HTTP interface, they are all: WARM:oejh.HttpGenerator:Ignoring extra content and then the con

Re: Scaling SolrCloud and DIH

2013-03-13 Thread Mark Miller
There is still some work to be done to make DIH play nicely with SolrCloud in terms of failover. https://issues.apache.org/jira/browse/SOLR-4058 is one of the issues that should be addressed. I think I made another issue or two, but I don't remember them offhand. - Mark On Mar 13, 2013, at 11

RE: [SPAM] Re: strange edismax parsing when searching in multiple fields (#TB)

2013-03-13 Thread Burgmans, Tom
The main reason of using stopwords is to speed up query performance, since we see that a huge part is consumed by highlighting stopwords. Also when reading the full highlighted document, we think that it makes a document better readable when only meaningful words are highlighted. For searching

Re: [ANNOUNCE] Apache Solr 4.2 released

2013-03-13 Thread Erick Erickson
See CHANGES.txt, but 4.2 is pretty much bug fixing, I don't think there's anything special you need to do. Best Erick On Tue, Mar 12, 2013 at 1:31 PM, Marthi, Suneel wrote: > We presently have Indexes generated from Solr 4.1. What is the upgrade > path to Solr 4.2 ? > > > > On 3/11/13 8:37 PM,

Re: strange edismax parsing when searching in multiple fields (#TB)

2013-03-13 Thread Walter Underwood
Or don't use stopwords. I haven't used stopwords for, oh, a dozen years or so. Removing stopwords was a hack developed for 16-bit computers and 40 megabyte disks. We don't need to do that any more. wunder On Mar 13, 2013, at 8:28 AM, Ahmet Arslan wrote: > I would merge stop_en.txt and stop_fr.

Scaling SolrCloud and DIH

2013-03-13 Thread jimtronic
I'm curious how people are using DIH with SolrCloud. I have cron jobs set up to trigger the dataimports which come from both xml files and a sql database. Some are frequent small delta imports while others are larger daily xml imports. Here's what I've tried: 1. Set up a micro box that sends the

Re: strange edismax parsing when searching in multiple fields (#TB)

2013-03-13 Thread Ahmet Arslan
I would merge stop_en.txt and stop_fr.txt. Use same set of stop words for all fields that you search on. You might find this useful : http://bibwild.wordpress.com/2010/04/14/solr-stop-wordsdismax-gotcha/ --- On Wed, 3/13/13, Burgmans, Tom wrote: > From: Burgmans, Tom > Subject: strange edism

strange edismax parsing when searching in multiple fields (#TB)

2013-03-13 Thread Burgmans, Tom
Hi group, Background: I have a collection containing English and French documents. I made sure to index the English content in field "body" (fieldType=text_en) and the French content in field "body_fr" (fieldType=text_fr). The user could be either English of French so the goal is to execute the

How to use suggester for smarter multi-word autocomplete?

2013-03-13 Thread Eric Wilson
I'm trying to use the suggester for auto-completion with Solr 4. I have followed the example configuration for phrase suggestions at the bottom of this wiki page: http://wiki.apache.org/solr/Suggester This shows how to use a text file with the following text for phrase suggestions: # simple au

Re: Figure out what value was matched in multi valued field

2013-03-13 Thread mephisto
I should have been more clear with my original post . I dont need this for debugging . This is for production , In the search results I also have to show which keyword/phrase the document is showed in the search result. Further each phrase/term has more details associated with it such as pricing in

Re: Cloud with two down replica's

2013-03-13 Thread Arkadi Colson
I just fixed it with unloading one replica core. Restarting tomcat followed by adding the replica core again... On 03/13/2013 04:13 PM, Mark Miller wrote: Anything else in your logs? - Mark On Mar 13, 2013, at 10:57 AM, Arkadi Colson wrote: On one replica the logs are saying: Mar 13, 2013

Re: Reusing same searcher in Solr?

2013-03-13 Thread Mark Miller
I think there is a JIRA issue for Searcher leases somewhere. If not, it's something we have talked a lot about. It's useful for a variety of reasons. I'm sure it's in the pipeline, though I don't know where. - Mark On Mar 13, 2013, at 10:09 AM, Shawn Heisey wrote: > I was reading Mike McCandl

Re: Cloud with two down replica's

2013-03-13 Thread Mark Miller
Anything else in your logs? - Mark On Mar 13, 2013, at 10:57 AM, Arkadi Colson wrote: > On one replica the logs are saying: > > Mar 13, 2013 2:55:23 PM org.apache.solr.common.SolrException log > SEVERE: null:org.apache.solr.common.SolrException: Error handling 'status' > action >at > org

Re: Using suggester for smarter phrase autocomplete

2013-03-13 Thread Robert Muir
On Wed, Mar 13, 2013 at 11:07 AM, Eric Wilson wrote: > I'm trying to use the suggester for auto-completion with Solr 4. I have > followed the example configuration for phrase suggestions at the bottom of > this wiki page: > http://wiki.apache.org/solr/Suggester

Using suggester for smarter phrase autocomplete

2013-03-13 Thread Eric Wilson
I'm trying to use the suggester for auto-completion with Solr 4. I have followed the example configuration for phrase suggestions at the bottom of this wiki page: http://wiki.apache.org/solr/Suggester

Re: Cloud with two down replica's

2013-03-13 Thread Arkadi Colson
On one replica the logs are saying: Mar 13, 2013 2:55:23 PM org.apache.solr.common.SolrException log SEVERE: null:org.apache.solr.common.SolrException: Error handling 'status' action at org.apache.solr.handler.admin.CoreAdminHandler.handleStatusAction(CoreAdminHandler.java:715) at org.

Cloud with two down replica's

2013-03-13 Thread Arkadi Colson
Due to server failure I have 2 down hosts for 1 shard: lvs_shard2 Is it possible to fix this? The index size is equal so I just need to be able to let one replica become active. { "lvs":{ "shards":{ "shard1":{ "range":"8000-", "replicas":{ "sol

Re: Poll: Largest SolrCloud out there?

2013-03-13 Thread Annette Newton
8 AWS hosts. 35GB memory per host 10Gb allocated to JVM 13 aws compute units per instance 4 Shards, 2 replicas 25M docs in total 22.4GB index per shard High writes, low reads On 13 March 2013 09:12, adm1n wrote: > 4 AWS hosts: > Memory: 30822868k total > CPU: Intel(R) Xeon(R) CPU E5-2670 0 @

Re: velocity in /srv/www

2013-03-13 Thread Paul Libbrecht
Guy, you'd need a proxy to go from one port (80 for the apache) to port 8983. Apache httpd will not run solr alone. Then the question of where you put the velocity page is "just a matter of configuration". A symbolic link probably. paul On 13 mars 2013, at 15:39, Guy Dobson wrote: > Fellow S

velocity in /srv/www

2013-03-13 Thread Guy Dobson
Fellow Solrites, Is there a way to put the Velocity pages in /srv/www alongside of /htdocs and /cgi-bin and tell it to look in /opt/solr-4.1.0/... to find my Solr index so that we don't have to open port 8983 on the firewall ? Thanks, Guy Guy Dobson Integrated Systems Librarian Drew Unive

Re: commit

2013-03-13 Thread Timothy Potter
collection -> Plugins / Stats -> CORE -> searcher On Wed, Mar 13, 2013 at 4:53 AM, Arkadi Colson wrote: > Sorry I'm quite new to solr but where exactly in the admin interface can I > find how long it takes to warm the index? > > Arkadi > > > On 03/13/2013 11:19 AM, Upayavira wrote: > >> It dep

Re: Figure out what value was matched in multi valued field

2013-03-13 Thread Paul Libbrecht
Mephisto, Maybe LUCENE-1999 helps you. We've used it with some success. Otherwise, you're left with highlighting. paul On 13 mars 2013, at 14:11, Jack Krupansky wrote: > Add &debugQuery=true to your query and examine the "explain" section, which > will show the terms/phrases that scored for e

Reusing same searcher in Solr?

2013-03-13 Thread Shawn Heisey
I was reading Mike McCandless' blog and came across this post: http://blog.mikemccandless.com/2011/11/searcherlifetimemanager-prevents-broken.html Is this kind of functionality available in Solr? I update my index once a minute, so being able to reuse an old searcher for a little while would

Re: Solr replication takes long time

2013-03-13 Thread Victor Ruiz
While looking at Solr logs, I found a java.lang.OutOfMemoryError: Java heap space that was happening 2 times per hour So I tried to increase the max memory heap assigned to JVM (-Xmx) and since then the servers are not crashing, even though the replication takes still long time to complete. But f

Re: Version conflict during data import from another Solr instance into clean Solr

2013-03-13 Thread Artem OXSEED
Hello, thank you for response! Configuration option does not help - it's probably not yet implemented. I found however the line of code which checks versions: Long lastVersion = vinfo.lookupVersion(cmd.getIndexedId()); long foundVersion = lastVersion == null ? -1 : lastVersion; if ( versionOn

Re: PDF keyword searches not accurate

2013-03-13 Thread JDJ
I don't know if this is going to make a difference, or not, but I just discovered that the version of Solr that ships with CF Server 9 is v1.4.0. - JDJ "There are two kinds of people in the world; those who understand binary, and those who don't. -- View this message in conte

Re: Figure out what value was matched in multi valued field

2013-03-13 Thread Jack Krupansky
Add &debugQuery=true to your query and examine the "explain" section, which will show the terms/phrases that scored for each document. -- Jack Krupansky -Original Message- From: mephisto Sent: Wednesday, March 13, 2013 6:52 AM To: solr-user@lucene.apache.org Subject: Figure out what v

Re: Search data who does not have "x" field

2013-03-13 Thread anurag.jain
"another solution would be to add a boolean field, hasCategory, and use it for filtering q=&fq=hasCategory:true " I am not getting result. i am trying localhost:8983/search?q=*:*&fq=category:true it is giving zero result. by the way first technique is working fine. -- View this message i

Re: PDF keyword searches not accurate

2013-03-13 Thread JDJ
Unfortunately, I am not in control of the development environment, so installing a stand-alone Solr is not an option. Well, let me correct that.. I do have my own instance of ColdFusion Server on my local machine (sometimes I develop locally, sometimes I develop on the network), but if I installed

Re: Search data who does not have "x" field

2013-03-13 Thread Gora Mohanty
On 13 March 2013 17:57, anurag.jain wrote: > Hi all, > > I am facing a problem. [...] > and some of data have "category" field and some of don't have. > > for example. > > > { > "id":"321", > "name":"anurag", > "category":"x" > }, > { > "id":"3", > "name":"john" > } > > > now i want to search that

Re: Search data who does not have "x" field

2013-03-13 Thread Victor Ruiz
add this to your query, or filter query: q=&fq=-category:[* TO *] another solution would be to add a boolean field, hasCategory, and use it for filtering q=&fq=hasCategory:true Victor anurag.jain wrote > Hi all, > > I am facing a problem. > > Problem is: > > I have updated 250 data t

Re: Search data who does not have "x" field

2013-03-13 Thread Rafał Kuć
Hello! You can for example add a filter, that will remove the documents that doesn't have a value in a field, like: fq=!category:[*+TO+*] -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch > Hi all, > I am facing a problem. > Problem is: > I h

Search data who does not have "x" field

2013-03-13 Thread anurag.jain
Hi all, I am facing a problem. Problem is: I have updated 250 data to solr. and some of data have "category" field and some of don't have. for example. { "id":"321", "name":"anurag", "category":"x" }, { "id":"3", "name":"john" } now i want to search that data who does not have that

Figure out what value was matched in multi valued field

2013-03-13 Thread mephisto
Hi , I have an application where users are tagged with multiple keywords/phrase . Depending on the keyword there is additional information specific to the keyword the document was matched for . Tooth Whitening Tooth Dentures Veneers I need to be able to figure out which particular value "Toot

Re: Version conflict during data import from another Solr instance into clean Solr

2013-03-13 Thread Alexandre Rafalovitch
I believe you are running into the update semantics, new with Solr 4 (4.1?): https://wiki.apache.org/solr/Per%20Steffensen/Update%20semantics I am not sure Wiki is 100% correct (especially on default mode), but it should be good enough. Basically, because you are specifying some real value in _ver

RE: Search term matching on part of a token, not the whole token

2013-03-13 Thread John, Phil (CSS)
Hi Chris, Thank you for taking the time to assist. Here's both the field and fieldtype definition: And here's a simplified set of params passed to solr (which still causes the issue to show up): q=class:510 defType=edismax fl=*,score q.op=AND mm=100% tie=0.01 st

Re: searching exact phrase with stop word returns bad results

2013-03-13 Thread Ahmet Arslan
Hi, You need an analyzer that injects these five tokens in your example: john@gmail.com => john doe @ gmail com If you use autoGeneratePhraseQueries = true, then all of your three needs will be satisfied. Don't use quotes in your query. Just q=@gmail.com not q="@gmail.com " I would go wi

Re: Solr replication takes long time

2013-03-13 Thread Victor Ruiz
After upgrading to 4.2, the problem is not yet solved, in this image you can see, how slow is the transfer speed. At least, after the update the master is not blocked during replication Any idea? -- View this message in co

Re: Is Lucene's DrillSideways something suitable for Solr?

2013-03-13 Thread Michael McCandless
On Tue, Mar 12, 2013 at 11:24 PM, Yonik Seeley wrote: > On Tue, Mar 12, 2013 at 10:27 PM, Alexandre Rafalovitch > wrote: >> Lucene seems to get a new DrillSideways functionality on top of its own >> facet implementation. >> >> I would love to have something like that in Solr > > Solr has had mult

Re: commit

2013-03-13 Thread Arkadi Colson
Sorry I'm quite new to solr but where exactly in the admin interface can I find how long it takes to warm the index? Arkadi On 03/13/2013 11:19 AM, Upayavira wrote: It depends whether you are using soft commits - that changes things a lot. If you aren't, then you should look in the admin inte

Re: copyField with * stops working with 4.2 (related to SOLR-3798 ?)

2013-03-13 Thread Jack Krupansky
Thanks for the clarification! Although, maybe we need to come up with some simpler, more clear terminology. -- Jack Krupansky -Original Message- From: Steve Rowe Sent: Wednesday, March 13, 2013 12:50 AM To: solr-user@lucene.apache.org Subject: Re: copyField with * stops working with 4

Re: Special characters not indexed

2013-03-13 Thread Jack Krupansky
If your goal is simply to determine if a character occurs within a term, you would have to use a wildcard such as *&*. If your goal is to make special characters be complete terms, you may have to develop a custom tokenizer that emits multiple tokens at the same position. Or, maybe you want to

Version conflict during data import from another Solr instance into clean Solr

2013-03-13 Thread Artem OXSEED
Hi, I've configured data import handler: class="org.apache.solr.handler.dataimport.DataImportHandler"> data-config.xml data-config.xml: http://host:8080/index"; query="*:*" wt="javabin"/> Both Solr instances are of the same version - 4.1. Target Solr instance

Re: xml output question

2013-03-13 Thread Upayavira
As has been said, you can use XSLT with wt=xslt&tr=stylesheet.xsl. You don't need to use Saxon, unless you need specific (e.g. XSLT 2.0) features. You don't say what exts and last actually mean, so it isn't possible to say whether this can be achieved with XSLT. Upayavira On Tue, Mar 12, 2013, a

Re: commit

2013-03-13 Thread Upayavira
It depends whether you are using soft commits - that changes things a lot. If you aren't, then you should look in the admin interface, and see how long it takes to warm your index, and commit at least less frequently than that (commit more often, and you'll have concurrent warming searchers which

Re: Solr Suggester component doesn't return hits for non-English words

2013-03-13 Thread Dejan Caric
Thank you Carlos and sorry for late reply. I've set the threshold to 0 and that did the trick. Kind regards, Dejan On Tue, Feb 26, 2013 at 3:05 AM, Carlos Maroto < cmar...@searchtechnologies.com> wrote: > Hi Dejan, > > I wouldn't say your problem is because the words are non-English words as

Re: Special characters not indexed

2013-03-13 Thread vsl
After changing to white space tokenizer there are still no results for given search term "&". Only when the whole word ("§$ %&/( )=? +*#'-<>") was given as a search term, this document was shown in results. -- View this message in context: http://lucene.472066.n3.nabble.com/Special-characters-n

Re: searching exact phrase with stop word returns bad results

2013-03-13 Thread adfel70
Am I the first needing this behaivour? Have you seen any set of tokenizer-filters for a similar requirement? Upayavira wrote > Exact phrase search isn't exact phrase search as you are thinking of it. > A phrase search for "foo bar" searches for the terms foo and bar, and > then checks whether th

Re: How to Integrate Solr With Hbase

2013-03-13 Thread adfel70
My needs are different. I have documents with large fields, larger than solr can store (at least if there havn't been any chnages in this issue). I want to index the fields but don't want to store them in solr. And I also need highlighting on these fields. So I need to integrate solr and hbase in

Re: xml output question

2013-03-13 Thread Miguel
Hi jazz You can use "wt" and "tr" parameters for XSLT transformation, example: &wt=xslt&tr=test.xsl so, you can generate whatever XML output for Solr. El 12/03/2013 20:22, jazz escribió: Hi Michael, Thans for the reply. My question is how to make child XML elements such as: These c

Re: commit

2013-03-13 Thread Arkadi Colson
What would be a good value for maxTime or maxDocs knowing that we insert about 10 docs/sec? Will it be a problem that we only use maxDocs = 1 because it's not searchable yet... On 03/13/2013 10:00 AM, Upayavira wrote: Auto commit would seem a good idea, as you don't want your independent w

Re: How to Integrate Solr With Hbase

2013-03-13 Thread Upayavira
If you want to be ble to use the data in both places, that's what you will need. You won't be ble to have Solr read indexes from within hbase, it needs to manage its own indexes. Upayavira On Wed, Mar 13, 2013, at 09:03 AM, adfel70 wrote: > So you end up having all the data both in hbase and sol

Re: searching exact phrase with stop word returns bad results

2013-03-13 Thread Upayavira
Exact phrase search isn't exact phrase search as you are thinking of it. A phrase search for "foo bar" searches for the terms foo and bar, and then checks whether they are one position apart. If punctuation has been removed during analysis, it *cannot* play a part in a search of any kind. You may

RE: Poll: Largest SolrCloud out there?

2013-03-13 Thread adm1n
4 AWS hosts: Memory: 30822868k total CPU: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz x8 17M docs 5 Gb index. 8 master-slave shards (2 shards /host). 57 msec/query avg. time. (~110K queries/24 hours). -- View this message in context: http://lucene.472066.n3.nabble.com/Poll-Largest-SolrCloud-out-

Re: debugQuery, explain tag - What does the fieldWeight value refer to?,

2013-03-13 Thread Upayavira
In the order they appear in the index. This used to be the order in which they were indexed, but merging strategies can now change that order. Tiy can control that order by sorting on the pseudo field 'score', then on mother field or function. Or you can include other functions or fields in the sco

Re: How to Integrate Solr With Hbase

2013-03-13 Thread adfel70
So you end up having all the data both in hbase and solr? Bharat Mallampati wrote > We haven't used Nutch to crawl data into SOLR, we used the standard HBASE > API to read and SOLRJ API to write to solr. > > And our document size is relatively small with 100 to 150 fields. > > > Thanks > Bhara

Re: commit

2013-03-13 Thread Upayavira
Auto commit would seem a good idea, as you don't want your independent worker threads issuing overlapping commits. There's also commtWithin that achieves the same thing. Upayavira On Wed, Mar 13, 2013, at 08:02 AM, Arkadi Colson wrote: > Hi > > I'm filling our solr database with about 5mil docs.

Re: searching exact phrase with stop word returns bad results

2013-03-13 Thread adfel70
I want the following behaivour. if "john@gmail.com" is indexed to the field 1. searching 'john' or 'doe' or 'gmail.com' will retreive the doc. 2. searching '"@gmail.com' will retreive the doc. 3. searching '"gmail.com@"' will not retreive the doc. All I can accomplish, but 3. because the word

Re: copyField with * stops working with 4.2 (related to SOLR-3798 ?)

2013-03-13 Thread Steve Rowe
I committed a fix under SOLR-4567. On Mar 13, 2013, at 12:50 AM, Steve Rowe wrote: > Yes, this is a regression, definitely my fault. Sorry Alex! > > The table on SOLR-3798 is missing this case: a glob matching one or more > explicit fields (as opposed to dynamic fields). > > I've filed a J

commit

2013-03-13 Thread Arkadi Colson
Hi I'm filling our solr database with about 5mil docs. All docs are in some kind of queue which are processed by 5 simultaneous workers. What is the best way to do commits is such a situation? If I say to let every worker do a commit after 100 docs there will be 5 commits in a short period. Or

Re: is there a way we can build spell dictionary from solr index such that it only take words leaving all`special characters

2013-03-13 Thread Upayavira
Use text analysis and copyField to create a new field that has terms as you expect them. Then use that for your spellcheck dictionary. Note, since 4.0, you don't need to create a dictionary. Solr can use your index directly. Upayavira On Wed, Mar 13, 2013, at 06:00 AM, Rohan Thakur wrote: > whil