date:20140205

Volatile spellcheck index

2014-02-05 Thread Alejandro Marqués Rodríguez

Hi, I'm having a problem with the spell check index building. I've configured the spell checker component to have the index built on optimize. * * * * * spell* * * * spellchecker* * spell* * 0.7* * true* * * * * * * * * * * * spellchecker* * on* *

Re: need help in understating solr cloud stats data

2014-02-05 Thread Ramkumar R. Aiyengar

We have had success with starting up Jolokia in the same servlet container as Solr, and then using its REST/Bulk API to JMX from the application of choice. On 4 Feb 2014 17:16, "Walter Underwood" wrote: > I agree that sorting and filtering stats in Solr is not a good idea. There > is certainly so

Disable searching on ddm tika metadata

2014-02-05 Thread Mauro Gregorio Binetti

Hi everybody, I'm a newbie and I'm working on searching performance in a project withou any type of documentation. I think searching is very slow because of the presence of all tika metadata, what do you think about it? I'm trying to disable this searching in al of these technical fields to test if

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Alexandre Rafalovitch

Did you reindex? Also, how are you submitting data? Are you using ExtractingRequestHandler (defined in your solrconfig.xml)? If so, there is already a mechanism for that. Just search for ignored in the documentation: http://wiki.apache.org/solr/ExtractingRequestHandler . Regards, Alex. Perso

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Jack Krupansky

Run some test queries with the debug=true parameter and check the timing section of the response to see what search components are consuming the time. Highlighting of large documents can be very slow, for example. Or, if you return the full text of a document, the raw size can slow the response

Re: Import data from mysql to sold

2014-02-05 Thread rachun

Hi gurus, after I got Solr DIH work with mysql now I need to make figure out how to make it work with MongoDB. I have been searching for all day but I don't have luck with it. So could anyone please suggest me any site for my solution? Thank you very much, Chun. -- View this message in conte

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Mauro Gregorio Binetti

I'm submitting data via Liferay that uses Apache Lucene/Solr for searching feature. Nothing is done directly on Solr. solrconfig.xml is actually done in this way: text true ignored_ true links ignored_ but I still have traces like

Re: Import data from mysql to sold

2014-02-05 Thread Jack Krupansky

I think this is the first time I have seen a request for import from MongoDB on this list. Do a Google search for "solr and mongodb" and you will find a bunch of links right away. Are you not seeing these? There is something called "Mongo Connector", but it uses the "push" model, as opposed to t

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Mauro Gregorio Binetti

Hi Jack, please can you give me some other details? Are you referring to a tool in particular? Mauro On Wed, Feb 5, 2014 at 12:20 PM, Jack Krupansky wrote: > Run some test queries with the debug=true parameter and check the timing > section of the response to see what search components are cons

Re: Solr Searching Issue

2014-02-05 Thread Toke Eskildsen

On Wed, 2014-02-05 at 08:17 +0100, Sathya wrote: > I am running single instance solr and the JVM heap space is minimum 6.3gb > and maximum 24.31gb. Nothing is running to complete the 24gb except tomcat > server. I have only 2 copyField entries only. Your Xmx is the same size as your RAM. It shoul

geofilt customization

2014-02-05 Thread Sohan Kalsariya

I am using geofilt() to filter my results according to my location. Now I don't want to filter the results by the last parameter. i.e.=> I don't need the last - distance parameter. that means i want results from all over the world. How should i get it? Regards, *Sohan Kalsariya*

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Jack Krupansky

Simply post to this mail list the timing section of the query response for a test query that you feel is too slow, but be sure to add the debug=true parameter (or debug=timing.) -- Jack Krupansky -Original Message- From: Mauro Gregorio Binetti Sent: Wednesday, February 5, 2014 6:44 A

Join Scoring

2014-02-05 Thread anand chandak

Hi Why doesn't the solr join query doesn't return the score when returning the response. Although I see JoinScorer in the JoinQParserPlugin class ? Also, to evaluate the join performance, I filed a join query aganist solr's join - JoinQParserPlugin and aganist lucene JoinUtil.createJoinQu

Re: Max Limit to Schema Fields - Solr 4.X

2014-02-05 Thread Mike L.

Thanks Shawn. This is good to know. Sent from my iPhone > On Feb 5, 2014, at 12:53 AM, Shawn Heisey wrote: > >> On 2/4/2014 8:00 PM, Mike L. wrote: >> I'm just wondering here if there is any defined limit to how many fields can >> be created within a schema? I'm sure the configuration maint

Re: SolrCloud query results order master vs replica

2014-02-05 Thread M. Flatterie

Good morning, so based on your answer, there is no garantee that the results will be the same from one replica to the other. I ran the queries in debug mode and I see... MASTER "321240": "\n1.7046129 = (MATCH) weight(prod_doc:tylenol in 20206) [DefaultSimilarity], result of:\n 1.7046129 = fi

RE: Volatile spellcheck index

2014-02-05 Thread Dyer, James

Alejandro, Assuming you're using Solr 3.x, under: ... ...you can add: ./spellchecker ...then the spell check index will be created on-disk and not in memory. But in Solr 4.0, the default spellcheck implementation changed to org.apache.solr.spelling.DirectSolrSpellChecker, which does n

expungeDeletes vs optimize

2014-02-05 Thread Bryan Bende

Does calling commit with expungeDeletes=true result in a full rewrite of the index like an optimize does? or does it only merge away the documents that were "deleted" by commit? Every two weeks or so we run a process to rebuild our index from the original documents resulting in a large amount of d

Re: Volatile spellcheck index

2014-02-05 Thread Alejandro Marqués Rodríguez

Thanks for the answer James. My fault not specifying the Solr version, we are working with solr 4.5. Anyway, thank you very much for pointing the change to DirectSolrSpellChecker. I hadn't even realized that change, and I think I wasn't using it, as the line "solr.DirectSolrSpellChecker" was missi

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Jack Krupansky

I’m not interested in the log (although maybe somebody else can spot something there) – it’s the query response that is returned on your query HTTP request (XML or JSON.) The specific parameter to add to your HTTP query request is “&debug=true”. -- Jack Krupansky From: Mauro Gregorio Binetti

How to merge collections split across multiple shards?

2014-02-05 Thread izomorfizm

--- Brief overview of the setup: --- 5 x SolrCloud (Solr 4.6.1) node instances (separate machines) The setup is intended to store last 48 hours webapp logs (which are pretty intense... ~ 3MB/sec) "logs" collec

Optimize and replication: some questions battery.

2014-02-05 Thread Luis Cappa Banda

Hello! I've got an scenario where I index very frequently on master servers and replicate to slave servers with one minute polling. Master indexes are growing fast and I would like to optimize indexes to improve search queries. However... 1. During an optimize operation, can master servers index

Re: weird exception on update

2014-02-05 Thread Dmitry Kan

Hi Hoss, Thanks for replying. I have created a jira: https://issues.apache.org/jira/browse/SOLR-5697 It contains the required configs (actually a shard) and a query parser maven project. These illustrate the issue. I had to omit the solr.war from the webapps of the shard as it exceeded the upload

Re: high memory usage with small data set

2014-02-05 Thread Johannes Siegert

Hi Erick, thanks for your reply. What do you exactly mean with "Do your used entries in your caches increase in parallel?"? I update the indices every hour and commit the changes. So a new searcher with empty or autowarmed caches should be created and the old one should be removed. Johann

Re: SolrCloud query results order master vs replica

2014-02-05 Thread Chris Hostetter

: Just to make sure I interpret the results correctly: : - they all have a score of 1.7046129 : - the order they are presented in is therefore not related to the score, : it is just the order in which the data is internally stored (like an SQL : SELECT statement without ORDER BY clause) The ord

Re: geofilt customization

2014-02-05 Thread Chris Hostetter

: I am using geofilt() to filter my results according to my location. : : Now I don't want to filter the results by the last parameter. i.e.=> I : don't need the last : - distance parameter. : that means i want results from all over the world. : How should i get it? Your question isn't making mu

Solr4 performance

2014-02-05 Thread Joshi, Shital

Hi, We have SolrCloud cluster (5 shards and 2 replicas) on 10 dynamic compute boxes (cloud). We're using local disk (/local/data) to store solr index files. All hosts have 60GB ram and Solr4 JVM are running with max 30GB heap size. So far we have 470 million documents. We are using custom shard

Re: Optimize and replication: some questions battery.

2014-02-05 Thread Chris Hostetter

: I've got an scenario where I index very frequently on master servers and : replicate to slave servers with one minute polling. Master indexes are : growing fast and I would like to optimize indexes to improve search : queries. However... For a scenerio where your index is changing that rapidly,

Problem querying large StrField?

2014-02-05 Thread Luis Lebolo

Hi All, It seems that I can't query on a StrField with a large value (say 70k characters). I have a Solr document with a string type: and field: Note that it's stored, in case that matters. Across my documents, the length of the value in this StrField can be up to ~70k characters or

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Mauro Gregorio Binetti

Ok... I get it. I understood what you mean. But I have some troubles in encoding query logged in the application to submit to Solr Admin: %2B(%2B(companyId:10153)+%2B((%2B(entryClassName:com.liferay.portlet.bookmarks.model.BookmarksEntry))+(%2B(entryClassName:com.liferay.portlet.blogs.model.BlogsE

Re: Problem querying large StrField?

2014-02-05 Thread Luis Lebolo

Update: It seems I get the bad behavior (no documents returned) when the length of a value in the StrField is greater than or equal to 32,767 (2^15). Is this some type of bit overflow somewhere? On Wed, Feb 5, 2014 at 12:32 PM, Luis Lebolo wrote: > Hi All, > > It seems that I can't query on a S

UTF-8 encoding problems while replicating an index using SolrCloud

2014-02-05 Thread Ugo Matrangolo

Hi, we are having problems with an installation of SolrCloud where a leader node kicks off an indexing and tries to replicate all the updates using the UpdateHandler. What we get instead is an error around a wrong UTF-8 encoding from the leader trying to call the /udpate endpoint on the replica:

Re: Problem querying large StrField?

2014-02-05 Thread Yonik Seeley

On Wed, Feb 5, 2014 at 1:04 PM, Luis Lebolo wrote: > Update: It seems I get the bad behavior (no documents returned) when the > length of a value in the StrField is greater than or equal to 32,767 > (2^15). Is this some type of bit overflow somewhere? I believe that's the maximum size of an index

Re: Problem querying large StrField?

2014-02-05 Thread Chris Hostetter

: Update: It seems I get the bad behavior (no documents returned) when the : length of a value in the StrField is greater than or equal to 32,767 : (2^15). Is this some type of bit overflow somewhere? IIRC there is a limit in the lower level lucene code to how many bytes a single term can be -- b

Re: UTF-8 encoding problems while replicating an index using SolrCloud

2014-02-05 Thread David Santamauro

I had that same error. I cleared it up by commenting out all the /update/xxx handlers and changing /update class to solr.UpdateRequestHandler Hope that helps David On 02/05/2014 01:37 PM, Ugo Matrangolo wrote: Hi, we are having problems with an installation of SolrCloud where a leader nod

Re: Java 7u51 and Guava... Does it affect Solr?

2014-02-05 Thread Mark Miller

Based on our current use of it and the nature of the issue, I don’t think we have anything to worry about. - Mark http://about.me/markrmiller On Jan 27, 2014, 9:52:05 PM, Shawn Heisey wrote: The Internet is buzzing about the change in Java 7u51 that breaks Google Guava. Guava is used in S

Re: SolrCloud fails to create new collections

2014-02-05 Thread Ray Cheng

Some more information that may help developers find out the cause. INFO 2014-02-04 14:47:08,931 DistributedQueue.java (line 211) Watcher fired on path: /overseer/collection-queue-work state: SyncConnected type NodeChildrenChanged ... (there were 393 of these "Watcher fired on path" lines in tota

Re: SolrCloud query results order master vs replica

2014-02-05 Thread M. Flatterie

Thank you Sir for that confirmation! Nic On Wed, 2/5/14, Chris Hostetter wrote: Subject: Re: SolrCloud query results order master vs replica To: solr-user@lucene.apache.org Received: Wednesday, February 5, 2014, 11:33 AM : Just to make sure I

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Jack Krupansky

(Gulp!) You could also set the debug parameter (temporarily) in the defaults section of your query request handler. But you still need to dump the text of the query response. -- Jack Krupansky -Original Message- From: Mauro Gregorio Binetti Sent: Wednesday, February 5, 2014 12:47 P

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Mohit Sinha

Hi, if you wish to execute the query at solr admin you can do so by appending it to the select handler :///select? eg for localhost at 8080 with solr instance name solr-4.4 core name test the select query would be localhost:8080/solr-4.4/test/select? Hope that helps! - Mohit Sinha On Thu, Feb 6

RE: expungeDeletes vs optimize

2014-02-05 Thread Petersen, Robert

Hi Bryan, >From what I've seen it will only get rid of the deletes in the segments that >the commit merged and there will be some residual deleted docs still in the >index. It doesn't do the full rewrite. Even if you play with merge factors >etc, you'll still have lint. In your situation

Default core for updates in multicore setup

2014-02-05 Thread Tom Burton-West

Hello, I'm running the example setup for Solr 4.6.1. In the ../example/solr/ directory, I set up a second core. I wanted to send updates to that core. I looked at .../exampledocs/post.sh and expected to see the URL as: URL= http://localhost:8983/solr/collection1/update However it does not

4.3.1 SC - IndexWriter issues causing replication + failures

2014-02-05 Thread Tim Vaillancourt

Hey guys, I am troubleshooting an issue on a 4.3.1 SolrCloud: 1 collection and 2 shards over 4 Solr instances, (which results in 1 core per Solr instance). After some time in Production without issues, we are seeing errors related to the IndexWriter all over our logs and an infinite loop of faili

[REMINDER] ApacheCon NA 2014 Travel Assistance Applications Due Feb 7

2014-02-05 Thread Chris Hostetter

(NOTE: cross posted, if you feel the need to reply, please keep it on general@lucene) As a reminder, Travel Assistance Applications for ApacheCon NA 2014 are due on Feb 7th (about 48 hours from now) Details are below, please note that if you have any questions about this program or the app

Re: Default core for updates in multicore setup

2014-02-05 Thread Chris Hostetter

: I then tried to locate some config somewhere that would specify that the : default core would be collection1, but could not find it. in the older style solr.xml, you can specify a "defaultCoreName". Moving forward, relying on the default core name is discouraged (and will hopefully be remove

Partial Word Search

2014-02-05 Thread Teague James

I cannot get Solr 4.6.0 to do partial word search on a particular field that is used for faceting. Most of the information I have found suggests modifying the fieldType "text" to include either the NGramFilterFactory or EdgeNGramFilterFactory in the filter. However since I am copying many other fie

Re: SOLR suggester with highlighting

2014-02-05 Thread Areek Zillur

This can be achieved using payloads in the suggester dictionary. The suggester based on spellcheck component does not support payloads in dictionary. You can use the new suggester component ( https://issues.apache.org/jira/browse/SOLR-5378), which allows you to highlight and return payloads. The pa

Re: Default core for updates in multicore setup

2014-02-05 Thread Tom Burton-West

Thanks Hoss, >>hardcoded default of "collection1" is still used for backcompat when there is no "defaultCoreName" configured by the user. Aha, it's hardcoded if there is nothing set in a config. No wonder I couldn't find it by grepping around the config files. I'm still trying to sort out the o

Re: Partial Word Search

2014-02-05 Thread Jack Krupansky

1. The ngramming occurs in the index, but does not modify the original, "stored" value that a query will return. So, "Example" will be returned even though the index will have all the sub-terms indexed (but not stored.) 2. You need the ngram filters to be asymmetric with regard to indexing and

Re: Default core for updates in multicore setup

2014-02-05 Thread Jack Krupansky

Tom, I did make an effort to "sort out" both the old and newer solr.xml features in my Solr 4.x Deep Dive e-book. -- Jack Krupansky -Original Message- From: Tom Burton-West Sent: Wednesday, February 5, 2014 5:56 PM To: solr-user@lucene.apache.org Subject: Re: Default core for updates

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Mauro Gregorio Binetti

Ahahhaha gulp is really funny :) Back to us... Do you mean modifying solrconfig.xml? Mauro Il giorno 05/feb/2014 20:45, "Jack Krupansky" ha scritto: > (Gulp!) > > You could also set the debug parameter (temporarily) in the defaults > section of your query request handler. But you still need to d

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Mauro Gregorio Binetti

Yes Mohit... What you said is what I tried but I have found really big problems in appending query with a correct syntax starting from the tracing I have posted in the last mail... Any suggestion about encoding? Il giorno 05/feb/2014 20:55, "Mohit Sinha" ha scritto: > Hi, > > if you wish to execu

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Jack Krupansky

Yes. Look at the example solrconfig.xml for a section labeled "defaults" for the "/select" request handler. You should see "df" as one parameter. Just copy that and change "df" to "debug" and change the field name to "true". -- Jack Krupansky -Original Message- From: Mauro Gregorio Bi

RE: SolrCloud multiple data center support

2014-02-05 Thread Darrell Burgan

Let's say I was primarily interested in ensuring there is a DR copy of the search index that is replicated to the remote data center, but I do not want the Solr instances in the remote data center to be part of the SolrCloud cluster, and that I am willing to accept some downtime in bringing up a

Re: Import data from mysql to sold

2014-02-05 Thread rachun

Hi Jack, Thank you very much for your reply. I've read that article already but it seem not what i am looking for. I'm not sure if it's possible to do what i want or not. Now I can import all data from mysql to Solr but my boss just want me to find the way to import from MongoDB into solr. Bes

Re: Import data from mysql to sold

2014-02-05 Thread Jack Krupansky

It appears that at this moment the best approach would be to write a Java program that reads from MongoDB and writes to Solr (Solr XML update requests.) Or, write a program that reads from MongDB and outputs a CSV format text file and then import that directly into Solr. -- Jack Krupansky ---

Optimize Index in solr 4.6

2014-02-05 Thread Sesha Sendhil Subramanian

Hi, I am running solr cloud with 10 shards. I do a batch indexing once everyday and once indexing is done I call optimize. I see that optimize happens on each shard one at a time and not in parallel. Is it possible for the optimize to happen in parallel? Each shard is on a separate box. Thanks S

Re: Import data from mysql to sold

2014-02-05 Thread rachun

I agree with you. I finally have another solution for this problem we will import all data directly from mysql instead. Thank you for all comments, keep sharing keep learning :) _/|\_ Chun. -- View this message in context: http://lucene.472066.n3.nabble.com/Import-data-from-mysql-to-sold-tp4

RE: Indexing multiple files in Lucene Solr

2014-02-05 Thread Anita Nair (UST, IND)

Hi Solr team, I am trying to look for a solution for indexing multiple text files with the same unique key, in Lucene. Is there a way to do this? Saw a posting in the mail archives (below), but I wonder if a solution was given to the users. Kindly respond , Anita Nair __

Re: Indexing multiple files in Lucene Solr

2014-02-05 Thread Alexandre Rafalovitch

Sure you can, Use overwrite flag in your update messages, as per http://wiki.apache.org/solr/UpdateXmlMessages Or have a different key (e.g. signature) nominated as your unique key. The issue is if you just allow duplicates all together, do you want to be able to delete a particular single docume

Re: Join Scoring

2014-02-05 Thread anand chandak

Resending, if somebody can please respond. Thanks, Anand On 2/5/2014 6:26 PM, anand chandak wrote: Hi, Having a question on join score, why doesn't the solr join query return the scores. Looking at the code, I see there's JoinScorer defined in the JoinQParserPlugin class ? If its not used

60 matches

Mail list logo