Re: search result not correct in solr

2014-04-29 Thread Alexandre Rafalovitch
On Wed, Apr 30, 2014 at 1:29 PM, neha sinha wrote: > maxGramSize="15" side="front"/> > protected="protwords.txt" /> I think combining NGrams with Porter filters, especially in that order will do really weird things. Have you tried using the Admin console? You really wa

Re: search result not correct in solr

2014-04-29 Thread neha sinha
Thanks Alexandre..but still that doesn't help me I am doing keyword search for word "Ribbing" and i am getting those products also which have "R-B" or "RB" word in some other field.But when i am doing search for "Ribbin" i am getting correct search results. My field type is textfield.Please find

Re: timeAllowed in not honoring

2014-04-29 Thread Salman Akram
I had this issue too. timeAllowed only works for a certain phase of the query. I think that's the 'process' part. However, if the query is taking time in 'prepare' phase (e.g. I think for wildcards to get all the possible combinations before running the query) it won't have any impact on that. You

Re: Issue with solr searching : words with "-" not able to search

2014-04-29 Thread neha sinha
same issue with my search result also and i have used solr.Textfield for this -- View this message in context: http://lucene.472066.n3.nabble.com/Issue-with-solr-searching-words-with-not-able-to-search-tp4128549p4133845.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: search result not correct in solr

2014-04-29 Thread Alexandre Rafalovitch
Can't figure out the exact question. Need more specific example. However, if you look in Solr 4 Admin panel, there is analysis screen that shows you how the text is analyzed during indexing and during search. Putting your words there will show you the effect of various components in your type defin

Re: timeAllowed in not honoring

2014-04-29 Thread Aman Tandon
Shawn this is the first time i raised this problem. My heap size is 14GB and i am not using solr cloud currently, 40GB index is replicated from master to two slaves. I read somewhere that it return the partial results which is computed by the query in that specified amount of time which is defin

search result not correct in solr

2014-04-29 Thread neha sinha
Hi I am trying to search with word Ribbing and i am also getting those result which have R-B or RB letter in their dsecription but when i am trying to search with Ribbin i m getting correct result...not getting any clue what to use in my solr schema.xml. Any guidance will be helpful. Thanks

Re: timeAllowed in not honoring

2014-04-29 Thread Shawn Heisey
On 4/29/2014 10:05 PM, Aman Tandon wrote: > I am using solr 4.2 with the index size of 40GB, while querying to my index > there are some queries which is taking the significant amount of time of > about 22 seconds *in the case of minmatch of 50%*. So i added a parameter > timeAllowed = 2000 in my q

timeAllowed in not honoring

2014-04-29 Thread Aman Tandon
Hi, I am using solr 4.2 with the index size of 40GB, while querying to my index there are some queries which is taking the significant amount of time of about 22 seconds *in the case of minmatch of 50%*. So i added a parameter timeAllowed = 2000 in my query but this doesn't seems to be work. Pleas

Re: Indexing Big Data With or Without Solr

2014-04-29 Thread rulinma
mark. -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-Big-Data-With-or-Without-Solr-tp4131215p4133831.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Raw query parameters

2014-04-29 Thread Xavier Morera
You saved my life Shawn! Thanks! On Mon, Apr 28, 2014 at 11:54 PM, Shawn Heisey wrote: > On 4/28/2014 7:54 PM, Xavier Morera wrote: > > Would anyone be so kind to explain what are the "Raw query parameters" > > in Solr's admin UI. I can't find an explanation in either the reference > > guide no

Re: Delete fields from document using a wildcard

2014-04-29 Thread Costi Muraru
I've opened an issue: https://issues.apache.org/jira/browse/SOLR-6034 Feedback in Jira is appreciated. On Tue, Apr 29, 2014 at 8:34 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > I think this is useful as well. Can you open an issue? > > > On Tue, Apr 29, 2014 at 7:53 PM, Shawn Hei

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Jochen Barth
Ok, https://wiki.apache.org/solr/SolrPerformanceFactors states that: "Retrieving the stored fields of a query result can be a significant expense. This cost is affected largely by the number of bytes stored per document--the higher byte count, the sparser the documents will be distributed o

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Jochen Barth
Something is really strange here: even when configuring fields id + sort_... to docValues="true" -- so there's nothing to get from "stored documents file" -- performance is still terrible with ocr stored=true _even_ with my patch which stores uncompressed like solr4.0.0 (checked with string

Re: Solr data directory contains index backups

2014-04-29 Thread Greg Walters
None that I'm aware of. A bit of googling shows the accepted solution to be an external script via cron or something similar. I think I saw an issue open on Apache's Jira about this but can't find it now. Thanks, Greg On Apr 25, 2014, at 4:37 PM, solr2020 wrote: > Thanks Greg. Is there any So

Re: saving user actions on item in solr for later retrieval

2014-04-29 Thread Ahmet Arslan
Hi Nolim, Actually EFF is searchable. See my comments at the end of the page  https://cwiki.apache.org/confluence/display/solr/Working+with+External+Files+and+Processes Ahmet On Tuesday, April 29, 2014 9:07 PM, nolim wrote: Thank you, it was interesting and I have learned some new things in

Re: saving user actions on item in solr for later retrieval

2014-04-29 Thread nolim
Thank you, it was interesting and I have learned some new things in solr :) But the "External File Field" isn't a good option because the field is unsearchable which it very important to us. We think about the first option (updating document in solr) but preforming commit only each 10 minutes - If

Re: Delete fields from document using a wildcard

2014-04-29 Thread Shalin Shekhar Mangar
I think this is useful as well. Can you open an issue? On Tue, Apr 29, 2014 at 7:53 PM, Shawn Heisey wrote: > On 4/29/2014 5:25 AM, Costi Muraru wrote: > > The problem is, I don't know the exact names of the fields I want to > > remove. All I know is that they end in *_1600_i. > > > > When remo

Re: Solr does not recognize language

2014-04-29 Thread Ahmet Arslan
Hi, solr/update should be used, not /solr/select curl 'http://localhost:8983/solr/update?commit=true&update.chain=langid'  By the way don't you have following definition in your solrconfig.xml?                      langid                 On Tuesday, April 29, 2014 4:50 PM, Victor Pascual

Re: Wildcard search not working with search term having special characters and digits

2014-04-29 Thread Geepalem
Can someone help me out with this issue please? -- View this message in context: http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-with-search-term-having-special-characters-and-digits-tp4133385p4133770.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Stemming not working with wildcard search

2014-04-29 Thread Geepalem
Can someone help me out with this issue? -- View this message in context: http://lucene.472066.n3.nabble.com/Stemming-not-working-with-wildcard-search-tp4133382p4133769.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: [ANNOUNCE] Apache Solr 4.8.0 released

2014-04-29 Thread Shawn Heisey
On 4/29/2014 8:48 AM, Shawn Heisey wrote: > On 4/29/2014 8:27 AM, Flavio Pompermaier wrote: >> In which sense fields and types are now deprecated in schema.xml? Where can >> I found any pointer about this? > https://issues.apache.org/jira/browse/SOLR-5936 > > Here is the patch for 4.8: > > https://

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Jochen Barth
Dear Shawn, see attachment for my first "brute force" no-compression attempt. Kind regards, Jochen Zitat von Shawn Heisey : On 4/29/2014 4:20 AM, Jochen Barth wrote: BTW: stored field compression: are all "stored fields" within a document are put into one compressed chunk, or by per-field b

Re: [ANNOUNCE] Apache Solr 4.8.0 released

2014-04-29 Thread Shawn Heisey
On 4/29/2014 8:27 AM, Flavio Pompermaier wrote: > In which sense fields and types are now deprecated in schema.xml? Where can > I found any pointer about this? https://issues.apache.org/jira/browse/SOLR-5936 Here is the patch for 4.8: https://issues.apache.org/jira/secure/attachment/12637716/SOL

Solr Server Infrastructure Config

2014-04-29 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
Hi, Can some one share or refer to get information on the SOLR server environment for production. Appx. We have 40 Collections, with appx size from 300MB to 8GB (for each Collection) and appx Total 100GB. The average increase of the size for total may be 2-5Gb / Year. To Get best performance f

Re: [ANNOUNCE] Apache Solr 4.8.0 released

2014-04-29 Thread Shalin Shekhar Mangar
Earlier, all tags were required to be nested inside a tag. Similarly, all and tags were required to be nested inside a tag. Such nesting is no longer required and you can inter-mix , and tags as you like. Therefore, the and tags are no longer required and can be removed. Even if you don't

Re: [ANNOUNCE] Apache Solr 4.8.0 released

2014-04-29 Thread Rafał Kuć
Hello! You don't need the and section anymore, you can just include type or field definition anywhere in the schema.xml section. You can find more in https://issues.apache.org/jira/browse/SOLR-5228 -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr & Elastics

Re: [ANNOUNCE] Apache Solr 4.8.0 released

2014-04-29 Thread Steve Rowe
https://issues.apache.org/jira/browse/SOLR-5228 On Apr 29, 2014, at 10:27 AM, Flavio Pompermaier wrote: > In which sense fields and types are now deprecated in schema.xml? Where can > I found any pointer about this? > > On Mon, Apr 28, 2014 at 6:54 PM, Uwe Schindler wrote: > >> 28 April 2014,

Re: [ANNOUNCE] Apache Solr 4.8.0 released

2014-04-29 Thread Flavio Pompermaier
In which sense fields and types are now deprecated in schema.xml? Where can I found any pointer about this? On Mon, Apr 28, 2014 at 6:54 PM, Uwe Schindler wrote: > 28 April 2014, Apache Solr™ 4.8.0 available > > The Lucene PMC is pleased to announce the release of Apache Solr 4.8.0 > > Solr is th

Re: Delete fields from document using a wildcard

2014-04-29 Thread Shawn Heisey
On 4/29/2014 5:25 AM, Costi Muraru wrote: > The problem is, I don't know the exact names of the fields I want to > remove. All I know is that they end in *_1600_i. > > When removing fields from a document, I want to avoid querying SOLR to see > what fields are actually present for the specific doc

Re: Solr does not recognize language

2014-04-29 Thread Victor Pascual
Hi Ahmet, thanks for your reply. Adding &update.chain=langid to my query doesn't work: IP:8080/solr/select/?q=*%3A*&update.chain=langid Regarding defining the chain in an UpdateRequestHandler... sorry for the lame question but shall I paste those three lines to solrconfig.xml, or shall I add them

Re: Solr does not recognize language

2014-04-29 Thread Ahmet Arslan
Hi, Did you attach your chain to a UpdateRequestHandler? You can do it by adding &update.chain=langid to the URL or defining it in a defaults section as follows      langid     On Tuesday, April 29, 2014 3:18 PM, Victor Pascual wrote: Dear all, I'm a new user of Solr. I've managed to ind

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Shawn Heisey
On 4/29/2014 4:20 AM, Jochen Barth wrote: > BTW: stored field compression: > are all "stored fields" within a document are put into one compressed chunk, > or by per-field basis? Here's the issue that added the compression to Lucene: https://issues.apache.org/jira/browse/LUCENE-4226 It was made

Solr does not recognize language

2014-04-29 Thread Victor Pascual
Dear all, I'm a new user of Solr. I've managed to index a bunch of documents (in fact, they are tweets) and everything works quite smoothly. Nevertheless it looks like Solr doesn't detect the language of my documents nor remove stopwords accordingly so I can extract the most frequent terms. I've

Apache Solr - Pdf Indexing.

2014-04-29 Thread vignesh
Hi Team, I am indexing PDF using Apache Solr 3.6 . Passing around 3000 keywords using the OR operator (gardens OR flowers OR time OR train OR trees OR etc) able to get the files containing these keywords. But every .PDF file will not be containing all the keywords, some may contai

Re: Delete fields from document using a wildcard

2014-04-29 Thread Costi Muraru
Thanks, Alex for the input. Let me provide a better example on what I'm trying to achieve. I have documents like this: 100 1 5 7 The schema looks the usual way: The dynamic field pattern I'm using is this: id_day_i. Each day I want to add new fields for the current day and remove the fields

Re: Apache Solr - Pdf Indexing.

2014-04-29 Thread Gora Mohanty
On Apr 29, 2014 2:52 PM, "vignesh" wrote: > > Hi Team, > > > > I am indexing PDF using Apache Solr 3.6 . Passing around 3000 keywords using the OR operator and able to get the files containing the keywords. Kindly guide me to get the keyword list in a .PDF file. What do you mean? Do

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Jochen Barth
BTW: stored field compression: are all "stored fields" within a document are put into one compressed chunk, or by per-field basis? Kind regards, J. Barth > > Regards, >Alex. > Personal website: http://www.outerthoughts.com/ > Current project: http://www.solr-start.com/ - Accelerating your

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Jochen Barth
Am 29.04.2014 11:19, schrieb Alexandre Rafalovitch: > Couple of random thoughts: > 1) The latest (4.8) Solr has support for nested documents, as well as > for expand components. Maybe that will let you have more efficient > architecture: http://heliosearch.org/expand-block-join/ Yes, I've seen thi

Re: Apache Solr - Pdf Indexing.

2014-04-29 Thread Alexandre Rafalovitch
Your question is not terribly clear. Are you having troubles indexing PDF in general? Try the tutorial and specifically look for extract handler. Or you already got PDF into the system but your 3000 Keyword query does not match it? In which case it might be just that PDF extraction is limited by d

Re: How to reduce enumerating docs

2014-04-29 Thread ??????
Will the filter query execute before or after my custom search component? In fact, I care about that, for example??if the following \docsEnum will contain 1M docs for term \aterm without the flter query, will it be less than 1M in case that the filter query is present? DocsEnum docsEn

Apache Solr - Pdf Indexing.

2014-04-29 Thread vignesh
Hi Team, I am indexing PDF using Apache Solr 3.6 . Passing around 3000 keywords using the OR operator and able to get the files containing the keywords. Kindly guide me to get the keyword list in a .PDF file. Note : In Schema.xml have declared a unique tag "id". Than

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Alexandre Rafalovitch
Couple of random thoughts: 1) The latest (4.8) Solr has support for nested documents, as well as for expand components. Maybe that will let you have more efficient architecture: http://heliosearch.org/expand-block-join/ 2) Do you return OCR text to the client? Or just search it? If just search it,

Re: How to reduce enumerating docs

2014-04-29 Thread Alexandre Rafalovitch
Can't you just specify the length range as a filter query? If your length type is tint/tlong, Solr already has optimized code that uses multiple resolutions depth to efficiently filter through the numbers. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://ww

Stored vs non-stored very large text fields

2014-04-29 Thread Jochen Barth
Dear reader, I'm trying to use solr for a hierarchical search: metadata from the higher-levelled elements is copied to the lower ones, and each element has the complete ocr text which it belongs to. At volume level, of course, we will have the complete ocr text in one and we need to store it for

How to reduce enumerating docs

2014-04-29 Thread ??????
Hi all, My doc has two fileds namely "length" and "fingerprint", which stand for the length and text of the doc. I have a custom SearchComponent that enum all the docs according to the term to search the fingerprint. That could be very slow because the number of docs is very huge and the o

Re: Selectively hiding SOLR facets.

2014-04-29 Thread Alexandre Rafalovitch
How would you know it if you did this manually? Solr does not know that Dutch is not valid for USA. You need to give it some sort of signal. One way could be to have a dynamic field for a facet which includes country name. So, you have language_USA, language_Belgium, etc. Then, when you do country

Re: Selectively hiding SOLR facets.

2014-04-29 Thread Iker Mtnz. Apellaniz
You could use facet.mincount parameter. default value is 0, setting it as N would require a minimum appearance on the result Set Iker 2014-04-29 4:56 GMT+02:00 atuldj.jadhav : > Yes, but with my query *country:"USA" * it is returning me languages > belonging to countries other than USA. > > Is

shard query with duplicated documents cause inaccuate paginating

2014-04-29 Thread Jie Sun
When we have duplicated documents (same uniqueID) among the shards, the query results could be non-deterministic, this is an known issue. The consequence when we display the search results on our UI page with paginating is: if user click the 'last page', it could display an empty page since the to