Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-21 Thread Mark Miller
The other odd thing here is that this should not stop replication at all. When the slave is ahead, it will still have it's index replaced. - Mark On Mar 22, 2013, at 1:26 AM, Mark Miller wrote: > I'm working on testing to try and catch what you are seeing here: > https://issues.apache.org/jir

Re: Writing new indexes from index readers slow!

2013-03-21 Thread Jed Glazner
Thanks Otis, I had not considered that approach, however not all of our fields are stored so that's not going to work for me. I'm wondering if its slow because there is just the one reader getting passed to the index writer... I noticed today that the addIndexes method can take an array of rea

Re: Writing new indexes from index readers slow!

2013-03-21 Thread Otis Gospodnetic
Jed, While this is something completely different, have you considered using SolrEntityProcessor instead? (assuming all your fields are stored) http://wiki.apache.org/solr/DataImportHandler#SolrEntityProcessor Otis -- Solr & ElasticSearch Support http://sematext.com/ On Thu, Mar 21, 2013 at

Re: Solr cloud and auto shard timeline

2013-03-21 Thread Otis Gospodnetic
Hi, I think there is a mixup here. SolrCloud has the same sharding capabilities as ES at this point, I believe, other than manual moving of shards Mark mentions. Otis -- Solr & ElasticSearch Support http://sematext.com/ On Thu, Mar 21, 2013 at 7:08 PM, Jamie Johnson wrote: > I've seen tha

Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-21 Thread Mark Miller
I'm working on testing to try and catch what you are seeing here: https://issues.apache.org/jira/browse/SOLR-4629 - Mark On Mar 22, 2013, at 12:23 AM, Mark Miller wrote: > Let me know if there is anything else you can add. > > A test with your setup that index n docs randomly, commits, random

Re: Did something change with Payloads?

2013-03-21 Thread jimtronic
Something has definitely changed at 4.1. I've installed 4.0, 4.1, and 4.2 side by side and conducted the same tests on each one. Only 4.0 is returning the expected results. Apologies for cross posting this here and in the Lucene forum, but I really can't tell if this is a Solr or a Lucene issue.

Re: SOLR - Documents with large number of fields ~ 450

2013-03-21 Thread Mark Miller
You might try using docvalues with the on disk option and try and let the OS manage all the memory needed for all the faceting/sorting. This would require Solr 4.2. - Mark On Mar 21, 2013, at 2:56 AM, kobe.free.wo...@gmail.com wrote: > Hello All, > > Scenario: > > My data model consist of ap

Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-21 Thread Mark Miller
Let me know if there is anything else you can add. A test with your setup that index n docs randomly, commits, randomly updates a conf file or not, and then replicates and repeats x times does not seem to fail, even with very high values for n and x. On every replication, the versions are compa

Re: had query regarding the indexing and analysers

2013-03-21 Thread Jack Krupansky
Actually, it's the Porter Stemmer that is turning "ace" into "ac". Try making a copy of text_en_splitting and delete the PorterStemFilterFactory filter from both the query and index analyzers. -- Jack Krupansky -Original Message- From: Rohan Thakur Sent: Wednesday, March 20, 2013 8:

Re: SOLR - Documents with large number of fields ~ 450

2013-03-21 Thread Otis Gospodnetic
Hi, In short, I suspect you'll OOM if you sort and facet on all these fields. Otis -- Solr & ElasticSearch Support http://sematext.com/ On Thu, Mar 21, 2013 at 2:56 AM, kobe.free.wo...@gmail.com < kobe.free.wo...@gmail.com> wrote: > Hello All, > > Scenario: > > My data model consist of appr

Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-21 Thread Uomesh
Thank you!!, Attached is my master solrconfig.xml. I have few custom handlers which you might need to remove. In custom handler i have not much code just adding some custom data for UI. Thanks, Umesh On Thu, Mar 21, 2013 at 9:59 PM, Mark Miller-3 [via Lucene] > wrote: > Could you attach the mas

Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-21 Thread Mark Miller
Could you attach the master as well? - Mark On Mar 21, 2013, at 4:36 PM, Uomesh wrote: > Hi Mark, > > Attached is my solrconfig_slave.xml. My replication interval is 1 > minute(default). > > Please let me know if you need any more config details > > Thanks, > umesh > > On Thu, Mar 21, 2013

Re: Very slow query when boosting involve with EnternalFileField

2013-03-21 Thread Floyd Wu
Anybody can point me a direction? Many thanks. 2013/3/20 Floyd Wu > Hi everyone, > > I have a problem and have no luck to figure out. > > When I issue a query to > Query 1 > > http://localhost:8983/solr/select?q={!boost+b=recip(ms(NOW/HOUR,last_modified_datetime),3.16e-11,1,1)}all

RE: Slow queries for common terms

2013-03-21 Thread David Parks
I figured I was trying to pull a coup here, but this is a temporary configuration while we only run a few users through an early beta. The performance is perfectly good for most terms, it's just this books term. I'm curious how adding RAM will solve that. I can see how deploying solr cloud and shar

Re: strange behaviour of wordbreak spellchecker in solr cloud

2013-03-21 Thread alxsss
Hello, I am debugging the SpellCheckComponent#finishStage. >From the responses I see that not only wordbreak, but also directSpellchecker >does not return some results in distributed mode. The request handler I was using had true So, I desided to turn of grouping and I see spellcheck res

Continue to the next record

2013-03-21 Thread randolf.julian
I have an XML file that has several documents in it. For example: 1 MyName1 2 MyName2 3 MyName3 I upload the data using SOLR's post.sh script. For some reason, document 2 failed and it cause the post.sh script to stop. How can I make it continue t

Re: Could not load config for solrconfig.xml

2013-03-21 Thread Steve Rowe
The wiki has some Tomcat info here: , not sure how up-to-date it is. See also LucidWorks' Solr Reference Guide's page Running Solr on Tomcat: IntelliJ has a Tomcat run configuration that can

Re: Could not load config for solrconfig.xml

2013-03-21 Thread Furkan KAMACI
Shoukd I create a collection1 folder as like in the example? On the other hand if I use .war tı deploy how can I resolve that problem too? 2013/3/22 Furkan KAMACI > I run ant idea command for Solr 4.1.0 and opened source code within > Intellij IDEA 12.0.4 and I use Centos 6.4 at my 64 bit compu

Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-21 Thread Mark Miller
Hmm…I guess I'll work on beefing up these tests some more tonight and see if I can stumble across it. - Mark On Mar 21, 2013, at 4:36 PM, Uomesh wrote: > Hi Mark, > > Attached is my solrconfig_slave.xml. My replication interval is 1 > minute(default). > > Please let me know if you need any m

Re: How can I compile and debug Solr from source code?

2013-03-21 Thread Furkan KAMACI
Ok I run that and see that there is a .war file at /lucene-solr/solr/dist Do you know that how can I run that ant phase from Intellij without command line (there are many phases under Ant build window) On the other hand within Intellij Idea how can I auto deploy it into Tomcat. All in all I will

Re: Solr cloud and auto shard timeline

2013-03-21 Thread Mark Miller
Can you elaborate on what you are looking for? If you mean automatically moving shards around, it's not anything I intend to work on currently. Not sure about others. I do plan on on working on an http command for moving shards though. - Mark On Mar 21, 2013, at 7:08 PM, Jamie Johnson wrote:

Could not load config for solrconfig.xml

2013-03-21 Thread Furkan KAMACI
I run ant idea command for Solr 4.1.0 and opened source code within Intellij IDEA 12.0.4 and I use Centos 6.4 at my 64 bit computer. I debugged JettySolrRunner (I don't know, I think this is the way to run Solt with Embedd Jetty within my Intellij IDEA.) However I get that error: SEVERE: Unable

Re: How can I compile and debug Solr from source code?

2013-03-21 Thread Steve Rowe
Perhaps you didn't see what I wrote earlier?: Sounds like you want 'ant dist', which will create the .war and put it into the solr/dist/ directory: PROMPT$ ant dist Steve On Mar 21, 2013, at 7:38 PM, Furkan KAMACI wrote: > I mean I need that: There is a .war file shipped with Solr source co

Re: How can I compile and debug Solr from source code?

2013-03-21 Thread Furkan KAMACI
I mean I need that: There is a .war file shipped with Solr source code. How can I regenerate (build my code and generate a .war file) as like that? I will deploy it to Tomcat then? 2013/3/22 Furkan KAMACI > Your mentioned suggestion is for only example application? Can I imply it > to just pure

Solr cloud and auto shard timeline

2013-03-21 Thread Jamie Johnson
I've seen that Elastic Search has had auto shardding capabilities for some time, is there a timeline for when a similar capability is being targeted for Solr Cloud?

Re: How can I compile and debug Solr from source code?

2013-03-21 Thread Furkan KAMACI
Your mentioned suggestion is for only example application? Can I imply it to just pure Solr (I don't want to generate example application because my aim is not just debugging Solr, I want to extend it and I will debug that extended code)? 2013/3/22 Alexandre Rafalovitch > That's nice. Can we put

Re: How can I compile and debug Solr from source code?

2013-03-21 Thread Alexandre Rafalovitch
That's nice. Can we put that on a Wiki? Or as a quick screencast? Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be w

Re: Sort-field for ALL docs in FieldCache for sort queries -> OOM on lots of docs

2013-03-21 Thread Shawn Heisey
On 3/21/2013 4:05 AM, Per Steffensen wrote: Can anyone else elaborate? How to "activate" it? How to make sure, for sorting, that sort-field-value for all docs are not read into memory for sorting - leading to OOM when you have a lot of docs? Can this feature be activated on top of an existing 4.0

Re: How can I compile and debug Solr from source code?

2013-03-21 Thread Erik Hatcher
Here's my development/debug workflow: - "ant idea" at the top-level to generate the IntelliJ project - cd solr; ant example - to build the full example - cd example; java -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005 -jar start.jar - to launch Jetty+Solr in debug mo

Re: How can I compile and debug Solr from source code?

2013-03-21 Thread Steve Rowe
As solr/README.txt says in the "Instructions for Building Apache Solr from Source" section: "4.Navigate to the "solr" folder and issue an "ant" command to see the available options for building, testing, and packaging Solr." PROMPT$ ant Buildfile: /Users/sarowe/svn/lucene/dev/branches/branch_4x/

Re: DocValues and field requirements

2013-03-21 Thread Shawn Heisey
On 3/21/2013 3:07 PM, Shawn Heisey wrote: This might be a requirement of the lower-level Lucene API, or it might be a requirement that was instituted at the Solr level because a problem was found when docs did not contain the field. Google seems reluctant to tell me, and I haven't figured out th

Re: Nested queries with proximity/slop

2013-03-21 Thread Yonik Seeley
https://issues.apache.org/jira/browse/SOLR-4625 -Yonik http://lucidworks.com On Tue, Mar 19, 2013 at 11:12 PM, Yonik Seeley wrote: > On Tue, Mar 19, 2013 at 8:52 PM, Michael Ryan wrote: >> I was wondering if anyone is aware of an existing Jira for this bug... >> >> _query_:"\"a b\"~2" >> ...is

Re: How can I compile and debug Solr from source code?

2013-03-21 Thread Furkan KAMACI
Using embedded is an option. However I see that there is a .war file inside Solr source code. So that means that I can generate a .war file and deploy it to Tomcat or something like that. My main question arises here. How can I generate a .war file from my customized Solr source code? That's why I

Re: DocValues and field requirements

2013-03-21 Thread Shawn Heisey
On 3/21/2013 5:46 AM, Marcin Rzewucki wrote: Hi, Can somebody explain why there are additional requirements for a field to be able to use DocValues ? For example: Trie*Fields have to be required or have default value. "Schema Parsing Failed: Field tlong{class=org.apache.solr.schema.TrieLongFiel

Re: How can I compile and debug Solr from source code?

2013-03-21 Thread Shawn Heisey
On 3/21/2013 6:56 AM, Furkan KAMACI wrote: I use Intellij Idea 12 and Solr 4.1 on a Centos 6.4 64 bit computer. I have opened Solr source code at Intellij IDEA as explained documentation. I want to deploy Solr into Tomcat 7. When I open the project there are configurations set previosly (I used

Re: SOLR - Documents with large number of fields ~ 450

2013-03-21 Thread Jack Krupansky
You will definitely be pushing the limits for reasonable performance. Maybe 4-5 years from now you will be able to get decent performance with hundreds of fields and dozens of faceted fields, but I'd be surprised if you could get decent performance with more than about 100 fields and a dozen fac

Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-21 Thread Uomesh
Hi Mark, Attached is my solrconfig_slave.xml. My replication interval is 1 minute(default). Please let me know if you need any more config details Thanks, umesh On Thu, Mar 21, 2013 at 3:19 PM, Mark Miller-3 [via Lucene] < ml-node+s472066n4049832...@n3.nabble.com> wrote: > Can you give more de

Re: Solr 4.2 - Slave Index version is higher than Master

2013-03-21 Thread Mark Miller
Can you give more details about your configuration and setup? Our best bet is to try and recreate this with a unit test. - Mark On Mar 21, 2013, at 4:08 PM, Uomesh wrote: > Hi, > > I am seeing an issue after upgrading from solr 3.6.2 to Solr 4.2. My Slave > stop replicating after sometime. An

Solr 4.2 - Slave Index version is higher than Master

2013-03-21 Thread Uomesh
Hi, I am seeing an issue after upgrading from solr 3.6.2 to Solr 4.2. My Slave stop replicating after sometime. And it seems issue is because of my Slave Index version is higher than master. How could it be possible to Slave Index version is higher than master? Please help me. IS there anything i

Re: CommaSplit and query is free text search

2013-03-21 Thread Jack Krupansky
That's fine as far as it goes, but the input is multi-valued, so merely splitting tokens on comma doesn't make the tokens separate values. Given: {"worked_company_name":["Dell","Microsoft,Facebook"] } The regex would produced the equivalent of : {"worked_company_name":["Dell","Microsoft Faceb

Re: Did something change with Payloads?

2013-03-21 Thread jimtronic
Ok, Yes, I have now recompiled against the 4.2.0 libraries. I needed to change a few things, but the problem still exists using the new libraries. I think the problem may actually be on the indexing side of things. Here's why: 1. I had an old index created under 4.0, running 4.0. Works as expecte

RE: CommaSplit and query is free text search

2013-03-21 Thread Keswani, Nitin - BLS CTR
You can use a type defined below to split on comma. Please note I have not used any additional filters. Based on your requirements you might want to add more filters for further processing after tokenisation : Thanks. Regards, Nitin Keswani -Original Me

Writing new indexes from index readers slow!

2013-03-21 Thread Jed Glazner
Hey Hey Everybody!, I'm not sure if I should have posted this to the developers list... if i'm totally barking up the wrong tree here, please let me know! Anywho, I've developed a command line utility based on the MultiPassIndexSplitter class from the lucene lib

RE: strange behaviour of wordbreak spellchecker in solr cloud

2013-03-21 Thread Dyer, James
The shard responses get combined in SpellCheckComponent#finishStage . I highly recommend you file a JIRA bug report for this at https://issues.apache.org/jira/browse/SOLR . If you write a failing unit test, it would make it much more likely that others would help you with a fix. Of course, i

Re: strange behaviour of wordbreak spellchecker in solr cloud

2013-03-21 Thread alxsss
Hello, We need this feature be fixed ASAP. So, please let me know which class is responsible for combining spellcheck results from all shards. I will try to debug the code. Thanks in advance. Alex. -Original Message- From: alxsss To: solr-user Sent: Tue, Mar 19, 2013 11:34 a

Solr 4.2.0 results links

2013-03-21 Thread zeroeffect
While I am still in the beginning phase of solr I have been able to index a directory of HTML files. I can search keywords and get results. The problem I am having is the links to the HTML document is file based and http based. I get the link but it points to file:\\ and not http:\\. I have been lo

Re: Problem with Solr replication in solr 4.2

2013-03-21 Thread Mark Miller
On Mar 21, 2013, at 12:08 PM, Rohit Harchandani wrote: > Hey, > Currently we are using solr 4.0 with a master slave setup. The data gets > indexed on the master and then we issue a fetchindex command to replicate > it on the slave. The slave has a postCommit listener which gets kicked off > when

Timeout occured while waiting response from server

2013-03-21 Thread Benjamin, Roy
I'm calling: m_server.add(docs, 12); Wondering if the timeout that expires was the one set when the server was created? m_server = new HttpSolrServer(serverUrl); m_server.setRequestWriter(new BinaryRequestWriter()); m_server.setConnectionTimeout(3); m_server.setSoTimeout(1); Als

setting bq in searchcomponent

2013-03-21 Thread Peyman Faratin
Hi If I run a main query "cheeze" jointly with a boost query "bq=spell:"cheeze" (boosting results with spell field "cheeze"), as /select?fl=title&qf=main&bq=spell:"cheeze"&bq=trans:"cheeze"&q=cheeze everything works fine. And defType=dismax What I'd like to do is to programmatically generate t

Re: Spatial Search with document score as distance between two points

2013-03-21 Thread Chris Hostetter
: q={!func}geodist()&sfield=latlng&pt=28.635308,77.22496&sort=score+asc ... : Problem : For those documents which doesn't have latlng field, value is coming exceptionally large '8763.191'. I'm pretty sure you're seeing the function assume a default lat,lon of "0,0" for docs thta you hav

Re: custom similary on a field not working

2013-03-21 Thread xavier jmlucjav
Damn...I was obfuscated seeing the 14 there...I had naively thought that term freq would not be stored in the doc, 1 would be stored, but I guess it still stores the real value and then applies custom similarity at query time. That means changing to a custom similarity does not need reindexing rig

Re: custom similary on a field not working

2013-03-21 Thread Chris Hostetter
: > public class NoTfSimilarity extends DefaultSimilarity { : > public float tf(float freq) { : > return freq > 0 ? 1.0f : 0.0f; : > } : > } ... : > But I still see tf=14 in my query?? ... : > 1.0 = tf(freq=14.0), with freq of: : >

Re: Don't cache filter queries

2013-03-21 Thread Chris Hostetter
: Just add {!cache=false} to the filter in your query : (http://wiki.apache.org/solr/SolrCaching#filterCache). ... : > I need to use the filter query feature to filter my results, but I : > don't want the results cached as documents are added to the index : > several times per second and t

Re: CommaSplit and query is free text search

2013-03-21 Thread Jack Krupansky
You should clean up your data before sending it to Solr. Theoretically, you could develop a custom update processor to do that cleanup within Solr, but it probably wouldn't be worth the extra effort. Once you have decided what the clean input format is, then you can decide what the details of

Problem with Solr replication in solr 4.2

2013-03-21 Thread Rohit Harchandani
Hey, Currently we are using solr 4.0 with a master slave setup. The data gets indexed on the master and then we issue a fetchindex command to replicate it on the slave. The slave has a postCommit listener which gets kicked off when replication finishes and we depend on this listener to know whn rep

java.lang.NullPointerException withs stats component and shards

2013-03-21 Thread Agnieszka Kukałowicz
Hi, I have problem with Stats component in shards environment. Solr throws Java.lang.NullPointerException when there is no results and statistic is computed over date field. price ddate true *:* date:[2013-03-23T00:00:00Z TO *] price

Re: Don't cache filter queries

2013-03-21 Thread Rafał Kuć
Hello! Just add {!cache=false} to the filter in your query (http://wiki.apache.org/solr/SolrCaching#filterCache). -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch > I need to use the filter query feature to filter my results, but I > don't want t

Re: Bitwise operation

2013-03-21 Thread Walter Underwood
How often is "frequently"? If it is 1000/second, you have a problem, but you'd have a problem with most solutions. Measure or estimate how many documents are affected, how often. Then set a latency for how long you can wait before the change is visible. With those, you can evaluate solutions. W

Don't cache filter queries

2013-03-21 Thread Dotan Cohen
I need to use the filter query feature to filter my results, but I don't want the results cached as documents are added to the index several times per second and the results will be state immediately. Is there any way to disable filter query caching? This is on Solr 4.1 running in Jetty on Ubuntu

Re: Facets with 5000 facet fields

2013-03-21 Thread Andy
But if I just add facet.method=fcs, wouldn't I just get fcs? Mark said this new method based on docvalues is better than fcs, so wouldn't I need to do something other than specifying fcs to enable this new method? From: Upayavira To: solr-user@lucene.apache.o

Re: custom similary on a field not working

2013-03-21 Thread xavier jmlucjav
Steve, yes, as I already included (though maybe is not very visible) I have this before element: I can see explain info is indeed different, for example I have [] instead of [DefaultSimilarity] thanks On Thu, Mar 21, 2013 at 3:08 PM, Steve Rowe wrote: > Hi xavier, > > Have you set the glo

Re: Keeping query index until a complete reindex is done

2013-03-21 Thread Silvio Hermann
thanks for that hint. so long, Silvio On 03/20/2013 01:07 PM, Jan Høydahl wrote: Hi, Yes. Two alternatives 1) If you just want to reindex, no big schema changes etc Just reindex everything on top of the old index. Search will still be live all the time 2) Create a new core/collection Reinde

CommaSplit and query is free text search

2013-03-21 Thread anurag.jain
I have field named as worked_company_name. in json input i am giving value like { "worked_company_name":["Dell","Microsoft,Facebook"] } -> data is very bad. means it may have comma etc. so can you please tell me how type should ? comaSplitwithsearch ?? thanks -- View this mes

Re: custom similary on a field not working

2013-03-21 Thread Steve Rowe
Hi xavier, Have you set the global similarity to solr.SchemaSimilarityFactory? See . Steve On Mar 21, 2013, at 9:44 AM, xavier jmlucjav wrote: > Hi Felipe, > > I need to keep positions, that is why I cannot just use > omitTermFreqAndPositions

Re: Did something change with Payloads?

2013-03-21 Thread Ahmet Arslan
Did you recompile your plugins using sold 4.2.0 dependency? May be method signatures (that you override) have changed ? --- On Thu, 3/21/13, jimtronic wrote: > From: jimtronic > Subject: Re: Did something change with Payloads? > To: solr-user@lucene.apache.org > Date: Thursday, March 21, 2013

What to expect when testing Japanese search index

2013-03-21 Thread Van Tassell, Kristian
I’m trying to set up our search index to handle Japanese data, and while some searches yield results, others do not. This is especially true the smaller the search term. For example, searching for this term: 更 Yields no results even though I know it appears in the text. I understand that this

Re: custom similary on a field not working

2013-03-21 Thread xavier jmlucjav
Hi Felipe, I need to keep positions, that is why I cannot just use omitTermFreqAndPositions On Thu, Mar 21, 2013 at 2:36 PM, Felipe Lahti wrote: > Do you really need a custom similarity? > Did you try to put the attribute "omitTermFreqAndPositions" in your field? > > It could be: > > indexed="

Re: Slow queries for common terms

2013-03-21 Thread Ahmet Arslan
Hi David, I happen to know CommonTermsQuery added recently. But now sure how to use it. http://lucene.apache.org/core/4_1_0/queries/org/apache/lucene/queries/CommonTermsQuery.html --- On Thu, 3/21/13, David Parks wrote: > From: David Parks > Subject: Slow queries for common terms > To: "sol

Re: custom similary on a field not working

2013-03-21 Thread Felipe Lahti
Do you really need a custom similarity? Did you try to put the attribute "omitTermFreqAndPositions" in your field? It could be: http://wiki.apache.org/solr/SchemaXml On Thu, Mar 21, 2013 at 7:35 AM, xavier jmlucjav wrote: > I have the following setup: > > positionIncrementGap="100"

Slow queries for common terms

2013-03-21 Thread David Parks
I've got a query that takes 15 seconds to return whenever I have the term "book" in a query that isn't cached. That's a pretty common term in our search index. We're indexing about 120 GB of text data. We only store terms and IDs, no document data, and the disk is virtually unused, it's all CPU

Re: Facets with 5000 facet fields

2013-03-21 Thread Upayavira
as was said below, add facet.method=fcs to your query URL. Upayavira On Thu, Mar 21, 2013, at 09:41 AM, Andy wrote: > What do I need to do to use this new per segment faceting method? > > > > From: Mark Miller > To: solr-user@lucene.apache.org > Sent: Wednesd

Re: Using Solr from Github or SVN

2013-03-21 Thread Alexandre Rafalovitch
Thanks. I read "Generate Patch" section and saw no Git mention. Did not read "Working.." section as I did not expect to need it any time soon. It is good to know. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is

Re: Slow queries for common terms

2013-03-21 Thread Jan Høydahl
Hi, If you say that you try to index 300M docs in ONE single Solr server, with "a few gigs" of RAM, then that's the reason for some bad performance right there. You should benchmark to find the sweet-spot of how many documents you want to fit per node/shard and still have acceptable indexing/qu

How can I compile and debug Solr from source code?

2013-03-21 Thread Furkan KAMACI
I use Intellij Idea 12 and Solr 4.1 on a Centos 6.4 64 bit computer. I have opened Solr source code at Intellij IDEA as explained documentation. I want to deploy Solr into Tomcat 7. When I open the project there are configurations set previosly (I used ant idea command before I open the project) .

Re: Using Solr from Github or SVN

2013-03-21 Thread Jan Høydahl
If you read the HowToContribute page you will see it described there. I's supported, you just need to apply the patch with -p1 instead of -p0 http://wiki.apache.org/solr/HowToContribute#Working_With_Patches -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training

Re: Bitwise operation

2013-03-21 Thread Upayavira
You could use the same approach for users as for groups - have a {!join} filter query to select docs that a user is allowed to see, and another to select groups they are allowed to see. I've no idea how performant this would be for you, as it depends on how many documents a single user is allowed

DocValues and field requirements

2013-03-21 Thread Marcin Rzewucki
Hi, Can somebody explain why there are additional requirements for a field to be able to use DocValues ? For example: Trie*Fields have to be required or have default value. "Schema Parsing Failed: Field tlong{class=org.apache.solr.schema.TrieLongField,analyzer=org.apache.solr.analysis.TokenizerCh

RE: Slow queries for common terms

2013-03-21 Thread David Parks
We have 300M documents, each about a paragraph of text on average. The index is 140GB in size. I'm not sure how to find the IDF score, was that in the debug query below? It seems that any query with the word "book" in it triggers a 15 sec response time (unless it's the 2nd time we run the same que

Re: Using Solr from Github or SVN

2013-03-21 Thread Alexandre Rafalovitch
I thought git patch format was different from svn's. Would that cause issues? Regard, Alex On 21 Mar 2013 05:53, "Jan Høydahl" wrote: > See http://wiki.apache.org/solr/HowToContribute > > Whether you choose to work locally with a GIT checkout or SVN is up to > you. At the end of the day, w

Re: Bitwise operation

2013-03-21 Thread Christopher ARZUR
Users also have rights at the individual level (in addition to inheritance of their group), your solution implies that I attach potentially 1million identifier in the document? I do not know Solr limitations, but I think I approach it? Le 21/03/2013 10:47, Upayavira a écrit : You could attach

Re: Using Solr from Github or SVN

2013-03-21 Thread Jan Høydahl
Committers mostly use ant+ivy, so that will most likely be more stable. Both Eclipse and IDEA are frequently in use by committers so both are safe choices. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com 21. mars 2013 kl. 11:10 skre

Re: Slow queries for common terms

2013-03-21 Thread Jan Høydahl
Hi, I think you can start by reading this blog http://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-2 and try out the approach using a dictionary of the most common words in your index. You don't say how many documents, avg. doc size, the IDF value of "book", h

custom similary on a field not working

2013-03-21 Thread xavier jmlucjav
I have the following setup: I index my corpus, and I can see tf is as usual, in this doc is 14 times in this field: 4.5094776 = (MATCH) weight(description:galaxy^10.0 in 440) [DefaultSimilarity], result of:

Re: Using Solr from Github or SVN

2013-03-21 Thread Furkan KAMACI
How about deciding on Maven or Ant + Ivy. On the other hand I need another suggestion whether using Eclipse or Intellij IDEA. What developers use in common? 2013/3/21 Jan Høydahl > See http://wiki.apache.org/solr/HowToContribute > > Whether you choose to work locally with a GIT checkout or SVN i

Slow queries for common terms

2013-03-21 Thread David Parks
I've got a query that takes 15 seconds to return whenever I have the term "book" in a query that isn't cached. That's a pretty common term in our search index. We're indexing about 120 GB of text data. We only store terms and IDs, no document data, and the disk is virtually unused, it's all CPU tim

Re: Sort-field for ALL docs in FieldCache for sort queries -> OOM on lots of docs

2013-03-21 Thread Per Steffensen
On 3/21/13 10:52 AM, Toke Eskildsen wrote: On Thu, 2013-03-21 at 09:57 +0100, Per Steffensen wrote: Thanks Toke! Can you please elaborate a little bit? How to use it? What it is supposed to do for you? Sorry, no, I only know about it on the abstract level. The release notes for Solr 4.2 says *

Re: Using Solr from Github or SVN

2013-03-21 Thread Jan Høydahl
See http://wiki.apache.org/solr/HowToContribute Whether you choose to work locally with a GIT checkout or SVN is up to you. At the end of the day, when you want to contribute stuff back, you'd generate a patch and attach it to JIRA. SVN is the main repo, so if you want to be 100% in sync, choos

Re: Sort-field for ALL docs in FieldCache for sort queries -> OOM on lots of docs

2013-03-21 Thread Toke Eskildsen
On Thu, 2013-03-21 at 09:57 +0100, Per Steffensen wrote: > Thanks Toke! Can you please elaborate a little bit? How to use it? What > it is supposed to do for you? Sorry, no, I only know about it on the abstract level. The release notes for Solr 4.2 says * DocValues have been integrated into Solr

Re: Bitwise operation

2013-03-21 Thread Upayavira
You could attach the doc rights to the document itself, and then index the group rights into a separate core, and then use pseudo-joins to filter them. Effectively, you would say, "find me all the groups that my user is allowed to see, then find me all documents that are in those groups, based upon

Re: Facets with 5000 facet fields

2013-03-21 Thread Andy
What do I need to do to use this new per segment faceting method? From: Mark Miller To: solr-user@lucene.apache.org Sent: Wednesday, March 20, 2013 1:09 PM Subject: Re: Facets with 5000 facet fields On Mar 20, 2013, at 11:29 AM, Chris Hostetter wrote: > No

Using Solr from Github or SVN

2013-03-21 Thread Furkan KAMACI
I want to branch Solr (latest version) at my local and implement some custom codes. After some time(maybe every month) I will merge my code with Solr. However There is code at SVN and Github for Solr and I see that they are not exactly synchronous. Which one do you suggest, do you think that if the

Re: Facets with 5000 facet fields

2013-03-21 Thread John Nielsen
It looks like docvalues might solve a problem we have. (sorry for the thread jacking) I looked for info on it on the wiki, but could not find any. Is there any documentation done on it yet? On Wed, Mar 20, 2013 at 6:09 PM, Mark Miller wrote: > > On Mar 20, 2013, at 11:29 AM, Chris Hostetter

Re: opinion: Stats over the faceting component

2013-03-21 Thread Upayavira
Have you made a JIRA ticket for this? This is useful generally, isn't it? Thx, Upayavira On Thu, Mar 21, 2013, at 03:18 AM, Tirthankar Chatterjee wrote: > We have done something similar. > Please read > http://lucene.472066.n3.nabble.com/How-to-modify-Solr-StatsComponent-to-support-stats-query-td

Re: SynonymFilterFactory and Punctuation

2013-03-21 Thread Upayavira
Something is stripping your punctuation before it gets to your synonym filter. Presumably this is the StandardTokenFilter. Try it with the WhitespaceFilterFactory. Also be careful to URL encode plus signs in queries - they represent a space in a URL. Upayavira On Wed, Mar 20, 2013, at 10:52 PM,

Re: dataimporthandler: nested query is called multiple times

2013-03-21 Thread patrick
alex, thank you for the link. i enabled the trace for 'org.apache.solr.handler.dataimport' and it seems as if the database is only called once: 2013-03-21T09:40:43 1363855243889 50 org.apache.solr.handler.dataimport.JdbcDataSource FINE org.apache.solr.handler.dataimport.JdbcDataSou

Re: how to search for special charcters

2013-03-21 Thread Upayavira
Use an EdgeNGramFilterFactory in your analysis time query parser. But note this will significantly increase the number of terms you are searching for, which will hit performance. Upayavira On Wed, Mar 20, 2013, at 03:47 PM, Niklas Langvig wrote: > Hi > How can in solr serach for all documents tha

Re: Sort-field for ALL docs in FieldCache for sort queries -> OOM on lots of docs

2013-03-21 Thread Per Steffensen
On 3/21/13 9:48 AM, Toke Eskildsen wrote: On Thu, 2013-03-21 at 09:13 +0100, Per Steffensen wrote: We have a lot of docs in Solr. Each particular Solr-node handles a lot of docs distributed among several replica. When you issue a sort query, it seems to me that, the value of the sort-field of AL

Re: Bitwise operation

2013-03-21 Thread Christopher ARZUR
@Jan Høydahl : do you mean "cutom filter" ? @Walter Underwood : I also agree with you, I'd only use native functions of Solr, but I do not know how to solve my problem ... My ACLs are composed of thousands of groups (inheritance) that have deny / allow user rights and who themselves have deny /

Re: Sort-field for ALL docs in FieldCache for sort queries -> OOM on lots of docs

2013-03-21 Thread Toke Eskildsen
On Thu, 2013-03-21 at 09:13 +0100, Per Steffensen wrote: > We have a lot of docs in Solr. Each particular Solr-node handles a lot > of docs distributed among several replica. When you issue a sort query, > it seems to me that, the value of the sort-field of ALL docs under the > Solr-node is adde

Sort-field for ALL docs in FieldCache for sort queries -> OOM on lots of docs

2013-03-21 Thread Per Steffensen
Hi We have a lot of docs in Solr. Each particular Solr-node handles a lot of docs distributed among several replica. When you issue a sort query, it seems to me that, the value of the sort-field of ALL docs under the Solr-node is added to the FieldCache. This leads to OOM-exceptions at some p

  1   2   >