date:20120216

Entity with multiple datasources

2012-02-16 Thread Radu Toev

Hello, I created a data-config.xml file where I define a datasource and an entity with 12 fields. In my use case I have 2 databases with the same schema, so I want to combine in one index the 2 databases. I defined a second dataSource tag and duplicateed the entity with its field(changed the name

Re: 'foruns' don't match 'forum' with NGramFilterFactory (or EdgeNGramFilterFactory)

2012-02-16 Thread Dirceu Vieira

Hi, It's funny that if you try "fóruns" it matches: http://bhakta.casadomato.org:8982/solr/select/?q=f%C3%B3runs&version=2.2&start=0&rows=10&indent=on But not when you try "foruns", it does not. Check this out... http://bhakta.casadomato.org:8982/solr/admin/analysis.jsp?nt=type&name=text&verbose

problem to indexing pdf directory

2012-02-16 Thread alessio crisantemi

Hi all, I have a problem to configure a pdf indexing from a directory in my solr wit DIH: with this data-config I obtain this result: full-import idle - 0:0:2.44 0 43 0 2012-02-12 19:06:00 Indexing failed. Rolled

Best requestHandler for "typing error".

2012-02-16 Thread stockii

Hello. Which RH do you use to find typing errors like "goolge" => do you mean "google" ?! I want to use my Autosuggestion "EdgeNGram" with a clever AutoCorrection! What do you use ? - --- System One Server, 12 GB RAM, 2

Re: Do we need reindexing from solr 1.4.1 to 3.5.0?

2012-02-16 Thread Kashif Khan

I kept old schema files and solrconfig file but there were some errors due to which solr was not loading. I dono what are those things. We have few our own custom plugins developed with 1.4.1 -- View this message in context: http://lucene.472066.n3.nabble.com/Do-we-need-reindexing-from-solr-1-4-1

Re: Do we need reindexing from solr 1.4.1 to 3.5.0?

2012-02-16 Thread Kashif Khan

we have both stored = true and false fields in the schema. So we cant reindex wat u said. we have tried that earlier. -- View this message in context: http://lucene.472066.n3.nabble.com/Do-we-need-reindexing-from-solr-1-4-1-to-3-5-0-tp3739353p3749631.html Sent from the Solr - User mailing list ar

Re: Using Solr for a rather busy "Yellow Pages"-type index - good idea or not really?

2012-02-16 Thread Mikhail Khludnev

Pls find inlined. On Thu, Feb 16, 2012 at 10:30 AM, Alexey Verkhovsky < alexey.verkhov...@gmail.com> wrote: > Hi, all, > > I'm new here. Used Solr on a couple of projects before, but didn't need to > dive deep into anything until now. These days, I'm doing a spike for a > "yellow pages" type sear

Re: Entity with multiple datasources

2012-02-16 Thread Dmitry Kan

1. Do you see any errors / exceptions in the logs? 2. Could you have duplicates? On Thu, Feb 16, 2012 at 10:15 AM, Radu Toev wrote: > Hello, > > I created a data-config.xml file where I define a datasource and an entity > with 12 fields. > In my use case I have 2 databases with the same schema,

Re: Spatial Search and faceting

2012-02-16 Thread Eric Grobler

Hi William, Thanks for the feedback. I will try the group query and see how the performance with 2 queries is. Best Regards Ericz On Thu, Feb 16, 2012 at 4:06 AM, William Bell wrote: > One way to do it is to group by city and then sort=geodist() asc > > select?group=true&group.field=city&sort

Realtime search with multi clients updating index simultaneously.

2012-02-16 Thread v_shan

I have a heldesk application developed in PHP/MySQL. I want to implement real time Full text search and I have shortlisted Solr. MySQL database will store all the tickets and their updates and that data will be imported for building Solr index. All Search requests will be handled by Solr. What I w

Re: Entity with multiple datasources

2012-02-16 Thread Radu Toev

1. Nothing in the logs 2. No. On Thu, Feb 16, 2012 at 12:44 PM, Dmitry Kan wrote: > 1. Do you see any errors / exceptions in the logs? > 2. Could you have duplicates? > > On Thu, Feb 16, 2012 at 10:15 AM, Radu Toev wrote: > > > Hello, > > > > I created a data-config.xml file where I define a da

Re: Entity with multiple datasources

2012-02-16 Thread Dmitry Kan

It sounds a bit, as if SOLR stopped processing data once it queried all from the smaller dataset. That's why you have 2000. If you just have a handler pointed to the bigger data set (6k), do you manage to get all 6k db entries into solr? On Thu, Feb 16, 2012 at 1:46 PM, Radu Toev wrote: > 1. Not

Re: Entity with multiple datasources

2012-02-16 Thread Radu Toev

I tried running with just one datasource(the one that has 6k entries) and it indexes them ok. The same, if I do sepparately the 1k database. It indexes ok. On Thu, Feb 16, 2012 at 2:11 PM, Dmitry Kan wrote: > It sounds a bit, as if SOLR stopped processing data once it queried all > from the smal

Re: Entity with multiple datasources

2012-02-16 Thread Dmitry Kan

OK, maybe you can show the db-data-config.xml just in case? Also in schema.xml, does you correspond to the unique field in the db? On Thu, Feb 16, 2012 at 2:13 PM, Radu Toev wrote: > I tried running with just one datasource(the one that has 6k entries) and > it indexes them ok. > The same, if I

Re: Entity with multiple datasources

2012-02-16 Thread Radu Toev

I've removed the connection params The unique key is id. On Thu, Feb 16, 2012 at 2:27 PM, Dmitry Kan wrote: > OK, maybe you can show the db-data-config.xml just in case? > Also in schema.xml, does you correspond to the unique fi

How to loop through the DataImportHandler query results?

2012-02-16 Thread K, Baraneetharan

Hi Solr community, I'm new to Solr and DataImportHandler., I've a requirement to fetch the data from a database table and pass it to solr. Part of existing data-config.xml and solr schema.xml are given below, data-config.xml

Re: SolrCloud Replication Question

2012-02-16 Thread Mark Miller

On Feb 14, 2012, at 10:57 PM, Jamie Johnson wrote: > Not sure if this is > expected or not. Nope - should be already resolved or will be today though. - Mark Miller lucidimagination.com

Re: SolrCloud Replication Question

2012-02-16 Thread Jamie Johnson

Ok, great. Just wanted to make sure someone was aware. Thanks for looking into this. On Thu, Feb 16, 2012 at 8:26 AM, Mark Miller wrote: > > On Feb 14, 2012, at 10:57 PM, Jamie Johnson wrote: > >> Not sure if this is >> expected or not. > > Nope - should be already resolved or will be today th

PatternReplaceFilterFactory group

2012-02-16 Thread O. Klein

PatternReplaceFilterFactory has no option to select the group to replace. Is there a reason for this, or could this be a nice feature? -- View this message in context: http://lucene.472066.n3.nabble.com/PatternReplaceFilterFactory-group-tp3750201p3750201.html Sent from the Solr - User mailing l

custom scoring

2012-02-16 Thread Carlos Gonzalez-Cadenas

Hello all: We'd like to score the matching documents using a combination of SOLR's IR score with another application-specific score that we store within the documents themselves (i.e. a float field containing the app-specific score). In particular, we'd like to calculate the final score doing some

Re: problem to indexing pdf directory

2012-02-16 Thread Gora Mohanty

On 16 February 2012 14:33, alessio crisantemi wrote: > Hi all, > I have a problem to configure a pdf indexing from a directory in my solr > wit DIH: > > with this data-config > > > > > > name="tika-test" > processor="FileListEntityProcessor" > baseDir="D:\gioconews_archivio\marzo20

Re: How to loop through the DataImportHandler query results?

2012-02-16 Thread Mikhail Khludnev

Hi Baranee, Some time ago I played with http://wiki.apache.org/solr/DataImportHandler#ScriptTransformer - it was a pretty good stuff. Regards On Thu, Feb 16, 2012 at 3:53 PM, K, Baraneetharan wrote: > To avoid that we don't want to mention the column names in the field tag , > but want to writ

Re: Solr soft commit feature

2012-02-16 Thread Nagendra Nagarajayya

The slaves will be able to replicate from the master as before but not in NRT depending on your commit interval. Commit interval can be set higher for NRT as it is not needed for searches except for consolidating the index changes on the master and can be an hr or even more. It maybe easier to

Re: Can I rebuild an index and remove some fields?

2012-02-16 Thread Robert Stewart

I will test it with my big production indexes first, if it works I will port to Java and add to contrib I think. On Wed, Feb 15, 2012 at 10:03 PM, Li Li wrote: > great. I think you could make it a public tool. maybe others also need such > functionality. > > On Thu, Feb 16, 2012 at 5:31 AM, Rober

Payload and exact search - 2

2012-02-16 Thread leonardo2

Hello, I already posted this question but for some reason it was attached to a thread with different topic. Is there the possibility of perform 'exact search' in a payload field? I'have to index text with auxiliary info for each word. In particular at each word is associated the bounding box co

Re: Entity with multiple datasources

2012-02-16 Thread Dmitry Kan

I think the problem here is that initially you trying to create separate documents for two different tables, while your config is aiming to create only one document. Here there is one solution (not tried by me): -- You can have multiple documents generated by the same data-config:

Re: Entity with multiple datasources

2012-02-16 Thread Radu Toev

I'm not sure I follow. The idea is to have only one document. Do the multiple documents have the same structure then(different datasources), and if so how are they actually indexed? Thanks. On Thu, Feb 16, 2012 at 4:40 PM, Dmitry Kan wrote: > I think the problem here is that initially you tryin

Re: problem to indexing pdf directory

2012-02-16 Thread alessio crisantemi

yes, but if I use TikaEntityProcessor the result of my full-import is 0 1 0 Indexing failed. Rolled back all changes. 2012/2/16 alessio crisantemi > Hi all, > I have a problem to configure a pdf indexing from a directory in my solr > wit DIH: > > with this data-config > > > > > >

Re: Entity with multiple datasources

2012-02-16 Thread Dmitry Kan

Each document in SOLR will correspond to one db record and since both databases have the same schema, you can't index two records from two databases into the same SOLR document. So after indexing, you should have 7k different documents, each of which holds data from a db record. Also one problem

Re: custom scoring

2012-02-16 Thread Em

Hello carlos, could you show us how your Solr-call looks like? Regards, Em Am 16.02.2012 14:34, schrieb Carlos Gonzalez-Cadenas: > Hello all: > > We'd like to score the matching documents using a combination of SOLR's IR > score with another application-specific score that we store within the >

Re: Entity with multiple datasources

2012-02-16 Thread Radu Toev

Really good point on the ids, I completely overlooked that matter. I will give it a try. Thanks again. On Thu, Feb 16, 2012 at 5:00 PM, Dmitry Kan wrote: > Each document in SOLR will correspond to one db record and since both > databases have the same schema, you can't index two records from two

Re: Entity with multiple datasources

2012-02-16 Thread Dmitry Kan

no problem, hope it helps, you're welcome. On Thu, Feb 16, 2012 at 5:03 PM, Radu Toev wrote: > Really good point on the ids, I completely overlooked that matter. > I will give it a try. > Thanks again. > > On Thu, Feb 16, 2012 at 5:00 PM, Dmitry Kan wrote: > > > Each document in SOLR will corre

Frequent garbage collections after a day of operation

2012-02-16 Thread Matthias Käppler

Hey everyone, we're running into some operational problems with our SOLR production setup here and were wondering if anyone else is affected or has even solved these problems before. We're running a vanilla SOLR 3.4.0 in several Tomcat 6 instances, so nothing out of the ordinary, but after a day o

RE: PatternReplaceFilterFactory group

2012-02-16 Thread Steven A Rowe

Hi O., PatternReplaceFilter(Factory) uses Matcher.replaceAll() or replaceFirst(), both of which take in a string that can include any or all groups using the syntax "$n", where n is the group number. See the Matcher.appendReplacement() javadocs for an explanation of the functionality and synta

Re: custom scoring

2012-02-16 Thread Carlos Gonzalez-Cadenas

Hello Em: The URL is quite large (w/ shards, ...), maybe it's best if I paste the relevant parts. Our "q" parameter is: "q":"_val_:\"product(query_score,max(query($q8),max(query($q7),max(query($q4),query($q3)\"", The subqueries q8, q7, q4 and q3 are regular queries, for example: "q7

Re: How to loop through the DataImportHandler query results?

2012-02-16 Thread Chantal Ackermann

If your script turns out too complex to maintain, and you are developing in Java, anyway, you could extend EntityProcessor and handle the data in a custom way. I've done that to transform a datamart like data structure back into a row based one. Basically you override the method that gets the data

Re: Frequent garbage collections after a day of operation

2012-02-16 Thread Chantal Ackermann

Make sure your Tomcat instances are started each with a max heap size that adds up to something a lot lower than the complete RAM of your system. Frequent Garbage collection means that your applications request more RAM but your Java VM has no more resources, so it requires the Garbage Collector t

RE: PatternReplaceFilterFactory group

2012-02-16 Thread O. Klein

steve_rowe wrote > > Hi O., > > PatternReplaceFilter(Factory) uses Matcher.replaceAll() or replaceFirst(), > both of which take in a string that can include any or all groups using > the syntax "$n", where n is the group number. See the > Matcher.appendReplacement() javadocs for an explanation

Re: is it possible to run deltaimport command with out delta query?

2012-02-16 Thread Shawn Heisey

On 2/15/2012 11:26 PM, nagarjuna wrote: hi all.. i am new to solr .can any body explain me about the delta-import and delta query and also i have the below questions 1.is it possible to run deltaimport without delataquery? 2. is it possible to write a delta query without having last_modifi

Re: problem to indexing pdf directory

2012-02-16 Thread alessio crisantemi

here the log: org.apache.solr.handler.dataimport.DataImporter doFullImport Grave: Full Import failed org.apache.solr.handler.dataimport.DataImportHandlerException: 'baseDir' is a required attribute Processing Document # 1 at org.apache.solr.handler.dataimport.FileListEntityProcessor.init(FileLis

Re: problem to indexing pdf directory

2012-02-16 Thread Gora Mohanty

On 16 February 2012 21:37, alessio crisantemi wrote: > here the log: > > > org.apache.solr.handler.dataimport.DataImporter doFullImport > Grave: Full Import failed > org.apache.solr.handler.dataimport.DataImportHandlerException: 'baseDir' is > a required attribute Processing Document # 1 [...] Th

Re: Setting solrj server connection timeout

2012-02-16 Thread Shawn Heisey

On 2/3/2012 1:12 PM, Shawn Heisey wrote: Is the following a reasonable approach to setting a connection timeout with SolrJ? queryCore.getHttpClient().getHttpConnectionManager().getParams() .setConnectionTimeout(15000); Right now I have all my solr server objects sharing

Re: problem to indexing pdf directory

2012-02-16 Thread alessio crisantemi

Yes, I read it. But I don't know the cause. and more: I work on windows and so, I configured manually tika and solr because I don't have maven... 2012/2/16 Gora Mohanty > On 16 February 2012 21:37, alessio crisantemi > wrote: > > here the log: > > > > > > org.apache.solr.handler.dataimport.Data

Re: Do we need reindexing from solr 1.4.1 to 3.5.0?

2012-02-16 Thread tamanjit.bin...@yahoo.co.in

There may be issues with your solrconfig. Kindly post the exception that you are recieving. -- View this message in context: http://lucene.472066.n3.nabble.com/Do-we-need-reindexing-from-solr-1-4-1-to-3-5-0-tp3739353p3750937.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: is it possible to run deltaimport command with out delta query?

2012-02-16 Thread Dyer, James

There is a good example on how to do a delta update using "command=full-update&clean=false" on the wiki, here: http://wiki.apache.org/solr/DataImportHandlerFaq#fullimportdelta This can be advantageous if you are updating a ton of data at once and do not want it executing as many queries to the

Re: Best requestHandler for "typing error".

2012-02-16 Thread tamanjit.bin...@yahoo.co.in

You can enable the spellcheck component and add it to your default request handler. This might be of use: http://wiki.apache.org/solr/SpellCheckComponent http://wiki.apache.org/solr/SpellCheckComponent It could be used both during autosuggest as well as did you mean. -- View this message in con

Solr edismax clarification

2012-02-16 Thread Indika Tantrigoda

Hi All, I am using edismax SearchHandler in my search and I have some issues in the search results. As I understand if the "defaultOperator" is set to OR the search query will be passed as -> The OR quick OR brown OR fox implicitly. However if I search for The quick brown fox, I get lesser result

copyField: multivalued field to joined singlevalue field

2012-02-16 Thread flyingeagle-de

Hello, I want to copy all data from a multivalued field joined together in a single valued field. Is there any opportunity to do this by using solr-standards? kind regards -- View this message in context: http://lucene.472066.n3.nabble.com/copyField-multivalued-field-to-joined-singlevalue-fiel

Re: copyField: multivalued field to joined singlevalue field

2012-02-16 Thread Yonik Seeley

On Thu, Feb 16, 2012 at 11:35 AM, flyingeagle-de wrote: > Hello, > > I want to copy all data from a multivalued field joined together in a single > valued field. > > Is there any opportunity to do this by using solr-standards? There is not currently, but it certainly makes sense. Anyone know of

Distributed Faceting Bug?

2012-02-16 Thread Jamie Johnson

I am attempting to execute a query with the following parameters q=*:* distrib=true facet=true facet.limit=10 facet.field=manu f.manu.facet.mincount=1 f.manu.facet.limit=10 f.manu.facet.sort=index rows=10 When doing this I get the following exception null java.lang.ArrayIndexOutOfBoundsExceptio

Re: custom scoring

2012-02-16 Thread Em

Hello Carlos, well, you must take into account that you are executing up to 8 queries per request instead of one query per request. I am not totally sure about the details of the implementation of the max-function-query, but I guess it first iterates over the results of the first max-query, after

Re: Distributed Faceting Bug?

2012-02-16 Thread Jamie Johnson

please ignore this, it has nothing to do with the faceting component. I was able to disable a custom component that I had and it worked perfectly fine. On Thu, Feb 16, 2012 at 12:42 PM, Jamie Johnson wrote: > I am attempting to execute a query with the following parameters > > q=*:* > distrib=tr

Re: Distributed Faceting Bug?

2012-02-16 Thread Em

Hi Jamie, what version of Solr/SolrJ are you using? Regards, Em Am 16.02.2012 18:42, schrieb Jamie Johnson: > I am attempting to execute a query with the following parameters > > q=*:* > distrib=true > facet=true > facet.limit=10 > facet.field=manu > f.manu.facet.mincount=1 > f.manu.facet.limit

Re: Distributed Faceting Bug?

2012-02-16 Thread Em

Hi Jamie, nice to hear. Maybe you can share in what kind of bug you ran, so that other developers with similar bugish components can benefit from your experience. :) Regards, Em Am 16.02.2012 19:23, schrieb Jamie Johnson: > please ignore this, it has nothing to do with the faceting component. >

Re: How to loop through the DataImportHandler query results?

2012-02-16 Thread Mikhail Khludnev

Chantal, if you prefer java here is http://wiki.apache.org/solr/DIHCustomTransform On Thu, Feb 16, 2012 at 7:24 PM, Chantal Ackermann < chantal.ackerm...@btelligent.de> wrote: > If your script turns out too complex to maintain, and you are developing > in Java, anyway, you could extend EntityP

Re: custom scoring

2012-02-16 Thread Carlos Gonzalez-Cadenas

Hello Em: Thanks for your answer. Yes, we initially also thought that the excessive increase in response time was caused by the several queries being executed, and we did another test. We executed one of the subqueries that I've shown to you directly in the "q" parameter and then we tested this s

Re: custom scoring

2012-02-16 Thread Em

Hello Carlos, > We have some more tests on that matter: now we're moving from issuing this > large query through the SOLR interface to creating our own QueryParser. The > initial tests we've done in our QParser (that internally creates multiple > queries and inserts them inside a DisjunctionMaxQue

Re: custom scoring

2012-02-16 Thread Carlos Gonzalez-Cadenas

Hello Em: 1) Here's a printout of an example DisMax query (as you can see mostly MUST terms except for some SHOULD terms used for boosting scores for stopwords) * * *((+stopword_shortened_phrase:hoteles +stopword_shortened_phrase:barcelona stopword_shortened_phrase:en) | (+stopword_phrase:hoteles

Re: copyField: multivalued field to joined singlevalue field

2012-02-16 Thread Chris Hostetter

: > I want to copy all data from a multivalued field joined together in a single : > valued field. : > : > Is there any opportunity to do this by using solr-standards? : : There is not currently, but it certainly makes sense. Part of it has just recently been commited to trunk actually... https

Re: Specify a cores roles through core add command

2012-02-16 Thread Mark Miller

https://issues.apache.org/jira/browse/SOLR-3138 On Feb 9, 2012, at 4:16 PM, Jamie Johnson wrote: > per SOLR-2765 we can add roles to specific cores such that it's > possible to give custom roles to solr instances, is it possible to > specify this when adding a core through curl > 'http://host:por

Re: Using Solr for a rather busy "Yellow Pages"-type index - good idea or not really?

2012-02-16 Thread Alexey Verkhovsky

On Thu, Feb 16, 2012 at 3:37 AM, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote: > Everybody start from daily bounce, but end up with UPDATED_AT column and > delta updates , just consider urgent content fix usecase. Don't think it's > worth to rely on daily bounce as a cornerstone of archite

Re: Distributed Faceting Bug?

2012-02-16 Thread Jamie Johnson

still digging ;) Once I figure it out I'll be happy to share. On Thu, Feb 16, 2012 at 1:32 PM, Em wrote: > Hi Jamie, > > nice to hear. > Maybe you can share in what kind of bug you ran, so that other > developers with similar bugish components can benefit from your > experience. :) > > Regards,

Re: SolrCloud - issues running with embedded zookeeper ensemble

2012-02-16 Thread arin g

i have the same problem, it seems that there is a bug in SolrZkServer class (parseProperties method), that doesn't work well when you have an external zookeeper ensemble. Thanks, arin -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-issues-running-with-embedded-zook

Re: Using Solr for a rather busy "Yellow Pages"-type index - good idea or not really?

2012-02-16 Thread Yonik Seeley

On Thu, Feb 16, 2012 at 3:03 PM, Alexey Verkhovsky wrote: >> 5. All Solr caching is switched off. > >> But why? >> > > Because (a) I shouldn't need to cache documents, if they are all in memory > anyway; Your're making many assumptions about how Solr works internally. One example of many: Solr

Re: custom scoring

2012-02-16 Thread Chris Hostetter

: We'd like to score the matching documents using a combination of SOLR's IR : score with another application-specific score that we store within the : documents themselves (i.e. a float field containing the app-specific : score). In particular, we'd like to calculate the final score doing some :

Re: Distributed Faceting Bug?

2012-02-16 Thread Jamie Johnson

The issue appears to be that I put an empty array into the doc scores instead of null in DocSlice. DocSlice then just checks if scores is null when hasScore is called which caused a further issue down the line. I'll follow up with anything else that I find along the way. On Thu, Feb 16, 2012 at

Re: custom scoring

2012-02-16 Thread Robert Muir

On Thu, Feb 16, 2012 at 8:34 AM, Carlos Gonzalez-Cadenas wrote: > Hello all: > > We'd like to score the matching documents using a combination of SOLR's IR > score with another application-specific score that we store within the > documents themselves (i.e. a float field containing the app-specifi

Re: Using Solr for a rather busy "Yellow Pages"-type index - good idea or not really?

2012-02-16 Thread Alexey Verkhovsky

On Thu, Feb 16, 2012 at 1:32 PM, Yonik Seeley wrote: > Your're making many assumptions about how Solr works internally. > True that. If this spike turns into a project, digging through the source code will come. Meantime, we have to start somewhere, and the default configuration may not be the gr

Re: Using Solr for a rather busy "Yellow Pages"-type index - good idea or not really?

2012-02-16 Thread Yonik Seeley

On Thu, Feb 16, 2012 at 4:06 PM, Alexey Verkhovsky wrote: > ly need ids, scores and total number of results out of Solr. Presentation of > selected entities will have to include some write-heavy data (from RDBMS > and/or memcached), therefore won't be Solr's business anyway. It depends on if you'

Re: SolrCloud - issues running with embedded zookeeper ensemble

2012-02-16 Thread Mark Miller

On Feb 16, 2012, at 2:53 PM, arin g wrote: > i have the same problem, it seems that there is a bug in SolrZkServer class > (parseProperties method), that doesn't work well when you have an external > zookeeper ensemble. > This issue was around using an embedded ensemble - an external ensemble m

Re: custom scoring

2012-02-16 Thread Em

Hello Carlos, I think we missunderstood eachother. As an example: BooleanQuery ( clauses: ( MustMatch( DisjunctionMaxQuery( TermQuery("stopword_field", "barcelona"), TermQuery("stopword_field", "hoteles") ) ), Sh

files left open?

2012-02-16 Thread Paulo Magalhaes

Hi all, I was loading a big (60 million docs) csv in solr 4 when something odd happened. I got a solr error in the log saying that it could not write the file. du -s indicated I had used 30Gb of a 50Gb available but df -k indicated that the disk was I00% used. ds and df giving different results c

Re: custom scoring

2012-02-16 Thread Em

I just modified some TestCases a little bit to see how the FunctionQuery behaves. Given that you got an index containing 14 docs, where 13 of them containing the term "batman" and two contain the term "superman", a search for q=+text:superman _val_:"query($qq)"&qq=text:superman Leads to two hits

Re: files left open?

2012-02-16 Thread Yonik Seeley

On Thu, Feb 16, 2012 at 5:56 PM, Paulo Magalhaes wrote: > I was loading a big (60 million docs) csv in solr 4 when something odd > happened. > I got a solr error in the log saying that it could not write the file. > du -s indicated I had used 30Gb of a 50Gb available but df -k indicated > that th

Re: Setting solrj server connection timeout

2012-02-16 Thread Mark Miller

Im not sure that timeout will help you here - I believe it's the timeout on 'creating' the connection. Try setting the socket timeout (setSoTimeout) - that should let you try sooner. It looks like perhaps the server is timing out and closing the connection. I guess all you can do is timeout reas

how to delta index linked entities in 3.5.0

2012-02-16 Thread AdamLane

The delta instructions from https://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command works for me in solr 1.4 but crashes in 3.5.0 (error: "deltaQuery has no column to resolve to declared primary key pk='ITEM_ID, CATEGORY_ID'" issue: https://issues.apache.org/jira/browse/SOLR-2907

Re: Setting solrj server connection timeout

2012-02-16 Thread Shawn Heisey

On 2/16/2012 6:28 PM, Mark Miller wrote: Im not sure that timeout will help you here - I believe it's the timeout on 'creating' the connection. Try setting the socket timeout (setSoTimeout) - that should let you try sooner. It looks like perhaps the server is timing out and closing the connecti

Sort by the number of matching terms (coord value)

2012-02-16 Thread Nicholas Clark

Hi, I'm looking for a way to sort results by the number of matching terms. Being able to sort by the coord() value or by the overlap value that gets passed into the coord() function would do the trick. Is there a way I can expose those values to the sort function? I'd appreciate any help that poi

Re: how to delta index linked entities in 3.5.0

2012-02-16 Thread Shawn Heisey

On 2/16/2012 6:31 PM, AdamLane wrote: The delta instructions from https://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command works for me in solr 1.4 but crashes in 3.5.0 (error: "deltaQuery has no column to resolve to declared primary key pk='ITEM_ID, CATEGORY_ID'" issue: https:/

Re: Sort by the number of matching terms (coord value)

2012-02-16 Thread Li Li

you can fool the lucene scoring fuction. override each function such as idf queryNorm lengthNorm and let them simply return 1.0f. I don't lucene 4 will expose more details. but for 2.x/3.x, lucene can only score by vector space model and the formula can't be replaced by users. On Fri, Feb 17, 2012

Improving proximity search performance

2012-02-16 Thread Bryan Loofbourrow

Here’s my use case. I expect to set up a Solr index that is approximately 1.4GB (this is a real number from the proof-of-concept using the real data, which consists of about 10 million documents, many of significant size, and making use of the FastVectorHighlighter to do highlighting on the body te

RE: Frequent garbage collections after a day of operation

2012-02-16 Thread Bryan Loofbourrow

A couple of thoughts: We wound up doing a bunch of tuning on the Java garbage collection. However, the pattern we were seeing was periodic very extreme slowdowns, because we were then using the default garbage collector, which blocks when it has to do a major collection. This doesn't sound like yo

Re: distributed deletes working?

2012-02-16 Thread Mark Miller

Yup - deletes are fine. On Thu, Feb 16, 2012 at 8:56 PM, Jamie Johnson wrote: > With solr-2358 being committed to trunk do deletes and updates get > distributed/routed like adds do? Also when a down shard comes back up are > the deletes/updates forwarded as well? Reading the jira I believe the

Re: Sort by the number of matching terms (coord value)

2012-02-16 Thread Nicholas Clark

I want to leave the score intact so I can sort by matching term frequency and then by score. I don't think I can do that if I modify all the similarity functions, but I think your solution would have worked otherwise. It would be great if there was a way I could expose this information through a f

Ranking based on number of matches in a multivalued field?

2012-02-16 Thread Steven Ou

So suppose I have a multivalued field for categories. Let's say we have 3 items with these categories: Item 1: category ids [1,2,5,7,9] Item 2: category ids [4,8,9] Item 3: category ids [1,4,9] I now run a filter query for any of the following category ids [1,4,9]. I should get all of them back a

UpdateRequestHandler coding

2012-02-16 Thread Lance Norskog

If I want to write a complex UpdateRequestHandler should I do it on trunk or the 3.x branch? The criteria are a stable, debugged, full-featured environment. -- Lance Norskog goks...@gmail.com

Re: Size of suggest dictionary

2012-02-16 Thread Mike Hugo

Thanks Em! What if we use a threshold value in the suggest configuration, like 0.005 I assume the dictionary size will then be smaller than the total number of distinct terms, is there anyway to determine what that size is? Thanks, Mike On Wednesday, February 15, 2012 at 4:39 PM, Em

Re: Frequent garbage collections after a day of operation

2012-02-16 Thread Jason Rutherglen

> One thing that could fit the pattern you describe would be Solr caches > filling up and getting you too close to your JVM or memory limit This [uncommitted] issue would solve that problem by allowing the GC to collect caches that become too large, though in practice, the cache setting would need

88 matches

Mail list logo