Replication not triggered

2015-04-27 Thread Michael Lackhoff
We have old fashioned replication configured between one master and one slave. Everything used to work but today I noticed that recent records were not present in the slave (same query gives hits on master but non on slave). The replication communication seems to work. This is what I get in the log

Re: variaton on boosting recent documents gives exception

2015-02-13 Thread Michael Lackhoff
Am 13.02.2015 um 11:18 schrieb Gonzalo Rodriguez: > You can always change the type of your sortyear field to an int, or create an > int version of it and use copyField to populate it. But that would require me to reindex. Would be nice to have some type conversion available within a function que

variaton on boosting recent documents gives exception

2015-02-12 Thread Michael Lackhoff
Since my field to measure recency is not a date field but a string field (with only year-numbers in it), I tried a variation on the suggested boost function for recent documents: recip(sub(2015,min(sortyear,2015)),1,10,10) But this gives an exception when used in a boost or bf parameter. I guess

Re: pf doesn't work like normal phrase query

2015-01-11 Thread Michael Lackhoff
Thanks everyone for all the advice! To sum up there seems to be no easy solution. I only have the option to either - make things really complicated - only help some users/query structures - accept the status quo What could help is an analogon to field aliases: If it was possible to say f.title.pf

Re: pf doesn't work like normal phrase query

2015-01-11 Thread Michael Lackhoff
Am 11.01.2015 um 18:30 schrieb Jack Krupansky: > It's still not quite clear to me what your specific goal is. From your > vague description it seems somewhat different from the blog post that you > originally cited. So, let's try one more time... explain in plain English > what use case you are tr

Re: pf doesn't work like normal phrase query

2015-01-11 Thread Michael Lackhoff
Hi Ahmet, > You might find this useful : > https://lucidworks.com/blog/whats-a-dismax/ I have a basic understanding but will do further reading... > Regarding your example : title:foo AND author:miller AND year:[2010 TO *] > last two clauses better served as a filter query. > > http://wiki.apa

Re: pf doesn't work like normal phrase query

2015-01-11 Thread Michael Lackhoff
Am 11.01.2015 um 14:19 schrieb Michael Lackhoff: > Or put another way: How can I do this boost in more complex queries like: > title:foo AND author:miller AND year:[2010 TO *] > It would be nice to have a title "foo" before another title "some foo > and bar" (giv

Re: pf doesn't work like normal phrase query

2015-01-11 Thread Michael Lackhoff
Am 11.01.2015 um 14:01 schrieb Ahmet Arslan: > What happens when you do not use fielded query? > > q=anatomie&qf=title_exact > instead of > > q=title_exact:"anatomie" Then it works (with qf=title): +(title:anatomie) (title_exact:" anatomie "^20.0) Only problem is that my frontend alway

pf doesn't work like normal phrase query

2015-01-11 Thread Michael Lackhoff
My aim is to boost "exactish" matches similar to the recipe described in [1]. The anchoring works in q but not in pf, where I need it. Here is an example that shows the effect: q=title_exact:"anatomie"&pf=title_exact^2000 debugQuery says it is interpreted this way: +title_exact:" anatomie "

Re: Solution for reverse order of year facets?

2014-03-04 Thread Michael Lackhoff
Hi Ahmet, I forgot to include what I did for one customer : 1) Using StatsComponent I get min and max values of the field (year) 2) Calculate "smart gap/range values" according to minimum and maximum. 3) Re-issue the same query (for thee second time) that includes a set of facet.query. It's

Re: Solution for reverse order of year facets?

2014-03-03 Thread Michael Lackhoff
On 03.03.2014 19:58 Shawn Heisey wrote: > There's already an issue in Jira. > > https://issues.apache.org/jira/browse/SOLR-1672 Thanks, this is of course the best solution. Only problem is that I use a custom verson from a vendor (based on version 4.3) I want to enhance. But perhaps they apply t

Re: Solution for reverse order of year facets?

2014-03-03 Thread Michael Lackhoff
Hi Ahmet, > There is no built in solution for this. Yes, I know, that's why I would like the TokenFilterFactory > Two workaround : > > 1) use facet.limit=-1 and invert the list (faceting response) at client side > > 2) use multiples facet.query >a)facet.query=year:[2012 TO 2014]&facet.quer

Re: Solution for reverse order of year facets?

2014-03-03 Thread Michael Lackhoff
On 03.03.2014 16:33 Ahmet Arslan wrote: > Currently there are two storing criteria available. However sort by index - > to return the constraints sorted in their index order (lexicographic by > indexed term) - should return most recent year at top, no? No, it returns them -- as you say -- in le

Solution for reverse order of year facets?

2014-03-03 Thread Michael Lackhoff
If I understand the docs right, it is only possible to sort facets by count or value in ascending order. Both variants are not very helpful for year facets if I want the most recent years at the top (or appear at all if I restrict the number of facet entries). It looks like a requirement that was

Re: SOLR 3.3.0 multivalued field sort problem

2011-08-13 Thread Michael Lackhoff
On 13.08.2011 21:28 Erick Erickson wrote: > Fair enough, but what's "first value in the list"? > There's nothing special about "mutliValued" fields, > that is where the schema has "multiValued=true". > under the covers, this is no different than just > concatenating all the values together and put

Re: SOLR 3.3.0 multivalued field sort problem

2011-08-13 Thread Michael Lackhoff
On 13.08.2011 20:31 Martijn v Groningen wrote: > The first solution would make sense to me. Some kind of a strategy > mechanism > for this would allow anyone to define their own rules. Duplicating results > would be confusing to me. That is why I would only activate it on request (setting a speci

Re: SOLR 3.3.0 multivalued field sort problem

2011-08-13 Thread Michael Lackhoff
On 13.08.2011 18:03 Erick Erickson wrote: > The problem I've always had is that I don't quite know what > "sorting on multivalued fields" means. If your field had tokens > a and z, would sorting on that field put the doc > at the beginning or end of the list? Sure, you can define > rules (

Re: problem in setting field attribute in schema.xml

2011-05-26 Thread Michael Lackhoff
Am 26.05.2011 14:10, schrieb Romi: did u mean when i set indexed="false" and store="true", solr does not index the field's value but store its value as it is??? I don't know if you are asking me since you do not quote anything but yes of course this is exactly the purpose of "indexed" and "sto

Re: problem in setting field attribute in schema.xml

2011-05-26 Thread Michael Lackhoff
Am 26.05.2011 12:52, schrieb Romi: i have done it, i deleted old indexes and created new indexes but still able to search it through *:*, and no result when i search it as field:value. really surprising result. :-O I really don't understand your problem. Thist is not at all surprising but the

Re: problem in setting field attribute in schema.xml

2011-05-25 Thread Michael Lackhoff
Am 25.05.2011 15:47, schrieb Vignesh Raj: It's very strange. Even I tried the same now and am getting the same result. I have set both indexed=false and stored=false. But still if I search for a keyword using my default search, I get the results in these fields as well. But if I specify field:val

Re: Is semicolon a character that needs escaping?

2010-09-07 Thread Michael Lackhoff
On 08.09.2010 00:05 Chris Hostetter wrote: > > : Subject: Is semicolon a character that needs escaping? > ... > : >From this I conclude that there is a bug either in the docs or in the > : query parser or I missed something. What is wrong here? > > Back in Solr 1.1, the standard query pars

Re: Is semicolon a character that needs escaping?

2010-09-02 Thread Michael Lackhoff
Hi Ken, >>> But in general escaping characters in a query gets tricky - if you >>> can >>> directly build queries versus pre-processing text sent to the query >>> parser, you'll save yourself some pain and suffering. >> >> What do you mean by these two alternatives? That is, what exactly >> co

Re: Is semicolon a character that needs escaping?

2010-09-02 Thread Michael Lackhoff
On 03.09.2010 00:57 Ken Krugler wrote: > The docs need to be updated, I believe. From some code I wrote back in > 2006... > [...] Thanks this explains it very well. > But in general escaping characters in a query gets tricky - if you can > directly build queries versus pre-processing text se

Is semicolon a character that needs escaping?

2010-09-02 Thread Michael Lackhoff
According to http://lucene.apache.org/java/2_9_1/queryparsersyntax.html only these characters need escaping: + - && || ! ( ) { } [ ] ^ " ~ * ? : \ but with this simple query: TI:stroke; AND TI:journal I got the error message: HTTP ERROR: 400 Unknown sort order: TI:journal My first guess was that i

Re: Very basic questions: Indexing text

2010-06-28 Thread Michael Lackhoff
On 28.06.2010 23:00 Ahmet Arslan wrote: >> 1) I can get my docs in the index, but when I search, it >> returns the entire document. I'd love to have it only >> return the line (or two) around the search term. > > Solr can generate Google-like snippets as you describe. > http://wiki.apache.org/s

Re: exceptionhandling & error-reporting?

2010-04-06 Thread Michael Lackhoff
On 06.04.2010 17:49 Alexander Rothenberg wrote: > On Monday 05 April 2010 20:14:44 Chris Hostetter wrote: >> define "crashes" ? ... presumabl you are tlaking about the client crashing >> because it can't parse theerro response, correct? ... the best suggestion >> given the current state of Solr is

Re: Confused by Solr Ranking

2010-03-09 Thread Michael Lackhoff
On 09.03.2010 16:01 Ahmet Arslan wrote: > >> I kind of suspected stemming to be the reason behind this. >> But I consider stemming to be a good feature. > > This is the side effect of stemming. Stemming increases recall while harming > precision. But most people want the best possible combinat

Re: schema-based Index-time field boosting

2009-11-23 Thread Michael Lackhoff
On 23.11.2009 19:33 Chris Hostetter wrote: > ...if there was a way to oost fields at index time that was configured in > the schema.xml, then every doc would get that boost on it's instances of > those fields but the only purpose of index time boosting is to indicate > that one document is more

Re: How to import multiple RSS-feeds with DIH

2009-11-09 Thread Michael Lackhoff
On 09.11.2009 09:46 Noble Paul നോബിള്‍ नोब्ळ् wrote: > When you say , the second example does not work , what does it mean? > some exception?(if yes, please post the stacktrace) Very mysterious. Now it works but I am sure I got an exception before. All I remember is something like "java.io.IOExce

How to import multiple RSS-feeds with DIH

2009-11-08 Thread Michael Lackhoff
[A new thread for this particular problem] On 09.11.2009 08:44 Noble Paul നോബിള്‍ नोब्ळ् wrote: > The tried and tested strategy is to post the question in this mailing > list w/ your data-config.xml. See my data-config.xml below. The first is the usual slashdot example with my 'id' addition, the

Re: Getting started with DIH

2009-11-08 Thread Michael Lackhoff
On 09.11.2009 08:20 Noble Paul നോബിള്‍ नोब्ळ् wrote: > It just started of as a single page and the features just got piled up > and the page just bigger. we are thinking of cutting it down to > smaller more manageable pages Oh, I like it the way it is as one page, so that the browser full text s

Re: Getting started with DIH

2009-11-08 Thread Michael Lackhoff
On 09.11.2009 06:54 Erik Hatcher wrote: > The brackets probably come from it being transformed as an array. Try > saying multiValued="false" on your specifications. Indeed. Thanks Erik that was it. My first steps with DIH showed me what a powerful tool this is but although the DIH wiki page

Re: Getting started with DIH

2009-11-08 Thread Michael Lackhoff
On 08.11.2009 16:56 Michael Lackhoff wrote: > What didn't work but looks like the potentially best solution is to fill > the id in my data-config by using the link twice: > > > This would be a definition just for this single data source but I don't > get any docs

Re: Getting started with DIH

2009-11-08 Thread Michael Lackhoff
On 08.11.2009 17:03 Lucas F. A. Teixeira wrote: > You have an example on using mail dih in solr distro Don't know where my eyes were. Thanks! When I was at it I looked at the schema.xml for the rss example and it uses "link" as UniqueKey, which is of course good, if you only have rss items but n

Getting started with DIH

2009-11-08 Thread Michael Lackhoff
I would like to start using DIH to index some RSS-Feeds and mail folders To get started I tried the RSS example from the wiki but as it is Solr complains about the missing id field. After some experimenting I found out two ways to fill the id: - in schema.xml This works but isn't very flexible.

Re: Preparing the ground for a real multilang index

2009-07-07 Thread Michael Lackhoff
On 08.07.2009 00:50 Jan Høydahl wrote: > itself and do not need to know the query language. You may then want > to do a copyfield from all your text_ -> text for convenient one- > field-to-rule-them-all search. Would that really help? As I understand it, copyfield takes the raw, not yet analyz

Re: Preparing the ground for a real multilang index

2009-07-02 Thread Michael Lackhoff
On 03.07.2009 00:49 Paul Libbrecht wrote: [I'll try to address the other responses as well] > I believe the proper way is for the server to compute a list of > accepted languages in order of preferences. > The web-platform language (e.g. the user-setting), and the values in > the Accept-Langu

Preparing the ground for a real multilang index

2009-07-02 Thread Michael Lackhoff
As pointed out in the recent thread about stemmers and other language specifics I should handle them all in their own right. But how? The first problem is how to know the language. Sometimes I have a language identifier within the record, sometimes I have more than one, sometimes I have none. How

Re: EnglishPorterFilterFactory and PatternReplaceFilterFactory

2009-07-02 Thread Michael Lackhoff
On 02.07.2009 17:28 Erick Erickson wrote: > I'm shooting a bit in the dark here, but I'd guess that these are > actually understandable results. Perhaps not too much in the dark > That is your implicit assumption, it seems to me, is that'wärme' and > 'waerme' should go through the stemmer and >

Re: EnglishPorterFilterFactory and PatternReplaceFilterFactory

2009-07-02 Thread Michael Lackhoff
On 02.07.2009 16:34 Walter Underwood wrote: > First, don't use an English stemmer on German text. It will give some odd > results. I know but at the moment I only have the choice between no stemmer at all and one stemmer and since more than half of the records are English (about 60% English, 30%

EnglishPorterFilterFactory and PatternReplaceFilterFactory

2009-07-02 Thread Michael Lackhoff
In Germany we have a strange habbit of seeing some sort of equivalence between Umlaut letters and a two letter representation. Example 'ä' and 'ae' are expected to give the same search results. To achieve this I added this filter to the "text" fieldtype definition: to both index and query

Re: Moving from single core to multicore

2009-02-10 Thread Michael Lackhoff
On 10.02.2009 02:39 Chris Hostetter wrote: > : Now all that is left is a more cosmetic change I would like to make: > : I tried to place the solr.xml in the example dir to get rid of the > : "-Dsolr.solr.home=multicore" for the start and changed the first entry > : from "core0" to "solr" and moved

Re: Moving from single core to multicore

2009-02-09 Thread Michael Lackhoff
On 09.02.2009 17:01 Ryan McKinley wrote: > Check your solrconfig.xml you probably have somethign like this: > > >${solr.data.dir:./solr/data} > (from the example) > > either remove that or make each one point to the correct location Thanks, that's it! Now all that is left is a more co

Re: Moving from single core to multicore

2009-02-09 Thread Michael Lackhoff
On 09.02.2009 15:40 Ryan McKinley wrote: >> But I have some problems setting this up. As long as I try the >> multicore >> sample everything works but when I copy my schema.xml into the >> multicore/core0/conf dir I only get 404 error messages when I enter >> the >> admin url. > > what is the

Moving from single core to multicore

2009-02-09 Thread Michael Lackhoff
Hello, I am not that experienced but managed to get a Solr index going by copying the "example" dir from the distribution (1.3 released version) and changing the fields in schema.xml to my needs. As I said everything is working very well so far. Now I need a second index on the same machine and th

Re: date range query performance

2008-10-31 Thread Michael Lackhoff
On 01.11.2008 06:10 Erik Hatcher wrote: > Yeah, this should work fine: > > default="NOW/DAY" multiValued="false"/> Wow, that was fast, thanks! -Michael

Re: date range query performance

2008-10-31 Thread Michael Lackhoff
On 31.10.2008 19:16 Chris Hostetter wrote: > forteh record, you don't need to index as a "StrField" to get this > benefit, you can still index using DateField you just need to round your > dates to some less graunlar level .. if you always want to round down, you > don't even need to do the rou

Re: Searching for future or "null" dates

2008-09-25 Thread Michael Lackhoff
On 26.09.2008 06:17 Chris Hostetter wrote: > that's true, regretably there is no prefix operator to indicate a "SHOULD" > clause in the Lucene query langauge, so if you set the default op to "AND" > you can't then override it on individual clauses. > > this is one of hte reasons i never make th

Re: Searching for future or "null" dates

2008-09-23 Thread Michael Lackhoff
On 23.09.2008 00:30 Chris Hostetter wrote: > : Here is what I was able to get working with your help. > : > : (productId:(102685804)) AND liveDate:[* TO NOW] AND ((endDate:[NOW TO *]) OR > : ((*:* -endDate:[* TO *]))) > : > : the *:* is what I was missing. > > Please, PLEASE ... do yourself a f

Re: wildcard newbie question

2008-01-30 Thread Michael Lackhoff
On 31.01.2008 00:31 Alessandro Senserini wrote: I have a text field type called courseTitle and it contains Struts 2 If I search courseTitle:strut* I get the documents but if I search with courseTitle:struts* I do not get any results. Could you please explain why? Just a guess: It might b

Re: Out of heap space with simple updates

2008-01-23 Thread Michael Lackhoff
On 23.01.2008 20:57 Chris Harris wrote: I'm using java -Xms512M -Xmx1500M -jar start.jar Thanks! I did see the -X... params in recent threads but didn't know where to place them -- not being a java guy at all ;-) -Michael

Out of heap space with simple updates

2008-01-23 Thread Michael Lackhoff
I wanted to try to do the daily update with XML updates (was mentioned recently as the recommended way) but got an "OutOfMemoryError: Java heap space" after 319000 records. I am sending one document at a time through the http update interface, so every request should be short enough to not run o

Re: Some sort of join in SOLR?

2008-01-17 Thread Michael Lackhoff
On 17.01.2008 23:48 Chris Hostetter wrote: assuming these are simple delimited files, something like the unix "join" command can do this for you ... then your indexing code can just process on file linerally. (if they aren't simple delimited files, you can preprocess them to strip out the exc

Re: Some sort of join in SOLR?

2008-01-17 Thread Michael Lackhoff
On 17.01.2008 18:32 Erick Erickson wrote: There's some cost here, and I don't know how this all plays with the sizes of your indexes. It may be totally impractical. Anyway, back to work. I think I will have to play with the different possibilities and see what fits best to my situation. Ther

Re: Some sort of join in SOLR?

2008-01-17 Thread Michael Lackhoff
On 17.01.2008 16:53 Erick Erickson wrote: I would *strongly* encourage you to store them together as one document. There's no real method of doing DB like joins in the underlying Lucene search engine. Thanks, that was also my preference. But that's generic advice. The question I have for you

Some sort of join in SOLR?

2008-01-16 Thread Michael Lackhoff
Hello, I have two sources of data for the same "things" to search. It is book data in a library. First there is the usual bibliographic data (author, title...) and then I have scanned and OCRed table of contents data about the same books. Both are updated independently. Now I don't know how t

Re: Another text I cannot get into SOLR with csv

2008-01-08 Thread Michael Lackhoff
On 08.01.2008 19:09 Yonik Seeley wrote: There is no shorter way, but if you update to the latest solr-dev (changes I checked in today), the default will be no encapsulation for split fields. Many thanks, also for your patience! Do you think the dev-version is ready for production? -Michael

Re: Another text I cannot get into SOLR with csv

2008-01-08 Thread Michael Lackhoff
On 08.01.2008 16:55 Yonik Seeley wrote: - A literal encapsulator should be possible to add by doubling it ' => '' but this gives the same error I think you would have to tripple it (the first is the encapsulator). Regardless, don't use encapsulation on the split fields unless you have to.

Re: Another text I cannot get into SOLR with csv

2008-01-08 Thread Michael Lackhoff
On 08.01.2008 16:11 Yonik Seeley wrote: Ahh, wait, it looks a single quote as the encapsulator for split field values by default. Try adding f.PUBLPLACE.encapsulator=%00 to disable the encapsulation. Hmm. Yes, this works but: - I didn't find anything about it in the docs (wiki). On the contrar

Re: Another text I cannot get into SOLR with csv

2008-01-08 Thread Michael Lackhoff
After a long weekend I could do a deeper look into this one and it looks as if the problem has to do with splitting. This one works for me fine. $ cat t2.csv id,name 12345,"'s-Gravenhage" 12345,'s-Gravenhage 12345,"""s-Gravenhage" $ curl http://localhost:8983/solr/update/csv?commit=true --dat

Re: correct escapes in csv-Update files

2008-01-04 Thread Michael Lackhoff
On 04.01.2008 17:35 Walter Underwood wrote: > I recommend the opencsv library for Java or the csv package for Python. > Either one can write legal CSV files. > > There are lots of corner cases in CSV and some differences between > applications, like whetehr newlines are allowed inside a quoted fi

Re: Another text I cannot get into SOLR with csv

2008-01-04 Thread Michael Lackhoff
On 04.01.2008 16:55 Yonik Seeley wrote: > On Jan 4, 2008 10:25 AM, Michael Lackhoff <[EMAIL PROTECTED]> wrote: >> If the fields value is: >> 's-Gravenhage >> I cannot get it into SOLR with CSV. > > This one works for me fine. > > $ cat t2.csv

Another text I cannot get into SOLR with csv

2008-01-04 Thread Michael Lackhoff
If the fields value is: 's-Gravenhage I cannot get it into SOLR with CSV. I tried to double the single quote/apostrophe or escape it in several ways but I either get an error or another character (the "escape") in front of the single quote. Is it not possible to have a field that begins with an apo

Re: correct escapes in csv-Update files

2008-01-04 Thread Michael Lackhoff
On 03.01.2008 17:16 Yonik Seeley wrote: > CSV doesn't use backslash escaping. > http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm > > "This is text with a ""quoted"" string" Thanks for the hint but the result is the same, that is, ""quoted"" behaves exactly like \"quoted\": - both leave the s

correct escapes in csv-Update files

2008-01-02 Thread Michael Lackhoff
I use UpdateCSV to feed my data into SOLR and it works very well. The only thing I don't understand is how to properly escape the encapsulator and the backslash. An example with the default encapsulator ("): "This is a text with a \"quote\"" "This gives one \ backslash" "This gives two backslashes