A few query issues with solr
Hi, I'm new to using Solr, and I have started an index with it and it works great. I have encountered a few minor issues that I currently solve by modifying the query beforehand - however I feel like there is a much more configuration oriented and Solr-correct way of achieving. Current manual modifications * Searching for "car" actually means buying a car so it should look for "car -rent", whereas searching for "car rent" should still look for "car rent" * Searching for "macy's" and searching for "macys" is different - currently I force macy's to macys * Searching for "at&t" gets converted to "at", "t" which are both stop worded - I am forced to convert at&t=>att before indexing and querying Is there a nice way to handle these or will I always need to resort to manual fixes for these? Cheers David
How to use TermsComponent when I need a filter
Hi, I have a solr index, which for simplicity is just a list of names, and a list of associations. (either a multivalue field e.g. {A1, A2, A3, A6} or a string concatenation list e.g. "A1 A2 A3 A6") I want to be able to provide autocomplete but with a specific association. E.g. Names beginning with Bob in association A5. Is this possible? I would prefer not to have to have one index per association, since the number of associations is pretty large Cheers, David
How to use TermsComponent when I need a filter
Hi, I have a solr index, which for simplicity is just a list of names, and a list of associations. (either a multivalue field e.g. {A1, A2, A3, A6} or a string concatenation list e.g. "A1 A2 A3 A6") I want to be able to provide autocomplete but with a specific association. E.g. Names beginning with Bob in association A5. Is this possible? I would prefer not to have to have one index per association, since the number of associations is pretty large Cheers, David
Delta Import with something other than Date
Hi, I have a table that I want to index, and the table has no datetime stamp. However, the table is append only so the primary key can only go up. Is it possible to store the last primary key, and use some delta query="select id where id>${last_id_value}" Cheers, David
RE: Delta Import with something other than Date
Currently DIH delta import uses the SQL query of type "select id from item where last_modified > ${dataimporter.last_index_time}" What I need is some field like ${dataimporter.last_primary_key} wiki.apache.org/solr/DataImportHandler I am thinking of storing the last primary key externally and calling the delta-import with a parameter and using ${dataimporter.request.last_primary_key} but that seems like a very brittle approach Cheers, David -Original Message- From: Jonathan Rochkind [mailto:rochk...@jhu.edu] Sent: Wednesday, September 08, 2010 6:38 PM To: solr-user@lucene.apache.org Subject: Re: Delta Import with something other than Date Of course you can store whatever you want in a solr index. And if you store an integer as a Solr 1.4 "int" type, you can certainly query for all documents that have greater than some specified integer in a field. You can't use SQL to query Solr though. I'm not sure what you're really asking? Jonathan David Yang wrote: > Hi, > > I have a table that I want to index, and the table has no datetime > stamp. However, the table is append only so the primary key can only go > up. Is it possible to store the last primary key, and use some delta > query="select id where id>${last_id_value}" > > Cheers, > > David > > >
Autocomplete with Filter Query
Hi, Is there any way to provide autocomplete while filtering results? Suppose I had a bunch of people and each person has multiple occupations. When I select 'Assistant' in a filter box, it would be nice if autocomplete only provides assistant names, instead of all names. The other issue is that I use DisMax to do my search (name, title, phone number etc) - so it might be more complex to do autocomplete. I could have a copy field to copy all dismax terms into one big field. Cheers, David
RE: Autocomplete with Filter Query
to-complete match in the middle of your terms, things get a lot more complicated. Ie, if you want "walk" to auto-complete to "dog walking" too. This won't do that. Also, if you want some kind of stemming to happen in auto-complete, this won't do that either. And also, if you want to auto-complete not the entire phrase the user has typed in, but each white-space-seperated word as they type it, this won't do THAT either. Trying to get all those things to work becomes even more complicated -- especially with the requirement that you want to be able to apply the 'fq's from your current search context to the auto-complete. I haven't entirely thought through a possible way to do all that. But hopefully this gives you some clues to think about it. Jonathan From: David Yang [dy...@nextjump.com] Sent: Friday, September 10, 2010 11:14 AM To: solr-user@lucene.apache.org Subject: Autocomplete with Filter Query Hi, Is there any way to provide autocomplete while filtering results? Suppose I had a bunch of people and each person has multiple occupations. When I select 'Assistant' in a filter box, it would be nice if autocomplete only provides assistant names, instead of all names. The other issue is that I use DisMax to do my search (name, title, phone number etc) - so it might be more complex to do autocomplete. I could have a copy field to copy all dismax terms into one big field. Cheers, David
DataImportHandler with multiline SQL
Hi I am using the DIH to retrieve data, and as part of the process, I wanted to create a temporary table and then import data from that. I have played around a little with DIH and it seems like for a query like: "select x; select y;" you can have select y to return no results and do random stuff, but the first select x needs to return results. Does anybody know exactly how DIH handles multiple sql statements in the query? Cheers, David
Shuffle results a little
Hi, I am interested in using solr to return search results for products. Is there any feature which will allow the result to be spread/shuffled around a little? The problem is that there are lots of results for one brand, but there are lots of other brands a few pages later. Is it possible to somehow shuffle it so that one brand does not dominate the top results? Like a field_limit=2 means in the at most two of any same field will be shown, and then the rest are skipped. I could implement this as a post-filter step, but that means that I would have to pull many more results from the index. Cheers, David
Per field facet limit
Hi, The wiki on facet.limit (http://wiki.apache.org/solr/SimpleFacetParameters#facet.limit) says "This parameter can be specified on a per field basis to indicate a separate limit for certain fields." But it is not specified how to specify a specific field. How do you do this? I tried a_id 30 b_id 3 Which didn't work, as well as plain 'facet.mincount' twice which also didn't work. Cheers, David.
RE: Per field facet limit
Thanks! Is there any way to apply this to facet queries as well? (I could just apply a f.field.facet.limit to each and every field, and then apply a global facet.limit for facet queries.) Cheers david -Original Message- From: Jonathan Rochkind [mailto:rochk...@jhu.edu] Sent: Wednesday, November 17, 2010 6:12 PM To: solr-user@lucene.apache.org Subject: Re: Per field facet limit f.name_of_field.facet.limit The f.name_of_field.original_value thing is a common pattern in Solr, but, yeah, sometimes it's hard to find it in the documentation. So same with any of the other facet parameters. f.name_of_field.facet.mincount, whatever. David Yang wrote: > Hi, > > > > The wiki on facet.limit > (http://wiki.apache.org/solr/SimpleFacetParameters#facet.limit) says > "This parameter can be specified on a per field basis to indicate a > separate limit for certain fields." But it is not specified how to > specify a specific field. How do you do this? > > > > I tried > > > > a_id > > 30 > > > > b_id > > 3 > > > > Which didn't work, as well as plain 'facet.mincount' twice which also > didn't work. > > > > Cheers, > > David. > > >
RE: Per field facet limit
Sorry for the typo, I meant mincount, not limit... :p Cheers, David -Original Message- From: David Yang [mailto:dy...@nextjump.com] Sent: Wednesday, November 17, 2010 6:15 PM To: solr-user@lucene.apache.org Subject: RE: Per field facet limit Thanks! Is there any way to apply this to facet queries as well? (I could just apply a f.field.facet.limit to each and every field, and then apply a global facet.limit for facet queries.) Cheers david -Original Message- From: Jonathan Rochkind [mailto:rochk...@jhu.edu] Sent: Wednesday, November 17, 2010 6:12 PM To: solr-user@lucene.apache.org Subject: Re: Per field facet limit f.name_of_field.facet.limit The f.name_of_field.original_value thing is a common pattern in Solr, but, yeah, sometimes it's hard to find it in the documentation. So same with any of the other facet parameters. f.name_of_field.facet.mincount, whatever. David Yang wrote: > Hi, > > > > The wiki on facet.limit > (http://wiki.apache.org/solr/SimpleFacetParameters#facet.limit) says > "This parameter can be specified on a per field basis to indicate a > separate limit for certain fields." But it is not specified how to > specify a specific field. How do you do this? > > > > I tried > > > > a_id > > 30 > > > > b_id > > 3 > > > > Which didn't work, as well as plain 'facet.mincount' twice which also > didn't work. > > > > Cheers, > > David. > > >
RE: Per field facet limit
Makes sense. The processing is already done and there is no reason to not return it, since it is wont explode into a horribly long list, unlike a field facet. Thanks! -Original Message- From: Jonathan Rochkind [mailto:rochk...@jhu.edu] Sent: Wednesday, November 17, 2010 6:21 PM To: solr-user@lucene.apache.org Subject: Re: Per field facet limit I don't think a facet.limit or facet.mincount apply to facet queries, it's not applicable, whether global or field-specific. Keep in mind that a single facet query just returns ONE count, for the query you supplied. It's up to you to supply a query that will give the count you want, it won't use facet.limit or facet.mincount, those parameters apply to ordinary facetting where you get many values per field, to filter the values per field. Each facet.query only gives you one count already. David Yang wrote: > Thanks! > > Is there any way to apply this to facet queries as well? > (I could just apply a f.field.facet.limit to each and every field, and > then apply a global facet.limit for facet queries.) > > Cheers > david > > -Original Message- > From: Jonathan Rochkind [mailto:rochk...@jhu.edu] > Sent: Wednesday, November 17, 2010 6:12 PM > To: solr-user@lucene.apache.org > Subject: Re: Per field facet limit > > f.name_of_field.facet.limit > > The f.name_of_field.original_value thing is a common pattern in Solr, > but, yeah, sometimes it's hard to find it in the documentation. > > So same with any of the other facet parameters. > f.name_of_field.facet.mincount, whatever. > > David Yang wrote: > >> Hi, >> >> >> >> The wiki on facet.limit >> (http://wiki.apache.org/solr/SimpleFacetParameters#facet.limit) says >> "This parameter can be specified on a per field basis to indicate a >> separate limit for certain fields." But it is not specified how to >> specify a specific field. How do you do this? >> >> >> >> I tried >> >> >> >> a_id >> >> 30 >> >> >> >> b_id >> >> 3 >> >> >> >> Which didn't work, as well as plain 'facet.mincount' twice which also >> didn't work. >> >> >> >> Cheers, >> >> David. >> >> >> >> > >
Tokenizer that Protects Phrases
Hi, I am trying to tokenize a string field of products. Two different products are: "camera", "security camera". What I would like is for "security camera" to be treated differently to "camera" - and only be displayed when the search is for "security camera", otherwise, the results should only display "camera". In other words, even though they share the English word "camera", their meanings are different. Now my guess about the best way to deal with this is just to manually provide a file of words that together is a token. For ex. "laptop battery", "security camera". Kind of like protwords, but like protphrases. Is this a good idea to solve this problem? How do I implement it if it is the right way? If there is a better way of dealing with this what is it? Thanks for your time, David
RE: DIH and updating specific record
Chris Hostetter answered this just recently: http://wiki.apache.org/solr/DataImportHandler#Accessing_request_paramete rs My addition: Pass a parameter like command=delta-import&idz=31415 And access it via 'sql where id=${dataimporter.request.idz}' If the idz is a string you might need to prequote the idz value. -Original Message- From: Olson, Ron [mailto:rol...@lbpc.com] Sent: Tuesday, February 22, 2011 3:18 PM To: solr-user@lucene.apache.org Subject: DIH and updating specific record Hi all- I am trying to determine if there is a way to tell Solr to update its index with a specific ID to a record in the database. All the examples and documentation seems to discuss using a "last updated" date/time field, but in this case modifying the table would not be an option. Instead, I'd like to invoke Solr's DIH delta query with a specific ID to say "here's something new or updated, please update your index with it". I apologize if this is a trivial thing, but I can't seem to find any documentation on how to do it. Thanks, Ron DISCLAIMER: This electronic message, including any attachments, files or documents, is intended only for the addressee and may contain CONFIDENTIAL, PROPRIETARY or LEGALLY PRIVILEGED information. If you are not the intended recipient, you are hereby notified that any use, disclosure, copying or distribution of this message or any of the information included in or with it is unauthorized and strictly prohibited. If you have received this message in error, please notify the sender immediately by reply e-mail and permanently delete and destroy this message and its attachments, along with any copies thereof. This message does not create any contractual obligation on behalf of the sender or Law Bulletin Publishing Company. Thank you.
RE: Sort Stability With Date Boosting and Rounding
One suggestion: use logarithms to compress the large time range into something easier to compare: 1/log(ms(now,date) -Original Message- From: Stephen Duncan Jr [mailto:stephen.dun...@gmail.com] Sent: Tuesday, February 22, 2011 6:03 PM To: solr-user@lucene.apache.org Subject: Sort Stability With Date Boosting and Rounding I'm trying to use http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents as a bf parameter to my dismax handler. The problem is, the value of NOW can cause documents in a similar range (date value within a few seconds of each other) to sometimes round to be equal, and sometimes not, changing their sort order (when equal, falling back to a secondary sort). This, in turn, screws up paging. The problem is that score is rounded to a lower level of precision than what the suggested formula produces as a difference between two values within seconds of each other. It seems to me if I could round the value to minutes or hours, where the difference will be large enough to not be rounded-out, then I wouldn't have problems with order changing on me. But it's not legal syntax to specify something like: recip(ms(NOW,manufacturedate_dt/HOUR),3.16e-11,1,1) Is this a problem anyone has faced and solved? Anyone have suggested solutions, other than indexing a copy of the date field that's rounded to the hour? -- Stephen Duncan Jr www.stephenduncanjr.com
Dismax and worddelimiterfilter
Hi, I am having some really strange issues matching "N61JQ-B2". If I had a field "N61JQ-B2", and I wanted to match "N61JQ", "N61JQB2", "N61JQ-B2" and "N61JQ B2" in dismax, what fieldtype should it have? My final fallback is to use ngrams but that would impose a pretty large overhead, since the field could be a long normal string with one model number in it. I noticed when I used WordDelimiterFilterFactory the dismax would convert the parsed query to some pre-analyzed query. Cheers, David