Re: numFound is changing when query across distributed-seach with the same query.
Yonik Seeley-2 wrote: > > On Thu, Dec 31, 2009 at 2:29 AM, johnson hong > wrote: >> >> Hi,all. >> I found a problem on distributed-seach. >> when i use "?q=keyword&start=0&rows=20" to query across >> distributed-seach,it will return numFound="181" ,then I >> change the start param from 0 to 100,it will return numFound="131". > > You probably have duplicates (docs on different shards with the same id). > Deeper paging will detect more of them. > It does raise the question of if we should be changing numFound, or > indicating a separate duplicate count. Duplicates aren't eliminated > from things like faceting or statistics, so it might be nice to have a > number that was consistent with those numbers. > > -Yonik > http://www.lucidimagination.com > > Thank you Yonik,Happy New Year all. I will check the index soon after the festival. -- View this message in context: http://old.nabble.com/numFound-is-changing-when-query-across-distributed-seach-with-the-same-query.-tp26976128p26984236.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: numFound is changing when query across distributed-seach with the same query.
On Thu, Dec 31, 2009 at 10:26 PM, Chris Hostetter wrote: > why do we bother detecthing/removing the duplicates? > > strictly speaking docs with duplicate IDs on multiple shards is a "garbage > in" situation, i can understanding Solr taking a little extra effort to > not fail hard if this situation is encountered, but why update the > numFound at all, or remove the duplicates from the list? ... why not leave > them in as is? (then numFound would never change) Distrib search keys some things off of the unique id, so when we encountered duplicates in the past it failed hard. IIRC only keeping one doc with the same id was actually the easiest way to not fail hard. -Yonik http://www.lucidimagination.com
Has anyone got Carrot2 working with Solr without using ant?
Hi, I'm about to start using Ant to get Carrot2 working with solr however I was first trying to get it working without Ant by placing jars into a lib directory in the quickstart example directory however I couldn't find any documentation to guide me in this. If anyone can suggest how to accomplish this I would be happy to hear about it. Happy New Year! Regards -- Alex https://sites.google.com/a/utg.edu.gm/alex
Re: Help with creating a solr schema
On Thu, Dec 31, 2009 at 10:26 AM, JaredM wrote: > > Hi, > > I'm new to Solr but so far I think its great. I've spent 2 weeks reading > through the wiki and mailing list info. > > I have a use case and I'm not sure what the best way is to implement it. I > am keeping track of peoples calendar schedules in a really simple way: each > user can login and input a number of date ranges where they are available > (so for example - User Alice might be available between 1-Jan-2010 - > 15-Jan-2010 and 20-Feb-2010 - 22-Feb-2010 and 1-Mar-2010-5-Mar-2010. > > In my data model I have this modelled as a one-to-many with a User table > (consisting of username, some metadata) and an Availability table > (consisting of start date and end date). > > Now I need to search which users are available between a given date range. > The bit I'm having trouble with is how to store multiple start & end date > pairs. Can someone provide some guidance? > -- > View this message in context: > http://old.nabble.com/Help-with-creating-a-solr-schema-tp26979319p26979319.html > Sent from the Solr - User mailing list archive at Nabble.com. > > I have done something similar to this before. You will have to store the username, firstname, lastname as single valued fields However, the start and end dates should be multivalued tint types. I decided to store the dates as UNIX timestamps. The start dates are stored as the unix timestamps at 12 midnight of that date (00:00:00) The end dates are stored as the unix time stamps at 11:59:59 PM on the end date 23:59:59 This (storing the dates as Trie integers) gave me faster range query results. when searching you will also have to convert the dates to unix time stamps using similar logic before using it in the solr search query You should use the username of the user as the uniqueKey. If a user has multiple dates of availability you will enter it like so: exampleun examplefn exampleln 137865661 137865662 137865663 137865681 137865682 137865683 -- "Good Enough" is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once. http://www.israelekpo.com/
Re: Has anyone got Carrot2 working with Solr without using ant?
You should just be able to copy those files down to the same location, as this is all Ant is doing. On Jan 1, 2010, at 2:11 PM, Alex Muir wrote: > Hi, > > I'm about to start using Ant to get Carrot2 working with solr however > I was first trying to get it working without Ant by placing jars into > a lib directory in the quickstart example directory however I couldn't > find any documentation to guide me in this. > > If anyone can suggest how to accomplish this I would be happy to hear about > it. > > Happy New Year! > Regards > > -- > > Alex > https://sites.google.com/a/utg.edu.gm/alex
Re: Requesting feedback on solr-spatial plugin
On Dec 30, 2009, at 9:54 PM, Mat Brown wrote: > Hi Grant, > > Thanks for the info and your point is well taken. I should have been > clearer that I have no intention of this project being a long-term > solution for spatial search in Solr - rather I was looking to build a > rough and ready solution that gives some basic spatial search > capabilities to tide us over until the real deal is available in Solr > 1.5. That being said, I'd love to be of use in the official spatial > efforts, so I'll be sure to take a look at the related tickets and see > if there is anywhere I can help out. FWIW, most of the functionality for spatial is now already committed on trunk. -Grant
solr 1.4 csv import -- Document missing required field: id
Hi, I am trying to import a csv file (without "id" field) on solr 1.4 In schema.xml "id" field is set with required="false". But I am getting "org.apache.solr.common.SolrException: Document missing required field: id" Following is the schema.xml fields section id Following is the csv file company_id,customer_name,active 58,Apache,Y 58,Solr,Y 58,Lucene,Y 60,IBM,Y Following is the solrj import client SolrServer server = new CommonsHttpSolrServer("http://localhost:8080/solr";); ContentStreamUpdateRequest req = new ContentStreamUpdateRequest("/update/csv"); req.addFile(new File(filename)); req.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true); NamedList result = server.request(req); System.out.println("Result: " + result); Could any of you help out please. Thanks -- View this message in context: http://old.nabble.com/solr-1.4-csv-import-Document-missing-required-field%3A-id-tp26990048p26990048.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Help with creating a solr schema
Thanks Ahmet and Israel. I prefer Israel's approach since the amount of metadata for the user is quite high but I'm not clear how to get around one problem: If I had 2 availabilities (I've left it in human-readable form instead of as a UNIX timestamp only for ease of understanding): 10-Jan-2010 20-Jan-2010 25-Jan-2010 28-Jan-2010 and I wanted to query for availability between 12-Jan-2010 to 26-Jan-2010 then then wouldn't the above document be returned (even though the use would not be available 20-25 Jan? -- View this message in context: http://old.nabble.com/Help-with-creating-a-solr-schema-tp26979319p26990178.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr 1.4 csv import -- Document missing required field: id
On Fri, Jan 1, 2010 at 9:13 PM, evana wrote: > > Hi, > > I am trying to import a csv file (without "id" field) on solr 1.4 > In schema.xml "id" field is set with required="false". > But I am getting "org.apache.solr.common.SolrException: Document missing > required field: id" > > Following is the schema.xml fields section > >required="false" /> > >multiValued="true"/> > > > > > > > id > > > Following is the csv file >company_id,customer_name,active >58,Apache,Y >58,Solr,Y >58,Lucene,Y >60,IBM,Y > > Following is the solrj import client >SolrServer server = new > CommonsHttpSolrServer("http://localhost:8080/solr";); >ContentStreamUpdateRequest req = new > ContentStreamUpdateRequest("/update/csv"); >req.addFile(new File(filename)); >req.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true); >NamedList result = server.request(req); >System.out.println("Result: " + result); > > > Could any of you help out please. > > Thanks > -- > View this message in context: > http://old.nabble.com/solr-1.4-csv-import-Document-missing-required-field%3A-id-tp26990048p26990048.html > Sent from the Solr - User mailing list archive at Nabble.com. > > The presence of the uniqueKey definition implicitly implies that the id field is a required field in the document even tough the attribute is set to false on the field definition. Try removing the uniqueKey definition for the id field in the schema.xml file and then try again to run the update script or application. The uniqueKey definition is not needed if you are going to build the index from scratch each time you do the import. However, if you are doing incremental updates, this field is required and the uniqueKey definition is also needed too to specify what the "primary key" for the doucment is. http://wiki.apache.org/solr/UniqueKey -- "Good Enough" is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once. http://www.israelekpo.com/
Re: Help with creating a solr schema
On Fri, Jan 1, 2010 at 9:47 PM, JaredM wrote: > > Thanks Ahmet and Israel. I prefer Israel's approach since the amount of > metadata for the user is quite high but I'm not clear how to get around one > problem: > > If I had 2 availabilities (I've left it in human-readable form instead of > as > a UNIX timestamp only for ease of understanding): > > 10-Jan-2010 > 20-Jan-2010 > 25-Jan-2010 > 28-Jan-2010 > > and I wanted to query for availability between 12-Jan-2010 to 26-Jan-2010 > then then wouldn't the above document be returned (even though the use > would > not be available 20-25 Jan? > -- > View this message in context: > http://old.nabble.com/Help-with-creating-a-solr-schema-tp26979319p26990178.html > Sent from the Solr - User mailing list archive at Nabble.com. > > Unfortunately, For this particular use case, if you are using the out-of-the-box features available in Solr 1.4, without a custom Solr plugin using a custom Lucene filter and some special value storage arrangement for the fields, you will have to store each start and end date as a separate document. So, there will be N separate documents for each username if that user has N distinct periods of availabilty. The start date and end date fields would also have to be single valued instead of multi-valued as I specified in the earlier post. Sorry. -- "Good Enough" is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once. http://www.israelekpo.com/