thanks for your reply... it kind of solved our problem ! we were in fact using Tokenizers that produce multiple tokens ... so i guess there is no other way for us then to use the copyfield workaround.
it would maybe be a good idea to have Lucene check the *stored* value for duplicate keys ... that seems so much more logical to me ! (imho, it makes no sense to check the *indexed* value for duplicate keys, but maybe there is a reason ?) or maybe give us the option to choose wether Lucene should check the *stored* or *indexed* value for duplicate keys. it is really confusing to get duplicate unique key *stored* values back from the server .... (and kind of frustrating) since we now use a copyfield to perform searches on the IDs, there is no more reason to index our unique key field .... what would happen if I set indexed=false on my unique id field ?? Maarten :-) Chris Hostetter <[EMAIL PROTECTED]> 16/03/2007 19:14 Please respond to solr-user@lucene.apache.org To solr-user@lucene.apache.org cc Subject Re: Bug ? unique id : but can someone please answer my question :'( : is it illegal to put filters on the unique id ? : or is it a bug that we get duplicate id's? : or is this a know issue (since everybody is using copyfields?) there's nothing illegal about using an Analyzer on your uniqueKey, but you have to ensure that your Analyzer: 1) never produces multiple tokens (ie: KeywordTokenizer is fine) 2) never produces duplicate output for differnet (legal) input. ...so if your dataset can legally contain two different documnets whose keys are "foo bar" and "Foo Bar" you certianly wouldn't want to use a Whitspace or StandardTokenizer -- but you also wouldn't ever want to use the LowerCaseFilter. If however you really wanted to ignore all punctuation in keys when clients upload documents to you, and trust that doc "1234-56-7890" is the same as doc "1234567890" then something liek hte pattern striping filter would be fine. the thing to understnad is that it's the *indexed* value of the uniqueKey that must be unique in order for Solr to do things properly ... it has to be able to search on that uniqueKey term to delete/replace a doc properly. -Hoss