Re: Is it possible to search for the empty string?

Shawn Heisey Mon, 18 May 2015 21:31:17 -0700

On 5/18/2015 7:34 PM, Walter Underwood wrote:
> Not out of the box.
> 
> Fields are parsed into tokens and queries search on tokens. An empty string 
> has no tokens for that field and a missing field has no tokens for that field.
> 
> If you really need to do this, then you’ll need to turn the empty string in a 
> special token that means “empty string”, choosing a token that won’t conflict 
> with any real token. At this point, we’re moving into Ugly Hack Land, but 
> sometimes that is the best we can do.
> 
> For example, you could create an update request processor that checked for a 
> field with an empty value, then replaced that with a rare character, perhaps 
> a composed, compatibility Unicode character, like Angstrom (same as circle 
> capital A), or one of the TV Guide symbols (numbers in TV-shaped surrounds), 
> or my personal favorite, the IPA symbol for “audible gnashing of teeth”, 
> which could be an appropriate response to this request.
> 
> That character is U+02AD, "LATIN LETTER BIDENTAL PERCUSSIVE” 
> (http://www.fileformat.info/info/unicode/char/2ad/index.htm).
> 
> Then you would need to craft a query using this special token to mean “empty 
> string”.
> 
> Of course, none of this works if some upstream processing of the update 
> document strips fields with empty values.


This field is already being handled by a custom update processor that I
wrote.  The empty string in the source data is leading to an
IndexOutOfBounds exception because of the way my parsing works on the
info that is sent to Solr.  Instead of throwing the exception, I could
trap it and replace the empty string with something we can look for in a
query.

I have added an additional parameter to my update processor config that
makes it replace empty strings with a specific string.

Thanks,
Shawn

Re: Is it possible to search for the empty string?

Reply via email to