Sure. copyField it into a new indexed non-stored field with the following
type definition:
    <fieldType name="address_email" class="solr.TextField">
      <analyzer>
        <tokenizer class="solr.UAX29URLEmailTokenizerFactory"/>
        <filter class="solr.TypeTokenFilterFactory"
types="filter_email.txt" enablePositionIncrements="true"
useWhitelist="true"/>
      </analyzer>
    </fieldType>

Content of filter_email.txt is (including <> signs):
<EMAIL>

You will have the emails only left as tokens. Can't display them easily,
but can certainly search.
Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Thu, Mar 14, 2013 at 2:33 PM, Jorge Luis Betancourt Gonzalez <
jlbetanco...@uci.cu> wrote:

> Sorry for the duplicated mail :-(, any advice on a configuration for
> searching emails in a field that does not have only email addresses, so the
> email addresses are contained in larger textual messages?
>
> ----- Mensaje original -----
> De: "Ahmet Arslan" <iori...@yahoo.com>
> Para: solr-user@lucene.apache.org
> Enviados: Jueves, 14 de Marzo 2013 11:23:47
> Asunto: Re: Question about email search
>
> Hi,
>
> Since you have word delimiter filter in your analysis chain, I am not sure
> if e-mail addresses are recognised. You can check that on solr admin UI,
> analysis page.
>
> If e-mail addresses kept one token, I would use leading wildcard query.
> &q=*@gmail.com
>
> There was a similar question recently:
> http://search-lucene.com/m/XF2ejnM6Vi2
>
> --- On Thu, 3/14/13, Jorge Luis Betancourt Gonzalez <jlbetanco...@uci.cu>
> wrote:
>
> > From: Jorge Luis Betancourt Gonzalez <jlbetanco...@uci.cu>
> > Subject: Question about email search
> > To: solr-user@lucene.apache.org
> > Date: Thursday, March 14, 2013, 5:11 PM
> > I'm using solr 3.6.2 to crawl some
> > data using nutch, in my schema I've one field with all the
> > content extracted from the page, which could possibly
> > include email addresses, this is the configuration of my
> > schema:
> >
> >         <fieldType name="text"
> > class="solr.TextField"
> >
> > positionIncrementGap="100"
> > autoGeneratePhraseQueries="true">
> >             <analyzer
> > type="index">
> >
> > <tokenizer class="solr.StandardTokenizerFactory"/>
> >
> > <filter class="solr.StandardFilterFactory"/>
> >
> > <filter class="solr.ISOLatin1AccentFilterFactory"/>
> >
> > <filter class="solr.SnowballPorterFilterFactory"
> > languange="Spanish"/>
> >
> > <charFilter class="solr.HTMLStripCharFilterFactory"/>
> >
> > <filter class="solr.StopFilterFactory"
> >
> >     ignoreCase="true" words="stopwords.txt"/>
> >
> > <filter class="solr.WordDelimiterFilterFactory"
> >
> >     generateWordParts="1"
> > generateNumberParts="1"
> >
> >     catenateWords="1" catenateNumbers="1"
> > catenateAll="0"
> >
> >     splitOnCaseChange="1"/>
> >
> > <filter class="solr.LowerCaseFilterFactory"/>
> >
> > <filter
> > class="solr.RemoveDuplicatesTokenFilterFactory"/>
> >             </analyzer>
> >         </fieldType>
> >
> > The thing is that I'm trying to search against a field of
> > this type (text) with a value like "@gmail.com" and I'm
> > intended to get documents with that text, any advice?
> >
> > slds
> > --
> > "It is only in the mysterious equation of love that any
> > logical reasons can be found."
> > "Good programmers often confuse halloween (31 OCT) with
> > christmas (25 DEC)"
> >
> >
>

Reply via email to