Subject: Re: analyzer for Code
Gian,
Lucene in Action has a case study from Krugle about their analysis for a
code search engine, if you want to look there.
Otis
--
Solr & ElasticSearch Support
http://sematext.com/
On Thu, Jun 13, 2013 at 4:19 AM, Gian Maria Ricci
wrote:
> I did a little
I'll have a look to it, thanks to everyone.
--
Gian Maria Ricci
Mobile: +39 320 0136949
-Original Message-
From: Steve Rowe [mailto:sar...@gmail.com]
Sent: Thursday, June 13, 2013 9:03 PM
To: solr-user@lucene.apache.org
Subject: Re: analyzer for Code
Hi Gian Maria,
Ope
Gian,
Lucene in Action has a case study from Krugle about their analysis for a
code search engine, if you want to look there.
Otis
--
Solr & ElasticSearch Support
http://sematext.com/
On Thu, Jun 13, 2013 at 4:19 AM, Gian Maria Ricci
wrote:
> I did a little search around and did not find an
e. J
>
> --
> Gian Maria Ricci
> Mobile: +39 320 0136949
>
>
>
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Thursday, June 13, 2013 1:24 PM
> To: solr-user@lucene.apache.org; Gian Maria Ricci
> Subject: Re: analyzer for Code
>
> We
Gian Maria Ricci
Subject: Re: analyzer for Code
Well, WordDelimiterFilterFactory would split on the punctuation, so
you could add it to the analyzer chain along with StandardAnalyzer.
You could use one of the regex filters to break up tokens that make it
through the analyzer as you see fit.
It could be pretty complicated to do well.
I'm pretty sure that Krugle is based on Solr: http://opensearch.krugle.org/
You might also look at the UI for Ohloh (used to be Koders):
http://code.ohloh.net/
wunder
On Jun 13, 2013, at 1:19 AM, Gian Maria Ricci wrote:
> I did a little search around
Well, WordDelimiterFilterFactory would split on the punctuation, so
you could add it to the analyzer chain along with StandardAnalyzer.
You could use one of the regex filters to break up tokens that make it
through the analyzer as you see fit.
But in general, this will be a bunch of compromises s