be using CJK
in addition to WhitespaceTokenizerFactory.
I've found some references to using copyFields or NGrams but I can't quite
grasp what the whole solution would look like.
--
Jacob Elder
@jelder
(646) 535-3379
've found some references to using copyFields or NGrams but I can't
> quite
> > grasp what the whole solution would look like.
>
--
Jacob Elder
@jelder
(646) 535-3379
StandardTokenizer doesn't handle some of the tokens we need, like
@twitteruser, and as far as I can tell, doesn't handle Chinese, Japanese or
Korean. Am I wrong about that?
On Mon, Nov 29, 2010 at 5:31 PM, Robert Muir wrote:
> On Mon, Nov 29, 2010 at 5:30 PM, Jacob Elder wrote:
&g
+1
That's exactly what we need, too.
On Mon, Nov 29, 2010 at 5:28 PM, Shawn Heisey wrote:
> On 11/29/2010 3:15 PM, Jacob Elder wrote:
>
>> I am looking for a clear example of using more than one tokenizer for a
>> source single field. My application has a single &
wrote:
> On Mon, Nov 29, 2010 at 5:35 PM, Jacob Elder wrote:
> > StandardTokenizer doesn't handle some of the tokens we need, like
> > @twitteruser, and as far as I can tell, doesn't handle Chinese, Japanese
> or
> > Korean. Am I wrong about that?
>
> it uses
this could be achieved?
>
> Thanks in advance for any help.
> Regards,
> Tommaso
>
--
Jacob Elder
@jelder
(646) 535-3379
rd/StandardTokenizer.java
> )
>
> So you can read the UAX#29 report and then you know how it tokenizes text
> You can also just use this demo app to see how the new one works:
> http://unicode.org/cldr/utility/breaks.jsp (choose "Word")
>
What does this mean to those of us on Solr 1.4 and Lucene 2.9.3? Does the
current stable StandardTokenizer handle CJK?
--
Jacob Elder
@jelder
(646) 535-3379
On Tue, Nov 30, 2010 at 10:07 AM, Robert Muir wrote:
> On Tue, Nov 30, 2010 at 9:45 AM, Jacob Elder wrote:
> > Right. CJK doesn't tend to have a lot of whitespace to begin with. In the
> > past, we were using a patched version of StandardTokenizer which treated
> >
.
Does setting commitWithin=1000 mean that only this update will be committed
within 1s, or that all pending documents will be committed within 1s?
--
Jacob Elder
of pop-up ads now. It would be great to
get some
consolidation and agreement from the community.
--
Jacob Elder
olr - User mailing list archive at Nabble.com.
> >>>
> >>>
> >>
> >>
> >>
> >> --
> >> -
> >> Noble Paul | Systems Architect| AOL | http://aol.com
> >>
> >>
> >
> > --
> > View this message in context:
> http://old.nabble.com/shards-parameter-tp26826908p26827527.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
> >
>
--
Jacob Elder
Hello,
Is there any way to get the number of deleted records from a delete request?
I'm sending:
type_i:(2 OR 3) AND creation_time_rl:[0 TO
124426080]
And getting:
02
This is Solr 1.3.
--
Jacob Elder
1. Real time or near-real time updates.
2. First-class spatial search.
On Wed, Feb 24, 2010 at 9:42 AM, Grant Ingersoll wrote:
> What would it be?
>
--
Jacob Elder
13 matches
Mail list logo