Re: Problem with html code inside xml

2007-10-03 Thread [EMAIL PROTECTED]
well... the xml output has changed and I receive  
hh   sic!


So the problem is not a problem...

Thanks

Steve

Le 3 oct. 07 à 01:09, Chris Hostetter a écrit :


: I created a field type:
:
: positionIncrementGap="100">


...

: Everything works (the div tags, p tags are removed) but some
: nnn   or  tags are style in the text after  
indexing.


i cut/paste that fieldtype into the example schema.xml, and  
experimented
with the analysis tool (http://localhost:8983/solr/admin/ 
analysis.jsp) and

both of those examples were correctly striped.

do you have a more specific example of something that doesn't work?

Hmm... it seems like maybe the problem is examples like this...
blahblahnnn
...if the tag is direclty adjacent to other text, it may not get  
striped
off ... i'm not sure if that's specific to the  
HtmlWhitespaceTokenizer.





-Hoss




Re: Indexing HTML

Hi Erik, All,

I escaped HTML text into entities before sending to Solr and indexing
went fine.  The problem now is that when I get back a snippet with
highlighted text for the html field, its not well formed as the
highliting dosen't somtimes include the entire tag if present.  For
e.g.:


−

−

ound-color: #FF; text-align: left; text-indent: 0px;
line-height: normal ; margin-top: 0px; margin-ri





−

−

/TR>
 > href="foobar">... @4:37)
> >
> > Or is that just a byproduct of how SOLR reports the errors back -
> > always
> > escaping them?
> >
> > Thanks guys - I'll have another crack at this tonight.
> >
> >
> > On 8/27/07, Erik Hatcher <[EMAIL PROTECTED]> wrote:
> >>
> >> Michael,
> >>
> >> I think the issue is that you're not escaping the  values.
> >> Send something like this to Solr instead:
> >>
> >>    >> href="foobar">linktext >> a>
> >>
> >> Erik
> >>
> >>
> >> On Aug 27, 2007, at 9:29 AM, Michael Kimsal wrote:
> >>
> >>> Hello
> >>>
> >>> I'm trying to index individual lines of an HTML file, and I'm
> >>> hitting this
> >>> error:
> >>>
> >>> TEXT must be immediately followed by END_TAG and not START_TAG
> >>>
> >>> I've got something that looks like
> >>>
> >>> 
> >>> 
> >>> 4
> >>> linktext >>> field>
> >>> 
> >>> 
> >>>
> >>> Actually, that sample code above, as its own data file POSTed to
> >>> SOLR,
> >>> throws
> >>>
> >>> parser must be on START_TAG or TEXT to read text (position:
> >>> START_TAG seen
> >>> ...... @4:37
> >>>
> >>> as an error.
> >>>
> >>> Any clues as to how I can do this?  I'd like to keep the original
> >>> copy of
> >>> each line intact in the index.
> >>>
> >>> Thanks!
> >>>
> >>> --
> >>> Michael Kimsal
> >>> http://webdevradio.com
> >>
> >>
> >
> >
> > --
> > Michael Kimsal
> > http://webdevradio.com
>
>


[no subject]

Hello All,

I am interested in some of the joys, tribulations and processes of running a 
replicated Solr environment. Can anyone point to any particular links, 
documents and/or personal experiences.

Thanks,
Eric Treece
[EMAIL PROTECTED]



Re: Solr live at Netflix

Yes. Congratulations on your launch. I'd love sort of a case study, I  
think SOLR could really benefit with a good "heres our schema, heres  
the site, this is the type of server/jvm, etc etc" sort of thing.


The example app is fine and all, but a real life example with a site  
that uses facets like this would really make it easier to get up and  
running with a non-trivial installation.


++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++


On Oct 2, 2007, at 5:53 PM, Norberto Meijome wrote:


On Tue, 02 Oct 2007 15:26:33 -0700
Walter Underwood <[EMAIL PROTECTED]> wrote:

Here at Netflix, we switched over our site search to Solr two  
weeks ago.

We've seen zero problems with the server. We average 1.2 million
queries/day on a 250K item index. We're running four Solr servers
with simple round-robin HTTP load-sharing.


Hi Walter,
would you mind sharing hardware specs, OS, index size, VM settings,  
OS specific tunings ?


unless that will be added to the wiki... :)

thanks in advance,
B

_
{Beto|Norberto|Numard} Meijome

"Have the courage to take your own thoughts
seriously, for they will shape you."
   Albert Einstein

I speak for myself, not my employer. Contents may be hot. Slippery  
when wet. Reading disclaimers makes you go blind. Writing them is  
worse. You have been Warned.






Seeing if an entry exists in an index for a set of terms


Hi.

I was wondering if there was a easy way to give solr a list of things 
and finding out which have entries.



ie I pass it a list

Bill Clinton
George Bush
Mary Papas
(and possibly 20 others)

to a solr index which contains news articles about presidents. I would 
like a response saying


bill Clinton was found in 20 records
George Bush was found in 15.

possibly with the links, but thats not too important.

I know I can do this by doing ~20 individual queries, but I thought 
there may be a more efficient way


Regards
Ian


Re: Seeing if an entry exists in an index for a set of terms

On 10/3/07, Ian Holsman <[EMAIL PROTECTED]> wrote:
> Hi.
>
> I was wondering if there was a easy way to give solr a list of things
> and finding out which have entries.
>
>
> ie I pass it a list
>
> Bill Clinton
> George Bush
> Mary Papas
> (and possibly 20 others)
>
> to a solr index which contains news articles about presidents. I would
> like a response saying
>
> bill Clinton was found in 20 records
> George Bush was found in 15.
>
> possibly with the links, but thats not too important.
>
> I know I can do this by doing ~20 individual queries, but I thought
> there may be a more efficient way

How about
facet.query=Bill Clinton&facet.query=George Bush, etc

Will give you counts, but not the links

-Yonik


Re: Seeing if an entry exists in an index for a set of terms


Yonik Seeley wrote:

On 10/3/07, Ian Holsman <[EMAIL PROTECTED]> wrote:
  

Hi.

I was wondering if there was a easy way to give solr a list of things
and finding out which have entries.


ie I pass it a list

Bill Clinton
George Bush
Mary Papas
(and possibly 20 others)

to a solr index which contains news articles about presidents. I would
like a response saying

bill Clinton was found in 20 records
George Bush was found in 15.

possibly with the links, but thats not too important.

I know I can do this by doing ~20 individual queries, but I thought
there may be a more efficient way



How about
facet.query=Bill Clinton&facet.query=George Bush, etc

Will give you counts, but not the links

-Yonik

  

That will work.
Thanks Yonik.