Re: Problem with html code inside xml
well... the xml output has changed and I receive hh sic! So the problem is not a problem... Thanks Steve Le 3 oct. 07 à 01:09, Chris Hostetter a écrit : : I created a field type: : : positionIncrementGap="100"> ... : Everything works (the div tags, p tags are removed) but some : nnn or tags are style in the text after indexing. i cut/paste that fieldtype into the example schema.xml, and experimented with the analysis tool (http://localhost:8983/solr/admin/ analysis.jsp) and both of those examples were correctly striped. do you have a more specific example of something that doesn't work? Hmm... it seems like maybe the problem is examples like this... blahblahnnn ...if the tag is direclty adjacent to other text, it may not get striped off ... i'm not sure if that's specific to the HtmlWhitespaceTokenizer. -Hoss
Re: Indexing HTML
Hi Erik, All, I escaped HTML text into entities before sending to Solr and indexing went fine. The problem now is that when I get back a snippet with highlighted text for the html field, its not well formed as the highliting dosen't somtimes include the entire tag if present. For e.g.: − − ound-color: #FF; text-align: left; text-indent: 0px; line-height: normal ; margin-top: 0px; margin-ri − − /TR>> href="foobar">... @4:37) > > > > Or is that just a byproduct of how SOLR reports the errors back - > > always > > escaping them? > > > > Thanks guys - I'll have another crack at this tonight. > > > > > > On 8/27/07, Erik Hatcher <[EMAIL PROTECTED]> wrote: > >> > >> Michael, > >> > >> I think the issue is that you're not escaping the values. > >> Send something like this to Solr instead: > >> > >> >> href="foobar">linktext > >> a> > >> > >> Erik > >> > >> > >> On Aug 27, 2007, at 9:29 AM, Michael Kimsal wrote: > >> > >>> Hello > >>> > >>> I'm trying to index individual lines of an HTML file, and I'm > >>> hitting this > >>> error: > >>> > >>> TEXT must be immediately followed by END_TAG and not START_TAG > >>> > >>> I've got something that looks like > >>> > >>> > >>> > >>> 4 > >>> linktext >>> field> > >>> > >>> > >>> > >>> Actually, that sample code above, as its own data file POSTed to > >>> SOLR, > >>> throws > >>> > >>> parser must be on START_TAG or TEXT to read text (position: > >>> START_TAG seen > >>> ... ... @4:37 > >>> > >>> as an error. > >>> > >>> Any clues as to how I can do this? I'd like to keep the original > >>> copy of > >>> each line intact in the index. > >>> > >>> Thanks! > >>> > >>> -- > >>> Michael Kimsal > >>> http://webdevradio.com > >> > >> > > > > > > -- > > Michael Kimsal > > http://webdevradio.com > >
[no subject]
Hello All, I am interested in some of the joys, tribulations and processes of running a replicated Solr environment. Can anyone point to any particular links, documents and/or personal experiences. Thanks, Eric Treece [EMAIL PROTECTED]
Re: Solr live at Netflix
Yes. Congratulations on your launch. I'd love sort of a case study, I think SOLR could really benefit with a good "heres our schema, heres the site, this is the type of server/jvm, etc etc" sort of thing. The example app is fine and all, but a real life example with a site that uses facets like this would really make it easier to get up and running with a non-trivial installation. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Oct 2, 2007, at 5:53 PM, Norberto Meijome wrote: On Tue, 02 Oct 2007 15:26:33 -0700 Walter Underwood <[EMAIL PROTECTED]> wrote: Here at Netflix, we switched over our site search to Solr two weeks ago. We've seen zero problems with the server. We average 1.2 million queries/day on a 250K item index. We're running four Solr servers with simple round-robin HTTP load-sharing. Hi Walter, would you mind sharing hardware specs, OS, index size, VM settings, OS specific tunings ? unless that will be added to the wiki... :) thanks in advance, B _ {Beto|Norberto|Numard} Meijome "Have the courage to take your own thoughts seriously, for they will shape you." Albert Einstein I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Seeing if an entry exists in an index for a set of terms
Hi. I was wondering if there was a easy way to give solr a list of things and finding out which have entries. ie I pass it a list Bill Clinton George Bush Mary Papas (and possibly 20 others) to a solr index which contains news articles about presidents. I would like a response saying bill Clinton was found in 20 records George Bush was found in 15. possibly with the links, but thats not too important. I know I can do this by doing ~20 individual queries, but I thought there may be a more efficient way Regards Ian
Re: Seeing if an entry exists in an index for a set of terms
On 10/3/07, Ian Holsman <[EMAIL PROTECTED]> wrote: > Hi. > > I was wondering if there was a easy way to give solr a list of things > and finding out which have entries. > > > ie I pass it a list > > Bill Clinton > George Bush > Mary Papas > (and possibly 20 others) > > to a solr index which contains news articles about presidents. I would > like a response saying > > bill Clinton was found in 20 records > George Bush was found in 15. > > possibly with the links, but thats not too important. > > I know I can do this by doing ~20 individual queries, but I thought > there may be a more efficient way How about facet.query=Bill Clinton&facet.query=George Bush, etc Will give you counts, but not the links -Yonik
Re: Seeing if an entry exists in an index for a set of terms
Yonik Seeley wrote: On 10/3/07, Ian Holsman <[EMAIL PROTECTED]> wrote: Hi. I was wondering if there was a easy way to give solr a list of things and finding out which have entries. ie I pass it a list Bill Clinton George Bush Mary Papas (and possibly 20 others) to a solr index which contains news articles about presidents. I would like a response saying bill Clinton was found in 20 records George Bush was found in 15. possibly with the links, but thats not too important. I know I can do this by doing ~20 individual queries, but I thought there may be a more efficient way How about facet.query=Bill Clinton&facet.query=George Bush, etc Will give you counts, but not the links -Yonik That will work. Thanks Yonik.