RE: Highlighting externally stored text

2013-07-31 Thread JohnRodey
Just an update. Change was pretty straight forward (at least for my simple test case) just a few lines in the getBestFragments method seemed to do the trick. -- View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-externally-stored-text-tp4078387p4081748.html Sent from

RE: Highlighting externally stored text

2013-07-31 Thread JohnRodey
Hey Bryan, Thanks for the response! To make use of the FastVectorHighlighter you need to enable termVectors, termPositions, and termOffsets correct? Which takes a considerable amount of space, but is good to know and I may possibly pursue this solution as well. Just starting to look at the code

Luke's analysis of Trie Dates

2013-07-18 Thread JohnRodey
I have a TrieDateField dynamic field setup in my schema, pretty standard... In my code I only set one field, "creation_tdt" and I round it to the nearest second before storing it. However when I analyze it with Luke I get: tdate IT--OF-- *_tdt (unstored field) 22404 -1 22404

Highlighting externally stored text

2013-07-16 Thread JohnRodey
Does anyone know if Issue SOLR-1397 (It should be possible to highlight external text ) is actively being worked by chance? Looks like the last update was May 2012. https://issues.apache.org/jira/browse/SOLR-1397 I'm trying to find a way to best highlight search results even though those results

Re: Benefits of Solr over Lucene?

2013-02-12 Thread JohnRodey
So I have had a fair amount of experience using Solr. However on a separate project we are considering just using Lucene directly, which I have never done. I am trying to avoid finding out late that Lucene doesn't offer what we need and being like "aw snap, it doesn't support geospatial" (or hig

Benefits of Solr over Lucene?

2013-02-12 Thread JohnRodey
I know that Solr web-enables a Lucene index, but I'm trying to figure out what other things Solr offers over Lucene. On the Solr features list it says "Solr uses the Lucene search library and extends it!", but what exactly are the extensions from the list and what did Lucene give you? Also if I h

Propogating an accurate exceptions to the end user

2011-06-21 Thread JohnRodey
Solr3.1 using SolrJ So I have a gui that allows folks to search my solr repository and I want to show appropriate errors when something bad happens, but my problem is that the Solr exception are not very pretty and sometimes are not very descriptive. For instance if I enter a bad query the messag

Re: Hitting the URI limit, how to get around this?

2011-06-03 Thread JohnRodey
Yep that was my issue. And like Ken said on Tomcat I set maxHttpHeaderSize="65536". -- View this message in context: http://lucene.472066.n3.nabble.com/Hitting-the-URI-limit-how-to-get-around-this-tp3017837p3020774.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Hitting the URI limit, how to get around this?

2011-06-03 Thread JohnRodey
So here's what I'm seeing: I'm running Solr 3.1 I'm running a java client that executes a Httpget (I tried HttpPost) with a large shard list. If I remove a few shards from my current list it returns fine, when I use my full shard list I get a "HTTP/1.1 400 Bad Request". If I execute it in firefox

Re: Better to have lots of smaller cores or one really big core?

2011-06-03 Thread JohnRodey
Thanks Erick for the response. So my data structure is the same, i.e. they all use the same schema. Though I think it makes sense for us to somehow break apart the data, for example by the date it was indexed. I'm just trying to get a feel for how large we should aim to keep those (by day, by we

Better to have lots of smaller cores or one really big core?

2011-06-02 Thread JohnRodey
I am trying to decide what the right approach would be, to have one big core and many smaller cores hosted by a solr instance. I think there may be trade offs either way but wanted to see what others do. And by small I mean about 5-10 million documents, large may be 50 million. It seems like sma

Hitting the URI limit, how to get around this?

2011-06-02 Thread JohnRodey
I have a master solr instance that I sent my request to, it hosts no documents it just farms the request out to a large number of shards. All the other solr instances that host the data contain multiple cores. Therefore my search string looks like "http://host:port/solr/select?...&shards=nodeA:123

Long list of shards breaks solrj query

2011-03-29 Thread JohnRodey
So I have a simple class that builds a SolrQuery and sets the "shards" param. I have a really long list of shards, over 250. My search seems to work until I get my shard list up to a certain length. As soon as I add one more shard I get: org.apache.commons.httpclient.HttpMethodDirector executeWi

Architecture question about solr sharding

2011-03-22 Thread JohnRodey
I have an issue and I'm wondering if there is an easy way around it with just SOLR. I have multiple SOLR servers and a field in my schema is a relative path to a binary file. Each SOLR server is responsible for a different subset of data that belongs to a different base path. For Example... My

General questions about distributed solr shards

2010-08-11 Thread JohnRodey
1) Is there any information on preferred maximum sizes for a single solr index. I've read some people say 10 million, some say 80 million, etc... Is there any official recommendation or has anyone experimented with large datasets into the tens of billions? 2) Is there any down side to running m

RE: Re: Disable Solr Response Formatting

2010-06-30 Thread JohnRodey
Thanks! I was looking for things to change in the solrconfig.xml file. indent=off -- View this message in context: http://lucene.472066.n3.nabble.com/Disable-Solr-Response-Formatting-tp933785p933966.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Disable Solr Response Formatting

2010-06-30 Thread JohnRodey
Oops, let me try that again... By default my SOLR response comes back formatted, like such Is there a way to tell it to return it unformatted? like: -- View this message in context: http://lucene.472066.n3.nabble.com/Disable-Solr-Response-Formatting-tp933785p933793.h

Disable Solr Response Formatting

2010-06-30 Thread JohnRodey
By default my SOLR response comes back formatted, like such Is there a way to tell it to return it unformatted? like: -- View this message in context: http://lucene.472066.n3.nabble.com/Disable-Solr-Response-Formatting-tp933785p933785.html Sent from the Solr - User mail

Can solr return pretty text as the content?

2010-06-23 Thread JohnRodey
When I feed pretty text into solr for indexing from lucene and search for it, the content is always returned as one long line of text. Is there a way for solr to return the pretty formatted text to me? -- View this message in context: http://lucene.472066.n3.nabble.com/Can-solr-return-pretty-te

Re: Does SOLR provide a java class to perform url-encoding

2010-05-25 Thread JohnRodey
I was assuming that I needed to leave the special characters in the http get, but running the solr admin it looks like it converts them the same way that URLEncoder.encode does. What is the need to preserve special characters? http://localhost:8983/solr/select?indent=on&version=2.2&q=%22mr.+bill

Re: Does SOLR provide a java class to perform url-encoding

2010-05-25 Thread JohnRodey
Thanks Sean, that was exactly what I need. One question though... How to correctly retain the Solr specific characters. I tried adding escape chars but URLEncoder doesn't seem to care about that: Example: String s1 = "\"mr. bill\" oh n?"; String s2 = "\\\"mr. bill\\\" oh n\\?"; String encoded1

Does SOLR provide a java class to perform url-encoding

2010-05-25 Thread JohnRodey
I would like to leverage on whatever SOLR provides to properly url-encode a search string. For example a user enters: "mr. bill" oh no The URL submitted by the admin page is: http://localhost:8983/solr/select?indent=on&version=2.2&q=%22mr.+bill%22+oh+no&fq=&start=0&rows=10&fl=*%2Cscore&qt=standa