Re: Query in Solr plugin across shards

2015-03-23 Thread Kevin Osborn
do the right thing" in terms of refreshing itself. > > FWIW, > Erick > > On Mon, Mar 23, 2015 at 9:23 AM, Kevin Osborn > wrote: > > I have created a PostFilter. PostFilter creates a DelegatingCollector, > > which provides a Lucene IndexSearcher. > > > > How

Query in Solr plugin across shards

2015-03-23 Thread Kevin Osborn
I have created a PostFilter. PostFilter creates a DelegatingCollector, which provides a Lucene IndexSearcher. However, I need to query for an object that may or may not be located on the shard that I am filtering on. Normally, I would do something like: searcher.search(new TermQuery(new Term("fi

Re: PostFilter does not seem to work across shards

2015-03-23 Thread Kevin Osborn
ering in, it works fine. I retrieve the object and do my intersections. But, on the other shard, I don't have my user document. So, I have nothing to intersect with. That is a separate issue that I need to figure out. On Mon, Mar 23, 2015 at 8:09 AM, Kevin Osborn wrote: > A little mor

Re: PostFilter does not seem to work across shards

2015-03-23 Thread Kevin Osborn
the filter is called for any documents on the second shard. On Fri, Mar 20, 2015 at 4:12 PM, Kevin Osborn wrote: > I developed a post filter. My documents to be filtered are on two > different shards. So, in a single-shard environment, > DelegatingCollector.doSetNextReader is called t

PostFilter does not seem to work across shards

2015-03-20 Thread Kevin Osborn
I developed a post filter. My documents to be filtered are on two different shards. So, in a single-shard environment, DelegatingCollector.doSetNextReader is called twice. And collect is called the correct number of times. Everything went well and I got my correct number of results back. So, I the

Re: copy field from boolean to int

2015-03-18 Thread Kevin Osborn
to match "true" and "false" and > > replace them with "0" and "1" ? > > > > > > : Date: Tue, 17 Mar 2015 17:57:03 -0700 > > : From: Kevin Osborn > > : Reply-To: solr-user@lucene.apache.org > > : To: solr-user@lucen

copy field from boolean to int

2015-03-17 Thread Kevin Osborn
I was hoping to use DocValues, but one of my fields is a boolean, which is not currently supported by DocValues. I can use a copyField to convert my boolean to a string. Is there is anyway to use a copyField to convert from a boolean to a tint?

Re: get Multi-Valued field data from DocValues

2015-03-13 Thread Kevin Osborn
RDS).map(b => NumericUtils.prefixCodedToLong(docValues.lookupOrd(b))).toSet Basically, we set the document, then iterate through the Ords. And then convert the BytesRef to a long. On Fri, Mar 13, 2015 at 2:33 PM, Kevin Osborn wrote: > getSortedNumeric throws the following exception: > > unexpected docvalu

Re: get Multi-Valued field data from DocValues

2015-03-13 Thread Kevin Osborn
getSortedNumeric throws the following exception: unexpected docvalues type SORTED_SET for field 'space_list' (expected one of [SORTED_NUMERIC, NUMERIC]). Use UninvertingReader or index with docvalues. If I am reading the doumentation correctly, getSortedNumeric sorts the values, but it is still f

get Multi-Valued field data from DocValues

2015-03-13 Thread Kevin Osborn
If I am finding the values of a long field for a single numeric field, I just do: DocValues.getNumeric(contex.reader(), "myField").get(docNumber). This returns the value of the field and everything is good. However, my field is a multi-valued long field. So, I need to do: DocValues.getSortedSet(

Re: Solr Cloud hangs when replicating updates

2013-09-06 Thread Kevin Osborn
ck suggestion would at least tell us if it's > the issue we think it is. > > FWIW, > Erick > > > On Wed, Sep 4, 2013 at 12:51 PM, Mark Miller > wrote: > > > It would be great if you could give this patch a try: > > http://pastebin.com/raw.php?i=aaRWwSGP &

Re: Solr Cloud hangs when replicating updates

2013-09-04 Thread Kevin Osborn
t; On Sep 3, 2013, at 2:15 PM, Kevin Osborn wrote: > > > I was having problems updating SolrCloud with a large batch of records. > The > > records are coming in bursts with lulls between updates. > > > > At first, I just tried large updates of 100,000 records at a t

Re: SolrCloud 4.x hangs under high update volume

2013-09-04 Thread Kevin Osborn
help me understand where to look next. > 3) It seems all threads in this state are waiting for > "0x0007216e68d8", is there a way to tell what "0x0007216e68d8" is? > 4) Is there a limit to how many updates you can do in SolrCloud? > 5) Wild-ass-theory: would more shards provide more locks (whatever they > are) on update, and thus more update throughput? > > To those interested, I've provided a stacktrace of 1 of 3 nodes at this > URL in gzipped form: > https://s3.amazonaws.com/timvaillancourt.com/tmp/solr-jstack-2013-08-23.gz > > Any help/suggestions/ideas on this issue, big or small, would be much > appreciated. > > Thanks so much all! > > Tim Vaillancourt > -- *KEVIN OSBORN* LEAD SOFTWARE ENGINEER CNET Content Solutions OFFICE 949.399.8714 CELL 949.310.4677 SKYPE osbornk 5 Park Plaza, Suite 600, Irvine, CA 92614 [image: CNET Content Solutions]

Solr Cloud hangs when replicating updates

2013-09-03 Thread Kevin Osborn
ection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) It basically appears that Solr gets stuck while trying to acquire a semaphore that never becomes available. Anyone have any ideas? This is definitely causing major problems for us.

Re: Indexing hangs when more than 1 server in a cluster

2013-08-14 Thread Kevin Osborn
I may have a bit of good news. The ulimit of open files was set to 4096. I just chose a random high limit (10) and it seems to be working better now. I still have more testing to do though, but the initial results are hopeful. On Wed, Aug 14, 2013 at 4:22 PM, Kevin Osborn wrote

Re: Indexing hangs when more than 1 server in a cluster

2013-08-14 Thread Kevin Osborn
, Aug 14, 2013 at 9:51 AM, Kevin Osborn wrote: > Thanks so much for your help and for the explanations. Eventually, we will > be doing several batches in parallel. But at least now I know where to look > and can do some testing on various scenarios. > > Since we may be doing a lot of

Re: Indexing hangs when more than 1 server in a cluster

2013-08-14 Thread Kevin Osborn
gt; > with the same frequency, but it still isn't free. It's still > > good to have the soft commit interval as long as you > > can tolerate. > > > > It's perfectly reasonable to have a hard commit interval > > that's much shorter than your soft commit int

Re: Indexing hangs when more than 1 server in a cluster

2013-08-13 Thread Kevin Osborn
logging and how it's managed when your hard commit occurs. > > If you can give that a try and let us know how that fares we might have > some further input to share. > > > On Aug 13, 2013, at 11:54 AM, Kevin Osborn wrote: > > > I am using Solr Cloud 4.4. It i

Indexing hangs when more than 1 server in a cluster

2013-08-13 Thread Kevin Osborn
server as well. I had also noticed similar behavior before with Solr 4.3. It definitely has something do with the clustering, but I am not sure what. And I don't see any error message (or really anything else) in the Solr logs. Thanks. -- *KEVIN OSBORN* LEAD SOFTWARE ENGINEER CNET Content Solu

Re: How to improve the Solr "OR" query performance

2013-07-03 Thread Kevin Osborn
struct an AND-based query that hits about as many > documents as your slow OR query. > > With an index size of just 9GB, I am surprised that you use sharding. > Have you tried using just a single instance to avoid the merge-overhead? > > - Toke Eskildsen, State and University Li

Re: Normalizing/Returning solr scores between 0 to 1

2013-06-27 Thread Kevin Osborn
to-1-tp4073797.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- *KEVIN OSBORN* LEAD SOFTWARE ENGINEER CNET Content Solutions OFFICE 949.399.8714 CELL 949.310.4677 SKYPE osbornk 5 Park Plaza, Suite 600, Irvine, CA 92614 [image: CNET Content Solutions]

Re: Is it possible to searh Solr with a longer query string?

2013-06-25 Thread Kevin Osborn
solrj.**SolrServerException: Server at > http://127.0.0.1:8087/solr430/**medline<http://127.0.0.1:8087/solr430/medline>returned > non ok status:400, message:Bad Request >QueryResponse solrRes = solrServer.query( new SolrQuery( q ) ); >long found = solrRes.getResults().**getNumFoun

Re: how to replicate Solr Cloud

2013-06-25 Thread Kevin Osborn
just a slave that could be a repeater. > > You say that sending the data in both directions is not idea, but it works > and is conceptually very simple. What is the reasoning behind wanting to > get away from that approach? > > Jason > > On Jun 25, 2013, at 10:07 AM, Kevin

Re: how to replicate Solr Cloud

2013-06-25 Thread Kevin Osborn
formance Monitoring -- http://sematext.com/spm > > > > On Tue, Jun 25, 2013 at 1:07 PM, Kevin Osborn > wrote: > > We are going to have two datacenters, each with their own SolrCloud and > > ZooKeeper quorums. The end result will be that they should be replicas of >

how to replicate Solr Cloud

2013-06-25 Thread Kevin Osborn
use replication to push or pull data from one datacenter to another? In my case, NRT is not a requirement. And I will also be dealing with about 3 collections and 5 or 6 shards. Thanks. -- *KEVIN OSBORN* LEAD SOFTWARE ENGINEER CNET Content Solutions OFFICE 949.399.8714 CELL 949.310.4677

Setup SolrCloud for multiple datacenters

2013-06-24 Thread Kevin Osborn
collections separately in each data center. And then use Solr's replication update handler to transfer data from one datacenter to another? Is there any other suggested method that we should investigate? -- *KEVIN OSBORN* LEAD SOFTWARE ENGINEER CNET Content Solutions OFFICE 949.399.8714 CELL 949.310

load balancing internal Solr on Azure

2013-05-24 Thread Kevin Osborn
hanks. -- *KEVIN OSBORN* LEAD SOFTWARE ENGINEER CNET Content Solutions OFFICE 949.399.8714 CELL 949.310.4677 SKYPE osbornk 5 Park Plaza, Suite 600, Irvine, CA 92614 [image: CNET Content Solutions]

Using alternate Solr index location for SolrCloud

2013-05-22 Thread Kevin Osborn
e instance directory. I see this is available for Core Admin, but I don't see it for the Collections API itself. Or failing that, solr.xml would be better. Does anyone have any suggestions? Thanks. -- *KEVIN OSBORN* LEAD SOFTWARE ENGINEER CNET Content Solutions OFFICE 949.399.8714 CELL 949.310.

Re: how to quickly export data from SolrCloud

2013-05-06 Thread Kevin Osborn
10:48 AM, Kevin Osborn wrote: > >> I am looking to export a large amount of data from Solr. This export will >> be done by a Java application and then written to file. Initially, I was >> thinking of using direct HTTP calls and using the CSV response writer. And >> then m

how to quickly export data from SolrCloud

2013-05-06 Thread Kevin Osborn
, with SolrCloud, I prefer to use SolrJ due to its communication with Zookeeper. Is there any way to use the CSV response writer with SolrJ? Would the overhead of using SolrJ's "solrbin" format be much slower than the CSV response writer? -- *KEVIN OSBORN* LEAD SOFTWARE ENGINE

Re: SolrJ CloudSolrServer throws ClassCastException

2012-10-24 Thread Kevin Osborn
or alpha to 4 at some > point? > > - Mark > > On Wed, Oct 24, 2012 at 1:14 AM, Kevin Osborn > wrote: > > It looks like this is where the problem lies. Here is the JSON that SolrJ > > is receiving from Zookeeper: > > > > "data":"{\\"manuf

Re: SolrJ CloudSolrServer throws ClassCastException

2012-10-23 Thread Kevin Osborn
ection\\":\\"manufacturer\\",\\n \\"node_name\\":\\"myhost:5275_solr\\",\\n \\"base_url\\":\\" http://myhost:5275/solr\\",\\n \\"leader\\":\\"true\\"}"}},{"data":{ Where SolrJ is expecting th

SolrJ CloudSolrServer throws ClassCastException

2012-10-23 Thread Kevin Osborn
so tried an external Zookeeper with the same results. -- *KEVIN OSBORN* LEAD SOFTWARE ENGINEER CNET Content Solutions OFFICE 949.399.8714 CELL 949.310.4677 SKYPE osbornk 5 Park Plaza, Suite 600, Irvine, CA 92614 [image: CNET Content Solutions]

Constant Score queries in functions

2012-03-29 Thread Kevin Osborn
is not a range, but that is the type of scoring I want. I suspect that would make it much faster, but am not sure if that is possible. Thanks. -- KEVIN OSBORN LEAD SOFTWARE ENGINEER T 949.399.8714 C 949.310.4677 5 Park Plaza, Suite 600, Irvine, CA 92614

performance between ExternalFileField and Join

2012-02-27 Thread Kevin Osborn
would have the access information. So, the number of unique terms on the key would be quite high. Would this be too slow? If someone has any knowledge about the performance issues on these two methods, please give an advice. Thanks. -- KEVIN OSBORN LEAD SOFTWARE ENGINEER T 949.399.8714 C

DataImportHandler using new connection on each query

2011-08-17 Thread Kevin Osborn
I have a data import handler that is importing data in full mode from SQL Server. It has one main entity and three sub-entities. Against a good database, it appears to open 4 connections total. One for the main query and the other 3 subqueries just re-use their connections. This works well enoug

Re: unique terms and multi-valued fields

2011-08-11 Thread Kevin Osborn
ably don't care much because this data is only referenced when you assemble a document for return to the client, it's irrelevant for searching. Best Erick On Tue, Aug 9, 2011 at 8:02 PM, Kevin Osborn wrote: > Please verify my understanding. I have a field called "category" a

unique terms and multi-valued fields

2011-08-09 Thread Kevin Osborn
Please verify my understanding. I have a field called "category" and it has a value "computers". If I use this same field and value for all of my documents, it is really only stored on disk once because "category:computers" is a unique term. Is this correct? But, what about multi-valued fields.

sending results of function query to range query

2011-06-16 Thread Kevin Osborn
I am not sure if I can use function queries this way. I have a query like this"attributeX:[* TO ?]" in my DB. I replace the ? with input from the front end. Obviously, this works fine. However, what I really want to do is "attributeX:[* TO (3 * ?)]" Is there anyway to embed the results of a func

problems with custom SolrCache.init() - fails on startup

2010-12-01 Thread Kevin Osborn
My project has a couple custom caches that descend from FastLRUCache. These worked fine in Solr 1.3. Then I started migrating my project to Solr 1.4.1 and had problems during startup. I believe the problem is that I attempt to access the core in the init process. I currently use the deprecated

calling other core from request handler

2010-07-22 Thread Kevin Osborn
I have a multi-core environment and a custom request handler. However, I have one place where I would like to have my request handler on coreA query to coreB. This is not distributed search. This is just an independent query to get some additional data. I am also guaranteed that each server wi

caching on unique queries

2010-05-19 Thread Kevin Osborn
Pretty much every one of my queries is going to be unique. However, the query is fairly complex and also contains both unique and non-unique data. In the query, some fields will be unique (e.g description), but other fields will be fairly common (e.g. category). If we could use those common fiel

Re: How to tell which field matched?

2010-05-17 Thread Kevin Osborn
In our case, we had specific matching that we needed to return, so I can't really contribute this to the code base, but we did get this working. Basically, we have a custom request handler. After it receives the search results, we then send this to our matcher algorithm. We then go through each

Re: LucidWorks Solr

2010-03-16 Thread Kevin Osborn
Stemmer, Porter, etc)? I'm unsure of which stemmer would work best. Thanks again! Kevin Osborn-2 wrote: > > I used it mostly for KStemmer, but I also liked the fact that it included > about a dozen or so stable patches since Solr 1.4 was released. We just > use the included WAR in

Re: LucidWorks Solr

2010-03-16 Thread Kevin Osborn
I used it mostly for KStemmer, but I also liked the fact that it included about a dozen or so stable patches since Solr 1.4 was released. We just use the included WAR in our project however. We don't use the installer or anything like that. From: blargy To:

DataInputHandlers and dynamic fields

2010-03-08 Thread Kevin Osborn
If my query were something like this: "select col1, col2 from table", my dynamic field would be something like "fld_${col1}". But I could not find any information on how to setup the DIH with dynamic fields. I saw that dynamic fields should be supported with SOLR-742, but am not sure how to proc

Re: Logging in Embedded SolrServer - What a nightmare.

2010-03-02 Thread Kevin Osborn
Not sure if it will solve your specific problem. We use Solr as a WAR as well as Solrj. So the main solr distribution comes with slf4j-jdk-1.5.5.jar. I just deleted that and replaced it with slf4j-log4j12-1.5.5.jar. And then it used my existing log4j.properties file. ___

Re: SOLR Multivalued field and length norm

2010-03-01 Thread Kevin Osborn
I too wish it worked this way, but it doesn't. I believe that this all takes places within Lucene, so there is no concept of single values or multi-valued fields. They are all just terms. The same is true with term frequency. In my case, I set omitNorms=true and then created a custom similarity

Re: filter result by catalog

2010-02-23 Thread Kevin Osborn
Like you, all of my research has come to the conclusion of "it depends". For this particular product, we have an index of a million documents or so. And each document can belong to many catalogs. Initially, it will be a small number, but there could be up to 200 or so catalogs (probably much les

Re: filter result by catalog

2010-02-19 Thread Kevin Osborn
proach described there? > > >Otis >Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch >Hadoop ecosystem search :: http://search-hadoop.com/ > > > >- Original Message >> From: Kevin Osborn >> To: Solr >> Sent: Fri, February 19, 2010 6:06:51

filter result by catalog

2010-02-19 Thread Kevin Osborn
So, I am looking at better ways to filter a resultset by catalog. So, I have an index of products. And based on the user, I want to filter the search results to what they are allowed to see. I will probably have up to 200 or so different catalogs.

Re: parsing strings into phrase queries

2010-02-18 Thread Kevin Osborn
The PositionFilter worked great for my purpose along with another filter that I build. In my case, my indexed data may be something like "X150". So, a query for "Nokia X150" should match. But I don't want random matches on "x". However, if my indexed data is "G7", I do want a query on "PowerSho

Re: cannot match on phrase queries

2010-02-16 Thread Kevin Osborn
Erick On Fri, Feb 12, 2010 at 8:28 PM, Kevin Osborn wrote: > I am seeing this in several of my fields. I have something like "Samsung > X150" or "Nokia BH-212". And my query will not match on X150 or BH-212. > > So, my query is something like +model:(Samsung X150).

parsing strings into phrase queries

2010-02-12 Thread Kevin Osborn
Right now if I have the query model:(Nokia BH-212V), the parser turns this into +(model:nokia model:"bh 212 v"). The problem is that I might have a model called Nokia BH-212, so this is completely missed. In my case, I would like my query to be +(model:nokia model:bh model:212 model:v). This is

Re: cannot match on phrase queries

2010-02-12 Thread Kevin Osborn
It appears that omitTermFreqAndPositions is indeed the culprit. I assume it has to do with the fact that the index parsing of BH-212 puts multiple terms in the same position. From: Kevin Osborn To: Solr Sent: Fri, February 12, 2010 5:28:08 PM Subject

cannot match on phrase queries

2010-02-12 Thread Kevin Osborn
I am seeing this in several of my fields. I have something like "Samsung X150" or "Nokia BH-212". And my query will not match on X150 or BH-212. So, my query is something like +model:(Samsung X150). Through debugQuery, I see that this gets converted to +(model:samsung model:"x 150"). It matches

Re: LongField not stripping leading zeros

2010-01-12 Thread Kevin Osborn
Thanks. Is there any performance penalty vs. LongField? I don't need to do any range queries on these value. I am basically treating them as numerical strings. I thought it would just be a shortcut to strip leading zeros, which I can easily do on my own. From

LongField not stripping leading zeros

2010-01-12 Thread Kevin Osborn
This is in Solr 1.3. I have some text in our database in the form 0088698183939. The leading zeros are useless, but I want to able to search it with no leading zeros or several leading zeros. So, I decided to index this as a long, expecting it to just store it as a number. But, instead, I see t

how to get highlighter to only show matched term

2009-08-24 Thread Kevin Osborn
If my query is something like manufacturer:IBM OR productTitle:Thinkpad, I actually just want to printout "IBM" or "Thinkpad" in any of the highlighted fields. I don't want to parse through any HTML or other text. Basically, I just want to know which of the terms in my query matched and in which

Re: multiple languages in result set

2009-07-28 Thread Kevin Osborn
BTW, the search will always be in a single language. From: Kevin Osborn To: Solr Sent: Tuesday, July 28, 2009 12:23:43 PM Subject: multiple languages in result set As of Solr 1.3, it looks like my choices for searching in multiple languages are either one

multiple languages in result set

2009-07-28 Thread Kevin Osborn
As of Solr 1.3, it looks like my choices for searching in multiple languages are either one language per core or using different fields per language (productTitle_en, productTitle_de, etc.). However, I may want my results back in multiple languages as well. For example, I could search for a term

Re: query on part number not matching

2009-04-21 Thread Kevin Osborn
r not matching On Mon, Apr 20, 2009 at 8:50 PM, Kevin Osborn wrote: > Looks like the format didn't come through in the email. ch, vxrch, and > cisco7204xvrch are all in position 4. Ah... the traditional way to "handle" that case is to use a little slop with the phrase query. -Yonik

Re: query on part number not matching

2009-04-20 Thread Kevin Osborn
not matching On Mon, Apr 20, 2009 at 6:59 PM, Kevin Osborn wrote: > > I have a manufacturer part number: CISCO7204VXR-CH. The indexer produces: > > 12 3 4 > cisco7204vxrch >vxrch >cisco7204vxrch

query on part number not matching

2009-04-20 Thread Kevin Osborn
I have a manufacturer part number: CISCO7204VXR-CH. The indexer produces: 12 3 4 cisco7204vxrch vxrch cisco7204vxrch If I query on CISCO7204VXR-CH, I get: 12 3 4 cisco7204vxrch Everyt

Re: logging

2009-04-10 Thread Kevin Osborn
, 2009, at 4:56 PM, Kevin Osborn wrote: > We built our own webapp that used the Solr JARs. We used Apache Commons/log4j > logging and just put log4j.properties in the Resin conf directory. The > commons-logging and log4j jars were put in the Resin lib driectory. > Everything worked grea

logging

2009-04-09 Thread Kevin Osborn
We built our own webapp that used the Solr JARs. We used Apache Commons/log4j logging and just put log4j.properties in the Resin conf directory. The commons-logging and log4j jars were put in the Resin lib driectory. Everything worked great and we got log files for our code only. So, I upgraded

Re: newSearcher doesn't fire

2009-04-02 Thread Kevin Osborn
ation.com On Thu, Apr 2, 2009 at 5:25 PM, Kevin Osborn wrote: > I am trying to figure this out. I have a firstSearcher and a newSearcher > event. They are almost identical. Upon startup, I see all the firstSearcher > events in my log. I also see log events for Added SolrEventLis

newSearcher doesn't fire

2009-04-02 Thread Kevin Osborn
I am trying to figure this out. I have a firstSearcher and a newSearcher event. They are almost identical. Upon startup, I see all the firstSearcher events in my log. I also see log events for Added SolrEventListener for both firstSearcher and newSearcher. Next, I push out a new index. I see th

negative boosts

2008-12-12 Thread Kevin Osborn
My index has a category field and I would like to apply a negative boost to certain categories. For example, if I search for "thinkpad", it should push results for the laptop bag and other accessory categories to the bottom. So, I first tried altering the bq field with category:(batteries bags

SynonymFilter and inch/foot symbols

2008-09-19 Thread Kevin Osborn
How would I handle a search for 21" or 3'. The " and ' symbols appear to get stripped away by Lucene before passing the query off to the analyzers. Here is my analyzer in the schema.xml: I could certainly replace X" with X inch using regex in my custom request handler. B

Re: Less aggressive stemmer?

2008-08-21 Thread Kevin Osborn
We had similar problems and then switched to KStem and have been pretty happy with the results. http://ciir.cs.umass.edu/cgi-bin/downloads/downloads.cgi - Original Message From: Jason Rennie <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Thursday, August 21, 2008 2:23:36 PM

Re: QueryResultsCache and DocSet filter

2008-08-14 Thread Kevin Osborn
, 2008 1:41:50 PM Subject: Re: QueryResultsCache and DocSet filter On Thu, Aug 14, 2008 at 3:15 PM, Kevin Osborn <[EMAIL PROTECTED]> wrote: > The problem here is that the calls in SolrIndexSearcher don't appear to use > the QueryResultsCache if the filer is a DocSet rather th

QueryResultsCache and DocSet filter

2008-08-14 Thread Kevin Osborn
We have a bunch of user caches that return DocSet objects. So, we intersect them and send a DocSet filter and the actual query to getDocListAndSet or getDocList. The problem here is that the calls in SolrIndexSearcher don't appear to use the QueryResultsCache if the filer is a DocSet rather than

Re: Fetching float or int fields from index by Lucene document

2008-07-02 Thread Kevin Osborn
ee the methods on FieldType, esp toExternal() -Yonik On Wed, Jul 2, 2008 at 5:39 PM, Kevin Osborn <[EMAIL PROTECTED]> wrote: > As part of my results, I am building a lot of facet information. For example, > an Attribute ID also needs to return the Attribute Text. > > So, I have

Fetching float or int fields from index by Lucene document

2008-07-02 Thread Kevin Osborn
As part of my results, I am building a lot of facet information. For example, an Attribute ID also needs to return the Attribute Text. So, I have code like the following (really in a cache): Term term = new Term ("AtrID", "A0001"); Document doc = searcher.doc(searcher.getFirstMatch(term)); retu

Re: DocSet to BitSet

2008-05-22 Thread Kevin Osborn
In v1.3, it is public. In v1.2, it is still protected. - Original Message From: Chris Hostetter <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Thursday, May 22, 2008 1:50:22 PM Subject: Re: DocSet to BitSet : That is more or less what I did. Once I found that function, it ju

Re: DocSet to BitSet

2008-05-22 Thread Kevin Osborn
That is more or less what I did. Once I found that function, it just took a small patch to expose that functionality, and then the problem was solved. - Original Message From: Chris Hostetter <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Thursday, May 22, 2008 12:32:56 PM Su

Re: DocSet to BitSet

2008-05-20 Thread Kevin Osborn
One of the primary reasons that I was doing it this way is because I am sending several filters, one is a big docset and others are BooleanQuery objects (products in stock, etc.). Since, the interface for SolrIndexSearcher.getDocListAndSet supports only (Query, DocSet,...) or (Query, List,...),

DocSet to BitSet

2008-05-15 Thread Kevin Osborn
I have a custom query object that extends ContstantScoreQuery. I give it a key which pulls some documents out of a cache. Thinking to make it more efficient, I used DocSet, backed by OpenBitSet or OpenHashSet. However, I need to set the BitSet object for the Lucene filter. Any idea on how to bes

Re: How do I use KStem with Solr?

2008-05-07 Thread Kevin Osborn
There is nothing super special that you need to do to get KStem compiled. However, you will need the Solr JAR file on your classpath when you compile KStem. You can do this on command-line, ANT, Eclipse, etc. This will produce the class files. It will also be the easiest to use if you put this

Re: complex queries

2008-05-06 Thread Kevin Osborn
- Original Message From: Erik Hatcher <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Tuesday, May 6, 2008 6:03:34 PM Subject: Re: complex queries On May 6, 2008, at 8:57 PM, Kevin Osborn wrote: > I don't think this is possible, but I figure that I would ask. >

complex queries

2008-05-06 Thread Kevin Osborn
I don't think this is possible, but I figure that I would ask. So, I want to find documents that match a search term and where a field in those documents are also in the results of a subquery. Basically, I am looking for the Solr equivalent of doing a SQL IN clause. As I said, I don't think it

Re: access control list

2008-05-01 Thread Kevin Osborn
I thought of that method. The problem I was thinking of is that if a new customer is added, that could potentially cause an update of about 2,000,000 records or so. Fortunately, this does not happen everyday. It also make indexing a little difficult because I now have to check permissions on eac

access control list

2008-04-30 Thread Kevin Osborn
I have an index of about 3,000,000 products and about 8500 customers. Each customers has access to about 50 to about 500,000 of the products. Our current method was using a bitset in the filter. So, for each customer, they have a bitset in the cache. For each docId that they have access to, the

Re: IOException: read past EOF during optimize phase

2008-01-16 Thread Kevin Osborn
PM Subject: Re: IOException: read past EOF during optimize phase This may be a Lucene bug... IIRC, I saw at least one other lucene user with a similar stack trace. I think the latest lucene version (2.3 dev) should fix it if that's the case. -Yonik On Jan 16, 2008 3:07 PM, Kevin Osborn &

Re: IOException: read past EOF during optimize phase

2008-01-16 Thread Kevin Osborn
-slave setup. This will separate your indexing from searching. Don't have the URL, but it's on zee Wiki. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message ---- From: Kevin Osborn <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent:

Re: IOException: read past EOF during optimize phase

2008-01-16 Thread Kevin Osborn
s -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message ---- From: Kevin Osborn <[EMAIL PROTECTED]> To: Solr Sent: Wednesday, January 16, 2008 3:07:23 PM Subject: IOException: read past EOF during optimize phase I am using the embedded Solr API for my indexing

IOException: read past EOF during optimize phase

2008-01-16 Thread Kevin Osborn
I am using the embedded Solr API for my indexing process. I created a brand new index with my application without any problem. I then ran my indexer in incremental mode. This process copies the working index to a temporary Solr location, adds/updates any records, optimizes the index, and then co

Re: Embedded about 50% faster for indexing

2007-08-27 Thread Kevin Osborn
At 10,000 documents per post, I was actually finding that embedded Solr was providing a significant performance boost. It has been a while since I did any comparisons, but it was probably on the order of 40% or so. - Original Message From: climbingrose <[EMAIL PROTECTED]> To: solr-user@

Re: facet query counts

2007-06-14 Thread Kevin Osborn
is exactly the same as 39f -Yonik On 6/14/07, Kevin Osborn <[EMAIL PROTECTED]> wrote: > I have a large subset (47640) of my total index. Most of them (45335) have a > single field, which we will call Field1. Field1 is a sfloat. > > If my query restricts the resultset

facet query counts

2007-06-14 Thread Kevin Osborn
I have a large subset (47640) of my total index. Most of them (45335) have a single field, which we will call Field1. Field1 is a sfloat. If my query restricts the resultset to my subset and I do a facet count on Field1, then the number of records returned is 47640. And if I sum up the facet co

Re: field display values

2007-05-25 Thread Kevin Osborn
I had a similar issue with a heavy use of dynamic fields. You first want to get those spaces out of there. Lucene does not like spaces in field names. So, I just replaced the space with a rarely used character (ASCII 8 or something like that). I did this in my indexing. And then I just translate

Re: Alphabetical Facets

2007-05-11 Thread Kevin Osborn
I don't have any pointers, but I would love to have this feature. - Original Message From: Ryan McKinley <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Friday, May 11, 2007 9:23:02 AM Subject: Alphabetical Facets Has anyone given any thought to alphabetical faceting? I'd like

Re: listing dynamic field in search results

2007-04-12 Thread Kevin Osborn
range. I couldn't find this information in the Lucene documentation. - Original Message From: Yonik Seeley <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Thursday, April 12, 2007 9:56:18 AM Subject: Re: listing dynamic field in search results On 4/12/07, Kevin Osborn <[E

Re: listing dynamic field in search results

2007-04-12 Thread Kevin Osborn
I discovered the issue. Some of my dynamic field names have spaces in them. If I replace ' ' with '_', then it works fine. This really isn't a great solution, since a '_' could be a legitimate part of the field name. I tried enclosing the entire field name with quotes, so the query is: http://l

listing dynamic field in search results

2007-04-11 Thread Kevin Osborn
I have quite a few dynamic fields. However, I will usually only want to return just a couple of those fields. So, if my static fields are StaticField1 and StaticField2 and my dynamic fields are s_DynamicField1, sf_DynamicField2, etc., the following line will work as expected: http://localhost:

Re: maximum index size

2007-03-27 Thread Kevin Osborn
- Original Message From: Mike Klaas <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Tuesday, March 27, 2007 3:20:40 PM Subject: Re: maximum index size If you are going to store a document for each customer then some field must indicate to which customer the document instance b