Field compression

2011-04-15 Thread Charlie Jackson
I know I'm late to the party, but I recently learned that field compression was removed as of Solr 1.4.1. I think a lot of sites were relying on that feature, so I'm curious what people are doing now that it's gone. Specifically, what are people doing to efficiently store *and highlight* large f

RE: How to extend IndexSchema and SchemaField

2010-09-10 Thread Charlie Jackson
Have you already explored the idea of using a custom analyzer for your field? Depending on your use case, that might work for you. - Charlie

Status of Solr in the cloud?

2010-08-26 Thread Charlie Jackson
There seem to be a few parallel efforts at putting Solr in a cloud configuration. See http://wiki.apache.org/solr/KattaIntegration, which is based off of https://issues.apache.org/jira/browse/SOLR-1395. Also http://wiki.apache.org/solr/SolrCloud which is https://issues.apache.org/jira/browse/SOLR-1

Allow custom overrides

2010-07-23 Thread Charlie Jackson
I need to implement a search engine that will allow users to override pieces of data and then search against or view that data. For example, a doc that has the following values: DocId FulltextMeta1 Meta2 Meta3 1 The quick brown fox foofoo foo

RE: Odd query result

2010-04-20 Thread Charlie Jackson
I'll take another look and see if it makes sense to have the index and query time parameters the same or different. As far as the initial issue, I think you're right Tom, it is hitting on both. I think what threw me off was the highlighting -- in one of my matching documents, the term "I-CAR" is h

Odd query result

2010-04-20 Thread Charlie Jackson
I've got an odd scenario with a query a user's running. The user is searching for the term "I-Car". It will hit if the document contains the term "I-CAR" (all caps) but not if it's "I-Car". When I throw the terms into the analysis page, the resulting tokens look identical, and my "I-Car" tokens hi

RE: HTTP caching and distributed search

2010-02-09 Thread Charlie Jackson
I tried your suggestion, Hoss, but committing to the new coordinator core doesn't change the indexVersion and therefore the ETag value isn't changed. I opened a new JIRA issue for this http://issues.apache.org/jira/browse/SOLR-1765 Thanks, Charlie -Original Message- From: Chris Hostett

HTTP caching and distributed search

2010-02-02 Thread Charlie Jackson
Currently, I've got a Solr setup in which we're distributing searches across two cores on a machine, say core1 and core2. I'm toying with the notion of enabling Solr's HTTP caching on our system, but I noticed an oddity when using it in combination with distributed searching. Say, for example, I ha

RE: Rounding dates on sort and filter

2010-01-19 Thread Charlie Jackson
-- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message ---- > From: Charlie Jackson > To: solr-user@lucene.apache.org > Sent: Tue, January 19, 2010 1:20:02 PM > Subject: Rounding dates on sort and filter > > I've got a legacy date field that I'd

Rounding dates on sort and filter

2010-01-19 Thread Charlie Jackson
I've got a legacy date field that I'd like to round for sorting and filtering. Right now, the index is large enough that sorting or filtering on a date field takes 10-20 seconds (unless it's cached). I know this is because the date field's precision is down to the millisecond, and I don't really ne

RE: NGram query failing

2009-10-23 Thread Charlie Jackson
as the min/max gram size is met. In other words, for any queries two or more characters long, this works for me. Less than two characters and it fails. I don't know exactly why that is, but I'll take it anyway! - Charlie -Original Message----- From: Charlie Jackson [mailto:ch

NGram query failing

2009-10-23 Thread Charlie Jackson
I have a requirement to be able to find hits within words in a free-form id field. The field can have any type of alphanumeric data - it's as likely it will be something like "123456" as it is to be "SUN-123-ABC". I thought of using NGrams to accomplish the task, but I'm having a problem. I set up

RE: Sorting/paging problem

2009-10-01 Thread Charlie Jackson
Oops, the missing trailing Z was probably just a cut and paste error. It might be tough to come up with a case that can reproduce it -- it's a sticky issue. I'll post it if I can, though. -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Tuesday, Septembe

Sorting/paging problem

2009-09-24 Thread Charlie Jackson
I've run into a strange issue with my Solr installation. I'm running queries that are sorting by a DateField field but from time to time, I'm seeing individual records very much out of order. What's more, they appear on multiple pages of my result set. Let me give an example. Starting with a basic

Availability during merge

2009-07-13 Thread Charlie Jackson
The wiki page for merging solr cores (http://wiki.apache.org/solr/MergingSolrIndexes) mentions that the cores being merged cannot be indexed to during the merge. What about the core being merged *to*? In terms of the example on the wiki page, I'm asking if core0 can add docs while core1 and core2 a

RE: Entity extraction?

2008-10-27 Thread Charlie Jackson
concept, thanks for that info. ________ Charlie Jackson [EMAIL PROTECTED] -Original Message- From: Walter Underwood [mailto:[EMAIL PROTECTED] Sent: Monday, October 27, 2008 11:17 AM To: solr-user@lucene.apache.org Subject: Re: Entity extraction? Th

RE: Entity extraction?

2008-10-27 Thread Charlie Jackson
e any experience with any of these? Charlie Jackson 312-873-6537 [EMAIL PROTECTED] -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Monday, October 27, 2008 10:23 AM To: solr-user@lucene.apache.org Subject: Re: E

RE: Entity extraction?

2008-10-24 Thread Charlie Jackson
simple demo: http://www.cortex-intelligence.com/tech/ > > Rossini > > > On Fri, Oct 24, 2008 at 6:18 PM, Charlie Jackson < > [EMAIL PROTECTED] > > wrote: > > > During a recent sales pitch to my company by FAST, they mentioned entity > > extraction. I'd

Entity extraction?

2008-10-24 Thread Charlie Jackson
During a recent sales pitch to my company by FAST, they mentioned entity extraction. I'd never heard of it before, but they described it as basically recognizing people/places/things in documents being indexed and then being able to do faceting on this data at query time. Does anything like this al

RE: Shared index base

2008-02-26 Thread Charlie Jackson
How do you handle commits to the index? By that, I mean that Solr recreates its searcher when you issue a commit, but only for the system that does the commit. Wouldn't you be left with searchers on the other machines that are stale? - Charlie -Original Message- From: Matthew Runo [mail

RE: Bossting a token with space at the end

2008-02-13 Thread Charlie Jackson
If you haven't explicity set the sort parameter, Solr will default to ordering my score. Information about Lucene scoring can be found here http://lucene.apache.org/java/docs/scoring.html And, specifically, the score formula can be found here http://hudson.zones.apache.org/hudson/job/Lucene-trun

RE: highlighting marks wrong words

2008-01-15 Thread Charlie Jackson
I believe changing the "AND id: etc etc " part of the query to it's on filter query will take care of your highlighting problem. In other words, try a query like this: q=(auto)&fq=id:(100 OR 1 OR 2 OR 3 OR 5 OR 6)&fl=score&hl.fl=content&hl=true&hl.fragsize=200&hl.snippets=2&hl.simpl e.pre=%3Cb%3

RE: Backup of a Solr index

2008-01-03 Thread Charlie Jackson
ry 03, 2008 11:00 AM To: solr-user@lucene.apache.org Subject: Re: Backup of a Solr index Charlie Jackson wrote: > Solr indexes are file-based, so there's no need to "dump" the index to a > file. > But however one has first to shutdown the Solr server before copying the i

RE: Backup of a Solr index

2008-01-02 Thread Charlie Jackson
Solr indexes are file-based, so there's no need to "dump" the index to a file. In terms of how to create backups and move those backups to other servers, check out this page http://wiki.apache.org/solr/CollectionDistribution. Hope that helps. -Original Message- From: Jörg Kiegeland

RE: Successful project based on SOLR

2007-12-20 Thread Charlie Jackson
Yeah I remember seeing that at one point when I was first looking at the solrj client. I had plans to build on it but I got pulled away on something else. Maybe it's time to take another look and see what I can do with it. As Jonathan said, it's a good project to work on. -Original Message--

RE: Successful project based on SOLR

2007-12-20 Thread Charlie Jackson
ifference with that and Hibernate Search<http://www.hibernate.org/410.html> ? On Dec 20, 2007 2:09 PM, Charlie Jackson <[EMAIL PROTECTED]> wrote: > Congratulations! > > > It uses an custom hibernate-SOLR > bridge which allows transparent persistence of entities on differ

RE: Successful project based on SOLR

2007-12-20 Thread Charlie Jackson
Congratulations! > It uses an custom hibernate-SOLR bridge which allows transparent persistence of entities on different SOLR servers. Any chance of this code making its way back to the SOLR community? Or, if not, can you give me an idea how you did it? This seamless integration of Hibernate an

RE: Tomcat6?

2007-12-03 Thread Charlie Jackson
$CALINA_HOME/conf/Catalina/localhost doesn't exist by default, but you can create it and it will work exactly the same way it did in Tomcat 5. It's not created by default because its not needed by the manager webapp anymore. -Original Message- From: Matthew Runo [mailto:[EMAIL PROTECTED

RE: Forced Top Document

2007-10-24 Thread Charlie Jackson
bogus field in descending order before any other sorting criteria are applied. Either way, the document only appears when it matches the search criteria, and it will always be on top. kyle On 10/24/07, Charlie Jackson <[EMAIL PROTECTED]> wrote: > Yes, this will only work if the results a

RE: Forced Top Document

2007-10-24 Thread Charlie Jackson
all the results are returned based on the query specified, but then resorted as specified. Boosting (which modifies the document's score) should not change the order unless the results are sorted by score. Mark On Oct 24, 2007, at 1:05 PM, Charlie Jackson wrote: > Do you know which d

RE: Forced Top Document

2007-10-24 Thread Charlie Jackson
Do you know which document you want at the top? If so, I believe you could just add an "OR" clause to your query to boost that document very high, such as ?q=foo OR id:bar^1000 Tried this on my installation and it did, indeed push the document specified to the top. -Original Message-

RE: Timeout Settings

2007-10-23 Thread Charlie Jackson
The CommonsHttpSolrServer has a setConnectionTimeout method. For my import, which was on a similar scale as yours, I had to set it up to 1000 (1 second). I think messing with this setting may take care of your timeout problem. -Original Message- From: Daniel Clark [mailto:[EMAIL PROTECTE

RE: quick allowDups questions

2007-10-11 Thread Charlie Jackson
Cool, thanks for the clarification, Ryan. -Original Message- From: Ryan McKinley [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 10, 2007 5:28 PM To: solr-user@lucene.apache.org Subject: Re: quick allowDups questions the default solrj implementation should do what you need. > > As

RE: quick allowDups questions

2007-10-10 Thread Charlie Jackson
r the help! -Original Message- From: Mike Klaas [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 10, 2007 3:58 PM To: solr-user@lucene.apache.org Subject: Re: quick allowDups questions On 10-Oct-07, at 1:11 PM, Charlie Jackson wrote: > Anyway, I need to update some docs in my index

quick allowDups questions

2007-10-10 Thread Charlie Jackson
Normally this is the type of thing I'd just scour through the online docs or the source code for, but I'm under the gun a bit. Anyway, I need to update some docs in my index because my client program wasn't accurately putting these docs in (values for one of the fields was missing). I'm hoping

RE: dataset parameters suitable for lucene application

2007-09-26 Thread Charlie Jackson
acceptable beyond 8.8M records, or that you only had 8.8M records to index? If the former, can you share the particular symptoms? On 9/26/07, Charlie Jackson <[EMAIL PROTECTED]> wrote: > My experiences so far with this level of data have been good. > > Number of records: Maxed out at 8

RE: dataset parameters suitable for lucene application

2007-09-26 Thread Charlie Jackson
My experiences so far with this level of data have been good. Number of records: Maxed out at 8.8 million Database size: friggin huge (100+ GB) Index size: ~24 GB 1) It took me about a day to index 8 million docs using a non-optimized program I wrote. It's non-optimized in the sense that it's not

RE: UTF-8 encoding problem on one of two Solr setups

2007-08-17 Thread Charlie Jackson
You might want to check out this page http://wiki.apache.org/solr/SolrTomcat Tomcat needs a small config change out of the box to properly support UTF-8. Thanks, Charlie -Original Message- From: Mario Knezovic [mailto:[EMAIL PROTECTED] Sent: Friday, August 17, 2007 12:58 PM To: solr-

RE: Solrsharp highlighting

2007-08-15 Thread Charlie Jackson
le to add facets; the example application implements one form of it. The nice thing about the facet support is that it utilizes generics to allow you to have strongly typed name/value pairs for the fieldname/count data. Hope this helps. -- jeff r. On 8/10/07, Charlie Jackson <[EMAIL PROTECTED

RE: Solrsharp highlighting

2007-08-10 Thread Charlie Jackson
Also, are there any examples out there of how to use Solrsharp's faceting capabilities? Charlie Jackson 312-873-6537 [EMAIL PROTECTED] -Original Message- From: Charlie Jackson [mailto:[EMAIL PROTECTED] Sent: Friday, August 10, 2007 3:51

Solrsharp highlighting

2007-08-10 Thread Charlie Jackson
Trying to use Solrsharp (which is a great tool, BTW) to get some results in a C# application. I see the HighlightFields method of the QueryBuilder object and I've set it to my highlight field, but how do I get at the results? I don't see anything in the SearchResults code that does anything with th

RE: fast update handlers

2007-05-10 Thread Charlie Jackson
What about issuing separate commits to the index on a regularly scheduled basis? For example, you add documents to the index every 2 seconds, or however often, but these operations don't commit. Instead, you have a cron'd script or something that just issues a commit every 5 or 10 minutes or whatev

Index corruptions?

2007-05-03 Thread Charlie Jackson
I have a couple of questions regarding index corruptions. 1) Has anyone using Solr in a production environment ever experienced an index corruption? If so, how frequently do they occur? 2) It seems like the CollectionDistribution setup would be a good way to put in place a recovery plan fo

RE: NullPointerException (not schema related)

2007-05-02 Thread Charlie Jackson
y.com/ - Tag - Search - Share - Original Message From: Charlie Jackson <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Tuesday, May 1, 2007 5:31:13 PM Subject: RE: NullPointerException (not schema related) I went with the first approach which got me up and running.

RE: NullPointerException (not schema related)

2007-05-01 Thread Charlie Jackson
I went with the first approach which got me up and running. Your other example config (using ./snapshooter) made me realize how foolish my original problem was! Anyway, I've got the whole thing up and running and it looks pretty awesome! One quick question, though. As stated in the wiki, one of

RE: NullPointerException (not schema related)

2007-05-01 Thread Charlie Jackson
Nevermind this...looks like my problem was tagging the "args" as an node instead of an node. Thanks anyway! Charlie -Original Message----- From: Charlie Jackson [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 01, 2007 12:02 PM To: solr-user@lucene.apache.org Subject: NullPointe

NullPointerException (not schema related)

2007-05-01 Thread Charlie Jackson
Hello, I'm evaluating solr for potential use in an application I'm working on, and it sounds like a really great fit. I'm having trouble getting the Collection Distribution part set up, though. Initially, I had problems setting up the postCommit listener. I first used this xml to configure the