Re: Using Tesseract OCR to extract PDF files in EML file attachment

2017-04-04 Thread AJ Weber
You'll need to use something like javax mail (or some of the jars that have been built on top of it for higher-level access) to open the EML files and extract the attachments, then operate on the extracted attachments as you would any file. There are alternative, paid, libraries to parse and e

Re: Solr Cloud - how to implement local indexing without SSL and distributed search with SSL

2016-07-19 Thread Sarit Weber
Hi again, Do you think it's possible to do that with server that will be dedicate to indexing and server that will be dedicate to search but will work on the same collections? Thanks, Sarit Weber Guardium Software Developer IBM Israel Software Labs, Jerusalem Phone: +972-2-649-1712

Solr Cloud - how to implement local indexing without SSL and distributed search with SSL

2016-07-17 Thread Sarit Weber
wondering if there is an option to remove SSL from indexing and keep using it for Searching. The solution will have to require the indexing to be done locally, not calling the remote zookeeper. Is there any way to achieve this with Solr Cloud? Thanks, Sarit Weber

Solr Cloud - how to implement local indexing without SSL and distributed search with SSL

2016-07-17 Thread Sarit Weber
wondering if there is an option to remove SSL from indexing and keep using it for Searching. The solution will have to require the indexing to be done locally, not calling the remote zookeeper. Is there any way to achieve this with Solr Cloud? Thanks, Sarit Weber

Re: DataImportHandler - Automatic scheduling of delta imports in Solr in windows 7

2016-03-08 Thread B Weber
harshrossi gmail.com> writes: > > I am using *DeltaImportHandler* for indexing data in Solr. Currently I am > manually indexing the data into Solr by selecting commands full-import or > delta-import from the Solr Admin screen. > > I am using Windows 7 and would like to automate the process by

Solr 4.5 spatial search - distance and score

2013-09-12 Thread Weber
I'm trying to get score by using a custom boost and also get the distance. I found David's code* to get it using "Intersects", which I want to replace by {!geofilt} or geodist() *David's code: https://issues.apache.org/jira/browse/SOLR-4255 He told me geodist() will be available again for this ki

Different 'fl' for first X results

2013-07-15 Thread Weber
How to get a different field list in the first X results? For example, in the first 5 results I want fields A, B, C, and on the next results I need only fields A, and B. -- View this message in context: http://lucene.472066.n3.nabble.com/Different-fl-for-first-X-results-tp4078178.html Sent from

Re: The book: Solr 4.x Deep Dive - Early Access Release #1

2013-06-21 Thread AJ Weber
On 6/21/2013 9:22 AM, Alexandre Rafalovitch wrote: I might be however confused regarding your strategy. I thought you were going to do several different volumes, rather than one large one. Or is this all a 'first' volume discussion so far. Pricing: $7.99 feels better for the book this size. U

newbie questions about cache stats & query perf

2013-01-09 Thread AJ Weber
Sorry, I did search for an answer, but didn't find an applicable one. I'm currently stuck on 1.4.1 (running in Tomcat 6 on 64bit Linux) for the time being... When I see stats like this: name: documentCache class: org.apache.solr.search.LRUCache version: 1.0 description: LRU

LetterTokenizer + EdgeNGram + apostrophe in query = invalid result

2011-02-25 Thread Matt Weber
's containing an apostrophe and they all stop returning results after 4 characters. Thanks, Matt Weber

Re: Ramdirectory

2011-02-25 Thread Matt Weber
I have used this without issue. In the example solrconfig.xml replace this line: with this one: Thanks, Matt Weber On Thu, Feb 24, 2011 at 7:47 PM, Bill Bell wrote: > Thanks - yeah that is why I asked how to use it. But I still don't know > how to use it. > > https://

Re: Using solr with woodstox 4.0.8

2010-06-14 Thread Weber, Alexander
Hi Peter! Yes, we do. Thanks for the hint! Cheers, Alex Am 14.06.10 16:49 schrieb "Peter Karich" unter : > Hi Alex! > >> Am I missing something? Anything more to test? >> > > Are you using solrj too? If so, beware of: > https://issues.apache.org/jira/browse/SOLR-1950 > > Regards, > Peter

Using solr with woodstox 4.0.8

2010-06-14 Thread Weber, Alexander
Hi all, we are using woodstox-4.0 and solr-1.4 in our project. As solr is using woodstox-3.2.7, there is a version clash. So I tried to check if solr would run with woodstox-4.0. I downloaded a clean solr-1.4.0 and replaced wstx-asl-3.2.7.jar with stax2-api-3.0.2.jar and woodstox-core-lgpl-4.0.8

Re: field collapsing sums

2009-09-30 Thread Matt Weber
You might want to see how the stats component works with field collapsing. Thanks, Matt Weber On Sep 30, 2009, at 5:16 PM, Uri Boness wrote: Hi, At the moment I think the most appropriate place to put it is in the AbstractDocumentCollapser (in the getCollapseInfo method). Though, it

Re: Showing few results for each category (facet)

2009-09-29 Thread Matt Weber
r query on the specific category they want to see, for example: &fq=category:people. Hope this helps. Thanks, Matt Weber On Sep 29, 2009, at 4:55 AM, Marian Steinbach wrote: On Tue, Sep 29, 2009 at 11:36 AM, Varun Gupta wrote: ... One way that I can think of doing this is by making a

Re: Usage of Sort and fq

2009-09-29 Thread Matt Weber
A description and examples of both parameters can be found here: http://wiki.apache.org/solr/CommonQueryParameters Thanks, Matt Weber On Sep 29, 2009, at 4:10 AM, Avlesh Singh wrote: /?q=*:*&fq:category:animal&sort=child_count%20asc Search for all documents (of animals), and fi

Re: Using two Solr documents to represent one logical document/file

2009-09-26 Thread Matt Weber
Check out the field collapsing patch: http://wiki.apache.org/solr/FieldCollapsing https://issues.apache.org/jira/browse/SOLR-236 Thanks, Matt Weber On Sep 25, 2009, at 3:15 AM, Peter Ledbrook wrote: Hi, I want to index both the contents of a document/file and metadata associated with

Re: Is it possible to query for "everything" ?

2009-09-14 Thread Matt Weber
Query for *:* Thanks, Matt Weber On Sep 14, 2009, at 4:18 PM, Jonathan Vanasco wrote: I'm using Solr for seach and faceted browsing Is it possible to have solr search for 'everything' , at least as far as q is concerned ? The request handlers I've found don't l

Re: Searching for the '+' character

2009-09-14 Thread Matt Weber
Why don't you create a synonym for + that expands to your customers product name that includes the plus? You can even have your FE do this sort of replacement BEFORE submitting to Solr. Thanks, Matt Weber On Sep 14, 2009, at 11:42 AM, AHMET ARSLAN wrote: Thanks Ahmet, Thats exce

Re: When to optimize?

2009-09-13 Thread Matt Weber
to optimize your index. Thanks, Matt Weber On Sep 13, 2009, at 6:21 PM, William Pierce wrote: Folks: Are there good rules of thumb for when to optimize? We have a large index consisting of approx 7M documents and we currently have it set to optimize once a day. But sometimes there are

Re: Solr Cell

2009-07-23 Thread Matt Weber
Found my own answer, use the literal parameter. Should have dug around before asking. Sorry. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On Jul 23, 2009, at 2:26 PM, Matt Weber wrote: Is it possible to supply addition metadata along with the binary file when

Solr Cell

2009-07-23 Thread Matt Weber
post the binary data for somefile.pdf to Solr Cell AND map my metadata into other fields in the same document that has the extracted text from the pdf. I know I could do this using Tika and SolrJ directly, but it would be much easier if Solr Cell can do it. Thanks, Matt Weber eSr

Re: Solr relevancy score - conversion

2009-06-09 Thread Matt Weber
, Matt Weber eSr Technologies http://www.esr-technologies.com On Jun 8, 2009, at 10:05 PM, Vijay_here wrote: Hi, I am using solr to inxdex some of the legal documents, where i need the solr search engine to return relevancy ranking score for each search results. As of now i am getting

Re: Solr Home on Linux JBoss ignored

2009-06-05 Thread Matt Weber
Check the dataDir setting in solrconfig.xml. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On Jun 5, 2009, at 6:03 AM, Dean Pullen wrote: I lied, it's actually saving data to: /usr/local/jboss-portal-2.7.1.GA/bin/C:\home\jboss\solr\data Which is a tad crazy!

Re: Facet counts limit

2009-05-20 Thread Matt Weber
1. The limit parameter takes a signed integer, so the max value is 2,147,483,647. 2. I don't think there is a defined limit which would mean you are only limited to want your system can handle. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 20, 2009,

Re: Search Query Questions

2009-05-14 Thread Matt Weber
I think you will want to look at the Field Collapsing patch for this. http://issues.apache.org/jira/browse/SOLR-236 . Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 14, 2009, at 5:52 PM, Chris Miller wrote: Oh, one more question 3) Is there a way to

Re: Selective Searches Based on User Identity

2009-05-12 Thread Matt Weber
r can access each result. Good luck. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 12, 2009, at 1:21 PM, Jay Hill wrote: The only downside would be that you would have to update a document anytime a user was granted or denied access. You would have to query b

Re: Selective Searches Based on User Identity

2009-05-12 Thread Matt Weber
Matt Weber eSr Technologies http://www.esr-technologies.com On May 12, 2009, at 12:26 PM, Terence Gannon wrote: Paul -- thanks for the reply, I appreciate it. That's a very practical approach, and is worth taking a closer look at. Actually, taking your idea one step further, per

Re: Facet counts for common terms of the searched field

2009-05-12 Thread Matt Weber
&facet.field=textfieldfacet&facet.limit=5 This will give you the top 5 words in the textfieldfacet. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 12, 2009, at 7:57 AM, sachin78 wrote: Thanks Matt for your reply. What do you mean by frequency(the d

Re: Facet counts for common terms of the searched field

2009-05-12 Thread Matt Weber
will be able to facet on this new field and sort the facet by frequency (the default) to get the most popular words. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 12, 2009, at 7:33 AM, sachin78 wrote: Does anybody have answer to this post.I have a similar

Re: solr + wordpress

2009-05-08 Thread Matt Weber
I actually wrote a plugin that integrates Solr with WordPress. http://www.mattweber.org/2009/04/21/solr-for-wordpress/ http://wordpress.org/extend/plugins/solr-for-wordpress/ https://launchpad.net/solr4wordpress Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 8

Re: Solr autocompletion in rails

2009-05-07 Thread Matt Weber
/. After you have your rails controller working you can hook it into your FE with some javascript like I did in the example on my blog. Hope this helps. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 7, 2009, at 7:37 AM, manisha_5 wrote: Thanks

Re: Conditional/Calculated Fields (is it possible?)

2009-05-06 Thread Matt Weber
I do not think this is possible. You will probably want to handle this logic on your side during indexing. Index the document with the fist price, then as that price expires, update the document with the new price. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com

Re: Multi-index Design

2009-05-05 Thread Matt Weber
ou would have a core for each domain. Each domain will then have it's own index that contains documents of all types. See http://wiki.apache.org/solr/MultipleIndexes . Thanks, Matt Weber On May 5, 2009, at 11:14 AM, Michael Ludwig wrote: Chris Masters schrieb: - flatten the

Re: Solr autocompletion in rails

2009-05-04 Thread Matt Weber
servlet you can write a rails handler parsing the json output directly. http://www.mattweber.org/2009/05/02/solr-autosuggest-with-termscomponent-and-jquery/ . Thanks, Matt Weber On May 4, 2009, at 9:39 AM, manisha_5 wrote: Hi, I am new to solr. I am using solr server to index the data

Re: autoSuggest

2009-05-04 Thread Matt Weber
just as you would with a facet. So in your schema.xml, add a new string field, then use a copyfield to copy the value of title into the new field and set terms.fl to the new field you just created after reindexing. Thanks, Matt Weber On May 4, 2009, at 6:46 AM, sunnyfr wrote: Hi, I would

Re: Highlight MoreLikeThis results?

2009-05-04 Thread Matt Weber
There was a thread about this last week and verdict is currently you can't highlight MoreLikeThis results. Thanks, Matt Weber On May 4, 2009, at 1:22 AM, jli...@gmail.com wrote: My query returns a number of MoreLikeThis results for a given document. I wonder if there is a w

Re: Term highlighting with MoreLikeThisHandler?

2009-04-30 Thread Matt Weber
milar. Thanks, Matt Weber On Apr 29, 2009, at 9:27 PM, Walter Underwood wrote: Think about this for a moment. When you use MoreLikeThis, the query is a document. How do you highlight a document in another document? wunder On 4/29/09 9:21 PM, "Matt Weber" wrote: Any luck

Re: Term highlighting with MoreLikeThisHandler?

2009-04-29 Thread Matt Weber
Any luck on this? I am experiencing the same issue. Highlighting works fine on all other request handlers, but breaks when I use the MoreLikeThisHandler. Thanks, Matt Weber On Apr 28, 2009, at 5:29 AM, Eric Sabourin wrote: Yes... at least I think so. the highlighting works correctly

Re: Strange Sorting results on a Text Field

2006-09-11 Thread Tom Weber
t;>text1 text2 text3. Thanks, tom On 11 Sep, 2006, at 15:14 , Yonik Seeley wrote: On 9/11/06, Tom Weber <[EMAIL PROTECTED]> wrote: Hello, have a strange response in a query with sorting. I sort on a field which is : I think you probably want a type="strin

Strange Sorting results on a Text Field

2006-09-11 Thread Tom Weber
Hello, have a strange response in a query with sorting. I sort on a field which is : multiValued="true"/> in this field mostly 32 byte md5's are saved, mostly only a single entry but also up to 5. when I do a search like this : "+testfield: (fde34c51739462d9486140601dcfb7bf 63a

Double Solr Installation on Single Tomcat (or Double Index)

2006-09-06 Thread Tom Weber
Hello, I need to have a second separate index (separate data) on the same server. Is there a possibility to do this in a single solr install on a tomcat server or do I need to have a second instance in the same tomcat install ? If either one is possible, does somebody has some adv

Re: Own Similarity Class in Solr

2006-07-27 Thread Tom Weber
Hi Chris, thanks for the details, I am meanwhile poking around with my own class which I defined in the schema.xml everything is working perfectly there. But I have still the problem with the normalization, I try to change several parameters to fix it to 1.0, this does indeed change

Recompilation of latest lucene seems to break update of Solr - Solution

2006-07-26 Thread Tom Weber
Hi again, sorry to spam the list, but wanted to share the solution which let the system compile. I had to use the latest lucene version, not available on their site, but only on the subversioning system. (Version called 2.1 - 419723 2006-07-06 22:14:07) To get this version, use th

Recompilation of latest lucene seems to break update of Solr - Addum

2006-07-26 Thread Tom Weber
Hi again, with the latest nightbuild of solr and the latest lucene, the error still persists, but the error message is different, jhust to be complete here the actual error message: java.lang.NoSuchMethodError: org.apache.lucene.document.Document.add (Lorg/apache/lucene/document/Fieldabl

Recompilation of latest lucene seems to break update of Solr

2006-07-26 Thread Tom Weber
Hello, jhust compiled the latest version of lucene (), updated the webapps/solr/WEB-INF/lib/ with the 3 jar files: lucene-core-2.0.1-dev.jar lucene-snowball-2.0.1-dev.jar lucene-highlighter-2.0.1-dev.jar Restarted solr, The Admin interface of solr is still running, but trying

Own Similarity Class in Solr

2006-07-26 Thread Tom Weber
Hello, I would like to alter the similarity behaviour of solr to remove the fieldnorm factor in the similarity calculations. As far as I read, I need to recreate my own similarity class and import it into solr using the config in schema.xml. Has anybody already tweaked or played wit

docFreq disable / disable end of word letter removal

2006-07-12 Thread Tom Weber
Hello, for my specific project, I would like to ask if the following settings can be made on the solr system: - Currently, I see that the docFreq is also playing in the scoring. Is is possible to disable this feature so that this is not calculated in the score ? - I see that solr is