RE: Solr Basic Configuration - Highlight - Begginer

2015-12-16 Thread Teague James
Hi Evert, I recently needed help with phrase highlighting and was pointed to the FastVectorHighlighter which worked out great. I just made a change to the configuration to add generateWordParts="0" and generateNumberParts="0" so that searches for things like "1a" would get highlighted correctly

RE: Solr Basic Configuration - Highlight - Begginer

2015-12-16 Thread Teague James
.pdf" with the same result no highlight in the respond as below: "highlighting": { "pdf1": {} } =( Really... do not know what to do... Thanks for your time, if you have any more suggestion where I could be missing something... please let me know.

Re: Solr Basic Configuration - Highlight - Begginer

2015-12-17 Thread Teague James
> insight into why that would be. The most frequent reason is that the > > field is a "string" type which is not broken up into words. Another > > possibility is that your analysis chain is leaving in the quotes or > > something similar. As James says, looking at a

Re: highlighting

2015-10-01 Thread Teague James
Mark! For what it's worth - up vote. Thanks! Cheers! -Teague James > On Oct 1, 2015, at 6:12 PM, Koji Sekiguchi > wrote: > > Hi Mark, > > I think I saw similar requirement recently in mailing list. The feature > sounds reasonable to me. > > > If not, how

URL Encoding on Import

2015-11-25 Thread Teague James
Hi everyone! Does anyone have any suggestions on how to URL encode URLs that I'm importing from SQL using the DIH? The importer pulls in something like "http://www.downloadsite.com/document that is being downloaded.doc" and then the Tika parser can't download the document because it ends up trying

Help With Phrase Highlighting

2015-12-01 Thread Teague James
Hello everyone, I am having difficulty enabling phrase highlighting and am hoping someone here can offer some help. This is what I have currently: Solr 4.9 solrconfig.xml (partial snip) xml explicit 10

Re: Help With Phrase Highlighting

2015-12-01 Thread Teague James
tiguous=true ? > > On Tue, Dec 1, 2015 at 3:36 PM, Teague James > wrote: > >> Hello everyone, >> >> I am having difficulty enabling phrase highlighting and am hoping someone >> here can offer some help. This is what I have currently:

Re: highlight

2015-12-02 Thread Teague James
the phrase correct. It is the method of highlighting (I'm trying to get one set of tags per phrase) and the occasional breaking of a single phrase into 2 hits. Given that setup, what do you recommend? I'm not sure I understand the approach you're describing. I appreciate the help!

RE: Help With Phrase Highlighting

2015-12-03 Thread Teague James
Thanks everyone who replied! The FastVectorHighlighter did the trick. Here is how I configured it: In solrconfig.xml: In the requestHandler I added: on text true 100 In schema.xml: I modified the text field: I restarted Solr, re-indexed the documents and tested. All phrases are correctly highli

Alternate Port Not Working for Solr 6.0.0

2016-05-31 Thread Teague James
Hello, I am trying to install Solr 6.0.0 and have been successful with the default installation, following the instructions provided on the Apache Solr website. However, I do not want Solr running on port 8983, I want it to run on port 80. I started a new Ubuntu 14.04 VM, installed open JDK 8, the

RE: Alternate Port Not Working for Solr 6.0.0

2016-06-02 Thread Teague James
d be appreciated. If this is a "feature" of the application, it'd be nice to see that in the documentation. Thanks Shawn! -Teague -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: Tuesday, May 31, 2016 4:31 PM To: solr-user@lucene.apache.org Su

Solr 6 Highlighting Not Working

2016-10-21 Thread Teague James
Can someone please help me troubleshoot my Solr 6.0 highlighting issue? I have a production Solr 4.9.0 unit configured to highlight responses and it has worked for a long time now without issues. I have recently been testing Solr 6.0 and have been unable to get highlighting to work. I used my 4.9 c

Solr 6.0 Highlighting Not Working

2016-10-24 Thread Teague James
Can someone please help me troubleshoot my Solr 6.0 highlighting issue? I have a production Solr 4.9.0 unit configured to highlight responses and it has worked for a long time now without issues. I have recently been testing Solr 6.0 and have been unable to get highlighting to work. I used my 4.9 c

RE: Solr 6.0 Highlighting Not Working

2016-10-25 Thread Teague James
Hi - Thanks for the reply, I'll give that a try.   -Original Message- From: jimtronic [mailto:jimtro...@gmail.com] Sent: Monday, October 24, 2016 3:56 PM To: solr-user@lucene.apache.org Subject: Re: Solr 6.0 Highlighting Not Working Perhaps you need to wrap your inner "" and "" tags in t

Solr 6.0.0 Returns Blank Highlights for Certain Queries

2017-01-18 Thread Teague James
MS Why does highlighting fail for "1a" type searches? Any help is appreciated! Thanks! -Teague James

Solr 6.0.0 Returns Blank Highlights for alpha-numeric combos

2017-02-01 Thread Teague James
pha-numeric two character value, such a n0, b1, 1z, etc.: ... Where "8667" is the document ID of the record that had the hit, but no highlight. Other searches, "ms" for example, return: ... MS Why does highlighting fail for "1a" type searches? Any help is appreciated! Thanks! -Teague James

RE: Solr 6.0.0 Returns Blank Highlights for alpha-numeric combos

2017-02-01 Thread Teague James
mind you but possible. Best, Erick On Wed, Feb 1, 2017 at 8:24 AM, Teague James wrote: > Hello everyone! I'm still stuck on this issue and could really use > some help. I have a Solr 6.0.0 instance that is storing documents > peppered with text like "1a", "2e",

How to Get Highlighting Working in Velocity (Solr 4.8.0)

2014-05-27 Thread Teague James
My Solr 4.8.0 index includes a field called 'dom_title'. The field is displayed in the result set. I want to be able to highlight keywords from this field in the displayed results. I have tried configuring solrconfig.xml and I have tried adding parameters to the query "&hl=true&hl.fl=dom_title" but

RE: Highlighting not working

2014-06-18 Thread Teague James
Vicky, I resolved this by making sure that the field that is searched has "stored=true". By default "text" is searched, which is the destination of the copyFields and is not stored. If you change your copyField destination to a field that is stored and use that field as the default search field th

Of, To, and Other Small Words

2014-07-14 Thread Teague James
Hello all, I am working with Solr 4.9.0 and am searching for phrases that contain words like "of" or "to" that Solr seems to be ignoring at index time. Here's what I tried: curl http://localhost/solr/update?commit=true -H "Content-Type: text/xml" --data-binary '100blah blah blah knowledge of scie

RE: Of, To, and Other Small Words

2014-07-14 Thread Teague James
is the stopword.txt. You could either empty that file out or change the field type for your field. On Mon, Jul 14, 2014 at 12:53 PM, Teague James wrote: > Hello all, > > I am working with Solr 4.9.0 and am searching for phrases that contain > words like "of" or "to"

RE: Of, To, and Other Small Words

2014-07-14 Thread Teague James
ing) by default uses lang/stopwords_en.txt (which wouldn't be empty if you check). What you're looking at is the stopword.txt. You could either empty that file out or change the field type for your field. On Mon, Jul 14, 2014 at 12:53 PM, Teague James wrote: > Hello all, > &

RE: Of, To, and Other Small Words

2014-07-14 Thread Teague James
t.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On Tue, Jul 15, 2014 at 8:10 AM, Teague James wrote: > Hi Anshum, > > Thanks for replying and suggesting this, but the field type I am using (a > modified text_general) in my schema ha

Contiguous Phrase Highlighting Example

2014-07-17 Thread Teague James
Hi everyone! Does anyone have any good examples of generating a contiguous highlight for a phrase? Here's what I have done: curl http://localhost/solr/collection1/update?commit=true -H "Content-Type: text/xml" --data-binary '100blah blah blah knowledge of science blah blah blah' Then, using a br

Indexing URLs for Binaries

2014-01-03 Thread Teague James
I am using Nutch 1.7 with Solr 4.6.0 to index websites that have links to binary files, such as Word, PDF, etc. The crawler crawls the site but I am not getting the URLs of the links for the binary files no matter how deep I set the settings for the site. I see the labels for the links in the conte

RE: Indexing URLs for Binaries

2014-01-03 Thread Teague James
: Re: Indexing URLs for Binaries Check suffix-urlfilter.txt in your conf directory for Nutch. You might be prohibiting those filetypes from the crawl. - Mark On 1/3/14, 10:29 AM, "Teague James" wrote: >I am using Nutch 1.7 with Solr 4.6.0 to index websites that have links >

Indexing URLs from websites

2014-01-07 Thread Teague James
I am trying to index a website that contains links to documents such as PDF, Word, etc. The intent is to be able to store the URLs for the links to the documents. For example, when indexing www.example.com which has links on the page like "Example Document" which points to www.example.com/docs/ex

Re: Indexing URLs from websites

2014-01-15 Thread Teague James
I am still unsuccessful in getting this to work. My expectation is that the index-anchor plugin should produce values for the field anchor. However this field is not showing up in my Solr index no matter what I try. Here's what I have in my nutch-site.xml for plugins: protocol-http|urlfilter-regex

RE: Indexing URLs from websites

2014-01-16 Thread Teague James
Hello Markus, I do get a linkdb folder in the crawl folder that gets created - but it is created at the time that I execute the command automatically by Nutch. I just tried to use solrindex against yesterday's cawl and did not get any errors, but did not get the anchor field or any of the outli

RE: Indexing URLs from websites

2014-01-16 Thread Teague James
Okay. I changed my solrindex to this: bin/nutch solrindex http://localhost/solr/ crawl/crawldb crawl/linkdb crawl/segments/20140115143147 I got the same errors: Indexer: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/.../crawl/linkdb/crawl_fetch Input path does

RE: Indexing URLs from websites

2014-01-16 Thread Teague James
Okay. I had used that previously and I just tried it again. The following generated no errors: bin/nutch solrindex http://localhost/solr/ crawl/crawldb -linkdb crawl/linkdb -dir crawl/segments/ Solr is still not getting an anchor field and the outlinks are not appearing in the index anywhere e

RE: Indexing URLs from websites

2014-01-17 Thread Teague James
Progress! I changed the value of that property in nutch-default.xml and I am getting the anchor field now. However, the stuff going in there is a bit random and doesn't seem to correlate to the pages I'm crawling. The primary objective is that when there is something on the page that is a link

RE: Indexing URLs from websites

2014-01-21 Thread Teague James
What I'm getting is just the anchor text. In cases where there are multiple anchors I am getting a comma separated list of anchor text - which is fine. However, I am not getting all of the anchors that are on the page, nor am I getting any of the URLs. The anchors I am getting back never include

RE: Indexing URLs from websites

2014-01-22 Thread Teague James
Markus, With some help from another user on the Nutch list I did a dump and found that the URLs I am trying to capture are in Nutch. However, when I index them with Solr I am not getting them. What I get in the dump is this: http://www.example.com/pdfs/article1.pdf Status: 2 (db_fetched) Fetch

Partial Word Search

2014-02-05 Thread Teague James
I cannot get Solr 4.6.0 to do partial word search on a particular field that is used for faceting. Most of the information I have found suggests modifying the fieldType "text" to include either the NGramFilterFactory or EdgeNGramFilterFactory in the filter. However since I am copying many other fie

RE: Partial Word Search

2014-02-06 Thread Teague James
enerally work since it will query for all the sub-terms, but AND will only work if all the sub-terms occur in the document field. -- Jack Krupansky -Original Message- From: Teague James Sent: Wednesday, February 5, 2014 4:52 PM To: solr-user@lucene.apache.org Subject: Partial Word Search

RE: Partial Word Search

2014-02-06 Thread Teague James
l the sub-terms, but AND will only work if all the sub-terms occur in the document field. -- Jack Krupansky -Original Message- From: Teague James Sent: Wednesday, February 5, 2014 4:52 PM To: solr-user@lucene.apache.org Subject: Partial Word Search I cannot get Solr 4.6.0 to do partial

DIH and Tika

2014-02-17 Thread Teague James
Is there a way to specify the document types that Tika parses? In my DIH I index the content of a SQL database which has a field that points to the SQL record's binary file (which could be Word, PDF, JPG, MOV, etc.). Tika then uses the document URL to index that document's content. However there ar

Update with non UTF-8 characters

2014-10-01 Thread Teague James
Hello! I am indexing Solr 4.9.0 using the /update request handler and am getting errors from Tika - Illegal IOException from org.apache.tika.parser.xml.DcXMLParser@74ce3bea which is caused by MalFormedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence. I believe that this is the result

Tika HTTP 400 Errors with DIH

2014-12-02 Thread Teague James
Hi all, I am using Solr 4.9.0 to index a DB with DIH. In the DB there is a URL field. In the DIH Tika uses that field to fetch and parse the documents. The URL from the field is valid and will download the document in the browser just fine. But Tika is getting HTTP response code 400. Any ideas why

RE: Tika HTTP 400 Errors with DIH

2014-12-04 Thread Teague James
Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Tuesday, December 02, 2014 1:45 PM To: solr-user Subject: Re: Tika HTTP 400 Errors with DIH On 2 December 2014 at 13:19, Teague James wrote: > clob="true" What does ClobTransformer is doing on the DownloadURL field? Is it possibl

RE: Tika HTTP 400 Errors with DIH

2014-12-05 Thread Teague James
Regards, >> Alex. >> Personal: http://www.outerthoughts.com/ and @arafalov Solr resources >> and newsletter: http://www.solr-start.com/ and @solrstart Solr >> popularizers community: https://www.linkedin.com/groups?gid=6713853 >> >> >> On 4 December 2014 at