Not able to use the highlighting feature! Want to return snippets of text

2012-05-20 Thread 12rad

I indexed a structured pdf document in Solr. The problem is when I search
for a simple string - I get the entire content field as the response! I
don't know how to change that.

My requirement is that, lets say I search for "metadata" it should give me

"*Metadata*Discussion . . . 4 matches ... make sure that Tika users have a
chance to get to all of the* metadata* created and/or extracted by Tika. ==
Original Problem == The original inspiration for this page was a Tika ...
10.7k - rev: 2 (current) last modified: 2010-08-02 18:09:45 "

But it gives me the whole document!- the entire string that was indexed. It
seems like Lucene can only tell me in which field it occurred, not where in
the field it occurred

I posted the document like this 
   curl
"http://localhost:8983/solr/update/extract?stream.file=/home/Desktop/DOCUMENTS/T.pdf&stream.contentType=application/pdf&literal.id=DOC_N&commit=true&captureAttr=true";
 

A query of *:* gives me the entire content of the document indexed in the
 field And any search also returns the same thing. 

Any help will be greatly appreciated!
Any help will be greatly appreciated!!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Not-able-to-use-the-highlighting-feature-Want-to-return-snippets-of-text-tp3985012.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Not able to use the highlighting feature! Want to return snippets of text. Urgent!!

2012-05-20 Thread 12rad
Also, 

the response just returns 

   


That is the name of the document. 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Not-able-to-use-the-highlighting-feature-Want-to-return-snippets-of-text-Urgent-tp3985012p3985013.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Not able to use the highlighting feature! Want to return snippets of text

2012-05-20 Thread 12rad
My query parameters are this:

text:abstract&hl=true&hl.fl=text&f.text.hl.snippets=2&f.text.hl.fragsize=200&debugQuery=true

I still get the entire string as the result in the
 tag. 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Not-able-to-use-the-highlighting-feature-Want-to-return-snippets-of-text-Urgent-tp3985012p3985022.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Not able to use the highlighting feature! Want to return snippets of text

2012-05-21 Thread 12rad
The field I am trying to highlight is stored. 





In the searchHandler i've set the parameters as follows: 

   on
   text
   5
   1000
   51
   true
   regex
   simple
   colored
   1000
   true
   true
   true


I still don't see any highlighting. I've managed to get snippets of text but
the actual word is not highlighted. I don't know where I am going wrong?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Not-able-to-use-the-highlighting-feature-Want-to-return-snippets-of-text-Urgent-tp3985012p3985174.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Not able to use the highlighting feature! Want to return snippets of text

2012-05-21 Thread 12rad
For the fragListBuilder
 it's 


fragment builder is 


  
  

  


 

  
  70
  
  0.5
  
  [-\w ,/\n\"']{20,200}

  


Thanks!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Not-able-to-use-the-highlighting-feature-Want-to-return-snippets-of-text-Urgent-tp3985012p3985212.html
Sent from the Solr - User mailing list archive at Nabble.com.


Remote streaming - posting a URL which is password protected

2012-05-21 Thread 12rad
I want to post index a http document that is password protected. 
It has a username name login. 
I tried doing this 

curl -u username:password
"http://localhost:8983/solr/update/extract?literal.id=doc900&commit=true"; -F
stream.url=http://somewebsite.com/docs/DOC2609

but it just indexes the login page only.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Remote-streaming-posting-a-URL-which-is-password-protected-tp3985221.html
Sent from the Solr - User mailing list archive at Nabble.com.


clickable links as results?

2012-05-22 Thread 12rad
Hi, 

I want to display - a clickable link to the document along if a search
matches along with the no of times the search query matched. 
What should i be looking at? 
I am fairly new to Solr and don't know how I can achieve this. 

Thanks for the help!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/clickable-links-as-results-tp3985505.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Not able to use the highlighting feature! Want to return snippets of text

2012-05-22 Thread 12rad
That worked! 
Thanks!
I did  
 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Not-able-to-use-the-highlighting-feature-Want-to-return-snippets-of-text-Urgent-tp3985012p3985507.html
Sent from the Solr - User mailing list archive at Nabble.com.


SolrJ indexing pdf documents

2012-06-16 Thread 12rad
Hi, 

I'm new to SolrJ. 
Here I are the steps I followed to write an application to index pdf
documents to fresh solr3.6

1 -In Schema.xml:
I added the fields I wanted indexed and changed stored = true.

2 - Started Solr using java -jar start.jar from the /example dir in Solr 3.6

3- In my application I start the server
solrServer = new CommonsHttpSolrServer(url);
solrServer.setParser(new XMLResponseParser());

4 - i index the data like this:
ContentStreamUpdateRequest index  = new
ContentStreamUpdateRequest("/update/extract");
index.setParam("literal.id", "doc");
index.setParam(CommonParams.STREAM_FILE, "/location/x.pdf");
index.setParam(CommonParams.STREAM_CONTENTTYPE, "application/pdf");
index.setParam(UpdateParams.COMMIT, "true");

5 - I commit using solrServer.commit().

When I run a simple query like *:* - don't see anything. The numDocs that
have been indexed is still 0. 
What I doing incorrectly?
Any help would be greatly appreciated. 

Thanks!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrJ-indexing-pdf-documents-tp3989965.html
Sent from the Solr - User mailing list archive at Nabble.com.


Nutch 1.4 with Solr 3.6 - compatible?

2012-07-03 Thread 12rad
Hi 

I am new to nutch and was trying it out using the instructions here 
http://wiki.apache.org/nutch/NutchTutorial

After changing the schema.xml of Solr to what Nutch has I keep getting this
error.. 
I am unable to start the solr server. 

 org.apache.solr.common.SolrException: undefined field text 
at
org.apache.solr.schema.IndexSchema.getDynamicFieldType(IndexSchema.java:1330) 
at
org.apache.solr.schema.IndexSchema$SolrQueryAnalyzer.getAnalyzer(IndexSchema.java:408)
 
at
org.apache.solr.schema.IndexSchema$SolrIndexAnalyzer.reusableTokenStream(IndexSchema.java:383)
 
at
org.apache.lucene.queryParser.QueryParser.getFieldQuery(QueryParser.java:574) 
at
org.apache.solr.search.SolrQueryParser.getFieldQuery(SolrQueryParser.java:206) 
at
org.apache.lucene.queryParser.QueryParser.Term(QueryParser.java:1429) 
at
org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1317) 
at
org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1245) 
at
org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:1234) 
at
org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:206) 
at
org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:79) 
at org.apache.solr.search.QParser.getQuery(QParser.java:143) 
at
org.apache.solr.request.SimpleFacets.getFacetQueryCounts(SimpleFacets.java:233) 
at
org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:194) 
at
org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:72)
 
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:186)
 
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376) 
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365) 
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
 
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) 
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) 
at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) 
at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) 
at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) 
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
 
at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) 
at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) 
at org.mortbay.jetty.Server.handle(Server.java:326) 

Anybody whose faced a similar issue? 
Do let me know. 
Thanks! 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Nutch-1-4-with-Solr-3-6-compatible-tp3992891.html
Sent from the Solr - User mailing list archive at Nabble.com.