indexing with pdf files problem

2010-07-12 Thread satya swaroop
hi all,
  i am working with solr on tomcat. the indexing is good for xml files
but when i send the docs or html files or pdf's through curl i get the error
as lazy error. can u telll me the way. the output is as follows when i send
a pdf file  i am working in ubuntu. solr home is /opt/example
  tomcat is /opt/tomcat6


Apache Tomcat/6.0.26 - Error report
HTTP Status 500 - lazy loading error

org.apache.solr.common.SolrException: lazy loading error
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:249)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:231)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:852)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.solr.common.SolrException:
java.lang.NullPointerException
at
org.apache.solr.handler.extraction.ExtractingRequestHandler.inform(ExtractingRequestHandler.java:76)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:244)
... 16 more
Caused by: java.lang.NullPointerException
at
org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:73)
at
org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:90)
at org.apache.tika.config.TikaConfig.(TikaConfig.java:99)
at org.apache.tika.config.TikaConfig.(TikaConfig.java:84)
at org.apache.tika.config.TikaConfig.(TikaConfig.java:61)
at
org.apache.solr.handler.extraction.ExtractingRequestHandler.inform(ExtractingRequestHandler.java:74)
... 17 more
type Status
reportmessage lazy loading error

org.apache.solr.common.SolrException: lazy loading error
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:249)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:231)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:852)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:6

Re: indexing with pdf files problem

2010-07-13 Thread satya swaroop
hi,
   I installed tika and made its jar files into solr home library and also
gave the path to the tika configuration file. But the error is same.  the
tika config file is as follows:::







 
 http://purl.org/dc/elements/1.1/
 application/xml

 
 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 application/msword
 
 
 
 
 
 
 
 

 
 

 application/vnd.ms-excel
 
 
 
 
 
 
 
 

 
 

 application/vnd.ms-powerpoint
 
 
 
 
 
 
 
 
 
 
 

 
 

text/html
 application/x-asp
 
 
 
 
 
 
 
 
 

 
 
 

 application/rtf
 
 
 
 
 
 
 
 

 
 

 application/pdf
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 

 
 

 text/plain
 
 
 
 
 
 
 
 


 
 application/vnd.sun.xml.writer
application/vnd.oasis.opendocument.text
 
 

 
 
 
 
 



 
 

 
 

 

 

 
 
 



with regards,
swaroop


indexing rich documents

2010-07-13 Thread satya swaroop
Hi all,
 i am new to solr and followed with the wiki and got the solr admin
run sucessfully. It is good going for xml files. But to index the rich
documents i am unable to get it. I followed wiki to make the richer
documents also,  but i didnt get it.The error comes when i send an pdf/html
file is a lazy error. can anyone give some detail description about how to
make richer documents indexable
 i use tomcat and working in ubuntu. The home directory for solr is
/opt/solr/example and catalina home is /opt/tomcat6.


thanks & regards,
 swaroop


indexing rich documents

2010-07-13 Thread satya swaroop
Hi all,
 i am new to solr and followed with the wiki and got the solr admin
run sucessfully. It is good going for xml files. But to index the rich
documents i am unable to get it. I followed wiki to make the richer
documents also,  but i didnt get it.The error comes when i send an pdf/html
file is a lazy error. can anyone give some detail description about how to
make richer documents indexable
 i use tomcat and working in ubuntu. The home directory for solr is
/opt/solr/example and catalina home is /opt/tomcat6.


thanks & regards,
 swaroop


Re: indexing rich documents

2010-07-13 Thread satya swaroop
hi,
yes i followed the wiki and can now tell me the procedure for it
  regards,
   swaroop


Re: indexing rich documents

2010-07-13 Thread satya swaroop
ya i checked the extraction request handler but couldnt get the
info... i installed tika-0.7 and copied the jar files into the solr
home library.. i started sending the pdf/html files then i get a lazy
error. i am using tomcat and solr 1.4


problem with storing??

2010-07-15 Thread satya swaroop
Hi all,
   i am new to solr and i followed d wiki and got everything going
right. But when i send any html/txt/pdf documents the response is as
follows:::



0576


but when i search in the solr i dont find the result can any one tell me
what to be done..??
The curl i used for the above o/p is

curl '
http://localhost:8080/solr/update/extract?literal.id=doc1000&commit=true&fmap.content=text'
-F "myfi...@java.pdf"

regards,
 satya


Re: problem with storing??

2010-07-15 Thread satya swaroop
hi,
   i sent the commit after adding the documents. but the problem is same

regards,
  satya


no response

2010-07-15 Thread satya swaroop
Hi all,
i Have a problem with the solr. when i send the documents(.doc) i am
not getting the response.
  example:
 sa...@geodesic-desktop:~/Desktop$  curl "
http://localhost:8080/solr/update/extract?stream.file=/home/satya/Desktop/InvestmentDecleration.doc&stream.contentType=application/msword&;
literal.id=Invest.doc"
sa...@geodesic-desktop:~/Desktop$


could any body tell me what to do??


Re: no response

2010-07-16 Thread satya swaroop
hi,
   i am sorry the mail u sent was in sent mail... I didnt look it I am
going to check now.. I will definetely tell u the entire thing

regards,
  satya


Re: problem with storing??

2010-07-16 Thread satya swaroop
hi,
I checked out the admin page and it is indexing for others.In the log
files i dont get anything when i send the documents. I checked out the log
in catalina(tomcat). I changed the dismax handler from q=*:* to q=   . I
atleast get the response when i send pdf/html files but dont even get for
the doc files


regards,
  swaroop


Re: problem with storing??

2010-07-18 Thread satya swaroop
hi all,
   now solr is working good.i am working in ubuntu and i was indexing
the documents which dont hav permissions . so the problem was that. i thank
all of u for ur reply to my queries.
  thanking you,
   satya


spell checking....

2010-07-26 Thread satya swaroop
hi all,
i am a new one to solr and able to implement indexing the documents
by following the solr wiki. now i am trying to add the spellchecking. i
followed the spellcheck component in wiki but not getting the suggested
spellings. i first build it by spellcheck.build=true,...

here i give u the example:::

http://localhost:8080/solr/spell?q=javs&spellcheck=true&spellcheck.collate=true



-








here the response should actualy suggest the "java" but didnt..

can any one guide me about it...
 i am using solr 1.4, tomcat in ubuntu





Regards,
swarup


Re: spell checking....

2010-07-26 Thread satya swaroop
This is in solrconfig.xml:::


  
  default

  solr.IndexBasedSpellChecker

  spell
   ./spellchecker
   0.7
 true
true



  jarowinkler
  lowerfilt
  org.apache.lucene.search.spell.JaroWinklerDistance
  ./spellchecker
  true
  true


  textSpell






 i added the following in standard request handler::



 
   explicit
   
  default
  
  false
  
  false
  
  1

 
  spellcheck


  


spell checking problem

2010-07-29 Thread satya swaroop
hi all,
  i need some help in spellchecking.i configured my solrconfig and
schema by looking the usermailing list and here i give you the configuration
i made..

my schema.xml::

 
  




  
  





  


 





my solrconfig.xml:
--
  

  default
  false
  false
  5



  spellcheck

  



 

spellText


  default
  name   
  ./spell
  true




  jarowinkler
  spell
  org.apache.lucene.search.spell.JaroWinklerDistance
  ./spellcheckerjaro



  




1)the problem here is for the default dictionary the index is getting
created and if i write "jawa" the suggestions it gives are data,sata.. but
the actual sugest is "java". I nearly have 20 java docs indexed
2)another problem ::: if i make build to jarowinkler dictionary which is
using the "spell" field is not going to create the dictionary and i only see
segments.gen and segments_1 in its directory


regards,
satya


indexing???

2010-08-12 Thread satya swaroop
Hi all,
   The indexing part of solr is going good,but i got a error on indexing
a single pdf file. when i searched for the error in the mailing list i found
that the error was due to copyright of that file. can't we index a file
which has copy rights or any digital rights???

regards,
  satya


Re: indexing???

2010-08-16 Thread satya swaroop
hi all,
   the error i got is ""Unexpected RuntimeException from
org.apache.tika.parser.pdf.pdfpar...@8210fc"" when i indexed a file similar
to the one in
   https://issues.apache.org/jira/browse/PDFBOX-709/samplerequestform.pdfcant
we index those type files in solr???

regards,
satya


stream.url problem

2010-08-17 Thread satya swaroop
hi all,
   i am indexing the documents to solr that are in my system. now i need
to index the files that are in remote system, i enabled the remote streaming
to true in solrconfig.xml and when i use the stream.url it shows the error
as ""connection refused"" and the detail of the error is:::

when i sent the request in my browser as::

http://localhost:8080/solr/update/extract?stream.url=http://remotehost/home/san/Desktop/programming_erlang_armstrong.pdf&literal.id=schb2

i get the error as

HTTP Status 500 - Connection refused java.net.ConnectException: Connection
refused at sun.reflect.GeneratedConstructorAccessor11.newInstance(Unknown
Source) at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at
sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1368)
at java.security.AccessController.doPrivileged(Native Method) at
sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1362)
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1016)
at
org.apache.solr.common.util.ContentStreamBase$URLStream.getStream(ContentStreamBase.java:88)
at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:161)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:237)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:852)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:619) Caused by:
java.net.ConnectException: Connection refused at
java.net.PlainSocketImpl.socketConnect(Native Method) at
java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) at
java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195) at
java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) at
java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at
java.net.Socket.connect(Socket.java:525) at
java.net.Socket.connect(Socket.java:475) at
sun.net.NetworkClient.doConnect(NetworkClient.java:163) at
sun.net.www.http.HttpClient.openServer(HttpClient.java:394) at
sun.net.www.http.HttpClient.openServer(HttpClient.java:529) at
sun.net.www.http.HttpClient.(HttpClient.java:233) at
sun.net.www.http.HttpClient.New(HttpClient.java:306) at
sun.net.www.http.HttpClient.New(HttpClient.java:323) at
sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:860)
at
sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:801)
at
sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:726)
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1049)
at
sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:2173)
at java.net.URLConnection.getContentType(URLConnection.java:485) at
org.apache.solr.common.util.ContentStreamBase$URLStream.(ContentStreamBase.java:81)
at
org.apache.solr.servlet.SolrRequestParsers.buildRequestFrom(SolrRequestParsers.java:136)
at
org.apache.solr.servlet.SolrRequestParsers.parse(SolrRequestParsers.java:116)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:225)
...


if any body know
please help me with this

regards,
satya


Re: indexing???

2010-08-17 Thread satya swaroop
hi,

1) i use tika 0.8...

2)the url is  https://issues.apache.org/jira/browse/PDFBOX-709 and the
file is samplerequestform.pdf

 3)the entire error is::;
curl "
http://localhost:8080/solr/update/extract?stream.file=/home/satya/my_workings/satya_ebooks/8-Linux/samplerequestform.pdf&literal.id=linuxc
"



  Apache Tomcat/6.0.26 - Error
report
HTTP Status 500 - org.apache.tika.exception.TikaException:
Unexpected RuntimeException from
org.apache.tika.parser.pdf.pdfpar...@1d688e2

org.apache.solr.common.SolrException:
org.apache.tika.exception.TikaException: Unexpected RuntimeException from
org.apache.tika.parser.pdf.pdfpar...@1d688e2
at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:214)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:237)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:852)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.tika.exception.TikaException: Unexpected
RuntimeException from org.apache.tika.parser.pdf.pdfpar...@1d688e2
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:144)
at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99)
at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:112)
at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:193)
... 18 more
Caused by: java.lang.ClassCastException:
org.apache.pdfbox.pdmodel.font.PDFontDescriptorAFM cannot be cast to
org.apache.pdfbox.pdmodel.font.PDFontDescriptorDictionary
at
org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.ensureFontDescriptor(PDTrueTypeFont.java:167)
at
org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.(PDTrueTypeFont.java:117)
at
org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:140)
at
org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:76)
at org.apache.pdfbox.pdmodel.PDResources.getFonts(PDResources.java:115)
at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:225)
at
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at
org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at
org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at
org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at
org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:56)
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:79)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:142)
... 21 more
type Status
reportmessage org.apache.tika.exception.TikaException:
Unexpected RuntimeException from
org.apache.tika.parser.pdf.pdfpar...@1d688e2

org.apache.solr.common.SolrException:
org.apache.tika.exc

solr working...

2010-08-18 Thread satya swaroop
hi all,
i am very intrested to know the working of solr. can anyone tell me
which modules or classes that gets invoked when we start the servlet
container like tomcat or when we send any requests to solr like sending pdf
files or what files get invoked at the start of solr.??

regards,
satya


/update/extract

2010-08-19 Thread satya swaroop
Hi all,
   when we handle extract request handler what class gets invoked.. I
need to know the navigation of classes when we send any files to solr.
can anybody tell me the classes or any sources where i can get the answer..
or can anyone tell me what classes get invoked when we start the
solr... I be thankful if anybody can help me with regarding this..

Regards,
satya


Re: stream.url problem

2010-08-24 Thread satya swaroop
>
> Hi all,
> I got the solution for my problem. I changed my port number and i
> kept the old one in the stream.url... so problem was that...
> thanks all
>
> Now i got another problem, it is when i send any requests to remote
> system for the files that have names with escape characters like " &,space
> ". For example= Tom&Jerry.pdf  i get a problem as "Unexpected end of
> file from server"...
>
> the request i sent is::
>
> curl "
> http://localhost:8080/solr/update/extract?stream.url=http://remotehost:8011/file_download.yaws?file=Wireless%20Lan.pdf&literal.id=su8
> "
>
> here file_download.yaws is a module that fetches the file and gives to
> solr.
>
> solr is able to index the files that doesnt contain the escape characters
> in the remote system.. example:: apache.txt, solr_apache.pdf
>
> the error i got is:::
>
> HTTP Status 500 - Unexpected end of file from server
> java.net.SocketException: Unexpected end of file from server at
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at
> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1368)
> at java.security.AccessController.doPrivileged(Native Method) at
> sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1362)
> at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1016)
> at
> org.apache.solr.common.util.ContentStreamBase$URLStream.getStream(ContentStreamBase.java:88)
> at
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:161)
> at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:57)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:133)
> at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1355) at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:340)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
> at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:852)
> at
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
> at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
> at java.lang.Thread.run(Thread.java:619) Caused by:
> java.net.SocketException: Unexpected end of file from server at
> sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:769) at
> sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632) at
> sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:766) at
> sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632) at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1072)
> at
> sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:2173)
> at java.net.URLConnection.getContentType(URLConnection.java:485) at
> org.apache.solr.common.util.ContentStreamBase$URLStream.(ContentStreamBase.java:81)
> at
> org.apache.solr.servlet.SolrRequestParsers.buildRequestFrom(SolrRequestParsers.java:138)
> at
> org.apache.solr.servlet.SolrRequestParsers.parse(SolrRequestParsers.java:117)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:226)
> ...
>



Regards,
 satya


reduce the content???

2010-08-25 Thread satya swaroop
Hi all,
  i indexed nearly 100 java pdf files which are of large size(min 1MB).
The solr is showing the results with the entire content that it indexed
which is taking time to show the results.. cant we reduce the content it
shows or can i just have the file names and ids instead of the entire
content in the results

Regards,
satya


solr working...

2010-08-26 Thread satya swaroop
Hi all,
  I am intrested to see the working of solr.
1)Can anyone tell me how to start with to know its working 

Regards,
satya


Re: solr working...

2010-08-26 Thread satya swaroop
Hi peter,
I am already working on solr and it is working good. But i want
to understand the code and know where the actual working is going on, and
how indexing is done and how the requests are parsed and how it is
responding and all others. TO understand the  code i asked how to start???

Regards,
satya


Re: solr working...

2010-08-26 Thread satya swaroop
Hi all,

  Thanks for ur response and information. I used slf4j log and i kept
log.info method in every class of solr module to know which classes get
invoke on particular requesthandler or on start of solr I was able to
keep it only in solr Module but not in lucene module... i get error when i
use it in dat module.. can any one tell me other ways like this to track the
path solr

Regards,
  satya


stream.url

2010-09-02 Thread satya swaroop
Hi all,

  I am using stream.url to index the files in the remote system. when i
use the url as
1) curl "
http://localhost:8080/solr/update/extract?stream.url=http://remotehost:port/file_download.yaws?file=yaws_presentation.pdf&literal.id=schb4
"
it works and i get the response as the file got indexed.

but when i use
2) curl "
http://localhost:8080/solr/update/extract?stream.url=http://remotehost:port/file_download.yaws?file=solr&;
apache.pdf&
literal.id=schb5"
i get the error in the solr... i replaced the escaped characters with %20
for space and %26 for &, but the error is same saying

""Unexpected end of file from server java.net.SocketException..""

when i used without solr as http://remotehost:port/file_download.yaws?file=solr
& apache.pdf then i get the file downloaded to my system.

I here enclose the entire error=

HTTP Status 500 - Unexpected end of file from server
java.net.SocketException: Unexpected end of file from server at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at
sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1368)
at java.security.AccessController.doPrivileged(Native Method) at
sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1362)
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1016)
at
org.apache.solr.common.util.ContentStreamBase$URLStream.getStream(ContentStreamBase.java:88)
at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:169)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:57)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:133)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1355) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:340)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:852)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:619) Caused by:
java.net.SocketException: Unexpected end of file from server at
sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:769) at
sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632) at
sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:766) at
sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632) at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1072)
at
sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:2173)
at java.net.URLConnection.getContentType(URLConnection.java:485) at
org.apache.solr.common.util.ContentStreamBase$URLStream.(ContentStreamBase.java:81)
at
org.apache.solr.servlet.SolrRequestParsers.buildRequestFrom(SolrRequestParsers.java:138)
at
org.apache.solr.servlet.SolrRequestParsers.parse(SolrRequestParsers.java:117)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:226)
... 12 more


can anybody provide information regarding this??


Regards,
Satya


Re: stream.url

2010-09-02 Thread satya swaroop
Hi stefan,
   I used escape charaters and made it... It is not problem for
a single file of 'solr &apache' but it shows the same problem for the files
like Wireless lan.ppt, Tom info.pdf.

the curl i sent is::

curl "
http://localhost:8080/solr/update/extract?stream.url=http://remotehost:port/file_download.yaws%3Ffile=solr
%20%26%20apache.pdf&literal.id=schb5"

Regards,
satya


Re: stream.url

2010-09-02 Thread satya swaroop
Hi,
I made the curl from the shell(command prompt or terminal) with the
escaping characters but the error is same when i saw in the remote
system the request is not getting there Is there anything to be changed
in config file inorder to enable the escaping characters for stream.url

Did anybody try indexing files in remote system through stream.url,  where
the files name contain escape characters like &,space

regards,
satya


Re: stream.url

2010-09-03 Thread satya swaroop
Hi all,

  I am unable to index the files of remote system that contains escaped
characters in  their file names i think there is a problem in solr for
indexing the files of escaped characters in remote system...
Has anybody tried to index the files in remote system that contain the
escaped characters But solr is working good for files that has no
escaped characters in their name.


I sent the request through the curl by encoding the filename in url format
but the problem is same...

Regards,
satya


Re: stream.url

2010-09-08 Thread satya swaroop
Hi Hoss,

 Thanks for reply and it got working The reason was as you
said i was not double escaping i used %2520 for whitespace and it is
working now

Thanks,
satya


cloud or zookeeper

2010-09-14 Thread satya swaroop
Hi All,
   What is the difference of using shards,solr cloud and zookeeper..
which is the best way to scale the solr..
 I need to reduce the index size in every system and reduce the search time
for a query...

Regards,
satya


SolrCloud new....

2010-09-20 Thread satya swaroop
Hi all,
I  am having 4 instances of solr in 4 systems.Each system has a
single instance of solr.. I want the result from all these servers. I came
to know using of solrcloud. I read about it and worked on the example and it
was working as given in wiki.
I am using solr 1.4 and apache tomcat. In order to implement cloud in the
solr trunk wat procedure should be followed.
1)Should i copy the libraries from cloud to trunk???
2)should i keep the cloud module in every system???
3) I am not using any cores in the solr. It is a single solr in every
system.can solrcloud support it??
4) the example is given in jetty.Is it the same way to make it in tomcat???

Regards,
satya


ant package

2010-09-21 Thread satya swaroop
Hi all,
i want to build the package of my solr and i found it can be done
using ant. When i type ant package in solr module i get an error as:::\


sa...@swaroop:~/temporary/trunk/solr$ ant package
Buildfile: build.xml

maven.ant.tasks-check:

BUILD FAILED
/home/satya/temporary/trunk/solr/common-build.xml:522:
##
  Maven ant tasks not found.
  Please make sure the maven-ant-tasks jar is in ANT_HOME/lib, or made
  available to Ant using other mechanisms like -lib or CLASSPATH.
  ##

Total time: 0 seconds


can anyone tell me the procedure to build it or give any information about
it..

Regards,
satya


Re: ant package

2010-09-21 Thread satya swaroop
HI ,
  ya i dont have the jar file in the ant/lib where can i get the jar
file or wat is the procedure to make that maven-artifact-ant-2.0.4-dep.jar??

regards,
satya


Re: ant package

2010-09-21 Thread satya swaroop
Hi erick,
 thanks for reply and i got the jar file downloaded and kept it
in ant library
now when i make ant package command it getting error in the middle of build
in generate-maven-artifacts... and the error is

sa...@geodesic-desktop:~/temporary/trunk/solr$ sudo  ant  package
---
---
---
generate-maven-artifacts:
[mkdir] Created dir: /home/satya/temporary/trunk/solr/build/maven
[mkdir] Created dir: /home/satya/temporary/trunk/solr/dist/maven
 [copy] Copying 1 file to
/home/satya/temporary/trunk/solr/build/maven/src/maven
[artifact:install-provider] Installing provider:
org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2

BUILD FAILED
/home/satya/temporary/trunk/solr/build.xml:853: The following error occurred
while executing this line:
/home/satya/temporary/trunk/solr/common-build.xml:373: artifact:deploy
doesn't support the "uniqueVersion" attribute

Total time: 1 minute 51 seconds
sa...@desktop:~/temporary/trunk/solr$

Regards,
satya


ant build problem

2010-10-04 Thread satya swaroop
Hi all,
i updated my solr trunk to revision 1004527. when i go for compiling
the trunk with ant i get so many warnings, but the build is successful. the
warnings are here:::
common.compile-core:
[mkdir] Created dir:
/home/satya/temporary/trunk/lucene/build/classes/java
[javac] Compiling 475 source files to
/home/satya/temporary/trunk/lucene/build/classes/java
[javac] warning: [path] bad path element
"/usr/share/ant/lib/hamcrest-core.jar": no such file or directory
[javac]
/home/satya/temporary/trunk/lucene/src/java/org/apache/lucene/queryParser/QueryParserTokenManager.java:455:
warning: [cast] redundant cast to int
[javac]  int hiByte = (int)(curChar >> 8);
[javac]   ^
[javac]
/home/satya/temporary/trunk/lucene/src/java/org/apache/lucene/queryParser/QueryParserTokenManager.java:705:
warning: [cast] redundant cast to int
[javac]  int hiByte = (int)(curChar >> 8);
[javac]   ^
[javac]
/home/satya/temporary/trunk/lucene/src/java/org/apache/lucene/queryParser/QueryParserTokenManager.java:812:
warning: [cast] redundant cast to int
[javac]  int hiByte = (int)(curChar >> 8);
[javac]   ^
[javac]
/home/satya/temporary/trunk/lucene/src/java/org/apache/lucene/queryParser/QueryParserTokenManager.java:983:
warning: [cast] redundant cast to int
[javac]  int hiByte = (int)(curChar >> 8);
[javac]   ^
[javac]
/home/satya/temporary/trunk/lucene/src/java/org/apache/lucene/search/FieldCacheImpl.java:209:
warning: [unchecked] unchecked cast
[javac] found   : java.lang.Object
[javac] required: T
[javac] key.creator.validate( (T)value, reader);
[javac]  ^
[javac]
/home/satya/temporary/trunk/lucene/src/java/org/apache/lucene/search/FieldCacheImpl.java:278:
warning: [unchecked] unchecked call to
Entry(java.lang.String,org.apache.lucene.search.cache.EntryCreator) as a
member of the raw type org.apache.lucene.search.FieldCacheImpl.Entry
[javac] return (ByteValues)caches.get(Byte.TYPE).get(reader, new
Entry(field, creator));
ptionList.addAll(exceptions);

||

[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files additionally use unchecked or unsafe
operations.
[javac] 100 warnings

BUILD SUCCESSFUL
Total time: 19 seconds


here i placed only the starting stage of warnings.
After the compiling i thought to check with the ant test and performed but
it is failed..

i didnt find any hamcrest-core.jar in my ant library
i use ant 1.7.1


Regards,
satya


solr requirements

2010-10-18 Thread satya swaroop
Hi All,
I am planning to have a separate server for solr and regarding
hardware requirements i have a doubt about what configuration to be needed.
I know it will be hard to tell but i just need a minimum requirement for the
particular situation as follows::


1) There are 1000 regular users using solr and Every day each user indexes
10 files of 1KB each and totally it leads to a size of 10MB for a day and it
goes on...???

2)How much of RAM is used by solr in genral???

Thanks,
satya


Re: solr requirements

2010-10-18 Thread satya swaroop
Hi,
   here is some more info about it. I use Solr to output only the file
names(file id's). Here i enclose the fields in my schema.xml and presently i
have only about 40MB of indexed data.


   
   
   

   
   
   
   

   
   
   
   

   
   
   
   


   
   
   
   
   
   
   
   
   
   
   

   


   
   

   
   

   
   
   

   

 



Regards,
satya


RAM increase

2010-10-20 Thread satya swaroop
Hi all,
  I increased my RAM size to 8GB and i want 4GB of it to be used
for solr itself. can anyone tell me the way to allocate the RAM for the
solr.


Regards,
satya


solr result....

2010-10-27 Thread satya swaroop
Hi ,
  Can the result of solr show the only a part of the content of a
document that got in the result.
example

if i send a query for to search tika then the result should be as follows:::


-
0
79

-

-
   text/html

 1html
-
-
   Apache Tomcat/6.0.26 - Error reportHTTP Status 500 -
org.apache.tika.exception.TikaException: Unexpected RuntimeException from
org.apache.tika.parser.pdf.pdfpar...@cc9d70

org.apache.solr.common.SolrException:
org.apache.tika.exception.TikaException: Unexpected RuntimeException from
org.apache.tika.parser.pdf.pdfpar...@cc9d70
at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:214)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:237)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1323)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)...

 
   



The result should not show the entire content of a file. It should show up
only a part of the content where the query word is present..As like the
google result and like search result in the lucidimagionation

Regards,
satya


Re: solr result....

2010-10-28 Thread satya swaroop
Hi Lance,
  I actually copied tika exceptions in one html file and indexed
it. It is just a content of a file and here i tell u  what i mean::


if i post a query like *java* then the result or response from solr should
hit only a part of the content like as follows::

http://localhost:8456/solr/select/?q=java&version=2.2&start=10&rows=10&indent=on

-
-
0
453

-
-
-
application/pdf

javaebuk
2001-07-02T11:54:10Z
-
-

A Java program with two main methods  The following is an example of a java
program with two main methods with different signatures.
Program 3
public class TwoMains
{
/** This class has two main methods with
* different signatures */
public static void main (String args[])  .
  
 
.






the doc in the result should not contain the entire content of a file. It
should have only a part of the content.The content should be the first hit
of the word java in that file...


Regards,
satya


Re: RAM increase

2010-10-29 Thread satya swaroop
Hi All,

 Thanks for your reply.I have a doubt whether to increase the ram or
heap size to java or to tomcat where the solr is running


Regards,
satya


Google like search

2010-12-14 Thread satya swaroop
Hi All,
 Can we get the results like google  having some data  about the
search... I was able to get the data that is the first 300 characters of a
file, but it is not helpful for me, can i be get the data that is having the
first found key in that file

Regards,
Satya


Re: Google like search

2010-12-14 Thread satya swaroop
Hi Tanguy,
  I am not asking for highlighting.. I think it can be
explained with an example.. Here i illustarte it::

when i post the query like dis::

http://localhost:8080/solr/select?q=Java&version=2.2&start=0&rows=10&indent=on

i Would be getting the result as follows::

-
-
0
1

-
-
Java%20debugging.pdf
122
-
-
Table of Contents
If you're viewing this document online, you can click any of the topics
below to link directly to that section.
1. Tutorial tips 2
2. Introducing debugging  4
3. Overview of the basics 6
4. Lessons in client-side debugging 11
5. Lessons in server-side debugging 15
6. Multithread debugging 18
7. Jikes overview 20






Here the str field contains the first 300 characters of the file as i kept a
field to copy only 300 characters in schema.xml...
But i dont want the content like dis.. Is there any way to make an o/p as
follows::

 Java is one of the best language,java is easy to learn...


where this content is at start of the chapter,where the first word of java
is occured in the file...


Regards,
Satya


Re: Google like search

2010-12-14 Thread satya swaroop
Hi Tanguy,
 Thanks for ur reply. sorry to ask this type of question.
how can we index each chapter of a file as seperate document.As for i know
we just give the path of file to solr to index it... Can u provide me any
sources for this type... I mean any blogs or wiki's...

Regards,
satya


Re: Google like search

2010-12-16 Thread satya swaroop
Hi All,

 Thanks for your suggestions.. I got the result of what i expected..

Cheers,
Satya


Testing Solr

2010-12-16 Thread satya swaroop
Hi All,

 I built solr successfully and i am thinking to test it  with nearly
300 pdf files, 300 docs, 300 excel files,...and so on of each type with 300
files nearly
 Is there any dummy data available to test for solr,Otherwise i need to
download each and every file individually..??
Another question is there any Benchmarks of solr...??

Regards,
satya


Different Results..

2010-12-22 Thread satya swaroop
Hi All,
 i am getting different results when i used with some escape keys..
for example:::
1) when i use this request
http://localhost:8080/solr/select?q=erlang!ericson
   the result obtained is
   

2) when the request is
 http://localhost:8080/solr/select?q=erlang/ericson
the result is
  


My query here is, do solr consider both the queries differently and what do
it consider for !,/ and all other escape characters.


Regards,
satya


error in html???

2010-12-23 Thread satya swaroop
Hi All,

 I am able to get the response in the success case in json format by
stating wt=json in the query. But as in case if any errors i am geting in
html format.
 1) Is there any specified reason to get in html format??
  2)cant we get the error result in json format??

Regards,
satya


Re: error in html???

2010-12-23 Thread satya swaroop
Hi Erick,
   Every result comes in xml format. But when you get any errors
like http 500 or http 400 like wise we will get in html format. My query is
cant we make that html file into json or vice versa..

Regards,
satya


spell suggest response

2011-01-11 Thread satya swaroop
Hi All,
 can we get just suggestions only without the files response??
Here I state an example
when i query
http://localhost:8080/solr/spellcheckCompRH?q=java daka
usar&spellcheck=true&spellcheck.count=5&spellcheck.collate=true

i get some result of java files and then the suggestions for the words
daka-data , usar-user. But actually i need only the spell suggestions.
But here time is getting consumed for displaying of files and then giving
spell suggestions. Cant we post a query to solr where we can get
the response as only spell suggestions???

Regards,
satya


Re: spell suggest response

2011-01-11 Thread satya swaroop
Hi Gora,
   I am using solr for file indexing and searching, But i have a
module where i dont need any files result but only the spell suggestions, so
i asked is der anyway in solr where i would get the spell suggestion
responses only.. I think it is clear for u now.. If not tell me I will try
to explain still furthur...

Regards,
satya


Re: spell suggest response

2011-01-11 Thread satya swaroop
Hi Stefan,
  Ya it works :). Thanks...
  But i have a question... can it be done only getting spell
suggestions even if the spelled word is correct... I mean near words to
it...
   ex:-

http://localhost:8080/solr/spellcheckCompRH?q=java&rows=0&spellcheck=true&spellcheck.count=10
   In the o/p the suggestions will not be coming as
java is a word that spelt correctly...
  But cant we get near suggestions as javax,javacetc.., ???

Regards,
satya


Re: spell suggest response

2011-01-12 Thread satya swaroop
Hi stefan,
I need the words from the index record itself. If java is given
then the relevant or similar or near words in the index should be shown.
Even the given keyword is true... can it be possible???


ex:-

http://localhost:8080/solr/spellcheckCompRH?q=java&rows=0&spellcheck=true&spellcheck.count=10
   In the o/p the suggestions will not be coming as
java is a word that spelt correctly...
  But cant we get near suggestions as javax,javacetc.., ???(the
terms in the index)

I read  about  suggester in solr wiki at
http://wiki.apache.org/solr/Suggester . But i tried to implement it but got
errors as

*error loading class org.apache.solr.spelling.suggest.suggester*

Regards,
satya


Re: spell suggest response

2011-01-12 Thread satya swaroop
Hi Juan,
 yeah.. i tried of onlyMorePopular and got some results but are
not similar words or near words to the word i have given in the query..
Here i state you the output..

http://localhost:8080/solr/spellcheckCompRH?q=java&rows=0&spellcheck=true&spellcheck.collate=true&spellcheck.onlyMorePopular=true&spellcheck.count=20

the o/p i get is
-
data
have
can
any
all
has
each
part
make
than
also




but this words are not similar to the given word 'java' the near words
would be javac,javax,data,java.io... etc.., the stated words are present in
the index..


Regards,
satya


Re: spell suggest response

2011-01-16 Thread satya swaroop
Hi Grijesh,
As you said you are implementing this type. Can you tell how
did you made in brief..

Regards,
satya


Re: spell suggest response

2011-01-17 Thread satya swaroop
Hi Grijesh,
   Though i use autosuggest i maynot get the exact results, the
order is not accurate.. As for example if i type
http://localhost:8080/solr/terms/?terms.fl=spell&terms.prefix=solr&terms.sort=index&terms.lower=solr&terms.upper.incl=true
 i get results as...
solr
solr.amp
solr.datefield
solr.p
solr.pdf
   like that.But this may not lead to getting accurate results as we get in
spellchecking,

i require suggestions for any word irrespective of whether it is correct or
not, is there anything to be changed in solr to get suggestions as we get
when we type a wrong word in spellchecking... If so please let me know...

Regards,
satya


Re: spell suggest response

2011-01-17 Thread satya swaroop
Hi Grijesh,
i added both the termscomponent and spellcheck component to the
terms requesthandler, when i send a query as
http://localhost:8080/solr/terms?terms.fl=text&terms.prefix=java&&rows=7&omitHeader=true&spellcheck=true&spellcheck.q=java&spellcheck.count=20

the result i get is

-

-

6
6
6
6
6
6


-







when i send this
http://localhost:8080/solr/terms?terms.fl=text&terms.prefix=jawa&&rows=5&omitHeader=true&spellcheck=true&spellcheck.q=jawa&spellcheck.count=20
i get the result as


-



-

-

-

int name="numFound">20
0
4
-

java
away
jav
jar
ara
apa
ana
ajax


Now i need to know how to make ordering of the terms as in the 1st query the
result obtained is inorder and i want only javax, javac,javascript but not
javas,javabas how can it be done??

Regards,
satya


spellchecking even the key is true....

2011-01-17 Thread satya swaroop
Hi All,
can we get the spellchecking results even when the keyword is true.
As for spellchecking will give only to the wrong keywords, cant we get
similar and near words of the keyword though the spellcheck.q is true..
as an example
http://localhost:8080/solr/spellcheck?q=java&spellcheck=true&spellcheck.count=5
the result will be

1)-

-






can we get the result as
2)

-


javax
javac
javabean
javascript



NOTE:: all the keywords in the 2nd result is are in index...

Regards,
satya


is solr dynamic calculation??

2011-02-17 Thread satya swaroop
Hi All,
 I have a query whether the solr shows the results of documents by
calculating the score on dynamic or is it pre calculating and supplying??..

for example:
if a query is made on q=solr in my index... i get a results of 25
documents... what is it calculating?? i am very keen to know its way of
calculation of score and ordering of results


Regards,
satya


Re: is solr dynamic calculation??

2011-02-17 Thread satya swaroop
Hi Markus,
As far i gone through the scoring of solr. The scoring is
done during searching on the use of boost values which were given during the
indexing.
I have a query now if i search for a keyword java then
1)if for a term named "java" in index contain 50,000 documents then do solr
calculate the score value for each and every document and filter them and
then sort it and   server results??? if it does the dynamic calculation
for each and every document then it takes a long time, but how can solr
reduced it??
 Am i right??? or if any wrong please tell me???

Regards,
satya


solr indexing

2011-02-22 Thread satya swaroop
Hi all,
   to my keen intrest on solr indexing mechanism i started mining the
code of solr indexing (/update/extract), i read the indexing file formats,
scoring procedure, i have some queries regarding this..
1) the scoring is performed on the dynamic and precalculated value(doc
boost, field boost, lengthnorm). In calculating the score if suppose a term
in the index consits nearly one million docs then is solr calculating the
score for each and every doc present for the term and getting the top docs
from the index??? or is it undergoing any mechanism such that limiting the
calculation of score to only a particular docs???

If anybody know about it or any documentation regarding this please inform
me...


Regards,
satya


Solr coding

2011-03-23 Thread satya swaroop
Hi All,
  As for my project Requirement i need to keep privacy for search of
files so that i need to modify the code of solr,

for example if there are 5 users and each user indexes some files as
  user1 -> java1, c1,sap1
  user2 -> java2, c2,sap2
  user3 -> java3, c3,sap3
  user4 -> java4, c4,sap4
  user5 -> java5, c5,sap5

   and if a user2 searches for the keyword "java" then it should be display
only  the file java2 and not other files

so inorder to keep this filtering inside solr itself may i know where to
modify the code... i will access a database to check the user indexed files
and then filter the result... i didnt have any cores.. i indexed all files
in a single index...

Regards,
satya


Re: Solr coding

2011-03-23 Thread satya swaroop
Hi Jayendra,
I forgot to mention the result also depends on the group of
user too It is some wat complex so i didnt tell it.. now i explain the
exact way..

  user1, group1 -> java1, c1,sap1
  user2 ,group2-> java2, c2,sap2
  user3 ,group1,group3-> java3, c3,sap3
  user4 ,group3-> java4, c4,sap4
  user5 ,group3-> java5, c5,sap5

 user1,group1 means user1 belong to group1


Here the filter includes the group too.., if for eg: user1 searches for
"java" then the results should show as java1,java3 since java3 file is
acessable to all users who are related to the group1, so i thought of to
edit the code...

Thanks,
satya


Re: Solr coding

2011-03-23 Thread satya swaroop
Hi Jayendra,
  the group field can be kept if the no. of groups are
small... if a user may belong to 1000 groups in that case it would be
difficult to make a query???,   if a user changes the groups then we have to
reindex the data again...

ok i will try ur suggestion, if it can fulfill the needs then task will be
very easy...

Regards,
satya


how to set cookie for url requesting in stream_url

2011-03-31 Thread satya swaroop
Hi All,
for indexing the documents in the other server i need to include a
cookie value in the url requesting through the stream_url.
can anybody tell me how to include the cookie in the url???
have anybody done this type??? or if there are any suggestions please tell
me???

ex:
http://localhost:8456/solr/update/extract?stream_url=remote_server_url&literal.id=13748
;

here i need to include a cookie value while requesting for the
remote_server_url.


Regards,
satya


Fwd: how to set cookie for url requesting in stream_url

2011-04-01 Thread satya swaroop
HI Markus,
   I am using solr branch_3x, in tomcat web server
Regards,
satya


Re: how to set cookie for url requesting in stream_url

2011-04-07 Thread satya swaroop
Hi All,
 I was able to set the cookie value to the Stream_url connection, i was
able to pass the cookie value upto contentstreamBase.URLStream class and i
added
conn.setRequestProperty("Cookie",cookie[0].name"="cookie[0].value) in the
connection setup.. and it is working fine now...

Regards,
satya


Search and index Result

2011-04-14 Thread satya swaroop
Hi all,
   i just made a duplication  of solrdispatchfilter as
solrdispatchfilter1 and solrdispatchfilter2 such that all the /update or
/update/extract things are passed through the solrdispatchfilter1
and all search (/select)  things are passes through the
solrdispatchfilter2. It is because i need to establish a privacy concern for
the search result.
I need to check whether the required user has access to the particular files
or not.. it was success in implementing the privacy of results.
one major problem i am getting is after indexing some documents and
commiting it, i am not getting the commited data in the search result, i am
getting the old data that was before commit...
But i get the result only after restarting the server.. can anyone tell me
where to modify such that the search will give the results from the recent
commit...


Thanks and Regards,
satya