indexing with pdf files problem

2010-07-12 Thread satya swaroop
hi all, i am working with solr on tomcat. the indexing is good for xml files but when i send the docs or html files or pdf's through curl i get the error as lazy error. can u telll me the way. the output is as follows when i send a pdf file i am working in ubuntu. solr home is /opt/e

Re: indexing with pdf files problem

2010-07-13 Thread satya swaroop
hi, I installed tika and made its jar files into solr home library and also gave the path to the tika configuration file. But the error is same. the tika config file is as follows::: http://purl.org/dc/elements/1.1/ application/xml

indexing rich documents

2010-07-13 Thread satya swaroop
Hi all, i am new to solr and followed with the wiki and got the solr admin run sucessfully. It is good going for xml files. But to index the rich documents i am unable to get it. I followed wiki to make the richer documents also, but i didnt get it.The error comes when i send an pdf/html

indexing rich documents

2010-07-13 Thread satya swaroop
Hi all, i am new to solr and followed with the wiki and got the solr admin run sucessfully. It is good going for xml files. But to index the rich documents i am unable to get it. I followed wiki to make the richer documents also, but i didnt get it.The error comes when i send an pdf/html

Re: indexing rich documents

2010-07-13 Thread satya swaroop
hi, yes i followed the wiki and can now tell me the procedure for it regards, swaroop

Re: indexing rich documents

2010-07-13 Thread satya swaroop
ya i checked the extraction request handler but couldnt get the info... i installed tika-0.7 and copied the jar files into the solr home library.. i started sending the pdf/html files then i get a lazy error. i am using tomcat and solr 1.4

problem with storing??

2010-07-15 Thread satya swaroop
Hi all, i am new to solr and i followed d wiki and got everything going right. But when i send any html/txt/pdf documents the response is as follows::: 0576 but when i search in the solr i dont find the result can any one tell me what to be done..?? The curl i used for the above o/p

Re: problem with storing??

2010-07-15 Thread satya swaroop
hi, i sent the commit after adding the documents. but the problem is same regards, satya

no response

2010-07-15 Thread satya swaroop
Hi all, i Have a problem with the solr. when i send the documents(.doc) i am not getting the response. example: sa...@geodesic-desktop:~/Desktop$ curl " http://localhost:8080/solr/update/extract?stream.file=/home/satya/Desktop/InvestmentDecleration.doc&stream.contentType=application

Re: no response

2010-07-16 Thread satya swaroop
hi, i am sorry the mail u sent was in sent mail... I didnt look it I am going to check now.. I will definetely tell u the entire thing regards, satya

Re: problem with storing??

2010-07-16 Thread satya swaroop
hi, I checked out the admin page and it is indexing for others.In the log files i dont get anything when i send the documents. I checked out the log in catalina(tomcat). I changed the dismax handler from q=*:* to q= . I atleast get the response when i send pdf/html files but dont even get for

Re: problem with storing??

2010-07-18 Thread satya swaroop
hi all, now solr is working good.i am working in ubuntu and i was indexing the documents which dont hav permissions . so the problem was that. i thank all of u for ur reply to my queries. thanking you, satya

spell checking....

2010-07-26 Thread satya swaroop
hi all, i am a new one to solr and able to implement indexing the documents by following the solr wiki. now i am trying to add the spellchecking. i followed the spellcheck component in wiki but not getting the suggested spellings. i first build it by spellcheck.build=true,... here i give u

Re: spell checking....

2010-07-26 Thread satya swaroop
This is in solrconfig.xml::: default solr.IndexBasedSpellChecker spell ./spellchecker 0.7 true true jarowinkler lowerfilt org.apache.lucene.search.spell.JaroWinklerDistance ./spellchecker

spell checking problem

2010-07-29 Thread satya swaroop
hi all, i need some help in spellchecking.i configured my solrconfig and schema by looking the usermailing list and here i give you the configuration i made.. my schema.xml:: my solrconfig.xml:::

indexing???

2010-08-12 Thread satya swaroop
Hi all, The indexing part of solr is going good,but i got a error on indexing a single pdf file. when i searched for the error in the mailing list i found that the error was due to copyright of that file. can't we index a file which has copy rights or any digital rights??? regards, satya

Re: indexing???

2010-08-16 Thread satya swaroop
hi all, the error i got is ""Unexpected RuntimeException from org.apache.tika.parser.pdf.pdfpar...@8210fc"" when i indexed a file similar to the one in https://issues.apache.org/jira/browse/PDFBOX-709/samplerequestform.pdfcant we index those type files in solr??? regards, satya

stream.url problem

2010-08-17 Thread satya swaroop
hi all, i am indexing the documents to solr that are in my system. now i need to index the files that are in remote system, i enabled the remote streaming to true in solrconfig.xml and when i use the stream.url it shows the error as ""connection refused"" and the detail of the error is::: w

Re: indexing???

2010-08-17 Thread satya swaroop
hi, 1) i use tika 0.8... 2)the url is https://issues.apache.org/jira/browse/PDFBOX-709 and the file is samplerequestform.pdf 3)the entire error is::; curl " http://localhost:8080/solr/update/extract?stream.file=/home/satya/my_workings/satya_ebooks/8-Linux/sample

solr working...

2010-08-18 Thread satya swaroop
hi all, i am very intrested to know the working of solr. can anyone tell me which modules or classes that gets invoked when we start the servlet container like tomcat or when we send any requests to solr like sending pdf files or what files get invoked at the start of solr.?? regards, saty

/update/extract

2010-08-19 Thread satya swaroop
Hi all, when we handle extract request handler what class gets invoked.. I need to know the navigation of classes when we send any files to solr. can anybody tell me the classes or any sources where i can get the answer.. or can anyone tell me what classes get invoked when we start the solr.

Re: stream.url problem

2010-08-24 Thread satya swaroop
> > Hi all, > I got the solution for my problem. I changed my port number and i > kept the old one in the stream.url... so problem was that... > thanks all > > Now i got another problem, it is when i send any requests to remote > system for the files that have names with escape

reduce the content???

2010-08-25 Thread satya swaroop
Hi all, i indexed nearly 100 java pdf files which are of large size(min 1MB). The solr is showing the results with the entire content that it indexed which is taking time to show the results.. cant we reduce the content it shows or can i just have the file names and ids instead of the entire

solr working...

2010-08-26 Thread satya swaroop
Hi all, I am intrested to see the working of solr. 1)Can anyone tell me how to start with to know its working Regards, satya

Re: solr working...

2010-08-26 Thread satya swaroop
Hi peter, I am already working on solr and it is working good. But i want to understand the code and know where the actual working is going on, and how indexing is done and how the requests are parsed and how it is responding and all others. TO understand the code i asked how to start?

Re: solr working...

2010-08-26 Thread satya swaroop
Hi all, Thanks for ur response and information. I used slf4j log and i kept log.info method in every class of solr module to know which classes get invoke on particular requesthandler or on start of solr I was able to keep it only in solr Module but not in lucene module... i get error wh

stream.url

2010-09-02 Thread satya swaroop
Hi all, I am using stream.url to index the files in the remote system. when i use the url as 1) curl " http://localhost:8080/solr/update/extract?stream.url=http://remotehost:port/file_download.yaws?file=yaws_presentation.pdf&literal.id=schb4 " it works and i get the response as the file got

Re: stream.url

2010-09-02 Thread satya swaroop
Hi stefan, I used escape charaters and made it... It is not problem for a single file of 'solr &apache' but it shows the same problem for the files like Wireless lan.ppt, Tom info.pdf. the curl i sent is:: curl " http://localhost:8080/solr/update/extract?stream.url=http://remot

Re: stream.url

2010-09-02 Thread satya swaroop
Hi, I made the curl from the shell(command prompt or terminal) with the escaping characters but the error is same when i saw in the remote system the request is not getting there Is there anything to be changed in config file inorder to enable the escaping characters for stream.url

Re: stream.url

2010-09-03 Thread satya swaroop
Hi all, I am unable to index the files of remote system that contains escaped characters in their file names i think there is a problem in solr for indexing the files of escaped characters in remote system... Has anybody tried to index the files in remote system that contain the escaped

Re: stream.url

2010-09-08 Thread satya swaroop
Hi Hoss, Thanks for reply and it got working The reason was as you said i was not double escaping i used %2520 for whitespace and it is working now Thanks, satya

cloud or zookeeper

2010-09-14 Thread satya swaroop
Hi All, What is the difference of using shards,solr cloud and zookeeper.. which is the best way to scale the solr.. I need to reduce the index size in every system and reduce the search time for a query... Regards, satya

SolrCloud new....

2010-09-20 Thread satya swaroop
Hi all, I am having 4 instances of solr in 4 systems.Each system has a single instance of solr.. I want the result from all these servers. I came to know using of solrcloud. I read about it and worked on the example and it was working as given in wiki. I am using solr 1.4 and apache tomcat

ant package

2010-09-21 Thread satya swaroop
Hi all, i want to build the package of my solr and i found it can be done using ant. When i type ant package in solr module i get an error as:::\ sa...@swaroop:~/temporary/trunk/solr$ ant package Buildfile: build.xml maven.ant.tasks-check: BUILD FAILED /home/satya/temporary/trunk/solr/c

Re: ant package

2010-09-21 Thread satya swaroop
HI , ya i dont have the jar file in the ant/lib where can i get the jar file or wat is the procedure to make that maven-artifact-ant-2.0.4-dep.jar?? regards, satya

Re: ant package

2010-09-21 Thread satya swaroop
Hi erick, thanks for reply and i got the jar file downloaded and kept it in ant library now when i make ant package command it getting error in the middle of build in generate-maven-artifacts... and the error is sa...@geodesic-desktop:~/temporary/trunk/solr$ sudo ant package

ant build problem

2010-10-04 Thread satya swaroop
Hi all, i updated my solr trunk to revision 1004527. when i go for compiling the trunk with ant i get so many warnings, but the build is successful. the warnings are here::: common.compile-core: [mkdir] Created dir: /home/satya/temporary/trunk/lucene/build/classes/java [javac] Compi

solr requirements

2010-10-18 Thread satya swaroop
Hi All, I am planning to have a separate server for solr and regarding hardware requirements i have a doubt about what configuration to be needed. I know it will be hard to tell but i just need a minimum requirement for the particular situation as follows:: 1) There are 1000 regular users

Re: solr requirements

2010-10-18 Thread satya swaroop
Hi, here is some more info about it. I use Solr to output only the file names(file id's). Here i enclose the fields in my schema.xml and presently i have only about 40MB of indexed data.

RAM increase

2010-10-20 Thread satya swaroop
Hi all, I increased my RAM size to 8GB and i want 4GB of it to be used for solr itself. can anyone tell me the way to allocate the RAM for the solr. Regards, satya

solr result....

2010-10-27 Thread satya swaroop
Hi , Can the result of solr show the only a part of the content of a document that got in the result. example if i send a query for to search tika then the result should be as follows::: - 0 79 - - text/html 1html - - Apache Tomcat/6.0.26 - Error reportHTT

Re: solr result....

2010-10-28 Thread satya swaroop
Hi Lance, I actually copied tika exceptions in one html file and indexed it. It is just a content of a file and here i tell u what i mean:: if i post a query like *java* then the result or response from solr should hit only a part of the content like as follows:: http://localhost:

Re: RAM increase

2010-10-29 Thread satya swaroop
Hi All, Thanks for your reply.I have a doubt whether to increase the ram or heap size to java or to tomcat where the solr is running Regards, satya

Google like search

2010-12-14 Thread satya swaroop
Hi All, Can we get the results like google having some data about the search... I was able to get the data that is the first 300 characters of a file, but it is not helpful for me, can i be get the data that is having the first found key in that file Regards, Satya

Re: Google like search

2010-12-14 Thread satya swaroop
Hi Tanguy, I am not asking for highlighting.. I think it can be explained with an example.. Here i illustarte it:: when i post the query like dis:: http://localhost:8080/solr/select?q=Java&version=2.2&start=0&rows=10&indent=on i Would be getting the result as follows:: - - 0 1

Re: Google like search

2010-12-14 Thread satya swaroop
Hi Tanguy, Thanks for ur reply. sorry to ask this type of question. how can we index each chapter of a file as seperate document.As for i know we just give the path of file to solr to index it... Can u provide me any sources for this type... I mean any blogs or wiki's... Regards,

Re: Google like search

2010-12-16 Thread satya swaroop
Hi All, Thanks for your suggestions.. I got the result of what i expected.. Cheers, Satya

Testing Solr

2010-12-16 Thread satya swaroop
Hi All, I built solr successfully and i am thinking to test it with nearly 300 pdf files, 300 docs, 300 excel files,...and so on of each type with 300 files nearly Is there any dummy data available to test for solr,Otherwise i need to download each and every file individually..?? An

Different Results..

2010-12-22 Thread satya swaroop
Hi All, i am getting different results when i used with some escape keys.. for example::: 1) when i use this request http://localhost:8080/solr/select?q=erlang!ericson the result obtained is 2) when the request is http://localhost:80

error in html???

2010-12-23 Thread satya swaroop
Hi All, I am able to get the response in the success case in json format by stating wt=json in the query. But as in case if any errors i am geting in html format. 1) Is there any specified reason to get in html format?? 2)cant we get the error result in json format?? Regards, satya

Re: error in html???

2010-12-23 Thread satya swaroop
Hi Erick, Every result comes in xml format. But when you get any errors like http 500 or http 400 like wise we will get in html format. My query is cant we make that html file into json or vice versa.. Regards, satya

spell suggest response

2011-01-11 Thread satya swaroop
Hi All, can we get just suggestions only without the files response?? Here I state an example when i query http://localhost:8080/solr/spellcheckCompRH?q=java daka usar&spellcheck=true&spellcheck.count=5&spellcheck.collate=true i get some result of java files and then the suggestions f

Re: spell suggest response

2011-01-11 Thread satya swaroop
Hi Gora, I am using solr for file indexing and searching, But i have a module where i dont need any files result but only the spell suggestions, so i asked is der anyway in solr where i would get the spell suggestion responses only.. I think it is clear for u now.. If not tell me I will

Re: spell suggest response

2011-01-11 Thread satya swaroop
Hi Stefan, Ya it works :). Thanks... But i have a question... can it be done only getting spell suggestions even if the spelled word is correct... I mean near words to it... ex:- http://localhost:8080/solr/spellcheckCompRH?q=java&rows=0&spellcheck=true&spellcheck

Re: spell suggest response

2011-01-12 Thread satya swaroop
Hi stefan, I need the words from the index record itself. If java is given then the relevant or similar or near words in the index should be shown. Even the given keyword is true... can it be possible??? ex:- http://localhost:8080/solr/spellcheckCompRH?q=java&rows=0&spellcheck=true&s

Re: spell suggest response

2011-01-12 Thread satya swaroop
Hi Juan, yeah.. i tried of onlyMorePopular and got some results but are not similar words or near words to the word i have given in the query.. Here i state you the output.. http://localhost:8080/solr/spellcheckCompRH?q=java&rows=0&spellcheck=true&spellcheck.collate=true&spellcheck.on

Re: spell suggest response

2011-01-16 Thread satya swaroop
Hi Grijesh, As you said you are implementing this type. Can you tell how did you made in brief.. Regards, satya

Re: spell suggest response

2011-01-17 Thread satya swaroop
Hi Grijesh, Though i use autosuggest i maynot get the exact results, the order is not accurate.. As for example if i type http://localhost:8080/solr/terms/?terms.fl=spell&terms.prefix=solr&terms.sort=index&terms.lower=solr&terms.upper.incl=true i get resu

Re: spell suggest response

2011-01-17 Thread satya swaroop
Hi Grijesh, i added both the termscomponent and spellcheck component to the terms requesthandler, when i send a query as http://localhost:8080/solr/terms?terms.fl=text&terms.prefix=java&&rows=7&omitHeader=true&spellcheck=true&spellcheck.q=java&spellcheck.count=20 the result i get is -

spellchecking even the key is true....

2011-01-17 Thread satya swaroop
Hi All, can we get the spellchecking results even when the keyword is true. As for spellchecking will give only to the wrong keywords, cant we get similar and near words of the keyword though the spellcheck.q is true.. as an example http://localhost:8080/solr/spellcheck?q=java&spellcheck=tr

is solr dynamic calculation??

2011-02-17 Thread satya swaroop
Hi All, I have a query whether the solr shows the results of documents by calculating the score on dynamic or is it pre calculating and supplying??.. for example: if a query is made on q=solr in my index... i get a results of 25 documents... what is it calculating?? i am very keen to

Re: is solr dynamic calculation??

2011-02-17 Thread satya swaroop
Hi Markus, As far i gone through the scoring of solr. The scoring is done during searching on the use of boost values which were given during the indexing. I have a query now if i search for a keyword java then 1)if for a term named "java" in index contain 50,000 documents then do s

solr indexing

2011-02-22 Thread satya swaroop
Hi all, to my keen intrest on solr indexing mechanism i started mining the code of solr indexing (/update/extract), i read the indexing file formats, scoring procedure, i have some queries regarding this.. 1) the scoring is performed on the dynamic and precalculated value(doc boost, field bo

Solr coding

2011-03-23 Thread satya swaroop
Hi All, As for my project Requirement i need to keep privacy for search of files so that i need to modify the code of solr, for example if there are 5 users and each user indexes some files as user1 -> java1, c1,sap1 user2 -> java2, c2,sap2 user3 -> java3, c3,sap3 user4 -> java4,

Re: Solr coding

2011-03-23 Thread satya swaroop
Hi Jayendra, I forgot to mention the result also depends on the group of user too It is some wat complex so i didnt tell it.. now i explain the exact way.. user1, group1 -> java1, c1,sap1 user2 ,group2-> java2, c2,sap2 user3 ,group1,group3-> java3, c3,sap3 user4 ,group3

Re: Solr coding

2011-03-23 Thread satya swaroop
Hi Jayendra, the group field can be kept if the no. of groups are small... if a user may belong to 1000 groups in that case it would be difficult to make a query???, if a user changes the groups then we have to reindex the data again... ok i will try ur suggestion, if it can fu

how to set cookie for url requesting in stream_url

2011-03-31 Thread satya swaroop
Hi All, for indexing the documents in the other server i need to include a cookie value in the url requesting through the stream_url. can anybody tell me how to include the cookie in the url??? have anybody done this type??? or if there are any suggestions please tell me??? ex: http://loca

Fwd: how to set cookie for url requesting in stream_url

2011-04-01 Thread satya swaroop
HI Markus, I am using solr branch_3x, in tomcat web server Regards, satya

Re: how to set cookie for url requesting in stream_url

2011-04-07 Thread satya swaroop
Hi All, I was able to set the cookie value to the Stream_url connection, i was able to pass the cookie value upto contentstreamBase.URLStream class and i added conn.setRequestProperty("Cookie",cookie[0].name"="cookie[0].value) in the connection setup.. and it is working fine now... Regards, s

Search and index Result

2011-04-14 Thread satya swaroop
Hi all, i just made a duplication of solrdispatchfilter as solrdispatchfilter1 and solrdispatchfilter2 such that all the /update or /update/extract things are passed through the solrdispatchfilter1 and all search (/select) things are passes through the solrdispatchfilter2. It is because i