hi all,
i am working with solr on tomcat. the indexing is good for xml files
but when i send the docs or html files or pdf's through curl i get the error
as lazy error. can u telll me the way. the output is as follows when i send
a pdf file i am working in ubuntu. solr home is /opt/e
hi,
I installed tika and made its jar files into solr home library and also
gave the path to the tika configuration file. But the error is same. the
tika config file is as follows:::
http://purl.org/dc/elements/1.1/
application/xml
Hi all,
i am new to solr and followed with the wiki and got the solr admin
run sucessfully. It is good going for xml files. But to index the rich
documents i am unable to get it. I followed wiki to make the richer
documents also, but i didnt get it.The error comes when i send an pdf/html
Hi all,
i am new to solr and followed with the wiki and got the solr admin
run sucessfully. It is good going for xml files. But to index the rich
documents i am unable to get it. I followed wiki to make the richer
documents also, but i didnt get it.The error comes when i send an pdf/html
hi,
yes i followed the wiki and can now tell me the procedure for it
regards,
swaroop
ya i checked the extraction request handler but couldnt get the
info... i installed tika-0.7 and copied the jar files into the solr
home library.. i started sending the pdf/html files then i get a lazy
error. i am using tomcat and solr 1.4
Hi all,
i am new to solr and i followed d wiki and got everything going
right. But when i send any html/txt/pdf documents the response is as
follows:::
0576
but when i search in the solr i dont find the result can any one tell me
what to be done..??
The curl i used for the above o/p
hi,
i sent the commit after adding the documents. but the problem is same
regards,
satya
Hi all,
i Have a problem with the solr. when i send the documents(.doc) i am
not getting the response.
example:
sa...@geodesic-desktop:~/Desktop$ curl "
http://localhost:8080/solr/update/extract?stream.file=/home/satya/Desktop/InvestmentDecleration.doc&stream.contentType=application
hi,
i am sorry the mail u sent was in sent mail... I didnt look it I am
going to check now.. I will definetely tell u the entire thing
regards,
satya
hi,
I checked out the admin page and it is indexing for others.In the log
files i dont get anything when i send the documents. I checked out the log
in catalina(tomcat). I changed the dismax handler from q=*:* to q= . I
atleast get the response when i send pdf/html files but dont even get for
hi all,
now solr is working good.i am working in ubuntu and i was indexing
the documents which dont hav permissions . so the problem was that. i thank
all of u for ur reply to my queries.
thanking you,
satya
hi all,
i am a new one to solr and able to implement indexing the documents
by following the solr wiki. now i am trying to add the spellchecking. i
followed the spellcheck component in wiki but not getting the suggested
spellings. i first build it by spellcheck.build=true,...
here i give u
This is in solrconfig.xml:::
default
solr.IndexBasedSpellChecker
spell
./spellchecker
0.7
true
true
jarowinkler
lowerfilt
org.apache.lucene.search.spell.JaroWinklerDistance
./spellchecker
hi all,
i need some help in spellchecking.i configured my solrconfig and
schema by looking the usermailing list and here i give you the configuration
i made..
my schema.xml::
my solrconfig.xml:::
Hi all,
The indexing part of solr is going good,but i got a error on indexing
a single pdf file. when i searched for the error in the mailing list i found
that the error was due to copyright of that file. can't we index a file
which has copy rights or any digital rights???
regards,
satya
hi all,
the error i got is ""Unexpected RuntimeException from
org.apache.tika.parser.pdf.pdfpar...@8210fc"" when i indexed a file similar
to the one in
https://issues.apache.org/jira/browse/PDFBOX-709/samplerequestform.pdfcant
we index those type files in solr???
regards,
satya
hi all,
i am indexing the documents to solr that are in my system. now i need
to index the files that are in remote system, i enabled the remote streaming
to true in solrconfig.xml and when i use the stream.url it shows the error
as ""connection refused"" and the detail of the error is:::
w
hi,
1) i use tika 0.8...
2)the url is https://issues.apache.org/jira/browse/PDFBOX-709 and the
file is samplerequestform.pdf
3)the entire error is::;
curl "
http://localhost:8080/solr/update/extract?stream.file=/home/satya/my_workings/satya_ebooks/8-Linux/sample
hi all,
i am very intrested to know the working of solr. can anyone tell me
which modules or classes that gets invoked when we start the servlet
container like tomcat or when we send any requests to solr like sending pdf
files or what files get invoked at the start of solr.??
regards,
saty
Hi all,
when we handle extract request handler what class gets invoked.. I
need to know the navigation of classes when we send any files to solr.
can anybody tell me the classes or any sources where i can get the answer..
or can anyone tell me what classes get invoked when we start the
solr.
>
> Hi all,
> I got the solution for my problem. I changed my port number and i
> kept the old one in the stream.url... so problem was that...
> thanks all
>
> Now i got another problem, it is when i send any requests to remote
> system for the files that have names with escape
Hi all,
i indexed nearly 100 java pdf files which are of large size(min 1MB).
The solr is showing the results with the entire content that it indexed
which is taking time to show the results.. cant we reduce the content it
shows or can i just have the file names and ids instead of the entire
Hi all,
I am intrested to see the working of solr.
1)Can anyone tell me how to start with to know its working
Regards,
satya
Hi peter,
I am already working on solr and it is working good. But i want
to understand the code and know where the actual working is going on, and
how indexing is done and how the requests are parsed and how it is
responding and all others. TO understand the code i asked how to start?
Hi all,
Thanks for ur response and information. I used slf4j log and i kept
log.info method in every class of solr module to know which classes get
invoke on particular requesthandler or on start of solr I was able to
keep it only in solr Module but not in lucene module... i get error wh
Hi all,
I am using stream.url to index the files in the remote system. when i
use the url as
1) curl "
http://localhost:8080/solr/update/extract?stream.url=http://remotehost:port/file_download.yaws?file=yaws_presentation.pdf&literal.id=schb4
"
it works and i get the response as the file got
Hi stefan,
I used escape charaters and made it... It is not problem for
a single file of 'solr &apache' but it shows the same problem for the files
like Wireless lan.ppt, Tom info.pdf.
the curl i sent is::
curl "
http://localhost:8080/solr/update/extract?stream.url=http://remot
Hi,
I made the curl from the shell(command prompt or terminal) with the
escaping characters but the error is same when i saw in the remote
system the request is not getting there Is there anything to be changed
in config file inorder to enable the escaping characters for stream.url
Hi all,
I am unable to index the files of remote system that contains escaped
characters in their file names i think there is a problem in solr for
indexing the files of escaped characters in remote system...
Has anybody tried to index the files in remote system that contain the
escaped
Hi Hoss,
Thanks for reply and it got working The reason was as you
said i was not double escaping i used %2520 for whitespace and it is
working now
Thanks,
satya
Hi All,
What is the difference of using shards,solr cloud and zookeeper..
which is the best way to scale the solr..
I need to reduce the index size in every system and reduce the search time
for a query...
Regards,
satya
Hi all,
I am having 4 instances of solr in 4 systems.Each system has a
single instance of solr.. I want the result from all these servers. I came
to know using of solrcloud. I read about it and worked on the example and it
was working as given in wiki.
I am using solr 1.4 and apache tomcat
Hi all,
i want to build the package of my solr and i found it can be done
using ant. When i type ant package in solr module i get an error as:::\
sa...@swaroop:~/temporary/trunk/solr$ ant package
Buildfile: build.xml
maven.ant.tasks-check:
BUILD FAILED
/home/satya/temporary/trunk/solr/c
HI ,
ya i dont have the jar file in the ant/lib where can i get the jar
file or wat is the procedure to make that maven-artifact-ant-2.0.4-dep.jar??
regards,
satya
Hi erick,
thanks for reply and i got the jar file downloaded and kept it
in ant library
now when i make ant package command it getting error in the middle of build
in generate-maven-artifacts... and the error is
sa...@geodesic-desktop:~/temporary/trunk/solr$ sudo ant package
Hi all,
i updated my solr trunk to revision 1004527. when i go for compiling
the trunk with ant i get so many warnings, but the build is successful. the
warnings are here:::
common.compile-core:
[mkdir] Created dir:
/home/satya/temporary/trunk/lucene/build/classes/java
[javac] Compi
Hi All,
I am planning to have a separate server for solr and regarding
hardware requirements i have a doubt about what configuration to be needed.
I know it will be hard to tell but i just need a minimum requirement for the
particular situation as follows::
1) There are 1000 regular users
Hi,
here is some more info about it. I use Solr to output only the file
names(file id's). Here i enclose the fields in my schema.xml and presently i
have only about 40MB of indexed data.
Hi all,
I increased my RAM size to 8GB and i want 4GB of it to be used
for solr itself. can anyone tell me the way to allocate the RAM for the
solr.
Regards,
satya
Hi ,
Can the result of solr show the only a part of the content of a
document that got in the result.
example
if i send a query for to search tika then the result should be as follows:::
-
0
79
-
-
text/html
1html
-
-
Apache Tomcat/6.0.26 - Error reportHTT
Hi Lance,
I actually copied tika exceptions in one html file and indexed
it. It is just a content of a file and here i tell u what i mean::
if i post a query like *java* then the result or response from solr should
hit only a part of the content like as follows::
http://localhost:
Hi All,
Thanks for your reply.I have a doubt whether to increase the ram or
heap size to java or to tomcat where the solr is running
Regards,
satya
Hi All,
Can we get the results like google having some data about the
search... I was able to get the data that is the first 300 characters of a
file, but it is not helpful for me, can i be get the data that is having the
first found key in that file
Regards,
Satya
Hi Tanguy,
I am not asking for highlighting.. I think it can be
explained with an example.. Here i illustarte it::
when i post the query like dis::
http://localhost:8080/solr/select?q=Java&version=2.2&start=0&rows=10&indent=on
i Would be getting the result as follows::
-
-
0
1
Hi Tanguy,
Thanks for ur reply. sorry to ask this type of question.
how can we index each chapter of a file as seperate document.As for i know
we just give the path of file to solr to index it... Can u provide me any
sources for this type... I mean any blogs or wiki's...
Regards,
Hi All,
Thanks for your suggestions.. I got the result of what i expected..
Cheers,
Satya
Hi All,
I built solr successfully and i am thinking to test it with nearly
300 pdf files, 300 docs, 300 excel files,...and so on of each type with 300
files nearly
Is there any dummy data available to test for solr,Otherwise i need to
download each and every file individually..??
An
Hi All,
i am getting different results when i used with some escape keys..
for example:::
1) when i use this request
http://localhost:8080/solr/select?q=erlang!ericson
the result obtained is
2) when the request is
http://localhost:80
Hi All,
I am able to get the response in the success case in json format by
stating wt=json in the query. But as in case if any errors i am geting in
html format.
1) Is there any specified reason to get in html format??
2)cant we get the error result in json format??
Regards,
satya
Hi Erick,
Every result comes in xml format. But when you get any errors
like http 500 or http 400 like wise we will get in html format. My query is
cant we make that html file into json or vice versa..
Regards,
satya
Hi All,
can we get just suggestions only without the files response??
Here I state an example
when i query
http://localhost:8080/solr/spellcheckCompRH?q=java daka
usar&spellcheck=true&spellcheck.count=5&spellcheck.collate=true
i get some result of java files and then the suggestions f
Hi Gora,
I am using solr for file indexing and searching, But i have a
module where i dont need any files result but only the spell suggestions, so
i asked is der anyway in solr where i would get the spell suggestion
responses only.. I think it is clear for u now.. If not tell me I will
Hi Stefan,
Ya it works :). Thanks...
But i have a question... can it be done only getting spell
suggestions even if the spelled word is correct... I mean near words to
it...
ex:-
http://localhost:8080/solr/spellcheckCompRH?q=java&rows=0&spellcheck=true&spellcheck
Hi stefan,
I need the words from the index record itself. If java is given
then the relevant or similar or near words in the index should be shown.
Even the given keyword is true... can it be possible???
ex:-
http://localhost:8080/solr/spellcheckCompRH?q=java&rows=0&spellcheck=true&s
Hi Juan,
yeah.. i tried of onlyMorePopular and got some results but are
not similar words or near words to the word i have given in the query..
Here i state you the output..
http://localhost:8080/solr/spellcheckCompRH?q=java&rows=0&spellcheck=true&spellcheck.collate=true&spellcheck.on
Hi Grijesh,
As you said you are implementing this type. Can you tell how
did you made in brief..
Regards,
satya
Hi Grijesh,
Though i use autosuggest i maynot get the exact results, the
order is not accurate.. As for example if i type
http://localhost:8080/solr/terms/?terms.fl=spell&terms.prefix=solr&terms.sort=index&terms.lower=solr&terms.upper.incl=true
i get resu
Hi Grijesh,
i added both the termscomponent and spellcheck component to the
terms requesthandler, when i send a query as
http://localhost:8080/solr/terms?terms.fl=text&terms.prefix=java&&rows=7&omitHeader=true&spellcheck=true&spellcheck.q=java&spellcheck.count=20
the result i get is
-
Hi All,
can we get the spellchecking results even when the keyword is true.
As for spellchecking will give only to the wrong keywords, cant we get
similar and near words of the keyword though the spellcheck.q is true..
as an example
http://localhost:8080/solr/spellcheck?q=java&spellcheck=tr
Hi All,
I have a query whether the solr shows the results of documents by
calculating the score on dynamic or is it pre calculating and supplying??..
for example:
if a query is made on q=solr in my index... i get a results of 25
documents... what is it calculating?? i am very keen to
Hi Markus,
As far i gone through the scoring of solr. The scoring is
done during searching on the use of boost values which were given during the
indexing.
I have a query now if i search for a keyword java then
1)if for a term named "java" in index contain 50,000 documents then do s
Hi all,
to my keen intrest on solr indexing mechanism i started mining the
code of solr indexing (/update/extract), i read the indexing file formats,
scoring procedure, i have some queries regarding this..
1) the scoring is performed on the dynamic and precalculated value(doc
boost, field bo
Hi All,
As for my project Requirement i need to keep privacy for search of
files so that i need to modify the code of solr,
for example if there are 5 users and each user indexes some files as
user1 -> java1, c1,sap1
user2 -> java2, c2,sap2
user3 -> java3, c3,sap3
user4 -> java4,
Hi Jayendra,
I forgot to mention the result also depends on the group of
user too It is some wat complex so i didnt tell it.. now i explain the
exact way..
user1, group1 -> java1, c1,sap1
user2 ,group2-> java2, c2,sap2
user3 ,group1,group3-> java3, c3,sap3
user4 ,group3
Hi Jayendra,
the group field can be kept if the no. of groups are
small... if a user may belong to 1000 groups in that case it would be
difficult to make a query???, if a user changes the groups then we have to
reindex the data again...
ok i will try ur suggestion, if it can fu
Hi All,
for indexing the documents in the other server i need to include a
cookie value in the url requesting through the stream_url.
can anybody tell me how to include the cookie in the url???
have anybody done this type??? or if there are any suggestions please tell
me???
ex:
http://loca
HI Markus,
I am using solr branch_3x, in tomcat web server
Regards,
satya
Hi All,
I was able to set the cookie value to the Stream_url connection, i was
able to pass the cookie value upto contentstreamBase.URLStream class and i
added
conn.setRequestProperty("Cookie",cookie[0].name"="cookie[0].value) in the
connection setup.. and it is working fine now...
Regards,
s
Hi all,
i just made a duplication of solrdispatchfilter as
solrdispatchfilter1 and solrdispatchfilter2 such that all the /update or
/update/extract things are passed through the solrdispatchfilter1
and all search (/select) things are passes through the
solrdispatchfilter2. It is because i
70 matches
Mail list logo