create cores dynamically
I am not sure if I understand how the creating cores dynamically functionality is supposed to work. From what I have sort of figured out is that I need to specify the instanceDir as the path to a directory which contains the conf file. So I have directory as template for configuration files but when I use this path, solr adds the data directory next to this template conf directory which defeats the purpose. I was hoping that it will copy the template files into a new directory created for the core. Is that not how its supposed to work. Any help is appreciated. Thanks Adeel -- View this message in context: http://lucene.472066.n3.nabble.com/create-cores-dynamically-tp4044279.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: create cores dynamically
Well this is all useful information but not sure if it really answers my question. Let me rephrase what exactly I am trying to do. Lets say I start with core0 so this is how the directory structure looks like solr.home -> solr.xml -> core0 -> conf -> data now when I want to dynamically add core1 i want to end up with a structure like this solr.home -> solr.xml -> core0 -> conf -> data -> core1 -> conf -> data is this possible with dynamic core creation, to have a seperate directory with conf and data directory inside it for each core separetely Thanks for the help -- View this message in context: http://lucene.472066.n3.nabble.com/create-cores-dynamically-tp4044279p4044389.html Sent from the Solr - User mailing list archive at Nabble.com.
help with facets and searchable fields
hi there i am trying to get familiar with solr while setting it up on my local pc and indeing and retrieving some sample data .. a couple of things i am having trouble with 1 - in my schema if i dont use the copyField to copy data from some fields to the text field .. they are not searchable .. so if i just have an id and title field with searchable and sortable attributes set to true .. i cant search for them but as soon as I add them to the 'text' field with the copyField functionality .. they become searchable .. 2 - i have another field called category which can be something like category 1, category 2, ... again same idea here .. cateogory field by itself isnt searchable until i copy it into the text field .. and as i understand all data is simply appended in the text field .. so after that if i search for some field with title matching to my query and then if i facet with the category field .. and if my result lets say was in category 1 .. the facet count is then returned as 2 .. with Category being the first thing and 2 being the second thing .. how can i make it so that it consider 'Category 2' as a single thing any help is appreciated -- View this message in context: http://old.nabble.com/help-with-facets-and-searchable-fields-tp27545136p27545136.html Sent from the Solr - User mailing list archive at Nabble.com.
too often delta imports performance effect
we are trying to setup solr for a website where data gets updated pretty frequently and I want to have those changes reflected in solr indexes sooner than nighly delta-imports .. so I am thinking we will probably want to set it up to have delta imports running every 15 mins or so .. and solr search will obviously be in use while this is going on .. first of all does solr works well with adding new data or updating existing data while people are doing searches in it secondly are these delta imports are gonna cause any significant performance degradation in solr search any help is appreciated -- View this message in context: http://old.nabble.com/too-often-delta-imports-performance-effect-tp27587778p27587778.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: too often delta imports performance effect
thank you .. that helps .. actually its not that many updates .. close to 10 fields probably and may be 50 doc updates per 15 .. so i am assuming that by handling indexing and searching in parallel you mean that if its updating some data .. it will continue to show old data until new data has been finalized(committed) or something like that ?? Jan Høydahl / Cominvent wrote: > > Hi, > > This all depends on actual volumes, HW, architecture etc. > What exactly is "pretty frequently", how many document updates/adds per 15 > minutes? > > Solr is designed to be able to do indexing and search in parallel, so you > don't need to fear this, unless you are already pushing the limits of what > your setup can handle. The best way to go is to start out and then > optimize when you see bottlenecks. > > Here is a pointer to Wiki about indexing performance: > http://wiki.apache.org/lucene-java/ImproveIndexingSpeed > > -- > Jan Høydahl - search architect > Cominvent AS - www.cominvent.com > > On 14. feb. 2010, at 23.56, adeelmahmood wrote: > >> >> we are trying to setup solr for a website where data gets updated pretty >> frequently and I want to have those changes reflected in solr indexes >> sooner >> than nighly delta-imports .. so I am thinking we will probably want to >> set >> it up to have delta imports running every 15 mins or so .. and solr >> search >> will obviously be in use while this is going on .. first of all does solr >> works well with adding new data or updating existing data while people >> are >> doing searches in it >> secondly are these delta imports are gonna cause any significant >> performance >> degradation in solr search >> any help is appreciated >> -- >> View this message in context: >> http://old.nabble.com/too-often-delta-imports-performance-effect-tp27587778p27587778.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> > > > -- View this message in context: http://old.nabble.com/too-often-delta-imports-performance-effect-tp27587778p27588472.html Sent from the Solr - User mailing list archive at Nabble.com.
schema design - catch all field question
if this is my schema with this one being the catch all field and I am copying all fields into the content field my question is .. what if instead of that I change the title field to be text as well and dont copy that into content field but still copy everything else (all string fields) to content field .. exactly what difference will that make .. -- View this message in context: http://old.nabble.com/schema-design---catch-all-field-question-tp27588936p27588936.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: schema design - catch all field question
I am just trying to understand the difference between the two options to know which one to choose .. it sounds like I probably should just merge all data in the content field to maximize search results Erick Erickson wrote: > > The obvious answer is that you won't get any hits for terms > in titles when you search the content field. > > But that's not very informative. What are you trying to accomplish? > That is, what's the high-level issue you're trying to address with > a change like that? > > Best > Erick > > On Sun, Feb 14, 2010 at 9:02 PM, adeelmahmood > wrote: > >> >> if this is my schema >> >> > required="true" >> /> >> >> >> >> >> >> with this one being the catch all field >> > multiValued="true"/> >> >> and I am copying all fields into the content field >> >> my question is .. what if instead of that I change the title field to be >> text as well and dont copy that into content field but still copy >> everything >> else (all string fields) to content field .. exactly what difference will >> that make .. >> >> -- >> View this message in context: >> http://old.nabble.com/schema-design---catch-all-field-question-tp27588936p27588936.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > -- View this message in context: http://old.nabble.com/schema-design---catch-all-field-question-tp27588936p27594836.html Sent from the Solr - User mailing list archive at Nabble.com.
labeling facets and highlighting question
simple question: I want to give a label to my facet queries instead of the name of facet field .. i found the documentation at solr site that I can do that by specifying the key local param .. syntax something like facet.field={!ex=dt%20key='By%20Owner'}owner I am just not sure what the ex=dt part does .. if i take it out .. it throws an error so it seems its important but what for ??? also I tried turning on the highlighting and i can see that it adds the highlighting items list in the xml at the end .. but it only points out the ids of all the matching results .. it doesnt actually shows the text data thats its making a match with // so i am getting something like this back ... instead of the actual text thats being matched .. isnt it supposed to do that and wrap the search terms in em tag .. how come its not doing that in my case here is my schema -- View this message in context: http://old.nabble.com/labeling-facets-and-highlighting-question-tp27632747p27632747.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: labeling facets and highlighting question
okay so if I dont want to do any excludes then I am assuming I should just put in {key=label}field .. i tried that and it doesnt work .. it says undefined field {key=label}field Lance Norskog-2 wrote: > > Here's the problem: the wiki page is confusing: > > http://wiki.apache.org/solr/SimpleFacetParameters#Tagging_and_excluding_Filters > > The line: > q=mainquery&fq=status:public&fq={!tag=dt}doctype:pdf&facet=on&facet.field={!ex=dt}doctype > > is standalone, but the later line: > > facet.field={!ex=dt key=mylabel}doctype > > mean 'change the long query from {!ex=dt}docType to {!ex=dt > key=mylabel}docType' > > 'tag=dt' creates a tag (name) for a filter query, and 'ex=dt' means > 'exclude this filter query'. > > On Wed, Feb 17, 2010 at 4:30 PM, adeelmahmood > wrote: >> >> simple question: I want to give a label to my facet queries instead of >> the >> name of facet field .. i found the documentation at solr site that I can >> do >> that by specifying the key local param .. syntax something like >> facet.field={!ex=dt%20key='By%20Owner'}owner >> >> I am just not sure what the ex=dt part does .. if i take it out .. it >> throws >> an error so it seems its important but what for ??? >> >> also I tried turning on the highlighting and i can see that it adds the >> highlighting items list in the xml at the end .. but it only points out >> the >> ids of all the matching results .. it doesnt actually shows the text data >> thats its making a match with // so i am getting something like this back >> >> >> >> >> ... >> >> instead of the actual text thats being matched .. isnt it supposed to do >> that and wrap the search terms in em tag .. how come its not doing that >> in >> my case >> >> here is my schema >> > required="true" >> /> >> >> >> >> >> -- >> View this message in context: >> http://old.nabble.com/labeling-facets-and-highlighting-question-tp27632747p27632747.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > > > -- > Lance Norskog > goks...@gmail.com > > -- View this message in context: http://old.nabble.com/labeling-facets-and-highlighting-question-tp27632747p27634177.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: some scores to 0 using omitNorns=false
I was gonna ask a question about this but you seem like you might have the answer for me .. wat exactly is the omitNorms field do (or is expected to do) .. also if you could please help me understand what termVectors and multiValued options do ?? Thanks for ur help Raimon Bosch wrote: > > > Hi, > > We did some tests with omitNorms=false. We have seen that in the last > result's page we have some scores set to 0.0. This scores setted to 0 are > problematic to our sorters. > > It could be some kind of bug? > > Regrads, > Raimon Bosch. > -- View this message in context: http://old.nabble.com/some-scores-to-0-using-omitNorns%3Dfalse-tp27637436p27637819.html Sent from the Solr - User mailing list archive at Nabble.com.
highlighting fragments EMPTY
hi i am trying to get highlighting working and its turning out to be a pain. here is my schema here is the catchall field (default field for search as well) here is how I have setup the solrconfig file title pi status 0 content content content regex regex regex after this when I search for lets say http://localhost:8983/solr/select?q=submit&hl=true I get these results in highlight section with no reference to the actual string .. this number thats being returned is the id of the records .. and is also the unique identifier .. why am I not getting the string fragments with search terms highlighted thanks for ur help -- View this message in context: http://old.nabble.com/highlighting-fragments-EMPTY-tp27654005p27654005.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: highlighting fragments EMPTY
well ok I guess that makes sense and I tried changing my title field to text type and then highlighting worked on it .. but 1) as far as not merging all fields in catchall field and instead configuring the dismax handler to search through them .. do you mean then ill have to specify the field I want to do the search in .. e.g. q=something&hl.fl=title or q=somethingelse&hl.fl=status .. and another thing is that I have abuot 20 some fields which I am merging in my catch all fields .. with that many fields do you still think its better to use dismax or catchall field ??? 2) secondly for highlighting q=title:searchterm also didnt worked .. it only works if I change the type of title field to text instead of string .. even if I give the full string in q param .. it still doesnt highlights it unless like I said I change the field type to text ... so why is that .. and if thats just how it is and I have to change some of my fields to text .. then my question is that solr will analyze them first their own field and then copy them to the catchall field while doing the analysis one more time .. since catchall field is also text .. i guess this is just more of a understanding question thanks for all u guys help Ahmet Arslan wrote: > >> hi >> i am trying to get highlighting working and its turning out >> to be a pain. >> here is my schema >> >> > stored="true" required="true" >> /> >> > stored="true" /> >> > stored="true" /> >> > stored="true" /> >> >> here is the catchall field (default field for search as >> well) >> > stored="false" >> multiValued="true"/> >> >> here is how I have setup the solrconfig file >> >> title pi >> status >> >> > name="f.name.hl.fragsize">0 >> >> > name="f.title.hl.alternateField">content >> > name="f.pi.hl.alternateField">content >> > name="f.status.hl.alternateField">content >> >> > name="f.title.hl.fragmenter">regex >> > name="f.pi.hl.fragmenter">regex >> > name="f.status.hl.fragmenter">regex >> >> after this when I search for lets say >> http://localhost:8983/solr/select?q=submit&hl=true >> I get these results in highlight section >> >> >> >> >> >> >> >> with no reference to the actual string .. this number thats >> being returned >> is the id of the records .. and is also the unique >> identifier .. why am I >> not getting the string fragments with search terms >> highlighted > > You need to change type of fields (title, pi, staus) from string to text > (same as content field). > > There should be a match/hit on that field in order to create highlighted > snippets. > > For example q=title:submit should return documents so that snippet of > title can be generated. > > FYI: You can search title, pi, status at the same time using > http://wiki.apache.org/solr/DisMaxRequestHandler without copying all of > them into a catch all field. > > > > > > > > -- View this message in context: http://old.nabble.com/highlighting-fragments-EMPTY-tp27654005p27661657.html Sent from the Solr - User mailing list archive at Nabble.com.
Understanding delta import
hi there I am having some trouble understanding delta import and how its different from full import .. from what I can tell the only difference is that it has the clean parameter set to false by default .. otherwise as far as setting up your query to use the data_import_last_index_time .. you can do that even in full import .. so i tried setting up my import process and my query is setup something like this SELECT some fields (all mapped correctly to solr fields) FROM some tables WHERE dateModified > data_import_last_index_time if i use this query and run the delta import .. it doesnt finds anything .. doesnt updates anything .. if I run full-import with clean=false .. it finds the modified documents and updates them correctly .. right after that if I run the delta import again .. it finds the same documents AGAIN (the ones which full-import found) and imports them .. but after that if I keep changing data and keep hitting delta import url .. nothing happens until I do the same full-import again .. any idea whats going on here so based on this it seems like full-import is doing what I want to do .. simple grab some recently modified data and update or add in solr index .. should I just use the full-import then ?? any advice would be very helpful thanks -- View this message in context: http://old.nabble.com/Understanding-delta-import-tp27690376p27690376.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Understanding delta import
any ideas ??? adeelmahmood wrote: > > hi there > I am having some trouble understanding delta import and how its different > from full import .. from what I can tell the only difference is that it > has the clean parameter set to false by default .. otherwise as far as > setting up your query to use the data_import_last_index_time .. you can do > that even in full import .. > > so i tried setting up my import process and my query is setup something > like this > > SELECT some fields (all mapped correctly to solr fields) > FROM some tables > WHERE dateModified > data_import_last_index_time > > if i use this query and run the delta import .. it doesnt finds anything > .. doesnt updates anything .. if I run full-import with clean=false .. it > finds the modified documents and updates them correctly .. right after > that if I run the delta import again .. it finds the same documents AGAIN > (the ones which full-import found) and imports them .. but after that if I > keep changing data and keep hitting delta import url .. nothing happens > until I do the same full-import again .. any idea whats going on here > > so based on this it seems like full-import is doing what I want to do .. > simple grab some recently modified data and update or add in solr index .. > should I just use the full-import then ?? any advice would be very helpful > > thanks > -- View this message in context: http://old.nabble.com/Understanding-delta-import-tp27690376p27698417.html Sent from the Solr - User mailing list archive at Nabble.com.
solr for reporting purposes
we are trying to use solr for somewhat of a reporting system too (along with search) .. since it provides such amazing control over queries and basically over the data that user wants .. they might as well be able to dump that data in an excel file too if needed .. our data isnt too much close to 25K docs with 15-20 fields in each doc .. and mostly these reports will be for close to 500 - 4000 records .. i am thinking about setting up a simple servlet that grabs all this data that submits the user query to solr over http .. grabs all that results data and dumps it in an excel file .. i was just hoping to get some idea of whether this is going to cause any performance impact on solr search .. especially since its all on the same server and some users will be doing reports while others will be searching .. right now search is working GREAT .. its blazing fast .. i dont wanna loose this but at the same time reporting is an important requirement as well .. also i would appreciate any hints towards some creative ways of doing it .. something like getting 500 some records in a single request and then using some timer task repeat the process .. thanks for ur help -- View this message in context: http://old.nabble.com/solr-for-reporting-purposes-tp27725967p27725967.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr for reporting purposes
I just want to clarify if its not obvious .. that the reason I am concerned about the performance of solr is becaues for reporting requests I will probably have to request all result rows at the same time .. instead of 10 or 20 adeelmahmood wrote: > > we are trying to use solr for somewhat of a reporting system too (along > with search) .. since it provides such amazing control over queries and > basically over the data that user wants .. they might as well be able to > dump that data in an excel file too if needed .. our data isnt too much > close to 25K docs with 15-20 fields in each doc .. and mostly these > reports will be for close to 500 - 4000 records .. i am thinking about > setting up a simple servlet that grabs all this data that submits the user > query to solr over http .. grabs all that results data and dumps it in an > excel file .. i was just hoping to get some idea of whether this is going > to cause any performance impact on solr search .. especially since its all > on the same server and some users will be doing reports while others will > be searching .. right now search is working GREAT .. its blazing fast .. i > dont wanna loose this but at the same time reporting is an important > requirement as well .. > > also i would appreciate any hints towards some creative ways of doing it > .. something like getting 500 some records in a single request and then > using some timer task repeat the process .. > > thanks for ur help > -- View this message in context: http://old.nabble.com/solr-for-reporting-purposes-tp27725967p27726016.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr for reporting purposes
well thanks for ur reply .. as far as the load goes again I think most of the reports will be for 1000-4000 records and we dont have that many users .. its an internal system so we have about 400 users per day and we are opening this up for only half of those people (a specific role of people) .. so close to 200 people could potentially use it .. so practially speaking i think we can have up to 50 requests at a given time .. but again since its reports they are gonna be needed every day .. once you get a report you have it for a while .. so overall i dont think its that much of user load that we have .. what do you think also i was thinking about handling requests in a 500 records limit fashion .. so a request for 2000 records will be handled as 5 separate (refresh by a 5 sec timeout) requests .. do you think thats a good idea to ask solr to return 500 rows at a time but make that request 5 times .. or its better to just ask for 2000 rows alltogether Ron Chan wrote: > > we've done it successfully for similar requirements > > the resource requirements depends on how many concurrent people will be > running those types of reports > > up to 4000 records is not a problem at all, one report at a time, but if > you had concurrent requests running into thousands as well then you may > have a problem, although you will probably run into memory problems at the > rendering end before you have problems with Solr, i.e. not a Solr problem > as such, but a problem generally of unrestricted adhoc reporting > > > > > - Original Message - > From: "adeelmahmood" > To: solr-user@lucene.apache.org > Sent: Saturday, 27 February, 2010 5:57:00 AM > Subject: Re: solr for reporting purposes > > > I just want to clarify if its not obvious .. that the reason I am > concerned > about the performance of solr is becaues for reporting requests I will > probably have to request all result rows at the same time .. instead of 10 > or 20 > > > adeelmahmood wrote: >> >> we are trying to use solr for somewhat of a reporting system too (along >> with search) .. since it provides such amazing control over queries and >> basically over the data that user wants .. they might as well be able to >> dump that data in an excel file too if needed .. our data isnt too much >> close to 25K docs with 15-20 fields in each doc .. and mostly these >> reports will be for close to 500 - 4000 records .. i am thinking about >> setting up a simple servlet that grabs all this data that submits the >> user >> query to solr over http .. grabs all that results data and dumps it in an >> excel file .. i was just hoping to get some idea of whether this is going >> to cause any performance impact on solr search .. especially since its >> all >> on the same server and some users will be doing reports while others will >> be searching .. right now search is working GREAT .. its blazing fast .. >> i >> dont wanna loose this but at the same time reporting is an important >> requirement as well .. >> >> also i would appreciate any hints towards some creative ways of doing it >> .. something like getting 500 some records in a single request and then >> using some timer task repeat the process .. >> >> thanks for ur help >> > > -- > View this message in context: > http://old.nabble.com/solr-for-reporting-purposes-tp27725967p27726016.html > Sent from the Solr - User mailing list archive at Nabble.com. > > > -- View this message in context: http://old.nabble.com/solr-for-reporting-purposes-tp27725967p27743896.html Sent from the Solr - User mailing list archive at Nabble.com.