Re: How do I secure solr server?
Hi Mel One method is to limit the access to the web backend by only having it respond to 127.0.0.1. I'm not certain here but i think do that you need to add the limiting access code in your servlet, which may be different. For instance, we edited jetty.xml in our situation. I hope this is of some help to get you started looking, I've probably got alot of terminology incorrect there, and some facts :-) Might help though. matt On 21 Feb 2008, at 06:46, Mel Brand wrote: Hi guys, I run solr on a separate server from the application server and I'd like to know how to protect it. I'd like to know how to prevent someone from communicating to the server and also prevent unauthorized access (through the web) to admin page. Any help is extremely appreciated!! :) Thanks, Mel
Re: How do I secure solr server?
On Thu, 2008-02-21 at 01:46 -0500, Mel Brand wrote: > Hi guys, > > I run solr on a separate server from the application server and I'd > like to know how to protect it. best with a firewall. > I'd like to know how to prevent > someone from communicating to the server and also prevent unauthorized > access (through the web) to admin page. I would not expose http://yourServer:8983 at all. I would use an Apache httpd server as proxy and implement the ac there. salu2 > > Any help is extremely appreciated!! :) > > Thanks, > > Mel -- Thorsten Scherler thorsten.at.apache.org Open Source Java consulting, training and solutions
spellchecker and extendedResults
Hi, I'am using spellchecker from solr nightly build (2008-02-20) and it does show extended results when i put option extendedResults. What may be the reason? Also it does sort matches in different order depending on suggestionCount parameter. Is it normal that sort order differs when I change suggestionCount? And i think it does not count frequency of the suggested word at all. The same situation with 1.2 release. What may be wrong? Vic. -- View this message in context: http://www.nabble.com/spellchecker-and-extendedResults-tp15607286p15607286.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: YAML update request handler
hi, The format over the wire is not of great significance because it gets unmarshalled into the corresponding language object as soon as it comes out of the wire. I would say XML/JSON should meet 99% of the requirements because all the platforms come with an unmarshaller for both of these. But,If it can offer good performance improvement it is worth trying. --Noble On Thu, Feb 21, 2008 at 3:41 AM, alexander lind <[EMAIL PROTECTED]> wrote: > On Feb 20, 2008, at 9:31 AM, Doug Steigerwald wrote: > > > A few months back I wrote a YAML update request handler to see if we > > could post documents faster than with XMl. We did see some small > > speed improvements (didn't write down the numbers), but the hacked > > together code was probably making it slower as well. Not sure if > > there are faster YAML libraries out there either. > > > > We're not actually using it, since it was just a small proof of > > concept type of project, but is this anything people might be > > interested in? > > > > Out of simple preference I would love to see a YAML request handler > just because I like the YAML format. If its also faster than XML, then > all the better. > > Cheers > Alec > -- --Noble Paul
Re: Solr in Windows XP + JDK 5 + Tomcat 6.0.13
Thanks a lot, it's running right now. It seems that solr.solr.home should not point into the webapps directory, maybe this tip should be included in the installation guide... Thanks again. On Wed, Feb 20, 2008 at 10:50 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > On Wed, Feb 20, 2008 at 5:32 PM, Alejandro Valdez > > <[EMAIL PROTECTED]> wrote: > > > Hi, I changed that line to: > > > > set JAVA_OPTS=-Dsolr.home=C:\xampp\tomcat\webapps\solr -Duser.language=en > > > > But It STILL isn't working...I almost give up :-( > > > > When I try to open http://localhost:8080/solr/admin, I get: > > > > --- > > HTTP Status 404 - /solr/admin > > type Status report > > message /solr/admin > > description The requested resource (/solr/admin) is not available. > > Apache Tomcat/6.0.13 > > --- > > > > > > Someone should fix the page http://wiki.apache.org/solr/SolrTomcat, > > there says that should be used -Dsolr.solr.home=... : > > solr.solr.home is the correct variable. > Try putting the solr home (the contents of solr/example) outside the > webapps directory. Only solr.war should go inside webapps. > > You could also try the "simple example install" from here: > > > http://wiki.apache.org/solr/SolrTomcat > > -Yonik >
Different Filters
Hello all, We have a requirement for being able to switch on and off certain filters for different searches. The problem is that these filters are defined in the schema, per field; we only have one field with text so I was wondering if there was a way of setting the filters in the solrconfig.xml and create a different search a bit like the example but with different filters. thoughts? Best Regards, Martin Owens
Re: Different Filters
On Thu, Feb 21, 2008 at 1:20 PM, Owens, Martin <[EMAIL PROTECTED]> wrote: > We have a requirement for being able to switch on and off certain filters > for different searches. Can the client send in which filters should be turned on and off, but leave the definition of the filters in solrconfig.xml? If so, you can get this effect with the new query parser plugin framework. Part of that includes what I call "local parameters" (not really documented yet), which includes parameter dereferencing. So you could add something like this to your query fq=&fq=
Re: Solr in Windows XP + JDK 5 + Tomcat 6.0.13
Hi Alejandro. Since this was a bit of trouble for you could you post the steps you used to get it to work (and/or any deviation from the wiki) to summarize this thread. It has been some days that I have seen the thread on the list and it would leave something useful other than I got it running for other folks with a similar issue in future. Many thanks. Regards David Alejandro Valdez wrote: Thanks a lot, it's running right now. It seems that solr.solr.home should not point into the webapps directory, maybe this tip should be included in the installation guide... Thanks again. On Wed, Feb 20, 2008 at 10:50 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: On Wed, Feb 20, 2008 at 5:32 PM, Alejandro Valdez <[EMAIL PROTECTED]> wrote: Hi, I changed that line to: > > set JAVA_OPTS=-Dsolr.home=C:\xampp\tomcat\webapps\solr -Duser.language=en > > But It STILL isn't working...I almost give up :-( > > When I try to open http://localhost:8080/solr/admin, I get: > > --- > HTTP Status 404 - /solr/admin > type Status report > message /solr/admin > description The requested resource (/solr/admin) is not available. > Apache Tomcat/6.0.13 > --- > > > Someone should fix the page http://wiki.apache.org/solr/SolrTomcat, > there says that should be used -Dsolr.solr.home=... : solr.solr.home is the correct variable. Try putting the solr home (the contents of solr/example) outside the webapps directory. Only solr.war should go inside webapps. You could also try the "simple example install" from here: http://wiki.apache.org/solr/SolrTomcat -Yonik
Companies Using Solr
Hey Folks, Reminder: http://wiki.apache.org/solr/PublicServers lists the sites using Solr. The listing is a bit thin. I know many people don't know about the list or don't have the time to add themselves to the list. I'd like to be able to promote open sourcing more systems (like Solr) and this information would help show it is helping a large community. Feel free to reply directly to me and I can add you. Thanks. --cw Clay Webster Associate VP, Platform Infrastructure CNET, Inc. (Nasdaq:CNET)
RE: Different Filters
> Can the client send in which filters should be turned on and off, but > leave the definition of the filters in solrconfig.xml? The client must set the property, how solr deals with that is how I want it to work. > If so, you can get this effect with the new query parser plugin > framework. Part of that includes what I call "local parameters" (not > really documented yet), which includes parameter dereferencing. What version of solr does this first appear? we're using a nightly build from December which was heavily hacked to do database result returning and word offset highlighting (and some other fixes) so we'd like to avoid using anything newer. > So you could add something like this to your query > fq=&fq= and have the various filters be a default defined in a handler in > solrconfig.xml How does this work? I'm still confused from your explanation. Are the query options turning the filters on or off? what kind of hander would go into solrconfig.xml? Best Regards, Martin Owens
Re: Different Filters
On Thu, Feb 21, 2008 at 1:49 PM, Owens, Martin <[EMAIL PROTECTED]> wrote: > > So you could add something like this to your query > > fq=&fq= > and have the various filters be a default defined in a handler in > solrconfig.xml > > How does this work? I'm still confused from your explanation. Are the query > options turning the filters on or off? what kind of hander would go into > solrconfig.xml? This feature was first committed 10/22/07 It's a simple indirection. fq=myfield:myval is equivalent to fq=&filter1=myfield:myval Now put filter1 as a default in your handler (same as any other default), and the client can turn on and off filter1 without knowing what exactly it is. -Yonik
Re: YAML update request handler
Python marshal format is worth a try. It is binary and can represent the same data as JSON. It should be a good fit to Solr. We benchmarked that against XML several years ago and it was 2X faster. Of course, XML parsers are a lot faster now. wunder On 2/21/08 10:50 AM, "Grant Ingersoll" <[EMAIL PROTECTED]> wrote: > XML can be a problem when it is really lengthy (lots of results, large > results) such that a binary format could be useful in certain cases > where we control both ends of the pipe (i.e. SolrJ.) I've seen apps > that deal with really large files wrapped in XML where the XML parsing > takes a significant amount of time as compared to a more compact > binary format. > > I think it at least warrants profiling/testing. > > -Grant > > On Feb 21, 2008, at 12:07 PM, Noble Paul നോബിള് > नोब्ळ् wrote: > >> hi, >> The format over the wire is not of great significance because it gets >> unmarshalled into the corresponding language object as soon as it >> comes out >> of the wire. I would say XML/JSON should meet 99% of the requirements >> because all the platforms come with an unmarshaller for both of these. >> >> But,If it can offer good performance improvement it is worth trying. >> --Noble >> >> On Thu, Feb 21, 2008 at 3:41 AM, alexander lind <[EMAIL PROTECTED]> >> wrote: >> >>> On Feb 20, 2008, at 9:31 AM, Doug Steigerwald wrote: >>> A few months back I wrote a YAML update request handler to see if we could post documents faster than with XMl. We did see some small speed improvements (didn't write down the numbers), but the hacked together code was probably making it slower as well. Not sure if there are faster YAML libraries out there either. We're not actually using it, since it was just a small proof of concept type of project, but is this anything people might be interested in? >>> >>> Out of simple preference I would love to see a YAML request handler >>> just because I like the YAML format. If its also faster than XML, >>> then >>> all the better. >>> >>> Cheers >>> Alec >>> >> >> >> >> -- >> --Noble Paul > > -- > Grant Ingersoll > http://www.lucenebootcamp.com > Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam > > Lucene Helpful Hints: > http://wiki.apache.org/lucene-java/BasicsOfPerformance > http://wiki.apache.org/lucene-java/LuceneFAQ > > > > >
Re: Solr in Windows XP + JDK 5 + Tomcat 6.0.13
Hello, yes of course. I followed the instructions from http://wiki.apache.org/solr/SolrTomcat (see below) but instead of copy the example configuration files into the directory c:\web\solr\ as is explained in that page, I did it into c:\tomcat\webapps\solr and started Tomcat with: -Dsolr.solr.home=c:\tomcat\webapps\solr But it didn't work. Apparently the directory used in solr.solr.home variable MUST NOT point inside the Tomcat's webapps directory, or it will be ignored. *** The enviroment I used was: Windows XP Professional XAMPP 1.6.4 Tomcat 6.0.13 Sun JDK 5 Updated content of http://wiki.apache.org/solr/SolrTomcat: Tomcat on Windows Single Solr app 1) Download and install [WWW] Tomcat for Windows using the MSI installer. Install it with the tcnative.dll file. Say you installed it in c:\tomcat\ 2) Check if Tomcat is installed correctly by going to [WWW] http://localhost:8080/ 3) Change the c:\tomcat\conf\server.xml file to add the URIEncoding Connector element as shown above. 4) Download and unzip the Solr distribution zip file into (say) c:\temp\solrZip\ 5) Make a directory called solr where you intend the application server to function, say c:\web\solr\ (Important: It must be outside the Tomcat's webapps directory) 6) Copy the contents of the example\solr directory c:\temp\solrZip\example\solr\ to c:\web\solr\ 7) Stop the Tomcat service 8) Copy the *solr*.war file from c:\temp\solrZip\dist\ to the Tomcat webapps directory c:\tomcat\webapps\ 9) Rename the *solr*.war file solr.war 10)Use the system tray icon to configure Tomcat to start with the following Java option: -Dsolr.solr.home=c:\web\solr 11)Start the Tomcat service 12)Go to the solr admin page to verify that the installation is working. It will be at [WWW] http://localhost:8080/solr/admin On Thu, Feb 21, 2008 at 4:38 PM, David Pratt <[EMAIL PROTECTED]> wrote: > Hi Alejandro. Since this was a bit of trouble for you could you post the > steps you used to get it to work (and/or any deviation from the wiki) to > summarize this thread. It has been some days that I have seen the thread > on the list and it would leave something useful other than I got it > running for other folks with a similar issue in future. Many thanks. > > Regards > David > > > > Alejandro Valdez wrote: > > Thanks a lot, it's running right now. > > > > It seems that solr.solr.home should not point into the webapps > > directory, maybe this tip should be included in the installation > > guide... > > > > Thanks again. > > > > > > On Wed, Feb 20, 2008 at 10:50 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > >> On Wed, Feb 20, 2008 at 5:32 PM, Alejandro Valdez > >> > >> <[EMAIL PROTECTED]> wrote: > >> > >>> Hi, I changed that line to: > >> > > >> > set JAVA_OPTS=-Dsolr.home=C:\xampp\tomcat\webapps\solr > -Duser.language=en > >> > > >> > But It STILL isn't working...I almost give up :-( > >> > > >> > When I try to open http://localhost:8080/solr/admin, I get: > >> > > >> > --- > >> > HTTP Status 404 - /solr/admin > >> > type Status report > >> > message /solr/admin > >> > description The requested resource (/solr/admin) is not available. > >> > Apache Tomcat/6.0.13 > >> > --- > >> > > >> > > >> > Someone should fix the page http://wiki.apache.org/solr/SolrTomcat, > >> > there says that should be used -Dsolr.solr.home=... : > >> > >> solr.solr.home is the correct variable. > >> Try putting the solr home (the contents of solr/example) outside the > >> webapps directory. Only solr.war should go inside webapps. > >> > >> You could also try the "simple example install" from here: > >> > >> > >> http://wiki.apache.org/solr/SolrTomcat > >> > >> -Yonik > >> > > >
Re: Solr in Windows XP + JDK 5 + Tomcat 6.0.13
Hi Alejandro. Your summary is good and it should be of benefit to others. Thank you for taking the time to prepare it. Regards, David Alejandro Valdez wrote: Hello, yes of course. I followed the instructions from http://wiki.apache.org/solr/SolrTomcat (see below) but instead of copy the example configuration files into the directory c:\web\solr\ as is explained in that page, I did it into c:\tomcat\webapps\solr and started Tomcat with: -Dsolr.solr.home=c:\tomcat\webapps\solr But it didn't work. Apparently the directory used in solr.solr.home variable MUST NOT point inside the Tomcat's webapps directory, or it will be ignored. *** The enviroment I used was: Windows XP Professional XAMPP 1.6.4 Tomcat 6.0.13 Sun JDK 5 Updated content of http://wiki.apache.org/solr/SolrTomcat: Tomcat on Windows Single Solr app 1) Download and install [WWW] Tomcat for Windows using the MSI installer. Install it with the tcnative.dll file. Say you installed it in c:\tomcat\ 2) Check if Tomcat is installed correctly by going to [WWW] http://localhost:8080/ 3) Change the c:\tomcat\conf\server.xml file to add the URIEncoding Connector element as shown above. 4) Download and unzip the Solr distribution zip file into (say) c:\temp\solrZip\ 5) Make a directory called solr where you intend the application server to function, say c:\web\solr\ (Important: It must be outside the Tomcat's webapps directory) 6) Copy the contents of the example\solr directory c:\temp\solrZip\example\solr\ to c:\web\solr\ 7) Stop the Tomcat service 8) Copy the *solr*.war file from c:\temp\solrZip\dist\ to the Tomcat webapps directory c:\tomcat\webapps\ 9) Rename the *solr*.war file solr.war 10)Use the system tray icon to configure Tomcat to start with the following Java option: -Dsolr.solr.home=c:\web\solr 11)Start the Tomcat service 12)Go to the solr admin page to verify that the installation is working. It will be at [WWW] http://localhost:8080/solr/admin On Thu, Feb 21, 2008 at 4:38 PM, David Pratt <[EMAIL PROTECTED]> wrote: Hi Alejandro. Since this was a bit of trouble for you could you post the steps you used to get it to work (and/or any deviation from the wiki) to summarize this thread. It has been some days that I have seen the thread on the list and it would leave something useful other than I got it running for other folks with a similar issue in future. Many thanks. Regards David Alejandro Valdez wrote: > Thanks a lot, it's running right now. > > It seems that solr.solr.home should not point into the webapps > directory, maybe this tip should be included in the installation > guide... > > Thanks again. > > > On Wed, Feb 20, 2008 at 10:50 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: >> On Wed, Feb 20, 2008 at 5:32 PM, Alejandro Valdez >> >> <[EMAIL PROTECTED]> wrote: >> >>> Hi, I changed that line to: >> > >> > set JAVA_OPTS=-Dsolr.home=C:\xampp\tomcat\webapps\solr -Duser.language=en >> > >> > But It STILL isn't working...I almost give up :-( >> > >> > When I try to open http://localhost:8080/solr/admin, I get: >> > >> > --- >> > HTTP Status 404 - /solr/admin >> > type Status report >> > message /solr/admin >> > description The requested resource (/solr/admin) is not available. >> > Apache Tomcat/6.0.13 >> > --- >> > >> > >> > Someone should fix the page http://wiki.apache.org/solr/SolrTomcat, >> > there says that should be used -Dsolr.solr.home=... : >> >> solr.solr.home is the correct variable. >> Try putting the solr home (the contents of solr/example) outside the >> webapps directory. Only solr.war should go inside webapps. >> >> You could also try the "simple example install" from here: >> >> >> http://wiki.apache.org/solr/SolrTomcat >> >> -Yonik >> >
Re: YAML update request handler
XML can be a problem when it is really lengthy (lots of results, large results) such that a binary format could be useful in certain cases where we control both ends of the pipe (i.e. SolrJ.) I've seen apps that deal with really large files wrapped in XML where the XML parsing takes a significant amount of time as compared to a more compact binary format. I think it at least warrants profiling/testing. -Grant On Feb 21, 2008, at 12:07 PM, Noble Paul നോബിള് नोब्ळ् wrote: hi, The format over the wire is not of great significance because it gets unmarshalled into the corresponding language object as soon as it comes out of the wire. I would say XML/JSON should meet 99% of the requirements because all the platforms come with an unmarshaller for both of these. But,If it can offer good performance improvement it is worth trying. --Noble On Thu, Feb 21, 2008 at 3:41 AM, alexander lind <[EMAIL PROTECTED]> wrote: On Feb 20, 2008, at 9:31 AM, Doug Steigerwald wrote: A few months back I wrote a YAML update request handler to see if we could post documents faster than with XMl. We did see some small speed improvements (didn't write down the numbers), but the hacked together code was probably making it slower as well. Not sure if there are faster YAML libraries out there either. We're not actually using it, since it was just a small proof of concept type of project, but is this anything people might be interested in? Out of simple preference I would love to see a YAML request handler just because I like the YAML format. If its also faster than XML, then all the better. Cheers Alec -- --Noble Paul -- Grant Ingersoll http://www.lucenebootcamp.com Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ
RE: Different Filters
>> This feature was first committed 10/22/07 Great! should be there then. > Now put filter1 as a default in your handler (same as any other > default), and the client can turn on and off filter1 without knowing > what exactly it is. OK so I have to add a new search hander into solrconfig.xml with a set name, I then use that in the query line to specify which field the search hander should use? Are you able to do an example including the solrconfig or schema changes and show the field and how it works with the English Stemmer for instance. Sorry for being a dunce today, I'm just not sure I'm totally understanding everything. Best Regards, Martin Owens
Re: Different Filters
On Thu, Feb 21, 2008 at 2:17 PM, Owens, Martin <[EMAIL PROTECTED]> wrote: > > Now put filter1 as a default in your handler (same as any other > > default), and the client can turn on and off filter1 without knowing > > what exactly it is. > > OK so I have to add a new search hander into solrconfig.xml with a set name, > I then use that in the query line to specify which field the search hander > should use? What field??? or what filter? I'm not really sure I still understand what you are trying to accomplish. Perhaps if you have some explicit examples of what types of things clients would send in as query parameters to Solr, and what types of lucene queries you actually want to be generated. -Yonik
Re: Different Filters
OK, talk of different fields threw me. To enable a client to turn on/off a specific filter without knowing what that filter is, add the following parameter to the query string when you want to turn the filter on: fq= Then add a default for the filter1 param in lucene query syntax (like +cat:electronics +inStock:true) to whatever handler you want to query (refer to the examples in solrconfig.xml for how to do this). -Yonik On Thu, Feb 21, 2008 at 3:34 PM, Owens, Martin <[EMAIL PROTECTED]> wrote: > > > What field??? or what filter? > > I'm not really sure I still understand what you are trying to accomplish. > > Perhaps if you have some explicit examples of what types of things > > clients would send in as query parameters to Solr, and what types of > > lucene queries you actually want to be generated. > > Oh dear a complete break down, > > OK so our perl based software uses http to set a request to solr, we want > for our software to be able to control the query filters being used with each > search by modifying attributes in the http query string such as I think you > were suggesting. I need examples of how to impliment what you were talking > about. > > Best Regards, Martin Owens >
RE: Different Filters
> What field??? or what filter? > I'm not really sure I still understand what you are trying to accomplish. > Perhaps if you have some explicit examples of what types of things > clients would send in as query parameters to Solr, and what types of > lucene queries you actually want to be generated. Oh dear a complete break down, OK so our perl based software uses http to set a request to solr, we want for our software to be able to control the query filters being used with each search by modifying attributes in the http query string such as I think you were suggesting. I need examples of how to impliment what you were talking about. Best Regards, Martin Owens
Re: wildcard query question
: the record is found. I was wondering how the colon character affects : the search, and if there is another way to write a wildcard query. most likely the issue is that your analyzer is striping out the colon character, hence your normal phrase search works (because the colon is striped out both when indexing and querying) but your wildcards don't, because the wildcard query string is not analyzed... http://wiki.apache.org/lucene-java/LuceneFAQ#head-133cf44dd3dff3680c96c1316a663e881eeac35a -Hoss
Re: 2D Facet
: say I have a parameter facet.field=STATE. For example we'll take 3D : faceting, so I'll need 2 more facet fields related to the first one. : Should we do something like this: : facet.field=STATE&f.STATE.facet.matrix=NAME&f.STATE.facet.matrix=INCOME : Or for example we can have may be like this: : facet.matrix=STATE,NAME,INCOME : What would you suggest is better? It's not something i've thought about too hard, but i was thinking along the line of the first example. So STATE is the main facet for the matrix, and the other facets are identified as values of the f.STATE.facet.matrix param ("matrix" isn't really the best word, it's more like a tre of facet values ... for each of the top N values in the "main" facet, you also get the top N values of the other facets listed). That way you could have multiple fracet trees, and a single facet could be part of more then one tree, it just couldn't be the main facet of more then one tree. for example, imagine we want to facet cars... facet.limit=10 & facet.field=STATE & facet.field=MODEL & f.MODEL.facet.tree=COLOR & f.MODEL.facet.tree=YEAR & facet.field=TYPE & f.TYPE.facet.tree=COLOR & f.TYPE.facet.tree=STATE ...that would give you completley independent facet counts for STATE, MODEL, and TYPE, but it would also tell you what the type 10 COLORs and YEARs are for each of the top 10 MODELs, and what the top 10 COLORs and STATEs are for each TYPE of car (even if not enough cars are in that state to show up in the main STATE facet) ...honestly: any permutation you want is possible, it's jsut a question of how to express it cleanly in key=val pair style input so it's easy to express over HTTP. : Also, where in Solr I could find something similar to take it as an : example? Where all this logic should be placed? the logic could o in a custom RequestHandler, or a custom Component ... if you look at the FacetComponent class in the nightly builds of Solr you can see how the current Simple faceting code is handled ... the underlying methods (for getting counts using DocSet intersections) can still be reused, you just need to pass them additional "filter" DocSets from the "main" facet. -Hoss
Re: Wild card searching not working properly
Problems like this depend heavily on example what the fieldtype and "index" analyzer is for the field you are querying on. it's important to keep in mind that wildcard and fuzzy queries are not "analyzed" so things like lowercasing and stemming have to be taken into account -- typically it's useful to use copyField to have a special version of your field with simplified analysis for doing wildcard searches on. as for your specific problem: given the limited information you've provided, no guesses immediately jump out at me as to what you should od to get things working the way you want ... it depends on your schema, and what the orriginal text was in those 3 documents you want to match. : For example, following are the search queries and the corresponding results : tomcat* -> 3 results : tomca* -> 0 results : tom*at -> 0 results : tom~at -> 0 results -Hoss
Re: How to search multiphrase or a middle term in a word???
your "master" examples work part of the time because of the WordDelimiterField can tell at indextime that the capital M in the middle of the word is a good place to split on. without hints like that at index time, the only way to do "middle of the word" searches is with wildcard type queries -- which ar really inefficent. You might want to read a it about N-Grams and consider using an NGramTokenizer to chunk up your input words into ngrams for easy searching on pieces of words. :My index data is : : srinivasan,sweetheart,thomasmaster,thomasMaster.(totally 4 words) :Search data : vasan :Like that, if i search for "hear", its return nothing. The result should :Search data : Master or master :But mainly, i am unable to search a middle term in a word. I have -Hoss
Re: Indexing very large files.
All, A while back I was running into an issue with a Java heap out of memory error while indexing large files. I figured out that was my own error due to a misconfiguration of my Netbeans memory settings. However, now that is fixed and I have stumbled upon a new error. When trying to upload files which include a Solr TextField value of 32MB or more in size, I get the following error (uploading with SimplePostTool): Solr returned an error: error reading input, returned 0 javax.xml.stream.XMLStreamException: error reading input, returned 0 at com.bea.xml.stream.MXParser.fillBuf(MXParser.java:3709) at com.bea.xml.stream.MXParser.more(MXParser.java:3715) at com.bea.xml.stream.MXParser.nextImpl(MXParser.java:1936) at com.bea.xml.stream.MXParser.next(MXParser.java:1333) at org.apache.solr.handler.XmlUpdateRequestHandler.readDoc( XmlUpdateRequestHandler.java:318) at org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate( XmlUpdateRequestHandler.java:195) at org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody( XmlUpdateRequestHandler.java:123) at org.apache.solr.handler.RequestHandlerBase.handleRequest( RequestHandlerBase.java:117) at org.apache.solr.core.SolrCore.execute( SolrCore.java:902) at org.apache.solr.servlet.SolrDispatchFilter.execute( SolrDispatchFilter.java:280) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:237) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter( ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter( ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke( StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke( StandardContextValve.java:175) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke( StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process( Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) at java.lang.Thread.run(Thread.java:613) I suspect there's a setting somewhere that I'm overlooking that is causing this, but after peering through the solrconfig.xml and schema.xml files I am not seeing anything obvious (to me, anyway...=). The second line of the error shows it's crashing in MXParser.fillBuf, which implies that I'm overloading the buffer (I assume due to too large of a string). Thanks in advance for any assistance, Dave
Re: custom handler results don't seem to match manually entered query string
Hmmm... everything seems right here. This may be a silly question, but you are calling rsp.add("response", docs_main.docList) in your custom handler correct? second question: how are you building up your query obejct? the only thing i can think of is that you are constructing the TermQueries directly (without using the analyzer) so they don't match what's really in the index (ie: things aren't being lowercased, not splitting on "." and "_") but when you cut/paste the query string into standard request handler it uses the QueryParser which does the proper analysis. what does debugQuery=true say about your query when you cut/paste the query string? can you post the full code of your custo mrequest handler? : Hi, : my problem is as follows: my request handler's code : : filters = null; : DocListAndSet docs_main = searcher.getDocListAndSet(query, filters, null, : start, rows, flags); : String querystr = query.toString(); : rsp.add("QUERY_main", querystr); : : : gives zero responses: : : ((text:Travel text:Home text:Online_Archives : text:Ireland text:Consumer_Information text:Regional text:Europe text:News : text:Complaints text:CNN.com text:February text:Transport : text:Airlines)^0.3) : : : : While copying the "QUERY_main" string into Solr admin returns full of them: : : : (text:Travel text:Home text:Online_Archives text:Ireland : text:Consumer_Information text:Regional text:Europe text:News : text:Complaints text:CNN.com text:February text:Transport text:Airlines)^0.3 : : 10 : 2.2 : : : ÿÿ : : : : : Please help me understand what's going on, I'm a bit confused atm. Thanks : :-) -Hoss
Re: Multi field queries
: Documents in my solr index has three fields, name, content and summary. : Suppose the user query be, "java sky democratic". I want the resulting : documents to have all the terms in the query ( "java sky democratic") in : either name, content or the summary (for example i.e., java and sky is in the : content and democratic is in the summary). take a look at the "dismax" request handler. it is designed explicitly for this purpose. http://wiki.apache.org/solr/DisMaxRequestHandler (NOTE: if you want all the input words to be required, set mm=100%) -Hoss
Offsets in results?
Hi all, I apologize if this question has been asked, but I've been unable to find the answer in the archives. Is there a way to get the offsets for results from a search in JSON format, or even from SOLR in general (regardless of format, XML even)? As in, I have fields set that I am searching over, and am returning those fields in my JSON object, and would like to know either a) the offsets of those fields or b) the offsets of the part that matched. I understand that there is a highlighting plugin, but for various reasons I'd like to get my hands on the offsets themselves. Is this something I need to hack up on my own? Thanks, Steve Suppe
Re: YAML update request handler
For the case where we use Solrj (we control both ends) It is best to resort to a custom binary format. It works fastest and with least cost /bandwidth . We can use a custom object serialization/deserialization mechanism (java standard serialization is verbose ) which is lightweight . I can create a patch which can be used for the same if you think it is useful. --Noble On Fri, Feb 22, 2008 at 12:20 AM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > XML can be a problem when it is really lengthy (lots of results, large > results) such that a binary format could be useful in certain cases > where we control both ends of the pipe (i.e. SolrJ.) I've seen apps > that deal with really large files wrapped in XML where the XML parsing > takes a significant amount of time as compared to a more compact > binary format. > > I think it at least warrants profiling/testing. > > -Grant > > On Feb 21, 2008, at 12:07 PM, Noble Paul നോബിള് > नोब्ळ् wrote: > > > hi, > > The format over the wire is not of great significance because it gets > > unmarshalled into the corresponding language object as soon as it > > comes out > > of the wire. I would say XML/JSON should meet 99% of the requirements > > because all the platforms come with an unmarshaller for both of these. > > > > But,If it can offer good performance improvement it is worth trying. > > --Noble > > > > On Thu, Feb 21, 2008 at 3:41 AM, alexander lind <[EMAIL PROTECTED]> > > wrote: > > > >> On Feb 20, 2008, at 9:31 AM, Doug Steigerwald wrote: > >> > >>> A few months back I wrote a YAML update request handler to see if we > >>> could post documents faster than with XMl. We did see some small > >>> speed improvements (didn't write down the numbers), but the hacked > >>> together code was probably making it slower as well. Not sure if > >>> there are faster YAML libraries out there either. > >>> > >>> We're not actually using it, since it was just a small proof of > >>> concept type of project, but is this anything people might be > >>> interested in? > >>> > >> > >> Out of simple preference I would love to see a YAML request handler > >> just because I like the YAML format. If its also faster than XML, > >> then > >> all the better. > >> > >> Cheers > >> Alec > >> > > > > > > > > -- > > --Noble Paul > > -- > Grant Ingersoll > http://www.lucenebootcamp.com > Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam > > Lucene Helpful Hints: > http://wiki.apache.org/lucene-java/BasicsOfPerformance > http://wiki.apache.org/lucene-java/LuceneFAQ > > > > > > -- --Noble Paul