Re: Issue on Facet field and exact match
On Mon, Sep 14, 2009 at 10:49 AM, dharhsana wrote: > > > This is my coding where i add fileds for blog details to solr, > > SolrInputDocument solrInputDocument = new SolrInputDocument(); > solrInputDocument.addField("blogTitle","$Never Fails$"); > solrInputDocument.addField("blogId","$Never Fails$"); > solrInputDocument.addField("userId",1); > > This is my coding to add fileds for post details to solr.. > > solrInputDocument.addField("blogId","$Never Fails$"); > solrInputDocument.addField("postId","$Never Fails post$"); > solrInputDocument.addField("postTitle","$Never Fails post$"); > solrInputDocument.addField("postMessage","$Never Fails post message$"); > > While i am quering it from solr, this is my coding.. > > > SolrQuery queryOfMyBlog = new SolrQuery("blogId_exact:Never Fails"); > queryOfMyBlog.setFacet(true); > queryOfMyBlog.addFacetField("blogTitle_exact"); > queryOfMyBlog.addFacetField("userId_exact"); > queryOfMyBlog.addFacetField("blogId_exact"); > queryOfMyBlog.setFacetMinCount(1); > queryOfMyBlog.setIncludeScore(true); > > You are indexing "$Never Fails$" into the field so you should search for the same. You don't need to add $ for exact searches on a string type field so you can just index "Never Fails". Also, to make it a exact phrase search enclose the query with quotes e.g. blogId_exact:"Never Fails" -- Regards, Shalin Shekhar Mangar.
Re: [DIH] Multiple repeat XPath stmts
As I said, copying is not an option. That will break everything else. On Sep 14, 2009, at 1:07 AM, Noble Paul നോബിള് नोब्ळ् wrote: The XPathRecordreader has a limit one mapping per xpath. So copying is the best solution On Mon, Sep 14, 2009 at 2:54 AM, Fergus McMenemie wrote: I'm trying to import several RSS feeds using DIH and running into a bit of a problem. Some feeds define a GUID value that I map to my Solr ID, while others don't. I also have a link field which I fill in with the RSS link field. For the feeds that don't have the GUID value set, I want to use the link field as the id. However, if I define the same XPath twice, but map it to two diff. columns I don't get the id value set. For instance, I want to do: schema.xml DIH config: Because I am consolidating multiple fields, I'm not able to do copyFields, unless of course, I wanted to implement conditional copy fields (only copy if the field is not defined) which I would rather not. How do I solve this? How about. The TemplateTransformer does nothing if its source expression is null. So the first transform assign the fallback value to ID, this is overwritten by the GUID if it is defined. You can not sort of do if-then-else using a combination of template and regex transformers. Adding a bit of maths to the transformers and I think we will have a turing complete language:-) fergus. Thanks, Grant -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer === -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: [DIH] Multiple repeat XPath stmts
if you wish to use conditional copy you can use a RegexTransformer this means that if guid!= null 'id' will be set to guid On Mon, Sep 14, 2009 at 4:16 PM, Grant Ingersoll wrote: > As I said, copying is not an option. That will break everything else. > > On Sep 14, 2009, at 1:07 AM, Noble Paul നോബിള് नोब्ळ् wrote: > >> The XPathRecordreader has a limit one mapping per xpath. So copying is >> the best solution >> >> On Mon, Sep 14, 2009 at 2:54 AM, Fergus McMenemie >> wrote: I'm trying to import several RSS feeds using DIH and running into a bit of a problem. Some feeds define a GUID value that I map to my Solr ID, while others don't. I also have a link field which I fill in with the RSS link field. For the feeds that don't have the GUID value set, I want to use the link field as the id. However, if I define the same XPath twice, but map it to two diff. columns I don't get the id value set. For instance, I want to do: schema.xml >>> required="true"/> DIH config: Because I am consolidating multiple fields, I'm not able to do copyFields, unless of course, I wanted to implement conditional copy fields (only copy if the field is not defined) which I would rather not. How do I solve this? >>> >>> How about. >>> >>> >>> >>> >>> >>> >>> >>> The TemplateTransformer does nothing if its source expression is null. >>> So the first transform assign the fallback value to ID, this is >>> overwritten by the GUID if it is defined. >>> >>> You can not sort of do if-then-else using a combination of template >>> and regex transformers. Adding a bit of maths to the transformers and >>> I think we will have a turing complete language:-) >>> >>> fergus. >>> Thanks, Grant >>> >>> -- >>> >>> === >>> Fergus McMenemie Email:fer...@twig.me.uk >>> Techmore Ltd Phone:(UK) 07721 376021 >>> >>> Unix/Mac/Intranets Analyst Programmer >>> === >>> >> >> >> >> -- >> - >> Noble Paul | Principal Engineer| AOL | http://aol.com > > -- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using > Solr/Lucene: > http://www.lucidimagination.com/search > > -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Seeking help setting up solr in eclipse
I'm not familiar w/ Eclipse, but do you need to set solr.solr.home? Perhaps http://wiki.apache.org/solr/SolrTomcat can help too. On Sep 13, 2009, at 7:12 PM, Markus Fischer wrote: Hi, I'ld like to set up Eclipse to run solr (in Tomcat for example), but struggling with the issue that I can't get the index.jsp and other files to be properly executed, for debugging and working on a plugin. I've checked out solr via subclipse plugin, created a Dynamic Web Project. It seems that I've to know in advance which directories contain the proper web files. Since I can't find a definitive UI to change that aftewards, I modified the .settings/org.eclipse.wst.common.component by hand, but I can't get it work. When I open solr/src/webapp/web/index.jsp via Run as/Run on Server, Tomcat gets started and the browser window opens the URL http://localhost:8080/solr/index.jsp which only gives me a HTTP Status 404 - /solr/index.jsp . That's straight to the point for me, but I'm not sure where to fix this. My org.eclipse.wst.common.component looks like this: I see that Tomcat gets started with these values (stripped path to workspace): /usr/lib/jvm/java-6-sun-1.6.0.15/bin/java -Dcatalina.base=/workspace/.metadata/.plugins/ org.eclipse.wst.server.core/tmp0 -Dcatalina.home=/apache-tomcat-6.0.20 -Dwtp.deploy=/workspace/.metadata/.plugins/ org.eclipse.wst.server.core/tmp0/wtpwebapps -Djava.endorsed.dirs=/apache-tomcat-6.0.20/endorsed -Dfile.encoding=UTF-8 -classpath /apache-tomcat-6.0.20/bin/bootstrap.jar:/usr/lib/jvm/java-6- sun-1.6.0.15/lib/tools.jar org.apache.catalina.startup.Bootstrap start The configuration files in "/workspace/Servers/Tomcat v6.0 Server at localhost-config", e.g. server.xml, contain: I see files copied, e.g. /workspace/.metadata/.plugins/org.eclipse.wst.server.core/tmp0/ wtpwebapps/solr/WEB-INF/classes/index.jsp I'm bumping against a wall currently, I don't see the woods anymore ... thanks for any help, - Markus -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Questions on copyField
Hello, I have a few questions regarding the copyField directive in schema.xml 1. Does the destination field store a reference or the actual data ? If I have soemthing like this then will the values in the 'name' field get copied into the 'text' field or will the 'text' field only store a reference to the 'name' field ? To put it more simply, if I later delete the 'name' field from the index will I lose the corresponding data in the 'text' field ? 2. Is there any inbuilt API which I can use to do the copyField action programmatically ? 3. Can I do a copyfield from the schema as well as programmatically for the same destination field Suppose I want the 'text' field to contain values for name, age and location. In my index only 'name' and 'age' are defined as fields. So I can add directives like The location however, I want to add it to the 'text' field programmatically. I don't want to store the location as a separate field in the index. Can I do this ? Thank you. Regards Rahul
Searching for the '+' character
Hi all, I need some help with a curious problem i can't find a solution for. I am somewhat of a newbie with the various analyzers and handlers and how they work together, so im looking for advice on how to proceed with my issue. I have content with text like 'product+' which has been indexed as text. I need to search for the character '+', but try as I might i can't do this. From the docs it should just be a matter of escaping: http://lucene.apache.org/java/2_4_1/queryparsersyntax.html#Escaping%20Special%20Characters So queries like: http://localhost:8983/solr/select/?q=\+&debugQuery=true or http://localhost:8983/solr/select/?q=\%2B&debugQuery=true should do the trick but they don't. I get: http://pastie.org/616055 and http://pastie.org/616052, respectively. Only with the + url encoded does it appear in the output, but no results are returned. I believe that the + is being stripped somehow but im not sure where exactly to look. I included the debug info from the query but im not sure if the output is helpfull. Does anyone have ideas on this issue, and how i should try to proceed? Many thanks, Paul
Solr results filtered on MoreLikeThis
Hi, I hope someone can help me in my search for finding the right solution for my search application. I hope I'm not repeating a question that has been asked before, but I could not find a similar question out there. So that is why I'm asking it here... Here goes: My index contains documents which also could contain duplicates based on content. The sources of these documents are from various locations on the internet. I some cases these documents look the same and in some cases they are the same. What I am trying to achieve is a result with matching documents, but where the results are unique based on the MoreLikeThis. So I want to provide matching documents only in the details not in the results. The results should state the number of morelikethis. So if 3 documents match and another 4 documents match, I only want 2 results like this: - document1 (3 similar documents) - document2 (4 similar documents) And when users click further I will let them see all the similar documents, but not in the search result I have used the MoreLikeThis via the standard query not the MoreLikeThisHandler. And I can see that the results are seperate from the "morelikethis" element in the result. I would like to have the morelikethis results be filtered on the actual result list. Sorry, if I'm repeating myself, but I'm just trying to explain it as best as I can. Regards, Marcel -- View this message in context: http://www.nabble.com/Solr-results-filtered-on-MoreLikeThis-tp25434881p25434881.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: "standard" requestHandler components
I just copied this information to the wiki at http://wiki.apache.org/solr/SolrRequestHandler -Peter On Fri, Sep 11, 2009 at 7:43 PM, Jay Hill wrote: > RequestHandlers are configured in solrconfig.xml. If no components are > explicitly declared in the request handler config the the defaults are used. > They are: > - QueryComponent > - FacetComponent > - MoreLikeThisComponent > - HighlightComponent > - StatsComponent > - DebugComponent > > If you wanted to have a custom list of components (either omitting defaults > or adding custom) you can specify the components for a handler directly: > > query > facet > mlt > highlight > debug > someothercomponent > > > You can add components before or after the main ones like this: > > mycomponent > > > > myothercomponent > > > and that's how the spell check component can be added: > > spellcheck > > > Note that the a component (except the defaults) must be configured in > solrconfig.xml with the name used in the str element as well. > > Have a look at the solrconfig.xml in the example directory > (".../example/solr/conf/") for examples on how to set up the spellcheck > component, and on how the request handlers are configured. > > -Jay > http://www.lucidimagination.com > > > On Fri, Sep 11, 2009 at 3:04 PM, michael8 wrote: > >> >> Hi, >> >> I have a newbie question about the 'standard' requestHandler in >> solrconfig.xml. What I like to know is where is the config information for >> this requestHandler kept? When I go to http://localhost:8983/solr/admin, >> I >> see the following info, but am curious where are the supposedly 'chained' >> components (e.g. QueryComponent, FacetComponent, MoreLikeThisComponent) >> configured for this requestHandler. I see timing and process debug output >> from these components with "debugQuery=true", so somewhere these components >> must have been configured for this 'standard' requestHandler. >> >> name: standard >> class: org.apache.solr.handler.component.SearchHandler >> version: $Revision: 686274 $ >> description: Search using components: >> >> org.apache.solr.handler.component.QueryComponent,org.apache.solr.handler.component.FacetComponent,org.apache.solr.handler.component.MoreLikeThisComponent,org.apache.solr.handler.component.HighlightComponent,org.apache.solr.handler.component.DebugComponent, >> stats: handlerStart : 1252703405335 >> requests : 3 >> errors : 0 >> timeouts : 0 >> totalTime : 201 >> avgTimePerRequest : 67.0 >> avgRequestsPerSecond : 0.015179728 >> >> >> What I like to do from understanding this is to properly integrate >> spellcheck component into the standard requestHandler as suggested in a >> solr >> spellcheck example. >> >> Thanks for any info in advance. >> Michael >> -- >> View this message in context: >> http://www.nabble.com/%22standard%22-requestHandler-components-tp25409075p25409075.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > -- Peter M. Wolanin, Ph.D. Momentum Specialist, Acquia. Inc. peter.wola...@acquia.com
Re: stopfilterFactory isn't removing field name
Thanks, I'll see if I can reproduce... -Yonik http://www.lucidimagination.com On Mon, Sep 14, 2009 at 2:10 AM, mike anderson wrote: > Yeah.. that was weird. removing the line "forever,for ever" from my synonyms > file fixed the problem. In fact, i was having the same problem for every > double word like that. I decided I didn't really need the synonym filter for > that field so I just took it out, but I'd really like to know what the > problem is. > -mike > > On Mon, Sep 14, 2009 at 1:10 AM, Yonik Seeley > wrote: >> >> That's pretty strange... perhaps something to do with your synonyms >> file mapping "for" to a zero length token? >> >> -Yonik >> http://www.lucidimagination.com >> >> On Mon, Sep 14, 2009 at 12:13 AM, mike anderson >> wrote: >> > I'm kind of stumped by this one.. is it something obvious? >> > I'm running the latest trunk. In some cases the stopFilterFactory isn't >> > removing the field name. >> > >> > Thanks in advance, >> > >> > -mike >> > >> > From debugQuery (both words are in the stopwords file): >> > >> > http://localhost:8983/solr/select?q=citations:for&debugQuery=true >> > >> > citations:for >> > citations:for >> > citations: >> > citations: >> > >> > >> > http://localhost:8983/solr/select?q=citations:the&debugQuery=true >> > >> > citations:the >> > citations:the >> > >> > >> > >> > >> > >> > >> > schema analyzer for this field: >> > >> > > > positionIncrementGap="100"> >> > >> > >> > > > synonyms="substitutions.txt" ignoreCase="true" expand="false"/> >> > >> > > > words="citationstopwords.txt"/> >> > >> > >> > >> > >> > >> > >> > >> > > > synonyms="substitutions.txt" ignoreCase="true" expand="false"/> >> > >> > > > words="citationstopwords.txt"/> >> > >> > >> > >> > >> > >> > > >
Re: Searching for the '+' character
> Hi all, > > I need some help with a curious problem i can't find a > solution for. I am somewhat of a newbie with the various > analyzers and handlers and how they work together, so im > looking for advice on how to proceed with my issue. > > I have content with text like 'product+' which has been > indexed as text. I need to search for the character '+', but > try as I might i can't do this. > > From the docs it should just be a matter of escaping: > I believe that the + is being stripped somehow but im not > sure where exactly to look. I think your analyzer is eating up +, which tokenizer are you using in it? Do you want to return documents containing 'product+' by searching '+'?
Configuring "slaves" for a "master" backup without restarting
Hi, A question about scalability. Let imagine the following architecture based on Master/Slave schema : - A "master" for the indexation called Master 1 - A backup of Master 1 (called Master 2) - Several "slaves" for search linked to Master 1 Can I configure the "slaves" to be automatically linked to Master 2 if Master 1 fails without restarting the JVMs? Thanks in advance. Nourredine.
Re: shards and facet_count
Shalin Shekhar Mangar wrote: On Fri, Sep 11, 2009 at 2:35 AM, Paul Rosen wrote: Hi again, I've mostly gotten the multicore working except for one detail. (I'm using solr 1.3 and solr-ruby 0.0.6 in a rails project.) I've done a few queries and I appear to be able to get hits from either core. (yeah!) I'm forming my request like this: req = Solr::Request::Standard.new( :start => start, :rows => max, :sort => sort_param, :query => query, :filter_queries => filter_queries, :field_list => @field_list, :facets => {:fields => @facet_fields, :mincount => 1, :missing => true, :limit => -1}, :highlighting => {:field_list => ['text'], :fragment_size => 600}, :shards => @cores) If I leave ":shards => @cores" out, then the response includes: 'facet_counts' => { 'facet_dates' => {}, 'facet_queries' => {}, 'facet_fields' => { 'myfacet' => [ etc...], etc... } which is what I expect. If I add the ":shards => @cores" back in (so that I'm doing the exact request above), I get: 'facet_counts' => { 'facet_dates' => {}, 'facet_queries' => {}, 'facet_fields' => {} so I've lost my facet information. Why would it correctly find my documents, but not report the facet info? I'm not a ruby guy but the response format in both the cases is exactly the same so I don't think there is any problem with the ruby client parsing. Can you check the Solr logs to see if there were any exceptions when you sent the shards parameter? I don't see any exceptions. The solr activity is pretty different for the two cases. Without the shards, it makes one call that looks something like this (I ellipsed the id and field parameters for clarity): Sep 14, 2009 9:32:09 AM org.apache.solr.core.SolrCore execute INFO: [resources] webapp=/solr path=/select params={facet.limit=-1&wt=ruby&rows=30&start=0&facet=true&facet.mincount=1&q=(rossetti)&fl=archive,...,license&qt=standard&facet.missing=true&hl.fl=text&facet.field=genre&facet.field=archive&facet.field=freeculture&hl.fragsize=600&hl=true} hits=27 status=0 QTime=6 Note that "facet=true". With the shards, it has five lines for the single call that I make: Sep 14, 2009 9:37:18 AM org.apache.solr.core.SolrCore execute INFO: [exhibits] webapp=/solr path=/select params={wt=javabin&rows=30&start=0&facet=true&fl=uri,score&q=(rossetti)&version=2.2&isShard=true&facet.missing=true&hl.fl=text&fsv=true&hl.fragsize=600&facet.field=genre&facet.field=archive&facet.field=freeculture&hl=false} hits=6 status=0 QTime=0 Sep 14, 2009 9:37:18 AM org.apache.solr.core.SolrCore execute INFO: [resources] webapp=/solr path=/select params={wt=javabin&rows=30&start=0&facet=true&fl=uri,score&q=(rossetti)&version=2.2&isShard=true&facet.missing=true&hl.fl=text&fsv=true&hl.fragsize=600&facet.field=genre&facet.field=archive&facet.field=freeculture&hl=false} hits=27 status=0 QTime=3 Sep 14, 2009 9:37:18 AM org.apache.solr.core.SolrCore execute INFO: [resources] webapp=/solr path=/select params={facet.limit=-1&wt=javabin&rows=30&start=0&ids=...,...&facet=false&facet.mincount=1&q=(rossetti)&fl=archive,...,uri&version=2.2&facet.missing=true&isShard=true&hl.fl=text&facet.field=genre&facet.field=archive&facet.field=freeculture&hl.fragsize=600&hl=true} status=0 QTime=35 Sep 14, 2009 9:37:18 AM org.apache.solr.core.SolrCore execute INFO: [exhibits] webapp=/solr path=/select params={facet.limit=-1&wt=javabin&rows=30&start=0&ids=...,...&facet=false&facet.mincount=1&q=(rossetti)&fl=archive,...,uri&version=2.2&facet.missing=true&isShard=true&hl.fl=text&facet.field=genre&facet.field=archive&facet.field=freeculture&hl.fragsize=600&hl=true} status=0 QTime=41 Sep 14, 2009 9:37:18 AM org.apache.solr.core.SolrCore execute INFO: [resources] webapp=/solr path=/select params={facet.limit=-1&wt=ruby&rows=30&start=0&facet=true&facet.mincount=1&q=(rossetti)&fl=archive,...,license&qt=standard&facet.missing=true&hl.fl=text&facet.field=genre&facet.field=archive&facet.field=freeculture&hl.fragsize=600&hl=true&shards=localhost:8983/solr/resources,localhost:8983/solr/exhibits} status=0 QTime=57 Note that on the third and fourth lines, "facet=false". Is that significant? Thanks, Paul
Re: Searching for the '+' character
Hi Ahmet, I believe its the WhitespaceTokenizerFactory, but i may be wrong. I've pasted the schema.xml into http://pastie.org/616162 On 14 Sep 2009, at 14:29, AHMET ARSLAN wrote: Hi all, I need some help with a curious problem i can't find a solution for. I am somewhat of a newbie with the various analyzers and handlers and how they work together, so im looking for advice on how to proceed with my issue. I have content with text like 'product+' which has been indexed as text. I need to search for the character '+', but try as I might i can't do this. From the docs it should just be a matter of escaping: I believe that the + is being stripped somehow but im not sure where exactly to look. I think your analyzer is eating up +, which tokenizer are you using in it? Do you want to return documents containing 'product+' by searching '+'? Best regards, Paul Forsyth mail: p...@ez.no skype: paulforsyth
Dataimport MySQLNonTransientConnectionException: No operations allowed after connection closed
I know that my issue is related to http://www.nabble.com/dataimporthandler-and-multiple-delta-import-td19160129.html#a19160129 and https://issues.apache.org/jira/browse/SOLR-728 but my case is quite different. As I understand patch at https://issues.apache.org/jira/browse/SOLR-728 prevents concurrent executing of import operation but does NOT put command in a queue. I have only few records to index. When run full reindex - it works very fast. But when I try to rerun this even after a couple of seconds - I am getting Caused by: com.mysql.jdbc.exceptions.MySQLNonTransientConnectionException: No operations allowed after connection closed. At this time, when I check status - it says that status is idle and everything was indexed success. Second run of reindex without exception I can run only after 10 seconds. It does not work for me! If I apply patch from https://issues.apache.org/jira/browse/SOLR-728 - I will unable to reindex in next 10 seconds as well. Any suggestions? -- View this message in context: http://www.nabble.com/Dataimport-MySQLNonTransientConnectionException%3A-No-operations-allowed-after-connection-closed-tp25436605p25436605.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Configuring "slaves" for a "master" backup without restarting
you can put both master1 and master2 behind a VIP. If Master 1 goes down make the VIP point to Master2 On Mon, Sep 14, 2009 at 7:11 PM, nourredine khadri wrote: > Hi, > > A question about scalability. > > Let imagine the following architecture based on Master/Slave schema : > > - A "master" for the indexation called Master 1 > - A backup of Master 1 (called Master 2) > - Several "slaves" for search linked to Master 1 > > Can I configure the "slaves" to be automatically linked to Master 2 if Master > 1 fails without restarting the JVMs? > > Thanks in advance. > > Nourredine. > > > -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Searching for the '+' character
> Hi Ahmet, > > I believe its the WhitespaceTokenizerFactory, but i may be > wrong. > > I've pasted the schema.xml into http://pastie.org/616162 > I looked at your field type named text. WordDelimiterFilterFactory is eating up '+' You can use .../solr/admin/analysis.jsp tool to see behaviour of each tokenizer/tokenfilter for particular input. But more importantly do you want to return documents containing 'product+' by searching '+'? You said you need to search for the character '+'. What that query supposed to return back?
Re: Single Core or Multiple Core?
The problem is that, if we use multicore it forces you to use a core name. this is inconvenient. We must get rid of this restriction before we move single-core to multicore. On Sat, Sep 12, 2009 at 3:14 PM, Uri Boness wrote: > +1 > Can you add a JIRA issue for that so we can vote for it? > > Chris Hostetter wrote: >> >> : > For the record: even if you're only going to have one SOlrCore, using >> the >> : > multicore support (ie: having a solr.xml file) might prove handy from >> a >> : > maintence standpoint ... the ability to configure new "on deck cores" >> with >> ... >> : Yeah, it is a shame that single-core deployments (no solr.xml) does not >> have >> : a way to enable CoreAdminHandler. This is something we should definitely >> : look at in Solr 1.5. >> >> I think the most straight forward starting point is to switch how we >> structure the examples so that all of the examples uses a solr.xml with >> multicore support. >> >> Then we can move forward on deprecating the specification of "Solr Home" >> using JNDI/systemvars and switch to having the location of the solr.xml be >> the one master config option with everything else coming after that. >> >> >> >> -Hoss >> >> >> > -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Searching for the '+' character
--- On Mon, 9/14/09, Paul Forsyth wrote: > From: Paul Forsyth > Subject: Re: Searching for the '+' character > To: solr-user@lucene.apache.org > Date: Monday, September 14, 2009, 5:55 PM > With words like 'product+' i'd expect > a search for '+' to return results like any other character > or word, so '+' would be found within 'product+' or similar > text. > > I've tried removing the worddelimiter from the query > analyzer, restarting and reindexing but i get the same > result. Nothing is found. I assume one of the filters could > be adjusted to keep the '+'. > > Weird thing is that i tried to remove all filters from the > analyzer and i get the same result. > > Paul When you remove all filters '+' is kept, but still '+' won't match 'product+'. Because you want to search inside a token. If + sign is always at the end of of your text, and you want to search only last character of your text EdgeNGramFilterFactory can do that. with the settings side="back" maxGramSize="1" minGramSize="1" The fieldType below will match '+' to 'product+' But this time 'product+' will be reduced to only '+'. You won't be able to search it otherways for example product*. Along with the last character, if you want to keep the original word it self you can set maxGramSize to 512. By doing this token 'product+' will produce 8 tokens: (and query product* or product+ will return it ) + word t+ word ct+ word uct+ word duct+ word oduct+ word roduct+ word product+ word If + sign can be anywhere inside the text you can use NGramTokenFilter. Hope this helps.
Re: Searching for the '+' character
With words like 'product+' i'd expect a search for '+' to return results like any other character or word, so '+' would be found within 'product+' or similar text. I've tried removing the worddelimiter from the query analyzer, restarting and reindexing but i get the same result. Nothing is found. I assume one of the filters could be adjusted to keep the '+'. Weird thing is that i tried to remove all filters from the analyzer and i get the same result. Paul On 14 Sep 2009, at 15:17, AHMET ARSLAN wrote: Hi Ahmet, I believe its the WhitespaceTokenizerFactory, but i may be wrong. I've pasted the schema.xml into http://pastie.org/616162 I looked at your field type named text. WordDelimiterFilterFactory is eating up '+' You can use .../solr/admin/analysis.jsp tool to see behaviour of each tokenizer/tokenfilter for particular input. But more importantly do you want to return documents containing 'product+' by searching '+'? You said you need to search for the character '+'. What that query supposed to return back? Best regards, Paul Forsyth mail: p...@ez.no skype: paulforsyth
Re: Single Core or Multiple Core?
I concur with Uri, but I would also add that it might be helpful to specify a default core to use somewhere in the configuration file. So that if no core is specified, the default one will be implicitly selected. I am not sure if this feature is available yet. What do you think? On Mon, Sep 14, 2009 at 10:46 AM, Uri Boness wrote: > Is it really a problem? I mean, as i see it, solr to cores is what RDBMS is > to databases. When you connect to a database you also need to specify the > database name. > > Cheers, > Uri > > > On Sep 14, 2009, at 16:27, Noble Paul നോബിള് नोब्ळ् < > noble.p...@corp.aol.com> wrote: > > The problem is that, if we use multicore it forces you to use a core >> name. this is inconvenient. We must get rid of this restriction before >> we move single-core to multicore. >> >> >> >> On Sat, Sep 12, 2009 at 3:14 PM, Uri Boness wrote: >> >>> +1 >>> Can you add a JIRA issue for that so we can vote for it? >>> >>> Chris Hostetter wrote: >>> : > For the record: even if you're only going to have one SOlrCore, using the : > multicore support (ie: having a solr.xml file) might prove handy from a : > maintence standpoint ... the ability to configure new "on deck cores" with ... : Yeah, it is a shame that single-core deployments (no solr.xml) does not have : a way to enable CoreAdminHandler. This is something we should definitely : look at in Solr 1.5. I think the most straight forward starting point is to switch how we structure the examples so that all of the examples uses a solr.xml with multicore support. Then we can move forward on deprecating the specification of "Solr Home" using JNDI/systemvars and switch to having the location of the solr.xml be the one master config option with everything else coming after that. -Hoss >>> >> >> >> -- >> - >> Noble Paul | Principal Engineer| AOL | http://aol.com >> > -- "Good Enough" is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once.
Re: Single Core or Multiple Core?
On Mon, Sep 14, 2009 at 8:16 PM, Uri Boness wrote: > Is it really a problem? I mean, as i see it, solr to cores is what RDBMS is > to databases. When you connect to a database you also need to specify the > database name. > > The problem is compatibility. If we make solr.xml compulsory then we only force people to do a configuration change. But if we make a core name mandatory, then we force them to change their applications (or the applications' configurations). It is better if we can avoid that. Besides, if there's only one core, why need a name? -- Regards, Shalin Shekhar Mangar.
Re: Searching for the '+' character
Thanks Ahmet, Thats excellent, thanks :) I may have to increase the gramsize to take into account other possible uses but i can now read around these filters to make the adjustments. With regard to WordDelimiterFilterFactory. Is there a way to place a delimiter on this filter to still get most of its functionality without it absorbing the + signs? Will i loose a lot of 'good' functionality by removing it? 'preserveOriginal' sounds promising and seems to work but is it a good idea to use this? On 14 Sep 2009, at 16:16, AHMET ARSLAN wrote: --- On Mon, 9/14/09, Paul Forsyth wrote: From: Paul Forsyth Subject: Re: Searching for the '+' character To: solr-user@lucene.apache.org Date: Monday, September 14, 2009, 5:55 PM With words like 'product+' i'd expect a search for '+' to return results like any other character or word, so '+' would be found within 'product+' or similar text. I've tried removing the worddelimiter from the query analyzer, restarting and reindexing but i get the same result. Nothing is found. I assume one of the filters could be adjusted to keep the '+'. Weird thing is that i tried to remove all filters from the analyzer and i get the same result. Paul When you remove all filters '+' is kept, but still '+' won't match 'product+'. Because you want to search inside a token. If + sign is always at the end of of your text, and you want to search only last character of your text EdgeNGramFilterFactory can do that. with the settings side="back" maxGramSize="1" minGramSize="1" The fieldType below will match '+' to 'product+' positionIncrementGap="100"> language="English"/> maxGramSize="1" minGramSize="1"/> synonyms="synonyms.txt" ignoreCase="true" expand="true"/> language="English"/> But this time 'product+' will be reduced to only '+'. You won't be able to search it otherways for example product*. Along with the last character, if you want to keep the original word it self you can set maxGramSize to 512. By doing this token 'product+' will produce 8 tokens: (and query product* or product+ will return it ) + word t+ word ct+ word uct+ word duct+ word oduct+ word roduct+ word product+ word If + sign can be anywhere inside the text you can use NGramTokenFilter. Hope this helps. Best regards, Paul Forsyth mail: p...@ez.no skype: paulforsyth
Re: Searching for the '+' character
Paul Forsyth schrieb: Hi Erick, In this specific case my client does have a new product with a '+' at the end. Its just one of those odd ones! Customers are expected to put + into the search box so i have to have results to show. I hear your concerns though. Originally i thought I would need to transform the + into something else, and do this back and forwards to get a match! sorry for jumping into the discussion with my little knowledge - but I actually think transforming the '+' into something else in the index (something like 'pluzz' that has a low probability to appear as such in the regular input) is a good solution. You just have to do the same on the query side. You could have your own filter for that to put it in the schema or just do it "manually" at index and query time. is that a possibility? Chantal Hopefully this will be a standard solr install, but with this tweak for escaped chars Paul On 14 Sep 2009, at 17:01, Erick Erickson wrote: Before you go too much further with this, I've just got to ask whetherthe use case for searching "product+" really serves your customers. If you mess around with analyzers to make things include the "+", what does that mean for "&"? "*"? "."? any other weird character you can think of? Would it be a bad thing for "product" to match "product+" and vice versa? Would it be more or less confusing for your users to have "product" FAIL to match "product+"? Of course only you really know your problem space, but think carefully about this issue before you take on the work of making "product+" work because it'll inevitably be wy more work than you think. Imagine the bug reports when "product&" fails to match "product+", both of which fail to match "product" I'd also get a copy of Luke and look at the index to be sure what you *think* is in there is *actually* there. It'll also help you understand what analyzers do better. Don't forget that using different analyzers when indexing and querying will lead to...er..."interesting" results. Best Erick On Mon, Sep 14, 2009 at 11:38 AM, Paul Forsyth wrote: Thanks Ahmet, Thats excellent, thanks :) I may have to increase the gramsize to take into account other possible uses but i can now read around these filters to make the adjustments. With regard to WordDelimiterFilterFactory. Is there a way to place a delimiter on this filter to still get most of its functionality without it absorbing the + signs? Will i loose a lot of 'good' functionality by removing it? 'preserveOriginal' sounds promising and seems to work but is it a good idea to use this? On 14 Sep 2009, at 16:16, AHMET ARSLAN wrote: --- On Mon, 9/14/09, Paul Forsyth wrote: From: Paul Forsyth Subject: Re: Searching for the '+' character To: solr-user@lucene.apache.org Date: Monday, September 14, 2009, 5:55 PM With words like 'product+' i'd expect a search for '+' to return results like any other character or word, so '+' would be found within 'product+' or similar text. I've tried removing the worddelimiter from the query analyzer, restarting and reindexing but i get the same result. Nothing is found. I assume one of the filters could be adjusted to keep the '+'. Weird thing is that i tried to remove all filters from the analyzer and i get the same result. Paul When you remove all filters '+' is kept, but still '+' won't match 'product+'. Because you want to search inside a token. If + sign is always at the end of of your text, and you want to search only last character of your text EdgeNGramFilterFactory can do that. with the settings side="back" maxGramSize="1" minGramSize="1" The fieldType below will match '+' to 'product+' But this time 'product+' will be reduced to only '+'. You won't be able to search it otherways for example product*. Along with the last character, if you want to keep the original word it self you can set maxGramSize to 512. By doing this token 'product+' will produce 8 tokens: (and query product* or product+ will return it ) + word t+ word ct+ word uct+ word duct+ word oduct+ word roduct+ word product+ word If + sign can be anywhere inside the text you can use NGramTokenFilter. Hope this helps. Best regards, Paul Forsyth mail: p...@ez.no skype: paulforsyth Best regards, Paul Forsyth mail: p...@ez.no skype: paulforsyth
Re: Searching for the '+' character
Hi Erick, In this specific case my client does have a new product with a '+' at the end. Its just one of those odd ones! Customers are expected to put + into the search box so i have to have results to show. I hear your concerns though. Originally i thought I would need to transform the + into something else, and do this back and forwards to get a match! Hopefully this will be a standard solr install, but with this tweak for escaped chars Paul On 14 Sep 2009, at 17:01, Erick Erickson wrote: Before you go too much further with this, I've just got to ask whetherthe use case for searching "product+" really serves your customers. If you mess around with analyzers to make things include the "+", what does that mean for "&"? "*"? "."? any other weird character you can think of? Would it be a bad thing for "product" to match "product+" and vice versa? Would it be more or less confusing for your users to have "product" FAIL to match "product+"? Of course only you really know your problem space, but think carefully about this issue before you take on the work of making "product+" work because it'll inevitably be wy more work than you think. Imagine the bug reports when "product&" fails to match "product+", both of which fail to match "product" I'd also get a copy of Luke and look at the index to be sure what you *think* is in there is *actually* there. It'll also help you understand what analyzers do better. Don't forget that using different analyzers when indexing and querying will lead to...er..."interesting" results. Best Erick On Mon, Sep 14, 2009 at 11:38 AM, Paul Forsyth wrote: Thanks Ahmet, Thats excellent, thanks :) I may have to increase the gramsize to take into account other possible uses but i can now read around these filters to make the adjustments. With regard to WordDelimiterFilterFactory. Is there a way to place a delimiter on this filter to still get most of its functionality without it absorbing the + signs? Will i loose a lot of 'good' functionality by removing it? 'preserveOriginal' sounds promising and seems to work but is it a good idea to use this? On 14 Sep 2009, at 16:16, AHMET ARSLAN wrote: --- On Mon, 9/14/09, Paul Forsyth wrote: From: Paul Forsyth Subject: Re: Searching for the '+' character To: solr-user@lucene.apache.org Date: Monday, September 14, 2009, 5:55 PM With words like 'product+' i'd expect a search for '+' to return results like any other character or word, so '+' would be found within 'product+' or similar text. I've tried removing the worddelimiter from the query analyzer, restarting and reindexing but i get the same result. Nothing is found. I assume one of the filters could be adjusted to keep the '+'. Weird thing is that i tried to remove all filters from the analyzer and i get the same result. Paul When you remove all filters '+' is kept, but still '+' won't match 'product+'. Because you want to search inside a token. If + sign is always at the end of of your text, and you want to search only last character of your text EdgeNGramFilterFactory can do that. with the settings side="back" maxGramSize="1" minGramSize="1" The fieldType below will match '+' to 'product+' positionIncrementGap="100"> synonyms="synonyms.txt" ignoreCase="true" expand="true"/> But this time 'product+' will be reduced to only '+'. You won't be able to search it otherways for example product*. Along with the last character, if you want to keep the original word it self you can set maxGramSize to 512. By doing this token 'product+' will produce 8 tokens: (and query product* or product+ will return it ) + word t+ word ct+ word uct+ word duct+ word oduct+ word roduct+ word product+ word If + sign can be anywhere inside the text you can use NGramTokenFilter. Hope this helps. Best regards, Paul Forsyth mail: p...@ez.no skype: paulforsyth Best regards, Paul Forsyth mail: p...@ez.no skype: paulforsyth
Re: Dataimport MySQLNonTransientConnectionException: No operations allowed after connection closed
I am using 1.3 Do you suggest 1.4 from developer trunk? I am concern if it stable. Is it safe to use it in big commerce app? Noble Paul നോബിള് नोब्ळ्-2 wrote: > > which version of Solr are you using. can you try with a recent one and > confirm this? > > On Mon, Sep 14, 2009 at 7:45 PM, palexv wrote: >> >> I know that my issue is related to >> http://www.nabble.com/dataimporthandler-and-multiple-delta-import-td19160129.html#a19160129 >> and https://issues.apache.org/jira/browse/SOLR-728 >> but my case is quite different. >> As I understand patch at https://issues.apache.org/jira/browse/SOLR-728 >> prevents concurrent executing of import operation but does NOT put >> command >> in a queue. >> >> I have only few records to index. When run full reindex - it works very >> fast. But when I try to rerun this even after a couple of seconds - I am >> getting >> Caused by: >> com.mysql.jdbc.exceptions.MySQLNonTransientConnectionException: >> No operations allowed after connection closed. >> >> At this time, when I check status - it says that status is idle and >> everything was indexed success. >> Second run of reindex without exception I can run only after 10 seconds. >> It does not work for me! If I apply patch from >> https://issues.apache.org/jira/browse/SOLR-728 - I will unable to reindex >> in >> next 10 seconds as well. >> Any suggestions? >> -- >> View this message in context: >> http://www.nabble.com/Dataimport-MySQLNonTransientConnectionException%3A-No-operations-allowed-after-connection-closed-tp25436605p25436605.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > > > -- > - > Noble Paul | Principal Engineer| AOL | http://aol.com > > -- View this message in context: http://www.nabble.com/Dataimport-MySQLNonTransientConnectionException%3A-No-operations-allowed-after-connection-closed-tp25436605p25436948.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Searching for the '+' character
Before you go too much further with this, I've just got to ask whetherthe use case for searching "product+" really serves your customers. If you mess around with analyzers to make things include the "+", what does that mean for "&"? "*"? "."? any other weird character you can think of? Would it be a bad thing for "product" to match "product+" and vice versa? Would it be more or less confusing for your users to have "product" FAIL to match "product+"? Of course only you really know your problem space, but think carefully about this issue before you take on the work of making "product+" work because it'll inevitably be wy more work than you think. Imagine the bug reports when "product&" fails to match "product+", both of which fail to match "product" I'd also get a copy of Luke and look at the index to be sure what you *think* is in there is *actually* there. It'll also help you understand what analyzers do better. Don't forget that using different analyzers when indexing and querying will lead to...er..."interesting" results. Best Erick On Mon, Sep 14, 2009 at 11:38 AM, Paul Forsyth wrote: > Thanks Ahmet, > > Thats excellent, thanks :) I may have to increase the gramsize to take into > account other possible uses but i can now read around these filters to make > the adjustments. > > With regard to WordDelimiterFilterFactory. Is there a way to place a > delimiter on this filter to still get most of its functionality without it > absorbing the + signs? Will i loose a lot of 'good' functionality by > removing it? 'preserveOriginal' sounds promising and seems to work but is it > a good idea to use this? > > > On 14 Sep 2009, at 16:16, AHMET ARSLAN wrote: > > >> >> --- On Mon, 9/14/09, Paul Forsyth wrote: >> >> From: Paul Forsyth >>> Subject: Re: Searching for the '+' character >>> To: solr-user@lucene.apache.org >>> Date: Monday, September 14, 2009, 5:55 PM >>> With words like 'product+' i'd expect >>> a search for '+' to return results like any other character >>> or word, so '+' would be found within 'product+' or similar >>> text. >>> >>> I've tried removing the worddelimiter from the query >>> analyzer, restarting and reindexing but i get the same >>> result. Nothing is found. I assume one of the filters could >>> be adjusted to keep the '+'. >>> >>> Weird thing is that i tried to remove all filters from the >>> analyzer and i get the same result. >>> >>> Paul >>> >> >> When you remove all filters '+' is kept, but still '+' won't match >> 'product+'. Because you want to search inside a token. >> >> If + sign is always at the end of of your text, and you want to search >> only last character of your text EdgeNGramFilterFactory can do that. >> with the settings side="back" maxGramSize="1" minGramSize="1" >> >> The fieldType below will match '+' to 'product+' >> >> >> >> >> >> >> > language="English"/> >>> maxGramSize="1" minGramSize="1"/> >> >> >> >> > ignoreCase="true" expand="true"/> >> >> >> > language="English"/> >> >> >> >> >> But this time 'product+' will be reduced to only '+'. You won't be able to >> search it otherways for example product*. Along with the last character, if >> you want to keep the original word it self you can set maxGramSize to 512. >> By doing this token 'product+' will produce 8 tokens: (and query product* or >> product+ will return it ) >> >> + word >> t+ word >> ct+ word >> uct+ word >> duct+ word >> oduct+ word >> roduct+ word >> product+ word >> >> If + sign can be anywhere inside the text you can use NGramTokenFilter. >> Hope this helps. >> >> >> >> > Best regards, > > Paul Forsyth > > mail: p...@ez.no > skype: paulforsyth > >
50% discount on "Taming Text" , "Lucene in Action", etc
http://www.manning.com/ingersoll/ And other books too, such as Lucene in Action 3rd edition... PDF only (MEAP) "Today Only! Save 50% on any ebook! This offer applies to all final ebooks or ebook editions purchased through the Manning Early Access Program. Enter code pop0914 in the Promotional Code box when you check out at manning.com."
"Only one usage of each socket address" error
Hi guys, I'm getting an exception while in the middle of a batch indexing job. Can anybody help me figure this out? Error: Only one usage of each socket address (protocol/network address/port) is normally permitted 127.0.0.1:8080 Solr is 1.4 on Tomcat. Big thanks. Rihaed
Re: Single Core or Multiple Core?
Is it really a problem? I mean, as i see it, solr to cores is what RDBMS is to databases. When you connect to a database you also need to specify the database name. Cheers, Uri On Sep 14, 2009, at 16:27, Noble Paul നോബിള് नो ब्ळ् wrote: The problem is that, if we use multicore it forces you to use a core name. this is inconvenient. We must get rid of this restriction before we move single-core to multicore. On Sat, Sep 12, 2009 at 3:14 PM, Uri Boness wrote: +1 Can you add a JIRA issue for that so we can vote for it? Chris Hostetter wrote: : > For the record: even if you're only going to have one SOlrCore, using the : > multicore support (ie: having a solr.xml file) might prove handy from a : > maintence standpoint ... the ability to configure new "on deck cores" with ... : Yeah, it is a shame that single-core deployments (no solr.xml) does not have : a way to enable CoreAdminHandler. This is something we should definitely : look at in Solr 1.5. I think the most straight forward starting point is to switch how we structure the examples so that all of the examples uses a solr.xml with multicore support. Then we can move forward on deprecating the specification of "Solr Home" using JNDI/systemvars and switch to having the location of the solr.xml be the one master config option with everything else coming after that. -Hoss -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re : Configuring "slaves" for a "master" backup without restarting
Good idea. Thanks. Also, in such architecture (Master/Slave), is there any best practices for index stored on an NFS mounted filesystem ? Specially about the rsync step, when the "slaves" want to synchronize their index from a remote filesystem (pb of inconsistent views of the directory). Nourredine. De : Noble Paul നോബിള് नोब्ळ् À : solr-user@lucene.apache.org Envoyé le : Lundi, 14 Septembre 2009, 16h15mn 55s Objet : Re: Configuring "slaves" for a "master" backup without restarting you can put both master1 and master2 behind a VIP. If Master 1 goes down make the VIP point to Master2 On Mon, Sep 14, 2009 at 7:11 PM, nourredine khadri wrote: > Hi, > > A question about scalability. > > Let imagine the following architecture based on Master/Slave schema : > > - A "master" for the indexation called Master 1 > - A backup of Master 1 (called Master 2) > - Several "slaves" for search linked to Master 1 > > Can I configure the "slaves" to be automatically linked to Master 2 if Master > 1 fails without restarting the JVMs? > > Thanks in advance. > > Nourredine. > > > -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Dataimport MySQLNonTransientConnectionException: No operations allowed after connection closed
which version of Solr are you using. can you try with a recent one and confirm this? On Mon, Sep 14, 2009 at 7:45 PM, palexv wrote: > > I know that my issue is related to > http://www.nabble.com/dataimporthandler-and-multiple-delta-import-td19160129.html#a19160129 > and https://issues.apache.org/jira/browse/SOLR-728 > but my case is quite different. > As I understand patch at https://issues.apache.org/jira/browse/SOLR-728 > prevents concurrent executing of import operation but does NOT put command > in a queue. > > I have only few records to index. When run full reindex - it works very > fast. But when I try to rerun this even after a couple of seconds - I am > getting > Caused by: com.mysql.jdbc.exceptions.MySQLNonTransientConnectionException: > No operations allowed after connection closed. > > At this time, when I check status - it says that status is idle and > everything was indexed success. > Second run of reindex without exception I can run only after 10 seconds. > It does not work for me! If I apply patch from > https://issues.apache.org/jira/browse/SOLR-728 - I will unable to reindex in > next 10 seconds as well. > Any suggestions? > -- > View this message in context: > http://www.nabble.com/Dataimport-MySQLNonTransientConnectionException%3A-No-operations-allowed-after-connection-closed-tp25436605p25436605.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Single Core or Multiple Core?
Yes, I think it is better to be backward compatible or the impact of moving to the new solr version would be big. On Mon, Sep 14, 2009 at 12:24 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Mon, Sep 14, 2009 at 8:16 PM, Uri Boness wrote: > > > Is it really a problem? I mean, as i see it, solr to cores is what RDBMS > is > > to databases. When you connect to a database you also need to specify the > > database name. > > > > > The problem is compatibility. If we make solr.xml compulsory then we only > force people to do a configuration change. But if we make a core name > mandatory, then we force them to change their applications (or the > applications' configurations). It is better if we can avoid that. Besides, > if there's only one core, why need a name? > > -- > Regards, > Shalin Shekhar Mangar. >
Re: Searching for the '+' character
Interesting. I thought that would be the 'hard' approach rather than add a filter, but i guess thats all it really is anyway. Has this been done before? Build a filter to transform a word there and back? On 14 Sep 2009, at 17:17, Chantal Ackermann wrote: Paul Forsyth schrieb: Hi Erick, In this specific case my client does have a new product with a '+' at the end. Its just one of those odd ones! Customers are expected to put + into the search box so i have to have results to show. I hear your concerns though. Originally i thought I would need to transform the + into something else, and do this back and forwards to get a match! sorry for jumping into the discussion with my little knowledge - but I actually think transforming the '+' into something else in the index (something like 'pluzz' that has a low probability to appear as such in the regular input) is a good solution. You just have to do the same on the query side. You could have your own filter for that to put it in the schema or just do it "manually" at index and query time. is that a possibility? Chantal Hopefully this will be a standard solr install, but with this tweak for escaped chars Paul On 14 Sep 2009, at 17:01, Erick Erickson wrote: Before you go too much further with this, I've just got to ask whetherthe use case for searching "product+" really serves your customers. If you mess around with analyzers to make things include the "+", what does that mean for "&"? "*"? "."? any other weird character you can think of? Would it be a bad thing for "product" to match "product+" and vice versa? Would it be more or less confusing for your users to have "product" FAIL to match "product+"? Of course only you really know your problem space, but think carefully about this issue before you take on the work of making "product+" work because it'll inevitably be wy more work than you think. Imagine the bug reports when "product&" fails to match "product+", both of which fail to match "product" I'd also get a copy of Luke and look at the index to be sure what you *think* is in there is *actually* there. It'll also help you understand what analyzers do better. Don't forget that using different analyzers when indexing and querying will lead to...er..."interesting" results. Best Erick On Mon, Sep 14, 2009 at 11:38 AM, Paul Forsyth wrote: Thanks Ahmet, Thats excellent, thanks :) I may have to increase the gramsize to take into account other possible uses but i can now read around these filters to make the adjustments. With regard to WordDelimiterFilterFactory. Is there a way to place a delimiter on this filter to still get most of its functionality without it absorbing the + signs? Will i loose a lot of 'good' functionality by removing it? 'preserveOriginal' sounds promising and seems to work but is it a good idea to use this? On 14 Sep 2009, at 16:16, AHMET ARSLAN wrote: --- On Mon, 9/14/09, Paul Forsyth wrote: From: Paul Forsyth Subject: Re: Searching for the '+' character To: solr-user@lucene.apache.org Date: Monday, September 14, 2009, 5:55 PM With words like 'product+' i'd expect a search for '+' to return results like any other character or word, so '+' would be found within 'product+' or similar text. I've tried removing the worddelimiter from the query analyzer, restarting and reindexing but i get the same result. Nothing is found. I assume one of the filters could be adjusted to keep the '+'. Weird thing is that i tried to remove all filters from the analyzer and i get the same result. Paul When you remove all filters '+' is kept, but still '+' won't match 'product+'. Because you want to search inside a token. If + sign is always at the end of of your text, and you want to search only last character of your text EdgeNGramFilterFactory can do that. with the settings side="back" maxGramSize="1" minGramSize="1" The fieldType below will match '+' to 'product+' But this time 'product+' will be reduced to only '+'. You won't be able to search it otherways for example product*. Along with the last character, if you want to keep the original word it self you can set maxGramSize to 512. By doing this token 'product+' will produce 8 tokens: (and query product* or product+ will return it ) + word t+ word ct+ word uct+ word duct+ word oduct+ word roduct+ word product+ word If + sign can be anywhere inside the text you can use NGramTokenFilter. Hope this helps. Best regards, Paul Forsyth mail: p...@ez.no skype: paulforsyth Best regards, Paul Forsyth mail: p...@ez.no skype: paulforsyth Best regards, Paul Forsyth mail: p...@ez.no skype: paulforsyth
Disabling tf (term frequency) during indexing and/or scoring
Hello, Let me preface this by admitting that I'm still fairly new to Lucene and Solr, so I apologize if any of this sounds naive and I'm open to thinking about my problem differently. I'm currently responsible for a rather large dataset of business records that I'm trying to build a Lucene/Solr infrastructure around, to replace an in-house solution that we've been using for a few years. These records are sourced from multiple providers and there's often a fair bit of overlap in the business coverage. I have a set of fuzzy correlation libraries that I use to identify these documents and I ultimately create a super-record that includes metadata from each of the providers. Given the nature of things, these providers often have slight variations in wording or spelling in the overlapping fields (it's amazing how many ways people find to refer to the same business or address). I'd like to capture these variations, as they facilitate searching, but TF considerations are currently borking field scoring here. For example, taking business names into consideration, I have a Solr schema similar to: stored="false" multiValued="true"> ... stored="false" multiValued="true"> multiValued="true" omitNorms="true"> ... For any given business record, there may be 1..N business names present in the nameNorm field (some with naming variations, some identical). With TF enabled, however, I'm getting different match scores on this field simply based on how many providers contributed to the record, which is not meaningful to me. For example, a record containing foo barfoo bar is necessarily scoring higher than a record just containing foo bar. Although I wouldn't mind TF data being considered within each discrete field value, I need to find a way to prevent score inflation based simply on the number of contributing providers. Looking at the mailing list archive and searching around, it sounds like the omitTf boolean in Lucene used to function somewhat in this manner, but has since taken on a broader interpretation (and name) that now also disables positional and payload data. Unfortunately, phrase support for fields like this is absolutely essential. So what's the best way to address a need like this? I guess I don't mind whether this is handled at index time or search time, but I'm not sure what I may need to override or if there's some existing provision I should take advantage of. Thank you for any help you may have. Best regards, Aaron
Re: KStem download
Pascal Dimassimo wrote: > > Hi, > > I want to try KStem. I'm following the instructions on this page: > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/Kstem > > ... but the download link doesn't work. > > Is anyone know the new location to download KStem? > I am stuck with the same issue its link is not working for a long time is there any alternate link Please let us know darniz -- View this message in context: http://www.nabble.com/KStem-download-tp24375856p25440432.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: KStem download
On Mon, Sep 14, 2009 at 1:56 PM, darniz wrote: > Pascal Dimassimo wrote: >> >> Hi, >> >> I want to try KStem. I'm following the instructions on this page: >> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/Kstem >> >> ... but the download link doesn't work. >> >> Is anyone know the new location to download KStem? >> > I am stuck with the same issue > its link is not working for a long time > > > is there any alternate link > Please let us know *shrug* - looks like they changed their download structure (or just took it down). I searched around their site a bit but couldn't find another one (and google wasn't able to find it either). The one from Lucid is functionally identical, free, and much, much faster though - I'd just use that. -Yonik http://www.lucidimagination.com
Return one word - Auto Complete Request Handler
I am trying configure an request handler that will be uses in the Auto Complete Query. I am limiting the result to one field by using the "fl" parameter, which can be used to specify field to return. How to make the field return only one word not full sentences. Thanks/Regards, Parvez
Seattle / PNW Hadoop/Lucene/HBase Meetup, Wed Sep 30th
Greetings, It's time for another Hadoop/Lucene/Apache"Cloud" Stack meetup! This month it'll be on Wednesday, the 30th, at 6:45 pm. We should have a few interesting guests this time around -- someone from Facebook may be stopping by to talk about Hive :) We've had great attendance in the past few months, let's keep it up! I'm always amazed by the things I learn from everyone. We're back at the University of Washington, Allen Computer Science Center (not Computer Engineering) Map: http://www.washington.edu/home/maps/?CSE Room: 303 -or- the Entry level. If there are changes, signs will be posted. More Info: The meetup is about 2 hours (and there's usually food): we'll have two in-depth talks of 15-20 minutes each, and then several "lightning talks" of 5 minutes. If no one offers, We'll then have discussion and 'social time'. we'll just have general discussion. Let net know if you're interested in speaking or attending. We'd like to focus on education, so every presentation *needs* to ask some questions at the end. We can talk about these after the presentations, and I'll record what we've learned in a wiki and share that with the rest of us. Contact: Bradford Stephens, 904-415-3009, bradfordsteph...@gmail.com Cheers, Bradford -- http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science
Re: Searching for the '+' character
> Thanks Ahmet, > > Thats excellent, thanks :) I may have to increase the > gramsize to take into account other possible uses but i can > now read around these filters to make the adjustments. > > With regard to WordDelimiterFilterFactory. Is there a way > to place a delimiter on this filter to still get most of its > functionality without it absorbing the + signs? Yes you are right, preserveOriginal="1" will causes the original token to be indexed without modifications. > Will i loose a lot of 'good' functionality by removing it? It depends of your input data. It is used to break one token into subwords. Like: "Wi-Fi" -> "Wi", "Fi" and "PowerShot" -> "Power", "Shot" If you input data set contains such words, you may need it. But I think just to make last character searchable, using NGramFilter(s) is not an optimal solution. I don't know what type of dataset you have but, I think using separate two fields (with different types) for that is more suitable. One field will contain actual data itself. The other will hold only the last character(s). You can achieve this by a copyField or programatically during indexing. The type of the field lastCharsField will be using EdgeNGramFilter so that only last character of token(s) will pass that filter. During searching you will search those two fields: originalField:\+ OR lastCharsField:\+ The query lastCharsField:\+ will return you all the products ending with +. Hope this helps.
Re: KStem download
is the source for the lucid kstemmer available ? from the lucid solr package i only found the compiled jars On Mon, Sep 14, 2009 at 11:04 AM, Yonik Seeley wrote: > On Mon, Sep 14, 2009 at 1:56 PM, darniz wrote: >> Pascal Dimassimo wrote: >>> >>> Hi, >>> >>> I want to try KStem. I'm following the instructions on this page: >>> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/Kstem >>> >>> ... but the download link doesn't work. >>> >>> Is anyone know the new location to download KStem? >>> >> I am stuck with the same issue >> its link is not working for a long time >> >> >> is there any alternate link >> Please let us know > > *shrug* - looks like they changed their download structure (or just > took it down). I searched around their site a bit but couldn't find > another one (and google wasn't able to find it either). > > The one from Lucid is functionally identical, free, and much, much > faster though - I'd just use that. > > -Yonik > http://www.lucidimagination.com >
Re: KStem download
Ok i downlaod the lucid imaginationversion of Solr. >From the lib directory i copied the two jars lucid-kstem.jar and lucid-solr-kstem.jar and put in my local solr instance at C:\solr\apache-solr-1.3.0\lib When i declare a field type like this its throwing class not found exception. Is there some other files which i am missing. Please let me know thanks Rashid Yonik Seeley-2 wrote: > > On Mon, Sep 14, 2009 at 1:56 PM, darniz wrote: >> Pascal Dimassimo wrote: >>> >>> Hi, >>> >>> I want to try KStem. I'm following the instructions on this page: >>> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/Kstem >>> >>> ... but the download link doesn't work. >>> >>> Is anyone know the new location to download KStem? >>> >> I am stuck with the same issue >> its link is not working for a long time >> >> >> is there any alternate link >> Please let us know > > *shrug* - looks like they changed their download structure (or just > took it down). I searched around their site a bit but couldn't find > another one (and google wasn't able to find it either). > > The one from Lucid is functionally identical, free, and much, much > faster though - I'd just use that. > > -Yonik > http://www.lucidimagination.com > > -- View this message in context: http://www.nabble.com/KStem-download-tp24375856p25440690.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: KStem download
Ok i downlaod the lucid imaginationversion of Solr. >From the lib directory i copied the two jars lucid-kstem.jar and lucid-solr-kstem.jar and put in my local solr instance at C:\solr\apache-solr-1.3.0\lib When i declare a field type like this its throwing class not found exception. Is there some other files which i am missing. Please let me know thanks darniz Yonik Seeley-2 wrote: > > On Mon, Sep 14, 2009 at 1:56 PM, darniz wrote: >> Pascal Dimassimo wrote: >>> >>> Hi, >>> >>> I want to try KStem. I'm following the instructions on this page: >>> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/Kstem >>> >>> ... but the download link doesn't work. >>> >>> Is anyone know the new location to download KStem? >>> >> I am stuck with the same issue >> its link is not working for a long time >> >> >> is there any alternate link >> Please let us know > > *shrug* - looks like they changed their download structure (or just > took it down). I searched around their site a bit but couldn't find > another one (and google wasn't able to find it either). > > The one from Lucid is functionally identical, free, and much, much > faster though - I'd just use that. > > -Yonik > http://www.lucidimagination.com > > -- View this message in context: http://www.nabble.com/KStem-download-tp24375856p25440692.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: 50% discount on "Taming Text" , "Lucene in Action", etc
3rd edition?! *whew* - let's get the 2nd edition in print first ;) Erik On Sep 14, 2009, at 12:10 PM, Fuad Efendi wrote: http://www.manning.com/ingersoll/ And other books too, such as Lucene in Action 3rd edition... PDF only (MEAP) "Today Only! Save 50% on any ebook! This offer applies to all final ebooks or ebook editions purchased through the Manning Early Access Program. Enter code pop0914 in the Promotional Code box when you check out at manning.com."
Re: Searching for the '+' character
Why don't you create a synonym for + that expands to your customers product name that includes the plus? You can even have your FE do this sort of replacement BEFORE submitting to Solr. Thanks, Matt Weber On Sep 14, 2009, at 11:42 AM, AHMET ARSLAN wrote: Thanks Ahmet, Thats excellent, thanks :) I may have to increase the gramsize to take into account other possible uses but i can now read around these filters to make the adjustments. With regard to WordDelimiterFilterFactory. Is there a way to place a delimiter on this filter to still get most of its functionality without it absorbing the + signs? Yes you are right, preserveOriginal="1" will causes the original token to be indexed without modifications. Will i loose a lot of 'good' functionality by removing it? It depends of your input data. It is used to break one token into subwords. Like: "Wi-Fi" -> "Wi", "Fi" and "PowerShot" -> "Power", "Shot" If you input data set contains such words, you may need it. But I think just to make last character searchable, using NGramFilter(s) is not an optimal solution. I don't know what type of dataset you have but, I think using separate two fields (with different types) for that is more suitable. One field will contain actual data itself. The other will hold only the last character(s). You can achieve this by a copyField or programatically during indexing. The type of the field lastCharsField will be using EdgeNGramFilter so that only last character of token(s) will pass that filter. During searching you will search those two fields: originalField:\+ OR lastCharsField:\+ The query lastCharsField:\+ will return you all the products ending with +. Hope this helps.
RE: 50% discount on "Taming Text" , "Lucene in Action", etc
Yes, 2nd edition; but subscription-based "Manning Early Access Program" (MEAP) is available, $13.75 (today only...), plus Author Online: http://www.manning-sandbox.com/forum.jspa?forumID=451 http://www.manning.com/hatcher3/ > -Original Message- > From: Erik Hatcher [mailto:erik.hatc...@gmail.com] > Sent: September-14-09 3:05 PM > To: solr-user@lucene.apache.org > Subject: Re: 50% discount on "Taming Text" , "Lucene in Action", etc > > 3rd edition?! *whew* - let's get the 2nd edition in print first ;) > > Erik > > On Sep 14, 2009, at 12:10 PM, Fuad Efendi wrote: > > > http://www.manning.com/ingersoll/ > > And other books too, such as Lucene in Action 3rd edition... PDF > > only (MEAP) > > > > "Today Only! Save 50% on any ebook! This offer applies to all final > > ebooks > > or ebook editions purchased through the Manning Early Access > > Program. Enter > > code pop0914 in the Promotional Code box when you check out at > > manning.com." > > > >
Load synonyms dynamically
Is there a way to load the synonyms dynamically. I mean if the synonym.txt file changes then during query time the newly added synonym should be active. Currently it required a reindex Thanks/Regards, Parvez
Solr 1.4 - autoSuggest - is it a default service
Hi, I am trying to use autoSuggest in Solr 1.4. Is autoSugest service available by default like select? or should I configure anything? Solrconfig.xml contains the termcomponent defined. Thanks R -- View this message in context: http://www.nabble.com/Solr-1.4---autoSuggestis-it-a-default-service-tp25443128p25443128.html Sent from the Solr - User mailing list archive at Nabble.com.
Difficulty with Multi-Word Synonyms
I'm running into an odd issue with multi-word synonyms in Solr (using the latest [9/14/09] nightly ). Things generally seem to work as expected, but I sometimes see words that are the leading term in a multi-word synonym being replaced with the token that follows them in the stream when they should just be ignored (i.e. there's no synonym match for just that token). When I preview the analysis at admin/analysis.jsp it looks fine, but at runtime I see problems like the one in the unit test below. It's a simple case, so I assume I'm making some sort of configuration and/or usage error. package org.apache.solr.analysis; import java.io.*; import java.util.*; import org.apache.lucene.analysis.WhitespaceTokenizer; import org.apache.lucene.analysis.tokenattributes.TermAttribute; public class TestMultiWordSynonmys extends junit.framework.TestCase { public void testMultiWordSynonmys() throws IOException { List rules = new ArrayList(); rules.add( "a b c,d" ); SynonymMap synMap = new SynonymMap( true ); SynonymFilterFactory.parseRules( rules, synMap, "=>", ",", true, null); SynonymFilter ts = new SynonymFilter( new WhitespaceTokenizer( new StringReader("a e")), synMap ); TermAttribute termAtt = (TermAttribute) ts.getAttribute(TermAttribute.class); ts.reset(); List tokens = new ArrayList(); while (ts.incrementToken()) tokens.add( termAtt.term() ); // This fails because ["e","e"] is the value of the token stream assertEquals(Arrays.asList("a","e"), tokens); } } Any help would be much appreciated. Thanks. --Gregg
multicore shards and relevancy score
Hi, I've done a few experiments with searching two cores with the same schema using the shard syntax. (using solr 1.3) My use case is that I want to have multiple cores because a few different people will be managing the indexing, and that will happen at different times. The data, however, is homogeneous. I've noticed in my tests that the results are not interwoven, but it might just be my test data. In other words, all the results from one core appear, then all the results from the other core. In thinking about it, it would make sense if the relevancy scores for each core were completely independent of each other. And that would mean that there is no way to compare the relevancy scores between the cores. In other words, I'd like the following results: - really relevant hit from core0 - pretty relevant hit from core1 - kind of relevant hit from core0 - not so relevant hit from core1 but I get: - really relevant hit from core0 - kind of relevant hit from core0 - pretty relevant hit from core1 - not so relevant hit from core1 So, are the results supposed to be interwoven, and I need to study my data more, or is this just not something that is possible? Also, if this is insurmountable, I've discovered two show stoppers that will prevent using multicore in my project (counting the lack of support for faceting in multicore). Are these issues addressed in solr 1.4? Thanks, Paul
Re: 50% discount on "Taming Text" , "Lucene in Action", etc
Hi, I can confirm it works! :-) Regards, Lukas On Mon, Sep 14, 2009 at 10:20 PM, Fuad Efendi wrote: > > Yes, 2nd edition; but subscription-based "Manning Early Access Program" > (MEAP) is available, $13.75 (today only...), plus Author Online: > http://www.manning-sandbox.com/forum.jspa?forumID=451 > http://www.manning.com/hatcher3/ > > > > -Original Message- > > From: Erik Hatcher [mailto:erik.hatc...@gmail.com] > > Sent: September-14-09 3:05 PM > > To: solr-user@lucene.apache.org > > Subject: Re: 50% discount on "Taming Text" , "Lucene in Action", etc > > > > 3rd edition?! *whew* - let's get the 2nd edition in print first ;) > > > > Erik > > > > On Sep 14, 2009, at 12:10 PM, Fuad Efendi wrote: > > > > > http://www.manning.com/ingersoll/ > > > And other books too, such as Lucene in Action 3rd edition... PDF > > > only (MEAP) > > > > > > "Today Only! Save 50% on any ebook! This offer applies to all final > > > ebooks > > > or ebook editions purchased through the Manning Early Access > > > Program. Enter > > > code pop0914 in the Promotional Code box when you check out at > > > manning.com." > > > > > > > > > >
Re: Solr 1.4 - autoSuggest - is it a default service
I guess you are looking for terms, Its in 1.4 just use a query like http://localhost:port /solr/terms/?terms=true&terms.fl=filed_name&terms.prefix=da Thanks/Regards, Parvez On Mon, Sep 14, 2009 at 3:35 PM, Yerraguntla wrote: > > Hi, > > I am trying to use autoSuggest in Solr 1.4. Is autoSugest service available > by default like select? or should I configure anything? > > > Solrconfig.xml contains the termcomponent defined. > > Thanks > R > > > -- > View this message in context: > http://www.nabble.com/Solr-1.4---autoSuggestis-it-a-default-service-tp25443128p25443128.html > Sent from the Solr - User mailing list archive at Nabble.com. > >
Re: Single Core or Multiple Core?
IMO forcing the users to do configuration change in Solr or in their application is the same thing - it all boils down to configuration change (I'll be very surprised if someone is actually hardcoding the Solr URL in their system - most probably it is configurable, and if it's not, forcing them to change it is actually a good thing). Besides, if there's only one core, why need a name? Consistency. Having a default core as Israel suggested can probably do the trick. But, at first it might seem that having a default core and not needing to specify the core name will make it easier for users to use. But I actually disagree - don't under estimate the power of being consistent. I rather have a manual telling me "this is how it works and it always work like that in all scenarios" then having something like "this is how it works but if you have scenario A then it works differently and you have to do this instead". Shalin Shekhar Mangar wrote: On Mon, Sep 14, 2009 at 8:16 PM, Uri Boness wrote: Is it really a problem? I mean, as i see it, solr to cores is what RDBMS is to databases. When you connect to a database you also need to specify the database name. The problem is compatibility. If we make solr.xml compulsory then we only force people to do a configuration change. But if we make a core name mandatory, then we force them to change their applications (or the applications' configurations). It is better if we can avoid that. Besides, if there's only one core, why need a name?
Is it possible to query for "everything" ?
I'm using Solr for seach and faceted browsing Is it possible to have solr search for 'everything' , at least as far as q is concerned ? The request handlers I've found don't like it if I don't pass in a q parameter
Re: Is it possible to query for "everything" ?
Query for *:* Thanks, Matt Weber On Sep 14, 2009, at 4:18 PM, Jonathan Vanasco wrote: I'm using Solr for seach and faceted browsing Is it possible to have solr search for 'everything' , at least as far as q is concerned ? The request handlers I've found don't like it if I don't pass in a q parameter
Re: Is it possible to query for "everything" ?
Use: ?q=*:* -Jay http://www.lucidimagination.com On Mon, Sep 14, 2009 at 4:18 PM, Jonathan Vanasco wrote: > I'm using Solr for seach and faceted browsing > > Is it possible to have solr search for 'everything' , at least as far as q > is concerned ? > > The request handlers I've found don't like it if I don't pass in a q > parameter >
Re: Is it possible to query for "everything" ?
Thanks Jay & Matt I tried *:* on my app, and it didn't work I tried it on the solr admin, and it did I checked the solr config file, and realized that it works on standard, but not on dismax, queries So i have my app checking *:* on a standard qt, and then filtering what I need on other qts! I would never have figured this out without you two!
Re: Is it possible to query for "everything" ?
With dismax you can use q.alt when the q param is missing: q.alt=*:* should work. -Jay On Mon, Sep 14, 2009 at 5:38 PM, Jonathan Vanasco wrote: > Thanks Jay & Matt > > I tried *:* on my app, and it didn't work > > I tried it on the solr admin, and it did > > I checked the solr config file, and realized that it works on standard, but > not on dismax, queries > > So i have my app checking *:* on a standard qt, and then filtering what I > need on other qts! > > I would never have figured this out without you two! >
Re: KStem download
i was able to declare a field type when the i use the lucid distribution of solr But if i copy the two jars and put it in lib directory of apache solr distribution it still gives me the following error. SEVERE: java.lang.NoClassDefFoundError: org/apache/solr/util/plugin/ResourceLoaderAware at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:621) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124) at java.net.URLClassLoader.defineClass(URLClassLoader.java:260) at java.net.URLClassLoader.access$000(URLClassLoader.java:56) at java.net.URLClassLoader$1.run(URLClassLoader.java:195) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:252) at org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:375) at org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:337) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:257) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:278) at org.apache.solr.util.plugin.AbstractPluginLoader.create(AbstractPluginLoader.java:83) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) at org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:781) at org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:56) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:413) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:431) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:440) at org.apache.solr.schema.IndexSchema.(IndexSchema.java:92) at org.apache.solr.core.SolrCore.(SolrCore.java:412) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594) at org.mortbay.jetty.servlet.Context.startContext(Context.java:139) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:161) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117) at org.mortbay.jetty.Server.doStart(Server.java:210) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.mortbay.start.Main.invokeMain(Main.java:183) at org.mortbay.start.Main.start(Main.java:497) at org.mortbay.start.Main.main(Main.java:115) Caused by: java.lang.ClassNotFoundException: org.apache.solr.util.plugin.ResourceLoaderAware at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:252) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) ... 53 more Even though i chec
Re: KStem download
i was able to declare a field type when the i use the lucid distribution of solr But if i copy the two jars and put it in lib directory of apache solr distribution it still gives me the following error. SEVERE: java.lang.NoClassDefFoundError: org/apache/solr/util/plugin/ResourceLoaderAware at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:621) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124) at java.net.URLClassLoader.defineClass(URLClassLoader.java:260) at java.net.URLClassLoader.access$000(URLClassLoader.java:56) at java.net.URLClassLoader$1.run(URLClassLoader.java:195) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:252) at org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:375) at org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:337) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:257) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:278) at org.apache.solr.util.plugin.AbstractPluginLoader.create(AbstractPluginLoader.java:83) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) at org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:781) at org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:56) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:413) at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:431) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:440) at org.apache.solr.schema.IndexSchema.(IndexSchema.java:92) at org.apache.solr.core.SolrCore.(SolrCore.java:412) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594) at org.mortbay.jetty.servlet.Context.startContext(Context.java:139) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:161) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117) at org.mortbay.jetty.Server.doStart(Server.java:210) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.mortbay.start.Main.invokeMain(Main.java:183) at org.mortbay.start.Main.start(Main.java:497) at org.mortbay.start.Main.main(Main.java:115) Caused by: java.lang.ClassNotFoundException: org.apache.solr.util.plugin.ResourceLoaderAware at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:252) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) ... 53 more Even though
Re: KStem download
The two jar files are all you should need, and the configuration is correct. However I noticed that you are on Solr 1.3. I haven't tested the Lucid KStemmer on a non-Lucid-certified distribution of 1.3. I have tested it on recent versions of 1.4 and it works fine (just tested with the most recent nightly build). So there are two options, but I don't know if either will work for you: 1. Move up to Solr 1.4, copy over the jars and configure. 2. Get the free Lucid certified distribution of 1.3 which already has the Lucid KStemmer (and other fixes which are an improvement over the standard 1.3). -Jay http://www.lucidimagination.com On Mon, Sep 14, 2009 at 6:09 PM, darniz wrote: > > i was able to declare a field type when the i use the lucid distribution of > solr > > > > class="com.lucidimagination.solrworks.analysis.LucidKStemFilterFactory" > protected="protwords.txt" /> > > > > But if i copy the two jars and put it in lib directory of apache solr > distribution it still gives me the following error. > > SEVERE: java.lang.NoClassDefFoundError: > org/apache/solr/util/plugin/ResourceLoaderAware >at java.lang.ClassLoader.defineClass1(Native Method) >at java.lang.ClassLoader.defineClass(ClassLoader.java:621) >at > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124) >at java.net.URLClassLoader.defineClass(URLClassLoader.java:260) >at java.net.URLClassLoader.access$000(URLClassLoader.java:56) >at java.net.URLClassLoader$1.run(URLClassLoader.java:195) >at java.security.AccessController.doPrivileged(Native Method) >at java.net.URLClassLoader.findClass(URLClassLoader.java:188) >at java.lang.ClassLoader.loadClass(ClassLoader.java:307) >at java.lang.ClassLoader.loadClass(ClassLoader.java:252) >at > > org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:375) >at > > org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:337) >at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) >at java.lang.Class.forName0(Native Method) >at java.lang.Class.forName(Class.java:247) >at > > org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:257) >at > > org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:278) >at > > org.apache.solr.util.plugin.AbstractPluginLoader.create(AbstractPluginLoader.java:83) >at > > org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) >at > org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:781) >at > org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:56) >at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:413) >at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:431) >at > > org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) >at > org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:440) >at org.apache.solr.schema.IndexSchema.(IndexSchema.java:92) >at org.apache.solr.core.SolrCore.(SolrCore.java:412) >at > > org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119) >at > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) >at > org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) >at > org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) >at > > org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594) >at org.mortbay.jetty.servlet.Context.startContext(Context.java:139) >at > > org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218) >at > org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500) >at > org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448) >at > org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) >at > > org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) >at > > org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:161) >at > org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) >at > > org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) >at > org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) >at > org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117) >at org.mortbay.jetty.Server.doStart(Server.java:210) >at > org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) >at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929) >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >at >
Re: Is it possible to query for "everything" ?
For the standard query handler, try [* TO *]. Bill On Mon, Sep 14, 2009 at 8:46 PM, Jay Hill wrote: > With dismax you can use q.alt when the q param is missing: > q.alt=*:* > should work. > > -Jay > > > On Mon, Sep 14, 2009 at 5:38 PM, Jonathan Vanasco > wrote: > > > Thanks Jay & Matt > > > > I tried *:* on my app, and it didn't work > > > > I tried it on the solr admin, and it did > > > > I checked the solr config file, and realized that it works on standard, > but > > not on dismax, queries > > > > So i have my app checking *:* on a standard qt, and then filtering what I > > need on other qts! > > > > I would never have figured this out without you two! > > >
Re: Misleading log messages while deploying solr
i downloaded and installed a fresh jboss (jboss-4.2.1.GA) and updated run.bat and jboss-service.xml then deployed solr war in the deploy folder. but still it is showing the same message on jboss startup.:confused::confused: con wrote: > > Thanks Hossman > > As per my understandings and investigations, if we disable STDERR from the > jboss configs, we will not be able to see any STDERR coming from any of > the APIs - which can be real error messages. > So if we know the exact reason why this message from solr is showing up, > we can block this at solr level or may be jboss level. > > Any suggestion which points out a reason for this or a solution that hides > these messages only is really appreciable. > > > thanks > > > > hossman wrote: >> >> >> : But the log message that is getting print in the server console, in my >> case >> : jboss, is showing status as error. >> : Why is this showing as ERROR, even though things are working fine. >> >> Solr is not declaring that those messages are ERRORs, solr is just >> logging >> informational messages (hence then "INFO" lines) using the java logging >> framework. >> >> My guess: since the logs are getting prefixed with "ERROR [STDERR]" >> something about the way your jboss container is configured is probably >> causing those log messages to be written to STDERR, and then jboss is >> capturing the STDERR and assuming that if it went there it mist be an >> "ERROR" of some kind and logging it to the console (using it's own log >> format, hence the touble timestamps per line message) >> >> In short: jboss is doing this in response to normal logging from solr. >> you should investigate your options for configuriring jboss and how it >> deals with log messages from applications. >> >> >> : 11:41:19,030 INFO [TomcatDeployer] deploy, ctxPath=/solr, >> : warUrl=.../tmp/deploy/tmp43266solr-exp.war/ >> : 11:41:19,948 ERROR [STDERR] 8 Sep, 2009 11:41:19 AM >> : org.apache.solr.servlet.SolrDispatchFilter init >> : INFO: SolrDispatchFilter.init() >> : 11:41:19,975 ERROR [STDERR] 8 Sep, 2009 11:41:19 AM >> : org.apache.solr.core.SolrResourceLoader locateInstanceDir >> : INFO: No /solr/home in JNDI >> : 11:41:19,976 ERROR [STDERR] 8 Sep, 2009 11:41:19 AM >> : org.apache.solr.core.SolrResourceLoader locateInstanceDir >> : INFO: using system property solr.solr.home: C:\app\Search >> : 11:41:19,984 ERROR [STDERR] 8 Sep, 2009 11:41:19 AM >> : org.apache.solr.core.CoreContainer$Initializer initialize >> : INFO: looking for solr.xml: C:\app\Search\solr.xml >> : 11:41:20,084 ERROR [STDERR] 8 Sep, 2009 11:41:20 AM >> : org.apache.solr.core.SolrResourceLoader >> : INFO: Solr home set to 'C:\app\Search' >> : 11:41:20,142 ERROR [STDERR] 8 Sep, 2009 11:41:20 AM >> : org.apache.solr.core.SolrResourceLoader createClassLoader >> : INFO: Adding >> : 'file:/C:/app/Search/lib/apache-solr-dataimporthandler-1.3.0.jar' to >> Solr >> : classloader >> : 11:41:20,144 ERROR [STDERR] 8 Sep, 2009 11:41:20 AM >> : org.apache.solr.core.SolrResourceLoader createClassLoader >> : INFO: Adding 'file:/C:/app/Search/lib/jsp-2.1/' to Solr classloader >> : >> : ... >> : INFO: Reusing parent classloader >> : 11:41:21,870 ERROR [STDERR] 8 Sep, 2009 11:41:21 AM >> : org.apache.solr.core.SolrConfig >> : INFO: Loaded SolrConfig: solrconfig.xml >> : 11:41:21,909 ERROR [STDERR] 8 Sep, 2009 11:41:21 AM >> : org.apache.solr.schema.IndexSchema readSchema >> : INFO: Reading Solr Schema >> : 11:41:22,092 ERROR [STDERR] 8 Sep, 2009 11:41:22 AM >> : org.apache.solr.schema.IndexSchema readSchema >> : INFO: Schema name=contacts schema >> : 11:41:22,121 ERROR [STDERR] 8 Sep, 2009 11:41:22 AM >> : org.apache.solr.util.plugin.AbstractPluginLoader load >> : INFO: created string: org.apache.solr.schema.StrField >> : >> : . >> : -- >> : View this message in context: >> http://www.nabble.com/Misleading-log-messages-while-deploying-solr-tp25354654p25354654.html >> : Sent from the Solr - User mailing list archive at Nabble.com. >> : >> >> >> >> -Hoss >> >> >> > > -- View this message in context: http://www.nabble.com/Misleading-log-messages-while-deploying-solr-tp25354654p25448607.html Sent from the Solr - User mailing list archive at Nabble.com.