Re: wildcards and German umlauts
Hi, "if i type complete word (such as "übersicht"). But there are no hits, if i use wildcards (such as "über*") Searching with wildcards and without umlauts works as well." I can confirm that. Greetz, Sebastian -- View this message in context: http://lucene.472066.n3.nabble.com/wildcards-and-German-umlauts-tp499972p2998425.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: wildcards and German umlauts
Ah, BTW, since the problem seems to be a query-parser-issue a simple workarround could be done by simple replace all Umlauts with ASCII-Characters (ä = ae, ö = oe, ü = ue for example) before sending the query to Solr and use a solr.MappingCharFilterFactory with the same replacements (ä = ae, ö = oe, ü = ue) while indexing. It's unflexible in some cases, but it works so far. Greetz, Sebastian -- View this message in context: http://lucene.472066.n3.nabble.com/wildcards-and-German-umlauts-tp499972p2998449.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: wildcards and German umlauts
Wildcard queries are not passed through an analyzer. > Ah, BTW, > > since the problem seems to be a query-parser-issue a simple workarround > could be done by simple replace all Umlauts with ASCII-Characters (ä = ae, > ö = oe, ü = ue for example) before sending the query to Solr and use a > solr.MappingCharFilterFactory with the same replacements (ä = ae, ö = oe, > ü = ue) while indexing. > > It's unflexible in some cases, but it works so far. > > Greetz, > > Sebastian > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/wildcards-and-German-umlauts-tp499972p2 > 998449.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Results with and without whitspace(soccer club and soccerclub)
You might use the "replace" mapping for things like "soccerclub => soccer club" rather than mutual synonyms Use the analysis page from the admin console to understand what transformations are possible with various syntaxes, then you'll be in a place to decide the details. Best Erick On Tue, May 24, 2011 at 6:01 AM, roySolr wrote: > Ok, I will do it with synonyms. > > What does the list look like? > > soccerclub,soccer club > > The index looks like this: > > Manchester united soccerclub > Chelsea soccer club > > I want them both in my results if i search for "soccer club" or > "soccerclub". > How can i configure this in schema.xml? > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Results-with-and-without-whitespace-soccer-club-and-soccerclub-tp2934742p2979577.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: wildcards and German umlauts
I don't get you. Did I wrote something of an Analyzer? Actually not. -- View this message in context: http://lucene.472066.n3.nabble.com/wildcards-and-German-umlauts-tp499972p2999074.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: wildcards and German umlauts
Ah, NOW I got it. It's not a bug, it's a feature. But that would mean, that every character-manipulation (e.g. char-mapping/replacement, Porter-Stemmer in some cases ...) would cause a wildcard-query to fail. That too bad. But why? What's the Problem with passing the prefix through the analyzer/filter-chain? Greetz, Sebastian -- View this message in context: http://lucene.472066.n3.nabble.com/wildcards-and-German-umlauts-tp499972p2999237.html Sent from the Solr - User mailing list archive at Nabble.com.
GeoJSON Response Writer
All, Has anyone modified the current json response writer to include the GeoJSON geospatial encoding standard. See here: http://geojson.org/ Just curious... Adam
Re: GeoJSON Response Writer
Hey Adam, I haven't done GeoJSON, but I did whip up a GeoRSS one, check it out here: https://issues.apache.org/jira/browse/SOLR-2074 Cheers, Chris On May 29, 2011, at 11:14 AM, Adam Estrada wrote: > All, > > Has anyone modified the current json response writer to include the GeoJSON > geospatial encoding standard. See here: http://geojson.org/ > > Just curious... > Adam ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
Re: GeoJSON Response Writer
Thanks Chris! Adam On Sun, May 29, 2011 at 2:19 PM, Mattmann, Chris A (388J) < chris.a.mattm...@jpl.nasa.gov> wrote: > Hey Adam, > > I haven't done GeoJSON, but I did whip up a GeoRSS one, check it out here: > > https://issues.apache.org/jira/browse/SOLR-2074 > > Cheers, > Chris > > On May 29, 2011, at 11:14 AM, Adam Estrada wrote: > > > All, > > > > Has anyone modified the current json response writer to include the > GeoJSON > > geospatial encoding standard. See here: http://geojson.org/ > > > > Just curious... > > Adam > > > ++ > Chris Mattmann, Ph.D. > Senior Computer Scientist > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 171-266B, Mailstop: 171-246 > Email: chris.a.mattm...@nasa.gov > WWW: http://sunset.usc.edu/~mattmann/ > ++ > Adjunct Assistant Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++ > >
Re: TermFreqVector Problem
there is nobody ever used TermFreqVector? - Zeki ama calismiyor... Calissa yapar... -- View this message in context: http://lucene.472066.n3.nabble.com/TermFreqVector-Problem-tp2992163p3000445.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Index content behind siteminder
Take a look at TikaEntityProcessor or the Tika package. I'm on restricted inet access so can't look at the exact class. Erick On May 24, 2011 6:45 AM, "Thumuluri, Sai" wrote: > Good morning, I am trying to index some PDFs which are protected by > siteminder, any ideas as to how I can go about it? I am using Solr 1.4 >
Re: TermFreqVector Problem
TermFreqVector vector = reader.getTermFreqVector(this.docId, "universal"); String universalTerms[] = vector.getTerms(); to see the lenght of universalTerms array, and it is 1 and only value that array stores is the field value: universalTerms[0]= "car house road age sex school education education tree garden" It seems that universal field is type "string". You'd like "text" type field instead. koji -- http://www.rondhuit.com/en/
Re: How to use StreamingUpdateSolrServer?
You use it from an external Java program. As I remember you can configure the number of simultaneous threads to use as we'll, but check since I can't look it up just now. Best Erick On May 24, 2011 7:00 PM, "deniz" wrote: > Hi all, > > to improve crappy indexing speed i would like to use > StreamingUpdateSolrServer but as a newbie I am not sure where to use... I > have checked the wiki but all i get is how to implement. not where to put > that method... Or maybe i am missing some facts... > > anyway, anyone used StreamingUpdateSolrServer before? > > - > Zeki ama calismiyor... Calissa yapar... > -- > View this message in context: http://lucene.472066.n3.nabble.com/How-to-use-StreamingUpdateSolrServer-tp2982670p2982670.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: newbie question for DataImportHandler
This trips up a lot of folks. Sold just marks docs as deleted, the terms etc are left in the index until an optimize is performed, or the segments are merged. This latter isn't very predictable, so just do an optimize. The docs aren't returned as results though. Best Erick On May 24, 2011 10:22 PM, "antoniosi" wrote: > Hi, > > I am new to Solr; apologize in advance if this is a stupid question. > > I have created a simple database, with only 1 table with 3 columns, id, > name, and last_update fields. > > I populate the database with 1 million test rows. > I run solr, go to the data import handler development console and do a full > import. I use the "Luke" tool to look at the content of the lucene index. > > This all works fine so far. > > I remove all the 1 million rows from my table and populate the table with > another million rows of data. > I remove the index that solr previously create. I restart solr and go to the > data import handler development console and do the full import again. > > I use the "Luke" tool to look at the content of the lucene index. However, I > am seeing the old data in my new index. > > Doe Solr keeps a cached copy of the index somewhere? > > I hope I have described my problem clearly. > > Thanks in advance. > > -- > View this message in context: http://lucene.472066.n3.nabble.com/newbie-question-for-DataImportHandler-tp2982277p2982277.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: adding results external to index
You'd, probably have to do this as two calls in your app, Solr doesn't have this built in. Best Erick On May 15, 2011 10:33 PM, "abhayd" wrote: > hi > > I am not sure if SOLR has this feature so just wanted to confirm.. > > Basically what I want to do is for certain query terms I would like to query > real time web service which will return certain results and at the same time > search in solr index. > > This can be implemented out side solr and I am well aware of that, but most > search engines offer this functionality. For instance Google Search > Appliance has a functionality called One Box. > > Can this be implemented in solr ? > > -- > View this message in context: http://lucene.472066.n3.nabble.com/adding-results-external-to-index-tp2946548p2946548.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Problem with caps and star symbol
I'd start by looking at the analysis page from the Solr admin page. That will give you an idea of the transformations the various steps carry out, it's invaluable! Best Erick On May 26, 2011 12:53 AM, "Saumitra Chowdhury" < saumi...@smartitengineering.com> wrote: > Hi all , > In my schema.xml i am using WordDelimiterFilterFactory, > LowerCaseFilterFactory, StopFilterFactory for index analyzer and an extra > SynonymFilterFactory for query analyzer. I am indexing a field name > '*name*'.Now > if a value with all caps like "NAME_BILL" is indexed I am able get this as > search result with the term " *name_bill *", " *NAME_BILL *", " *namebill *", > "*namebill** ", " *nameb** " ... But for the term like following " * > NAME_BILL** ", " *name_bill** ", " *namebill** ", " *NAME** " the result > does mot show this document. Can anyone please explain why this is > happening? .In fact star " * " is not giving any result in many > cases specially if it is used after full value of a field. > > Portion of my schema is given below. > > > - > > > > > - > > - > > > generateNumberParts="0" catenateWords="1" catenateNumbers="1" > catenateAll="0"/> > > words="stopwords.txt" enablePositionIncrements="true"/> > > - > > > generateNumberParts="0" catenateWords="1" catenateNumbers="1" > catenateAll="0"/> > > ignoreCase="true" expand="true"/> > words="stopwords.txt" enablePositionIncrements="true"/> > > > - > positionIncrementGap="100"> > - > > > generateNumberParts="0" catenateWords="1" catenateNumbers="1" > catenateAll="0"/> > > ignoreCase="true" expand="false"/> > words="stopwords.txt"/> > > >
Re: Terms Component - solr-1.4.0
Please tell us what you've tried and what problems you're having, we can't help much with such a general request. Best Erick On May 26, 2011 5:02 AM, "Solr User" wrote: > Hi All, > > Please help me in implementing TermsComponent in my current Solr solution. > > Regards, > Solr User > > On Tue, May 17, 2011 at 4:12 PM, Solr User wrote: > >> Hi All, >> >> I am using Solr 1.4.0 and dismax as request handler.I have the following in >> my solrconfig.xml in the dismax request handler tag >> >> >> spellcheck >> >> >> The above tags helps to find terms if there are spelling issues. I tried >> configuring terms component and no luck. >> >> May I know how to configure terms component with dismax? or Do I need to >> call terms component directly to get auto suggestions? >> >> Thank you so much in advance. >> >> Regards, >> Solr User >>
Re: Too many Boolean Clause and Filter Query
This is usually done with roles to limit the size of the author token clause. You might search the archives for permissions, authorizations, etc. Adding a ton of author tokens in a clause doesn't scale we'll, you need to use a different strategy here. Best Erick On May 26, 2011 5:51 AM, "Sujatha Arun" wrote: > We have increased the now ,but since we have a number > of instances on a single server and also number of ids that will get > added to filter wll be increasing ...with no known limit ,I was wonderng f > there was any other scalable method not affected by the clause>.. > > Also on looking at Manifold CF Documentation , not sure if this is any > dfferent than ndexing user permssion to solr and filtering .Any body has > done ths for permisssion based document flterng > > Regards > Sujatha > > On Thu, May 26, 2011 at 3:47 PM, pravesh wrote: > >> I'm sure you can fix this by increasing value to some >> max. >> This shld apply to filter query as well >> >> -- >> View this message in context: >> http://lucene.472066.n3.nabble.com/Too-many-Boolean-Clause-and-Filter-Query-tp2974848p2988190.html >> Sent from the Solr - User mailing list archive at Nabble.com. >>
Match in the process of filter, not end, does it mean "not matching"?
This is the schema: And there is a multiValued field: Now I want to search this string: Merry Christmas and Happy New Year In "Admin Analysis" in solr admin, it highlight (in light blue) the matching word in LowerCaseFilterFactory, CommonGramsFilterFactory and ShingleFilterFactory. However, it does not have any highlight in NGramFilterFactory. Now, I did a search in full-interface mode in solr admin: textContains_Something:"Merry Christmas and Happy New Year" It contains NO RESULT. Does it mean that matching only counts after all tokenizer and filters? Thank you in advance for any help.
Re: parentDeltaQuery
delta import i know. i want to abt parentdelta query - Thanks & Regards Romi -- View this message in context: http://lucene.472066.n3.nabble.com/parentDeltaQuery-tp2979110p3000847.html Sent from the Solr - User mailing list archive at Nabble.com.