Re: wildcards and German umlauts

2011-05-29 Thread mdz-munich
Hi,

"if i type complete word (such as "übersicht").
But there are no hits, if i use wildcards (such as "über*")
Searching with wildcards and without umlauts works as well." 

I can confirm that. 

Greetz,

Sebastian

--
View this message in context: 
http://lucene.472066.n3.nabble.com/wildcards-and-German-umlauts-tp499972p2998425.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: wildcards and German umlauts

2011-05-29 Thread mdz-munich
Ah, BTW,

since the problem seems to be a query-parser-issue a simple workarround
could be done by simple replace all Umlauts with ASCII-Characters (ä = ae, ö
= oe, ü = ue for example) before sending the query to Solr and use a
solr.MappingCharFilterFactory with the same replacements (ä = ae, ö = oe, ü
= ue) while indexing. 

It's unflexible in some cases, but it works so far. 

Greetz,

Sebastian 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/wildcards-and-German-umlauts-tp499972p2998449.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: wildcards and German umlauts

2011-05-29 Thread Markus Jelsma
Wildcard queries are not passed through an analyzer.

> Ah, BTW,
> 
> since the problem seems to be a query-parser-issue a simple workarround
> could be done by simple replace all Umlauts with ASCII-Characters (ä = ae,
> ö = oe, ü = ue for example) before sending the query to Solr and use a
> solr.MappingCharFilterFactory with the same replacements (ä = ae, ö = oe,
> ü = ue) while indexing.
> 
> It's unflexible in some cases, but it works so far.
> 
> Greetz,
> 
> Sebastian
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/wildcards-and-German-umlauts-tp499972p2
> 998449.html Sent from the Solr - User mailing list archive at Nabble.com.


Re: Results with and without whitspace(soccer club and soccerclub)

2011-05-29 Thread Erick Erickson
You might use the "replace" mapping for things like
"soccerclub => soccer club" rather than mutual synonyms

Use the analysis page from the admin console to understand what
transformations are possible with various syntaxes, then you'll be
in a place to decide the details.

Best
Erick

On Tue, May 24, 2011 at 6:01 AM, roySolr  wrote:
> Ok, I will do it with synonyms.
>
> What does the list look like?
>
> soccerclub,soccer club
>
> The index looks like this:
>
> Manchester united soccerclub
> Chelsea soccer club
>
> I want them both in my results if i search for "soccer club" or
> "soccerclub".
> How can i configure this in schema.xml?
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Results-with-and-without-whitespace-soccer-club-and-soccerclub-tp2934742p2979577.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: wildcards and German umlauts

2011-05-29 Thread mdz-munich
I don't get you. Did I wrote something of an Analyzer? Actually not. 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/wildcards-and-German-umlauts-tp499972p2999074.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: wildcards and German umlauts

2011-05-29 Thread mdz-munich
Ah, NOW I got it. It's not a bug, it's a feature. 

But that would mean, that every character-manipulation (e.g.
char-mapping/replacement, Porter-Stemmer in some cases ...) would cause a
wildcard-query to fail. That too bad.

But why? What's the Problem with passing the prefix through the
analyzer/filter-chain?  

Greetz,

Sebastian

--
View this message in context: 
http://lucene.472066.n3.nabble.com/wildcards-and-German-umlauts-tp499972p2999237.html
Sent from the Solr - User mailing list archive at Nabble.com.


GeoJSON Response Writer

2011-05-29 Thread Adam Estrada
All,

Has anyone modified the current json response writer to include the GeoJSON
geospatial encoding standard. See here: http://geojson.org/

Just curious...
Adam


Re: GeoJSON Response Writer

2011-05-29 Thread Mattmann, Chris A (388J)
Hey Adam,

I haven't done GeoJSON, but I did whip up a GeoRSS one, check it out here:

https://issues.apache.org/jira/browse/SOLR-2074

Cheers,
Chris

On May 29, 2011, at 11:14 AM, Adam Estrada wrote:

> All,
> 
> Has anyone modified the current json response writer to include the GeoJSON
> geospatial encoding standard. See here: http://geojson.org/
> 
> Just curious...
> Adam


++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++



Re: GeoJSON Response Writer

2011-05-29 Thread Adam Estrada
Thanks Chris!

Adam

On Sun, May 29, 2011 at 2:19 PM, Mattmann, Chris A (388J) <
chris.a.mattm...@jpl.nasa.gov> wrote:

> Hey Adam,
>
> I haven't done GeoJSON, but I did whip up a GeoRSS one, check it out here:
>
> https://issues.apache.org/jira/browse/SOLR-2074
>
> Cheers,
> Chris
>
> On May 29, 2011, at 11:14 AM, Adam Estrada wrote:
>
> > All,
> >
> > Has anyone modified the current json response writer to include the
> GeoJSON
> > geospatial encoding standard. See here: http://geojson.org/
> >
> > Just curious...
> > Adam
>
>
> ++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattm...@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++
>
>


Re: TermFreqVector Problem

2011-05-29 Thread deniz
there is nobody ever used TermFreqVector? 

-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/TermFreqVector-Problem-tp2992163p3000445.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Index content behind siteminder

2011-05-29 Thread Erick Erickson
Take a look at TikaEntityProcessor or the Tika package. I'm on restricted
inet access so can't look at the exact class.

Erick
On May 24, 2011 6:45 AM, "Thumuluri, Sai" 
wrote:
> Good morning, I am trying to index some PDFs which are protected by
> siteminder, any ideas as to how I can go about it? I am using Solr 1.4
>


Re: TermFreqVector Problem

2011-05-29 Thread Koji Sekiguchi

TermFreqVector vector = reader.getTermFreqVector(this.docId, "universal");
String universalTerms[] = vector.getTerms();

to see the lenght of universalTerms array, and it is 1 and only value that
array stores is the field value:

universalTerms[0]= "car house road age sex school education education tree
garden"


It seems that universal field is type "string". You'd like "text" type field 
instead.

koji
--
http://www.rondhuit.com/en/


Re: How to use StreamingUpdateSolrServer?

2011-05-29 Thread Erick Erickson
You use it from an external Java program. As I remember you can configure
the number of simultaneous threads to use as we'll, but check since I can't
look it up just now.

Best
Erick
On May 24, 2011 7:00 PM, "deniz"  wrote:
> Hi all,
>
> to improve crappy indexing speed i would like to use
> StreamingUpdateSolrServer but as a newbie I am not sure where to use... I
> have checked the wiki but all i get is how to implement. not where to put
> that method... Or maybe i am missing some facts...
>
> anyway, anyone used StreamingUpdateSolrServer before?
>
> -
> Zeki ama calismiyor... Calissa yapar...
> --
> View this message in context:
http://lucene.472066.n3.nabble.com/How-to-use-StreamingUpdateSolrServer-tp2982670p2982670.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: newbie question for DataImportHandler

2011-05-29 Thread Erick Erickson
This trips up a lot of folks. Sold just marks docs as deleted, the terms etc
are left in the index until an optimize is performed, or the segments are
merged. This latter isn't very predictable, so just do an optimize.

The docs aren't returned as results though.

Best
Erick
On May 24, 2011 10:22 PM, "antoniosi"  wrote:
> Hi,
>
> I am new to Solr; apologize in advance if this is a stupid question.
>
> I have created a simple database, with only 1 table with 3 columns, id,
> name, and last_update fields.
>
> I populate the database with 1 million test rows.
> I run solr, go to the data import handler development console and do a
full
> import. I use the "Luke" tool to look at the content of the lucene index.
>
> This all works fine so far.
>
> I remove all the 1 million rows from my table and populate the table with
> another million rows of data.
> I remove the index that solr previously create. I restart solr and go to
the
> data import handler development console and do the full import again.
>
> I use the "Luke" tool to look at the content of the lucene index. However,
I
> am seeing the old data in my new index.
>
> Doe Solr keeps a cached copy of the index somewhere?
>
> I hope I have described my problem clearly.
>
> Thanks in advance.
>
> --
> View this message in context:
http://lucene.472066.n3.nabble.com/newbie-question-for-DataImportHandler-tp2982277p2982277.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: adding results external to index

2011-05-29 Thread Erick Erickson
You'd, probably have to do this as two calls in your app, Solr doesn't have
this built in.

Best
Erick
On May 15, 2011 10:33 PM, "abhayd"  wrote:
> hi
>
> I am not sure if SOLR has this feature so just wanted to confirm..
>
> Basically what I want to do is for certain query terms I would like to
query
> real time web service which will return certain results and at the same
time
> search in solr index.
>
> This can be implemented out side solr and I am well aware of that, but
most
> search engines offer this functionality. For instance Google Search
> Appliance has a functionality called One Box.
>
> Can this be implemented in solr ?
>
> --
> View this message in context:
http://lucene.472066.n3.nabble.com/adding-results-external-to-index-tp2946548p2946548.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Problem with caps and star symbol

2011-05-29 Thread Erick Erickson
I'd start by looking at the analysis page from the Solr admin page. That
will give you an idea of the transformations the various steps carry out,
it's invaluable!

Best
Erick
On May 26, 2011 12:53 AM, "Saumitra Chowdhury" <
saumi...@smartitengineering.com> wrote:
> Hi all ,
> In my schema.xml i am using WordDelimiterFilterFactory,
> LowerCaseFilterFactory, StopFilterFactory for index analyzer and an extra
> SynonymFilterFactory for query analyzer. I am indexing a field name
> '*name*'.Now
> if a value with all caps like "NAME_BILL" is indexed I am able get this as
> search result with the term " *name_bill *", " *NAME_BILL *", " *namebill
*",
> "*namebill** ", " *nameb** " ... But for the term like following " *
> NAME_BILL** ", " *name_bill** ", " *namebill** ", " *NAME** " the result
> does mot show this document. Can anyone please explain why this is
> happening? .In fact star " * " is not giving any result in many
> cases specially if it is used after full value of a field.
>
> Portion of my schema is given below.
>
> 
> -
> 
> 
> 
> 
> -
> 
> -
> 
> 
>  generateNumberParts="0" catenateWords="1" catenateNumbers="1"
> catenateAll="0"/>
> 
>  words="stopwords.txt" enablePositionIncrements="true"/>
> 
> -
> 
> 
>  generateNumberParts="0" catenateWords="1" catenateNumbers="1"
> catenateAll="0"/>
> 
>  ignoreCase="true" expand="true"/>
>  words="stopwords.txt" enablePositionIncrements="true"/>
> 
> 
> -
>  positionIncrementGap="100">
> -
> 
> 
>  generateNumberParts="0" catenateWords="1" catenateNumbers="1"
> catenateAll="0"/>
> 
>  ignoreCase="true" expand="false"/>
>  words="stopwords.txt"/>
> 
> 
> 


Re: Terms Component - solr-1.4.0

2011-05-29 Thread Erick Erickson
Please tell us what you've tried and what problems you're having, we can't
help much with such a general request.

Best
Erick
On May 26, 2011 5:02 AM, "Solr User"  wrote:
> Hi All,
>
> Please help me in implementing TermsComponent in my current Solr solution.
>
> Regards,
> Solr User
>
> On Tue, May 17, 2011 at 4:12 PM, Solr User  wrote:
>
>> Hi All,
>>
>> I am using Solr 1.4.0 and dismax as request handler.I have the following
in
>> my solrconfig.xml in the dismax request handler tag
>>
>> 
>> spellcheck
>> 
>>
>> The above tags helps to find terms if there are spelling issues. I tried
>> configuring terms component and no luck.
>>
>> May I know how to configure terms component with dismax? or Do I need to
>> call terms component directly to get auto suggestions?
>>
>> Thank you so much in advance.
>>
>> Regards,
>> Solr User
>>


Re: Too many Boolean Clause and Filter Query

2011-05-29 Thread Erick Erickson
This is usually done with roles to limit the size of the author token
clause. You might search the archives for permissions, authorizations, etc.
Adding a ton of author tokens in a clause doesn't scale we'll, you need to
use a different strategy here.

Best
Erick
On May 26, 2011 5:51 AM, "Sujatha Arun"  wrote:
> We have increased the  now ,but since we have a number
> of instances on a single server and also number of ids that will get
> added to filter wll be increasing ...with no known limit ,I was wonderng f
> there was any other scalable method not affected by the  clause>..
>
> Also on looking at Manifold CF Documentation , not sure if this is any
> dfferent than ndexing user permssion to solr and filtering .Any body has
> done ths for permisssion based document flterng
>
> Regards
> Sujatha
>
> On Thu, May 26, 2011 at 3:47 PM, pravesh  wrote:
>
>> I'm sure you can fix this by increasing  value to some
>> max.
>> This shld apply to filter query as well
>>
>> --
>> View this message in context:
>>
http://lucene.472066.n3.nabble.com/Too-many-Boolean-Clause-and-Filter-Query-tp2974848p2988190.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>


Match in the process of filter, not end, does it mean "not matching"?

2011-05-29 Thread Ellery Leung
This is the schema:

 















































 

And there is a multiValued field:

 



 

Now I want to search this string: Merry Christmas and Happy New Year

 

In "Admin Analysis" in solr admin, it highlight (in light blue) the matching
word in LowerCaseFilterFactory, CommonGramsFilterFactory and
ShingleFilterFactory.  However, it does not have any highlight in
NGramFilterFactory.

 

Now, I did a search in full-interface mode in solr admin: 

 

textContains_Something:"Merry Christmas and Happy New Year"

 

It contains NO RESULT.

 

Does it mean that matching only counts after all tokenizer and filters?

 

Thank you in advance for any help.



Re: parentDeltaQuery

2011-05-29 Thread Romi
delta import i know. i want to abt parentdelta query

-
Thanks & Regards
Romi
--
View this message in context: 
http://lucene.472066.n3.nabble.com/parentDeltaQuery-tp2979110p3000847.html
Sent from the Solr - User mailing list archive at Nabble.com.