Re: Solr and Nutch/Droids - to use or not to use?

2010-06-16 Thread MitchK
Thank you for the feedback, Otis. Yes, I thought that such an approach is usefull if the number of pages to crawl is relatively low. However, what about using solr + nutch? Exists the problem that this would not scale, if the index becomes too large, up to now? What about extending nutch with fe

Re: Solr and Nutch/Droids - to use or not to use?

2010-06-16 Thread MitchK
Thanks, that really helps to find the right beginning for such a journey. :-) > * Use Solr, not Nutch's search webapp > As far as I have read, Solr can't scale, if the index gets too large for one Server > The setup explained here has one significant caveat you also need to keep > in mind:

RE: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-16 Thread MitchK
Good morning! Great feedback from you all. This really helped a lot to get an impression of what is possible and what is not. What is interesting to me are some detail questions. Let's assume Solr is possible to work on his own with distributed indexing, so that the client does not need to know

Re: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-17 Thread MitchK
> Solr doesn't know anything about OPIC, but I suppose you can feed the OPIC > score computed by Nutch into a Solr field and use it during scoring, if > you want, say with a function query. > Oh! Yes, that makes more sense than using the OPIC as doc-boost-value. :-) Anywhere at the Lucene Mail

Re: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-17 Thread MitchK
h you can use today: > > http://search-lucene.com/?q=ExternalFileField&fc_project=Solr > > Otis > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Lucene ecosystem search :: http://search-lucene.com/ > > > > - Original Message >&g

Re: Document boosting troubles

2010-06-17 Thread MitchK
Hi, first of all, are you sure that row.put('$docBoost',docBoostVal) is correct? I think it should be row.put($docBoost,docBoostVal); - unfortunately I am not sure. Hm, I think, until you can solve the problem with the docBoosts itself, you should use a functionQuery. Use "div(1, rank)" as boo

Re: Document boosting troubles

2010-06-17 Thread MitchK
Sorry, I've overlooked your other question. > > rank:1^10.0 rank:2^9.0 rank:3^8.0 rank:4^7.0 rank:5^6.0 rank:6^5.0 > rank:7^4.0 rank:8^3.0 rank:9^2.0 > > This is wrong. You need to change "bf" to "bq". Bf -> boosting function Bq -> boosting query. -- View this message

Re: solr multi-node

2010-06-17 Thread MitchK
Antonello, here are a few links to the Solr Wiki: http://wiki.apache.org/solr/SolrReplication Solr Replication http://wiki.apache.org/solr/DistributedSearchDesign Distributed Search Design http://wiki.apache.org/solr/DistributedSearch Distributed Search http://wiki.apache.org/solr/SolrCloud So

Re: Master master?

2010-06-17 Thread MitchK
What is the usecase for such an architecture? Do you send requests to two different masters for indexing and that's why they need to be synchronized? Kind regards - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/Master-master-tp884253p903233.html Sent from the Solr -

Re: Document boosting troubles

2010-06-17 Thread MitchK
Hi, > One problem down, two left! =) bf ==> bq did the trick, thanks. Now at > least if I can't get the DIH solution working I don't have to tack that on > every query string. > I would really recommend to use a boost function. If your rank will change in future implementations, you do not

Re: DismaxRequestHandler

2010-06-17 Thread MitchK
Joe, please, can you provide an example of what you are thinking of? Subqueries with Solr... I've never seen something like that before. Thank you! Kind regards - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/DismaxRequestHandler-tp903641p904142.html Sent from th

Re: Re: Re: Solr and Nutch/Droids - to use or not to use?

2010-06-17 Thread MitchK
Otis, And again I wished I were registred. I will check the JIRA and when I feel comfortable with it, I will open it. Kind regards - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-and-Nutch-Droids-to-use-or-not-to-use-tp900069p904145.html Sent from the Solr - U

Re: Question on dynamic fields

2010-06-17 Thread MitchK
Barani, without more background on dynamic fields, I would say that the most easiest way would be to define a suffix for each of the fields you want to index into the mentioned dynamic field and to redefine your dynamic field - condition. If suffix does not work, because of other dynamic-field d

Re: solr with hadoop

2010-06-22 Thread MitchK
I wanted to add a Jira-issue about exactly what Otis is asking here. Unfortunately, I haven't time for it because of my exams. However, I'd like to add a question to Otis' ones: If you destribute the indexing-progress this way, are you able to replicate the different documents correctly? Thank y

Re: Solr 1.4 - Image-Highlighting and Payloads

2010-06-24 Thread MitchK
Sebastian, sounds like an exciting project. > We've found the argument "TokenGroup" in method "highlightTerm" > implemented in SimpleHtmlFormatter. "TokenGroup" provides the method > "getPayload()", but the returned value is always "NULL". > No, Token provides this method, not TokenGroup. Bu

Re: MoreLikeThis (mlt) : use the match's maxScore for result score normalization

2010-06-24 Thread MitchK
Chantal, have a look at http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/search/similar/MoreLikeThis.html More like this to have a guess what the MLT's score concerns. The problem is that you can't compare scores. The query for the "normal" result-response was maybe something like

Re: MoreLikeThis (mlt) : use the match's maxScore for result score normalization

2010-06-25 Thread MitchK
Hi Chantal, Munich? Germany seems to be soo small :-). Chantal Ackermann wrote: > > I only want a way to show to the > user a kind of relevancy or similarity indicator (for example using a > range of 10 stars) that would give a hint on how similar the mlt hit is > to the input (match) item.

Question about the mailinglist (junk on my behalf)

2010-06-28 Thread MitchK
Hello community, since a few days I recieve daily some mails with suspicious content. It is said that some of my mails were rejected, because of the file-types of the mail's attachements and other things. This wonders me a lot, because I didn't send any mails with attachements and even the eMail-

Re: Wither field compresed="true" ?

2010-06-29 Thread MitchK
David, well, I am no committer, but I noticed that Lucene will no longer care of compressing (I think this was because of the trouble when doing this) and maybe this is the reason why Solr keeps this option no longer available. Unfortunately, I do not have got any link for it, but I think this w

Re: How I can use score value for my function

2010-06-29 Thread MitchK
Ramzesua, this is not possible, because Solr does not know what is the resulting score at query-time (as far as I know). The score will be computed, when every hit from every field is combined by the scorer. Furthermore I have shown you an alternative in the other threads. It makes not exactly wh

Re: How I can use score value for my function

2010-06-29 Thread MitchK
Britske good workaround! I did not thought about the possibility of using subqueries. Regards - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/How-I-can-use-score-value-for-my-function-tp899662p931448.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr with hadoop

2010-07-05 Thread MitchK
I need to revive this discussion... If you do distributed indexing correctly, what about updating the documents and what about replicating them correctly? Does this work? Or wasn't this an issue? Kind regards - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/solr-wit

Problem with Solr-Mailinglist

2010-07-19 Thread MitchK
Hello, I try to post http://lucene.472066.n3.nabble.com/Solr-in-an-extra-project-what-about-replication-scaling-etc-td977961.html#a977961 this message for the fourth time to the Solr Mailinglist and everytime I get the following response from the Mailing-list's server: > solr-user@lucene.

Re: Problem with Solr-Mailinglist

2010-07-19 Thread MitchK
Thank you both. I will do what Hoss suggested, tomorrow. The mail was sent over the nabble-board and another time over my thunderbird-client. Both with the same result. So there was not more HTML-code than it was in every of my other postings. Kind regards - Mitch -- View this message in contex

Re: Autocomplete with NGrams

2010-07-19 Thread MitchK
Frank, have a look at Solr's example-directory's and look for 'multicore'. There you can see an example-configuration for a multicore-environment. Kind regards, - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-with-NGrams-tp979312p979610.html Sent from

Re: Autocomplete with NGrams

2010-07-20 Thread MitchK
It sounds like the best solution here, right. However, I do not want to exclude the possibility of doing things one *should* do in different cores with different configurations and schema.xml in one core. I haven't completly read the lucidimagination article, but I would suggest you to do your wo

Re: Beginner question

2010-07-20 Thread MitchK
Here you can find params and their meanings for the dismax-handler. You may not find anything in the wiki by searching for a parser ;). Link: http://wiki.apache.org/solr/DisMaxRequestHandler Wiki: DisMaxRequestHandler Kind regards - Mitch Erik Hatcher-4 wrote: > > Consider using the dismax

Re: a bug of solr distributed search

2010-07-21 Thread MitchK
Li Li, this is the intended behaviour, not a bug. Otherwise you could get back the same record in a response for several times, which may not be intended by the user. Kind regards, - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search-tp98

nested query and number of matched records

2010-07-21 Thread MitchK
Hello community, I got a situation, where I know that some types of documents contain very extensive information and other types are giving more general information. Since I don't know whether a user searches for general or extensive information (and I don't want to ask him when he uses the defau

Re: a bug of solr distributed search

2010-07-21 Thread MitchK
Ah, okay. I understand your problem. Why should doc x be at position 1 when searching for the first time, and when I search for the 2nd time it occurs at position 8 - right? I am not sure, but I think you can't prevent this without custom coding or making a document's occurence unique. Kind rega

Re: nested query and number of matched records

2010-07-21 Thread MitchK
Oh,... I just see, there is no direct question ;-). How can I specify the number of returned documents in the desired way *within* one request? - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/nested-query-and-number-of-matched-records-tp983756p983773.html Sent from

Re: a bug of solr distributed search

2010-07-21 Thread MitchK
I don't know much about the code. Maybe you can tell me to what file you are referring? However, from the comments one can see, that the problem is known but one decided to let it happen, because of System requirements in the Java version. - Mitch -- View this message in context: http://luce

Re: a bug of solr distributed search

2010-07-21 Thread MitchK
It already was sorted by score. The problem here is the following: Shard_A and shard_B contain doc_X and doc_X. If you are querying for something, doc_X could have a score of 1.0 at shard_A and a score of 12.0 at shard_B. You can never be sure which doc Solr sees first. In the bad case, Solr see

Re: nested query and number of matched records

2010-07-21 Thread MitchK
Thank you three for your feedback! Chantal, unfortuntately kenf is right. Facetting won't work in this special case. > parallel calls. > Yes, this will be the solution. However, this would lead to a second HTTP-request and I hoped to be able to avoid it. Chantal Ackermann wrote: > > Sure S

Re: a bug of solr distributed search

2010-07-23 Thread MitchK
Yonik, why do we do not send the output of TermsComponent of every node in the cluster to a Hadoop instance? Since TermsComponent does the map-part of the map-reduce concept, Hadoop only needs to reduce the stuff. Maybe we even do not need Hadoop for this. After reducing, every node in the cluste

Re: a bug of solr distributed search

2010-07-23 Thread MitchK
values to disk (Which I do not suggest). Thoughts? - Mitch MitchK wrote: > > Yonik, > > why do we do not send the output of TermsComponent of every node in the > cluster to a Hadoop instance? > Since TermsComponent does the map-part of the map-reduce concept, Hadoop > only need

Re: a bug of solr distributed search

2010-07-23 Thread MitchK
That only works if the docs are exactly the same - they may not be. Ahm, what? Why? If the uniqueID is the same, the docs *should* be the same, don't they? -- View this message in context: http://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search-tp983533p990563.html Sent from the So

Re: a bug of solr distributed search

2010-07-24 Thread MitchK
Okay, but than LiLi did something wrong, right? I mean, if the document exists only at one shard, it should get the same score whenever one requests it, no? Of course, this only applies if nothing gets changed between the requests. The only remaining problem here would be, that you need distribut

Re: a bug of solr distributed search

2010-07-25 Thread MitchK
Good morning, https://issues.apache.org/jira/browse/SOLR-1632 - Mitch Li Li wrote: > > where is the link of this patch? > > 2010/7/24 Yonik Seeley : >> On Fri, Jul 23, 2010 at 2:23 PM, MitchK wrote: >>> why do we do not send the output of TermsComponent of every

Re: Solr Doc Lucene Doc !?

2010-07-26 Thread MitchK
Stockii, Solr's index is a Lucene Index. Therefore, Solr documents are Lucene documents. Kind regards, - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Doc-Lucene-Doc-tp995922p995968.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: DIH : SQL query (sub-entity) is executed although variable is not set (null or empty list)

2010-07-26 Thread MitchK
Hi Chantal, did you tried to write a http://wiki.apache.org/solr/DIHCustomFunctions custom DIH Function ? If not, I think this will be a solution. Just check, whether "${prog.vip}" is an empty string or null. If so, you need to replace it with a value that never can response anything. So the vi

Re: DIH : SQL query (sub-entity) is executed although variable is not set (null or empty list)

2010-07-27 Thread MitchK
Hi Chantal, > However, with this approach indexing time went up from 20min to more > than 5 hours. > This is 15x slower than the initial solution... wow. >From MySQL I know that IN ()-clauses are the embodiment of endlessness - they perform very, very badly. New idea: Create a method which

Re: DIH : SQL query (sub-entity) is executed although variable is not set (null or empty list)

2010-07-27 Thread MitchK
Hi Chantal, instead of: /* multivalued, not required */ you do: /* multivalued, not required */ The yourCustomFunctionToReturnAQueryString(vip, querystring1, querystring2) { if(vip != n

SolrJ Response + JSON

2010-07-28 Thread MitchK
Hello community, I need to transform SolrJ - responses into JSON, after some computing on those results by another application has finished. I can not do those computations on the Solr - side. So, I really have to translate SolrJ's output into JSON. Any experiences how to do so without writing

SolrJ Response + JSON

2010-07-28 Thread MitchK
Hello , Second try to send a mail to the mailing list... I need to translate SolrJ's response into JSON-response. I can not query Solr directly, because I need to do some math with the responsed data, before I show the results to the client. Any experiences how to translate SolrJ's response i

Re: SolrJ Response + JSON

2010-07-28 Thread MitchK
you haven't already, and query with wt=json. Can't get mucht easier. Cheers, On Wednesday 28 July 2010 15:08:26 MitchK wrote: Hello , Second try to send a mail to the mailing list... I need to translate SolrJ's response into JSON-response. I can not query Solr directly, becau

Re: SolrJ Response + JSON

2010-07-28 Thread MitchK
e JacksonParser with Spring. http://json.org/ lists parsers for different programming languages. Cheers, Chantal On Wed, 2010-07-28 at 15:08 +0200, MitchK wrote: Hello , Second try to send a mail to the mailing list... I need to translate SolrJ's response into JSON-response. I can not

Re: SolrJ Response + JSON

2010-07-28 Thread MitchK
edback! Are you sure that you cannot change the SOLR results at query time according to your needs? Unfortunately, it is not possible in this case. Kind regards, Mitch Am 28.07.2010 16:49, schrieb Chantal Ackermann: Hi Mitch On Wed, 2010-07-28 at 16:38 +0200, MitchK wrote: Thank you,

Re: Nabble problems?

2010-07-29 Thread MitchK
I got some problems with Nabble, too. Nabble sends some warnings that my posts are still pending to the mailing-list, while people were already answering to my initial questions. Did you send a message to the nabble-support? Kind regards, - Mitch kenf_nc wrote: > > The Nabble.com page for Sol

Re: Boosting DisMax queries with !boost component

2010-08-01 Thread MitchK
Hi, qf needs to have spaces in it, unfortunately the local query parser can not deal with that, as Erik Hatcher mentioned some months ago. A solution would be to do something like that: {!dismax%20qf=$yourqf}yourQuery&yourgf=title^1.0 tags^2.0 Since you are using the dismax-query-parser, you c

Re: SolrJ Response + JSON

2010-08-02 Thread MitchK
Hi, as I promised, I want to give a feedback for transforming SolrJ's output into JSON with the package from json.org (the package was the json.org's one): I need to make a small modification to the package, since they store the JSON-key-value-pairs in a HashMap, I changed this to a LinkedHa

RE: Boosting DisMax queries with !boost component

2010-08-02 Thread MitchK
Jonathan Rochkind wrote: > >> qf needs to have spaces in it, unfortunately the local query parser can >> not >> deal with that, as Erik Hatcher mentioned some months ago. > > By "local query parser", you mean what I call the LocalParams stuff (for > lack of being sure of the proper term)? >

Re: How to limit rows to which highlighting applies

2010-08-22 Thread MitchK
Alex, it sounds like it would make sense. Use cases could be i.e. clustering or similar techniques. However, in my opinion the point of view for such a modification is not the right. I.e. one wants to have got several resultsets. I could imagine that one does a primary-query (the query for the d

Re: Doing Shingle but also keep special single word

2010-08-22 Thread MitchK
Hi, keepword-filter is no solution for this problem, since this would lead to the problematic that one has to manage a word-dictionary. As explained, this would lead to too much effort. You can easily add outputUnigrams=true and check out the analysis.jsp for this field. So you can see how much

Re: Doing Shingle but also keep special single word

2010-08-23 Thread MitchK
will be equal. Kind regards, - Mitch scott chu wrote: > > I don't quite understand additional-field-way? Do you mean making another > field that stores special words particularly but no indexing for that > field? > > Scott > > - Original Message - &g

Re: Why it's boosted up?

2010-08-24 Thread MitchK
Hi Scott, > (so shorter fields are automatically boosted up). " > The theory behind that is the following (in easy words): Let's say you got two documents, each doc contains on 1 field (like it was in my example). Additionally we got a query that contains two words. Let's say doc1 contains o

Re: Solr creates whitespace in dismax query

2010-08-24 Thread MitchK
Johann, try to remove the wordDelimiterFilter from the query-analyzer of your fieldType. If your index-analyzer-wordDelimiterFilter is well configured, it will find everything you want. Does this solve the problem? Kind regards, - Mitch -- View this message in context: http://lucene.472066.n

Re: full control over norm values?

2010-08-27 Thread MitchK
Hi Micheal, have a look at SweetSpotSimilarity (Lucene). Kind regards, - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/full-control-over-norm-values-tp1366910p1367462.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: anyone use hadoop+solr?

2010-09-04 Thread MitchK
Hi, this topic started a few months ago, however there are some questions from my side, that I couldn't answer by looking at the SOLR-1301-issue nor the wiki-pages. Let me try to explain my thoughts: Given: a Hadoop-cluster, a solr-search-cluster and nutch as a crawling-engine which also perform

Re: Show a facet filter "All"

2010-09-05 Thread MitchK
Peter, take a close look at tagging and and excluding filters: http://wiki.apache.org/solr/SimpleFacetParameters#LocalParams_for_faceting Another way would be to index your services_raw as services_raw/Exclusive rental services_raw/Fotoreport services_raw/Live music In this case, you can use t

Re: anyone use hadoop+solr?

2010-09-06 Thread MitchK
Thanks for your detailed feedback Andzej! >From what I understood, SOLR-1301 becomes obsolete ones Solr becomes cloud-ready, right? > Looking into the future: eventually, when SolrCloud arrives we will be > able to index straight to a SolrCloud cluster, assigning documents to > shards throug

Re: anyone use hadoop+solr?

2010-09-06 Thread MitchK
Yonik, are there any discussions about SolrCloud-indexing? I would be glad to join them, if I find some interesting papers about that topic. - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/anyone-use-hadoop-solr-tp485333p1426469.html Sent from the Solr - User maili

Re: SolrCloud distributed indexing (Re: anyone use hadoop+solr?)

2010-09-06 Thread MitchK
Andrzej, thank you for sharing your experiences. > b) use consistent hashing as the mapping schema to assign documents to a > changing number of shards. There are many explanations of this schema on > the net, here's one that is very simple: > Boom. With the given explanation, I understan

Re: SolrCloud distributed indexing (Re: anyone use hadoop+solr?)

2010-09-07 Thread MitchK
every hour. :-) Thoughts? Kind regards Andrzej Bialecki wrote: > > On 2010-09-06 16:41, Yonik Seeley wrote: >> On Mon, Sep 6, 2010 at 10:18 AM, MitchK wrote: >> [...consistent hashing...] >>> But it doesn't solve the problem at all, correct me if I am wrong, but: &

Re: SolrCloud distributed indexing (Re: anyone use hadoop+solr?)

2010-09-07 Thread MitchK
I must add something to my last post: When saying it could be used together with techniques like consistent hashing, I mean it could be used at indexing time for indexing documents, since I assumed that the number of shards does not change frequently and therefore an ODV-case becomes relatively i

Re: Solr CoreAdmin create ignores dataDir Parameter

2010-09-10 Thread MitchK
Frank, have a look at SOLR-646. Do you think a workaround for the data-dir-tag in the solrconfig.xml can help? I think about something like ${solr./data/corename} for illustration. Unfortunately I am not very skilled in working with solr's variables and therefore I do not know what variables ar

Re: Swapping cores with SolrJ

2010-09-14 Thread MitchK
Hi Shaun, I think it is more easy to fix this problem, if we got more information about what is going on in your application. Please, could you provide the CoreAdminResponse returned by car.process() for us? Kind regards, - Mitch -- View this message in context: http://lucene.472066.n3.nabble.

Re: questions about autocommit & committing documents

2010-09-26 Thread MitchK
Hi Andy, Andy-152 wrote: > > > 1 > 1000 > > > has been commented out. > > - With commented out, does it mean that every new document > indexed to Solr is being auto-committed individually? Or that they are not > being auto-committed at all? > I am not sure, whether there is a de

Re: questions about autocommit & committing documents

2010-09-26 Thread MitchK
First: Usually you do not use post.jar for updating your index. It's a simple tool, but normally you use features like the csv- or xml-update-RequestHandler. Have a look at "UpdateCSV" and "UpdateXMLMessages" in the wiki. There you can find examples on how to commit explicitly. With the post.jar

Custom Analyzer/Tokenizer works but results were not saved

2010-01-05 Thread MitchK
Hello community, I wrote another mail today, but I think something goes wrong (I can't find my post in the mailinglist) - if not, I am sorry for posting a "doublepost" - I am using a maillist for the first time. I have created a custom analyzer, which contains on a LowerCaseTokenizer, a StopFilt

No Analyzer, tokenizer or stemmer works at Solr

2010-01-06 Thread MitchK
I have tested a lot and all the time I thought I set wrong options for my custom analyzer. Well, I have noticed that Solr isn't using ANY analyzer, filter or stemmer. It seems like it only stores the original input. I am using the example-configuration of the current Solr 1.4 release. What's wron

Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-06 Thread MitchK
aved data afterwards. Thank you. Mitch Erick Erickson wrote: > > <<>> > > How do you know this? Because it's highly unlikely that SOLR > is completely broken on that level..... > > Erick > > On Wed, Jan 6, 2010 at 3:48 PM, MitchK wrote: >

Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-06 Thread MitchK
indexed = true. My language was a little bit tricky ;). ryantxu wrote: > > > On Jan 6, 2010, at 3:48 PM, MitchK wrote: > >> >> I have tested a lot and all the time I thought I set wrong options >> for my >> custom analyzer. >> Well, I have noticed t

Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-07 Thread MitchK
thinking going on implementing it such that analyzed > output is stored. > > You can, however, use the analysis request handler componentry to get > analyzed stuff back as you see it in analysis.jsp on a per-document or > per-field text basis - if you're looking to

Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-07 Thread MitchK
ields? ryantxu wrote: > > > On Jan 7, 2010, at 10:50 AM, MitchK wrote: > >> >> Eric, >> >> you mean, everything is okay, but I do not see it? >> >>>> Internally for searching the analysis takes place and writes to the >>>> index in an

Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-07 Thread MitchK
dex it in f1 > but do NOT store it in f1. Store it in f2 > but do NOT index it in f2. > 2> take that same data, index AND store > it in f3. > > <1> is almost entirely equivalent to <2> > in terms of index resources. > > Practically though,

Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-08 Thread MitchK
use case you're trying to implement > or is this mostly theoretical? > > Erick > > On Thu, Jan 7, 2010 at 2:08 PM, MitchK wrote: > >> >> The difference between stored and indexed is clear now. >> >> You are right, if you are responsing only to

Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-11 Thread MitchK
Hello Hossman, sorry for my late response. For this specific case, you are right. It makes more sense to do such work "on the fly". However, I am only testing at the moment, what one can do with Solr and what not. Is the UpdateProcessor something that comes froms Lucene itself or from Solr? Th

Re: No Analyzer, tokenizer or stemmer works at Solr

2010-01-11 Thread MitchK
riginal query - how can I do that? Erik Hatcher-4 wrote: > > > On Jan 11, 2010, at 7:33 AM, MitchK wrote: >> Is the UpdateProcessor something that comes froms Lucene itself or >> from >> Solr? > > It's at the Solr level - > <

Term Dictionary + scoring

2010-01-15 Thread MitchK
Hello, I have searched the wiki and the mailing-lists, but I can't find any postings for the following training-use cases. First: I want to create a Term Dictionary, which I can response to my client. The client should be able to manipulate this response in any way he wants - so I really need a

Re: Problem with text field in Solr

2010-01-15 Thread MitchK
What is analysis.jsp showing to you, when you query the words? Due to stemming the input, there could be the mistake. What happens, if you search for "aviation" without wildcards? -- View this message in context: http://old.nabble.com/Problem-with-text-field-in-Solr-tp27175346p27175827.html Sen

Re: [1.3] help with update timeout issue?

2010-01-15 Thread MitchK
If, and only if you need to fix your problem as fast as you can, I would think about virtualization. You need to replicate your Solr and his index-files. The idea is quiete easy: while one Solr-server does its optimization, the other one is available for searching documents without any downtime.

Re: [1.3] help with update timeout issue?

2010-01-15 Thread MitchK
The current topic "Need deployment strategy" may give you another answer quite similar to mine one. It sounds much cleaner. -- View this message in context: http://old.nabble.com/-1.3--help-with-update-timeout-issue--tp27171798p27179780.html Sent from the Solr - User mailing list archive at Nab

Re: Term Dictionary + scoring

2010-01-16 Thread MitchK
Grant, thank you for the link to the wiki. TermsComponent was unknown to me until now. It sounds good! > Generally, this clickthrough tracking is tied to the query, so you need a > layer above just popularity. You >need popularity per query (or in all > likelihood a subset of the queries, since

Re: Fundamental questions of how to build up solr for huge portals

2010-01-16 Thread MitchK
Hello Peter, well, I am no expert on Solr, but what you want to do sounds like a case for several SolrCores [1]. I am thinking of one core per portal and one super-core to search over all portals. This would be redundant and several information will be stored twice or more times. Another way woul

Re: Term Dictionary + scoring

2010-01-17 Thread MitchK
An idea of mine was to set up two SolrCores, since I don't know how to create a layer above popularity. Let's call them QueryCore and DataCore. Querying for the first time against the DataCore would lead to a normal response, since there were no queries before. Let's say we are querying for "Sta

Re: Google Commerce Search

2010-01-17 Thread MitchK
mrbelvedr wrote: > > * Index our MS Sql Server 2008 product table > Yes. Have a look at http://wiki.apache.org/solr/DataImportHandlerFaq mrbelvedr wrote: > > * Spell check for product brand names - user enters brand "sharpee" and > the search engine will reply "Did you mean 'Sharpie'? " >

Re: Google Commerce Search

2010-01-17 Thread MitchK
As you know, Solr is fully written in Java and Java is still plattform-independent. ;) Learn more about Solr on http://www.lucene.apache.org/solr mrbelvedr wrote: > > That sounds great. Could it also run on Windows? I am interested in > hiring an experienced Solr freelancer to help us set up S

Re: Google Commerce Search

2010-01-17 Thread MitchK
As you know, Solr is fully written in Java and Java is still plattform-independent. ;) Learn more about Solr on http://lucene.apache.org/solr/ mrbelvedr wrote: > > That sounds great. Could it also run on Windows? I am interested in > hiring an experienced Solr freelancer to help us set up Solr

Re: Google Commerce Search

2010-01-17 Thread MitchK
Bill, are you comparing percentage-rates or total numbers? Keep in mind that Google own a big part of the traffic-cake. Kind regards Mitch William Pierce-3 wrote: > > Let me give you an example from my own personal experience.We submit > data feed of products from my clients to various s

Getting the Lucene Document ID for TermVectorComponent

2010-01-30 Thread MitchK
Hello community, using the TermVectorComponent for a special set of documents, I will need the Lucene Document IDs.[1] How do I get them from a response? I have searched the Solr- and the Lucene-wiki but can't find anything to solve this problem. By the way: Are there any known bugs with using

Re: Getting the Lucene Document ID for TermVectorComponent

2010-01-31 Thread MitchK
Hoss, this will have the same effect as I expected. Why wasn't that my first thought? :) Thank you! -- View this message in context: http://old.nabble.com/Getting-the-Lucene-Document-ID-for-TermVectorComponent-tp27382599p27390985.html Sent from the Solr - User mailing list archive at Nabble.com

Force Solr to use special response-rules

2010-02-15 Thread MitchK
Hello community, with the help of the sloppy pharse query [1] I can say, that queried words have to occur within a special number of words. Now, I want to create an extra rule: If the query contains on 9 words, I want to make sure, that 6 of them have to occur within a document or else it would

Filter Query or Main Query or facetting?

2010-03-06 Thread MitchK
Hello community, I am not sure about what is the best way to handle the following problem: I have got an index, let's say with 2mio documents, and there is a check-field. The check-field contains on boolean values (TRUE/FALSE). What is the best way to query only documents with a TRUE check-value

Re: Filter Query or Main Query or facetting?

2010-03-06 Thread MitchK
Yes, that's possible. However I thought, that the normal-q-param forces Solr to lookup every check-field whereas it is true or false. So I am looking for something like a tree that devides the index into two pieces - true and false. So Solr do not need to lookup the check-field anymore, because

Re: Filter Query or Main Query or facetting?

2010-03-06 Thread MitchK
it sounds like you're after. > > When measuring speed, remember that the first few queries > aren't representative. > > HTH > Erick > On Sat, Mar 6, 2010 at 12:32 PM, MitchK wrote: > >> >> Yes, that's possible. >> >> However I tho

Re: Free Webinar: Mastering Solr 1.4 with Yonik Seeley

2010-03-07 Thread MitchK
Last but not least: When can we view it? :) -- View this message in context: http://old.nabble.com/Re%3A-Free-Webinar%3A-Mastering-Solr-1.4-with-Yonik-Seeley-tp27720526p27810048.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Free Webinar: Mastering Solr 1.4 with Yonik Seeley

2010-03-07 Thread MitchK
Sorry, I did not recognize that it already took place. Thank you for the link. -- View this message in context: http://old.nabble.com/Re%3A-Free-Webinar%3A-Mastering-Solr-1.4-with-Yonik-Seeley-tp27720526p27811668.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Error 400 - By search with exclamation mark ... ?! PatternReplaceFilterFactory ?

2010-03-07 Thread MitchK
According to Ahmet Arslan's Post: Solr is expecting a word after the "!", because it is an operator. If you escape it, it is part of the queried string. -- View this message in context: http://old.nabble.com/Error-400---By-search-with-exclamation-mark-...--%21-PatternReplaceFilterFactory---tp277

Re: Handling and sorting email addresses

2010-03-07 Thread MitchK
Ian, did you have a look at Solr's admin analysis.jsp? When everything on the analysis's page is fine, you have missunderstood Solr's schema.xml-file. You've set two attributes in your schema.xml: stored = true indexed = true What you get as a response is the stored field value. The stored fiel

  1   2   3   >