Performance Drop from 1.3 to 1.4

2009-08-31 Thread Ilan Rabinovitch
Hello, We recently began migrating a few of our applications from 1.3 to 1.4 in order to take advantage of the replication and performance improvements. In practice however, we are noticing that our instances which make use of LocalSolr have experienced some performance degradation from 1.3

Re: solr and approximate string matching

2009-08-31 Thread Ryszard Szopa
Hi, On Sun, Aug 30, 2009 at 9:32 PM, Shalin Shekhar Mangar wrote: > The best way to debug these kind of problems is to look at analysis.jsp > and/or use debugQuery=on on the query to see exactly how it is being parsed. > > Can you post the output of your query with debugQuery=on? Thanks a lot fo

sql server indexing using dih problem

2009-08-31 Thread rameshgalla
Hi, I am trying to index sql server table using dih. my data-config.xml file configuration: When i have tried to debug i got the following error: - - 0 29672 - - db-data-config.xml full-import debug - - - select CustomerID,Title,For

How to set similarity to catch more results ?

2009-08-31 Thread Kaoul
Hello, I'm new to Solr and don't find in documentation how-to to set similarity. I want it more flexible, as if I make a mistake with letters, results are found like with google. Thank you in advance.

Re: How to set similarity to catch more results ?

2009-08-31 Thread rajan chandi
There are fuzzy searches which might be able to help a bit. There could be more but I am just a newbie. Regards Rajan On Mon, Aug 31, 2009 at 3:30 PM, Kaoul wrote: > Hello, > > I'm new to Solr and don't find in documentation how-to to set > similarity. I want it more flexible, as if I make a mi

Re: sql server indexing using dih problem

2009-08-31 Thread Shalin Shekhar Mangar
On Mon, Aug 31, 2009 at 3:25 PM, rameshgalla wrote: > > Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The TCP/IP > connection to the host 10.232.6.38, port 1433 has failed. Error: > "Connection > refused: connect. Verify the connection properties, check that an instance > of SQL Serv

Hierarchical schema design

2009-08-31 Thread Pooja Verlani
Hi all, Is there a possibility to have a hierarchical schema in solr, meaning can we have objects under objects. For example, for a doc like: ,b3> . . . . . . . I need to make schema with 3 types of such objects and all of them having different field va

Re: filtering facets

2009-08-31 Thread Mike Topper
Hi Olivier, are the facet counts on the urls you dont want 0? if so you can use facet.mincount to only return results greater than 0. -Mike Olivier H. Beauchesne wrote: > Hi, > > Long time lurker, first time poster. > > I have a multi-valued field, let's call it article_outlinks containing > al

Re: Performance Drop from 1.3 to 1.4

2009-08-31 Thread Yonik Seeley
I don't know exactly how the local solr stuff currently works (it's not currently part of Solr), but it's possible to get worse memory performance if you're not careful. Solr and Lucene now do per-segment searching and sorting in a single index... and that means fieldcache entries populated at the

Re: filtering facets

2009-08-31 Thread Olivier H. Beauchesne
Hi Mike, No, my problem is that the field article_outlinks is multivalued thus it contains several urls not related to my search. I would like to facet only urls matching my query. For exemple(only on one document, but my search targets over 1M docs): Doc1: article_url: url1.com/1 url2.com/2

Help! Issue with tokens in custom synonym filter

2009-08-31 Thread Lajos
Hi all, I've been writing some custom synonym filters and have run into an issue with returning a list of tokens. I have a synonym filter that uses the WordNet database to extract synonyms. My problem is how to define the offsets and position increments in the new Tokens I'm returning. For a

Re: Help! Issue with tokens in custom synonym filter

2009-08-31 Thread AHMET ARSLAN
> I've been writing some custom synonym filters and have run > into an issue with returning a list of tokens. I have a > synonym filter that uses the WordNet database to extract > synonyms. My problem is how to define the offsets and > position increments in the new Tokens I'm returning. > > For a

Is caching worth it when my whole index is in RAM?

2009-08-31 Thread Michael
Hi, If I've got my entire 20G 4MM document index in RAM (on a ramdisk), do I have a need for the document cache? Or should I set it to 0 items, because pulling field values from an index in RAM is so fast that the document cache would be a duplication of effort? Are there any other caches that I

Re: Help! Issue with tokens in custom synonym filter

2009-08-31 Thread Smiley, David W.
Although this is not a direct answer to your question, you may want to consider generating a synonyms file from wordnet. Then, you can use the standard synonym filter in Solr. The only downside to this is that the synonym file might be pretty large... but you've probably got some large file fo

Re: filtering facets

2009-08-31 Thread Michael
You could post-process the response and remove urls that don't match your domain pattern. On Mon, Aug 31, 2009 at 9:45 AM, Olivier H. Beauchesne wrote: > Hi Mike, > > No, my problem is that the field article_outlinks is multivalued thus it > contains several urls not related to my search. I would

Re: Hierarchical schema design

2009-08-31 Thread Uri Boness
Hi, The search index is flat. There are no hierarchies in there. Now, I'm not sure what you're referring to with "this type of objects". But if you refer to having different types of documents in one index (and schema) that's certainly possible. You can define all the fields that you expect i

Re: How to set similarity to catch more results ?

2009-08-31 Thread Aakash Dharmadhikari
hi Kaoul, There are multiple ways that you can use to get the desired results. - Stemming - this makes all forms of a word (e.g. Run, Running, Runner) match to its stem or root word Run. - Synonyms - this will take a list of synonyms from you and would match veg = vegetarian and ev

Re: Release Date Solr 1.4

2009-08-31 Thread Yonik Seeley
Many of you probably know that Lucene went into code-freeze last Thursday... which puts a probable Lucene release date at the end of this week. My day-job colleagues and I are all traveling this week (company get-together) so that may slow things down a bit for some of us, and perhaps cause the go

Re: Dismax Wildcard Queries

2009-08-31 Thread Smiley, David W.
Hi Kurt. I'm the author of those JIRA issues. I'm glad you have interest in them. Please vote for them if you have not done so already. I updated SOLR-758 and I hope it works out okay for you. If you have further questions, please comment on the relevant issues. ~ David Smiley Author: htt

Re: Why can't have & sign in the text?

2009-08-31 Thread AHMET ARSLAN
> I use text as my field type, but whenever my field has > '&' sign, the post.jar will error out. What can I do to work around this? > Thanks. The files - that you are posting - must be valid xml. Escape special xml characters, e.g. replace & with &

Re: WordDelimiterFilter to QueryParser to MultiPhraseQuery?

2009-08-31 Thread jOhn
This is mostly my misunderstanding of catenateAll="1" as I thought it would break down with an OR using the full concatenated word. Thus: Jokers Wild -> { jokers, wild } OR { jokerswild } But really it becomes: { jokers, {wild, jokerswild}} which will not match. And if you have a mistyped camel

Date Faceting and Double Counting

2009-08-31 Thread Stephen Duncan Jr
If we do date faceting and start at 2009-01-01T00:00:00Z, end at 2009-01-03T00:00:00Z, with a gap of +1DAY, then documents that occur at exactly 2009-01-02T00:00:00Z will be included in both the returned counts (2009-01-01T00:00:00Z and 2009-01-02T00:00:00Z). At the moment, this is quite bad for u

Re: Help! Issue with tokens in custom synonym filter

2009-08-31 Thread Lajos
Hi David & Ahmet, I hadn't seen the SynonymTokenFilter from Lucene, so that helped. Ultimately, however, it seems I was pretty much doing the right thing, although my token type might have been wrong. Unfortunately, while the tokens are being returned properly (AFAIK), when I do a query usin

Why can't have & sign in the text?

2009-08-31 Thread Elaine Li
Hi, I use text as my field type, but whenever my field has '&' sign, the post.jar will error out. What can I do to work around this? Thanks. solr returned an error: comctcwstxexcWstxLazyException_Unexpected_character___code_32_missing_name__ at_javaxxmlstreamSerializableLocation587f587f__comctcws

Re: Why can't have & sign in the text?

2009-08-31 Thread Elaine Li
Thanks a lot! Really helped. On Mon, Aug 31, 2009 at 2:21 PM, AHMET ARSLAN wrote: >> I use text as my field type, but whenever my field has >> '&' sign, the post.jar will error out. What can I do to work around this? >> Thanks. > > The files - that you are posting - must be valid xml. Escape spe

Re: filtering facets

2009-08-31 Thread Olivier H. Beauchesne
yeah, but then I would have to retrieve *a lot* of facets. I think for now i'll retrieve all the subdomains with facet.prefix and then merge those queries. Not ideal, but when I will have more motivation, I will submit a patch to solr :-) Michael a écrit : You could post-process the response

Re: Sorting by Unindexed Fields

2009-08-31 Thread Isaac Foster
Hi Erik, Sorry it took me a while to get back to your response. I appreciate any help I can get. The number of documents will start out small, but if we do well we'll have a lot. The fields would all be numeric (we'll map categorical fields to integers), and I would imagine the number of fields w

Re: Field names with whitespaces

2009-08-31 Thread Jay Hill
This seems to work: ?q=field\ name:something Probably not a good idea to have field names with whitespace though. -Jay 2009/8/28 Marcin Kuptel > Hi, > > Is there a way to query solr about fields which names contain whitespaces? > Indexing such data does not cause any problems but I have been

Re: How to set similarity to catch more results ?

2009-08-31 Thread Avlesh Singh
> > I want it more flexible, as if I make a mistake with letters, results are > found like with google. > You are talking about spelling mistakes? http://wiki.apache.org/solr/SpellCheckComponent Cheers Avlesh On Mon, Aug 31, 2009 at 3:30 PM, Kaoul wrote: > Hello, > > I'm new to Solr and don't f

Re: Date Faceting and Double Counting

2009-08-31 Thread Avlesh Singh
I don't think this behavior needs to be fixed. It is justified for the data you have indexed. "date minus 1 second" should definitely work for you. Cheers Avlesh On Mon, Aug 31, 2009 at 11:37 PM, Stephen Duncan Jr < stephen.dun...@gmail.com> wrote: > If we do date faceting and start at 2009-01-0

Re: Hierarchical schema design

2009-08-31 Thread Avlesh Singh
As Uri has already replied, there is no concept of "a hierarchical schema" in Solr. My gut feeling says you might be talking about Multiple cores . Cheers Avlesh On Mon, Aug

Re: Is caching worth it when my whole index is in RAM?

2009-08-31 Thread Avlesh Singh
Good question! The application level cache, say filter cache, would still help because it not only caches values but also the underlying computation. Even with all the data in your RAM you will still end up doing the computations every time. Looking for responses from the more knowledgeable. Chee

Re: filtering facets

2009-08-31 Thread Avlesh Singh
> > when I will have more motivation, I will submit a patch to solr :-) > You want to add more here?- https://issues.apache.org/jira/browse/SOLR-1387 Cheers Avlesh On Tue, Sep 1, 2009 at 2:51 AM, Olivier H. Beauchesne wrote: > yeah, but then I would have to retrieve *a lot* of facets. I think fo

Monitoring split time for fq queries when filter cache is used

2009-08-31 Thread Rahul R
Hello, I am trying to measure the benefit that I am getting out of using the filter cache. As I understand, there are two major parts to an fq query. Please correct me if I am wrong : - doing full index queries of each of the fq params (if filter cache is used, this result will be retrieved from th

Drill down into hierarchical facet : how to?

2009-08-31 Thread clico
Hello I'm looking for a way to do that I have a hierachical facet ex : Continent / Country / City / Blok Europe/France/Paris/Saint Michel America/USA/NYC/Chelsea etc ... I have some points of interest tagged in differents level of a same tree ex : some POI can be tagged Saint Michel and othe