Re: Using Solr with CouchDB

2010-04-28 Thread Uri Boness
Jumping in late here, but if you're interested, we're currently implementing a LCF connector for couchdb at JTeam (http://www.jteam.nl) . We'll make it available on line and try to contribute it back to LCF. We'll also soon publish a blog post about it as an example of how to develop custom rep

Lucene User Group Meetup in Amsterdam

2010-02-03 Thread Uri Boness
Hi All, On 17th February we'll host the first Dutch Lucene User Group Meetup. This meet-up will be split into two parts: - The first part will be dedicated to the user group itself. We'll have an introduction to the members and have an open discussion about the goals of the user group and th

Re: Solr vs. Compass

2010-01-23 Thread Uri Boness
"USA" (3 characters) is repeated in few millions documents, and field "Canada" (6 characters) in another few; no any "relational", it's done automatically without any Compass/Hibernate/Table(s) Don't think "relational". I wrote this 2 years ago: h

Re: Solr vs. Compass

2010-01-21 Thread Uri Boness
There seems to be an implication that compass wont scale as well as solr - and I'm not sure that's true at all. They will both scale as well as the underlying Lucene. Lucene doesn't handle distributed search or replication out of the box, you have to implement it using some of it's features (d

Re: Solr vs. Compass

2010-01-21 Thread Uri Boness
In addition, the biggest appealing feature in Compass is that it's transactional and therefore integrates well with your infrastructure (Spring/EJB, Hibernate, JPA, etc...). This obviously is nice for some systems (not very large scale ones) and the programming model is clean. On the other hand

Re: schema question

2010-01-18 Thread Uri Boness
Yeah, probably the SignatureUpdateProcessorFactory can do the trick, but you still need to write a custom Signature. (we should really offer a simple "ConcatSignature" implementation for generating predictable combination keys) +1 Cheers, Uri Chris Hostetter wrote: : TemplateTranformer. Other

Re: schema question

2010-01-17 Thread Uri Boness
If you're using DataImportHandler than this can easily be done with a TemplateTranformer. Otherwise, if you really must do it in Solr you can write your own custom UpdateProcessor and plug it in: DIH TemplateTransformer: http://wiki.apache.org/solr/DataImportHandler#TemplateTransformer Update

Re: Filter exclusion on query facets?

2009-12-15 Thread Uri Boness
Yes, you can tag filters using the new local params format and then explicitly exclude them when providing the facet fields. see: http://wiki.apache.org/solr/SimpleFacetParameters#Tagging_and_excluding_Filters Cheers, Uri Mat Brown wrote: Hi all, Just wondering if it's possible to do filter

Re: how to do auto-suggest case-insensitive match and return original case field values

2009-12-08 Thread Uri Boness
Just updated SOLR-1625 to support regexp hints. https://issues.apache.org/jira/browse/SOLR-1625 Cheers, Uri Chris Hostetter wrote: : In my web application I want to set up auto-suggest as you type : functionality which will search case-insensitively yet return the original : case terms. It do

Re: Query time boosting with dismax

2009-12-05 Thread Uri Boness
sed though. Erik On Dec 5, 2009, at 6:54 AM, Uri Boness wrote: You can actually define boost queries to do that (bq parameter). Boost queries accept the standard Lucene query syntax and eventually appended to the user query. Just make sure that the default operator is set to OR other wise t

Re: Query time boosting with dismax

2009-12-05 Thread Uri Boness
the setting in schema.xml. I think boosting queries are OR'd in automatically to the main query: From DismaxQParser#addBoostQuery() ... query.add(f, BooleanClause.Occur.SHOULD);... There is one case where query.add((BooleanClause) c); is used though. Erik On Dec 5, 2009, at 6:54

Re: Query time boosting with dismax

2009-12-05 Thread Uri Boness
You can actually define boost queries to do that (bq parameter). Boost queries accept the standard Lucene query syntax and eventually appended to the user query. Just make sure that the default operator is set to OR other wise these boost queries will not only influence the boosts but also filt

Solr Training in Europe

2009-11-09 Thread Uri Boness
Hi All, For those who are interested, the official Lucid Solr trainings are now available in Europe. The first training - "Introduction to Solr" is a 3 days training covering the basics and some of the more advance features of Solr. It is scheduled for 30th November (till 2nd December) and wil

latest lucene libraries in maven repo

2009-11-01 Thread Uri Boness
Hi, It seems the the latest lucene libraries are not up to date in the Solr maven repo (http://people.apache.org/repo/m2-snapshot-repository/org/apache/solr/solr-lucene-core/1.4-SNAPSHOT/) Can we expect them to be updated soon? Cheers, Uri

Re: solr web ui

2009-10-31 Thread Uri Boness
If you wish to save yourself from the hassle of applying the patch, you can also download it from http://www.jteam.nl/news/solrexplorer Uri Grant Ingersoll wrote: There is also a GWT contribution in JIRA that is pretty handy and will likely be added in 1.5. See http://issues.apache.org/jira/

Re: conditional sorting

2009-10-02 Thread Uri Boness
If the threshold is only 10, why can't you always sort by popularity and if the result set is <10 then resort on the client side based on date_entered? Uri Bojan Šmid wrote: Hi all, I need to perform sorting of my query hits by different criterion depending on the number of hits. For instanc

Re: field collapsing sums

2009-09-30 Thread Uri Boness
Hi, At the moment I think the most appropriate place to put it is in the AbstractDocumentCollapser (in the getCollapseInfo method). Though, it might not be the most efficient. Cheers, Uri Joe Calderon wrote: hello all, i have a question on the field collapsing patch, say i have an integer f

Re: Single Core or Multiple Core?

2009-09-14 Thread Uri Boness
enario A then it works differently and you have to do this instead". Shalin Shekhar Mangar wrote: On Mon, Sep 14, 2009 at 8:16 PM, Uri Boness wrote: Is it really a problem? I mean, as i see it, solr to cores is what RDBMS is to databases. When you connect to a database you also need t

Re: Single Core or Multiple Core?

2009-09-14 Thread Uri Boness
forces you to use a core name. this is inconvenient. We must get rid of this restriction before we move single-core to multicore. On Sat, Sep 12, 2009 at 3:14 PM, Uri Boness wrote: +1 Can you add a JIRA issue for that so we can vote for it? Chris Hostetter wrote: : > For the record: even

Re: Single Core or Multiple Core?

2009-09-12 Thread Uri Boness
+1 Can you add a JIRA issue for that so we can vote for it? Chris Hostetter wrote: : > For the record: even if you're only going to have one SOlrCore, using the : > multicore support (ie: having a solr.xml file) might prove handy from a : > maintence standpoint ... the ability to configure new "

Re: Very slow first query

2009-09-11 Thread Uri Boness
"Not having any facet" and "Not using a filter cache" are two different things. If you're not using query filters, you can still have facet calculated and returned as part of the search result. The facet component uses lucene's field cache to retrieve values for the facet field. Jonathan Ariel

Re: Query runs faster without filter queries?

2009-09-10 Thread Uri Boness
If I recall correctly, in solr 1.3 there was an issue where filters didn't really behaved as they should have. Basically, if you had a query and filters defined, the query would have executed normally and only after that the filter would be applied. AFAIK this is fixed in 1.4 where now the docu

Re: Field Collapsing (was Re: Schema for group/child entity setup)

2009-09-10 Thread Uri Boness
llapse together again). Either way it's not ideal. At the time (many months ago) there was no way to account for this but it sounds like this patch could make it possible, maybe. Thanks! -- Steve On Sep 5, 2009, at 5:57 AM, Uri Boness wrote: There's work on the patch that is being d

Re: Field Collapsing (was Re: Schema for group/child entity setup)

2009-09-10 Thread Uri Boness
es by default? On Sun, Sep 6, 2009 at 3:58 AM, Uri Boness wrote: You can check out http://www.ilocal.nl. If you search for a bank in Amsterdam then you'll see that a lot of the results are collapsed. For this we used an older version of this patch (which works on 1.3) but a lot has cha

Re: Catchall field and facet search

2009-09-09 Thread Uri Boness
Hi, This is a bit tricky but I think you can achieve it as follows: 1. have a field called "location_facet" which holds the logical path of the location for each address (e.g. /Eurpoe/England/London) 2. have another multi valued filed "location_search" that holds all the locations - your "catc

Re: Field Collapsing (was Re: Schema for group/child entity setup)

2009-09-07 Thread Uri Boness
requirements. Do you know any live site using field collapsing already? On Sat, Sep 5, 2009 at 5:57 PM, Uri Boness wrote: There's work on the patch that is being done now which will enable you to ask for specific field values of the collapsed documents using a dedicated request parame

Re: Field Collapsing (was Re: Schema for group/child entity setup)

2009-09-05 Thread Uri Boness
any live site using field collapsing already? On Sat, Sep 5, 2009 at 5:57 PM, Uri Boness wrote: There's work on the patch that is being done now which will enable you to ask for specific field values of the collapsed documents using a dedicated request parameter. This work is not committe

Re: Field Collapsing (was Re: Schema for group/child entity setup)

2009-09-05 Thread Uri Boness
he results data? What is the latest build that is safe to use on a production environment? I'd probably go for that and use field collapsing. Thank you very much. On Fri, Sep 4, 2009 at 4:49 AM, Uri Boness wrote: The collapsed documents are represented by one "master" document

Re: Using scoring from another program

2009-09-03 Thread Uri Boness
Function queries is what you need: http://wiki.apache.org/solr/FunctionQuery Paul Tomblin wrote: Every document I put into Solr has a field "origScore" which is a floating point number between 0 and 1 that represents a score assigned by the program that generated the document. I would like it t

Re: Best way to do a lucene matchAllDocs not using q.alt=*:*

2009-09-03 Thread Uri Boness
you can use LukeRequestHandler http://localhost:8983/solr/admin/luke Marc Sturlese wrote: Hey there, I need a query to get the total number of documents in my index. I can get if I do this using DismaxRequestHandler: q.alt=*:*&facet=false&hl=false&rows=0 I have noticed this query is very memory

Re: Field Collapsing (was Re: Schema for group/child entity setup)

2009-09-03 Thread Uri Boness
ilable for solr 1.3). I guess you'll have to try some of the old patches, but I'm not sure about their stability. cheers, Uri R. Tan wrote: Thanks Uri. How does paging and scoring work when using field collapsing? What patch works with 1.3? Is it production ready? R On Thu, S

Re: Field Collapsing (was Re: Schema for group/child entity setup)

2009-09-03 Thread Uri Boness
The development on this patch is quite active. It works well for single solr instance, but distributed search (ie. shards) is not yet supported. Using this page you can group search results based on a specific field. There are two flavors of field collapsing - adjacent and non-adjacent, the for

Re: Drill down into hierarchical facet : how to?

2009-09-01 Thread Uri Boness
Hi, You know the level your currently in: America/USA You have the values for the location facet in the form: America/USA/NYC/Chelsea...3 America/USA/NYC/East Village2 America/USA/San Francisco/Haight-Ashbury...5 America/USA/Los A

Re: Hierarchical schema design

2009-08-31 Thread Uri Boness
Hi, The search index is flat. There are no hierarchies in there. Now, I'm not sure what you're referring to with "this type of objects". But if you refer to having different types of documents in one index (and schema) that's certainly possible. You can define all the fields that you expect i

Re: Can Apache Solr have more than one schema?

2009-08-27 Thread Uri Boness
On Thu, Aug 27, 2009 at 5:43 PM, Uri Boness wrote: Not in the same core. You can define multiple cores where each core is a separate solr instance except they all run within one container. each core has its own index, schema and configuration. If you want to compare it to databases, then I

Re: Can Apache Solr have more than one schema?

2009-08-27 Thread Uri Boness
Not in the same core. You can define multiple cores where each core is a separate solr instance except they all run within one container. each core has its own index, schema and configuration. If you want to compare it to databases, then I guess a core is to Solr Server what a database is to it

Re: Updating a solr record

2009-08-27 Thread Uri Boness
I guess if you have stored="true" then there is no problem. 2. If you don't use stored="true" you can still get access to term vectors, which you can probably reuse to create fake field with same term vector in an updated document... just an idea, may be I am wrong... Reconstructing a the field

Re: Solr project statisitics

2009-08-27 Thread Uri Boness
00 AM, Uri Boness wrote: Hi, Where can I find general statistics about the Solr project. The only thing I found is statistics about the Lucene project at: http://people.apache.org/~vgritsenko/stats/projects/lucene.html#Downloads-N1008F Now the question is whether these number include all luc

Announcing Dutch Lucene User Group

2009-08-27 Thread Uri Boness
Hi, We started a new Lucene user group in The Netherlands. In the last couple of years we've notice an increasing demand and interest in Lucene and Solr. We thought it's about time to have a centralize place where people can have open discussions, trainings, and periodic meet-ups to share kno

Solr project statisitics

2009-08-27 Thread Uri Boness
Hi, Where can I find general statistics about the Solr project. The only thing I found is statistics about the Lucene project at: http://people.apache.org/~vgritsenko/stats/projects/lucene.html#Downloads-N1008F Now the question is whether these number include all lucene's sub-projects (includ

Re: solr nutch url indexing

2009-08-26 Thread Uri Boness
understand the schema better, you can read http://wiki.apache.org/solr/SchemaXml Uri last...@gmail.com wrote: Uri Boness wrote: Well... yes, it's a tool the Nutch ships with. It also ships with an example Solr schema which you can use. hi, is there any documentation to understand what

Re: Responses getting truncated

2009-08-25 Thread Uri Boness
can dig deeper? Thanks -Rupert On Mon, Aug 24, 2009 at 4:31 PM, Uri Boness wrote: It can very well be an issue with the data itself. For example, if the data contains un-escaped characters which invalidates the response. I don't know much a

Re: solr nutch url indexing

2009-08-25 Thread Uri Boness
Well... yes, it's a tool the Nutch ships with. It also ships with an example Solr schema which you can use. Fuad Efendi wrote: Thanks for the link, so, SolrIndex is NOT plugin, it is an application... I use similar approach... -Original Message- From: Uri Boness Hi, Nutch

Re: solr nutch url indexing

2009-08-25 Thread Uri Boness
It seems to me that this configuration actually does what you want - queries on "title" mostly. The default search field doesn't influence a dismax query. I would suggest you to include the debugQuery=true parameter, it will help you figure out how the matching is performed. You can read more

Re: Responses getting truncated

2009-08-24 Thread Uri Boness
It can very well be an issue with the data itself. For example, if the data contains un-escaped characters which invalidates the response. I don't know much about ruby, but what do you get with wt=json? Rupert Fiasco wrote: I am seeing our responses getting truncated if and only if I search on

Re: multi-language search

2009-08-24 Thread Uri Boness
I can think of ways to tackle your problem: Option 1: each document will have a field indicating its language. Then, when searching, you can simply filter the query on the language you're searching on. Advantages: everything is in one index, so if in the future you will need to do a cross lang

Re: solr nutch url indexing

2009-08-24 Thread Uri Boness
Hi, Nutch comes with support for Solr out of the box. I suggest you follow the steps as described here: http://www.lucidimagination.com/blog/2009/03/09/nutch-solr/ Cheers, Uri Fuad Efendi wrote: Is SolrIndex plugin for Nutch? Thanks! -Original Message- From: Uri Boness

Re: solr nutch url indexing

2009-08-24 Thread Uri Boness
How did you configure nutch? Make sure you have the "parse-html" and "index-basic" configured. The HtmlParser should by default extract the page title and add to the parsed data, and the BasicIndexingFilter by default adds this title to the NutchDocument and stores it in the "title" filed. All

Re: Common Solr Question

2009-08-20 Thread Uri Boness
Hi, 1. that change you made should work. Just remember that request parameters (query string parameters) override the configured defaults. 2. That is correct 3. not quite sure what you mean by that. 4. I guess you're asking in your statement is correct... it is. I think you should have a look

Re: Facet filtering

2009-08-20 Thread Uri Boness
Another solution is to use hierachical values. So for example, instead of having a "Barack Obama" value you'll have "person/Barak Obama". To filter on a person you can just use wildcards (e.g. "person/*"). Asif Rahman wrote: Is there any way to assign metadata to terms in a field and then filt

Re: Buzz measurement - Aggregate functions

2008-10-10 Thread Uri Boness
you can try using the field collapse patch (currently in JIRA). You'll probably need to manually extract the patch code and apply it yourself as its latest update only applies to an earlier version of solr (1.3-dev). http://issues.apache.org/jira/browse/SOLR-236 Cheers, Uri Marcus Herou wrote

Re: synonym token types and ranking

2008-06-12 Thread Uri Boness
ng of original token vs. synonym token(s) also makes sense. > Is this something you can provide a patch for? > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > - Original Message > > From: Uri Boness <[EMAIL PROTECTED]> >

synonym token types and ranking

2008-06-11 Thread Uri Boness
Hi, I've noticed that currently the SynonymFilter replaces the original token with the configured tokens list (which includes the original matched token) and each one of these tokens is of type "word". Wouldn't it make more sense to only mark the original token as type "word" and the the othe

Re: Getting maximum and minimum values of a field

2008-05-30 Thread Uri Boness
I guess, to generalize the idea, is to have some support for aggregation functions. average anyone ;-) ? It would also be very useful to be able to define the field that is being aggregated. For example, in a flight reservation web site we developed we needed to show facets on different flight

Re: Field Grouping

2008-05-15 Thread Uri Boness
Hi, I'm actually quite interested in this feature. What is the ranking strategy for the group? is it based on the highest ranking document with in the group? is it configurable? cheers, Uri oleg_gnatovskiy wrote: Yes, that is the patch I am trying to get to work. It doesn't have a feature f