date:20100623

Re: Nested table support ability

2010-06-23 Thread Govind Kanshi

Amit - unless you test it would not be apparent. Key piece is as Otis
mentioned "flatten everything". This requires effort from your side to
actually create documents in manner suitable for your searches. The
relationship needs to be "merged" into the document. To avoid storing text
representations  - you may want to store just the "identifier" and use front
end to translate between human readable text vs stored identifier.
Taking your case further - Rather than storing ADMIN store just a
representation may be a smallint with customer information.

On Wed, Jun 23, 2010 at 11:30 AM, amit_ak  wrote:

>
> Hi Otis, Thanks for the update.
>
> My paramteric search has to span across customer table and 30 child tables.
> We have close to 1 million customers. Do you think Lucene/Solr is the right
> fsolution for such requirements? or database search would be more optimal.
>
> Regards,
> Amit
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Nested-table-support-ability-tp905253p916087.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Field Collapsing SOLR-236

2010-06-23 Thread Rakhi Khatwani

Hi,
   Patching did work. but when i build the trunk, i get the following
exception:

[SolrTrunk]# ant compile
Buildfile: /testWorkspace/SolrTrunk/build.xml

init-forrest-entities:
  [mkdir] Created dir: /testWorkspace/SolrTrunk/build
  [mkdir] Created dir: /testWorkspace/SolrTrunk/build/web

compile-lucene:

BUILD FAILED
/testWorkspace/SolrTrunk/common-build.xml:207:
/testWorkspace/modules/analysis/common does not exist.

Regards,
Raakhi

On Wed, Jun 23, 2010 at 2:39 AM, Martijn v Groningen <
martijn.is.h...@gmail.com> wrote:

> What exactly did not work? Patching, compiling or running it?
>
> On 22 June 2010 16:06, Rakhi Khatwani  wrote:
> > Hi,
> >  I tried checking out the latest code (rev 956715) the patch did not
> > work on it.
> > Infact i even tried hunting for the revision mentioned earlier in this
> > thread (i.e. rev 955615) but cannot find it in the repository. (it has
> > revision 955569 followed by revision 955785).
> >
> > Any pointers??
> > Regards
> > Raakhi
> >
> > On Tue, Jun 22, 2010 at 2:03 AM, Martijn v Groningen <
> > martijn.is.h...@gmail.com> wrote:
> >
> >> Oh in that case is the code stable enough to use it for production?
> >> -  Well this feature is a patch and I think that says it all.
> >> Although bugs are fixed it is deferentially an experimental feature
> >> and people should keep that in mind when using one of the patches.
> >> Does it support features which solr 1.4 normally supports?
> >>- As far as I know yes.
> >>
> >> am using facets as a workaround but then i am not able to sort on any
> >> other field. is there any workaround to support this feature??
> >>- Maybee http://wiki.apache.org/solr/Deduplication prevents from
> >> adding duplicates in you index, but then you miss the collapse counts
> >> and other computed values
> >>
> >> On 21 June 2010 09:04, Rakhi Khatwani  wrote:
> >> > Hi,
> >> >Oh in that case is the code stable enough to use it for production?
> >> > Does it support features which solr 1.4 normally supports?
> >> >
> >> > I am using facets as a workaround but then i am not able to sort on
> any
> >> > other field. is there any workaround to support this feature??
> >> >
> >> > Regards,
> >> > Raakhi
> >> >
> >> > On Fri, Jun 18, 2010 at 6:14 PM, Martijn v Groningen <
> >> > martijn.is.h...@gmail.com> wrote:
> >> >
> >> >> Hi Rakhi,
> >> >>
> >> >> The patch is not compatible with 1.4. If you want to work with the
> >> >> trunk. I'll need to get the src from
> >> >> https://svn.apache.org/repos/asf/lucene/dev/trunk/
> >> >>
> >> >> Martijn
> >> >>
> >> >> On 18 June 2010 13:46, Rakhi Khatwani  wrote:
> >> >> > Hi Moazzam,
> >> >> >
> >> >> >  Where did u get the src code from??
> >> >> >
> >> >> > I am downloading it from
> >> >> > https://svn.apache.org/repos/asf/lucene/solr/branches/branch-1.4
> >> >> >
> >> >> > and the latest revision in this location is 955469.
> >> >> >
> >> >> > so applying the latest patch(dated 17th june 2010) on it still
> >> generates
> >> >> > errors.
> >> >> >
> >> >> > Any Pointers?
> >> >> >
> >> >> > Regards,
> >> >> > Raakhi
> >> >> >
> >> >> >
> >> >> > On Fri, Jun 18, 2010 at 1:24 AM, Moazzam Khan 
> >> >> wrote:
> >> >> >
> >> >> >> I knew it wasn't me! :)
> >> >> >>
> >> >> >> I found the patch just before I read this and applied it to the
> trunk
> >> >> >> and it works!
> >> >> >>
> >> >> >> Thanks Mark and martijn for all your help!
> >> >> >>
> >> >> >> - Moazzam
> >> >> >>
> >> >> >> On Thu, Jun 17, 2010 at 2:16 PM, Martijn v Groningen
> >> >> >>  wrote:
> >> >> >> > I've added a new patch to the issue, so building the trunk (rev
> >> >> >> > 955615) with the latest patch should not be a problem. Due to
> >> recent
> >> >> >> > changes in the Lucene trunk the patch was not compatible.
> >> >> >> >
> >> >> >> > On 17 June 2010 20:20, Erik Hatcher 
> >> wrote:
> >> >> >> >>
> >> >> >> >> On Jun 16, 2010, at 7:31 PM, Mark Diggory wrote:
> >> >> >> >>>
> >> >> >> >>> p.s. I'd be glad to contribute our Maven build re-organization
> >> back
> >> >> to
> >> >> >> the
> >> >> >> >>> community to get Solr properly Mavenized so that it can be
> >> >> distributed
> >> >> >> and
> >> >> >> >>> released more often.  For us the benefit of this structure is
> >> that
> >> >> we
> >> >> >> will
> >> >> >> >>> be able to overlay addons such as RequestHandlers and other
> third
> >> >> party
> >> >> >> >>> support without having to rebuild Solr from scratch.
> >> >> >> >>
> >> >> >> >> But you don't have to rebuild Solr from scratch to add a new
> >> request
> >> >> >> handler
> >> >> >> >> or other plugins - simply compile your custom stuff into a JAR
> and
> >> >> put
> >> >> >> it in
> >> >> >> >> /lib (or point to it with  in solrconfig.xml).
> >> >> >> >>
> >> >> >> >>>  Ideally, a Maven Archetype could be created that would allow
> one
> >> >> >> rapidly
> >> >> >> >>> produce a Solr webapp and fire it up in Jetty in mere seconds.
> >> >> >> >>
> >> >> >> >> How's that any diff

Re: Field Collapsing SOLR-236

2010-06-23 Thread Rakhi Khatwani

Oops this is probably i didn't checkout the modules file from the trunk.
doing that right now :)

Regards
Raakhi

On Wed, Jun 23, 2010 at 1:12 PM, Rakhi Khatwani  wrote:

> Hi,
>Patching did work. but when i build the trunk, i get the following
> exception:
>
> [SolrTrunk]# ant compile
> Buildfile: /testWorkspace/SolrTrunk/build.xml
>
> init-forrest-entities:
>   [mkdir] Created dir: /testWorkspace/SolrTrunk/build
>   [mkdir] Created dir: /testWorkspace/SolrTrunk/build/web
>
> compile-lucene:
>
> BUILD FAILED
> /testWorkspace/SolrTrunk/common-build.xml:207:
> /testWorkspace/modules/analysis/common does not exist.
>
> Regards,
> Raakhi
>
> On Wed, Jun 23, 2010 at 2:39 AM, Martijn v Groningen <
> martijn.is.h...@gmail.com> wrote:
>
>> What exactly did not work? Patching, compiling or running it?
>>
>> On 22 June 2010 16:06, Rakhi Khatwani  wrote:
>> > Hi,
>> >  I tried checking out the latest code (rev 956715) the patch did not
>> > work on it.
>> > Infact i even tried hunting for the revision mentioned earlier in this
>> > thread (i.e. rev 955615) but cannot find it in the repository. (it has
>> > revision 955569 followed by revision 955785).
>> >
>> > Any pointers??
>> > Regards
>> > Raakhi
>> >
>> > On Tue, Jun 22, 2010 at 2:03 AM, Martijn v Groningen <
>> > martijn.is.h...@gmail.com> wrote:
>> >
>> >> Oh in that case is the code stable enough to use it for production?
>> >> -  Well this feature is a patch and I think that says it all.
>> >> Although bugs are fixed it is deferentially an experimental feature
>> >> and people should keep that in mind when using one of the patches.
>> >> Does it support features which solr 1.4 normally supports?
>> >>- As far as I know yes.
>> >>
>> >> am using facets as a workaround but then i am not able to sort on any
>> >> other field. is there any workaround to support this feature??
>> >>- Maybee http://wiki.apache.org/solr/Deduplication prevents from
>> >> adding duplicates in you index, but then you miss the collapse counts
>> >> and other computed values
>> >>
>> >> On 21 June 2010 09:04, Rakhi Khatwani  wrote:
>> >> > Hi,
>> >> >Oh in that case is the code stable enough to use it for
>> production?
>> >> > Does it support features which solr 1.4 normally supports?
>> >> >
>> >> > I am using facets as a workaround but then i am not able to sort on
>> any
>> >> > other field. is there any workaround to support this feature??
>> >> >
>> >> > Regards,
>> >> > Raakhi
>> >> >
>> >> > On Fri, Jun 18, 2010 at 6:14 PM, Martijn v Groningen <
>> >> > martijn.is.h...@gmail.com> wrote:
>> >> >
>> >> >> Hi Rakhi,
>> >> >>
>> >> >> The patch is not compatible with 1.4. If you want to work with the
>> >> >> trunk. I'll need to get the src from
>> >> >> https://svn.apache.org/repos/asf/lucene/dev/trunk/
>> >> >>
>> >> >> Martijn
>> >> >>
>> >> >> On 18 June 2010 13:46, Rakhi Khatwani  wrote:
>> >> >> > Hi Moazzam,
>> >> >> >
>> >> >> >  Where did u get the src code from??
>> >> >> >
>> >> >> > I am downloading it from
>> >> >> > https://svn.apache.org/repos/asf/lucene/solr/branches/branch-1.4
>> >> >> >
>> >> >> > and the latest revision in this location is 955469.
>> >> >> >
>> >> >> > so applying the latest patch(dated 17th june 2010) on it still
>> >> generates
>> >> >> > errors.
>> >> >> >
>> >> >> > Any Pointers?
>> >> >> >
>> >> >> > Regards,
>> >> >> > Raakhi
>> >> >> >
>> >> >> >
>> >> >> > On Fri, Jun 18, 2010 at 1:24 AM, Moazzam Khan > >
>> >> >> wrote:
>> >> >> >
>> >> >> >> I knew it wasn't me! :)
>> >> >> >>
>> >> >> >> I found the patch just before I read this and applied it to the
>> trunk
>> >> >> >> and it works!
>> >> >> >>
>> >> >> >> Thanks Mark and martijn for all your help!
>> >> >> >>
>> >> >> >> - Moazzam
>> >> >> >>
>> >> >> >> On Thu, Jun 17, 2010 at 2:16 PM, Martijn v Groningen
>> >> >> >>  wrote:
>> >> >> >> > I've added a new patch to the issue, so building the trunk (rev
>> >> >> >> > 955615) with the latest patch should not be a problem. Due to
>> >> recent
>> >> >> >> > changes in the Lucene trunk the patch was not compatible.
>> >> >> >> >
>> >> >> >> > On 17 June 2010 20:20, Erik Hatcher 
>> >> wrote:
>> >> >> >> >>
>> >> >> >> >> On Jun 16, 2010, at 7:31 PM, Mark Diggory wrote:
>> >> >> >> >>>
>> >> >> >> >>> p.s. I'd be glad to contribute our Maven build
>> re-organization
>> >> back
>> >> >> to
>> >> >> >> the
>> >> >> >> >>> community to get Solr properly Mavenized so that it can be
>> >> >> distributed
>> >> >> >> and
>> >> >> >> >>> released more often.  For us the benefit of this structure is
>> >> that
>> >> >> we
>> >> >> >> will
>> >> >> >> >>> be able to overlay addons such as RequestHandlers and other
>> third
>> >> >> party
>> >> >> >> >>> support without having to rebuild Solr from scratch.
>> >> >> >> >>
>> >> >> >> >> But you don't have to rebuild Solr from scratch to add a new
>> >> request
>> >> >> >> handler
>> >> >> >> >> or other plugins - simply compile your custom stuff into a JAR

Re: Field missing when use distributed search + dismax

2010-06-23 Thread Scott Zhang

Hi. All.

I found more about fields missing things.
I tried the default distributed search example which configured 2 instances,
one on 8983 and another on 7574.
When I try search with standard query handler, the result fields are all
right.
When I search with the deafult dismax, some fields disappeared. Not sure
why.

Can anyone test this and confirm the reason?

Thanks.
Regards.


On Wed, Jun 23, 2010 at 2:50 PM, Scott Zhang  wrote:

> Hi. Lance.
>
> Thanks for replying.
>
> Yes. I especially checked the schema.xml and did another simple test.
> The broker is running on localhost:7499/solr.  A solr instance is running
> on localhost:7498/solr. For this test, I only use these 2 instances. 7499's
> index is empty. 7498 has 12 documents in index. I copied the schema.xml from
> 7498 to 7499 before test.
> 1. http://localhost:7498/solr/select
> I get:
> .
> result name="response" numFound="12" start="0">
> -
> 
> gppost_6179
> gppost
> 
> .
>
> 2. http://localhost:7499/solr/select
> I get:
> 
>
> 3. http://localhost:7499/solr/select?shards=localhost:7498/solr
> I get:
> 
> -
> 
> gppost_6179
> 
> -
> 
> gppost_6282
> 
>
> So strange!
>
> I then checked with standard searchhandler.
> 1. http://localhost:7499/solr/select?shards=localhost:7498/solr&q=marship
> 
> -
> 
> member_marship11
> member
> 2010-01-21T00:00:00Z
> 
> 
>
> And 2.
> http://localhost:7499/solr/select?shards=localhost:7498/solr&q=marship&qt=dismax
> result name="response" numFound="1" start="0">
> -
> 
> member_marship11
> 
> 
>
> So strange!
>
>
> On Wed, Jun 23, 2010 at 11:12 AM, Lance Norskog  wrote:
>
>> Do all of the Solr instances, including the broker, use the same
>> schema.xml?
>>
>> On 6/22/10, Scott Zhang  wrote:
>> > Hi. All.
>> >I was using distributed search over 30 solr instance, the previous
>> one
>> > was using the standard query handler. And the result was returned
>> correctly.
>> > each result has 2 fields. "ID" and "type".
>> >Today I want to use search withk dismax, I tried search with each
>> > instance with dismax. It works correctly, return "ID" and "type" for
>> each
>> > result. The strange thing is when I
>> > use distributed search, the result only have "ID". The field "type"
>> > disappeared. I need that "type" to know what the "ID" refer to. Why solr
>> > "eat" my "type"?
>> >
>> >
>> > Thanks.
>> > Regards.
>> > Scott
>> >
>>
>>
>> --
>> Lance Norskog
>> goks...@gmail.com
>>
>
>

Re: Searching across multiple repeating fields

2010-06-23 Thread Mark Allan


Cheers, Geert-Jan, that's very helpful.

We won't always be searching with dates and we wouldn't want  
duplicates to show up in the results, so your second suggestion looks  
like a good workaround if I can't solve the actual problem.  I didn't  
know about FieldCollapsing, so I'll definitely keep it in mind.


Thanks
Mark

On 22 Jun 2010, at 3:44 pm, Geert-Jan Brits wrote:


Perhaps my answer is useless, bc I don't have an answer to your direct
question, but:
You *might* want to consider if your concept of a solr-document is  
on the

correct granular level, i.e:

your problem posted could be tackled (afaik) by defining a  document  
being a

'sub-event' with only 1 daterange.
So for each event-doc you have now, this is replaced by several sub- 
event

docs in this proposed situation.

Additionally each sub-event doc gets an additional field 'parent- 
eventid'
which maps to something like an event-id (which you're probably  
using) .

So several sub-event docs can point to the same event-id.

Lastly, all sub-event docs belonging to a particular event implement  
all the

other fields that you may have stored in that particular event-doc.

Now you can query for events based on data-rages like you  
envisioned, but
instead of returning events you return sub-event-docs. However since  
all
data of the original event (except the multiple dateranges) is  
available in
the subevent-doc this shouldn't really bother the client. If you  
need to

display all dates of an event (the only info missing from the returned
solr-doc) you could easily store it in a RDB and fetch it using the  
defined

parent-eventid.

The only caveat I see, is that possibly multiple sub-events with the  
same

'parent-eventid' might get returned for a particular query.
This however depends on the type of queries you envision. i.e:
1)  If you always issue queries with date-filters, and *assuming* that
sub-events of a particular event don't temporally overlap, you will  
never

get multiple sub-events returned.
2)  if 1)  doesn't hold and assuming you *do* mind multiple sub- 
events of

the same actual event, you could try to use Field Collapsing on
'parent-eventid' to only return the first sub-event per parent- 
eventid that
matches the rest of your query. (Note however, that Field Collapsing  
is a

patch at the moment. http://wiki.apache.org/solr/FieldCollapsing)

Not sure if this helped you at all, but at the very least it was a  
nice

conceptual exercise ;-)

Cheers,
Geert-Jan


2010/6/22 Mark Allan 


Hi all,

Firstly, I apologise for the length of this email but I need to  
describe

properly what I'm doing before I get to the problem!

I'm working on a project just now which requires the ability to  
store and
search on temporal coverage data - ie. a field which specifies a  
date range

during which a certain event took place.

I hunted around for a few days and couldn't find anything which  
seemed to
fit, so I had a go at writing my own field type based on  
solr.PointType.

It's used as follows:
schema.xml
  
  stored="true"

multiValued="true"/>
data.xml
  
  
  ...
  1940,1945
  
  

Internally, this gets stored as:
  1940,1945
  1940
  1945

In due course, I'll declare the subfields as a proper date type,  
but in the
meantime, this works absolutely fine.  I can search for an  
individual date
and Solr will check (queryDate > daterange_0 AND queryDate <  
daterange_1 )
and the correct documents are returned.  My code also allows the  
user to
input a date range in the query but I won't complicate matters with  
that

just now!

The problem arises when a document has more than one "daterange"  
field
(imagine a news broadcast which covers a variety of topics and  
hence time

periods).

A document with two daterange fields
  
  ...
  19820402,19820614
  1990,2000
  
gets stored internally as
  name="daterange">19820402,198206141990,2000arr>
  198204021990arr>
  198206142000arr>


In this situation, searching for 1985 should yield zero results as  
it is
contained within neither daterange, however, the above document is  
returned
in the result set.  What Solr is doing is checking that the  
queryDate (1985)
is greater than *any* of the values in daterange_0 AND queryDate is  
less

than *any* of the values in daterange_1.

How can I get Solr to respect the positions of each item in the  
daterange_0
and _1 arrays?  Ideally I'd like the search to use the following  
logic, thus
preventing the above document from being returned in a search for  
1985:

  (queryDate > daterange_0[0] AND queryDate < daterange_1[0]) OR
(queryDate > daterange_0[1] AND queryDate < daterange_1[1])

Someone else had a very similar problem recently on the mailing  
list with a

multiValued PointType field but the thread went cold without a final
solution.

While I could filter the results when they get back to my application
layer, it seems like it's not really the right place to

Re: OOM on sorting on dynamic fields

2010-06-23 Thread Matteo Fiandesio

Hi to all,
we moved solr with patched lucene's FieldCache in production environment.
During tests we noticed random ConcurrentModificationException calling
the getCacheEntries method due to this bug

https://issues.apache.org/jira/browse/LUCENE-2273

We applied that patch as well, and added an abstract int
getCacheSize() method to FieldCache abstract class and its
implementation in abstract Cache inner class in CacheFieldImpl  that
returns the cache size without instantiating a CacheEntry array.

Response time are slower on cache purging but acceptable from the user
point of view.
Regards,
Matteo


On 22 June 2010 22:41, Matteo Fiandesio  wrote:
> Fields over i'm sorting to are dynamic so one query sorts on
> erick_time_1,erick_timeA_1 and other sorts on erick_time_2 and so
> on.What we see in the heap are a lot of arrays,most of them,filled
> with 0s maybe due to the fact that this timestamps fields are not
> present in all the documents.
>
> By the way,
> I have a script that generates the OOM in 10 minutes on our solr
> instance and with the temporary patch it runned without any problems.
> The side effect is that when the cache is purged next query that
> regenerates the  cache is a little bit slower.
>
> I'm aware that the solution is unelegant and we are investigating to
> solve the problem in another way.
> Regards,
> Matteo
>
>
> On 22 June 2010 19:25, Erick Erickson  wrote:
>> Hmmm, I'm missing something here then. Sorting over 15 fields of type long
>> shouldn't use much memory, even if all the values are unique. When you say
>> "12-15 dynamic fields", are you talking about 12-15 fields per query out of
>> XXX total fields? And is XXX large? At a guess, how many different fields
>> do
>> you think you're sorting over cumulative by the time you get your OOM?
>> Note if you sort over the field "erick_time" in 10 different queries, I'm
>> only counting that as 1 field. I guess another way of asking this is
>> "how many dynamic fields are there total?".
>>
>> If this is really a sorting issue, you should be able to force this to
>> happen
>> almost immediately by firing off enough sort queries at the server. It'll
>> tell you a lot if you can't make this happen, even on a relatively small
>> test machine.
>>
>> Best
>> Erick
>>
>> On Tue, Jun 22, 2010 at 12:59 PM, Matteo Fiandesio <
>> matteo.fiande...@gmail.com> wrote:
>>
>>> Hi Erick,
>>> the index is quite small (1691145 docs) but sorting is massive and
>>> often on unique timestamp fields.
>>>
>>> OOM occur after a range of time between three and four hours.
>>> Depending as well if users browse a part of the application.
>>>
>>> We use solrj to make the queries so we did not use Readers objects
>>> directly.
>>>
>>> Without sorting we don't see the problem
>>> Regards,
>>> Matteo
>>>
>>> On 22 June 2010 17:01, Erick Erickson  wrote:
>>> > H.. A couple of details I'm wondering about. How many
>>> > documents are we talking about in your index? Do you get
>>> > OOMs when you start fresh or does it take a while?
>>> >
>>> > You've done some good investigations, so it seems like there
>>> > could well be something else going on here than just "the usual
>>> > suspects" of sorting
>>> >
>>> > I'm wondering if you aren't really closing readers somehow.
>>> > Are you updating your index frequently and re-opening readers often?
>>> > If so, how?
>>> >
>>> > I'm assuming that if you do NOT sort on all these fields, you don't have
>>> > the problem, is that true?
>>> >
>>> > Best
>>> > Erick
>>> >
>>> > On Fri, Jun 18, 2010 at 10:52 AM, Matteo Fiandesio <
>>> > matteo.fiande...@gmail.com> wrote:
>>> >
>>> >> Hello,
>>> >> we are experiencing OOM exceptions in our single core solr instance
>>> >> (on a (huge) amazon EC2 machine).
>>> >> We investigated a lot in the mailing list and through jmap/jhat dump
>>> >> analyzing and the problem resides in the lucene FieldCache that fills
>>> >> the heap and blows up the server.
>>> >>
>>> >> Our index is quite small but we have a lot of sort queries  on fields
>>> >> that are dynamic,of type long representing timestamps and are not
>>> >> present in all the documents.
>>> >> Those queries apply sorting on 12-15 of those fields.
>>> >>
>>> >> We are using solr 1.4 in production and the dump shows a lot of
>>> >> Integer/Character and Byte Array filled up with 0s.
>>> >> With solr's trunk code things does not change.
>>> >>
>>> >> In the mailing list we saw a lot of messages related to this issues:
>>> >> we tried truncating the dates to day precision,using missingSortLast =
>>> >> true,changing the field type from slong to long,setting autowarming to
>>> >> different values,disabling and enabling caches with different values
>>> >> but we did not manage to solve the problem.
>>> >>
>>> >> We were thinking to implement an LRUFieldCache field type to manage
>>> >> the FieldCache as an LRU and preventing but, before starting a new
>>> >> development, we want to be sure that we are not doing anything wrong
>>>

Re: Field Collapsing SOLR-236

2010-06-23 Thread Rakhi Khatwani

Hi,
I checked out modules & lucene from the trunk.
Performed a build using the following commands
ant clean
ant compile
ant example

Which compiled successfully.

I then put my existing index(using schema.xml from solr1.4.0/conf/solr/) in
the multicore folder, configured solr.xml and started the server

When i type in http://localhost:8983/solr

i get the following error:
org.apache.solr.common.SolrException: Plugin init failure for [schema.xml]
fieldType:analyzer without class or tokenizer & filter list
at
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:168)
at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:480)
at org.apache.solr.schema.IndexSchema.(IndexSchema.java:122)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:429)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:286)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:198)
at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:123)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:86)
at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97)
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at
org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:662)
at org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
at
org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1250)
at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:517)
at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:467)
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at
org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
at
org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at
org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
at org.mortbay.jetty.Server.doStart(Server.java:224)
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.mortbay.start.Main.invokeMain(Main.java:194)
at org.mortbay.start.Main.start(Main.java:534)
at org.mortbay.start.Main.start(Main.java:441)
at org.mortbay.start.Main.main(Main.java:119)
Caused by: org.apache.solr.common.SolrException: analyzer without class or
tokenizer & filter list
at org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:908)
at org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:60)
at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450)
at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435)
at
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:142)
... 32 more

Then i picked up an existing index (schema.xml from solr1.3/solr/conf) and
put it in multicore folder, configured solr.xml and restarted my index

Collapsing worked fine.

Any pointers, which part of schema.xml (solr 1.4) is causing this exception?

Regards,
Raakhi

On Wed, Jun 23, 2010 at 1:35 PM, Rakhi Khatwani  wrote:

>
> Oops this is probably i didn't checkout the modules file from the trunk.
> doing that right now :)
>
> Regards
> Raakhi
>
> On Wed, Jun 23, 2010 at 1:12 PM, Rakhi Khatwani wrote:
>
>> Hi,
>>Patching did work. but when i build the trunk, i get the following
>> exception:
>>
>> [SolrTrunk]# ant compile
>> Buildfile: /testWorkspace/SolrTrunk/build.xml
>>
>> init-forrest-entities:
>>   [mkdir] Created dir: /testWorkspace/SolrTrunk/build
>>   [mkdir] Created dir: /testWorkspace/SolrTrunk/build/web
>>
>> compile-lucene:
>>
>> BUILD FAILED
>> /testWorkspace/SolrTrunk/common-build.xml:207:
>> /testWorkspace/modules/analysis/common does not exist.
>>
>> Regards,
>> Raakhi
>>
>> On Wed, Jun 23, 2010 at 2:39 AM, Martijn v Groningen <
>> martijn.is.h...@gmail.com> wrote:
>>
>>> What exactly did not work? Patching, compiling or running it?
>>>
>>> On 22 June 2010 16:06, Rakhi Khatwani  wrote:
>>> > Hi,
>>> >  I tried checking out the latest code (rev 956715) the patch did
>>> not
>>> > work on it.
>>> > Infact i even tried hunting for the revision mentioned earlier in this
>>> > thread (i.e. rev 955615) but cannot find it in the repository. (it has
>>> > revision 955569 followed by revision 955785).
>>> >
>>> > Any pointers??
>>> > Regards
>>> > Raakhi
>>> >
>>> > On Tue, Jun 22, 2010 at 2:03

Re: Field Collapsing SOLR-236

2010-06-23 Thread Govind Kanshi

fieldType:analyzer without class or tokenizer & filter list seems to point
to the config - you may want to correct.


On Wed, Jun 23, 2010 at 3:09 PM, Rakhi Khatwani  wrote:

> Hi,
>I checked out modules & lucene from the trunk.
> Performed a build using the following commands
> ant clean
> ant compile
> ant example
>
> Which compiled successfully.
>
>
> I then put my existing index(using schema.xml from solr1.4.0/conf/solr/) in
> the multicore folder, configured solr.xml and started the server
>
> When i type in http://localhost:8983/solr
>
> i get the following error:
> org.apache.solr.common.SolrException: Plugin init failure for [schema.xml]
> fieldType:analyzer without class or tokenizer & filter list
> at
>
> org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:168)
> at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:480)
> at org.apache.solr.schema.IndexSchema.(IndexSchema.java:122)
> at org.apache.solr.core.CoreContainer.create(CoreContainer.java:429)
> at org.apache.solr.core.CoreContainer.load(CoreContainer.java:286)
> at org.apache.solr.core.CoreContainer.load(CoreContainer.java:198)
> at
>
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:123)
> at
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:86)
> at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97)
> at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
> at
>
> org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:662)
> at org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
> at
>
> org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1250)
> at
> org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:517)
> at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:467)
> at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
> at
>
> org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
> at
>
> org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
> at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
> at
>
> org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
> at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
> at
> org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
> at org.mortbay.jetty.Server.doStart(Server.java:224)
> at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
> at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.mortbay.start.Main.invokeMain(Main.java:194)
> at org.mortbay.start.Main.start(Main.java:534)
> at org.mortbay.start.Main.start(Main.java:441)
> at org.mortbay.start.Main.main(Main.java:119)
> Caused by: org.apache.solr.common.SolrException: analyzer without class or
> tokenizer & filter list
> at org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:908)
> at org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:60)
> at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450)
> at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435)
> at
>
> org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:142)
> ... 32 more
>
>
> Then i picked up an existing index (schema.xml from solr1.3/solr/conf) and
> put it in multicore folder, configured solr.xml and restarted my index
>
> Collapsing worked fine.
>
> Any pointers, which part of schema.xml (solr 1.4) is causing this
> exception?
>
> Regards,
> Raakhi
>
>
>
> On Wed, Jun 23, 2010 at 1:35 PM, Rakhi Khatwani 
> wrote:
>
> >
> > Oops this is probably i didn't checkout the modules file from the trunk.
> > doing that right now :)
> >
> > Regards
> > Raakhi
> >
> > On Wed, Jun 23, 2010 at 1:12 PM, Rakhi Khatwani  >wrote:
> >
> >> Hi,
> >>Patching did work. but when i build the trunk, i get the
> following
> >> exception:
> >>
> >> [SolrTrunk]# ant compile
> >> Buildfile: /testWorkspace/SolrTrunk/build.xml
> >>
> >> init-forrest-entities:
> >>   [mkdir] Created dir: /testWorkspace/SolrTrunk/build
> >>   [mkdir] Created dir: /testWorkspace/SolrTrunk/build/web
> >>
> >> compile-lucene:
> >>
> >> BUILD FAILED
> >> /testWorkspace/SolrTrunk/common-build.xml:207:
> >> /testWorkspace/modules/analysis/common does not exist.
> >>
> >> Regards,
> >> Raakhi
> >>
> >> On Wed, Jun 23, 2010 at 2:39 AM, Martijn v Groningen <
> >> martijn.is.h...@gmail.com> wrote:
> >>
> >>> What exactly did not work? Patching, compiling or running it?
> >>>
> >>> On 22 June 2010 16:06,

Re: Mr Erick Re: Change the Solr searcher

2010-06-23 Thread sarfaraz masood

but how to add this change to the running solr server?? 

i mean to say that how make my changes visible in running solr ?? Do i need to 
make a pluggin , patch or something???

-sarfaraz

--- On Tue, 22/6/10, Erik Hatcher  wrote:

From: Erik Hatcher 
Subject: Re: Change the Solr searcher
To: solr-user@lucene.apache.org
Date: Tuesday, 22 June, 2010, 11:11 PM

Sounds like what you want is to override Solr's "query" component.  Have a look 
at the built-in one and go from there.

    Erik

On Jun 22, 2010, at 1:38 PM, sarfaraz masood wrote:

> I am a novice in solr / lucene. but i have gone
> thru the documentations of both.I have even implemented programs in
> lucene for searching etc.
> 
> My problem is to apply a new search technique other than the one used by solr.
> 
> Now as i know that lucene has its own searcher
> which is used by solr as well.
> 
> *Ques.. Cant i replace this searcher part in
> SOLR by a java program that returns documents as per my algorithm ?
> 
> i.e I only want to change the searcher part of solr. I have
> studied abt customizing the scoring which is absolutely not my aim.My
> aim is replace the searcher.
> 
> Plz help me in this regards. I will be highly gratefull to you for your 
> assistance in this work of mine.
> 
> If any part of this mail was not clear to you then plz lemme know, i will 
> expain that you.
> 
> Regards
> 
> -sarfaraz
>

TermsComponent - AutoComplete - Multiple Term Suggestions & Inclusive Search?

2010-06-23 Thread Saïd Radhouani

Hi,

I'm using the Terms Component to se up the autocomplete feature based on a 
String field. Here are the params I'm using:

terms=true&terms.fl=type&terms.lower=cat&terms.prefix=cat&terms.lower.incl=false

With the above params, I've been able to get suggestions for terms that start 
with the specified prefix. I'm wondering wether it's possible to:

- have inclusive search, i.e., by typing "cat," we get "category," 
"subcategory," etc.?

- start suggestion from any word in the field. i.e., by typing "cat," we get 
"The best category..."?

Thanks!

 -Saïd

Re: Field Collapsing SOLR-236

2010-06-23 Thread Rakhi Khatwani

Hi,
   But these is almost no settings in my config
heres a snapshot of what i have in my solrconfig.xml














*:*






Am i goin wrong anywhere?
Regards,
Raakhi

On Wed, Jun 23, 2010 at 3:28 PM, Govind Kanshi wrote:

> fieldType:analyzer without class or tokenizer & filter list seems to point
> to the config - you may want to correct.
>
>
> On Wed, Jun 23, 2010 at 3:09 PM, Rakhi Khatwani 
> wrote:
>
> > Hi,
> >I checked out modules & lucene from the trunk.
> > Performed a build using the following commands
> > ant clean
> > ant compile
> > ant example
> >
> > Which compiled successfully.
> >
> >
> > I then put my existing index(using schema.xml from solr1.4.0/conf/solr/)
> in
> > the multicore folder, configured solr.xml and started the server
> >
> > When i type in http://localhost:8983/solr
> >
> > i get the following error:
> > org.apache.solr.common.SolrException: Plugin init failure for
> [schema.xml]
> > fieldType:analyzer without class or tokenizer & filter list
> > at
> >
> >
> org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:168)
> > at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:480)
> > at org.apache.solr.schema.IndexSchema.(IndexSchema.java:122)
> > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:429)
> > at org.apache.solr.core.CoreContainer.load(CoreContainer.java:286)
> > at org.apache.solr.core.CoreContainer.load(CoreContainer.java:198)
> > at
> >
> >
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:123)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:86)
> > at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97)
> > at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
> > at
> >
> >
> org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:662)
> > at org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
> > at
> >
> >
> org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1250)
> > at
> > org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:517)
> > at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:467)
> > at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
> > at
> >
> >
> org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
> > at
> >
> >
> org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
> > at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
> > at
> >
> >
> org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
> > at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
> > at
> > org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
> > at org.mortbay.jetty.Server.doStart(Server.java:224)
> > at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
> > at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > at java.lang.reflect.Method.invoke(Method.java:597)
> > at org.mortbay.start.Main.invokeMain(Main.java:194)
> > at org.mortbay.start.Main.start(Main.java:534)
> > at org.mortbay.start.Main.start(Main.java:441)
> > at org.mortbay.start.Main.main(Main.java:119)
> > Caused by: org.apache.solr.common.SolrException: analyzer without class
> or
> > tokenizer & filter list
> > at org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:908)
> > at org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:60)
> > at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450)
> > at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435)
> > at
> >
> >
> org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:142)
> > ... 32 more
> >
> >
> > Then i picked up an existing index (schema.xml from solr1.3/solr/conf)
> and
> > put it in multicore folder, configured solr.xml and restarted my index
> >
> > Collapsing worked fine.
> >
> > Any pointers, which part of schema.xml (solr 1.4) is causing this
> > exception?
> >
> > Regards,
> > Raakhi
> >
> >
> >
> > On Wed, Jun 23, 2010 at 1:35 PM, Rakhi Khatwani 
> > wrote:
> >
> > >
> > > Oops this is probably i didn't checkout the modules file from the
> trunk.
> > > doing that right now :)
> > >
> > > Regards
> > > Raakhi
> > >
> > > On Wed, Jun 23, 2010 at 1:12 PM, Rakhi Khatwani  > >wrote:
> > >
> > >> Hi,
> > >>Patching did work. but when i build the trunk, i get the
> > following
> > >> exception:
> > >>
> > >> [SolrTrunk]# ant compile
> > >> Buildfile: /testWorkspace/SolrTrunk/build.xml
> > >>
> > >> init-forrest-entities:
> > >>

Re: collapse exception

2010-06-23 Thread Martijn v Groningen

That is a good idea. I'm trying to achieve something similar.  I'm
already busy with creating a Lucene collector that groups the result
set and will eventually have  the same functionality as in SOLR-236.
When that is solid the idea is to integrate that into Solr. I've
attached a patch in LUCENE-1421. It is a work in progress.

On 23 June 2010 03:00, Erik Hatcher  wrote:
> Martijn - Maybe the patches to SolrIndexSearcher could be extracted into a
> new issue so that we can put in the infrastructure at least.  That way this
> could truly be a drop-in plugin without it actually being in core.  I
> haven't looked at the specifics, but I imagine we could get the core stuff
> adjusted to suit this plugin.
>
>        Erik
>
> On Jun 22, 2010, at 5:24 PM, Martijn v Groningen wrote:
>
>> I checked your stacktrace and I can't remember putting
>> SolrIndexSearcher.getDocListAndSet(...) in the doQuery(...) method. I
>> guess the patch was modified before it was applied.
>> I think the error occurs when you do a field collapse search with a fq
>> parameter. That is the only reason I can think of why this exception
>> is thrown.
>>
>> When this component become a contrib? Using patch is so annoying
>> Patching is a bit of a hassle. This patch has some changes in the
>> SolrIndexSearcher which makes it difficult to make it a contrib or an
>> extension.
>>
>> On 22 June 2010 04:52, Li Li  wrote:
>>>
>>> I don't know because it's patched by someone else but I can't get his
>>> help. When this component become a contrib? Using patch is so annoying
>>>
>>> 2010/6/22 Martijn v Groningen :

 What version of Solr and which patch are you using?

 On 21 June 2010 11:46, Li Li  wrote:
>
> it says  "Either filter or filterList may be set in the QueryCommand,
> but not both." I am newbie of solr and have no idea of the exception.
> What's wrong with it? thank you.
>
> java.lang.IllegalArgumentException: Either filter or filterList may be
> set in the QueryCommand, but not both.
>       at
> org.apache.solr.search.SolrIndexSearcher$QueryCommand.setFilter(SolrIndexSearcher.java:1711)
>       at
> org.apache.solr.search.SolrIndexSearcher.getDocListAndSet(SolrIndexSearcher.java:1286)
>       at
> org.apache.solr.search.fieldcollapse.NonAdjacentDocumentCollapser.doQuery(NonAdjacentDocumentCollapser.java:205)
>       at
> org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.executeCollapse(AbstractDocumentCollapser.java:246)
>       at
> org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.collapse(AbstractDocumentCollapser.java:173)
>       at
> org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:174)
>       at
> org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127)
>       at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:203)
>       at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>       at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
>       at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
>       at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
>       at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>       at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>       at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>       at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>       at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>       at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>       at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>       at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
>       at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849)
>       at
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>       at
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
>       at java.lang.Thread.run(Thread.java:619)
>



 --
 Met vriendelijke groet,

 Martijn van Groningen

>>>
>>
>>
>>
>> --
>> Met vriendelijke groet,
>>
>> Martijn van Groningen
>
>



-- 
Met vriendelijke groet,

Martijn van Groningen

Solrj throws RuntimeException - Invalid version or the data is not in javabin format

2010-06-23 Thread Villemos, Gert

I have a problem injecting data using SolrJ from a Windows client to an
Ubuntu server (see exception below). The same configuration works when
injecting from a Windows client to a Windows server.

 

I inject using a standard
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer instance,
created with the URL of the SOLR server. I do not change the injection
format and understand that the default injection format is XML, i.e. not
binary. The requesthandler configured in the solrconfig.xml for updates
is a solr.XmlUpdateRequestHandler. I have no repsonsehandler explicitly
configured.

 

The stack trace seems to indicate that SolrJ tries to decode the
response as a binary stream. 

 

What am I doing wrong? How can I fix it? Do I need to configure a
responsehandler explicitly? Why does this work windows2windows and not
windows2ubuntu?

 

Thanks,

Gert.

 

 

 Exception 

 

Exception in thread "Thread-2" java.lang.RuntimeException: Invalid
version or th

e data in not in 'javabin' format

at
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:

99)

at
org.apache.solr.client.solrj.impl.BinaryResponseParser.processRespons

e(BinaryResponseParser.java:39)

at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(Commo

nsHttpSolrServer.java:466)

at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(Commo

nsHttpSolrServer.java:243)

at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(Ab

stractUpdateRequest.java:105)

at
org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:64)

at org.esa.huginn.solr.SolrContainer.add(SolrContainer.java:188)

at
org.esa.huginn.loader.LoaderContext.execute(LoaderContext.java:69)

at
esa.dops.game.core.schedulers.AbstractComponentScheduler.run(Abstract

ComponentScheduler.java:57)

at
esa.dops.game.core.schedulers.RelativeTimeComponentScheduler.initiali

se(RelativeTimeComponentScheduler.java:48)

at
esa.dops.game.core.component.AbstractComponentContext.initialize(Abst

ractComponentContext.java:87)

at
org.esa.huginn.loader.LoaderContext.initialize(LoaderContext.java:51)

 

at
esa.dops.game.core.component.AbstractComponentContext.run(AbstractCom

ponentContext.java:69)

 



Please help Logica to respect the environment by not printing this email  / 
Pour contribuer comme Logica au respect de l'environnement, merci de ne pas 
imprimer ce mail /  Bitte drucken Sie diese Nachricht nicht aus und helfen Sie 
so Logica dabei, die Umwelt zu schützen. /  Por favor ajude a Logica a 
respeitar o ambiente nao imprimindo este correio electronico.



This e-mail and any attachment is for authorised use by the intended 
recipient(s) only. It may contain proprietary material, confidential 
information and/or be subject to legal privilege. It should not be copied, 
disclosed to, retained or used by, any other party. If you are not an intended 
recipient then please promptly delete this e-mail and any attachment and all 
copies and inform the sender. Thank you.

Import XML files different format?

2010-06-23 Thread scrapy

Hi,

I'm new to solr. It looks great.

I would like to add a XML document in the following format in solr:
















etc...




Is there a way to do this? If yes how?

Or i need to convert it with some scripts to this:



   Patrick Eagar
   Sports
etc...


Thanks for your help

Regards

Re: Import XML files different format?

2010-06-23 Thread Erik Hatcher


You can use DataImportHandler's XML/XPath capabilities to do this:

  


or you could, of course, convert your XML to Solr's XML format.

Another fine option for what this data looks like, CSV format.

I'd imagine you have the orginal data in a relational database though?

Erik


On Jun 23, 2010, at 7:59 AM, scr...@asia.com wrote:


Hi,

I'm new to solr. It looks great.

I would like to add a XML document in the following format in solr:




   
   
   
   
   
   
   
   
   
   


etc...




Is there a way to do this? If yes how?

Or i need to convert it with some scripts to this:



  Patrick Eagar
  Sports
etc...


Thanks for your help

Regards

Re: about function query

2010-06-23 Thread Yonik Seeley

See 
http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents
for more info on how to do a multiplicative boost.

-Yonik
http://www.lucidimagination.com

On Tue, Jun 22, 2010 at 11:13 PM, Li Li  wrote:
> I want to integrate document's timestamp into scoring of search. And I
> find an example in the book "Solr 1.4 Enterprise Search Server" about
> function query. I want to boost a document which is newer. so it may
> be a function such as 1/(timestamp+1) . But the function query is
> added to the final result, not multiplied. So I can't adjust the
> parameter well.
> e.g
> search term is term1, topdocs are doc1 with score 2.0; doc2 with score 1.5.
> search term is term2, topdocs are doc1 with score 20;  doc2 with score 15.
> it is hard to adjust the relative score of these 2 docs with add a value.  i
> if it is multiply, it's easy. if doc1 is very old, we assign a score
> 1,and doc2 is new, we assign a score 2
> thus total score is 2.0*1 1.5*2 . So doc2 rank higher than doc1
> but when use add,  2.0 + weight*1, 1.5 +weight*2, it's hard to get a
> proper weight.
> if we let weight is 1, it works well for term1
> but with term2, it 20 +1*1.5 15+1*2  time has little influence on the
> final result.
>

Re: Import XML files different format?

2010-06-23 Thread scrapy

Thanks Eric for your answer.

I'll try to use DIH via data-config.xml as i might index other content with 
different XML structure in the futur... 

Will i need to have different data-config for each XML strucure content file? 
And then manualy cange between them?

-Original Message-
From: Erik Hatcher 
To: solr-user@lucene.apache.org
Sent: Wed, Jun 23, 2010 2:19 pm
Subject: Re: Import XML files different format?

You can use DataImportHandler's XML/XPath capabilities to do this: 

or you could, of course, convert your XML to Solr's XML format. 

Another fine option for what this data looks like, CSV format. 

I'd imagine you have the orginal data in a relational database though? 

   Erik 

On Jun 23, 2010, at 7:59 AM, scr...@asia.com wrote: 

> Hi, 
> 
> I'm new to solr. It looks great. 
> 
> I would like to add a XML document in the following format in solr: 
> 
>  
>  
>  
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>  
> 
> etc... 
>  
> 
> 
> 
> Is there a way to do this? If yes how? 
> 
> Or i need to convert it with some scripts to this: 
> 
>  
>  
>   Patrick Eagar 
>   Sports 
> etc... 
> 
> 
> Thanks for your help 
> 
> Regards

Re: TermsComponent - AutoComplete - Multiple Term Suggestions & Inclusive Search?

2010-06-23 Thread Chantal Ackermann

Hi Saïd,

I think your problem is the field's type: String. You have to use a
TextField and apply tokenizers that will find "subcategory" if you put
in "cat". (Not sure which filter does that, though. I wouldn't think
that the PorterStemmer cuts off prefix syllables of that kind?)

If, however, you search on an analyzed version of the field it should
return hits as usual according to the analyzer chain, and you can thus
use the values of that field listed in the hits as suggestions.

Exmple:
input: potter
field type: solr.TextField (with porter stemmer)
finds: "Harry Potter and Whatever"
and also "Potters and Plums"

Cheers,
Chantal

On Wed, 2010-06-23 at 13:17 +0200, Saïd Radhouani wrote:
> Hi,
> 
> I'm using the Terms Component to se up the autocomplete feature based on a 
> String field. Here are the params I'm using:
> 
> terms=true&terms.fl=type&terms.lower=cat&terms.prefix=cat&terms.lower.incl=false
> 
> With the above params, I've been able to get suggestions for terms that start 
> with the specified prefix. I'm wondering wether it's possible to:
> 
> - have inclusive search, i.e., by typing "cat," we get "category," 
> "subcategory," etc.?
> 
> - start suggestion from any word in the field. i.e., by typing "cat," we get 
> "The best category..."?
> 
> Thanks!
> 
>  -Saïd
> 
>

Alphabetic range

2010-06-23 Thread Sophie M.


Hello all,

I try since several day to build up an alphabetical range. I will explain
all steps (i have the Solr1.4 Enterprise  Search Server book written by
Smiley and Pugh).

I want get all artists beginning by the two first letter. If I request "mi",
I want to have as response "michael jackson" and all artists name beginning
by "mi".

I defined a field type similiar to Smiley and Pugh's example p.148



 






I defined the field ArtistSort like : 


To the request : 

http://localhost:8983/solr/music/select?indent=on&q=yu&qt=standard&wt=standard&facet=on&facet.field=ArtistSort&facetsort=lex&facet.missing=on&facet.method=enum&fl=ArtistSort

I get :

http://lucene.472066.n3.nabble.com/file/n916716/select.xml select.xml 

I don't understand why the pattern doesn't my exacty. For example "An An Yu"
matches but I only want artists whom name begins by "yu". And I know that an
artist named ReYu would match because ReYu would be interpreted as Re Yu (as
two words).

I also tried to make an other type of queries like : 

http://localhost:8983/solr/music/select?indent=on&version=2.2&q=ArtistSort:mi*&fq=&start=0&rows=10&fl=ArtistSort&qt=standard&wt=standard&explainOther=&hl.fl=

I get exacly what I would. I made several tries, I get only artist's names
wich begins by the good first to letters.

But I get very few responses, see there :

result name="response" numFound="6" start="0">


mike manne and tiger blues

−

mimika

−

miduno

−

milue macïro

−

mister pringle

−

mimmai



In my index there is more than 80 000 artists...  I really don't understand
why I can't get more responses. I think about the problem since days and
days and now my brain freezes 

Thank you in advance.

Sophie
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Alphabetic-range-tp916716p916716.html
Sent from the Solr - User mailing list archive at Nabble.com.

Setting up Eclipse with merged Lucene Solr source tree

2010-06-23 Thread Ukyo Virgden

Hi,

I'm trying to setup and eclipse environment for combined Lusolr tree. I've
created a Lucene project containing /trunk/lusolr/lucene
and /trunk/lusolr/modules as one project and /trunk/lusolr/solr as another.
I've added lucene project as a dependency to Solr project, removed solr libs
from lucene project and added Lucene project to dependencies of Solr
project.

Lucene source tree is fine but in the Solr tree I get 5 errors

The method getTextContent() is undefined for the type Node TestConfig.java
/Solr/src/test/org/apache/solr/core line 91
The method getTextContent() is undefined for the type Node TestConfig.java
/Solr/src/test/org/apache/solr/core line 94
The method setXIncludeAware(boolean) is undefined for the type
DocumentBuilderFactory Config.java /Solr/src/java/org/apache/solr/core line
113
The method setXIncludeAware(boolean) is undefined for the type
DocumentBuilderFactory DataImporter.java
/Solr/contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport
line
The method setXIncludeAware(boolean) is undefined for the type Object
TestXIncludeConfig.java /Solr/src/test/org/apache/solr/core line 32

Is this the correct way to setup eclipse after the source tree merge?

Thanks in advance
Ukyo

dataimport.properties is not updated on delta-import

2010-06-23 Thread warb


Hello!

I am having some difficulties getting dataimport (DIH) to behave correctly
in Solr 1.4.0. Indexing itself works just as it is supposed to with both
full-import and delta-import adding modified or newly created records to the
index. The problem is however that the date and time of the last
delta-import is not updated in the "dataimport.properites" file. The only
time the file gets updated is when performing a full-import. 

Now, this is not a huge problem since delta-import will simply disregard
records already imported (due to the primary key), but it seems wasteful to
fetch records which have already been added on previous runs. Also, as the
database grows the delta-imports will take longer and longer.

Does anyone know of anything I might have overlooked or known bugs?

Thanks in advance!

Johan Andersson
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/dataimport-properties-is-not-updated-on-delta-import-tp916753p916753.html
Sent from the Solr - User mailing list archive at Nabble.com.

Indexing Rich Format Documents using Data Import Handler (DIH) and the TikaEntityProcessor

2010-06-23 Thread Tod


Please refer to this thread for history:

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201006.mbox/%3c4c1b6bb6.7010...@gmail.com%3e


I'm trying to integrate the TikaEntityProcessor as suggested.  I'm using 
Solr Version: 1.4.0 and getting the following error:


java.lang.ClassNotFoundException: Unable to load BinURLDataSource or 
org.apache.solr.handler.dataimport.BinURLDataSource


curl -s http://test.html|curl 
http://localhost:9080/solr/update/extract?extractOnly=true --data-binary 
@-  -H 'Content-type:text/html'


... works fine so presumably my Tika processor is working.


My data-config.xml looks like this:


  

  

  

  
  
  
  
  
  
  


 query="select CONTENT_URL from my_database where 
content_id='${my_database.CONTENT_ID}'">

 
  url="http://www.mysite.com/${my_database.content_url}";
  
 


  


I added the entity name="my_database_url" section to an existing 
(working) database entity to be able to have Tika index the content 
pointed to by the content_url.


Is there anything obviously wrong with what I've tried so far because 
this is not working, it keeps rolling back with the error above.



Thanks - Tod

Re: TermsComponent - AutoComplete - Multiple Term Suggestions & Inclusive Search?

2010-06-23 Thread Sophie M.


To build your autocompletion, you can use the NGramFilterFactory. If you type
cat It will match "subcategory" and "the best category".

If you change your mind and you don't want anymore to match subcategory, you
can use the EdgeNGramFilterFactory.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/TermsComponent-AutoComplete-Multiple-Term-Suggestions-Inclusive-Search-tp916530p916769.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: dataimport.properties is not updated on delta-import

2010-06-23 Thread Stefan Moises


Hi,

what I have experienced is that the primary key seems to be case 
sensitive for the delta queries, at least for some jdcd drivers... see 
http://lucene.472066.n3.nabble.com/Problem-with-DIH-delta-import-on-JDBC-tp763469p765262.html 
... so make sure you specify it with the correct case (e.g. ID instead 
of id) in your db-data-config.xml.


Maybe that's the problem...

Cheers,
Stefan

Am 23.06.2010 15:09, schrieb warb:

Hello!

I am having some difficulties getting dataimport (DIH) to behave correctly
in Solr 1.4.0. Indexing itself works just as it is supposed to with both
full-import and delta-import adding modified or newly created records to the
index. The problem is however that the date and time of the last
delta-import is not updated in the "dataimport.properites" file. The only
time the file gets updated is when performing a full-import.

Now, this is not a huge problem since delta-import will simply disregard
records already imported (due to the primary key), but it seems wasteful to
fetch records which have already been added on previous runs. Also, as the
database grows the delta-imports will take longer and longer.

Does anyone know of anything I might have overlooked or known bugs?

Thanks in advance!

Johan Andersson
   


--
***
Stefan Moises
Senior Softwareentwickler

shoptimax GmbH
Guntherstraße 45 a
90461 Nürnberg
Amtsgericht Nürnberg HRB 21703
GF Friedrich Schreieck

Tel.: 0911/25566-25
Fax:  0911/25566-29
moi...@shoptimax.de
http://www.shoptimax.de
***

fuzzy query performance

2010-06-23 Thread Peter Karich

Hi!

How can I improve the performance of a fuzzy search like: mihchael~0.7
through a relative large index (~1 million docs)?
It takes over 15 seconds at the moment if we would perform it on the
normal text search field.
I searched the web and the jira and couldn't find anything related to that.

Any pointers or ideas would be appreciated!

Regards,
Peter.

Re: fuzzy query performance

2010-06-23 Thread Mark Miller


On 6/23/10 9:48 AM, Peter Karich wrote:

Hi!

How can I improve the performance of a fuzzy search like: mihchael~0.7
through a relative large index (~1 million docs)?
It takes over 15 seconds at the moment if we would perform it on the
normal text search field.
I searched the web and the jira and couldn't find anything related to that.

Any pointers or ideas would be appreciated!

Regards,
Peter.


Solr trunk should have much improved fuzzy speeds (due to some very cool 
work that was done in Lucene) - you using 1.4?


--
- Mark

http://www.lucidimagination.com

Re: Setting up Eclipse with merged Lucene Solr source tree

2010-06-23 Thread Erick Erickson

Did you see this page?"
http://wiki.apache.org/solr/HowToContribute

Especially down near the end,
the section
"Development Environment Tips"

HTH
Erick

On Wed, Jun 23, 2010 at 8:57 AM, Ukyo Virgden  wrote:

> Hi,
>
> I'm trying to setup and eclipse environment for combined Lusolr tree. I've
> created a Lucene project containing /trunk/lusolr/lucene
> and /trunk/lusolr/modules as one project and /trunk/lusolr/solr as another.
> I've added lucene project as a dependency to Solr project, removed solr
> libs
> from lucene project and added Lucene project to dependencies of Solr
> project.
>
> Lucene source tree is fine but in the Solr tree I get 5 errors
>
> The method getTextContent() is undefined for the type Node TestConfig.java
> /Solr/src/test/org/apache/solr/core line 91
> The method getTextContent() is undefined for the type Node TestConfig.java
> /Solr/src/test/org/apache/solr/core line 94
> The method setXIncludeAware(boolean) is undefined for the type
> DocumentBuilderFactory Config.java /Solr/src/java/org/apache/solr/core line
> 113
> The method setXIncludeAware(boolean) is undefined for the type
> DocumentBuilderFactory DataImporter.java
>
> /Solr/contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport
> line
> The method setXIncludeAware(boolean) is undefined for the type Object
> TestXIncludeConfig.java /Solr/src/test/org/apache/solr/core line 32
>
> Is this the correct way to setup eclipse after the source tree merge?
>
> Thanks in advance
> Ukyo
>

Re: Help with highlighting

2010-06-23 Thread noel

Here's my request:
q=ASA+AND+minisite_id%3A36&version=1.3&json.nl=map&rows=10&start=0&wt=json&hl=true&hl.fl=%2A&hl.simple.pre=%3Cspan+class%3D%22hl%22%3E&hl.simple.post=%3C%2Fspan%3E&hl.fragsize=0&hl.mergeContiguous=false

And here's what happened:
It didn't return results, even when I applied an asterisk for which fields 
highlight. I tried other fields and that didn't work either, however all_text 
is the only one that works. Any other ideas why the other fields won't 
highlight? Thanks.

-Original Message-
From: "Erik Hatcher" 
Sent: Tuesday, June 22, 2010 9:49pm
To: solr-user@lucene.apache.org
Subject: Re: Help with highlighting

You need to share with us the Solr request you made, any any custom  
request handler settings that might map to.  Chances are you just need  
to twiddle with the highlighter parameters (see wiki for docs) to get  
it to do what you want.

Erik

On Jun 22, 2010, at 4:42 PM, n...@frameweld.com wrote:

> Hi, I need help with highlighting fields that would match a query.  
> So far, my results only highlight if the field is from all_text, and  
> I would like it to use other fields. It simply isn't the case if I  
> just turn highlighting on. Any ideas why it only applies to  
> all_text? Here is my schema:
>
> 
>
> 
>   
>   
>   
>   
>sortMissingLast="true" omitNorms="true" />
>sortMissingLast="true" omitNorms="true" />
>   
>   
>omitNorms="true"/>
>
>   
>omitNorms="true"/>
>omitNorms="true"/>
>   
>   
>sortMissingLast="true" omitNorms="true"/>
>sortMissingLast="true" omitNorms="true"/>
>sortMissingLast="true" omitNorms="true"/>
>sortMissingLast="true" omitNorms="true"/>
>   
>   
>
>sortMissingLast="true" omitNorms="true"/>
>   
>   
>indexed="true" />
>   
>   
>positionIncrementGap="100">
>   
>class="solr.WhitespaceTokenizerFactory"/>
>   
>   
>
>   
>positionIncrementGap="100">
>   
>class="solr.WhitespaceTokenizerFactory"/>
>   
> 
> generateWordParts="1" generateNumberParts="1" catenateWords="1"  
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>   
> 
> protected="protwords.txt"/>
>class="solr.RemoveDuplicatesTokenFilterFactory"/>
>   
>
>   
>class="solr.WhitespaceTokenizerFactory"/>
>synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
> 
> generateWordParts="1" generateNumberParts="1" catenateWords="0"  
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>   
> 
> protected="protwords.txt"/>
>class="solr.RemoveDuplicatesTokenFilterFactory"/>
>   
>   
>
>   
>positionIncrementGap="100" >
>   
>class="solr.WhitespaceTokenizerFactory"/>
>synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
>ignoreCase="true"  
> words="stopwords.txt"/>
> 
> generateWordParts="0" generateNumberParts="0" catenateWords="1"  
> catenateNumbers="1" catenateAll="0"/>
>   
> 
> protected="protwords.txt"/>
>class="solr.RemoveDuplicatesTokenFilterFactory"/>
>
>   
>   
>   
>positionIncrementGap="100" >
>   
>class="solr.StandardTokenizerFactory"/>
>   
>class="solr.RemoveDuplicatesTokenFilterFactory" />
>maxShingleSize="2"  
> outputUnigrams="false" />
>   
>
>   
>   
>sortMissingLast="true" omitNorms="true">
>   
>class="solr.KeywordTokenizerFactory"/>
>   
>   
>  pattern="([^a-z])" replacement="" 
> replace="all"
>   />
>   
>   
>
>class="solr.StrField" />
>

Re: Help with highlighting

2010-06-23 Thread dan sutton

It looks to me like a tokenisation issue, all_text content and the query
text will match, but the string fieldtype fields 'might not' and therefore
will not be highlighted.

On Wed, Jun 23, 2010 at 4:40 PM,  wrote:

> Here's my request:
> q=ASA+AND+minisite_id%3A36&version=1.3&json.nl
> =map&rows=10&start=0&wt=json&hl=true&hl.fl=%2A&hl.simple.pre=%3Cspan+class%3D%22hl%22%3E&hl.simple.post=%3C%2Fspan%3E&hl.fragsize=0&hl.mergeContiguous=false
>
> And here's what happened:
> It didn't return results, even when I applied an asterisk for which fields
> highlight. I tried other fields and that didn't work either, however
> all_text is the only one that works. Any other ideas why the other fields
> won't highlight? Thanks.
>
> -Original Message-
> From: "Erik Hatcher" 
> Sent: Tuesday, June 22, 2010 9:49pm
> To: solr-user@lucene.apache.org
> Subject: Re: Help with highlighting
>
> You need to share with us the Solr request you made, any any custom
> request handler settings that might map to.  Chances are you just need
> to twiddle with the highlighter parameters (see wiki for docs) to get
> it to do what you want.
>
>Erik
>
> On Jun 22, 2010, at 4:42 PM, n...@frameweld.com wrote:
>
> > Hi, I need help with highlighting fields that would match a query.
> > So far, my results only highlight if the field is from all_text, and
> > I would like it to use other fields. It simply isn't the case if I
> > just turn highlighting on. Any ideas why it only applies to
> > all_text? Here is my schema:
> >
> > 
> >
> > 
> >   
> >   
> >
> >   
> >> sortMissingLast="true" omitNorms="true" />
> >> sortMissingLast="true" omitNorms="true" />
> >
> >   
> >omitNorms="true"/>
> >
> >omitNorms="true"/>
> >omitNorms="true"/>
> >omitNorms="true"/>
> >
> >   
> >> sortMissingLast="true" omitNorms="true"/>
> >> sortMissingLast="true" omitNorms="true"/>
> >> sortMissingLast="true" omitNorms="true"/>
> >> sortMissingLast="true" omitNorms="true"/>
> >
> >   
> >
> >> sortMissingLast="true" omitNorms="true"/>
> >
> >   
> >> indexed="true" />
> >
> >   
> >> positionIncrementGap="100">
> >   
> >class="solr.WhitespaceTokenizerFactory"/>
> >   
> >   
> >
> >
> >> positionIncrementGap="100">
> >   
> >class="solr.WhitespaceTokenizerFactory"/>
> >   
> >class="solr.WordDelimiterFilterFactory"
> > generateWordParts="1" generateNumberParts="1" catenateWords="1"
> > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> >class="solr.LowerCaseFilterFactory"/>
> >class="solr.EnglishPorterFilterFactory"
> > protected="protwords.txt"/>
> >class="solr.RemoveDuplicatesTokenFilterFactory"/>
> >   
> >
> >   
> >class="solr.WhitespaceTokenizerFactory"/>
> >> synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
> >class="solr.WordDelimiterFilterFactory"
> > generateWordParts="1" generateNumberParts="1" catenateWords="0"
> > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
> >class="solr.LowerCaseFilterFactory"/>
> >class="solr.EnglishPorterFilterFactory"
> > protected="protwords.txt"/>
> >class="solr.RemoveDuplicatesTokenFilterFactory"/>
> >   
> >   
> >
> >
> >> positionIncrementGap="100" >
> >   
> >class="solr.WhitespaceTokenizerFactory"/>
> >> synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
> >ignoreCase="true"
> > words="stopwords.txt"/>
> >class="solr.WordDelimiterFilterFactory"
> > generateWordParts="0" generateNumberParts="0" catenateWords="1"
> > catenateNumbers="1" catenateAll="0"/>
> >class="solr.LowerCaseFilterFactory"/>
> >class="solr.EnglishPorterFilterFactory"
> > protected="protwords.txt"/>
> >class="solr.RemoveDuplicatesTokenFilterFactory"/>
> >
> >   
> >   
> >
> >> positionIncrementGap="100" >
> >   
> >class="solr.StandardTokenizerFactory"/>
> >class="solr

remove from list

2010-06-23 Thread Susan Rust

Hey SOLR folks -- There's too much info for me to digest, so please  
remove me from the email threads.


However, if we can build you a forum, bulletin board or other web- 
based tool, please let us know. For that matter, we would be happy to  
build you a new website.


Bill O'Connor is our CTO and the Drupal.org SOLR Redesign Lead. So we  
love SOLR! Let us know how we can support your efforts.


Susan Rust
VP of Client Services

If you wish to travel quickly, go alone
If you wish to travel far, go together

Achieve Internet
1767 Grand Avenue, Suite 2
San Diego, CA 92109

800-618-8777 x106
858-453-5760 x106

Susan-Rust (skype)
@Susan_Rust (twitter)
@Achieveinternet (twitter)
@drupalsandiego (San Diego Drupal Users' Group Twitter)



This message contains confidential information and is intended only  
for the individual named. If you are not the named addressee you  
should not disseminate, distribute or copy this e-mail. Please notify  
the sender immediately by e-mail if you have received this e-mail by  
mistake and delete this e-mail from your system. E-mail transmission  
cannot be guaranteed to be secure or error-free as information could  
be intercepted, corrupted, lost, destroyed, arrive late or incomplete,  
or contain viruses. The sender therefore does not accept liability for  
any errors or omissions in the contents of this message, which arise  
as a result of e-mail transmission. If verification is required please  
request a hard-copy version.














On Jun 23, 2010, at 1:52 AM, Mark Allan wrote:


Cheers, Geert-Jan, that's very helpful.

We won't always be searching with dates and we wouldn't want  
duplicates to show up in the results, so your second suggestion  
looks like a good workaround if I can't solve the actual problem.  I  
didn't know about FieldCollapsing, so I'll definitely keep it in mind.


Thanks
Mark

On 22 Jun 2010, at 3:44 pm, Geert-Jan Brits wrote:

Perhaps my answer is useless, bc I don't have an answer to your  
direct

question, but:
You *might* want to consider if your concept of a solr-document is  
on the

correct granular level, i.e:

your problem posted could be tackled (afaik) by defining a   
document being a

'sub-event' with only 1 daterange.
So for each event-doc you have now, this is replaced by several sub- 
event

docs in this proposed situation.

Additionally each sub-event doc gets an additional field 'parent- 
eventid'
which maps to something like an event-id (which you're probably  
using) .

So several sub-event docs can point to the same event-id.

Lastly, all sub-event docs belonging to a particular event  
implement all the

other fields that you may have stored in that particular event-doc.

Now you can query for events based on data-rages like you  
envisioned, but
instead of returning events you return sub-event-docs. However  
since all
data of the original event (except the multiple dateranges) is  
available in
the subevent-doc this shouldn't really bother the client. If you  
need to
display all dates of an event (the only info missing from the  
returned
solr-doc) you could easily store it in a RDB and fetch it using the  
defined

parent-eventid.

The only caveat I see, is that possibly multiple sub-events with  
the same

'parent-eventid' might get returned for a particular query.
This however depends on the type of queries you envision. i.e:
1)  If you always issue queries with date-filters, and *assuming*  
that
sub-events of a particular event don't temporally overlap, you will  
never

get multiple sub-events returned.
2)  if 1)  doesn't hold and assuming you *do* mind multiple sub- 
events of

the same actual event, you could try to use Field Collapsing on
'parent-eventid' to only return the first sub-event per parent- 
eventid that
matches the rest of your query. (Note however, that Field  
Collapsing is a

patch at the moment. http://wiki.apache.org/solr/FieldCollapsing)

Not sure if this helped you at all, but at the very least it was a  
nice

conceptual exercise ;-)

Cheers,
Geert-Jan


2010/6/22 Mark Allan 


Hi all,

Firstly, I apologise for the length of this email but I need to  
describe

properly what I'm doing before I get to the problem!

I'm working on a project just now which requires the ability to  
store and
search on temporal coverage data - ie. a field which specifies a  
date range

during which a certain event took place.

I hunted around for a few days and couldn't find anything which  
seemed to
fit, so I had a go at writing my own field type based on  
solr.PointType.

It's used as follows:
schema.xml
 
 stored="true"

multiValued="true"/>
data.xml
 
 
 ...
 1940,1945
 
 

Internally, this gets stored as:
 1940,1945
 1940
 1945

In due course, I'll declare the subfields as a proper date type,  
but in the
meantime, this works absolutely fine.  I can search for an  
individual date
and Solr will check (queryDate

RE: remove from list

2010-06-23 Thread Markus Jelsma

If you want to unsubscribe, then you can do so [1] without trying to sell 
something ;)

 

[1]: http://lucene.apache.org/solr/mailing_lists.html

 

Cheers!
 
-Original message-
From: Susan Rust 
Sent: Wed 23-06-2010 18:23
To: solr-user@lucene.apache.org; Erik Hatcher ; 
Subject: remove from list

Hey SOLR folks -- There's too much info for me to digest, so please  
remove me from the email threads.

However, if we can build you a forum, bulletin board or other web- 
based tool, please let us know. For that matter, we would be happy to  
build you a new website.

Bill O'Connor is our CTO and the Drupal.org SOLR Redesign Lead. So we  
love SOLR! Let us know how we can support your efforts.

Susan Rust
VP of Client Services

If you wish to travel quickly, go alone
If you wish to travel far, go together

Achieve Internet
1767 Grand Avenue, Suite 2
San Diego, CA 92109

800-618-8777 x106
858-453-5760 x106

Susan-Rust (skype)
@Susan_Rust (twitter)
@Achieveinternet (twitter)
@drupalsandiego (San Diego Drupal Users' Group Twitter)



This message contains confidential information and is intended only  
for the individual named. If you are not the named addressee you  
should not disseminate, distribute or copy this e-mail. Please notify  
the sender immediately by e-mail if you have received this e-mail by  
mistake and delete this e-mail from your system. E-mail transmission  
cannot be guaranteed to be secure or error-free as information could  
be intercepted, corrupted, lost, destroyed, arrive late or incomplete,  
or contain viruses. The sender therefore does not accept liability for  
any errors or omissions in the contents of this message, which arise  
as a result of e-mail transmission. If verification is required please  
request a hard-copy version.













On Jun 23, 2010, at 1:52 AM, Mark Allan wrote:

> Cheers, Geert-Jan, that's very helpful.
>
> We won't always be searching with dates and we wouldn't want  
> duplicates to show up in the results, so your second suggestion  
> looks like a good workaround if I can't solve the actual problem.  I  
> didn't know about FieldCollapsing, so I'll definitely keep it in mind.
>
> Thanks
> Mark
>
> On 22 Jun 2010, at 3:44 pm, Geert-Jan Brits wrote:
>
>> Perhaps my answer is useless, bc I don't have an answer to your  
>> direct
>> question, but:
>> You *might* want to consider if your concept of a solr-document is  
>> on the
>> correct granular level, i.e:
>>
>> your problem posted could be tackled (afaik) by defining a   
>> document being a
>> 'sub-event' with only 1 daterange.
>> So for each event-doc you have now, this is replaced by several sub- 
>> event
>> docs in this proposed situation.
>>
>> Additionally each sub-event doc gets an additional field 'parent- 
>> eventid'
>> which maps to something like an event-id (which you're probably  
>> using) .
>> So several sub-event docs can point to the same event-id.
>>
>> Lastly, all sub-event docs belonging to a particular event  
>> implement all the
>> other fields that you may have stored in that particular event-doc.
>>
>> Now you can query for events based on data-rages like you  
>> envisioned, but
>> instead of returning events you return sub-event-docs. However  
>> since all
>> data of the original event (except the multiple dateranges) is  
>> available in
>> the subevent-doc this shouldn't really bother the client. If you  
>> need to
>> display all dates of an event (the only info missing from the  
>> returned
>> solr-doc) you could easily store it in a RDB and fetch it using the  
>> defined
>> parent-eventid.
>>
>> The only caveat I see, is that possibly multiple sub-events with  
>> the same
>> 'parent-eventid' might get returned for a particular query.
>> This however depends on the type of queries you envision. i.e:
>> 1)  If you always issue queries with date-filters, and *assuming*  
>> that
>> sub-events of a particular event don't temporally overlap, you will  
>> never
>> get multiple sub-events returned.
>> 2)  if 1)  doesn't hold and assuming you *do* mind multiple sub- 
>> events of
>> the same actual event, you could try to use Field Collapsing on
>> 'parent-eventid' to only return the first sub-event per parent- 
>> eventid that
>> matches the rest of your query. (Note however, that Field  
>> Collapsing is a
>> patch at the moment. http://wiki.apache.org/solr/FieldCollapsing)
>>
>> Not sure if this helped you at all, but at the very least it was a  
>> nice
>> conceptual exercise ;-)
>>
>> Cheers,
>> Geert-Jan
>>
>>
>> 2010/6/22 Mark Allan 
>>
>>> Hi all,
>>>
>>> Firstly, I apologise for the length of this email but I need to  
>>> describe
>>> properly what I'm doing before I get to the problem!
>>>
>>> I'm working on a project just now which requires the ability to  
>>> store and
>>> search on temporal coverage data - ie. a field which specifies a  
>>> date range
>>> during which a certain event took place.
>>>

Re: remove from list

2010-06-23 Thread Susan Rust


Will do -- but wasn't selling -- trying to donate!

Susan Rust
VP of Client Services

If you wish to travel quickly, go alone
If you wish to travel far, go together

Achieve Internet
1767 Grand Avenue, Suite 2
San Diego, CA 92109

800-618-8777 x106
858-453-5760 x106

Susan-Rust (skype)
@Susan_Rust (twitter)
@Achieveinternet (twitter)
@drupalsandiego (San Diego Drupal Users' Group Twitter)



This message contains confidential information and is intended only  
for the individual named. If you are not the named addressee you  
should not disseminate, distribute or copy this e-mail. Please notify  
the sender immediately by e-mail if you have received this e-mail by  
mistake and delete this e-mail from your system. E-mail transmission  
cannot be guaranteed to be secure or error-free as information could  
be intercepted, corrupted, lost, destroyed, arrive late or incomplete,  
or contain viruses. The sender therefore does not accept liability for  
any errors or omissions in the contents of this message, which arise  
as a result of e-mail transmission. If verification is required please  
request a hard-copy version.














On Jun 23, 2010, at 9:30 AM, Markus Jelsma wrote:

If you want to unsubscribe, then you can do so [1] without trying to  
sell something ;)




[1]: http://lucene.apache.org/solr/mailing_lists.html



Cheers!

-Original message-
From: Susan Rust 
Sent: Wed 23-06-2010 18:23
To: solr-user@lucene.apache.org; Erik Hatcher  
;

Subject: remove from list

Hey SOLR folks -- There's too much info for me to digest, so please
remove me from the email threads.

However, if we can build you a forum, bulletin board or other web-
based tool, please let us know. For that matter, we would be happy to
build you a new website.

Bill O'Connor is our CTO and the Drupal.org SOLR Redesign Lead. So we
love SOLR! Let us know how we can support your efforts.

Susan Rust
VP of Client Services

If you wish to travel quickly, go alone
If you wish to travel far, go together

Achieve Internet
1767 Grand Avenue, Suite 2
San Diego, CA 92109

800-618-8777 x106
858-453-5760 x106

Susan-Rust (skype)
@Susan_Rust (twitter)
@Achieveinternet (twitter)
@drupalsandiego (San Diego Drupal Users' Group Twitter)



This message contains confidential information and is intended only
for the individual named. If you are not the named addressee you
should not disseminate, distribute or copy this e-mail. Please notify
the sender immediately by e-mail if you have received this e-mail by
mistake and delete this e-mail from your system. E-mail transmission
cannot be guaranteed to be secure or error-free as information could
be intercepted, corrupted, lost, destroyed, arrive late or incomplete,
or contain viruses. The sender therefore does not accept liability for
any errors or omissions in the contents of this message, which arise
as a result of e-mail transmission. If verification is required please
request a hard-copy version.













On Jun 23, 2010, at 1:52 AM, Mark Allan wrote:


Cheers, Geert-Jan, that's very helpful.

We won't always be searching with dates and we wouldn't want
duplicates to show up in the results, so your second suggestion
looks like a good workaround if I can't solve the actual problem.  I
didn't know about FieldCollapsing, so I'll definitely keep it in  
mind.


Thanks
Mark

On 22 Jun 2010, at 3:44 pm, Geert-Jan Brits wrote:


Perhaps my answer is useless, bc I don't have an answer to your
direct
question, but:
You *might* want to consider if your concept of a solr-document is
on the
correct granular level, i.e:

your problem posted could be tackled (afaik) by defining a
document being a
'sub-event' with only 1 daterange.
So for each event-doc you have now, this is replaced by several sub-
event
docs in this proposed situation.

Additionally each sub-event doc gets an additional field 'parent-
eventid'
which maps to something like an event-id (which you're probably
using) .
So several sub-event docs can point to the same event-id.

Lastly, all sub-event docs belonging to a particular event
implement all the
other fields that you may have stored in that particular event-doc.

Now you can query for events based on data-rages like you
envisioned, but
instead of returning events you return sub-event-docs. However
since all
data of the original event (except the multiple dateranges) is
available in
the subevent-doc this shouldn't really bother the client. If you
need to
display all dates of an event (the only info missing from the
returned
solr-doc) you could easily store it in a RDB and fetch it using the
defined
parent-eventid.

The only caveat I see, is that possibly multiple sub-events with
the same
'parent-eventid' might get returned for a particular query.
This however depends on the type of queries you envision. i.e:
1)  If you always issue queries with date-filters, and *assuming*
that
sub-ev

Re: Help with highlighting

2010-06-23 Thread noel

Thanks, that's exactly the problem. I've tried different types, even a 
fieldType that had no tokenizers and that didn't work. However, text just gives 
me my results as wanted. 

-Original Message-
From: "dan sutton" 
Sent: Wednesday, June 23, 2010 12:06pm
To: solr-user@lucene.apache.org
Subject: Re: Help with highlighting

It looks to me like a tokenisation issue, all_text content and the query
text will match, but the string fieldtype fields 'might not' and therefore
will not be highlighted.

On Wed, Jun 23, 2010 at 4:40 PM,  wrote:

> Here's my request:
> q=ASA+AND+minisite_id%3A36&version=1.3&json.nl
> =map&rows=10&start=0&wt=json&hl=true&hl.fl=%2A&hl.simple.pre=%3Cspan+class%3D%22hl%22%3E&hl.simple.post=%3C%2Fspan%3E&hl.fragsize=0&hl.mergeContiguous=false
>
> And here's what happened:
> It didn't return results, even when I applied an asterisk for which fields
> highlight. I tried other fields and that didn't work either, however
> all_text is the only one that works. Any other ideas why the other fields
> won't highlight? Thanks.
>
> -Original Message-
> From: "Erik Hatcher" 
> Sent: Tuesday, June 22, 2010 9:49pm
> To: solr-user@lucene.apache.org
> Subject: Re: Help with highlighting
>
> You need to share with us the Solr request you made, any any custom
> request handler settings that might map to.  Chances are you just need
> to twiddle with the highlighter parameters (see wiki for docs) to get
> it to do what you want.
>
>Erik
>
> On Jun 22, 2010, at 4:42 PM, n...@frameweld.com wrote:
>
> > Hi, I need help with highlighting fields that would match a query.
> > So far, my results only highlight if the field is from all_text, and
> > I would like it to use other fields. It simply isn't the case if I
> > just turn highlighting on. Any ideas why it only applies to
> > all_text? Here is my schema:
> >
> > 
> >
> > 
> >   
> >   
> >
> >   
> >> sortMissingLast="true" omitNorms="true" />
> >> sortMissingLast="true" omitNorms="true" />
> >
> >   
> >omitNorms="true"/>
> >
> >omitNorms="true"/>
> >omitNorms="true"/>
> >omitNorms="true"/>
> >
> >   
> >> sortMissingLast="true" omitNorms="true"/>
> >> sortMissingLast="true" omitNorms="true"/>
> >> sortMissingLast="true" omitNorms="true"/>
> >> sortMissingLast="true" omitNorms="true"/>
> >
> >   
> >
> >> sortMissingLast="true" omitNorms="true"/>
> >
> >   
> >> indexed="true" />
> >
> >   
> >> positionIncrementGap="100">
> >   
> >class="solr.WhitespaceTokenizerFactory"/>
> >   
> >   
> >
> >
> >> positionIncrementGap="100">
> >   
> >class="solr.WhitespaceTokenizerFactory"/>
> >   
> >class="solr.WordDelimiterFilterFactory"
> > generateWordParts="1" generateNumberParts="1" catenateWords="1"
> > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> >class="solr.LowerCaseFilterFactory"/>
> >class="solr.EnglishPorterFilterFactory"
> > protected="protwords.txt"/>
> >class="solr.RemoveDuplicatesTokenFilterFactory"/>
> >   
> >
> >   
> >class="solr.WhitespaceTokenizerFactory"/>
> >> synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
> >class="solr.WordDelimiterFilterFactory"
> > generateWordParts="1" generateNumberParts="1" catenateWords="0"
> > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
> >class="solr.LowerCaseFilterFactory"/>
> >class="solr.EnglishPorterFilterFactory"
> > protected="protwords.txt"/>
> >class="solr.RemoveDuplicatesTokenFilterFactory"/>
> >   
> >   
> >
> >
> >> positionIncrementGap="100" >
> >   
> >class="solr.WhitespaceTokenizerFactory"/>
> >> synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
> >ignoreCase="true"
> > words="stopwords.txt"/>
> >class="solr.WordDelimiterFilterFactory"
> > generateWordParts="0" generateNumberParts="0" catenateWords="1"
> > catenateNumbers="1" catenateAll="0"/>
> >class="solr.LowerCaseFilterFactory"/>
> >class="solr.EnglishPorterFilterFactory"
> > protected="protwords.txt"/>
>

Highlight question

2010-06-23 Thread Gregg Hoshovsky

I just started working with the highlighting.  I am using the default 
configurations. I have a field that I can get a single highlight to occur 
marking the data.

What I would like to do is this,

Given a word say 'tumor', and the sentence

" the lower tumor grew 1.5 cm. blah blah blah  we need to remove the tumor in 
the next surgery"

I would like to get ."... the lower tumor grew 1.5 cm . blah blah 
blah  we need to ... remove the tumor in the next . surgery"

Thus finding multiple references to the work and  only grabbing a few words 
around it.



In the solrconfig.xml I have been able to change the hl.simple.pre/post 
variable, but when I try to change the hl,regex pattern or the hl.snippets they 
don't have any effect. I thought the hl.snippets would alow me to find more 
than one and highlight it, and well I tried a bunch of regex patterns but they 
didn't do anything.

here is a snippet of the config file.

Any help is appreciated.

Gregg


   
   

  
  4  70
  
  0.2
  
  [-\w ,/\n\"']{1,1}

   

   
   

  4
 100

Help with sorting

2010-06-23 Thread Adi Neacsu

Hi everyone , 
I'm stuck in sorting with solr . I have documents of some 
institutions 
differentiated by an id named instanta . I indexed all those 
documents 
and among other things I put in the index the date  the document 
was 
created and the id of the institution .When I want sort the 
documents 
wich contain a certain word by date or by instituion all I get is 
an 
order that I don't understand . 

 
 

 QueryOptions options = new QueryOptions 
{ 
Rows = resultsPerPage, 
Start = (pageNumber - 1) * resultsPerPage, 
OrderBy = new[] { new SortOrder("instanta", Order.DESC) } 

}; 

Thank you in advance 
 
jud. Adrian Neacsu
Presedinte Tribunalul Vrancea
http://www.adrianneacsu.jurindex.ro

www.jurisprudenta.org

www.societateapentrujustitie.ro
 (+40) 0721949875  ;  (+40) 0749182508  
fax 0337814221

DIH and dynamicField

2010-06-23 Thread Boyd Hemphill

I am new to the list so any coaching on asking question is much appreciated.

I am  having a problem where importing with DIH and attempting to use
dynamicField produces no result.  I get no error, nor do I get a message in
the log.

I found this:  https://issues.apache.org/jira/browse/SOLR-742 which says the
issue was closed in bulk for the 1.4 release.  The messages above seem to
indicate the patch was in/out/good/bad, so I am not sure if the issue was
fixed as we are seeing the same behavior described in the bug.

Has this issue, in fact, been resolved?  Is anyone using DIH and
dynamicField successfully together?

Solr is truly fantastic (so is DIH for that matter).  Thank you!

Boyd Hemphill

Re: fuzzy query performance

2010-06-23 Thread Peter Karich

Hi Mark!

> Solr trunk should have much improved fuzzy speeds (due to some very
cool work that was done in Lucene) - you using 1.4?

yes.
So, you mean I should try it out her:
http://svn.apache.org/viewvc/lucene/dev/trunk/solr/

or some 'more stable' branch?
http://svn.apache.org/viewvc/lucene/solr/branches/branch-1.5-dev/

What would you choose?

Regards,
Peter.

>> Hi!
>>
>> How can I improve the performance of a fuzzy search like: mihchael~0.7
>> through a relative large index (~1 million docs)?
>> It takes over 15 seconds at the moment if we would perform it on the
>> normal text search field.
>> I searched the web and the jira and couldn't find anything related to
>> that.
>>
>> Any pointers or ideas would be appreciated!
>>
>> Regards,
>> Peter.
>
> Solr trunk should have much improved fuzzy speeds (due to some very
> cool work that was done in Lucene) - you using 1.4?

Stemmed and/or unStemmed field

2010-06-23 Thread Vishal A.

Hello all,

 

One quick question, trying to find out what scenario would work best.

We have huge free text dataset containing product titles, descriptions.
Unfortunately, we don't have the data categorized so we rely on 'search
relevancy + synonyms'  heavily to categorize.

Here is what I am trying to do :  Someone clicks on  'Comforters & Pillows'
, we would want the results to be filtered where title has keyword
'Comforter' or  'Pillows' but we have been getting results with word
'comfort' in the title. I assume it is because of stemming. What is the
right way to handle this?

I am thinking to create another unstemmed field as 'title_unstemmed' which
stores the data unstemmed. So basically, with dismax -  I could boost score
on unstemmed field.  I can think of other scenarios where stemming would be
needed so stemmed field would still match.

 

Does that sound like something that will work? Any suggestions please?  

 

Much appreciated

Can solr return pretty text as the content?

2010-06-23 Thread JohnRodey


When I feed pretty text into solr for indexing from lucene and search for it,
the content is always returned as one long line of text.  Is there a way for
solr to return the pretty formatted text to me?
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-solr-return-pretty-text-as-the-content-tp917912p917912.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Can solr return pretty text as the content?

2010-06-23 Thread caman


Define Pretty text.

 

1)Are you talking about XML/JSON returned by SOLR is not pretty ?

If yes, try indent=on with your query params

 

2)Or talking about data in certain field? 

Solr returns what you feed it. Look at your filters for that field
type. Your filters/tokenizer may be stripping the formatting.

 

 

 

From: JohnRodey [via Lucene]
[mailto:ml-node+917912-920852633-124...@n3.nabble.com] 
Sent: Wednesday, June 23, 2010 1:19 PM
To: caman
Subject: Can solr return pretty text as the content?

 

When I feed pretty text into solr for indexing from lucene and search for
it, the content is always returned as one long line of text.  Is there a way
for solr to return the pretty formatted text to me? 

  _  

View message @
http://lucene.472066.n3.nabble.com/Can-solr-return-pretty-text-as-the-conten
t-tp917912p917912.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-solr-return-pretty-text-as-the-content-tp917912p917966.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Highlight question

2010-06-23 Thread Ahmet Arslan

> In the solrconfig.xml I have been able to change the
> hl.simple.pre/post variable, but when I try to change the
> hl,regex pattern or the hl.snippets they don't have any
> effect. I thought the hl.snippets would alow me to find more
> than one and highlight it, and well I tried a bunch of regex
> patterns but they didn't do anything.

4 param should go to under default section of 
your default SearchHandler. 



all 
4 



Also &hl.formatter=regex paremter is required to activate regular expression 
based fragmenter.

Re: Help with sorting

2010-06-23 Thread Ahmet Arslan



> When I want sort the
> documents 
> wich contain a certain word by date or by instituion all I
> get is 
> an 
> order that I don't understand . 
> 
>  stored="false" /> 
>  stored="false" 
> required="true" /> 

You need to use a sortable type: sint with solr 1.3; tint with solr 1.4

field name="instanta" type="tint"

Re: fuzzy query performance

2010-06-23 Thread Robert Muir

On Wed, Jun 23, 2010 at 3:34 PM, Peter Karich  wrote:

>
> So, you mean I should try it out her:
> http://svn.apache.org/viewvc/lucene/dev/trunk/solr/
>
>
yes, the speedups are only in trunk.

-- 
Robert Muir
rcm...@gmail.com

Re: DIH and dynamicField

2010-06-23 Thread Robert Zotter

Boyd Hemphill-2 wrote:
> 
> I am  having a problem where importing with DIH and attempting to use
> dynamicField produces no result.  I get no error, nor do I get a message
> in
> the log.

It would help if you posted the relevant parts of your data-config.xml and
schema.xml. If you are doing a straight column to name mapping my first
guess would be you could have those backwards or there is some
misconfiguration in your schema.xml. For example if you have a database
column "foo" and you want to add it to the "foo_dynamic" field you should be
using something like this:

solrconfig.xml

data-config.xml

Hope this helps. 

- Robert Zotter
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/DIH-and-dynamicField-tp917823p918189.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Stemmed and/or unStemmed field

2010-06-23 Thread Robert Muir

On Wed, Jun 23, 2010 at 3:58 PM, Vishal A.
wrote:

>
> Here is what I am trying to do :  Someone clicks on  'Comforters & Pillows'
> , we would want the results to be filtered where title has keyword
> 'Comforter' or  'Pillows' but we have been getting results with word
> 'comfort' in the title. I assume it is because of stemming. What is the
> right way to handle this?
>

from your examples, it seems a more lightweight stemmer might be an easy
option: https://issues.apache.org/jira/browse/LUCENE-2503

-- 
Robert Muir
rcm...@gmail.com

RE: Stemmed and/or unStemmed field

2010-06-23 Thread caman


Ahh,perfect.

Will take a look. thanks

 

From: Robert Muir [via Lucene]
[mailto:ml-node+918302-232685105-124...@n3.nabble.com] 
Sent: Wednesday, June 23, 2010 4:17 PM
To: caman
Subject: Re: Stemmed and/or unStemmed field

 

On Wed, Jun 23, 2010 at 3:58 PM, Vishal A. 
<[hidden email]>wrote: 

> 
> Here is what I am trying to do :  Someone clicks on  'Comforters &
Pillows' 
> , we would want the results to be filtered where title has keyword 
> 'Comforter' or  'Pillows' but we have been getting results with word 
> 'comfort' in the title. I assume it is because of stemming. What is the 
> right way to handle this? 
> 

from your examples, it seems a more lightweight stemmer might be an easy 
option: https://issues.apache.org/jira/browse/LUCENE-2503

-- 
Robert Muir 
[hidden email] 



  _  

View message @
http://lucene.472066.n3.nabble.com/Stemmed-and-or-unStemmed-field-tp917876p9
18302.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Stemmed-and-or-unStemmed-field-tp917876p918309.html
Sent from the Solr - User mailing list archive at Nabble.com.

Some minor Solritas layout tweaks

2010-06-23 Thread Ken Krugler

I grabbed the latest & greatest from trunk, and then had to make a few  
minor layout tweaks.


1. In main.css, the ".query-box input" { height} isn't tall enough (at  
least on my Mac 10.5/FF 3.6 config), so character descenders get  
clipped.


I bumped it from 40px to 50px, and that fixed the issue for me.

2. The constraint text (for removing facet constraints) overlaps with  
the Solr logo.


It looks like the div that contains this anchor text is missing a  
class="constraints", as I see a .constraints in the CSS.


I added this class name, and also (to main.css):

.constraints {
  margin-top: 10px;
}

But IANAWD, so this is probably not the best way to fix the issue.

3. And then I see a .constraints-title in the CSS, but it's not used.

Was the intent of this to set the '>' character to gray?

4. It seems silly to open JIRA issues for these types of things, but I  
also don't want to add to noise on the list.


Which approach is preferred?

Thanks,

-- Ken




Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g

Multiple Solr Webapps in Glassfish with JNDI

2010-06-23 Thread Kelly Taylor


Does anybody know how to setup multiple Solr webapps in Glassfish with JNDI?

-Kelly
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Multiple-Solr-Webapps-in-Glassfish-with-JNDI-tp918383p918383.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Setting up Eclipse with merged Lucene Solr source tree

2010-06-23 Thread Lance Norskog

I have found it easier to make these projects in my Eclipse workspace
and make remote links to the parts that I really want. This cuts the
total stuff in the project- cuts build times, 'search everywhere'
times, menus full of classes named '*file*', etc.

But git may have problems with this, and git is a lifesaver for
playing with patches etc.

Lance

On Wed, Jun 23, 2010 at 8:03 AM, Erick Erickson  wrote:
> Did you see this page?"
> http://wiki.apache.org/solr/HowToContribute
>
> Especially down near the end,
> the section
> "Development Environment Tips"
>
> HTH
> Erick
>
> On Wed, Jun 23, 2010 at 8:57 AM, Ukyo Virgden  wrote:
>
>> Hi,
>>
>> I'm trying to setup and eclipse environment for combined Lusolr tree. I've
>> created a Lucene project containing /trunk/lusolr/lucene
>> and /trunk/lusolr/modules as one project and /trunk/lusolr/solr as another.
>> I've added lucene project as a dependency to Solr project, removed solr
>> libs
>> from lucene project and added Lucene project to dependencies of Solr
>> project.
>>
>> Lucene source tree is fine but in the Solr tree I get 5 errors
>>
>> The method getTextContent() is undefined for the type Node TestConfig.java
>> /Solr/src/test/org/apache/solr/core line 91
>> The method getTextContent() is undefined for the type Node TestConfig.java
>> /Solr/src/test/org/apache/solr/core line 94
>> The method setXIncludeAware(boolean) is undefined for the type
>> DocumentBuilderFactory Config.java /Solr/src/java/org/apache/solr/core line
>> 113
>> The method setXIncludeAware(boolean) is undefined for the type
>> DocumentBuilderFactory DataImporter.java
>>
>> /Solr/contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport
>> line
>> The method setXIncludeAware(boolean) is undefined for the type Object
>> TestXIncludeConfig.java /Solr/src/test/org/apache/solr/core line 32
>>
>> Is this the correct way to setup eclipse after the source tree merge?
>>
>> Thanks in advance
>> Ukyo
>>
>



-- 
Lance Norskog
goks...@gmail.com

Re: DIH and dynamicField

2010-06-23 Thread Lance Norskog

A side comment about patches and JIRA- the second-to-last comment on
SOLR-742 says ''Committed'. That means one of the committers (Shalin
in this case) committed the fix. It was in 2008 so it's in Solr 1.4.

https://issues.apache.org/jira/browse/SOLR-742?focusedCommentId=12643747&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12643747

But, yes, Robert is right: post what you can of your config files.

On Wed, Jun 23, 2010 at 3:11 PM, Robert Zotter  wrote:
>
>
> Boyd Hemphill-2 wrote:
>>
>> I am  having a problem where importing with DIH and attempting to use
>> dynamicField produces no result.  I get no error, nor do I get a message
>> in
>> the log.
>
> It would help if you posted the relevant parts of your data-config.xml and
> schema.xml. If you are doing a straight column to name mapping my first
> guess would be you could have those backwards or there is some
> misconfiguration in your schema.xml. For example if you have a database
> column "foo" and you want to add it to the "foo_dynamic" field you should be
> using something like this:
>
> solrconfig.xml
> 
>
> data-config.xml
> 
>
> Hope this helps.
>
> - Robert Zotter
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/DIH-and-dynamicField-tp917823p918189.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Lance Norskog
goks...@gmail.com

Re: Multiple Solr Webapps in Glassfish with JNDI

2010-06-23 Thread Otis Gospodnetic

Hi Kelly,

I'm not much of a Classfish user, but have you tried following the JNDI 
instructions for Tomcat, maybe that works for Glassfish, too?

http://search-lucene.com/?q=jndi&fc_project=Solr
 
Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Kelly Taylor 
> To: solr-user@lucene.apache.org
> Sent: Wed, June 23, 2010 8:03:48 PM
> Subject: Multiple Solr Webapps in Glassfish with JNDI
> 
> 
Does anybody know how to setup multiple Solr webapps in Glassfish with 
> JNDI?

-Kelly
-- 
View this message in context: 
> href="http://lucene.472066.n3.nabble.com/Multiple-Solr-Webapps-in-Glassfish-with-JNDI-tp918383p918383.html";
>  
> target=_blank 
> >http://lucene.472066.n3.nabble.com/Multiple-Solr-Webapps-in-Glassfish-with-JNDI-tp918383p918383.html
Sent 
> from the Solr - User mailing list archive at Nabble.com.

Re: fuzzy query performance

2010-06-23 Thread Otis Gospodnetic

Btw. here you can see Robert's presentation on what he did to speed up fuzzy 
queries:  http://www.slideshare.net/otisg
 Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch




- Original Message 
> From: Robert Muir 
> To: solr-user@lucene.apache.org
> Sent: Wed, June 23, 2010 5:13:10 PM
> Subject: Re: fuzzy query performance
> 
> On Wed, Jun 23, 2010 at 3:34 PM, Peter Karich <
> ymailto="mailto:peat...@yahoo.de"; 
> href="mailto:peat...@yahoo.de";>peat...@yahoo.de> 
> wrote:

>
> So, you mean I should try it out her:
> 
> href="http://svn.apache.org/viewvc/lucene/dev/trunk/solr/"; target=_blank 
> >http://svn.apache.org/viewvc/lucene/dev/trunk/solr/
>
>
yes, 
> the speedups are only in trunk.

-- 
Robert Muir

> ymailto="mailto:rcm...@gmail.com"; 
> href="mailto:rcm...@gmail.com";>rcm...@gmail.com

Re: Alphabetic range

2010-06-23 Thread Otis Gospodnetic

Sophie,

Go to your Solr Admin page, look for the Analysis page link, go there, enter 
some artists names, enter the query, check the verbose checkboxes, and submit.  
This will tell you what is going on with your analysis at index and at search 
time.
 Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Sophie M. 
> To: solr-user@lucene.apache.org
> Sent: Wed, June 23, 2010 8:56:39 AM
> Subject: Alphabetic range
> 
> 
Hello all,

I try since several day to build up an alphabetical range. 
> I will explain
all steps (i have the Solr1.4 Enterprise  Search Server 
> book written by
Smiley and Pugh).

I want get all artists beginning by 
> the two first letter. If I request "mi",
I want to have as response "michael 
> jackson" and all artists name beginning
by "mi".

I defined a field 
> type similiar to Smiley and Pugh's example p.148

 name="bucketFirstTwoLetters" class="solr.TextField"
sortMissingLast="true" 
> omitNorms="true">
 type="index">

>  class="solr.PatternTokenizerFactory"
pattern="^([a-zA-Z])([a-zA-Z]).*" 
> group="2"/> 

> 

> 

>  class="solr.KeywordTokenizerFactory"/>

> 

> 

I defined the field ArtistSort like 
> : 

 stored="true"
multivalued="false"/>
To the request : 


> href="http://localhost:8983/solr/music/select?indent=on&q=yu&qt=standard&wt=standard&facet=on&facet.field=ArtistSort&facetsort=lex&facet.missing=on&facet.method=enum&fl=ArtistSort";
>  
> target=_blank 
> >http://localhost:8983/solr/music/select?indent=on&q=yu&qt=standard&wt=standard&facet=on&facet.field=ArtistSort&facetsort=lex&facet.missing=on&facet.method=enum&fl=ArtistSort

I 
> get :


> href="http://lucene.472066.n3.nabble.com/file/n916716/select.xml"; 
> target=_blank 
> >http://lucene.472066.n3.nabble.com/file/n916716/select.xml select.xml 
> 

I don't understand why the pattern doesn't my exacty. For example "An An 
> Yu"
matches but I only want artists whom name begins by "yu". And I know that 
> an
artist named ReYu would match because ReYu would be interpreted as Re Yu 
> (as
two words).

I also tried to make an other type of queries like : 
> 


> href="http://localhost:8983/solr/music/select?indent=on&version=2.2&q=ArtistSort:mi*&fq=&start=0&rows=10&fl=ArtistSort&qt=standard&wt=standard&explainOther=&hl.fl=";
>  
> target=_blank 
> >http://localhost:8983/solr/music/select?indent=on&version=2.2&q=ArtistSort:mi*&fq=&start=0&rows=10&fl=ArtistSort&qt=standard&wt=standard&explainOther=&hl.fl=

I 
> get exacly what I would. I made several tries, I get only artist's names
wich 
> begins by the good first to letters.

But I get very few responses, see 
> there :

result name="response" numFound="6" 
> start="0">


mike manne and 
> tiger blues

−

 name="ArtistSort">mimika

−

 name="ArtistSort">miduno

−

 name="ArtistSort">milue 
> macïro

−

 name="ArtistSort">mister 
> pringle

−

 name="ArtistSort">mimmai



In my index 
> there is more than 80 000 artists...  I really don't understand
why I 
> can't get more responses. I think about the problem since days and
days and 
> now my brain freezes 

Thank you in advance.

Sophie
-- 
View 
> this message in context: 
> href="http://lucene.472066.n3.nabble.com/Alphabetic-range-tp916716p916716.html";
>  
> target=_blank 
> >http://lucene.472066.n3.nabble.com/Alphabetic-range-tp916716p916716.html
Sent 
> from the Solr - User mailing list archive at Nabble.com.

Re: Performance related question on DISMAX handler..

2010-06-23 Thread Otis Gospodnetic

BB,

Dismax could be slower than standard, depending on what kinds of queries you 
throw at either handler.
"Millions of docs" is a bit imprecise (2M or 22M or 222M or 999M, tweet-sized 
docs or book sized docs), but given adequate hardware and proper treatment 
shouldn't be a problem.
 Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: bbarani 
> To: solr-user@lucene.apache.org
> Sent: Tue, June 22, 2010 2:27:05 PM
> Subject: Performance related question on DISMAX handler..
> 
> 
Hi,

I just want to know if there will be any overhead / performance 
> degradation
if I use the Dismax search handler instead of standard search 
> handler?

We are planning to index millions of documents and not sure if 
> using Dismax
will slow down the search performance. Would be great if someone 
> can share
their thoughts.

Thanks,
BB
-- 
View this message in 
> context: 
> href="http://lucene.472066.n3.nabble.com/Performance-related-question-on-DISMAX-handler-tp914892p914892.html";
>  
> target=_blank 
> >http://lucene.472066.n3.nabble.com/Performance-related-question-on-DISMAX-handler-tp914892p914892.html
Sent 
> from the Solr - User mailing list archive at Nabble.com.

Spatial types and DIH

2010-06-23 Thread Eric Angel

I'm using solr 4.0-2010-06-23_08-05-33 and can't figure out how to add the 
spatial types (LatLon, Point, GeoHash or SpatialTile) using dataimporthandler.  
My lat/lngs from the database are in separate fields.  Does anyone know how to 
do his?

Eric

Re: Field missing when use distributed search + dismax

2010-06-23 Thread Otis Gospodnetic

Make sure you list it in ...&fl=ID,type or set it in the defaults section of 
your handler.
 Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Scott Zhang 
> To: solr-user@lucene.apache.org
> Sent: Tue, June 22, 2010 11:04:07 AM
> Subject: Field missing when use distributed search + dismax
> 
> Hi. All.
   I was using distributed search over 30 solr instance, the 
> previous one
was using the standard query handler. And the result was 
> returned correctly.
each result has 2 fields. "ID" and "type".
  
> Today I want to use search withk dismax, I tried search with each
instance 
> with dismax. It works correctly, return "ID" and "type" for each
result. The 
> strange thing is when I
use distributed search, the result only have "ID". 
> The field "type"
disappeared. I need that "type" to know what the "ID" refer 
> to. Why solr
"eat" my "type"?


Thanks.
Regards.
Scott

Re: anyone use hadoop+solr?

2010-06-23 Thread Otis Gospodnetic

Marc is referring to the very informative by Ted Dunning from maybe a month or 
so ago.

For what it's worth, we just used Hadoop Streaming, JRuby, and EmbeddedSolr to 
speed up indexing by parallelizing it.
 Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Marc Sturlese 
> To: solr-user@lucene.apache.org
> Sent: Tue, June 22, 2010 12:43:27 PM
> Subject: Re: anyone use hadoop+solr?
> 
> 
Well, the patch consumes the data from a csv. You have to modify the input 
> to
use TableInputFormat (I don't remember if it's called exaclty like that) 
> and
it will work.
Once you've done that, you have to specify as much 
> reducers as shards you
want.

I know 2 ways to index using 
> hadoop
method 1 (solr-1301 & nutch):
-Map: just get data from the 
> source and create key-value
-Reduce: does the analysis and index the 
> data
So, the index is build on the reducer side

method 2 (hadoop 
> lucene index contrib)
-Map: does analysis and open indexWriter to add 
> docs
-Reducer: Merge small indexs build in the map
So, indexs are build on 
> the map side
method 2 has no good integration with Solr at the 
> moment.

In the jira (SOLR-1301) there's a good explanation of the 
> advantages and
disadvantages of indexing on the map or reduce side. I 
> recomend you to read
with detail all the comments on the jira to know exactly 
> how it works.


-- 
View this message in context: 
> href="http://lucene.472066.n3.nabble.com/anyone-use-hadoop-solr-tp485333p914625.html";
>  
> target=_blank 
> >http://lucene.472066.n3.nabble.com/anyone-use-hadoop-solr-tp485333p914625.html
Sent 
> from the Solr - User mailing list archive at Nabble.com.

Re: solr with hadoop

2010-06-23 Thread Otis Gospodnetic

I don't think it's ever been discussed - your Q below is #1 hit currently: 
http://search-lucene.com/?q=%2B%28dih+OR+dataimporthandler%29+hdfs
 Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Jon Baer 
> To: solr-user@lucene.apache.org
> Sent: Tue, June 22, 2010 12:47:14 PM
> Subject: Re: solr with hadoop
> 
> I was playing around w/ Sqoop the other day, its a simple Cloudera tool for 
> imports (mysql -> hdfs) @ 
> href="http://www.cloudera.com/developers/downloads/sqoop/"; target=_blank 
> >http://www.cloudera.com/developers/downloads/sqoop/

It seems to me 
> (it would be pretty efficient) to dump to HDFS and have something like Data 
> Import Handler be able to read from hdfs:// directly ...

Has this route 
> been discussed / developed before (ie DIH w/ hdfs:// handler)?

- 
> Jon

On Jun 22, 2010, at 12:29 PM, MitchK wrote:

> 
> I 
> wanted to add a Jira-issue about exactly what Otis is asking here.
> 
> Unfortunately, I haven't time for it because of my exams.
> 
> 
> However, I'd like to add a question to Otis' ones:
> If you destribute the 
> indexing-progress this way, are you able to replicate
> the different 
> documents correctly?
> 
> Thank you.
> - Mitch
> 
> 
> Otis Gospodnetic-2 wrote:
>> 
>> Stu,
>> 
> 
>> Interesting!  Can you provide more details about your 
> setup?  By "load
>> balance the indexing stage" you mean 
> "distribute the indexing process",
>> right?  Do you simply take 
> your content to be indexed, split it into N
>> chunks where N matches 
> the number of TaskNodes in your Hadoop cluster and
>> provide a map 
> function that does the indexing?  What does the reduce
>> function 
> do?  Does that call IndexWriter.addAllIndexes or do you do that
>> 
> outside Hadoop?
>> 
>> Thanks,
>> Otis
>> 
> --
>> Sematext -- 
> >http://sematext.com/ -- Lucene - Solr - Nutch
>> 
>> 
> - Original Message 
>> From: Stu Hood <
> ymailto="mailto:stuh...@webmail.us"; 
> href="mailto:stuh...@webmail.us";>stuh...@webmail.us>
>> To: 
> ymailto="mailto:solr-user@lucene.apache.org"; 
> href="mailto:solr-user@lucene.apache.org";>solr-user@lucene.apache.org
>> 
> Sent: Monday, January 7, 2008 7:14:20 PM
>> Subject: Re: solr with 
> hadoop
>> 
>> As Mike suggested, we use Hadoop to organize our 
> data en route to Solr.
>> Hadoop allows us to load balance the indexing 
> stage, and then we use
>> the raw Lucene IndexWriter.addAllIndexes 
> method to merge the data to be
>> hosted on Solr instances.
>> 
> 
>> Thanks,
>> Stu
>> 
>> 
>> 
> 
>> -Original Message-
>> From: Mike Klaas <
> ymailto="mailto:mike.kl...@gmail.com"; 
> href="mailto:mike.kl...@gmail.com";>mike.kl...@gmail.com>
>> 
> Sent: Friday, January 4, 2008 3:04pm
>> To: 
> ymailto="mailto:solr-user@lucene.apache.org"; 
> href="mailto:solr-user@lucene.apache.org";>solr-user@lucene.apache.org
>> 
> Subject: Re: solr with hadoop
>> 
>> On 4-Jan-08, at 11:37 AM, 
> Evgeniy Strokin wrote:
>> 
>>> I have huge index base 
> (about 110 millions documents, 100 fields  
>>> each). But size 
> of the index base is reasonable, it's about 70 Gb.  
>>> All I 
> need is increase performance, since some queries, which match  
> 
>>> big number of documents, are running slow.
>>> So I 
> was thinking is any benefits to use hadoop for this? And if  
> 
>>> so, what direction should I go? Is anybody did something 
> for  
>>> integration Solr with Hadoop? Does it give any 
> performance boost?
>>> 
>> Hadoop might be useful for 
> organizing your data enroute to Solr, but  
>> I don't see how it 
> could be used to boost performance over a huge  
>> Solr 
> index.  To accomplish that, you need to split it up over two  
> 
>> machines (for which you might find hadoop useful).
>> 
> 
>> -Mike
>> 
>> 
>> 
>> 
> 
>> 
>> 
>> 
> -- 
> View this message in 
> context: 
> href="http://lucene.472066.n3.nabble.com/solr-with-hadoop-tp482688p914589.html";
>  
> target=_blank 
> >http://lucene.472066.n3.nabble.com/solr-with-hadoop-tp482688p914589.html
> 
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Nested table support ability

2010-06-23 Thread Otis Gospodnetic

Amit,

I'd say it depends on the types of queries you need to run.  Maybe you 
mentioned that already, but your reply cut it off (Nabble).  I can say this 
with certainty: 1M is a small number and 30 fields is not a big deal.
 Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: amit_ak 
> To: solr-user@lucene.apache.org
> Sent: Wed, June 23, 2010 2:00:50 AM
> Subject: Re: Nested table support ability
> 
> 
Hi Otis, Thanks for the update.

My paramteric search has to span 
> across customer table and 30 child tables.
We have close to 1 million 
> customers. Do you think Lucene/Solr is the right
fsolution for such 
> requirements? or database search would be more 
> optimal.

Regards,
Amit

-- 
View this message in context: 
> href="http://lucene.472066.n3.nabble.com/Nested-table-support-ability-tp905253p916087.html";
>  
> target=_blank 
> >http://lucene.472066.n3.nabble.com/Nested-table-support-ability-tp905253p916087.html
Sent 
> from the Solr - User mailing list archive at Nabble.com.

Re: Non-prefix, hierarchical autocomplete? Would SOLR-1316 work? Solritas?

2010-06-23 Thread Otis Gospodnetic

Hi Andy,

I didn't check out SOLR-1316 yet, other then looking at the comments.  Sounds 
more complicated than it should be, but maybe it's great and I really need to 
try it.
Solritas uses TermsComponent, which should work well for individual terms 
(which country and city names are not, unless you tokenize them as single 
tokens).
I don't think there is anything that will do everything you need out of the box.
You can get autocompletion on the country field, but you then need to do a bit 
of JS work to restrict cities to the country specified in the country field.  
Actually, now that I wrote this, I think we did something very much like that 
with http://sematext.com/products/autocomplete/index.html .
Finally, for dealing with commas or spaces as tag separators, you can peak at 
the JS in a service like delicious.com and see how they do it.  Their 
implementation of tag entry is nice.

And here is another slick auto-complete with extra niceness in the search form 
itself, from one of our customers: http://www.etsy.com/explorer 
Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Andy 
> To: solr-user@lucene.apache.org
> Sent: Sat, June 19, 2010 3:28:15 AM
> Subject: Non-prefix, hierarchical autocomplete? Would SOLR-1316 work? 
> Solritas?
> 
> Hi,

I've seen some posts on using SOLR-1316 or Solritas for autocomplete. 
> Wondered what is the best solution for my use case:

1) I would like to 
> have an "hierarchical" autocomplete. For example, I have a "Country" dropdown 
> list and a "City" textbox. A user would select a country from the dropdown 
> list, 
> and then type out the City in the textbox. Based on which country he 
> selected, I 
> want to limit the autocomplete suggestions to cities that are relevant for 
> the 
> selected country.

This hierarchy could be multi-level. For example, there 
> may be a "Neighborhood" textbox. The autocomplete suggestions for 
> "Neighborhood" 
> would be limited to neighborhoods that are relevant for the city entered by 
> the 
> user in the "City" textbox.

2) I want to have autocomplete suggestions 
> that includes non-prefix matches. For example, if the user type "auto", the 
> autocomplete suggestions should include terms such as "automata" and "build 
> automation".

3) I'm doing autocomplete for tags. I would like to allow 
> multi-word tags and use comma (",") as a separator for tags. So when the use 
> hits the space bar, he is still typing out the same tag, but when he hits the 
> comma key, he's starting a new tag.

Would SOLR-1316 or Solritas work for 
> the above requirements? If they do how do I set it up? I can't really find 
> much 
> documentation on SOLR-1316 or Solritas in this 
> area.

Thanks.

Re: Indexing Different Types

2010-06-23 Thread Otis Gospodnetic

Stephen,

Sure, multiple cores, one for each type is one approach.  Another one is just 
adding a 'type' field and restricting auto-completion by type.  In our AC 
implementation we have a piece made for very similar situations, where you have 
multiple types of entities, but want a single input field (search box) to give 
you suggestions from all entity types, yet have suggestions for different types 
visually grouped together.  I don't think we have a demo of that anywhere, 
though you can see AC in action on http://search-lucene.com/ for example.
 
Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Divine Mercy 
> To: solr-user@lucene.apache.org
> Sent: Mon, June 21, 2010 4:59:55 PM
> Subject: Indexing Different Types
> 
> 
Hi 

I have a requirement and I am wondering what is the best way to 
> handle this through Solr.

I have different types of unrelated data for 
> example categories, tags and some address information.

I would like to 
> implement auto complete on this information, so there would be an auto 
> complete 
> form for each one.

What would be the best way for implementing this using 
> SOLR?

Would this be using multiple indexes one index for tags, categories 
> and address.





Regards


Stephen

>
>   
> 
_

> href="http://clk.atdmt.com/UKM/go/19780/direct/01/"; target=_blank 
> >http://clk.atdmt.com/UKM/go/19780/direct/01/
We want to hear all 
> your funny, exciting and crazy Hotmail stories. Tell us now

61 matches

Mail list logo