date:20090912

Re: Extract info from parent node during data import

2009-09-12 Thread Noble Paul നോബിള്‍ नोब्ळ्

On Sat, Sep 12, 2009 at 12:24 PM, Fergus McMenemie  wrote:
>>On Fri, Sep 11, 2009 at 6:48 AM, venn hardy  wrote:
>>>
>>> Hi Fergus,
>>>
>>> When I debugged in the development console 
>>> http://localhost:9080/solr/admin/dataimport.jsp?handler=/dataimport
>>>
>>> I had no problems. Each category/item seems to be only indexed once, and no 
>>> parent fields are available (except the category name).
>>>
>>> I am not entirely sure how the forEach statement works, but my 
>>> interpretation of forEach="/document/category/item | /document/category" is 
>>> something like this:
>>>
>>> 1. Whenever DIH encounters a document/category it will extract the 
>>> /document/category/
>>>
>>> name field as a common field
>>> 2. Whenever DIH encounters a document/category/item it will extract all of 
>>> the item fields.
>>> 3. When all fields have been encountered, save the document in solr and go 
>>> to the next category/item
>>
>>/document/category/item | /document/category
>>
>>means there are two paths which triggers a new doc (it is possible to
>>have more). Whenever it encounters the closing tag of that xpath , it
>>emits all the fields it collected since the opening of the same tag.
>>after that it clears all the fields it collected since the opening of
>>the tag.
>>
>>If there are fields it collected before opening of the same tag, it retains it
>
>
> Nice and clear, but that is not what I see.
>
> With my test case with forEach="/record | /record/mediaBlock"
> I see that for each /record/mediaBlock "document" indexed it contains all 
> fields
> from the parent "/record" document as well. A search over mediaBlock s 
> returns lots
> of extra fields from the parent which did not have the commonField attribute. 
> I
> will try and produce a testcase

yes it does . . /record/mediaBlock will have all the fields collected
from /record as well. It is by design
.
>
>
>>>
>>>
 Date: Thu, 10 Sep 2009 14:19:31 +0100
 To: solr-user@lucene.apache.org
 From: fer...@twig.me.uk
 Subject: RE: Extract info from parent node during data import

 >Hi Paul,
 >The forEach="/document/category/item | /document/category/name" didn't 
 >work (no categoryname was stored or indexed).
 >However forEach="/document/category/item | /document/category" seems to 
 >work well. I am not sure why category on its own works, but not 
 >category/name...
 >But thanks for tip. It wasn't as painful as I thought it would be.
 >Venn

 Hmmm, I had bother with this. Although each occurance of 
 /document/category/item
 causes a new solr document to indexed, that document contained all the 
 fields from
 the parent element as well.

 Did you see this?

 >
 >> From: noble.p...@corp.aol.com
 >> Date: Thu, 10 Sep 2009 09:58:21 +0530
 >> Subject: Re: Extract info from parent node during data import
 >> To: solr-user@lucene.apache.org
 >>
 >> try this
 >>
 >> add two xpaths in your forEach
 >>
 >> forEach="/document/category/item | /document/category/name"
 >>
 >> and add a field as follows
 >>
 >> >>> >> commonField="true"/>
 >>
 >> Please try it out and let me know.
 >>
 >> On Thu, Sep 10, 2009 at 7:30 AM, venn hardy  
 >> wrote:
 >> >
 >> > Hello,
 >> >
 >> >
 >> >
 >> > I am using SOLR 1.4 (from nighly build) and its URLDataSource in 
 >> > conjunction with the XPathEntityProcessor. I have successfully 
 >> > imported XML content, but I think I may have found a limitation when 
 >> > it comes to the commonField attribute in the DataImportHandler.
 >> >
 >> >
 >> >
 >> > Before writing my own parser to read in a whole XML document, I 
 >> > thought I'd post the question here (since I got some great advice 
 >> > last time).
 >> >
 >> >
 >> >
 >> > The bulk of my content is contained within each  tag. However, 
 >> > each item has a parent called  and each category has a name 
 >> > which I would like to import. In my forEach loop I specify the 
 >> > /document/category/item as the collection of items I am interested 
 >> > in. Is there anyway to extract an element from underneath a parent 
 >> > node? To be a more more specific (see eg xml below). I would like to 
 >> > index the following:
 >> >
 >> > - category: Category 1; id: 1; author: Author 1
 >> >
 >> > - category: Category 1; id: 2; author: Author 2
 >> >
 >> > - category: Category 2; id: 3; author: Author 3
 >> >
 >> > - category: Category 2; id: 4; author: Author 4
 >> >
 >> >
 >> >
 >> > Any ideas on how I can get to a parent node from within a child 
 >> > during data import? If it cant be done, what do you suggest would be 
 >> > the best way so I can keep using the DataImportHandler... would XSLT 
 >> > be a good idea to 'flatten out' the structure a bit?
 >> >
 >> >

Re: Single Core or Multiple Core?

2009-09-12 Thread Uri Boness


+1
Can you add a JIRA issue for that so we can vote for it?

Chris Hostetter wrote:

: > For the record: even if you're only going to have one SOlrCore, using the
: > multicore support (ie: having a solr.xml file) might prove handy from a
: > maintence standpoint ... the ability to configure new "on deck cores" with
...
: Yeah, it is a shame that single-core deployments (no solr.xml) does not have
: a way to enable CoreAdminHandler. This is something we should definitely
: look at in Solr 1.5.

I think the most straight forward starting point is to switch how we 
structure the examples so that all of the examples uses a solr.xml with 
multicore support.


Then we can move forward on deprecating the specification of "Solr Home" 
using JNDI/systemvars and switch to having the location of the solr.xml be 
the one master config option with everything else coming after that.




-Hoss

Re: Facet Response Structure

2009-09-12 Thread smock


As to point 1 - this is not a problem with the response structure I've
outlined.  This is exactly the problem I'm trying to solve.  NULL is not a
value in the field, it is a placeholder to indicate how many documents the
field does not exist for.  In my example response structure above, 'missing'
is placed outside of the 'facets' list, clearing up the confusion. 
'missing' could indeed be a facet value without any collisions.

To point 2 - I understand it would cause compatibility issues, that is why I
was suggesting it be incorporated into the next SOLR release.  I'd also be
willing to work 

Regarding the stats component, it does not do what you think it does.  It
reports a count of all values, not distinct values.  The stats component
also strictly works on numeric fields, which would make it impossible to use
in a lot of cases where the FacetComponent does work.


Shalin Shekhar Mangar wrote:
> 
> On Sat, Sep 12, 2009 at 1:20 AM, smock  wrote:
> 
>>
>> I'd like to propose a change to the facet response structure.  Currently,
>> it
>> looks like:
>>
>> {'facet_fields':{'field1':[('value1',count1),('value2',count2),(null,missingCount)]}}
>>
>> My immediate problem with this structure is that null is not of the same
>> type as the 'value's.  Also, the meaning of the (null,missingCount) tuple
>> is
>> not the same as the meaning of the ('value',count) tuples, it is a
>> special
>> case to represent the documents for which the field has no value.  I'd
>> like
>> to propose changing the response to:
>>
>> {'facet_fields',:{'field1':{'facets':[('value1',count1),('value2',count2)],'missing':missingCount}}}
>>
>>
> Well, there are two problems:
> 1. 'missing' can be a value in the field
> 2. Facet support has been there for a long time. This would break
> compatibility with existing clients.
> 
> 
>>
>> In addition to cleaning up the 'null' issue mentioned above, I think this
>> will allow for greater flexibility moving forward with the facet
>> component.
>> For instance, it would be great if the FacetComponent could add an
>> optional
>> count of the 'hits', or number of distinct facet values contained in the
>> query result.  If the facet request has a limit on it, this number is not
>> available via a count of the returned facet values.  The response
>> structure
>> I've outlined above could accomodate this piece of metadata very easily:
>>
>> {'facet_fields',:{'field1':{'facets':[('value1',count1),('value2',count2)],'missing':missingCount,'hits':hitsCount}}}
>>
>>
> Have you looked at StatsComponent? It give counts for total distinct
> values
> and count of documents missing a value among other things:
> 
> http://wiki.apache.org/solr/StatsComponent
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Facet-Response-Structure-tp25407363p25414267.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: "standard" requestHandler components

2009-09-12 Thread michael8


Hi Jay,

I got it from reading your response.  I did browse around in solrconfig.xml
but could not find any components configured for 'standard', but didn't
realized that there are 'defaults' hardwired.  Thanks for your quick &
detailed response and also your additional tip on spellcheck config.  You
saved me lots of time on trial-&-error.

Regards,
Michael


Jay Hill wrote:
> 
> RequestHandlers are configured in solrconfig.xml. If no components are
> explicitly declared in the request handler config the the defaults are
> used.
> They are:
> - QueryComponent
> - FacetComponent
> - MoreLikeThisComponent
> - HighlightComponent
> - StatsComponent
> - DebugComponent
> 
> If you wanted to have a custom list of components (either omitting
> defaults
> or adding custom) you can specify the components for a handler directly:
> 
>   query
>   facet
>   mlt
>   highlight
>   debug
>   someothercomponent
> 
> 
> You can add components before or after the main ones like this:
> 
>   mycomponent
> 
> 
> 
>   myothercomponent
> 
> 
> and that's how the spell check component can be added:
> 
>   spellcheck
> 
> 
> Note that the a component (except the defaults) must be configured in
> solrconfig.xml with the name used in the str element as well.
> 
> Have a look at the solrconfig.xml in the example directory
> (".../example/solr/conf/") for examples on how to set up the spellcheck
> component, and on how the request handlers are configured.
> 
> -Jay
> http://www.lucidimagination.com
> 
> 
> On Fri, Sep 11, 2009 at 3:04 PM, michael8  wrote:
> 
>>
>> Hi,
>>
>> I have a newbie question about the 'standard' requestHandler in
>> solrconfig.xml.  What I like to know is where is the config information
>> for
>> this requestHandler kept?  When I go to http://localhost:8983/solr/admin,
>> I
>> see the following info, but am curious where are the supposedly 'chained'
>> components (e.g. QueryComponent, FacetComponent, MoreLikeThisComponent)
>> configured for this requestHandler.  I see timing and process debug
>> output
>> from these components with "debugQuery=true", so somewhere these
>> components
>> must have been configured for this 'standard' requestHandler.
>>
>> name:standard
>> class:  org.apache.solr.handler.component.SearchHandler
>> version:$Revision: 686274 $
>> description:Search using components:
>>
>> org.apache.solr.handler.component.QueryComponent,org.apache.solr.handler.component.FacetComponent,org.apache.solr.handler.component.MoreLikeThisComponent,org.apache.solr.handler.component.HighlightComponent,org.apache.solr.handler.component.DebugComponent,
>> stats:  handlerStart : 1252703405335
>> requests : 3
>> errors : 0
>> timeouts : 0
>> totalTime : 201
>> avgTimePerRequest : 67.0
>> avgRequestsPerSecond : 0.015179728
>>
>>
>> What I like to do from understanding this is to properly integrate
>> spellcheck component into the standard requestHandler as suggested in a
>> solr
>> spellcheck example.
>>
>> Thanks for any info in advance.
>> Michael
>> --
>> View this message in context:
>> http://www.nabble.com/%22standard%22-requestHandler-components-tp25409075p25409075.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/%22standard%22-requestHandler-components-tp25409075p25414682.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr SVN build problem

2009-09-12 Thread Ryan McKinley


Should be fixed in trunk.  Try updating and see if it works for you

See:
https://issues.apache.org/jira/browse/SOLR-1424



On Sep 9, 2009, at 8:12 PM, Allahbaksh Asadullah wrote:


Hi ,
I am building Solr from source. During building it from source I am  
getting

following error.

generate-maven-artifacts:
   [mkdir] Created dir: c:\Downloads\solr_trunk\build\maven
   [mkdir] Created dir: c:\Downloads\solr_trunk\dist\maven
[copy] Copying 1 file to
c:\Downloads\solr_trunk\build\maven\c:\Downloads\s
olr_trunk\src\maven

BUILD FAILED
c:\Downloads\solr_trunk\build.xml:741: The following error occurred  
while

execut
ing this line:
c:\Downloads\solr_trunk\common-build.xml:261: Failed to copy
c:\Downloads\solr_t
runk\src\maven\solr-parent-pom.xml.template to
c:\Downloads\solr_trunk\build\mav
en\c:\Downloads\solr_trunk\src\maven\solr-parent-pom.xml.template  
due to

java.io
.FileNotFoundException
c:\Downloads\solr_trunk\build\maven\c:\Downloads\solr_tru
nk\src\maven\solr-parent-pom.xml.template (The filename, directory  
name, or

volu
me label syntax is incorrect)

Regards,
Allahbaksh

Re: Single Core or Multiple Core?

2009-09-12 Thread Jonathan Ariel

What do you mean by "single-core deployments does not have a way to enable
CoreAdminHandler"?I'm just trying to understand the feature that you are
talking about

On Sat, Sep 12, 2009 at 6:44 AM, Uri Boness  wrote:

> +1
> Can you add a JIRA issue for that so we can vote for it?
>
>
> Chris Hostetter wrote:
>
>> : > For the record: even if you're only going to have one SOlrCore, using
>> the
>> : > multicore support (ie: having a solr.xml file) might prove handy from
>> a
>> : > maintence standpoint ... the ability to configure new "on deck cores"
>> with
>>...
>> : Yeah, it is a shame that single-core deployments (no solr.xml) does not
>> have
>> : a way to enable CoreAdminHandler. This is something we should definitely
>> : look at in Solr 1.5.
>>
>> I think the most straight forward starting point is to switch how we
>> structure the examples so that all of the examples uses a solr.xml with
>> multicore support.
>>
>> Then we can move forward on deprecating the specification of "Solr Home"
>> using JNDI/systemvars and switch to having the location of the solr.xml be
>> the one master config option with everything else coming after that.
>>
>>
>>
>> -Hoss
>>
>>
>>
>>
>

Re: Highlighting in SolrJ?

2009-09-12 Thread Jay Hill

Will do Shalin.

-Jay
http://www.lucidimagination.com


On Fri, Sep 11, 2009 at 9:23 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> Jay, it would be great if you can add this example to the Solrj wiki:
>
> http://wiki.apache.org/solr/Solrj
>
> On Fri, Sep 11, 2009 at 5:15 AM, Jay Hill  wrote:
>
> > Set up the query like this to highlight a field named "content":
> >
> >SolrQuery query = new SolrQuery();
> >query.setQuery("foo");
> >
> >query.setHighlight(true).setHighlightSnippets(1); //set other params
> as
> > needed
> >query.setParam("hl.fl", "content");
> >
> >QueryResponse queryResponse =getSolrServer().query(query);
> >
> > Then to get back the highlight results you need something like this:
> >
> >Iterator iter = queryResponse.getResults();
> >
> >while (iter.hasNext()) {
> >  SolrDocument resultDoc = iter.next();
> >
> >  String content = (String) resultDoc.getFieldValue("content"));
> >  String id = (String) resultDoc.getFieldValue("id"); //id is the
> > uniqueKey field
> >
> >  if (queryResponse.getHighlighting().get(id) != null) {
> >List highightSnippets =
> > queryResponse.getHighlighting().get(id).get("content");
> >  }
> >}
> >
> > Hope that gets you what you need.
> >
> > -Jay
> > http://www.lucidimagination.com
> >
> > On Thu, Sep 10, 2009 at 3:19 PM, Paul Tomblin 
> wrote:
> >
> > > Can somebody point me to some sample code for using highlighting in
> > > SolrJ?  I understand the highlighted versions of the field comes in a
> > > separate NamedList?  How does that work?
> > >
> > > --
> > > http://www.linkedin.com/in/paultomblin
> > >
> >
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>

Re: Single Core or Multiple Core?

2009-09-12 Thread Shalin Shekhar Mangar

On Sat, Sep 12, 2009 at 9:45 PM, Jonathan Ariel  wrote:

> What do you mean by "single-core deployments does not have a way to enable
> CoreAdminHandler"?I'm just trying to understand the feature that you are
> talking about
>
>
I'm talking about the core related commands described here:

http://wiki.apache.org/solr/CoreAdmin

-- 
Regards,
Shalin Shekhar Mangar.

Re: Facet Response Structure

2009-09-12 Thread Shalin Shekhar Mangar

On Sat, Sep 12, 2009 at 6:29 PM, smock  wrote:

>
> As to point 1 - this is not a problem with the response structure I've
> outlined.  This is exactly the problem I'm trying to solve.  NULL is not a
> value in the field, it is a placeholder to indicate how many documents the
> field does not exist for.  In my example response structure above,
> 'missing'
> is placed outside of the 'facets' list, clearing up the confusion.
> 'missing' could indeed be a facet value without any collisions.
>
>
You are right, I missed that.


> To point 2 - I understand it would cause compatibility issues, that is why
> I
> was suggesting it be incorporated into the next SOLR release.  I'd also be
> willing to work
>
>
I'm not convinced that it is something that needs to be changed. I'm also
not sure about the right way to deprecate a widely used response format. Go
ahead and raise an issue if you want and we can collect thoughts from
others.


> Regarding the stats component, it does not do what you think it does.  It
> reports a count of all values, not distinct values.  The stats component
> also strictly works on numeric fields, which would make it impossible to
> use
> in a lot of cases where the FacetComponent does work.
>
>
Yes, my bad. Though it does report the count of missing values.

-- 
Regards,
Shalin Shekhar Mangar.

Re: Highlighting in SolrJ?

2009-09-12 Thread Shalin Shekhar Mangar

Thanks Jay!

On Sat, Sep 12, 2009 at 10:03 PM, Jay Hill  wrote:

> Will do Shalin.
>
> -Jay
> http://www.lucidimagination.com
>
>
> On Fri, Sep 11, 2009 at 9:23 PM, Shalin Shekhar Mangar <
> shalinman...@gmail.com> wrote:
>
> > Jay, it would be great if you can add this example to the Solrj wiki:
> >
> > http://wiki.apache.org/solr/Solrj
> >
> > On Fri, Sep 11, 2009 at 5:15 AM, Jay Hill 
> wrote:
> >
> > > Set up the query like this to highlight a field named "content":
> > >
> > >SolrQuery query = new SolrQuery();
> > >query.setQuery("foo");
> > >
> > >query.setHighlight(true).setHighlightSnippets(1); //set other params
> > as
> > > needed
> > >query.setParam("hl.fl", "content");
> > >
> > >QueryResponse queryResponse =getSolrServer().query(query);
> > >
> > > Then to get back the highlight results you need something like this:
> > >
> > >Iterator iter = queryResponse.getResults();
> > >
> > >while (iter.hasNext()) {
> > >  SolrDocument resultDoc = iter.next();
> > >
> > >  String content = (String) resultDoc.getFieldValue("content"));
> > >  String id = (String) resultDoc.getFieldValue("id"); //id is the
> > > uniqueKey field
> > >
> > >  if (queryResponse.getHighlighting().get(id) != null) {
> > >List highightSnippets =
> > > queryResponse.getHighlighting().get(id).get("content");
> > >  }
> > >}
> > >
> > > Hope that gets you what you need.
> > >
> > > -Jay
> > > http://www.lucidimagination.com
> > >
> > > On Thu, Sep 10, 2009 at 3:19 PM, Paul Tomblin 
> > wrote:
> > >
> > > > Can somebody point me to some sample code for using highlighting in
> > > > SolrJ?  I understand the highlighted versions of the field comes in a
> > > > separate NamedList?  How does that work?
> > > >
> > > > --
> > > > http://www.linkedin.com/in/paultomblin
> > > >
> > >
> >
> >
> >
> > --
> > Regards,
> > Shalin Shekhar Mangar.
> >
>



-- 
Regards,
Shalin Shekhar Mangar.

Re: Extract info from parent node during data import

Re: Single Core or Multiple Core?

Re: Facet Response Structure

Re: "standard" requestHandler components

Re: Solr SVN build problem

Re: Single Core or Multiple Core?

Re: Highlighting in SolrJ?

Re: Single Core or Multiple Core?

Re: Facet Response Structure

Re: Highlighting in SolrJ?

10 matches

Site Navigation

Mail list logo

Footer information