Here's something that'll create a JSON model that can be directly uploaded
into Solr:

https://github.com/ryac/lambdamart-xml-to-json

It'll map the feature IDs to the names found in the feature-store as well.

I had this error when uploading model:

Model type does not exist
org.apache.solr.ltr.model.MultipleAdditiveTreesModel

Once I changed the values from floats to strings it uploaded fine.

Ryan




On Mon, 24 Jul 2017 at 12:14 Ryan Yacyshyn <ryan.yacys...@gmail.com> wrote:

> Thanks Doug, this is helpful.
>
> I also started something last night to output to JSON for Solr, I'll post
> it up as well.
>
> Ryan
>
>
>
>
> On Sun, 23 Jul 2017 at 23:48 Doug Turnbull <
> dturnb...@opensourceconnections.com> wrote:
>
>> Yes you're correct that the feature is the 1-based identifier from your
>> training data.
>>
>> For a script. Not one to Solr exactly, but when developing the
>> Elasticsearch plugin, I started to work on a JSON serialization format,
>> and
>> as part of that built a Python script for reading the Ranklib XML and
>> outputting to my own JSON format. It could be helpful to you or anyone
>> constructing a script:
>>
>>
>> https://github.com/o19s/elasticsearch-learning-to-rank/blob/7426858c2afb168ac426cab6d857fddccb9c26fc/demo/ranklibToJson.py
>>
>> On Sun, Jul 23, 2017 at 7:18 AM Ryan Yacyshyn <ryan.yacys...@gmail.com>
>> wrote:
>>
>> > Hi everyone,
>> >
>> > I'm trying out the LTR plugin and have a couple questions when it comes
>> to
>> > converting the LambdaMart XML to JSON. Below is a snippet of the model
>> > generated from rankLib:
>> >
>> > <ensemble>
>> >   <tree id="1" weight="0.1">
>> >     <split>
>> >       <feature> 10 </feature>
>> >       <threshold> 0.28156844 </threshold>
>> >       <split pos="left">
>> >         <feature> 11 </feature>
>> >         <threshold> 7.111111 </threshold>
>> >         <split pos="left">
>> >           <feature> 7 </feature>
>> >           <threshold> 2.2759523 </threshold>
>> >           <split pos="left">
>> >             <output> 0.8436763 </output>
>> >           </split>
>> >           <split pos="right">
>> >             <output> 1.4320849 </output>
>> >           </split>
>> >         </split>
>> > ----------------------
>> >
>> > And a sample of training data:
>> >
>> > 1 qid:1 1:0.0 2:1.0 3:0.0 4:0.0 5:0.0 6:0.0 7:19.496738 8:0.0 9:0.0
>> > 10:0.08307255 11:7.111111 #docId: oeqzg5-165248
>> >
>> > It's probably obvious, but I just want to check if the <feature> node in
>> > the XML is referring to the ID in my training set and that it's possible
>> > that the model doesn't use all features in the training data?
>> >
>> > I'll be mapping these feature IDs with the names I gave them in the
>> feature
>> > store in Solr..
>> >
>> > Is there a script or utility already made out there to convert the XML
>> to
>> > this Solr JSON format? The closest I found to something was this:
>> > https://sourceforge.net/p/lemur/feature-requests/144/
>> >
>> > Thanks for your help!
>> >
>> > Ryan
>> >
>>
>

Reply via email to