After a whole lot of facet-wrangling, I've come up with a practical
solution that suits my situation, which is to index each triple as a
series of paths. For example, if the "shelve" process of the
"accessionWF" workflow is "completed," it gets indexed as:

<field name="wf_wps">accessionWF</field>
<field name="wf_wps">accessionWF:shelve</field>
<field name="wf_wps">accessionWF:shelve:completed</field>
<field name="wf_wsp">accessionWF</field>
<field name="wf_wsp">accessionWF:completed</field>
<field name="wf_wsp">accessionWF:completed:shelve</field>
<field name="wf_swp">completed</field>
<field name="wf_swp">completed:accessionWF</field>
<field name="wf_swp">completed:accessionWF:shelve</field>

(I could use PathHierarchyTokenizerFactory to eliminate 2/3 of those
field declarations, but doing it this way keeps me from having to
upgrade my Solr to 3.1 yet.)

That lets solr return a facet structure that looks like this:

<lst name="facet_fields">
    <lst name="wf_wps">
        <int name="accessionWF">554</int>
        <int name="accessionWF:shelve">554</int>
        <int name="accessionWF:shelve:completed">550</int>
        <int name="accessionWF:shelve:error">4</int>
    </lst>
    <lst name="wf_wsp">
        <int name="accessionWF">554</int>
        <int name="accessionWF:completed">554</int>
        <int name="accessionWF:completed:shelve">550</int>
        <int name="accessionWF:error">4</int>
        <int name="accessionWF:error:shelve">4</int>
    </lst>
    <lst name="wf_swp">
        <int name="completed">554</int>
        <int name="completed:accessionWF">554</int>
        <int name="completed:accessionWF:shelve">550</int>
        <int name="error">4</int>
        <int name="error:accessionWF">4</int>
        <int name="error:accessionWF:shelve">4</int>
    </lst>
</lst>

I then use some Ruby post-processing to turn it into:

{
    "wf_wps": {
        "accessionWF": [554, {
            "shelve": [554, {
                "completed": 550,
                "error": 4
            }],
            "publish": [554, {
                "completed": 554
            }]
        }]
    },
    "wf_swp": {
        "completed": [554, {
            "accessionWF": [554, {
                "shelve": 550,
                "publish": 554
            }]
        }],
        "error": [4, {
            "accessionWF": [4, {
                "shelve": 4
            }]
        }]
    },
    "wf_wsp": {
        "accessionWF": [554, {
            "completed": [554, {
                "shelve": 550,
                "publish": 554
            }],
            "error": [4, {
                "shelve": 4
            }]
        }]
    }
}

Eventually I may try to code up something that does the restructuring
on the solr side, but for now, this suits my purposes.

Michael

Reply via email to