Thanks Jack.

On 1/24/15, 3:57 PM, Jack Krupansky wrote:
Take a look at the RegexTransformer. Or,in some cases your may need to use
the raw ScriptTransformer.

See:
https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler

-- Jack Krupansky

On Sat, Jan 24, 2015 at 3:49 PM, Carl Roberts <carl.roberts.zap...@gmail.com
wrote:
Via this rss-data-config.xml file and a class that I wrote (attached) to
download and XML file from a ZIP URL:

<dataConfig>
     <dataSource type="ZIPURLDataSource" connectionTimeout="15000"
readTimeout="30000"/>
     <document>
         <entity name="cve-2002"
                 pk="id"
url="https://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2002.xml.zip";
                 processor="XPathEntityProcessor"
                 forEach="/nvd/entry">
             <field column="id" xpath="/nvd/entry/@id" commonField="false"
/>
             <field column="cve" xpath="/nvd/entry/cve-id"
commonField="false" />
             <field column="cwe" xpath="/nvd/entry/cwe/@id"
commonField="false" />
             <field column="vulnerable-configuration"
xpath="/nvd/entry/vulnerable-configuration/logical-test/fact-ref/@name"
commonField="false" />
             <field column="vulnerable-software"
xpath="/nvd/entry/vulnerable-software-list/product" commonField="false" />
             <field column="published" xpath="/nvd/entry/published-datetime"
commonField="false" />
             <field column="modified" xpath="/nvd/entry/last-modified-datetime"
commonField="false" />
             <field column="summary" xpath="/nvd/entry/summary"
commonField="false" />
         </entity>
         <entity name="cve-2003"
                 pk="id"
url="http://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-2003.xml.zip";
                 processor="XPathEntityProcessor"
                 forEach="/nvd/entry">
             <field column="id" xpath="/nvd/entry/@id" commonField="false"
/>
             <field column="cve" xpath="/nvd/entry/cve-id"
commonField="false" />
             <field column="cwe" xpath="/nvd/entry/cwe/@id"
commonField="false" />
             <field column="vulnerable-configuration"
xpath="/nvd/entry/vulnerable-configuration/logical-test/fact-ref/@name"
commonField="false" />
             <field column="vulnerable-software"
xpath="/nvd/entry/vulnerable-software-list/product" commonField="false" />
             <field column="published" xpath="/nvd/entry/published-datetime"
commonField="false" />
             <field column="modified" xpath="/nvd/entry/last-modified-datetime"
commonField="false" />
             <field column="summary" xpath="/nvd/entry/summary"
commonField="false" />
         </entity>
         <!--
         <entity name="nvd-rss-update"
                 pk="link"
                 url="https://nvd.nist.gov/download/nvd-rss.xml";
                 processor="XPathEntityProcessor"
                 forEach="/RDF/item"
                 transformer="DateFormatTransformer"
                 preImportDeleteQuery="">
             <field column="id" xpath="/RDF/item/title" commonField="true"
/>
             <field column="link" xpath="/RDF/item/link" commonField="true"
/>
             <field column="summary" xpath="/RDF/item/description"
commonField="true" />
             <field column="date" xpath="/RDF/item/date" commonField="true"
/>
         </entity>
         -->
     </document>
</dataConfig>


On 1/24/15, 3:45 PM, Jack Krupansky wrote:

How are you currently importing data?

-- Jack Krupansky

On Sat, Jan 24, 2015 at 3:42 PM, Carl Roberts <
carl.roberts.zap...@gmail.com

wrote:
Sorry if I was not clear.  What I am asking is this:

How can I parse the data during import to tokenize it by (:) and strip
the
cpe:/o?



On 1/24/15, 3:28 PM, Alexandre Rafalovitch wrote:

  You are using keywords here that seem to contradict with each other.
Or your use case is not clear.

Specifically, you are saying you are getting stuff from a (Solr?)
query. So, the results are now outside of Solr. Then you are asking
for help to strip stuff off it. Well, it's outside of Solr, do
whatever you want with it!

But then at the end, you say you want to search for whatever you
stripped off. So, that should be back in Solr again?

Or are you asking something along these lines:
1. I have a multiValued field with the following sample content... (it
does not matter to Solr where it comes from)
2. I wanted it returned as is, but I want to be able to find documents
when somebody searches for X, Y, or Z
3. What would be the best analyzer chain to be able to do so?

Regards,
      Alex.
----
Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 24 January 2015 at 15:04, Carl Roberts <
carl.roberts.zap...@gmail.com>
wrote:

  Hi,
How can I parse the data in a field that is returned from a query?

Basically,

I have a multi-valued field that contains values such as these that are
returned from a query:

             "cpe:/o:freebsd:freebsd:1.1.5.1",
             "cpe:/o:freebsd:freebsd:2.2.3",
             "cpe:/o:freebsd:freebsd:2.2.2",
             "cpe:/o:freebsd:freebsd:2.2.5",
             "cpe:/o:freebsd:freebsd:2.2.4",
             "cpe:/o:freebsd:freebsd:2.0.5",
             "cpe:/o:freebsd:freebsd:2.2.6",
             "cpe:/o:freebsd:freebsd:2.1.6.1",
             "cpe:/o:freebsd:freebsd:2.0.1",
             "cpe:/o:freebsd:freebsd:2.2",
             "cpe:/o:freebsd:freebsd:2.0",
             "cpe:/o:openbsd:openbsd:2.3",
             "cpe:/o:freebsd:freebsd:3.0",
             "cpe:/o:freebsd:freebsd:1.1",
             "cpe:/o:freebsd:freebsd:2.1.6",
             "cpe:/o:openbsd:openbsd:2.4",
             "cpe:/o:bsdi:bsd_os:3.1",
             "cpe:/o:freebsd:freebsd:1.0",
             "cpe:/o:freebsd:freebsd:2.1.7",
             "cpe:/o:freebsd:freebsd:1.2",
             "cpe:/o:freebsd:freebsd:2.1.5",
             "cpe:/o:freebsd:freebsd:2.1.7.1"],

And my problem is that I need to strip the cpe:/o part and I also need
to
tokenize words using the (:) as a separator so that I can then search
for
"freebsd 1.1" or "openbsd 2.4" or just "freebsd".

Thanks in advance.

Joe



Reply via email to