Yes - I am using DIH and I am reading the info from an XML file using
the URL datasource, and I want to strip the cpe:/o and tokenize the data
by (:) during import so I can then search it as I've described. So, my
question is this:
Is there any built in logic via a transformer class that could do this?
If not, how would you recommend I do this?
Regards,
Joe
On 1/24/15, 3:38 PM, Jack Krupansky wrote:
Or, maybe... he's using DIH and getting these values from an RDBMS database
query and now wants to index them in Solr. Who knows!
It might be simplest to transform the colons to spaces and use a normal
text field. Although you could use a custom text field type that used a
regex tokenizer which treated the colons as token separators.
-- Jack Krupansky
On Sat, Jan 24, 2015 at 3:28 PM, Alexandre Rafalovitch <arafa...@gmail.com>
wrote:
You are using keywords here that seem to contradict with each other.
Or your use case is not clear.
Specifically, you are saying you are getting stuff from a (Solr?)
query. So, the results are now outside of Solr. Then you are asking
for help to strip stuff off it. Well, it's outside of Solr, do
whatever you want with it!
But then at the end, you say you want to search for whatever you
stripped off. So, that should be back in Solr again?
Or are you asking something along these lines:
1. I have a multiValued field with the following sample content... (it
does not matter to Solr where it comes from)
2. I wanted it returned as is, but I want to be able to find documents
when somebody searches for X, Y, or Z
3. What would be the best analyzer chain to be able to do so?
Regards,
Alex.
----
Sign up for my Solr resources newsletter at http://www.solr-start.com/
On 24 January 2015 at 15:04, Carl Roberts <carl.roberts.zap...@gmail.com>
wrote:
Hi,
How can I parse the data in a field that is returned from a query?
Basically,
I have a multi-valued field that contains values such as these that are
returned from a query:
"cpe:/o:freebsd:freebsd:1.1.5.1",
"cpe:/o:freebsd:freebsd:2.2.3",
"cpe:/o:freebsd:freebsd:2.2.2",
"cpe:/o:freebsd:freebsd:2.2.5",
"cpe:/o:freebsd:freebsd:2.2.4",
"cpe:/o:freebsd:freebsd:2.0.5",
"cpe:/o:freebsd:freebsd:2.2.6",
"cpe:/o:freebsd:freebsd:2.1.6.1",
"cpe:/o:freebsd:freebsd:2.0.1",
"cpe:/o:freebsd:freebsd:2.2",
"cpe:/o:freebsd:freebsd:2.0",
"cpe:/o:openbsd:openbsd:2.3",
"cpe:/o:freebsd:freebsd:3.0",
"cpe:/o:freebsd:freebsd:1.1",
"cpe:/o:freebsd:freebsd:2.1.6",
"cpe:/o:openbsd:openbsd:2.4",
"cpe:/o:bsdi:bsd_os:3.1",
"cpe:/o:freebsd:freebsd:1.0",
"cpe:/o:freebsd:freebsd:2.1.7",
"cpe:/o:freebsd:freebsd:1.2",
"cpe:/o:freebsd:freebsd:2.1.5",
"cpe:/o:freebsd:freebsd:2.1.7.1"],
And my problem is that I need to strip the cpe:/o part and I also need to
tokenize words using the (:) as a separator so that I can then search for
"freebsd 1.1" or "openbsd 2.4" or just "freebsd".
Thanks in advance.
Joe