In this case q=name:(ana jose) will work, but suppose if it is to be
searched in full text field It might have poor recall, It will also produce
document like "San Jose is better than Santa Ana" which was not the user
intent. Erick's solution "ana jose"~2 is capturing the intent too.
On Mon, Apr
There might be issues with your default search field. Suppose if you are
searching field named "MyTestField" then give your query as
MyTestField:Birmingham
and see if you get any results. As Matt suggested there might be some
issues with the way you have done tokenization/analysis etc.
On Mon, A
Well have indexed heterogeneous sources including a variety of NoSQL's,
RDBMs and Rich Documents (PDF Word etc.) using SolrJ. The only prerequisite
of using SolrJ is that you should have an API to fetch data from your data
source (Say JDBC for RDBMS, Tika for extracting text content from rich
docum
one has limited or no
experience with the areas mentioned above but is passionate about
Information Retrieval/Text Mining & have rock solid background in
Algorithms is encouraged to apply/connect.
Check out more on GE Research: http://www.geglobalresearch.com/
Cheers,
Yavar Husain
Lead Da
What is the best pattern to index the following kind of data:
HarryPotter.PDF
HarryPotter.txt
Avengers.Docx
Avengers.txt
For each of the above file the meta data lies in the text file having same
name as the rich document (as can be seen above).
(1) Now the brute force method that I can think o
Solr is an IR system where Spell correction is a topping however Google has
a team dedicated just for Spell corrections. Did you mean (more general
term and much broader than basic Spell correctors) or Spell Correctors
require a plethora of skills. I will just discuss Spell correctors here and
not
> plus sign...
>
> Best,
> Erick
>
> On Tue, Dec 23, 2014 at 9:55 PM, Yavar Husain
> wrote:
> > So my Solr date range query is as follows:
> >
> >
> &facet.range=date&facet.range.start=NOW/DAY-36MONTH&facet.range.end=NOW/DAY&facet.range.gap=%2B1MO
So my Solr date range query is as follows:
&facet.range=date&facet.range.start=NOW/DAY-36MONTH&facet.range.end=NOW/DAY&facet.range.gap=%2B1MONTH
I need facets for past 36 months or 3 year and everything is fine except
for data not being returned for last 1 month,
However the facets I am getting
Though I am interacting with Dawid (creator of Carrot2) on Carrot2 mailing
list however just wanted to post my problem to a wider audience.
I am using Solr 4.7 (on both windows and linux) and saved my
lingo-attributes.xml file from the workbench which I am using in Solr. Note
that for testing I am
Have most of experience working on Solr with Tomcat. However I recently
started with Jetty. I am using Solr 4.7.0 on Windows 7. I have configured
solr properly and am able to see the admin UI as well as velocity browse.
Dataimporthandler screen is also getting displayed. However when I do a
full im
n is a stopgap measure rather than a robust
> architecture.
>
>
> -- Jack Krupansky
>
> -Original Message- From: Yavar Husain
> Sent: Tuesday, July 22, 2014 2:22 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr Cassandra MySQL Best Practice Indexing
>
>
t; Solr-enabled Cassandra data center just the same as with normal Solr.
>
> -- Jack Krupansky
>
> -Original Message- From: Yavar Husain
> Sent: Monday, July 21, 2014 8:37 AM
> To: solr-user@lucene.apache.org
> Subject: Solr Cassandra MySQL Best Practice Indexing
>
>
So my full text data lies on Cassandra along with an ID. Now I have a lot
of structured data linked to the ID which lies on an RDBMS (read MySQL). I
need this structured data as it would help me with my faceting and other
needs. What is the best practice in going about indexing in this scenario.
My
lgorithms is encouraged to apply/connect.
Cheers,
Yavar Husain
Lead Data Scientist - Text Mining Laboratory
GE Research, Bangalore
LinkedIn: http://www.linkedin.com/pub/yavar-husain/5/805/151
Text@ yavarhus...@gmail.com
14 matches
Mail list logo