Hi all,
I'm wondering if it's possible to have spell checking performed on terms in
the synonym list?
For example, let's say I have documents with the word "lawyer" in them and
I add "lawyer, attorney" in the synonyms.txt file. Then a query is made for
the word "atorney". Is there any way to prov
Hi,
We are facing the same issues on our setup. 3 zk nodes, 1 shard, 10
collections, 1 replica. v. 5.0.0. default startup params.
Solr Servers: 2 core cpu, 7gb memory
Index size: 28g, 3gb heap
This setup was running on v. 4.6 before upgrading to 5 without any of these
errors. The timeout seems to
On 08/07/2015 20:39, Allison, Timothy B. wrote:
Unfortunately, no. We can't even do that now with straight Tika. I
imagine this is for pdf files? If you'd like to add this as a
feature, please submit a ticket over on Tika.
Another alternative is to pre-process the PDF files to remove the fir
Mikhail,
We've now override the equal & hashcode of the custom query to use this new
param as well, and it works like charm.
Thanks allot,
Ami
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-cache-when-using-custom-scoring-tp4216419p4216496.html
Sent from the Solr - U
Hi,
I wants to add a question regarding copyField and LowerCaseFilterFactory
We notice that LowerCaseFilterFactory takes huge part of the CPU ( via
profiling ) for the text filed
Can we avoid it or improve that implementation? ( keeping the insensitive case
search )
Best Regards,
Nir Barel
-
Cool ! So actually you were not using the default you defined in th
Solrconfig, but it was loaded from a java environment property set to be
"3" ms ?
Cheers
2015-07-09 4:21 GMT+01:00 Summer Shire :
> Yonik, Mikhail, Alessandro
>
> After a lot of digging around and isolation, All u guys were
Let me answer in line :
2015-07-09 9:35 GMT+01:00 Nir Barel :
> Hi,
>
> I wants to add a question regarding copyField and LowerCaseFilterFactory
> We notice that LowerCaseFilterFactory takes huge part of the CPU ( via
> profiling ) for the text filed
> Can we avoid it or improve that implementati
Hi All,
Is there a way to get a combined data from 2 different cores together in a
single call?
like a data form both CatalogEntry and CatalogGroup cores in a single call
to solr.
--
Regards,
Santosh Sidnal
Hi Li Li,
I am experiencing the same problem. can you Explain little detailed?
Where do i change these methods?
I am using Solr 5.0.0, And How do i query this? Is there any change while
query?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Ranking-based-on-term-position-tp
One of the uses of synonyms is to replace a mis-spelled query term with a
correctly spelled value.
The "2 sided" synonym file format allows you to control which values "survive"
into the actual query.
lawyer, attorney, ambulance chaser, atorney, lowyor => lawyer, attorney
I am not aware, howev
Hi all,
How can we restore the index in Solr 5.1.0 ?
We did following:
1:- Started Solr Cloud from:
bin/solr start -e cloud -noprompt
2:- posted some documents to solr from examples folder using :
java -Dc=gettingstarted -jar post.jar *.xml
3:- Backed up t
Hello,
I've been working to get a search engine up an running for a little while
now. I'm using Solr to index from both a database and a file system.
However, I'm using the filepath contained inside the database to find the
file in the filesystem and then merge the the metadata in the DB and the
I posted the code anyway just forgot to get rid of that line in the post.
Sorry
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrJ-Tika-custom-indexer-not-indexing-CERTAIN-doc-text-tp4216541p4216542.html
Sent from the Solr - User mailing list archive at Nabble.com.
That should be fixable. In a past life, I generated a perfect hash to fold
case for Unicode in a locale-neutral manner and it was very fast. If I
remember right, there are only about 2500 Unicode characters that can be case
folded at all. So the generated, collision-free hash function was v
On 7/9/2015 2:35 AM, Nir Barel wrote:
> I wants to add a question regarding copyField and LowerCaseFilterFactory
> We notice that LowerCaseFilterFactory takes huge part of the CPU ( via
> profiling ) for the text filed
> Can we avoid it or improve that implementation? ( keeping the insensitive
>
Ryan,
If you use index-time synonyms on the spellcheck field, this will give you what
you want.
For instance, if the document has "lawyer" and you index both terms
"lawyer","attorney", then the spellchecker will see that "atorney" is 1 edit
away from an indexed term and will suggest "attorney"
-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org]
Sent: Thursday, July 09, 2015 9:55 AM
To: solr-user@lucene.apache.org
Subject: Re: Do I really need copyField when my app can do the copy?
On 7/9/2015 2:35 AM, Nir Barel wrote:
> I wants to add a question regarding copyF
I was having a problem in a 4.x version of Solr and wanted to check
5.2.1 to see if it still had the same problem, so I copied my fieldType
into a 5.2.1 example schema. My fieldType uses some ICU analysis
classes, so I also put the contrib jars into server/solr/lib.
I ran into a problem similar t
Combining under new subject to reflect new question.
Took a quick look at both the LowerCaseFilter and Java implementation it uses.
A perfect hash would be much faster and, since LowerCaseFilter does not
consider locale, applicable.
ICUFoldingFilter is a somewhat different animal. But I ta
Company: AlphaSense https://www.alpha-sense.com/
Position: Search Engineer
AlphaSense is a one-stop financial search engine for financial research
analysts all around the world.
AlphaSense is looking for Search Engineers experienced with Lucene / Solr
and search architectures in general. Position
Hi All,
I did a load test with a total of 800 requests (at 40 concurrent requests
per second) to be executed against Solr index with 14 M records. Performance
was good (< 1 second) especially after a short period of time of the test.
BTW, the second round of load test was even better.
The local m
You can try using the "shards" parameter. The problem will be, though,
that the score calculations may not really be comparable...
Best,
Erick
On Thu, Jul 9, 2015 at 3:40 AM, santosh sidnal wrote:
> Hi All,
>
> Is there a way to get a combined data from 2 different cores together in a
> single c
Wow, that code looks familiar ;)...
Anyway, what have you tried?
bq: It would pull it but when I got the results in Solr it would look
blank
How do you know this? Do _some_ docs have text in Solr but some
don't or are all of your text fields blank? In this case I suspect
you're not storing the da
Concur on both points. You can also use PDFBox's app "ExtractText" with
-startPage and -endPage parameters:
https://pdfbox.apache.org/1.8/commandline.html#extractText
-Original Message-
From: Charlie Hull [mailto:char...@flax.co.uk]
Sent: Thursday, July 09, 2015 3:55 AM
To: solr-user@
On 7/9/2015 9:48 AM, wwang525 wrote:
> I did a load test with a total of 800 requests (at 40 concurrent requests
> per second) to be executed against Solr index with 14 M records. Performance
> was good (< 1 second) especially after a short period of time of the test.
> BTW, the second round of loa
It's actually unlikely that increasing the documentCache will
help materially. It's primarily so various components won't
have to fetch the documents off disk for a _single_ request.
I've heard some anecdotal evidence that it helps in some situations,
but that's been rare in my experience.
Your fi
Haha no need to reinvent wheels. Especially when you don't know java. Just a
prototype anyway.
I made a very strong assumption that it was pulling the text as blank
because I would copy the EXACT same text from one file in the file system
and put it into another file under a different name, but in
Rich,
I've run into various problems with group.query and highlighting. You noted
one below (SOLR-5046), and there is also SOLR-6712, which might be related to
what you are experiencing. Still waiting for that patch to be reviewed...
-Original Message-
From: Rich Hume [mailto:rh...@id
I rather doubt that it's a Solr issue. Text is text after all. If
some docs display text, then it's probably a matter of not
getting the text in the first place.
My _guess_ is that you're not getting any text at all from
the document. Either the document isn't being found
or it's not a form that T
Hi,
The real production requests will not be randomly generated, and a lot of
requests will be repeated. I think the performance will be better due to the
repeated requests. In addition, I am sure the configuration will need to be
adjusted once the application is in production.
For the time being
Hi, I'm having a similar use case, still looking for a solution, I have
posted a question about it in Stack Overflow (
http://stackoverflow.com/questions/31281640/sum-field-and-sort-on-solr )
Did you solve it ?
Regards.
--
Emilio Borraz
*Back-end Developer*
emilio.bor...@sonatasmx.com
I'd examine the filter queries used to see whether they make sense as well.
You really have to re-tune after you start getting real user queries though
as anything you generate won't reflect reality. I'd start _much_ smaller, 512
or 1024 and work _up_ with real data.
Raising the document cache lim
Thanks for all your help. I decided to switch to Ubuntu linux.
Allan elkowitzelkow...@alumni.caltech.edu
On Wednesday, July 8, 2015 1:44 AM, Shawn Heisey
wrote:
On 7/7/2015 10:43 AM, Allan Elkowitz wrote:
> So I am a newbie at Solr and am having trouble getting the examples workin
Hi everyone,
I use solr to index and search in office file (docx, pptx, ...). To reduce
the size of solr index, I do not store the content of the file on solr,
however now my customer want to preview the content of the file.
I have read the document of ExtractingRequestHandler, but it seems that
I want to log query running through DIH should i use LogTransformer
to do that
One thing I noted that you need to give full package detail while mentioning
transformer.
Like, I have added bellow
mailto:test.mi...@gmail.com]
Sent: Friday, July 10, 2015 11:08 AM
To: solr-user@lucene.apache.org
Subject: LogTransformer
I want to log query running through DIH should i use LogT
Hi Jagdish,
not working for me.
On Fri, Jul 10, 2015 at 11:21 AM, Jagdish Vasani
wrote:
> One thing I noted that you need to give full package detail while
> mentioning transformer.
> Like, I have added bellow
>
> Hope this will help you.
>
> Thanks,
> Jagdish
> -Original Message-
> Fr
37 matches
Mail list logo