I am trying to index large data (not rich document) about 5GB, but Its not
getting index. In case of small data it's perfectly indexing.For Large data
import XML response.. 00 data-config.xml
full-import busy A command is still running... 0:9:12.738169
181
Hi,
We have an index of approximately 400GB in size, indexing 5000 documents was
taking 20 seconds. But lately, the indexing is taking very long, committing the
same amount of document is taking 5-20 mins.
On checking the logs I can see that their a frequent merges happening, which I
am guess
It will affect the phrase queries. That is why I am not using suggest
configuration.
On Thu, Jan 17, 2013 at 7:20 AM, Chris Hostetter
wrote:
>
> : Or there is some other way to do that?
>
> I'm late to this thread, but what was wrong with the simple suggestion of
> omitTermFreqAndPositions="true"
On Jan 15, 2013, at 10:59 AM, Otis Gospodnetic
wrote:
> Hi,
>
> Question:
> Can one add the Solr master-like replication handler (but not call it
> /replication, yes) to SolrCloud nodes and point additional slave-like
> servers (i.e. servers that are not in the SolrCloud cluster) to that?
>
>
My issue is more that the search term doll shows up in both documents on CDs
as well as documents about toys. But I have 10 CD documents for every toy
document, so my searches for "doll" tend to show the CDs most prominently.
But that's not the way a user thinks. If they want the CD documents they'
Thanks for the recommendation. I'll start this book today.
In my example, "doll" is one example of a million I might only guess at,
whereas the category "music", and "book" tend to interferes in many places and
seem to be a more limited set of categories to deal with.
Dave
-Original Messa
: DirectXmlRequest is part of the SolrJ library, so I guess that means it
: is not commonly used. My use case is that I'm applying an XSLT to the
: raw XML on the client side, instead of leaving that up to the Solr
: master (although even if I applied the XSLT on the Solr server, I'd
I think
: Or there is some other way to do that?
I'm late to this thread, but what was wrong with the simple suggestion of
omitTermFreqAndPositions="true" ?
-Hoss
: I am trying to write a util which can parse a Lucene/Solr query and convert
: into an object representation to add more clauses to the query.
:
: Eg.
: Input: (name:John AND name:Doe)
: Output: ((firstName:John OR lastName:John) AND (firstName:John OR
: lastName:John))
edismax can support this
Please correct my understanding,
Use one of the factory as global similarity.
And extends org.apache.lucene.search.similarities.DefaultSimilarity to create
custom sim.
And add a similarity tag in field type definition for required fields.
Or there is some other way to do that?
Rgds
AJ
On 17-
On Wed, Jan 16, 2013 at 6:42 PM, Walter Underwood wrote:
> Ah, that would be it. Does 4.0 also give a stack trace if you call a function
> that doesn't exist?
Stack trace still appears in the logs, but the error message returned seems OK:
http://localhost:8983/solr/query?q=*:*&defType=edismax&b
Ah, that would be it. Does 4.0 also give a stack trace if you call a function
that doesn't exist?
I can achieve most of what I want with bq, though that has IDF, which I'd
rather avoid here.
wunder
On Jan 16, 2013, at 3:38 PM, Yonik Seeley wrote:
> On Wed, Jan 16, 2013 at 6:35 PM, Walter Unde
: None of the variants worked. I started with that syntax for both
: exists() and if(). All gave the same stack trace. --wunder
...
: We're running Solr 3.3 and I have a function query for boosting that
: works with bq but not
...i'm very confused. All of the "boolean" functions (lik
On Wed, Jan 16, 2013 at 6:35 PM, Walter Underwood wrote:
> None of the variants worked. I started with that syntax for both exists() and
> if(). All gave the same stack trace. --wunder
These boolean functions are new for 4.0, but it looks like you're using 3.3?
-Yonik
http://lucidworks.com
None of the variants worked. I started with that syntax for both exists() and
if(). All gave the same stack trace. --wunder
On Jan 16, 2013, at 3:32 PM, Yonik Seeley wrote:
> On Wed, Jan 16, 2013 at 6:11 PM, Walter Underwood
> wrote:
>> I got the syntax from:
>> http://lucidworks.lucidimagina
On Wed, Jan 16, 2013 at 6:11 PM, Walter Underwood wrote:
> I got the syntax from:
> http://lucidworks.lucidimagination.com/display/solr/Function+Queries
Oops, I've alerted our tech writers! It should be fixed now.
exists(field|function) returns true if a value exists for a given document.
Exam
Hi,
Thanks very much for helps! I checked solr source code, what happened is that
for XML text inside one element, solr does not call URLDecoder (but to pass
CTRL character, I have to call urlencode from PHP).
So either I try to remove CTRL character from PHP side, or I change solr
XMLReader
sli
First, that works as "bf".
I got the syntax from:
http://lucidworks.lucidimagination.com/display/solr/Function+Queries
Various documentation has different syntax for exists().
wunder
On Jan 16, 2013, at 3:00 PM, Jack Krupansky wrote:
> Maybe it's the semicolons in the "if", which should be co
Maybe it's the semicolons in the "if", which should be commas. Also, you're
using some odd syntax in the "exists" value data source which expects a
field name or a function.
-- Jack Krupansky
-Original Message-
From: Walter Underwood
Sent: Wednesday, January 16, 2013 1:28 PM
To: solr
In Apache Nutch we strip non-character code points with a simple method. Check
the patch, the relevant part is easily ported to any language:
https://issues.apache.org/jira/browse/NUTCH-1016
-Original message-
> From:Zhang, Lisheng
> Sent: Wed 16-Jan-2013 20:48
> To: solr-user@lucen
I would prefer to use SchemaSimilarityFactory as a global similarity and
configure a per-field similarity of which some use a flat TF impl. Much simples
and no need to patch anything, just build a custom sim.
-Original message-
> From:Upayavira
> Sent: Wed 16-Jan-2013 21:22
> To: solr-u
Hi,
I am trying to do something similar:-
Eg.
Input: (name:John AND name:Doe)
Output: ((firstName:John OR lastName:John) AND (firstName:John OR
lastName:John))
How can I extract the fields, change them and repackage the query?
Thanks,
Balaji
--
View this message in context:
http://lu
Hi,
I am trying to do something similar:-
Eg.
Input: (name:John AND name:Doe)
Output: ((firstName:John OR lastName:John) AND (firstName:John OR
lastName:John))
How can I extract the fields, change them and repackage the query?
Thanks,
Balaji
--
View this message in context:
http://lu
There's gonna be two ways to do this - for yourself or for everyone.
For yourself, you'll want to subclass
org.apache.lucene.search.similarities.DefaultSimilarity and
org.apache.solr.search.similarities.DefaultSimilarityFactory.
Alternatively, patch those two files to allow setting the TF or the
Hi Alex,
Thanks very much for helps! I switched to (I am using PHP in client side)
createTextNode(urlencode($value))
so CTRL character problem is avoided, but I noticed that somehow solr did
not perform urldecode($value), so my initial value
abc xyz
becomes
abc+xyz
I have not fully read th
It has been discussed few times - you need to implement own Similarity,
which will write number of tokens as a norm during indexing, and then in
query time you can check the norm value per document.
You can also do it on a more straightforward way: preprocess docs to derive
a number_or_colors field
We're running Solr 3.3 and I have a function query for boosting that works with
bq but not with boost (edismax). This is the same behavior described here:
http://stackoverflow.com/questions/12128561/why-doesnt-solr-function-query-work-with-boost-parameter
Here is the first part of the stack trac
Hi,
How do I find documents that have more than one value in a field?
Example:
blue
red
Vincent Vu Nguyen
group field is timestamp… it is not multivalued.
./zahoor
On 15-Jan-2013, at 7:14 PM, Upayavira wrote:
> Is your group field multivalued? Could docs appear in more than one
> group?
>
> Upayavira
>
> On Tue, Jan 15, 2013, at 01:22 PM, J Mohamed Zahoor wrote:
>>
>> The sum of all the "count"
Hi all,
I recently discovered the group.main=true/false parameter which really has
made life simple in terms of ensuring that the format coming out of Solr
for my clients (RoR app) is backwards compatible with the non-grouped
results which ensures no special "handle grouped results" logic.
The on
Sounds like 'Doll' could be a category for you, while "Doll face" is a
title. Maybe the categories should get a higher boost in eDismax definition
over the titles?
Related, you may find the following book interesting:
http://rosenfeldmedia.com/books/searchanalytics/
Regards,
Alex.
Personal bl
Boost query and Boost function will suffice your purpose.
Rgds
AJ
On 16-Jan-2013, at 17:20, Dariusz Borowski wrote:
> Hi,
>
> Is it possible to define priorities on fields?
>
> Lets say I have a product table which has the following fields:
>
> - id
> - title
> - description
> - code_name
>
Its all about the data data set, here I mean index. If you have documents
containing "toy" and "doll" it will return that in result set.
What I understood that you are talking about the context of the query. For
example if you search "books on MK Gandhi" and "books by MK Gandhi" both
queries h
Well you can stop the solrs :-)
If you are making backup by copying the actual files stored by solr, you
probably want to stop them anyway to make sure everything is consistent
and written to disk. If you dont stop the solrs, at least make sure that
you do a "commit" (not soft) after all incomm
Is there a way to lock solr for writes?
I don't wona use solr integrated backup because i'm using ceph claster.
What I need is to have consistent data for few seconds to make backup.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Way-to-lock-solr-for-incoming-writes-tp4033
Looking at this second time, maybe we have an X/Y problem (sp?). Why was
that symbol in there in the first place?
Was it a field separator instead of using multiple fields? Was it a
character in an encoding other than UTF-8?
My guess is that the character will not make sense to Solr during either
On Tue, Jan 15, 2013 at 3:55 PM, Alexandre Rafalovitch
wrote:
> Basically, the
> recommendation is to avoid CDATA and automatically encode characters such
> as yours, as well as less/more and ampersand.
Unfortunately that doesn't even work. Just as a raw control character
like a 0 byte is invali
Forgot the link : http://en.wikipedia.org/wiki/Valid_characters_in_XML
André
On 01/16/2013 02:24 PM, Andre Bois-Crettez wrote:
Worth to note that some characters are completely forbidden in XML, such
as "chr(0)".
When dealing with external text input, some cleanup might be necessary
to avoid br
Worth to note that some characters are completely forbidden in XML, such
as "chr(0)".
When dealing with external text input, some cleanup might be necessary
to avoid breaking indexation.
For example you could replace each forbidden XML character with " ".
André
On 01/15/2013 09:55 PM, Alexandre
Hello!
What do you mean by priority ? You can define index or query time
boost. However that will allow to specify the importance of such
field.
A good page to look at is: http://wiki.apache.org/solr/SolrRelevancyCookbook
--
Regards,
Rafał Kuć
Sematext :: http://sematext.com/ :: Solr - Lucene
Hi,
Is it possible to define priorities on fields?
Lets say I have a product table which has the following fields:
- id
- title
- description
- code_name
An entry could be like this:
id: 42
title: shinny new shoes
description: Shinny new shoes made in Italy
code_name: shinny-new-shoes-42-2013
Done same thing in solr3.6 and working but in sorl3.6 filed level of
similarity is not available. And Solr4 has Similarity Factories. So I was
not getting how do I do it on solr4. Which class do i need to extend and
move ahead.
On Wed, Jan 16, 2013 at 4:44 PM, Upayavira wrote:
> For someone ver
This involves taking a subclass of the DefaultSimilarity class, in Java,
and adding that to your Solr setup. For someone versed in Java, this is
relatively straight-forward. For others it is non-trivial.
Upayavira
On Wed, Jan 16, 2013, at 10:57 AM, Amit Jha wrote:
> Hi,
>
> How can I do this in
Hi,
How can I do this in solr4.
Amit
On Thu, Dec 6, 2012 at 1:40 PM, Markus Jelsma wrote:
> custom similarity for that field that returns 1 for
I'm a beginner-intermediate solr admin, I've set up the basics for our
application and it runs well.
Now it's time for me to dig in and start tuning and improving queries.
My next target is searches on simple terms such as "doll" which, in google,
would return documents about, well, "toy do
And, it would make for slow queries, as the more fields you query, the
worse performance gets.
Having said that, you can query multiple fields using the edismax query
parser, with it qf param.
Upayavira
On Wed, Jan 16, 2013, at 12:23 AM, Jack Krupansky wrote:
> Semi-hard-coded.
>
> In QueryPars
Mark,
Here is the https://issues.apache.org/jira/browse/SOLR-3284
ConcurrentUpdateSolrServer queues updates on the SolrJ side, not the server
ones. Solr server processes number of updates simultaneously, e.g. if your
servlet containers threads are unlimited it can potentially lead to OOM.
On Wed,
On Tue, 2013-01-15 at 18:02 +0100, Nicholas Ding wrote:
> I'm thinking store hierachical data structure on Solr. I know I have to
> flatten the structure in a form like A_B_C, but it is possible to extend
> Solr to support hierachical data?
You need to be more specific here. What is it you're tryi
48 matches
Mail list logo