Hi Gora ,
Thank you for your quick reply.
I have only one data source, But have more than 300 tables. Each tables I
have put in individual in data-confic.xml
But when I am trying to do full import, Its showing Thant much as
169
This 169 means I took 169 tables from my data source and each 169
Colleagues,
fwiw bq is a DisMax parser feature. Shawn, to approach the boosting syntax
with the standard parser you need something like q=foo:bar ip:sc^1000.
Specifying ^1000 in bq makes no sense ever. If you show query params and
debugQuery output, it would much easier for us to help you.
PS omitt
On 18 January 2013 12:49, ashimbose wrote:
> Hi Otis,
>
> Thank you for your reply.
>
> But I am unable to get any search result related to the error code. Its not
> response for more than 168 Data Source. I have tested it. If you have any
> other solution please let me know.
Not sure about the l
On 1/17/2013 11:41 PM, Walter Underwood wrote:
As I understand it, the bq parameter is a full Lucene query, but only used for
ranking, not for selection. This is the complement of fq.
You can use weighting: provider:fred^8
I tried bq=ip:sc^1000 and it doesn't seem to be making any difference
On 1/17/2013 11:41 PM, Walter Underwood wrote:
As I understand it, the bq parameter is a full Lucene query, but only used for
ranking, not for selection. This is the complement of fq.
You can use weighting: provider:fred^8
This will be affected by idf, so providers with fewer matches will hav
As I understand it, the bq parameter is a full Lucene query, but only used for
ranking, not for selection. This is the complement of fq.
You can use weighting: provider:fred^8
This will be affected by idf, so providers with fewer matches will have higher
weight than those with more matches. Th
I did try the bq parameter. Either I'm not using it correctly, or it's
not making a noticeable difference. I was not able to find any good
docs, either. Can you give me complete instructions in its use? Can I
control the boost factor? Is the boost additive or multiplicative?
For query ele
Hi Oakstream,
Coincidentally I've been thinking of porting the geohash prefixtree
intersection algorithm in Lucene 4 spatial to Accumulo (another big-table
system like HBase). There's a decent chance it'll happen this year, I
think. That doesn't help your need right now of course so go with Otis
A new response attribute would be better but it also complicates the patch
in that it would require a new way to serialize DocSlices I think
(especially when group.main=true)? I was looking to set group.main=true so
that my existing clients don't have to change to parse the grouped
resultset format
Thanks again Eric. This time I got it working :). Infact your first
response itself had clear explanation, somehow I did not understand it
completely!
On Thu, Jan 17, 2013 at 6:59 PM, Erick Erickson wrote:
> You could write a custom Filter (or perhaps Tokenizer), but I usually
> just do it on th
Have you tried boost query? bq=provider:fred
wunder
On Jan 17, 2013, at 9:08 PM, Jack Krupansky wrote:
> Start with "Query Elevation" and see if that helps:
> http://wiki.apache.org/solr/QueryElevationComponent
>
> Index-time document boost is a possibility.
>
> Maybe an ExternalFileField whe
Solr will ignore "required" for dynamic fields. It will be parsed and
preserved, but will not affect the check for required fields in an input
document.
Ditto for "default" value for a dynamic field.
-- Jack Krupansky
-Original Message-
From: Alexandre Rafalovitch
Sent: Friday, Janu
I want to make something like Alfresco, but not having that many features.
And I'd like to utilise the searching ability of Solr.
On Fri, Jan 18, 2013 at 4:11 PM, Gora Mohanty wrote:
> On 18 January 2013 10:36, Nicholas Li wrote:
> > hi
> >
> > I am new to solr and I would like to use Solr as m
Yes.
-- Jack Krupansky
-Original Message-
From: Alexandre Rafalovitch
Sent: Friday, January 18, 2013 12:26 AM
To: solr-user@lucene.apache.org
Subject: Re: What is the difference in defining multiValued on field and or
fieldtype?
Thank you Jack,
I just realized that perhaps ignored
Thank you Jack,
I just realized that perhaps ignored was a bad example. But if I understood
correctly, then I can specify multiValued on the type and not do so on the
field itself and I still get multiValued entries.
That's good to know.
Regards,
Alex.
Personal blog: http://blog.outerthought
Unfortunately, it seems (
http://lucene.472066.n3.nabble.com/Nrt-and-caching-td3993612.html) that
these caches are not per-segment. In this case, I want to (soft) commit
less frequently. Am I right?
Tomás, as the fieldValueCache is very similar to lucene's FieldCache, I
guess it has a big contribu
Specifying an attribute on the field type makes it the default for any field
of that type.
Setting multiValued=true on "ignored" simply allows it to be used for any
field, whether it is single or multi-valued, and any source data, whether it
has one or multiple values for that ignored field. O
Or put the term in quotes.
-- Jack Krupansky
-Original Message-
From: Erick Erickson
Sent: Thursday, January 17, 2013 6:59 PM
To: solr-user@lucene.apache.org
Subject: Re: searching for q terms that start with a dash/hyphen being
interpreted as prohibited clauses
I think all you need
On 18 January 2013 10:36, Nicholas Li wrote:
> hi
>
> I am new to solr and I would like to use Solr as my document server, plus
> search engine. But solr is not CMIS compatible( While it shoud not be, as
> it is not build as a pure document management server). In that sense, I
> would build anoth
Start with "Query Elevation" and see if that helps:
http://wiki.apache.org/solr/QueryElevationComponent
Index-time document boost is a possibility.
Maybe an ExternalFileField where every document could have a dynamic boost
value that you add with a boost function.
-- Jack Krupansky
-Orig
hi
I am new to solr and I would like to use Solr as my document server, plus
search engine. But solr is not CMIS compatible( While it shoud not be, as
it is not build as a pure document management server). In that sense, I
would build another layer beyond Solr so that the exposed interface would
There are a couple ways you can proceed. You can preconfigure some SolrCores in
solr.xml. Even if you don't, you want a solr.xml, because that is where a lot
of cloud properties are defined. Or you can use the collections API or the core
admin API.
I guess I'd recommend the collections API.
Yo
You'd want to do your Solr spatial query, get IDs from the index, and then
*after* that do a multi get against your HBase table with top N IDs from
Solr's response and get thus get the data back to the caller. I don't know
how fast multi gets are, what the limitations are, etc. Maybe somebody
els
Hi,
Instead of the multi-valued fields, would parent-child setup for you here?
See http://search-lucene.com/?q=solr+join&fc_type=wiki
Otis
--
Solr & ElasticSearch Support
http://sematext.com/
On Thu, Jan 17, 2013 at 8:04 PM, David Parks wrote:
> The documents are individual products which
I'm trying to get a 2-node SolrCloud install off the ground with the 4.1
branch. This is a new project for a different system than my existing
Solr 3.5.0 setup. It will have one shard and two replicas.
I have part of the example in /opt/mbsolr4 -- jetty, the war file, logs,
etc. This is the
The documents are individual products which come from 1 or more vendors.
Example: a 'toy spiderman doll' is sold by 2 vendors, that is 1 document.
Most fields are multi valued (short_description from each of the 2 vendors,
long_description, product_name, vendor, etc. the same).
I'd like to collaps
Thanks for your response! I appreciate it.
There will be cases where I want to "AND or OR" the query between HBASE and
Lucene. Would it make sense to custom code querying both repositories at
the same time or sequentiallyOr are there any tools out there to do
this?
Basically I'm thinking
I think fieldValueCache is not per segment, only fieldCache is. However,
unless I'm missing something, this cache is only used for faceting on
multivalued fields
On Thu, Jan 17, 2013 at 8:58 PM, Erick Erickson wrote:
> filterCache: This is bounded by 1M * (maxDoc) / 8 * (num filters in
> cache).
I think all you need to do is escape the hyphen, or have you tried that already?
Best
Erick
On Thu, Jan 17, 2013 at 1:38 PM, geeky2 wrote:
> hello
>
> environment: solr 3.5
>
> problem statement:
>
> i have a requirement to search for part numbers that start with a dash /
> hyphen.
>
> example q
filterCache: This is bounded by 1M * (maxDoc) / 8 * (num filters in
cache). Notice the /8. This reflects the fact that the filters are
represented by a bitset on the _internal_ Lucene ID. UniqueId has no
bearing here whatsoever. This is, in a nutshell, why warming is
required, the internal Lucene I
Hmmm, Maybe I'm finally getting it.
Right, that does seem odd. I would expect you to get 4x the number of
docs on any particular shard/replica in this situation.
What happens you look at the Solr logs for each partition? You should
be able to glean the num results from the logs. I guess there are
Hi,
You certainly can do that, but you'll need to suck all data out of HBase
and index it in Solr first. And then presumably you'll want to keep the 2
more or less in sync via incremental indexing. Maybe Lily project can
help? If not, you'll have to write something that scans HBase and indexes,
On 1/17/2013 2:01 PM, snake wrote:
I think your not understanding the issue.Imagine www.acme.com has created a
collection.
This resides in d:\acme.com\wwwroot\collections
Then they decide to redo their website, or they get a new developer who
decides not to use collections, or they simply move h
Solr 4 most definitely ignores missing cores (just run into that
accidentally again myself). So, if you start Solr and directory is missing,
it will survive (but complain).
The other problem is what happens when a customer deletes the account and
the core directory disappears in a middle of open s
my knowledge of solr is pretty limited, I have only been investigating this
in the last couple of days due to this issue.
The way SOLR is implemented in ColdFusion is with a single core, so all
sites run under same core. I presume a core is like multiple instances ?
On Thu, Jan 17, 2013 at 9:03 P
You must have an Admin UI open and pointing at Logging section. So, it
sends a ping to see if any new log entries were added.
Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events
I've been trying to figure this out on my own, but I've come up empty so
far. I need to boost documents from a certain provider. The idea is
that if any documents in a result match a separate query (like
provider:bigbucks), I need to multiply the score by X. It's important
that the result se
On Thu, Jan 17, 2013 at 3:40 PM, snake wrote:
> Ok so is there any other to stop this problem I am having where any site
> can break solr by delering their collection?
> Seems odd everyone would vote to remove a feature that would make solr more
> stable.
I agree.
abortOnConfigurationError was m
I think your not understanding the issue.Imagine www.acme.com has created a
collection.
This resides in d:\acme.com\wwwroot\collections
Then they decide to redo their website, or they get a new developer who
decides not to use collections, or they simply move hosts, so they delete
the old one.
The
I keep seeing these in the tomcat logs:
Jan 17, 2013 3:57:33 PM org.apache.solr.core.SolrCore execute
INFO: [Lisa] webapp=/solr path=/admin/logging
params={since=1358453312320&wt=jso
n} status=0 QTime=0
I'm just curious:
What is getting executed here? I'm not running any queries against this core
Or a different design.
You can mark collections for deletion, then delete them in an organized, safe
manner later.
wunder
On Jan 17, 2013, at 12:40 PM, snake wrote:
> Ok so is there any other to stop this problem I am having where any site
> can break solr by delering their collection?
> Seems
Ok so is there any other to stop this problem I am having where any site
can break solr by delering their collection?
Seems odd everyone would vote to remove a feature that would make solr more
stable.
--
View this message in context:
http://lucene.472066.n3.nabble.com/how-to-get-abortOnConfig
On 1/17/2013 12:38 PM, Chris Hostetter wrote:
: You're not only giving up the ability to monitor things, you're also giving up
: the ability to detect errors. All exceptions that get thrown by the internals
: of ConcurrentUpdateSolrServer are swallowed, your code will never know they
: happened
Hi Shawn,
"don't panic"
Due 'historical' reasons, like comparing the different subclasses of
SolrServer, I have an HttpSolrServer for querys and commits. I've never
tried to to use the CUSS for anything else than adding documents.
As I wrote, it was a home made problem and not a bug. Sometime
Try my suggested field definition and see if it helps with faceting. It
should. Try it on a small example or a fake schema.
But I would still recommend escalating the problem up the chain to an
architect or similar. Because I bet that data is stored in multiple places
(e.g. in the database) and yo
@Alexandre Rafalovitch Thanks.
yeah you got my point.
training_skill:["c", "c++", "php", "java", ".net"]
but it is not possible for me to split "php,java,.net" because data can
very and data is very large. i mean i have to perform on 5 line data.
it might come["c++,php,java",".net","c#,
I think the problem here is that the list has 3-values, but the last one is
actually a set of several as well. Anurag seem to be able to split them
into separate values whether they came as individual array items or as part
of joint list. So, we have a mix of multiValue submission and desire to
spl
: You're not only giving up the ability to monitor things, you're also giving up
: the ability to detect errors. All exceptions that get thrown by the internals
: of ConcurrentUpdateSolrServer are swallowed, your code will never know they
: happened. The client log (slf4j with whatever binding &
On 18 January 2013 00:31, anurag.jain wrote:
>
> [ { "last_name" : "jain", "training_skill":["c", "c++", "php,java,.net"]
> }
> ]
>
> actually i want to tokenize in c c++ php java .net
What do you mean by "tokenize" in this case? It has
been a while since I had occasion to use JSON input,
and
David,
What's the documents and the field? It can help to suggest workaround.
On Thu, Jan 17, 2013 at 5:51 PM, David Parks wrote:
> I want to configure Field Collapsing, but my target field is multi-valued
> (e.g. the field I want to group on has a variable # of entries per
> document,
> 1-N e
Snake,
It was killed in 4.0/trunk more than two years ago
https://issues.apache.org/jira/browse/SOLR-1846
"Setting abortOnConfigurationError==false has not worked for some time, and
based on a POLL of existing users, no one seems to need/want it,"
You might be in that rare case when it used to don
actually [ { "last_name" : "jain", "training_skill":*["c", "c++",
"php,java,.net"]* } ] training_skill is list. and if i want to store in
string field type then it will include [ and , also. so how to avoid ? or it
will not.
or do you have any other field type definition through which my wor
no-no-no. your implementation as slow as result processing, due to using
stored fields.
Fast way is something like
*org.apache.solr.schema.IntField.getValueSource(SchemaField,
QParser)* .
it's worth to check how the standard functions are build - check the static
{} block in org.apache.solr.search.
You mean to say that the problem is with json which is being ingested.
What you are trying to achieve is that you want to split the values on the
basis of comma and index it as multiple value.
What problem you are facing in indexing json in format Solr expects. If you
don't have control over it,
[ { "last_name" : "jain", "training_skill":["c", "c++", "php,java,.net"] }
]
actually i want to tokenize in c c++ php java .net
so through this i can make them as facet.
but problem is in list
"training_skill":["c", "c++", *"php,java,.net"*]
--
View this message in context:
http://l
you just need to make the field as multivalued.
type should be set based on your search requirements.
On Thu, Jan 17, 2013 at 11:27 PM, anurag.jain wrote:
> my json file look like
>
> [ { "last_name" : "jain", "training_skill":["c", "c++", "php,java,.net"] }]
>
> can u please suggest me how
Hello,
I have point data (lat/lon) stored in hbase/hadoop and would like to query
the data spatially with polygons. (If I pass in a few polygons find me all
the records that exist within these polygons. I need it to support polygons
not just box queries). Hadoop doesn't really have much support
hello
environment: solr 3.5
problem statement:
i have a requirement to search for part numbers that start with a dash /
hyphen.
example q= term: *-0004A-0436*
example query:
http://some_url:some_port/some_core/select?facet=false&sort=score+desc%2C+rankNo+asc%2C+partCnt+desc&start=0&q=*-0004A-
my json file look like
[ { "last_name" : "jain", "training_skill":["c", "c++", "php,java,.net"] }]
can u please suggest me how should i declare field in schema for
"trainingskill" field
please reply
urgent
--
View this message in context:
http://lucene.472066.n3.nabble.com/MultiValue-t
and if i want to search only three column of my data base and want my
search result to show all the columns(even the one that are not indexed
because i dont want them to be searched) how can i do that?
On Thu, Jan 17, 2013 at 11:56 AM, hassan altaf wrote:
> Can anyone explain delta import sectio
Hi Mikhail,
Thanks for the info.
If my FunctionQuery accesses stored fields like that:
public float floatVal(int docNum) {
Document doc = null;
try { doc = reader.document(docNum); } catch (Exception e) {}
return getSimilarityScore(doc);
}
Is it still the same case? Is there a faster way
Hello John,
> getting all the documents and analyzing their result fields?
is almost not ever possible. Lucene stored fields usually are really slow.
when FunctionQueries is backed of field values it uses Lucene FieldCache,
which is array of field values that's damn faster.
You are welcome.
O
Hi,
Is there any performance boost when using FunctionQuery over getting all
the documents and analyzing their result fields?
As far as I understand, Function Query does exactly that, for each matched
document it feches the fields you're interested at, and then it calculates
whatever score mechan
Hello,
Here is another one from the other day:
http://search-lucene.com/m/tqmNjXO51B/SolrCloud+Performance+for+High+Query+Volume
Am I the only one seeing people reporting this? :)
Otis
--
Solr & ElasticSearch Support
http://sematext.com/
On Mon, Jan 14, 2013 at 10:55 PM, Otis Gospodnetic <
Hi Erick,
It looks like we are saying the exact same thing but with different terms ;)
I looked at the Solr glossary and you might be right.. maybe I should talk
about partitions instead of shards.
Since my last message, I`ve configured the replication between the master and
slave and everythi
I'm currently running Solr 4.0 final on tomcat v7.0.34 with ManifoldCF v1.2
dev running on Jetty.
I have solr multicore set up with 10 cores. (Is this too much?)
I so I also have at least 10 connectors set up in ManifoldCF (1 per core, 10
JVMs per connection)
>From the look of it; Solr couldn't ha
On 1/16/2013 11:22 PM, Cool Techi wrote:
We have an index of approximately 400GB in size, indexing 5000 documents was
taking 20 seconds. But lately, the indexing is taking very long, committing the
same amount of document is taking 5-20 mins.
On checking the logs I can see that their a frequen
I'd think adding a new response attribute would be more flexible and
powerful, thinking about clients, UIs, etc.
Otis
--
Solr & ElasticSearch Support
http://sematext.com/
On Thu, Jan 17, 2013 at 10:15 AM, Tomás Fernández Löbbe <
tomasflo...@gmail.com> wrote:
> Bu Amit is right, when you use
Bu Amit is right, when you use group.main, the number of groups is not
displayed, even if you set grop.ngroups.
I think in this case NumFound should display the number of groups instead
of the number of docs matching. Other option would be to keep "numFound" as
the number of docs matching and add
ashimbose,
It is possible that this is happening because Solr reaches a point where
it is doing so many simultaneous merges that ongoing indexing is stopped
until a huge merge finishes. This causes the JDBC driver to time out
and disconnect, and there is no viable generic way to recover from
Hi David,
I think this is where search analytics can help. If your intuition is
right and people who search for "doll" are not actually searching for "doll
face..." CD, then search analytics will confirm that. This analytics I'm
talking about involves search and click tracking and analysis. Onc
On 1/17/2013 3:32 AM, Uwe Reh wrote:
one entry in my long list of self made problems is:
"Done the commit before the ConcurrentUpdateSolrServer was finished."
Since the ConcurrentUpdateSolrServer is asynchronous, it's very easy to
create a race conditions. Make sure that your program is waiting
There's a parameter to enable that. :D
In solrJ
solrQuery.setParam("group.ngroups", true);
http://wiki.apache.org/solr/FieldCollapsing
--
View this message in context:
http://lucene.472066.n3.nabble.com/group-ngroups-behavior-in-response-tp4033924p4034187.html
Sent from the Solr - User maili
Similar thoughts: I used unit tests to explore that issue with SolrJ,
originally encoding with ClientUtils; The returned results had "|"
many places in the text, with no clear way to un-encode. I eventually
ran some tests with no encoding at all, including strings like
"hello & goodbye"; such strin
here is what it says in the SOLR info page
Solr Specification Version: 1.4.0.2009.11.18.10.19.05
Solr Implementation Version: 1.4.1-dev exported - kvinu - 2009-11-18
10:19:05
Lucene Specification Version: 2.9.1
Lucene Implementation Version: 2.9.1 832363 - 2009-11-03 04:37:25
On Thu, Jan 17,
Hi,
That's a juicy index. Is this on a single server? Have you considered
sharding it and thus spreading the indexing work over multiple servers,
disks, etc.?
You could increase ramBufferSizeMB, which will help a bit with indexing
speed, but not with actual merging.
Otis
--
Solr & ElasticSearch
Hi,
It looks like this is the cause:
JBC0016E: Remote call failed
(return code=-2,220). SDK9019E: internal errorSDK9019X:
Interestingly, Google gives just 1 hit for the above as query - your post.
But it seems you should look up what the above codes mean first...
Otis
--
Solr & ElasticSe
I want to configure Field Collapsing, but my target field is multi-valued
(e.g. the field I want to group on has a variable # of entries per document,
1-N entries).
I read on the wiki (http://wiki.apache.org/solr/FieldCollapsing) that
grouping doesn't support multi-valued fields yet.
Anything in
You're still confusing shards (or at least mixing up the terminology)
with simple replication. Shards are when you split up the index into
several sub indexes and configure the sub-indexes to "know about each
other". Say you have 1M docs in 2 shards. 500K of them would go on one
shard and 500K on t
You could write a custom Filter (or perhaps Tokenizer), but I usually
just do it on the input side before things get sent to Solr.
I don't think PatternReplaceCharFilterFactory will help, you could
easily turn the input into original:original, but then you'd need to
write a custom filter that norm
I will explain the scenario just to avoid all the potential replies asking
why.
We run coldFusion servers (windows) which has SOLR built in (running on
Jetty).
A customer creates a collection which is stored within their own webspace,
they only have read/write access to their own webspace so canno
Hello,
QTime counts only searching and filtering, but not writing response, which
includes retrieving the stored fields (&fl=...). So, it's quite reasonable.
On Thu, Jan 17, 2013 at 7:09 AM, 张浓飞 wrote:
> I have a solr website with about 500 docs ( 30 fileds defined in schema
> ), and a c# cli
Hi,
I have some problems related to URL encoding.
I'm using Solr 3.6.1 on a Windows (32 bit) system.
Apache Tomcat is version 6.0.36.
I'm accessing Solr through solrj-3.3.0.
When using the Solr admin and specifying my request, the URL looks like
this (${SOLR} is there for the sake of brevity) :
Hi Mark,
one entry in my long list of self made problems is:
"Done the commit before the ConcurrentUpdateSolrServer was finished."
Since the ConcurrentUpdateSolrServer is asynchronous, it's very easy to
create a race conditions. Make sure that your program is waiting ()
before it's doing the c
Some questions:
What version of Solr?
Has the number of documents in your index changed in the meantime?
How many before, how many now?
How does maxdocs compare to numdocs?
Has this system ever been upgraded from an older Solr?
Is it committing that is taking that long, or opening a searcher one
85 matches
Mail list logo