Hi,
I while back, I had the same 'problem'. After solving it for myself, I
built and distributed a combination of Solr and Nutch into a pre-configured
environment. While what I did was specific to Windows (I included Cygwin in
the distribution, and a bunch of other stuff for easy administration of
On Nov 9, 2012, at 1:20 PM, shreejay wrote:
> Instead of doing an optimize, I have now changed the Merge settings by
> keeping a maxBuffer = 960, a merge Factor = 40 and ConcurrentMergePolicy.
Don't you mean ConcurrentMergeScheduler?
Keep in mind that if you use the default TieredMergePolicy,
Hi
we have 20million short docs (about 60 terms, less than 1k in total bytes
each) on each box, and we wanted to rank results based on how many terms got
matched only. In particular we are only interested in top N with best scores
(say a small number like 5).
With some help from the forum users
Have you looked at your logs? I think at around 1000 collections, the
clusterstate.json node will become too large for zookeeper by default. It has a
default limit of 1MB per node - you should be able to raise/override that limit
with a sys prop or something when starting zookeeper. I can't reme
Please file a JIRA issue for this change.
- Mark
On Nov 9, 2012, at 8:41 AM, Trym R. Møller wrote:
> Hi
>
> The constructor of SolrZKClient has changed, I expect to ensure clean up of
> resources. The strategy is as follows:
> connManager = new ConnectionManager(...)
> try {
>...
> } catc
Yeah, if you want to use a new config set when you dynamically create a new
collection, you must first upload the new config set. It's pretty easy using
the cloud-scripts/zkcli.sh|bat scripts.
If someone likes the idea of being able to point to a new config set to upload
when using the collecti
I think I may have found my answer buy I'd like additional validation:
I believe that I can add a function to my query to get only the highest
values of 'file_version' like this -
_val_:"max(file_version, 1)"
I seem to be getting the results I want. Does this look correct?
Regards,
Tim
--
Vie
Hm, OK, now I just leave my work, next week I'll try to do what you say and
give you a feedback.
Meanwhile, thank you very much for your help.
On Fri, Nov 9, 2012 at 6:30 PM, Tomás Fernández Löbbe wrote:
> I thought it was possible to upload a new configuration when creating a new
> collection
I thought it was possible to upload a new configuration when creating a new
collection through the Collections API, but it looks like the CREATE action
only takes:
replicationFactor
name
collection.configName
numShards
I think this means that you'll have to use an existing configuration
(already u
Howdy,
I have a Solr query that is almost perfect:
http://localhost:8080/apache-solr-4.0.0/v3_tag_core/select?q=tag%3A%22coat%22%5E4+%22coat%22+cid%3A136+&sort=score+desc&rows=10&fl=id+tag+cid+file_version+lang+score&wt=json&indent=true&debugQuery=true
It's grabbing data that includes the fields:
Hi, about the port, that's my mistake, I have the wrong port specified in
solr.xml.
But, now, I got the following error:
17:37:10,358 WARN
[com.datasul.technology.webdesk.indexer.engine.IndexerSearchEngine]
(http--0.0.0.0-8080-6) Fail uptading indexer synonyms/stopwords list.
17:37:10,378 INFO
Also, JBoss AS uses Tomcat, rigth? you may want to look at Mark Miller's
comments here:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201210.mbox/%3ccabcj++j+am6e0ghmm+hpzak5d0exrqhyxaxla6uutw1yqae...@mail.gmail.com%3E
On Fri, Nov 9, 2012 at 4:30 PM, Tomás Fernández Löbbe wrote:
> D
Do you have a stacktrace of the error you are getting? When Zookeeper runs
embedded (when you are using -DzkRun), it runs on [solr port]+1000. In the
example Jetty, Solr runs at 8983, and so zk runs at 9983, in your case it
should be using 9080.
Which Solr instance is the one that can't connect to
Hi Thomás, thanks for your help.
I change the start cmd to:
JAVA_OPTS="-DzkRun -DnumShards=2 -Dbootstrap_conf=true -Xmx2048m
-XX:MaxPermSize=512m" ./standalone.sh
Then, I tried to add a new core like this:
http://localhost:8080/ecm-indexer/admin/collections?action=CREATE&name=2&numShards=2
&boot
Another option is to use HTTP auth, which would involve modifying
web.xml in the Solr WAR and configuring a user in your container.
Unfortunately, this won't work with distributed queries.
Michael Della Bitta
Appinions
18 East 41st Street, 2nd Flo
Why not just do the join in the DB via your initial query? You'll be
executing 1 query per *each* ID in your list which is expensive in your
sub-entity. If you just have your query do the joins up front then each row
could be a complete (or nearly complete) document?
On Thu, Nov 8, 2012 at 9:31 A
I think you have to use either bootstrap_conf=true or
"bootstrap_confdir=/path/to/conf"+"collection.configName=foo" (not both at
the same time). If you use the first one, Solr will upload the
configuration for all the cores that you have configured (with the name of
the core as name of the configur
Thanks Erick. I will try optimizing after indexing everything. I was doing it
after every batch since it was taking way too long to Optimize (which was
expected), but it was not finishing merging it into lesser number of
segments (1 segment).
Instead of doing an optimize, I have now changed the M
Maybe you just want to use the white space tokenizer - the standard
tokenizer treats the at-sign as if a space.
See:
http://lucene.apache.org/core/4_0_0/analyzers-common/org/apache/lucene/analysis/core/WhitespaceTokenizerFactory.html
Or, you could use the "classic" tokenizer which does keep ema
Lee,
I guess my question was if you are trying to prevent the "big bad world"
from doing stuff they aren't supposed to in Solr, how are you going to
prevent the big bad world from POSTing a "delete all" query? Or restrict
them from hitting the admin console, looking at the schema.xml,
solrconfig.x
Actually, I want to use it with multiple cores, and my app dinamically add
cores to solr.
So, my solr.xml looks like this:
so, my solr.home is jboss.home/solr, which is represented by the dot in
instanceDir setting.
My solr.home has the following files:
conf/
-stopwords.txt
--
Are you sure you are pointing to the correct conf directory? sounds like
you are missing the collection name in the path (maybe it should be
../solr/YOURCOLLECTIONNAME/conf?)
On Fri, Nov 9, 2012 at 1:58 PM, Carlos Alexandro Becker
wrote:
> I started my JBoss server with the following command:
>
Oh weird. I'll post URLs on their own lines next time to clarify.
Thanks guys and looking forward to any feedback!
Cheers
Amit
On Fri, Nov 9, 2012 at 2:05 AM, Dmitry Kan wrote:
> I guess the url should have been:
>
>
> http://hokiesuns.blogspot.com/2012/11/using-solrs-postfiltering-to-collect
Hi Jack,
We have an email field defined like this:-
A query like [emailAddress : bob*] would match b...@bob.com, but queries
Hi Trym
I believe one of the reasons that they started throwing
RuntimeExceptions insted of UnknownHostException, TimeoutException etc
is that the method signature has changed to not have a "throws"-part.
They probably do not want do deal with those checked exceptions. Im not
sure I completel
What I'm saying is if you specify "spellcheck.maxCollationTries", it will run
the suggested query against the index for you and only return valid re-written
queries. That is, a misspelled firstname will be replaced with a valid
firstname; a missspelled lastname will be replaced with a valid las
Here are things I would try:
- You need to package the patch from SOLR-2943 in your jar as well as SOLR-2613
(to get the class DIHCachePersistCacheProperties)
- You need to specify "cacheImpl", not "persistCacheImpl"
- You are correct using "persistCacheName" & "persistCacheBaseDir" , contra the
Hi all,
I am using the the /terms request handler defined in the default
configuration with solr 3.6.1:
true
terms
When issuing a normal request to this request handler it is working as
expected.
However, when I'm trying to issue a distributed search requ
Hi
The constructor of SolrZKClient has changed, I expect to ensure clean up
of resources. The strategy is as follows:
connManager = new ConnectionManager(...)
try {
...
} catch (Throwable e) {
connManager.close();
throw new RuntimeException();
}
try {
connManager.waitForConnec
Hi Amit
I did not do this via a servlet filter as I wanted the solr devs to be
concerned with solr config and keep them out of any concerns of the
container. By specifying declarative data in a request handler that would
be enough to produce a service uri for an application.
Or have I missed a p
Hi Doug,
Retrieval Engines are not designed for deep paging (very large start
parameter). https://issues.apache.org/jira/browse/SOLR-1726
And your sort syntax is wrong. &sort:id
It should be &sort=id asc
--- On Fri, 11/9/12, Doug Kunzman wrote:
> From: Doug Kunzman
> Subject: sort on wild c
Yes ku3ia, I read your thread yesterday and looks like we get same issue. I
wish Apache Con is nearly finished and expert can resolve this
Thanks again to solr community,
Jul
--
View this message in context:
http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4019271
Hi -
We are using SOLR 3.6 and have noticed that when the start parameter is a
very large number SOLR's performance is rather slow.
After looking at our schema I was hoping to speed up SOLR performance by
using a sort order since it could be on an index column.
This hasn't worked. I was wonderi
(12/11/09 19:20), mechravi25 wrote:
Hi All,
Im using Solr 3.6.1 version. For the issue given in the following url, there
is no patch file provided
https://issues.apache.org/jira/browse/SOLR-3790
Can you tell me if there is patch file for the same?
Also, We noticed that the below url had the c
Hi All!
Solr provides support for newSearcher events. But those are dispatched
before the real search becomes the current one.
Is that possible to add some code that would be called whenever the new
searcher starts to serve requests?
Thanx,
Hi, I have near the same problems with cloud state
see
http://lucene.472066.n3.nabble.com/Replicated-zookeeper-td4018984.html
--
View this message in context:
http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4019264.html
Sent from the Solr - User mailing list archi
- Shards : 2
- ZooKeeper Cluster : 3
- One collection.
Here is how I run it and my scenario case:
In first console, I get first Node (first Shard) running on port 8983:
In second console, I get second Node (second Shard) running on port 8984:
Here I get just 2 nodes for my 2 shards runn
Hi All,
Im using Solr 3.6.1 version. For the issue given in the following url, there
is no patch file provided
https://issues.apache.org/jira/browse/SOLR-3790
Can you tell me if there is patch file for the same?
Also, We noticed that the below url had the changes that had to be done to
resolve
I guess the url should have been:
http://hokiesuns.blogspot.com/2012/11/using-solrs-postfiltering-to-collect.html
i.e. without 'and' in the end of it.
-- Dmitry
On Fri, Nov 9, 2012 at 12:03 PM, Erick Erickson wrote:
> It's always good when someone writes up their experiences!
>
> But when I tr
It's always good when someone writes up their experiences!
But when I try to follow that link, I get to your "Random Writings", but it
tells me that the blog post doesn't exist...
Erick
On Thu, Nov 8, 2012 at 4:21 PM, Amit Nithian wrote:
> Hey all,
>
> I wanted to thank those who have helped
Hi James,
Yes, that was this parameter who made the request fail.
I've edited the patch and added the new version to jira.
Thank you.
2012/11/7 Dyer, James
> Try specifying the "escape" parameter. This is the character your file
> uses to escape delimiters occuring in the data. If this fixe
You really should be careful about optimizes, they're generally not needed.
And optimizing is almost always wrong when done after every N documents in
a batch process. Do it at the very end or not at all. optimize essentially
re-writes the entire index into a single segment, so you're copying aroun
If this went away when you made your "id" field into a string type rather
than analyzed then it's probably not worth a JIRA...
Erick
On Thu, Nov 8, 2012 at 11:39 AM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:
> Looks like a bug. If Solr 4.0, maybe this needs to be in JIRA along with
you have to have at least one node per shard running for SolrCloud to
function. So when you bring down all nodes and start one, then you have
some shards with no live nodes and SolrCloud goes into a wait state.
Best
Erick
On Thu, Nov 8, 2012 at 6:17 PM, darul wrote:
> Is it same issue as one d
Robert, Tom,
That's it indeed! Using maxDoc as numerator opposed to docCount yields very
skewed results for an unevenly distributed multi-lingual index. We have one
language dominating the other twenty so the dominating language contains no
rare terms compared to the others.
We're now checking
Correct me if i am wrong but wouldn't collation return alternate terms
against the master dictionary field.
So if I were to take a collated term and run a query for that term against a
specific field (say First Name) I am not guaranteed to get back results
since that term could actually have been
46 matches
Mail list logo