--
View this message in context:
http://www.nabble.com/How-to-avoid-case-sensitive-search--tp22716698p22716698.html
Sent from the Solr - User mailing list archive at Nabble.com.
how are you posting the xml ? missing content stream means that the
POST data is missing
On Wed, Mar 25, 2009 at 7:03 PM, Rui Pereira wrote:
> I'm trying to delete documents based on the following type of update
> requests:
> topologyid:3140topologyid:3142
>
> This doesn't cause any changes on i
right now a cron job is the only option.
building this into DIH has been a common request?
What do others think about this?
On Thu, Mar 26, 2009 at 10:11 AM, Tricia Williams
wrote:
> Hello,
>
> Is there a best way to schedule the DataImportHandler? The idea being to
> schedule a delta-import
Hello,
Is there a best way to schedule the DataImportHandler? The idea
being to schedule a delta-import every Sunday morning at 7am or perhaps
every hour without human intervention. Writing a cron job to do this
wouldn't be difficult. I'm just wondering is this a built in feature?
Tric
take a look at this
http://wiki.apache.org/solr/SolrPerformanceFactors#head-4ea89b13099bdaf11d82e54303d2408220c12f22
On Wed, Mar 25, 2009 at 4:07 AM, sunnyfr wrote:
>
> Hi,
> Sorry I still don't know what should I do ???
> I can see in my log which clearly optimize somewhere even if my command i
Actually solr2 is an application other then default one(example) on which I
have configured my application.
let me explain things more in details:
so my application path is http://localhost:8983/solr2/admin and I would like
to configure it for multi-cores so I have placed solr.xml in config
dir
Hi Alex , you may be able to use CachedSqlEntityprocessor. you can do
delta-import using full-import
http://wiki.apache.org/solr/DataImportHandlerFaq#fullimportdelta
the inner entity can use a CachedSqlEntityProcessor
On Thu, Mar 26, 2009 at 1:45 AM, AlexxelA wrote:
>
> Yes my database is remot
Hi,
I'm not sure if anyone will be able to help without more detail. First
suggestion would be to look at Solr with a debugger/profiler to see where
memory is used up.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: smock
> To: solr-us
Hi,
Yes, you can use Solr for this, but index partitioning should be done outside
of Solr. That is, your app will need to know where to send each doc based on
its timestamp, when and where to create new index (new Solr core), and so on.
Similarly, deleting older than N days is done by you, u
Hi,
Without knowing the details, I'd say keep it in the same index if the
additional information shares some/enough fields with the main product data and
separately if it's sufficiently distinct (this also means 2 queries and manual
merging/joining).
Otis --
Sematext -- http://sematext.com/
My question is - From design and query speed point of - should I add
new core to handle the additional data or should I add the data to
the existing core.
Do you ever need to get results from both sets of data in the same
query? If so, putting them in the same index will be faster. If
Actually what I meant was if there are 100 indexed fields. So there are 100
facet fields right..
So whenever I create solrQuery, I have to do addFacetField("fieldName")
can I avoid this and just get all facet fields.
Sorry for the confusion.
Thanks again,
Ashish
Shalin Shekhar Mangar wrote:
>
I've a question. Is it safe to use 'localhost' as solr_hostname in
scripts.conf?
--
-Tim
Hi All,
In my project, I have one primary core containing all the basic
information for a product.
Now I need to add additional information which will be searched and displayed
in conjunction with the product results.
My question is - From design and query speed point of - should I ad
I implemented OAI-PMH for solr a few years back for the Massachusetts
library system... it appears not to be running right now, but
check... http://www.digitalcommonwealth.org/
It would be great to get that code revived and live open source
somewhere. As is, it uses a pre 1.3 release tha
I set the autowarm to 2000, which only takes about two minutes and resolves
my issues.
Thanks for your help!
best,
cloude
On Wed, Mar 25, 2009 at 9:34 AM, Ryan McKinley wrote:
> It looks like the cache is configured big enough, but the autowarm count is
> too big to have good performance.
>
>
Yes my database is remote, mysql 5 and i'm using connector/J 5.1.7. My index
has 2 documents. When i try to do lets say 14 updates it takes about 18
sec total. Here's the resulting log of the operation :
2009-03-25 15:53:57 org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO: Ti
try using db for permission management and when u want to make a rep public
u just have to add it's id or name to everyuser permissions field. i think
you don't need to add any "is_public" field to index, just an id or name
field in wich the indexed doc is.So you can pre-filter the reps quering the
Hi,
I've used Lucene before, but new to Solr. I've gone through the
mailing list, but unable to find any clear idea on how to partition
Solr indexes. Here is what we want,
1) Be able to partition indexes by timestamp - basically partition
per day (create a new index directory every day)
2)
Hello there,
I'm looking for a way to implement SRW/U and a OAI-PMH servers over solr,
similar to what i have found here:
http://marc.info/?l=solr-dev&m=116405019011211&w=2 . Well actually if it is
decoupled (not a plugin) would be ok, if not better =).
I wanted to know if anyone knows if there i
Would it not make more sense to wait for the Lucene's IW+IR marriage and other
things happening in core Lucene that will make near-real-time search possible?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: John Wang
> To: solr-user@lucen
I think it's the later. I don't think the term interval is exposed anywhere.
If you expose it through the config and provide a patch, I think we can add
this to the core quickly.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: "Burton-
OK, we're getting closer. I just have two final questions regarding this then:
1. This would also include all the public repositories, right? If so,
how would such a query look? Some kind of is_public:true AND ...?
2. When a repository is made public, the is_public property in the
Solr index need
Hello,
After running a nightly release from around January of Solr for about 4
weeks without any problems, I'm starting to see OutofMemory errors:
Mar 24, 2009 1:35:36 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.OutOfMemoryError: Java heap space
at org.apache.lucene.util
OK, now I'll turn it over to the folks who actually maintain that site .
Meanwhile, here's the link to the 2.4.1 query syntax.
http://lucene.apache.org/java/2_4_1/queryparsersyntax.html
Best
Erick
On Wed, Mar 25, 2009 at 2:00 PM, nga pham wrote:
> http://lucene.apache.org/solr/tutorial.html#G
Hi Jon:
We are running various LinkedIn search systems on Zoie in production.
-John
On Thu, Feb 19, 2009 at 9:11 AM, Jon Baer wrote:
> This part:
>
> The part of Zoie that enables real-time searchability is the fact that
> ZoieSystem contains three IndexDataLoader objects:
>
>* a RAMLuc
http://lucene.apache.org/solr/tutorial.html#Getting+Started
link - lucene QueryParser syntax
is not working
On Wed, Mar 25, 2009 at 10:48 AM, nga pham wrote:
> Oops my mistake. Sorry for the trouble
>
> On Wed, Mar 25, 2009 at 10:42 AM, Erick Erickson <
> erickerick...@gmail.com> wrote:
>
>>
Hello all,
We are experimenting with the ShingleFilter with a very large document set (1
million full-text books). Because the ShingleFilter indexes every word pair as
a token, the number of unique terms increases tremendously. In our experiments
so far the tii and tis files are getting very l
Oops my mistake. Sorry for the trouble
On Wed, Mar 25, 2009 at 10:42 AM, Erick Erickson wrote:
> Which links? Please be as specific as possible.
>
> Erick
>
> On Wed, Mar 25, 2009 at 1:20 PM, nga pham wrote:
>
> > Hi
> >
> > Some of the getting started link dont work. Can you please enable it?
Which links? Please be as specific as possible.
Erick
On Wed, Mar 25, 2009 at 1:20 PM, nga pham wrote:
> Hi
>
> Some of the getting started link dont work. Can you please enable it?
>
Otis,
Absolutely. Here are the tokenizers and filters for the "text" fieldtype in
the schema. http://pastebin.com/f2bb249f3
Thanks!
That's what I suspected. Want to paste the relevant tokenizer+filters
sections of your schema? The index-time and query-time analysis has to be
the same or c
ok so u can create a table in a DB where you have a row foreach user and a
field with the reps he/she can access. Then you just have to take a look on
the db and include the repository name in the index. so you just have to
control (using query parameters) if the query is done for the right reps fo
Hm, I must be missing something, then.
Consider this.
There are three repositories, A and B, C. There are two users, U1 and U2.
Repository A is public, while B and C are private. Only U1 can access
B. No one can access C.
I index this data, such that Is_Private is true for B.
Now, when U2 sear
Hi
Some of the getting started link dont work. Can you please enable it?
That's what I suspected. Want to paste the relevant tokenizer+filters sections
of your schema? The index-time and query-time analysis has to be the same or
compatible enough, and that's not the case here.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Mess
Otis, that very much looks like what I'm after.
Curtis
> -Original Message-
> From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
> Sent: Wednesday, March 25, 2009 12:53 PM
> To: solr-user@lucene.apache.org
> Subject: Re: REST interface for Query
>
>
> Curtis,
>
> Like this?
>
i can't see the problem about that. you can manage your users using a DB and
keep there the permissions they could have, and create or erase users
without problems. you just have to manage a "working index" field for each
user with repositories' ids he can access. or u can create several indexes
an
Otis:
Okay, I'm not sure whether I should be including the quotes in the query
when using the analyzer, so I've run it both ways (no quotes on the index
value). I'll try to approximate the final "tables" returned for each term:
The field is dc_subject in both cases, being of type "text"
***
V
you can even create separated indexes for private or public access if u need
(and place them in separated machines), but i think Eric's suggestion is the
best and easier
On Wed, Mar 25, 2009 at 5:52 PM, Jesper Nøhr wrote:
> Hi list,
>
> I've finally settled on Solr, seeing as it has almost every
On Wed, Mar 25, 2009 at 5:57 PM, Eric Pugh
wrote:
> You could index the user name or ID, and then in your application add as
> filter the username as you pass the query back to Solr. Maybe have a
> access_type that is Public or Private, and then for public searches only
> include the ones that me
You could index the user name or ID, and then in your application add
as filter the username as you pass the query back to Solr. Maybe have
a access_type that is Public or Private, and then for public searches
only include the ones that meet the access_type of Public.
Eric
On Mar 25, 200
Curtis,
Like this?
https://issues.apache.org/jira/browse/SOLR-839
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: "Olson, Curtis B"
> To: solr-user@lucene.apache.org
> Sent: Wednesday, March 25, 2009 12:28:35 PM
> Subject: REST interfac
Hi list,
I've finally settled on Solr, seeing as it has almost everything I
could want out of the box.
My setup is a complicated one. It will serve as the search backend on
Bitbucket.org, a mercurial hosting site. We have literally thousands
of code repositories, as well as users and other data.
It looks like the cache is configured big enough, but the autowarm
count is too big to have good performance.
Try something smaller and see if that fixes both problems. I imagine
even just warming the most recent 100 queries would precache the most
important ones, but try some higher numbe
Greetings, I am a new subscriber. I'm Curtis Olson and I work for CACI
under contract at the U.S. Department of State, where we deal with
massive quantities of documents, so Solr is ideal for us.
We have a good sized index that we are starting to build up in
development. Some of the filter
Hi,
If you want to fill up the new cache set the autowarmCount to something high
(e.g. same number as the cache size), but be prepared to pay the price in
warmupTime and thus hit those onDeckSearchers warming again.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Ori
Hi,
Take the whole string to your Solr Admin -> Analysis page and analyze it. Does
it get analyzed the way you'd expect it to be analyzed?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Kurt Nordstrom
> To: solr-user@lucene.apache.org
Thanks for the quick reply.
the box has 8 real cpu's. Perhaps a good idea then to reduce the nr of cores
to 8 as well. I'm testing out a different scenario with multiple boxes as
well, where clients persist docs to multiple cores on multiple boxes. (which
is what multicore was invented for after
Hello,
We've encountered a strange issue in our Solr install regarding a particular
string that just doesn't seem to want to return results, despite the exact
same string being in the index.
What makes it even stranger is that we had the same data in a previous
install of Solr, and it worked the
Yes, I guess I'm running 40k queries when it starts :) I didn't know that
each count was equal to a query. I thought it was just copying the cache
entries from the previous searcher, but I guess that wouldn't include new
entries. I set it to the size of our filterCache. What should I set the the
au
I don't understand why this sometimes takes two minutes between the
start
commit & /update and sometimes takes 20 minutes? One of our caches
has about
~40,000 items, but I can't imagine it taking 20 minutes to autowarm a
searcher.
What do your cache configs look like?
How big is the auto
Ah, it's hard to tell. I look at index size on disk, number of docs, query
rate, types of queries, etc.
Are you actually seeing problems with your existing servers? Or see specific
performance movement in one of the aspects? (e.g. increasing latency, increased
GC or memory usage, increased
Hm, where does that /solr2 come from?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: mitulpatel
> To: solr-user@lucene.apache.org
> Sent: Wednesday, March 25, 2009 12:30:11 AM
> Subject: Re: Not able to configure multicore
>
>
>
>
>
Hm, I can't quite tell from here, but that is just a warning, so it's not super
problematic at this point.
Could it be that one of your other caches (query cache) is large and lots of
items are copied on searcher flip?
Could it be that your JVM doesn't have large or free enough enough heap? Ca
Prerna,
You could create an index snapshot with snapshooter script and then copy the
index. You should do that while the source index is not getting modified.
Re issue #2: run optimize
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: pre
Britske,
Here are a few quick ones:
- Does that machine really have 10 CPU cores? If it has significantly less,
you may be beyond the "indexing sweet spot" in terms of indexer threads vs. CPU
cores
- Your maxBufferedDocs is super small. Comment that out anyway. use
ramBufferedSizeMB and s
hi,
I'm having difficulty indexing a collection of documents in a reasonable
time.
it's now going at 20 docs / sec on a c1.xlarge instance of amazon ec2 which
just isnt enough.
This box has 8GB ram and the equivalent of 20 xeon processors.
these document have a couple of stored, indexed, m
Hi,
Issue 1:
I have 2 solr instances, i need to copy indexes from solr1 instance to solr2
without restarting the solr.
Please suggest how will this work. Both solr are on multicore setup.
Issue2:
I deleted all indexes from solr and reloaded my core, solr admin return 0
results.
The size of ind
I'm trying to delete documents based on the following type of update
requests:
topologyid:3140topologyid:3142
This doesn't cause any changes on index and if I try to read the response,
the following error ocurs:
13:32:35,196 ERROR [STDERR] 25/Mar/2009 13:32:35
org.apache.solr.update.processor.Log
On Wed, Mar 25, 2009 at 12:42 PM, Pierre-Yves LANDRON
wrote:
>
> Hello,
>
> When I send an update or a commit to solr via curl, the response I get is
> formated in HTML ; I can't find a way to have a machine readable response
> file.
> Here what is said on the subject in the solr config file :
> "
On Wed, Mar 25, 2009 at 1:33 PM, ristretto.rb wrote:
> Hello, I'm a happy Solr user. Thanks for the excellent software!!
> Hopefully this is a good question, I have indeed looked around the FAQ
> and google and such first.
> I have just switched from Firefox to Opera for web browsing. (Another
On Wed, Mar 25, 2009 at 3:26 PM, Ashish P wrote:
>
> Similar to getting range facets for date where we specify start, end and
> gap.
> Can we do the same thing for numeric facets where we specify start, end and
> gap.
No. But you can do this with multiple queries by using facet.field with fq
pa
On Wed, Mar 25, 2009 at 7:30 AM, Ashish P wrote:
>
> Can I get all the facets in QueryResponse??
You can get all the facets that are returned by the server. Set facet.limit
to the number of facets you want to retrieve.
See
http://lucene.apache.org/solr/api/solrj/org/apache/solr/client/solrj/So
Similar to getting range facets for date where we specify start, end and gap.
Can we do the same thing for numeric facets where we specify start, end and
gap.
--
View this message in context:
http://www.nabble.com/numeric-range-facets-tp22698330p22698330.html
Sent from the Solr - User mailing li
Hello, I'm a happy Solr user. Thanks for the excellent software!!
Hopefully this is a good question, I have indeed looked around the FAQ
and google and such first.
I have just switched from Firefox to Opera for web browsing. (Another story)
When I use the solr/admin the home page and stats works
On Wed, Mar 25, 2009 at 12:30 PM, Paul Libbrecht wrote:
> could I suggest that the maven repositories are populated next-time a
>>> release of "solr-specific-lucenes" are made?
>>>
>> But they are? It is inside the org.apache.solr group since those lucene
>> jars
>> are released by Solr -- http://
Hello,
When I send an update or a commit to solr via curl, the response I get is
formated in HTML ; I can't find a way to have a machine readable response file.
Here what is said on the subject in the solr config file :
"The response format differs from solr1.1 formatting and returns a standard
could I suggest that the maven repositories are populated next-time a
release of "solr-specific-lucenes" are made?
But they are? It is inside the org.apache.solr group since those
lucene jars
are released by Solr -- http://repo2.maven.org/maven2/org/apache/solr/
Nope,
http://repo1.maven.org/
68 matches
Mail list logo