My response to this was mangled by my email client - sorry - hopefully
this one comes through a little easier to read ;)
On 03/09/2010 04:28 PM, Shawn Heisey wrote:
I attended the Webinar on March 4th. Many thanks to Yonik for putting
that on. That has led to some questions about the best way
On Wed, Mar 3, 2010 at 7:51 AM, Marc Sturlese wrote:
> I am testing date facets in trunk with huge index. Aparently, as the default
> solrconfig.xml shows, the fastest way to run dace facets queries is index
> the field with this data type:
>
>
> precisionStep="6" positionIncrementGap="0"/
You can usually raise the header size limit by editing the config of
your servlet container. That can only get you so far though, and
different browsers have their own limits.
Your best bet, as Lance said, is either posting or sticking them in
solconfig.
You can post by using the query(SolrP
CommonGrams is a tool for this. It makes "is a" into a token, but then
"is" and "a" are still removed as stopwords.
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.CommonGramsFilterFactory
On 3/13/10, Christopher Ball wrote:
> Thank you for the idea Mitch, but it just doesn't se
Thank you for the idea Mitch, but it just doesn't seem right that I should
have to revert to Scoring when what I really need seems so fundamental.
Logically, what I want is a "phrase filter factory" that would match on
phrases listed in a file, like stopwords, but in this case index the match
and
On 03/09/2010 04:28 PM, Shawn Heisey wrote:
I attended the Webinar on March 4th. Many thanks to Yonik for
putting that on. That has led to some questions about the best way
to bring fault tolerance to our distributed search. High level
question: Should I go with SolrCloud, or stick with 1.
On 03/12/2010 09:44 AM, Shawn Heisey wrote:
Does SolrCloud's notion of a "collection", which appears to use
cores, override normal multi-core usage for building an offline index
and quickly swapping it into production?
A collection will normally be composed of multiple cores. By default
rig
On 03/13/2010 09:07 PM, blargy wrote:
How are you guys solving the problem with managing all of your configuration
difference between development and production.
For example when deploying to production I need to change the
data-config.xml (DataImportHandler) database settings. I also have some
Commit actions are in the jetty log. I don't have a script to pull
them out in a spread-sheet-able form, but that would be useful.
On 3/13/10, Frederico Azeiteiro wrote:
> Yes, the http request is timing out even when using values of 10m.
>
> Normally the commit takes about 10s. I did an optimize
DIH has special handling for upper & lower case field names. It is
possible your config is running afoul of this.
Try using different names for the Solr fields than the database fields.
On 3/11/10, James Ostheimer wrote:
> Hi-
>
> I can't seem to make any of the transfomers work, I am using the
You can use mysql , select *, “staticdata” as staticdata from table x.
As long as your field name is staticdata, this should add it there.
On 3/12/10 8:39 AM, "Tommy Chheng" wrote:
Haven't tried this myself but try adding a default value and don't
specify it during the import.
http://wiki.ap
How are you guys solving the problem with managing all of your configuration
difference between development and production.
For example when deploying to production I need to change the
data-config.xml (DataImportHandler) database settings. I also have some ant
scripts to start/stop tomcat as wel
Yes, the http request is timing out even when using values of 10m.
Normally the commit takes about 10s. I did an optimize (it took 6h) and it
looks good for now...
59m? well i didn't wait that long, i restarted the solr instance and tried
again.
I'll try to use autocommit on a near future.
What is timing out? The external HTTP request? Commit times are a
sawtooth and slowly increase. My record is 59 minutes, but I was doing
benchmarking.
On Thu, Mar 11, 2010 at 1:46 AM, Frederico Azeiteiro
wrote:
> Hi,
>
> I'm having timeouts commiting on a 125 GB index with about 2200
> docs.
I don't really follow DataImportHandler, but it looks like its using an
unbounded cache (simple HashMap).
Perhaps we should make the cache size configurable?
The impl seems a little odd - the caching occurs in the base class - so
caching impls that extends it don't really have full control - t
You might also try using CDATA blocks to wrap your Unicode text. It is
usually much easier to view the text while debugging these problems.
On Thu, Mar 11, 2010 at 12:13 AM, Eric Pugh
wrote:
> So I am using Sunspot to post over, which means an extra layer of
> indirection between mean and my XML!
It is usually a limitation in the servlet container. You could try
using embedded Solr or using an HTTP POST instead of an HTTP GET.
However, in this case it is probably not possible.
If these long filter queries never change, you could embed these in
the solrconfig.xml declaration for a request h
One way is to add magic 'beginning' and 'end' terms, then do phrase
searches with those terms.
On Wed, Mar 10, 2010 at 7:51 AM, Jan Høydahl / Cominvent
wrote:
> Hi,
>
> Sometimes you need to anchor your search to start/end of field.
>
> Example:
> 1. title=New York Yankees
> 2. title=New York
> 3
Erik,
I have seen many posts regarding out of memory error but I am not sure
whether they are using cachesqlEntityProcessor..
I want to know if there is a way to flush out the buffer of cache instead of
storing everything in cache.
I can clearly see the heapsize growing like anything if I use
HTMLStripCharFilter is only in the analyzer: it creates searchable
terms from the HTML input. The raw HTML is stored and fetched.
There are some bugs in term positions and highlighting, An
EntityProcessor wrapping the HTMLStripCharFIlter would be really
useful.
On Tue, Mar 9, 2010 at 5:31 AM, Mar
Have you searched the users' list? This question has come up multiple times
and you'll find your question has probably already been answered. Let us
know if you come up blank...
Best
Erick
On Sat, Mar 13, 2010 at 3:56 PM, JavaGuy84 wrote:
>
> Sorry forgot to attach the error log,
>
> Error Log:
Also how would one auto-commit after a delta-import?
I click on the commit, clean and verbose checkboxes but those seem to have
no affect.
blargy wrote:
>
> Is there any documentation on this screen? (and dont point me
> http://wiki.apache.org/solr/DataImportHandler)
>
> When using the Full-i
Sorry forgot to attach the error log,
Error Log:
-
org.apache.solr.handler.dataimport.DataImportHandlerException:
java.lang.OutOfMe
moryError: Java heap space
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilde
r.java:650)
at
org.apache.solr.handler.da
Is there any documentation on this screen? (and dont point me
http://wiki.apache.org/solr/DataImportHandler)
When using the Full-import, Status, Reload-Config, Document-Count and Full
Import With Cleaning everything works as expected but when I use any of the
following I get an exception: Debug N
How can I enable logging of all the xml posted to my Solr server? Is this
possible? As of right now all I see in the logs are the request params when
querying.
While I am on the topic of logging I have one other question too. Is it
possible to use custom variables in the logging.properties file s
Hi,
I am using CachedsqlEntityProcessor in my DIH dataconfig to reduce the
number of queries executed against the database ,
I having more than 2 million rows returned for Entity 2 and around 30
rows returned for entity1.
I am have set the heap size to 1 GB but even then I am always get
Hi,
How do we combine clustering component and Dismax query handler?
Regards,
allahbaksh
Christopher,
maybe the SynonymFilter can help you to solve your problem.
Let me try to explain:
If you create an extra field in the index for your use-case, you can boost
matches of them in a special way.
The next step is creating an extra synonym-file.
as much as => SpecialPhrase1
in amount o
Ok, let me try and explaining what I am hoping to achieve at a higher level:
I want to aggressively remove stop words to reduce the size of my index, but
there are certain domain specific multiword phrases which include stop words
that I need to retain in the index.
So I want to stop out words su
On 13.03.2010, at 08:01, blargy wrote:
>
> I was actually able to accomplish (althought not pretty) what I wanted using
> a regex transformer.
>
> transformer="RegexTransformer"
>query="select *, 'valueA, valueB' values from items">
>
>
Nice approach. In MySQL y
For anyone interested, my issue (I think) was because I had specified
the url field as a multivalued field. I wasn't able to create a test
case that emulated my problem. This guess is based on gradual fiddling
with my configs.
My concern is no longer pressing but I do have a couple question
31 matches
Mail list logo