RE: can indexing information stored in db rather than filesystem?

2011-09-13 Thread Jaeger, Jay - DOT
I don't think you understand. Solr does not have the code to do that. It just isn't there, nor would I expect it would ever be there. Solr is open source though. You could look at the code and figure out how to do it (though why anyone would do that remains beyond my ability to understand).

RE: can indexing information stored in db rather than filesystem?

2011-09-13 Thread Jaeger, Jay - DOT
Nicely put. ;^) -Original Message- From: Walter Underwood [mailto:wun...@wunderwood.org] Sent: Tuesday, September 13, 2011 9:16 AM To: solr-user@lucene.apache.org Subject: Re: can indexing information stored in db rather than filesystem? On Sep 13, 2011, at 6:51 AM, kiran.bodigam wrote:

RE: Out of memory

2011-09-13 Thread Jaeger, Jay - DOT
numDocs is not the number of documents in memory. It is the number of documents currently in the index (which is kept on disk). Same goes for maxDocs, except that it is a count of all of the documents that have ever been in the index since it was created or optimized (including deleted documen

RE: EofException with Solr in Jetty

2011-09-14 Thread Jaeger, Jay - DOT
Looking at the source for Jetty, line 149 in Jetty's HttpOutput java file looks like this: if (_closed) throw new IOException("Closed"); < [http://www.jarvana.com/jarvana/view/org/eclipse/jetty/aggregate/jetty-all/7.1.0.RC0/jetty-all-7.1.0.RC0-sources.jar!/org/ec

RE: index not created

2011-09-14 Thread Jaeger, Jay - DOT
> changed the configuration to point it to my solr dir and started it again You might look in your logs to see where Solr thinks the Solr home directory is and/or if it complains about not being able to find it. As a guess, it can't find it, perhaps because solr.solr.home does not point to the

RE: Schema fieldType y-m-d ?!?!

2011-09-14 Thread Jaeger, Jay - DOT
Just add a bogus 0 timestamp after it when you index it. That is what we did. Dates are not stored or indexed as characters, anyway, so space would not be any different one way or the other. JRJ -Original Message- From: stockii [mailto:stock.jo...@googlemail.com] Sent: Wednesday, Sep

RE: EofException with Solr in Jetty

2011-09-14 Thread Jaeger, Jay - DOT
Mail - Von: "Jay Jaeger - DOT" An: solr-user@lucene.apache.org, "JETTY user mailing list" Gesendet: Mittwoch, 14. September 2011 15:21:19 Betreff: RE: EofException with Solr in Jetty Looking at the source for Jetty, line 149 in Jetty's HttpOutput java file looks l

RE: Performance troubles with solr

2011-09-14 Thread Jaeger, Jay - DOT
I think folks are going to need a *lot* more information. Particularly 1. Just what does your "test script" do? Is it doing updates, or just queries of the sort you mentioned below? 2. If the test script is doing updates, how are those updates being fed to Solr? 3. What version of Solr

RE: Performance troubles with solr

2011-09-14 Thread Jaeger, Jay - DOT
o 6000m, particularly given your relatively modest number of documents (2,000,000). I was trying everything before asking here. 5. Machine characteristics, particularly operating system and physical memory on the machine. OS => Debian 6.0, Physcal Memory => 32 gb, CPU => 2x Intel Quad Cor

RE: glassfish, solrconfig.xml and SolrException: Error loading DataImportHandler

2011-09-14 Thread Jaeger, Jay - DOT
Some things to think about: When solr starts up, solr should report for the location of solr home. Is it what you expect? Is there any security on the "dist" directory that would prevent solr from accessing it? Is there a classloader policy set on glassfish that could be getting in the way? (y

RE: Performance troubles with solr

2011-09-14 Thread Jaeger, Jay - DOT
nd its too much. When i send a set of random queries (10-20 queries per second) response times goes crayz ( 8 seconds to 60+ seconds). On Wed, Sep 14, 2011 at 6:07 PM, Jaeger, Jay - DOT wrote: > I don't have enough experience with filter queries to advise well on when > to use fq vs. pu

RE: Replication and ExternalFileField

2011-09-15 Thread Jaeger, Jay - DOT
Actually, Windoze also has symbolic links. You have to manipulate them from the command line, but they do exist. http://en.wikipedia.org/wiki/NTFS_symbolic_link -Original Message- From: Per Osbeck [mailto:per.osb...@lbi.com] Sent: Thursday, September 15, 2011 7:15 AM To: solr-user@lu

RE: SOLR Index Speed

2011-09-26 Thread Jaeger, Jay - DOT
500 / second would be 1,800,000 per hour (much more than 500K documents). 1) how big is each document? 2) how big are your index files? 3) as others have recently written, make sure you don't give your JRE so much memory that your OS is starved for memory to use for file system cache. JRJ --

RE: A fieldType for a address street

2011-09-26 Thread Jaeger, Jay - DOT
We used copyField to copy the address to two fields: 1. Which contains just the first token up to the first whitespace 2. Which copies all of it, but translates to lower case. Then our users can enter either a street number, a street name, or both. We copied all of it to the second field bec

RE: strange performance issue with many shards on one server

2011-09-28 Thread Jaeger, Jay - DOT
That would still show up as the CPU being busy. -Original Message- From: Federico Fissore [mailto:feder...@fissore.org] Sent: Wednesday, September 28, 2011 6:12 AM To: solr-user@lucene.apache.org Subject: Re: strange performance issue with many shards on one server Frederik Kraus, il 28

RE: strange performance issue with many shards on one server

2011-09-28 Thread Jaeger, Jay - DOT
one server Jaeger, Jay - DOT, il 28/09/2011 18:40, ha scritto: > That would still show up as the CPU being busy. > i don't know how the program (top, htop, whatever) displays the value but when the cpu has a cache miss definitely that thread sits and waits for a number of clock cyc

RE: Trouble configuring multicore / accessing admin page

2011-09-28 Thread Jaeger, Jay - DOT
One time when we had that problem, it was because one or more cores had a broken XML configuration file. Another time, it was because solr/home was not set right in the servlet container. Another time it was because we had an older EAR pointing to a newer release Solr home directory. Given wha

RE: Trouble configuring multicore / accessing admin page

2011-09-28 Thread Jaeger, Jay - DOT
cores adminPath="/admij/cores" Was that a cut and paste? If so, the /admij/cores is presumably incorrect, and ought to be /admin/cores -Original Message- From: Jaeger, Jay - DOT [mailto:jay.jae...@dot.wi.gov] Sent: Wednesday, September 28, 2011 4:10 PM To:

RE: 32-bit to 64-bit

2011-09-29 Thread Jaeger, Jay - DOT
Are you changing just the host OS or the JVM, or both, from 32 bit to 64 bit? If it is just the OS, the answer is definitely no, you don't need to do anything more than copy. If the answer is the JVM, I *think* the answer is still no, but others more authoritative than I may wish to respond. -

RE: About solr distributed search

2011-09-29 Thread Jaeger, Jay - DOT
I am no expert, but here is my take and our situation. Firstly, are you asking what the minimum number of documents is before it makes *any* sense at all to use a distributed search, or are you asking what the maximum number of documents is before a distributed search is essentially required?

RE: Errors in requesthandler statistics

2011-09-29 Thread Jaeger, Jay - DOT
I am not expert, but based on my experience, the information you are looking for should indeed be in your logs. There are at least three logs you might look for / at: - An HTTP request log - The solr log - Logging by the application server / JVM Some information is available at http://wiki.apac

RE: Errors in requesthandler statistics

2011-09-29 Thread Jaeger, Jay - DOT
If you are asking how to tell which of 94000 records failed in a SINGLE HTTP update request, I have no idea, but I suspect that you cannot necessarily tell. It might help if you copied and pasted what you find in the solr log for the failure (see my previous response for how to figure out where

RE: Weird issues when upgrading from 1.4 to 3.4

2011-10-03 Thread Jaeger, Jay - DOT
I have no idea what might be causing your memory to increase like that (we haven't run 3.4, and our index so far has been at most 28 million rows with maybe 40 fields), but just as an aside, depending upon what you meant by "we drop the whole index", I'd think it might work better to do an righ

RE: composite Unique Keys?

2011-10-05 Thread Jaeger, Jay - DOT
We generated our own concatenated key (original customer, who may historically have different addresses, etc.). If there is a way for Solr to do that automatigically, I'd love to hear about it. I don't think that the extra bytes for the key itself (String vs. binary integer) is all that much o

RE: "Private" text fields

2011-10-06 Thread Jaeger, Jay - DOT
My thought about this, based on some work we did when we considered using Solr to index our LAN files: 1) If it matters - if someone misusing the private tags is a real issue (and it sounds like it would be), then I think you need an application out in front to enforce this (a good idea with So

RE: what is the recommended way to store locations?

2011-10-06 Thread Jaeger, Jay - DOT
We do much the same (along with name, address, postal code, etc.). However, we use AND when we search: the more data someone can provide, the fewer and more applicable their search results. JRJ -Original Message- From: Jason Toy [mailto:jason...@gmail.com] Sent: Thursday, October 06, 2

RE: Pls help :-) ! calling external ws/db to fetch field instead of own index?

2011-10-13 Thread Jaeger, Jay - DOT
Perhaps integrate this using a javascript or other application front end to query solr, get the key to the database, and then run off to get the data? -Original Message- From: Ikhsvaku S [mailto:ikhsv...@gmail.com] Sent: Tuesday, October 11, 2011 2:47 PM To: solr-user@lucene.apache.org

RE: capacity planning

2011-10-13 Thread Jaeger, Jay - DOT
We have used a VMWare VM for our index for testing for our index (currently around 3GB) and it has been just fine - at most maybe a 10 to 20% penalty, if that, even when CPU bound. We also plan to use a VM for production. What hypervisor one uses matters - sometimes a lot. -Original Messag

RE: Replication with an HA master

2011-10-13 Thread Jaeger, Jay - DOT
One thing to consider is the case where the JVM is up, but the system is otherwise unavailable (say, a NIC failure, firewall failure, load balancer failure) - especially if you use a SAN (whose connection is different from the normal network). In such a case the old master might have uncommitte

RE: Error loading class 'solr.extraction.ExtractingRequestHandler'

2011-10-17 Thread Jaeger, Jay - DOT
It sounds like maybe you either have not told Solr where the Solr home directory is, or , more likely, have not copied the jar files for this particular class into the right directory (typically a "lib" directory) so Tomcat cannot find that class. There is other correspondence on this list that

RE: Xsl for query output

2011-10-17 Thread Jaeger, Jay - DOT
It depends upon whether you want Solr to do the XSL processing, or the browser. After fussing a bit, and doing some reading and thinking, we decided it was best to let the browser do the work, at least in our case. If the browser is doing the processing, you don't need to modify sorlconfig.xml

RE: Getting errors thrown from sun.nio.ch.FileDispatcher with native or simple or single lock .Please , i need help in resolving the issue.

2011-10-18 Thread Jaeger, Jay - DOT
As others have reported, I also did not get your image. I am interested in your situation because we will deploy to WAS 7 in production, and have tested there. One thing I noted that might point to a possible problem you might have: 1. "The owner of the files created in the 2 environment

RE: How to retreive multiple documents using one unique field?

2011-10-18 Thread Jaeger, Jay - DOT
I do not believe that it will work as you have written it, unless you put an application in between to read that XML and then call Solr with what it expects. See http://wiki.apache.org/solr/UpdateXmlMessages You need to have: unique-value-if-any-1 abc 123 un

RE: how was developed solr admin page and the UI part?

2011-10-19 Thread Jaeger, Jay - DOT
I believe that if you have the Solr distribution, you have the source for the web UI already: it is just .jsp pages. They are inside the solr .war file. JRJ -Original Message- From: nagarjuna [mailto:nagarjuna.avul...@gmail.com] Sent: Wednesday, October 19, 2011 12:07 AM To: solr-user@

RE: OS Cache - Solr

2011-10-19 Thread Jaeger, Jay - DOT
200 instances of what? The Solr application with lucene, etc. per usual? Solr cores? ??? Either way, 200 seems to be very very very many: unusually so. Why so many? If you have 200 instances of Solr in a 20 GB JVM, that would only be 100MB per Solr instance. If you have 200 instances of S

RE: How to update document with solrj?

2011-10-19 Thread Jaeger, Jay - DOT
Solr does not have an "update" per se: you have to re-add the document. A document with the same value for the field defined as the uniqueKey will replace any existing document with that key (you do not have to query and explicitly delete it first). JRJ -Original Message- From: hadi

RE: add thumnail image for search result

2011-10-19 Thread Jaeger, Jay - DOT
It won't do it for you automatically. I suppose you might create the thumbnail image beforehand, Base64 encode it, and add it as a stored, non-indexed, binary field (see schema: solr.BinaryField) when you index the document. JRJ -Original Message- From: hadi [mailto:md.anb...@gmail.com

RE: Optimization /Commit memory

2011-10-19 Thread Jaeger, Jay - DOT
Commit does not particularly spike disk or memory usage, unless you are adding a very large number of documents between commits. A commit can cause a need to merge indexes, which can increase disk space temporarily. An optimize is *likely* to merge indexes, which will usually increase disk spa

RE: how was developed solr admin page and the UI part?

2011-10-20 Thread Jaeger, Jay - DOT
It certainly is possible to develop search pages, update pages, etc. in any architecture you like: I think I'd suggest looking at SolrJ if you want to do that.http://wiki.apache.org/solr/Solrj PLEASE: Go read through the documentation and tutorial and browse thru the Wiki and FAQ. It's a

RE: Optimization /Commit memory

2011-10-20 Thread Jaeger, Jay - DOT
e is not a single answer or formula that fits every situation. JRJ -Original Message- From: Sujatha Arun [mailto:suja.a...@gmail.com] Sent: Wednesday, October 19, 2011 11:58 PM To: solr-user@lucene.apache.org Subject: Re: Optimization /Commit memory Thanks Jay , I was trying to compute t

RE: OS Cache - Solr

2011-10-20 Thread Jaeger, Jay - DOT
Instances not solr cores. We get an avg response time of below 1 sec. The number of documents is not many most of the isntances ,some of the instnaces have about 5 lac documents on average. Regards Sujahta On Thu, Oct 20, 2011 at 3:35 AM, Jaeger, Jay - DOT wrote: > 200 instances of what? The S

RE: Optimization /Commit memory

2011-10-24 Thread Jaeger, Jay - DOT
; > On Thu, Oct 20, 2011 at 6:23 PM, Jaeger, Jay - DOT > wrote: > >> Well, since the OS RAM includes the JVM RAM, that is part of your >> requirement, yes? Aside from the JVM and normal OS requirements, all you >> need OS RAM for is file caching. Thus, for updates, the O

RE: some basic information on Solr

2011-10-24 Thread Jaeger, Jay - DOT
1. Solr, proper, does not index "files". An adjunct called Solr Cel can. See http://wiki.apache.org/solr/ExtractingRequestHandler . That article describes which kinds of files it Solr Cel can handle. 2. I have no idea what you mean by "incidents per year". Please explain. 3. Even though

RE: indexing key value pair into lucene solr index

2011-10-24 Thread Jaeger, Jay - DOT
Maybe put them in a single string field (or any other field type that is not analyzed -- certainly not text) using some character separator that will connect them, but won't confuse the Solr query parser? So maybe you start out with key value pairs of Key1 value1 Key2 value2 Key3 value3 Prepro

RE: some basic information on Solr

2011-10-25 Thread Jaeger, Jay - DOT
website but found it was really technical, since we are not on the developer side and we just want some basic information or numbers about its usage. Thanks for your answer, anyway. 2011/10/24 Jaeger, Jay - DOT > 1. Solr, proper, does not index "files". An adjunct called Solr Ce

RE: sort non-roman character strings last

2011-10-25 Thread Jaeger, Jay - DOT
Could you replace it with something that will sort it last instead of an empty string? (Say, for example, replacement="{}"). This would still give something that looks empty to a person, and would sort last. BTW, it looks to me as though your pattern only requires that the input contain just

RE: sort non-roman character strings last

2011-10-25 Thread Jaeger, Jay - DOT
t the same thing as: String silly = ""; JRJ -Original Message- From: themanwho [mailto:theman...@mac.com] Sent: Tuesday, October 25, 2011 9:22 AM To: solr-user@lucene.apache.org Subject: RE: sort non-roman character strings last Jay, Thanks, good call on the pattern.

RE: Points to processing hastags

2011-10-25 Thread Jaeger, Jay - DOT
Sounds like a possible application of solr.PatternTokenizerFactory http://lucene.apache.org/solr/api/org/apache/solr/analysis/PatternTokenizerFactory.html You could use copyField to copy the entire string to a separate field (or set of fields) that are processed by patterns. JRJ -Origina

RE: Replication issues with multiple Slaves

2011-10-25 Thread Jaeger, Jay - DOT
I noted that in these messages the left hand side is lower case collection, but the right hand side is upper case Collection. Assuming you did a cut/paste, could you have a core name mismatch between a master and a slave somehow? Otherwise (shudder): could you be doing a commit while the repli

RE: Loading data to SOLR first time ( taking too long)

2011-10-25 Thread Jaeger, Jay - DOT
My goodness. We do 4 million in about 1/2 HOUR (7+ million in 40 minutes). First question: Are you somehow forcing Solr to do a commit for each and every record? If so, that way leads to the house of PAIN. The thing to do next, I suppose, might be to try and figure out whether the issue is i

RE: Replication issues with multiple Slaves

2011-10-26 Thread Jaeger, Jay - DOT
download them. By keeping older commits we were able to work around this issue. > > -Original Message- > From: Jaeger, Jay - DOT [mailto:jay.jae...@dot.wi.gov] > Sent: 25 October 2011 20:48 > To: solr-user@lucene.apache.org > Subject: RE: Replication issues with multi

RE: Loading data to SOLR first time ( taking too long)

2011-10-26 Thread Jaeger, Jay - DOT
No, we do not use DIH. Based on other responses I saw, its seems likely that the issue is in the DIH component somehow. JRJ -Original Message- From: Awasthi, Shishir [mailto:shishir.awas...@baml.com] Sent: Tuesday, October 25, 2011 3:24 PM To: solr-user@lucene.apache.org; Jaeger, Jay

RE: some basic information on Solr

2011-10-26 Thread Jaeger, Jay - DOT
It didn't look like that, but maybe. Our experience has been very very good. I don't think we have seen a crash in our prototype to date (though that prototype is also not very busy). We have had as many a four cores, with as many as 35 million "documents". -Original Message- From

RE: Difficulties Installing Solr with Jetty 7.x

2011-10-26 Thread Jaeger, Jay - DOT
>From your logs, it looks like the Solr library is being found just fine, and >that the servlet is initing OK. Does your Jetty configuration specify index.jsp in a welcome list? We had that problem in WebSphere: we got 404's the same way, and the cure was to modify the Jetty web.xml to include

RE: Difficulties Installing Solr with Jetty 7.x

2011-10-26 Thread Jaeger, Jay - DOT
ERRATA, that should the the *SOLR* web.xml (not the Jetty web.xml) Sorry for the confusion. -Original Message- From: Jaeger, Jay - DOT [mailto:jay.jae...@dot.wi.gov] Sent: Wednesday, October 26, 2011 4:02 PM To: 'solr-user@lucene.apache.org' Subject: RE: Difficulties Installing

RE: Upgratding the Index from 1.4.1 to 3.4 using replication

2011-10-26 Thread Jaeger, Jay - DOT
I very much doubt that would work: different versions of Lucene involved, and Solr replication does just a streamed file copy, nothing fancy. JRJ -Original Message- From: Nemani, Raj [mailto:raj.nem...@turner.com] Sent: Wednesday, October 26, 2011 12:55 PM To: solr-user@lucene.apache.o

RE: Difficulties Installing Solr with Jetty 7.x

2011-10-27 Thread Jaeger, Jay - DOT
erbilt [mailto:li...@datagenic.com] Sent: Wednesday, October 26, 2011 5:41 PM To: solr-user@lucene.apache.org Subject: Re: Difficulties Installing Solr with Jetty 7.x Jay: Thanks for the response. $JETTY_HOME/etc/webdefault.xml is the unmodified file that came with Jetty, and it has a referencin

RE: large scale indexing issues / single threaded bottleneck

2011-11-03 Thread Jaeger, Jay - DOT
Shishir, we have 35 million "documents", and should be doing about 5000-1 new "documents" a day, but with very small "documents": 40 fields which have at most a few terms, with many being single terms. You may occasionally see some impact from top level index merges but those should be

RE: change solr url

2011-11-03 Thread Jaeger, Jay - DOT
The file that he refers to, web.xml, is inside the solr WAR file in folder web-inf. That WAR file is in ...\example\webapps. You would have to uncomment the section under and change the to something else. But, as the comments in the section explain, you would also have to make other cha

RE: Questions about Solr's security

2011-11-03 Thread Jaeger, Jay - DOT
It seems to me that this issue needs to be addressed in the FAQ and in the tutorial, and that somewhere there should be a /select lock-down "how to". This is not obvious to many (most?) users of Solr. It certainly wasn't obvious to me before I read this. JRJ -Original Message- From:

JEE servlet mapping, security and multiple Solr cores

2011-08-12 Thread Jaeger, Jay - DOT
l out. Jay R. Jaeger State of Wisconsin, Dept. of Transportation

RE: filtering non english text from my results

2011-08-15 Thread Jaeger, Jay - DOT
1. Find a dictionary with the English words you find acceptable 2. Use the KeepWordFilterFactory (doc in the "AnalyzerTTokenizersTokenFilters Wiki page). -Original Message- From: Omri Cohen [mailto:omri...@gmail.com] Sent: Monday, August 15, 2011 1:23 AM To: solr-user@lucene.apache.or

RE: ideas for indexing large amount of pdf docs

2011-08-15 Thread Jaeger, Jay - DOT
Note on i: Solr replication provides pretty good clustering support out-of-the-box, including replication of multiple cores. Read the Wiki on replication (Google +solr +replication if you don't know where it is). In my experience, the problem with indexing PDFs is it takes a lot of CPU on t

RE: Product data schema question

2011-08-16 Thread Jaeger, Jay - DOT
On the surface, you could simply add some more fields to your schema. But as far as I can tell, you would have to have a separate Solr "document" for each SKU/size combination, and store the rest of the information (brand, model, color, SKU) redundantly and make the unique key a combination of

RE: Product data schema question

2011-08-16 Thread Jaeger, Jay - DOT
s an index. -Original Message- From: Steve Cerny [mailto:sjce...@gmail.com] Sent: Tuesday, August 16, 2011 11:37 AM To: solr-user@lucene.apache.org Subject: Re: Product data schema question Jay, this is great information. I don't know enough about Solr whether this is possible...Can we setup

RE: Product data schema question

2011-08-16 Thread Jaeger, Jay - DOT
Not particularly. Just trying to do my part to answer some questions on the list. -Original Message- From: Steve Cerny [mailto:sjce...@gmail.com] Sent: Tuesday, August 16, 2011 11:49 AM To: solr-user@lucene.apache.org Subject: Re: Product data schema question Thanks Jay, if we come to

RE: Unable to get multicore working

2011-08-16 Thread Jaeger, Jay - DOT
Perhaps your admin doesn’t work because you don't have defaultCoreName="whatever-core-you-want-by-default" in your tag? E.g.: Perhaps this was enough to prevent it starting any cores -- I'd expect a default to be required. Also, from experience, if you add cores, and you have securi

RE: Unable to get multicore working

2011-08-16 Thread Jaeger, Jay - DOT
them, besides 404 errors. On Tuesday, 16 August, 2011 at 1:10 PM, Jaeger, Jay - DOT wrote: > Perhaps your admin doesn’t work because you don't have > defaultCoreName="whatever-core-you-want-by-default" in your tag? E.g.: > > > > Perhaps this was enough

RE: Unable to get multicore working

2011-08-16 Thread Jaeger, Jay - DOT
I tried on my own test environment -- pulling out the default core parameter out, under Solr 3.1 I got exactly your symptom: an error 404. HTTP ERROR 404 Problem accessing /solr/admin/index.jsp. Reason: missing core name in path The log showed: 2011-08-

RE: Unable to get multicore working

2011-08-16 Thread Jaeger, Jay - DOT
Whoops: That was Solr 4.0 (which pre-dates 3.1). I doubt very much that the release matters, though: I expect the behavior would be the same. -Original Message- From: Jaeger, Jay - DOT [mailto:jay.jae...@dot.wi.gov] Sent: Tuesday, August 16, 2011 4:04 PM To: solr-user

RE: Unable to get multicore working

2011-08-16 Thread Jaeger, Jay - DOT
now. Excellent! The site schemas are loading! Looks like the site schemas have an issue: "SEVERE: org.apache.solr.common.SolrException: Unknown fieldtype 'long' specified on field area_id" Errr. Why would `long` be an invalid type? On Tuesday, 16 August, 2011 at 2:06 PM, Jaeg

RE: Unable to get multicore working

2011-08-17 Thread Jaeger, Jay - DOT
okay, now. Thanks for the help. You guys saved me from the insane asylum. On Tuesday, 16 August, 2011 at 2:32 PM, Jaeger, Jay - DOT wrote: > That said, the logs are showing a different error now. Excellent! The site > schemas are loading! > > Great! > > "SEVERE: org.apa

RE: master unreachable - attempting simple replication

2011-08-17 Thread Jaeger, Jay - DOT
I'd suggest looking at the logs of the master to see if the request is getting thru or not, or if there are any errors logged there. If the master has a replication config error, it might show up there. We just went thru some master/slave troubleshooting. Here are some things that you might l

RE: Solr 1.4.1 vs 3.3 (Speed)

2011-08-17 Thread Jaeger, Jay - DOT
It would perhaps help if you reported what you mean by "noticeably less time". What were your timings? Did you run the tests multiple times? One thing to watch for in testing: Solr performance is greatly affected by the OS file system cache. So make sure when testing that you use the same

RE: Most current tik jar files that work with Solr 1.4.1

2011-08-17 Thread Jaeger, Jay - DOT
> What is the latest version of Tika that I can use with Solr 1.4.1? it > comes packaged with 0.4. I tried 0.8 and it no workie. When I was testing Tika last year, I used Solr build 1271 to get the most recent Tika I could get my hands on at the time. That was before Solr 3.1, so I expect it

RE: 'Stable' 4.0 version

2011-08-17 Thread Jaeger, Jay - DOT
> geospatial requirements Looking at your email address, no surprise there. 8^) > What insight can you share (if any) regarding moving forward to a later > nightly build? I used build 1271 (Solr 1.4.1, which seemed to be called Solr 4 at the time) during some testing, and it performed well

RE: Synonym and Whitespaces and optional TokenizerFactory

2011-08-18 Thread Jaeger, Jay - DOT
You could presumably do it with solr.PatternTokenizerFactory with the pattern set to .* as your Or, maybe, if Solr allows it, you don't use any tokenizer at all? Or, maybe you could use solr.WhitespaceTokenizerFactory, allowing it to split up the words, along with solr.WordDelimiterFilterFacto

RE: Solr Copyfields

2011-08-18 Thread Jaeger, Jay - DOT
I would suggest #3, unless you have some very unusual performance requirements. It has the advantage of isolating your index environment requirements from the database. -Original Message- From: Nicholas Fellows [mailto:n...@djdownload.com] Sent: Thursday, August 18, 2011 8:40 AM To:

RE: XSLT Exception

2011-08-18 Thread Jaeger, Jay - DOT
I am not an XSLT expert, but believe that in XSLT, "not" is a function, rather than an operator. http://www.w3.org/TR/xpath-functions/#func-not So, not(contains)) rather than not contains() should presumably do the trick. -Original Message- From: Christopher Gross [mailto:cog

RE: how to deal with URLDatasource which needs authorization?

2011-08-24 Thread Jaeger, Jay - DOT
You could run the HTML import from Tika (see the Solr tutorial on the Solr website). The job that ran Tika would need the user/password of the site to be indexed, but Solr would not. (You might have to write a little script to get the HTML page using curl or wget or Nutch). Users could then s

RE: query

2011-08-24 Thread Jaeger, Jay - DOT
One way I had thought of doing this kind of thing: include in the index an "ACL" of some sort. The problem I see in your case is that the list if "friends" can presumably change over time. So, given that, one way would be to have a little application in between. The request goes to the appli

RE: Best way to anchor solr searches?

2011-08-25 Thread Jaeger, Jay - DOT
I don't think it has to be quite so bleak as that, depending upon the number of queries done over a given timeframe, and the size of the result sets. Solr does cache the identifiers of "documents" returned by search results. See http://wiki.apache.org/solr/SolrCaching paying particular attent

RE: Solr in a windows shared hosting environment

2011-08-25 Thread Jaeger, Jay - DOT
Yes, but since Solr is written in Java to run in a JEE container, you would host Solr in a web application server, either Jetty (which comes packaged), or something else (say, Tomcat or WebSphere or something like that). As a result, you aren't going to find anything that says how to run Solr un

RE: How to copy and extract information from a multi-line text before the tokenizer

2011-08-25 Thread Jaeger, Jay - DOT
"A programmer had a problem. He tried to solve it with regular expressions. Now he has two problems" :). A. That just isn't fair... 8^) (I can't think of very many things that have allowed me to perform more magic over my career than regular expressions, starting with SNOBOL. Uh oh: I ju

RE: Solr in a windows shared hosting environment

2011-08-25 Thread Jaeger, Jay - DOT
ndows shared hosting environment Thank you! Since it's shared hosting, how do I install java? -Original Message- From: Jaeger, Jay - DOT [mailto:jay.jae...@dot.wi.gov] Sent: Thursday, August 25, 2011 4:34 PM To: solr-user@lucene.apache.org Subject: RE: Solr in a windows shared hosting e

<    1   2   3   4   5