Re: xpath processing

2010-10-22 Thread pghorpade
processor="FileListEntityProcessor" fileName=".*xml" recursive="true" baseDir="C:\data\sample_records\mods\starr"> processor="XPathEntityProcessor" url="${f.fileAbsolutePath}" stream="false" forEach="/mods" transformer="DateFormatTransformer,RegexTransformer,TemplateTransformer">

Re: xpath processing

2010-10-22 Thread Ken Stanley
Parinita, In its simplest form, what does your entity definition for DIH look like; also, what does one record from your xml look like? We need more information before we can really be of any help. :) - Ken It looked like something resembling white marble, which was probably what it was: somethi

Re: how well does multicore scale?

2010-10-22 Thread Lance Norskog
http://wiki.apache.org/solr/CoreAdmin Since Solr 1.3 On Fri, Oct 22, 2010 at 1:40 PM, mike anderson wrote: > Thanks for the advice, everyone. I'll take a look at the API mentioned and > do some benchmarking over the weekend. > > -Mike > > > On Fri, Oct 22, 2010 at 8:50 AM, Mark Miller wrote: >

Re: xpath processing

2010-10-22 Thread pghorpade
Quoting pghorp...@ucla.edu: Can someone help me please? I am trying to import mods xml data in solr using the xml/http datasource This does not work with XPathEntityProcessor of the data import handler xpath="/mods/name/namepa...@type = 'date']" I actually have 143 records with type attribute

RE: How to index long words with StandardTokenizerFactory?

2010-10-22 Thread Steven A Rowe
Hi Sergey, What does your ~34kb field value look like? Does StandardTokenizer think it's just one token? What doesn't work? What happens? Steve > -Original Message- > From: Sergey Bartunov [mailto:sbos@gmail.com] > Sent: Friday, October 22, 2010 3:18 PM > To: solr-user@lucene.apa

Re: Date faceting +1MONTH problem

2010-10-22 Thread Yonik Seeley
On Fri, Oct 22, 2010 at 6:02 PM, Shawn Heisey wrote: > On 10/22/2010 3:01 PM, Yonik Seeley wrote: >> >> On Fri, Sep 17, 2010 at 9:51 PM, Chris Hostetter >>  wrote: >>> >>>  the default query parser >>> doesn't support range queries with mixed upper/lower bound inclusion. >> >> This has just been

Solr ExtractingRequestHandler with Compressed files

2010-10-22 Thread Joey Hanzel
Hi, Has anyone had success using ExtractingRequestHandler and Tika with any of the compressed file formats (zip, tar, gz, etc) ? I am sending solr the archived.tar file using curl. curl " http://localhost:8983/solr/update/extract?literal.id=doc1&fmap.content=body_texts&commit=true"; -H 'Content-t

Re: Date faceting +1MONTH problem

2010-10-22 Thread Shawn Heisey
On 10/22/2010 3:01 PM, Yonik Seeley wrote: On Fri, Sep 17, 2010 at 9:51 PM, Chris Hostetter wrote: the default query parser doesn't support range queries with mixed upper/lower bound inclusion. This has just been added to trunk. Things like [0 TO 100} now work. Awesome! Is it easily port

Question about gaze

2010-10-22 Thread Lingley, Dean R
Anyone have an idea about the error below. Installed gaze and it works ok, then when trying to import after installing received the following error. Commented out the line in solrconfig.xml and it imports fine again, but now gaze Any ideas? Thanks, Dean ./import-marc.sh /home/filemove/

RE: Confusion about entities and documents

2010-10-22 Thread Olson, Ron
Hmm, okay, I guess I wasn't taking the hierarchy-flattening aspect of Solr seriously enough. :) Based on your reply from the other thread, I guess the best solution, as far as I can tell, is to maintain the multiple value lists and take advantage of the fact that the arrays will always be in th

Re: Confusion about entities and documents

2010-10-22 Thread harrysmith
>What I get when I search for, say, "XYZ", is a document that has XYZ Corp as a manufacturer name, but the >array of parts_manu appears to be a child of the document, not the parts array. > >Is this the correct behavior, insofar as a document has a single level of elements, and that's it? If so,

Re: Date faceting +1MONTH problem

2010-10-22 Thread Yonik Seeley
On Fri, Sep 17, 2010 at 9:51 PM, Chris Hostetter wrote: > the default query parser > doesn't support range queries with mixed upper/lower bound inclusion. This has just been added to trunk. Things like [0 TO 100} now work. -Yonik http://www.lucidimagination.com

Re: how well does multicore scale?

2010-10-22 Thread mike anderson
Thanks for the advice, everyone. I'll take a look at the API mentioned and do some benchmarking over the weekend. -Mike On Fri, Oct 22, 2010 at 8:50 AM, Mark Miller wrote: > On 10/22/10 1:44 AM, Tharindu Mathew wrote: > > Hi Mike, > > > > I've also considered using a separate cores in a multi

Re: Using different schemas when syncing with PostgreSQL and DIH

2010-10-22 Thread Juan Manuel Alvarez
Thank you Shawn! That was exactly what I was looking for! =o) On Fri, Oct 22, 2010 at 4:29 PM, Shawn Heisey wrote: > On 10/22/2010 10:06 AM, Juan Manuel Alvarez wrote: >> >> My question is: >> Every time I do an import operation (delta or full) with DIH, I only >> need to sync the index with one

Re: Using different schemas when syncing with PostgreSQL and DIH

2010-10-22 Thread Shawn Heisey
On 10/22/2010 10:06 AM, Juan Manuel Alvarez wrote: My question is: Every time I do an import operation (delta or full) with DIH, I only need to sync the index with one schema only, so... is there a way to pass a custom parameter with the schema name to DIH so I can build the query with the corres

Re: A bug in ComplexPhraseQuery ?

2010-10-22 Thread Ahmet Arslan
> class="org.apache.solr.search.ComplexPhraseQParserPlugin"> >     name="inOrder">false >   > I added this change to SOLR-1604, can you test it give us feedback?

Re: How to index long words with StandardTokenizerFactory?

2010-10-22 Thread Sergey Bartunov
I'm using Solr 1.4.1. Now I'm successed with replacing lucene-core jar but maxTokenValue seems to be used in very strange way. Currenty for me it's set to 1024*1024, but I couldn't index a field with just size of ~34kb. I understand that it's a little weird to index such a big data, but I just want

Re: SolrJ addField with Reader

2010-10-22 Thread Bojan Vukojevic
Is there an example of how to use ContentStreamBase.FileStream from SolrJ during indexing to reduce memory footprint? Using "addField" is requiring a string. The only example I could find in JUnits is below and does not show indexing... thx! *public* *void* testFileStream() *throws* IOException {

Re: A bug in ComplexPhraseQuery ?

2010-10-22 Thread Ahmet Arslan
> In my opinion, ordering term in a proximity search does not > make sense! > So the work around for us is to generate the opposite > search every time a > proximity operator is used. > not very elegant! If you want I can make it configurable. You can define your choice in solrconfig.xml like thi

Confusion about entities and documents

2010-10-22 Thread Olson, Ron
Hi all- I've been checking the online docs about this, but I haven't found a suitable explanation about how entities and sub-entities work within a document. I am loading records from a SQL database and everything seems to be getting flattened in a way I was not expecting. For example, I have

RE: How to index long words with StandardTokenizerFactory?

2010-10-22 Thread Steven A Rowe
Hi Sergey, I've opened an issue to add a maxTokenLength param to the StandardTokenizerFactory configuration: https://issues.apache.org/jira/browse/SOLR-2188 I'll work on it this weekend. Are you using Solr 1.4.1? I ask because of your mention of Lucene 2.9.3. I'm not sure there will

Re: Failing to successfully import international characters via DIH

2010-10-22 Thread Dennis Gearon
Sounds like one of three things: 1/ Everything is set to UTF-*, but the content has another encoding. 2/ Something 'mirocosoftish' is adding a BOM (byte order mark) that is being incorrectly interpreted. 3/ The byte order is wrong somewhere along the way and not being translated correctly across

Re: Solr Javascript+JSON not optimized for SEO

2010-10-22 Thread Dennis Gearon
How can we see what each will do? Dennis Gearon --- On Fri, 10/22/10, PeterKerk wrote: > From: PeterKerk > Subject: Solr Javascript+JSON not optimized for SEO > To: solr-user@lucene.apache.org > Date: Friday, October 22, 2010, 2:59 AM > > Hi, > > When I retrieve data via javascript+JSON meth

How to index long words with StandardTokenizerFactory?

2010-10-22 Thread Sergey Bartunov
I'm trying to force solr to index words which length is more than 255 symbols (this constant is DEFAULT_MAX_TOKEN_LENGTH in lucene StandardAnalyzer.java) using StandardTokenizerFactory as 'filter' tag in schema configuration XML. Specifying the maxTokenLength attribute won't work. I'd tried to mak

Using different schemas when syncing with PostgreSQL and DIH

2010-10-22 Thread Juan Manuel Alvarez
Hello everyone! I am using Solr synced with a PostgreSQL database using DIH and I am facing an issue. The thing is that I use one Solr server and different Postgre schemas in the same database, with the same tables inside each one, so the following queries: SELECT * FROM "schema1"."Objects"; and

Re: Failing to successfully import international characters via DIH

2010-10-22 Thread Pradeep Singh
Holy cow, you already have this in place. I apologize. This looked exactly the kind of problem I have solved this way. On Fri, Oct 22, 2010 at 8:38 AM, Pradeep Singh wrote: > > >> What would you recommend changing or checking? >> >> > Tomcat *Connector* URIEncoding. I have done this several time

Re: Failing to successfully import international characters via DIH

2010-10-22 Thread Pradeep Singh
> > What would you recommend changing or checking? > > Tomcat *Connector* URIEncoding. I have done this several times on tomcat, might be at a loss on other servers though. - Pradeep

Failing to successfully import international characters via DIH

2010-10-22 Thread virtas
Hi, wanted to share problem i have got with importing text from different languages. All international text looks wrong on luke and on AJAX solr. What I see for chinese and japanese characters is this: æ˜ ç”»ã‚„éŸ³æ¥½ãŒæ¥½ã—ã„ï¼AIのサイモンのファンです。アダムやマットã

Re: Import From MYSQL database

2010-10-22 Thread virtas
In the main directory of jetty should be directory called 'logs' log name is usually coded like this: 2010_07_31.request.log change the date and try searching your system -- View this message in context: http://lucene.472066.n3.nabble.com/Import-From-MYSQL-database-tp1738753p1752946.html Sent

Re: different results depending on result format

2010-10-22 Thread Mike Sokolov
OK I solved the problem. It turns out that I was connecting to the server using its FQDN (rosen.ifactory.com). When, instead, I connect to it using the name "rosen" (which maps to the same IP using the default domain name configured in my resolver, ifactory.com), I get results back. I am loo

Re: Strange file name after installing solr

2010-10-22 Thread Grant Ingersoll
On Oct 21, 2010, at 11:52 PM, Bac Hoang wrote: > apache-solr-1.4.1Hello folks, > > I'm very new user to solr. Please help > > What I have in hand: 1) apache-solr-1.4.1; 2) Geronimo > > After installing solr.war using Geronimo administration GUI, I got a > "strange" file, under the >

Re: Solr sorting problem

2010-10-22 Thread Moazzam Khan
For anyone who faced the same problem, changing the field to string from text worked! -Moazzam On Fri, Oct 22, 2010 at 8:50 AM, Moazzam Khan wrote: > The field type of the first name and last name is text. Could that be > why it's not sorting properly? I just changed it to string and started > a

Re: Solr sorting problem

2010-10-22 Thread Moazzam Khan
The field type of the first name and last name is text. Could that be why it's not sorting properly? I just changed it to string and started a full-import. Hopefully that will work. Thanks, Moazzam On Thu, Oct 21, 2010 at 7:42 PM, Jayendra Patil wrote: > need additional information . > Sorti

Re: different results depending on result format

2010-10-22 Thread Mike Sokolov
Yes - I really only have the one solr instance. And I have plenty of other cases where I am getting good results back via solrj. It's really a mystery. Unfortunately I have to catch up on other stuff I have been neglecting, but I'll follow up when I'm able to get a solution... -Mike On 10

solr performance

2010-10-22 Thread Markus.Rietzler
last week we put our solr in production. it was a very smooth start. solr really works great and without any problems so far. its a huge improvement over our old intranet search i wonder however whether we can increase the search performance of our solr installation, just to make the search ex

Re: how well does multicore scale?

2010-10-22 Thread Mark Miller
On 10/22/10 1:44 AM, Tharindu Mathew wrote: > Hi Mike, > > I've also considered using a separate cores in a multi tenant > application, ie a separate core for each tenant/domain. But the cores > do not suit that purpose. > > If you check out documentation no real API support exists for this so >

Re: how well does multicore scale?

2010-10-22 Thread Tharindu Mathew
On Fri, Oct 22, 2010 at 11:18 AM, Lance Norskog wrote: > There is an API now for dynamically loading, unloading, creating and > deleting cores. > Restarting a Solr with thousands of cores will take, I don't know, hours. > Is this in the trunk? Any docs available? > On Thu, Oct 21, 2010 at 10:44 PM

Re: facet Prefix (or term prefix)

2010-10-22 Thread Markus Jelsma
Hi, There is no facet.contains facility there are alternatives. Instead of using the faceting engine, you will need to create a field that has an NGramTokenizer. Properly configured, you can use this field to query upon and it will return what you would expect from a facet.contains feature. H

facet Prefix (or term prefix)

2010-10-22 Thread Jason Brown
I am aware of the facet.prefix facility. I am using SOLR to return a facetted fields contents - I use the facet.prefix to restrict what returns from SOLR - this is very useful for predictive search functionality (autocomplete). My only issue is that the field I facet on is a string and could ha

Re: mincount doesn't work with FacetQuery

2010-10-22 Thread Mark Allan
This is a response to a thread from several months ago ( http://lucene.472066.n3.nabble.com/mincount-doesn-t-work-with-FacetQuery-tp473162p473162.html ) Sorry, I don't know where to get the thread number to request that specific thread from listserv and reply properly via email. Anyway, I've

Re: different results depending on result format

2010-10-22 Thread Savvas-Andreas Moysidis
strange..are you absolutely sure the two queries are directed to the same Solr instance? I'm running the same query from the admin page (which specifies the xml format) and I get the exact same results as solrj. On 21 October 2010 22:25, Mike Sokolov wrote: > quick follow-up: I also notice that

Re: MoreLikeThis explanation?

2010-10-22 Thread Darren Govoni
Hi Koji, I tried to apply your patch to the 1.4.0 tagged branch, but it didn't take completely. What branch does it work for? Darren On Thu, 2010-10-21 at 23:03 +0900, Koji Sekiguchi wrote: > (10/10/21 20:33), dar...@ontrenet.com wrote: > > Hi, > >Does the latest Solr provide an explanat

Solr Javascript+JSON not optimized for SEO

2010-10-22 Thread PeterKerk
Hi, When I retrieve data via javascript+JSON method (instead of REST via URL), the link which I click does not reflect what the user will end up seeing. Example for showing the features belonging to a LED TV product: JSON getFeatureFacets('LEDTV') Get features for LED TV REST www.domain.com/T

Re: Import From MYSQL database

2010-10-22 Thread do3do3
i really try to index tables with english keywords in mysql database but fail, and also try to import data from this database during java and successed i don't know how to use the dataimport folder in contrib folder, may be this is the problem what i done was build configurations file (shema.xml,

Re: why sorl is slower than lucene so much?

2010-10-22 Thread kafka0102
thanks a lot. I got it. On 2010年10月21日 22:36, Yonik Seeley wrote: 2010/10/21 kafka0102: I found the problem's cause.It's the DocSetCollector. my fitler query result's size is about 300,so the DocSetCollector.getDocSet() is OpenBitSet. And 300 OpenBitSet.fastSet(doc) op is too slow.