date:20100812

Re: DIH transformer script size limitations with Jetty?

2010-08-12 Thread Shalin Shekhar Mangar

On Thu, Aug 12, 2010 at 5:42 AM, harrysmith wrote: > > To follow up on my own question, it appears this is only an issue when > using > the DataImport console debugging tools. It looks like when submitting the > debugging request, the data-config.xml is sent via a GET request, which > would fail.

Re: Data Import Handler Query

2010-08-12 Thread Alexey Serba

Try to define image solr fields <-> db columns mapping explicitly in "image" entity, i.e. See http://www.lucidimagination.com/search/document/c8f2ed065ee75651/dih_and_multivariable_fields_problems On Thu, Aug 12, 2010 at 2:30 AM, Manali Joshi wrote: > I tried making the schema

Indexing Hanging during GC?

2010-08-12 Thread Rebecca Watson

Hi, When indexing large amounts of data I hit a problem whereby Solr becomes unresponsive and doesn't recover (even when left overnight!). I think i've hit some GC problems/tuning is required of GC and I wanted to know if anyone has ever hit this problem. I can replicate this error (albeit taking

Re: Analysing SOLR logfiles

2010-08-12 Thread Jay Flattery

Thanks - splunk looks overkill. We're extremely small scale - were hoping for something open source :-) - Original Message From: Jan Høydahl / Cominvent To: solr-user@lucene.apache.org Sent: Wed, August 11, 2010 11:14:37 PM Subject: Re: Analysing SOLR logfiles Have a look at www.splunk

Re: Improve Query Time For Large Index

2010-08-12 Thread Peter Karich

Hi Robert! > Since the example given was "http" being slow, its worth mentioning that if > queries are "one word" urls [for example http://lucene.apache.org] these > will actually form slow phrase queries by default. > do you mean that http://lucene.apache.org will be split up into "http luce

Re: Improve Query Time For Large Index

2010-08-12 Thread Peter Karich

Hi Tom, I tried again with: and even now the hitratio is still 0. What could be wrong with my setup? ('free -m' shows that the cache has over 2 GB free.) Regards, Peter. > Hi Peter, > > Can you give a few more examples of slow queries? > Are they phrase queries? Boolean queries? prefix or

Re: Improve Query Time For Large Index

2010-08-12 Thread Peter Karich

Hi Tom! > Hi Peter, > > Can you give a few more examples of slow queries? > Are they phrase queries? Boolean queries? prefix or wildcard queries? > I am experimenting with one word queries only at the moment. > If one word queries are your slow queries, than CommonGrams won't help. > Comm

Re: Analysing SOLR logfiles

2010-08-12 Thread Rebecca Watson

we've just started using awstats - as suggested by the solr 1.4 book. its open source!: http://awstats.sourceforge.net/ On 12 August 2010 18:18, Jay Flattery wrote: > Thanks - splunk looks overkill. > We're extremely small scale - were hoping for something open source :-) > > > - Original Me

Re: Improve Query Time For Large Index

2010-08-12 Thread Robert Muir

exactly! On Thu, Aug 12, 2010 at 5:26 AM, Peter Karich wrote: > Hi Robert! > > > Since the example given was "http" being slow, its worth mentioning that > if > > queries are "one word" urls [for example http://lucene.apache.org] these > > will actually form slow phrase queries by default. > >

Re: Multiple Facet Dates

2010-08-12 Thread Raphaël Droz

On 05/08/2010 09:59, Raphaël Droz wrote: Hi, I saw this post : http://lucene.472066.n3.nabble.com/Multiple-Facet-Dates-td495480.html I didn't see work in progress or plans about this feature on the list and bugtracker. Does someone already created a patch, pof, ... I wouldn't have been able

Solr branches

2010-08-12 Thread Tomasz Wegrzanowski

Hi, I'm having oome problems with solr. From random browsing I'm getting an impression that a lot of memory fixes happened recently in solr and lucene. Could you give me a quick summary how (un)stable are different lucene / solr branches and how much improvement I can expect?

Re: Analysing SOLR logfiles

2010-08-12 Thread Peter Karich

I wonder too, that there shouldn't be a special tool which analyzes solr logfiles (e.g. parses qtime, the parameters q, fq, ...) Because there are some other open source log analyzers out there: http://yaala.org/ http://www.mrunix.net/webalizer/ Another free tool is newrelic.com (you will submit

indexing???

2010-08-12 Thread satya swaroop

Hi all, The indexing part of solr is going good,but i got a error on indexing a single pdf file. when i searched for the error in the mailing list i found that the error was due to copyright of that file. can't we index a file which has copy rights or any digital rights??? regards, satya

Indexing large files using Solr Cell causes OutOfMemory error

2010-08-12 Thread Lannig Carina

Hi, I'm trying to index a txt-File (~150MB) using Solr Cell/Tika. The curl command aborts due to a java.lang.OutOfMemoryError. * java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3209)

Re: Solr branches

2010-08-12 Thread Koji Sekiguchi

(10/08/12 21:06), Tomasz Wegrzanowski wrote: Hi, I'm having oome problems with solr. From random browsing I'm getting an impression that a lot of memory fixes happened recently in solr and lucene. Could you give me a quick summary how (un)stable are different lucene / solr branches and how much

Re: Schema Definition Question

2010-08-12 Thread kenf_nc

One way I've done to handle this, and it works only for some types of data, is to put the searchable part of the sub-doc in a search field (indexed=true) and put an xml or json representation of the sub-doc in a stored only field. Then if the main doc is hit via search I can grab the xml or json,

Re: Indexing Hanging during GC?

2010-08-12 Thread dc tech

I am a little confused - how did 180k documents become 100m index documents? We use have over 20 indices (for different content sets), one with 5m documents (about a couple of pages each) and another with 100k+ docs. We can index the 5m collection in a couple of days (limitation is in the source) w

Re: Solr branches

2010-08-12 Thread Tomasz Wegrzanowski

On 12 August 2010 13:46, Koji Sekiguchi wrote: > (10/08/12 21:06), Tomasz Wegrzanowski wrote: >> >> Hi, >> >> I'm having oome problems with solr. From random browsing >> I'm getting an impression that a lot of memory fixes happened >> recently in solr and lucene. >> >> Could you give me a quick su

Re: Indexing large files using Solr Cell causes OutOfMemory error

2010-08-12 Thread Gora Mohanty

On Thu, 12 Aug 2010 14:32:19 +0200 Lannig Carina wrote: > Hi, > > I'm trying to index a txt-File (~150MB) using Solr Cell/Tika. > The curl command aborts due to a java.lang.OutOfMemoryError. [...] > AFAIK Tika keeps the whole file in RAM and posts it as one single > string to Solr. I'm using JVM

Re: Indexing Hanging during GC?

2010-08-12 Thread Rebecca Watson

sorry -- i used the term "documents" too loosely! 180k scientific articles with between 500-1000 sentences each and we index sentence-level index documents so i'm guessing about 100 million lucene index documents in total. an update on my progress: i used GC settings of: -XX:+UseConcMarkSweepGC

Deleting with the DIH sometimes doesn't delete

2010-08-12 Thread Qwerky

I'm doing deletes with the DIH but getting mixed results. Sometimes the documents get deleted, other times I can still find them in the index. What would prevent a doc from getting deleted? For example, I delete 594039 and get this in the logs; 2010-08-12 14:41:55,625 [Thread-210] INFO [DataImp

index pdf files

2010-08-12 Thread Ma, Xiaohui (NIH/NLM/LHC) [C]

I wrote a simple java program to import a pdf file. I can get a result when I do search *:* from admin page. I get nothing if I search a word. I wonder if I did something wrong or miss set something. Here is part of result I get when do *:* search: *

Re: index pdf files

2010-08-12 Thread Marco Martinez

To help you we need the description of your fields in your schema.xml and the query that you do when you search only a single word. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/8/12 Ma, Xiao

RE: index pdf files

2010-08-12 Thread Ma, Xiaohui (NIH/NLM/LHC) [C]

Thanks so much. I didn't know how to make any changes in schema.xml for pdf files. I used solr default schema.xml. Please tell me what I need do in schema.xml. The simple java program I use is following. I also attached that pdf file. I really appreciate your help! *

how to update solr to older 1.5 builds instead of to trunk

2010-08-12 Thread solr-user

please excuse this newbie question, but: I want to upgrade solr to a version but not to the latest version in the trunk (because there are so many changes that I would have to test against, and modify my custom classes for, and behavior changes, and deal with the lucene index change, etc) My t

Re: Indexing Hanging during GC?

2010-08-12 Thread dc tech

1) I assume you are doing batching interspersed with commits 2) Why do you need sentence level Lucene docs? 3) Are your custom handlers/parsers a part of SOLR jvm? Would not be surprised if you a memory/connection leak their (or it is not releasing some resource explicitly) In general, we have NEV

Re: how to update solr to older 1.5 builds instead of to trunk

2010-08-12 Thread Yonik Seeley

Another option is the 3x branch - that should still be able to read indexes from Solr 1.4/Lucene 2.9 I personally don't expect a 1.5 release to ever materialize. There will eventually be a Lucene/Solr 3.1 release off of the 3x branch, and a Lucene/Solr 4.0 release off of trunk. -Yonik http://www.l

Re: Solr Doc Lucene Doc !?

2010-08-12 Thread stockii

no help ? =( -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Doc-Lucene-Doc-tp995922p1114172.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to update solr to older 1.5 builds instead of to trunk

2010-08-12 Thread solr-user

Thanks Yonik but http://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/solr/CHANGES.txt says that the lucene index has changed " Upgrading from Solr 1.4 -- * The Lucene index format has changed and as a result, once you upgrade, previous versions of Solr will no lo

Re: Indexing Hanging during GC?

2010-08-12 Thread Rebecca Watson

hi, > 1) I assume you are doing batching interspersed with commits as each file I crawl for are article-level each contains all the sentences for the article so they are naturally batched into the about 500 documents per post in LCF. I use auto-commit in Solr: 50 90 >

Re: how to update solr to older 1.5 builds instead of to trunk

2010-08-12 Thread Yonik Seeley

On Thu, Aug 12, 2010 at 12:24 PM, solr-user wrote: > Thanks Yonik but > http://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/solr/CHANGES.txt > says that the lucene index has changed Right - but it will be able to read your older index. Do you need Solr 1.4 to be able to read the new ind

edismax pf2 and ps

2010-08-12 Thread Ron Mayer

Short summary: Is there any way I can specify that I want a lot of phrase slop for the "pf" parameter, but none at all for the "pf2" parameter? I find the 'pf' parameter with a pretty large 'ps' to do a very nice job for providing a modest boost to many documents that are quite well rela

Re: how to update solr to older 1.5 builds instead of to trunk

2010-08-12 Thread solr-user

no, once upgraded I wouldnt need to have an older solr read the indexes. misunderstood the note. thx -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-update-solr-to-older-1-5-builds-instead-of-to-trunk-tp1113863p1115694.html Sent from the Solr - User mailing list arch

RE: index pdf files

2010-08-12 Thread Ma, Xiaohui (NIH/NLM/LHC) [C]

Does anyone know if I need define fields in schema.xml for indexing pdf files? If I need, please tell me how I can do it. I defined fields in schema.xml and created data-configuration file by using xpath for xml files. Would you please tell me if I need do it for pdf files and how I can do? T

RE: Improve Query Time For Large Index

2010-08-12 Thread Burton-West, Tom

Hi Peter, If hits aren't showing up, and you aren't getting any queryResultCache hits even with the exact query being repeated, something is very wrong. I'd suggest first getting the query result cache working, and then moving on to look at other possible bottlenecks. What are your settings

Re: index pdf files

2010-08-12 Thread Stefan Moises

Maybe this helps: http://www.packtpub.com/article/indexing-data-solr-1.4-enterprise-search-server-2 Cheers, Stefan Am 12.08.2010 19:45, schrieb Ma, Xiaohui (NIH/NLM/LHC) [C]: Does anyone know if I need define fields in schema.xml for indexing pdf files? If I need, please tell me how I can do

Re: Solr Doc Lucene Doc !?

2010-08-12 Thread kenf_nc

Are you just trying to learn the tiny details of how Solr and DIH work? Is this just an intellectual curiosity? Or are you having some specific problem that you are trying to solve? If you have a problem, could you describe the symptoms of the problem? I am using Solr, DIH, and several other relat

Results from More then One Cors?

2010-08-12 Thread Jörg Agatz

Hallo Users... I tryed to get results from more then one Cores.. But i dont know how.. Maby you have a Idea.. I need it into PHP King

RE: index pdf files

2010-08-12 Thread Ma, Xiaohui (NIH/NLM/LHC) [C]

Thanks so much for your help! I defined dynamic field in schema.xml as following: But I wonder what I should put for . I really appreciate your help! -Original Message- From: Stefan Moises [mailto:moi...@shoptimax.de] Sent: Thursday, August 12, 2010 1:58 PM To: solr-user@lucene.ap

Re: Solr Doc Lucene Doc !?

2010-08-12 Thread stockii

i write a little thesis about this. and i need to know how solr is using lucene -in which way. in example of using dih and searching. so for my better understanding .. ;-) -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Doc-Lucene-Doc-tp995922p1118089.html Sent from t

RE: index pdf files

2010-08-12 Thread Ma, Xiaohui (NIH/NLM/LHC) [C]

Thanks so much. I got it work now. I really appreciate your help! Xiaohui -Original Message- From: Stefan Moises [mailto:moi...@shoptimax.de] Sent: Thursday, August 12, 2010 1:58 PM To: solr-user@lucene.apache.org Subject: Re: index pdf files Maybe this helps: http://www.packtpub.com/a

possible bug in sorting by Function?

2010-08-12 Thread solr-user

I was looking at the ability to sort by Function that was added to solr. For the most part it seems to work. However solr doesn't seem to like to sort by certain functions. For example, this sum works: http://10.0.11.54:8994/solr/select?q=*:*&sort=sum(1,Latitude,Longitude,sum(Latitude,Longitu

Re: possible bug in sorting by Function?

2010-08-12 Thread solr-user

small typo in last email: second sum should have been hsin, but I notice that the problem also occurs when I leave it as sum -- View this message in context: http://lucene.472066.n3.nabble.com/possible-bug-in-sorting-by-Function-tp1118235p1118260.html Sent from the Solr - User mailing list arc

Field getting tokenized prior to charFilter on select query

2010-08-12 Thread Andrew Chalupa

I'm attempting to make use of PatternReplaceCharFilterFactory, but am running into issues on both 1.4.1 ( I ported it) and on nightly (4.0-2010-07-27). It seems that on a real query the charFilter isn't executed prior to the tokenizer. I modified the example configuration included in the dis

XSL import/include relative to app server home directory...

2010-08-12 Thread Brian Johnson

Hello, I'm customizing my XML response using with the XSLTResponseWriter using "&wt=xslt&tr=transform.xsl". Because I have a few use-cases to support, I wanted to break up the common bits and import/include them from multiple top level xslt files, but it appears that the base directory of the tran

Require some advice

2010-08-12 Thread Pavan Gupta

Hi, I am new to text search and mining and have been doing research for different available products. My application requires reading a SMS message (unstructured) and finding out entities such as person name, area, zip , city and skills associated with the person. SMS would be in form of free text.

RE: index pdf files

2010-08-12 Thread Ma, Xiaohui (NIH/NLM/LHC) [C]

I got the following error when I index some pdf files. I wonder if anyone has this issue before and how to fix it. Thanks so much in advance! *** Error 500 HTTP ERROR: 500org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tik

Free Webinar: Findability: Designing the Search Experience

2010-08-12 Thread Erik Hatcher

Here's perhaps the coolest webinar we've done to date, IMO :) I attended Tyler's presentation at Lucene EuroCon* and thoroughly enjoyed it. Search UI/UX is a fascinating topic to me, and really important to do well for the applications most of us are building. I'm pleased to pass along the

Re: possible bug in sorting by Function?

2010-08-12 Thread solr-user

problem could be related to some oddity in sum()?? some more examples: note: Latitude and Longitude are fields of type=double works: http://10.0.11.54:8994/solr/select?q=*:*&sort=sum(sum(1,1.0))%20asc http://10.0.11.54:8994/solr/select?q=*:*&sort=sum(Latitude,Latitude)%20asc http://10.0.11.54:8

RE: Require some advice

2010-08-12 Thread Michael Griffiths

Solr is a search engine, not an entity extraction tool. While there are some decent open source entity extraction tools, they are focused on processing sentences and paragraphs. The structural differences in text messages means you'd need to do a fair amount of work to get decent entity extrac

SOLR-788 - disributed More Like This

2010-08-12 Thread Shawn Heisey

I tried some time ago to use SOLR-788. Ultimately I was able to get both patch versions to apply (separately), but neither worked. The suggestion I received when I commented on the issue was to download the specific release mentioned in the patch and then update, but the patch was created be

Re: possible bug in sorting by Function?

2010-08-12 Thread solr-user

issue resolve. problem was that solr.war was silently not being overwritten by new version. will try to spend more time debugging before posting. -- View this message in context: http://lucene.472066.n3.nabble.com/possible-bug-in-sorting-by-Function-tp1118235p1121349.html Sent from the Solr -

Re: General questions about distributed solr shards

2010-08-12 Thread Shawn Heisey

On 8/11/2010 3:27 PM, JohnRodey wrote: 1) Is there any information on preferred maximum sizes for a single solr index. I've read some people say 10 million, some say 80 million, etc... Is there any official recommendation or has anyone experimented with large datasets into the tens of billions?

RE: Require some advice

2010-08-12 Thread Nagelberg, Kallin

Try this, http://viewer.opencalais.com/ They have an open API for that data. With your text message of : "John Mayer Mumbai 411004 Juhu, car driver, also capable of body guard" It gives back: People: John Mayer Mumbai Positions: body guard, car driver. It's not perfect but it's not bad eithe

Re: clustering component

2010-08-12 Thread Matt Mitchell

Hey thanks Stanislaw! I'm going to try this against the current trunk tonight and see what happens. Matt On Wed, Jul 28, 2010 at 8:41 AM, Stanislaw Osinski < stanislaw.osin...@carrotsearch.com> wrote: > > The patch should also work with trunk, but I haven't verified it yet. > > > > I've just add

Hierarchical faceting

2010-08-12 Thread Mats Bolstad

Hey all, I am doing a search on hierarchical data, and I have a hard time getting my head around the following problem. I want a result as follows, in one single query only: USA (3) > California (2) > Arizona (1) Europe (4) > Norway (3) >> Oslo (3) > Sweden (1) How it looks in the XML/JSON resp

Re: Phrase search

2010-08-12 Thread Chris Hostetter

: I'm trying to match "Apple 2" but not "Apple2" using phrase search, this is why I have it quoted. : I was under the impression --when I use phrase search-- all the : analyzer magic would not apply, but it is!!! Otherwise, how would I : search for a phrase?! well .. yes ... even with phras

Re: Solr query result cache size and "expire" property

2010-08-12 Thread Chris Hostetter

: please help - how can I calculate queryresultcache size (how much RAM should : be dedicated for that). I have 1,5 index size, 4 mio docs. : QueryResultWindowSize is 20. : Could I use "expire" property on the documents in this cache? There is no "expire" property, items are automaticly removed f

Re: How to extend the BinaryResponseWriter imposed by Solrj

2010-08-12 Thread Chris Hostetter

: I'm trying to extend the writer used by solrj : (org.apache.solr.response.BinaryResponseWriter), i have declared it in ... : I see that it is initialized, but when i try to set the 'wt' param to : 'myWriter' : : solrQuery.setParam("wt","myWriter"), nothing happen, it's still using the :

can searcher.getReader().getFieldNames() return only stored fields?

2010-08-12 Thread Gerald

Collection myFL = searcher.getReader().getFieldNames(IndexReader.FieldOption.ALL); will return all fields in the schema (i.e. index, stored, and indexed+stored). Collection myFL = searcher.getReader().getFieldNames(IndexReader.FieldOption.INDEXED ); likely returns all fields that are indexed (I

Re: Data Import Handler Query

2010-08-12 Thread Manali Joshi

Thanks Alexey. That solved the issue. I am now able to get all images information in the index. On Thu, Aug 12, 2010 at 12:47 AM, Alexey Serba wrote: > Try to define image solr fields <-> db columns mapping explicitly in > "image" entity, i.e. > > > > > > > > See > http://www.luci

Re: Duplicate a core

2010-08-12 Thread Chris Hostetter

: Is it possible to duplicate a core? I want to have one core contain only : documents within a certain date range (ex: 3 days old), and one core with : all documents that have ever been in the first core. The small core is then : replicated to other servers which do "real-time" processing on it

SOLR Query

2010-08-12 Thread Moiz Bhukhiya

Hi there, I've a problem querying SOLR for a specific field with a query string that contains spaces. I added following lines in the schema.xml to add my own defined fields. Fields are: ap_name, ap_address, ap_dob, ap_desg, ap_sec. Since all these fields are beginning with ap_, I included the th

Re: analysis tool vs. reality

2010-08-12 Thread Chris Hostetter

: Furthermore, I would like to add its not just the highlight matches : functionality that is horribly broken here, but the output of the analysis : itself is misleading. : : lets say i take 'textTight' from the example, and add the following synonym: : : this is broken => broke : : the query t

Re: analysis tool vs. reality

2010-08-12 Thread Robert Muir

On Thu, Aug 12, 2010 at 7:55 PM, Chris Hostetter wrote: > > > You say it's bogus because the qp will divide on whitesapce first -- but > you're assuming you know what query parser will be used ... the "field" > query parser (to name one) doesn't split on whitespace first. That's my > point: analys

Re: Solrj ContentStreamUpdateRequest Slow

2010-08-12 Thread Chris Hostetter

: It returns in around a second. When I execute the attached code it takes just : over three minutes. The optimal for me would be able get closer to the : performance I'm seeing with curl using Solrj. I think your problem may be that StreamingUpdateSolrServer buffers up commands and sends them

Re: analysis tool vs. reality

2010-08-12 Thread Chris Hostetter

: > You say it's bogus because the qp will divide on whitesapce first -- but : > you're assuming you know what query parser will be used ... the "field" : > query parser (to name one) doesn't split on whitespace first. That's my : > point: analysis.jsp doesn't make any assumptions about what quer

Re: Multiple updatehandlers in solr, different autocommit settings

2010-08-12 Thread Chris Hostetter

: : I'm trying to set different autocommit settings to 2 separate request : handlers...I would like a requesthandler to use an update handler and a : second requesthandler use another update handler... : : can I have more than one update handler in the same solrconfig? : how can I configure a req

Re: Hierarchical faceting

2010-08-12 Thread Jayendra Patil

We were able to get the hierarchy faceting working with a work around approach. e.g. if you have Europe//Norway//Oslo as an entry 1. Create a new multivalued field with string type 2. Index the field for "Europe//Norway//Oslo" with values 0//Europe 1//Europe//Norway 2//Europe//Norway//Oslo 3

Re: Index compatibility 1.4 Vs 3.1 Trunk

2010-08-12 Thread Chris Hostetter

: > : > That should still be true in the the official 4.0 release (i really should : > have said "When 4.0 can no longer read SOlr 1.4 indexes"), ... : > i havne't been following the detials closely, but i suspect that tool : > hasn't been writen yet because there isn't much point until the full :

Re: edismax pf2 and ps

2010-08-12 Thread Jayendra Patil

We pretty much had the same issue, ended up customizing the ExtendedDismax code. In your case its just a change of a single line addShingledPhraseQueries(query, normalClauses, phraseFields2, 2, tiebreaker, pslop); to addShingledPhraseQueries(query, normalClauses, phraseFields2, 2,

Re: DIH and multivariable fields problems

2010-08-12 Thread Lance Norskog

Please add a JIRA issue for this. https://issues.apache.org/jira/secure/BrowseProject.jspa On Tue, Aug 10, 2010 at 6:59 PM, kenf_nc wrote: > > Glad I could help. I also would think it was a very common issue. Personally > my schema is almost all dynamic fields. I have unique_id, content, > last_u

Re: DataImportHandler in Solr 1.4.1: exception handling in FileListEntityProcessor

2010-08-12 Thread Lance Norskog

Please add a JIRA issue for this. On Wed, Aug 11, 2010 at 6:24 AM, Sascha Szott wrote: > Sorry, there was a mistake in the stack trace. The correct one is: > > SEVERE: Full Import failed > org.apache.solr.handler.dataimport.DataImportHandlerException: 'baseDir' > value: /home/doe/foo is not a dir

Re: analysis tool vs. reality

2010-08-12 Thread Robert Muir

On Thu, Aug 12, 2010 at 8:07 PM, Chris Hostetter wrote: > > : > You say it's bogus because the qp will divide on whitesapce first -- > but > : > you're assuming you know what query parser will be used ... the "field" > : > query parser (to name one) doesn't split on whitespace first. That's > my

Re: Solr 1.4.1 and 3x: Grouping of query changes results

2010-08-12 Thread Chris Hostetter

: > Does not return document as expected: : > id:1234 AND (-indexid:1 AND -indexid:2) AND -indexid:3 : > : > Has anyone else experienced this? The exact placement of the parens isn't : > key, just adding a level of nesting changes the query results. ... : I could be wrong but I think this

Re: index pdf files

2010-08-12 Thread Chris Hostetter

: Subject: index pdf files : References: : <4c63ed43.4030...@r.email.ne.jp> : : In-Reply-To: http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh

Re: Indexing large files using Solr Cell causes OutOfMemory error

2010-08-12 Thread Chris Hostetter

: Subject: Indexing large files using Solr Cell causes OutOfMemory error : References: : In-Reply-To: http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a

Re: Filter Performance in Solr 1.3

2010-08-12 Thread Lance Norskog

There was a major Lucene change in filter handling from Solr 1.3 to Solr 1.4. They are much much faster in 1.4. Really Lucene 2.4.1 to Lucene 2.9.2. The filter is now consulted much earlier in the search process, thus weeding out many more documents early. It sounds like in Solr 1.3, you should on

Re: PDF file

2010-08-12 Thread Chris Hostetter

: Subject: PDF file : References: <20100729152139.321c4...@ibis> : : In-Reply-To: http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even

Re: In multicore env, can I make it access core0 by default

2010-08-12 Thread Chris Hostetter

: In-Reply-To: : References: : : Subject: In multicore env, can I make it access core0 by default http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a f

Re: hl.usePhraseHighlighter

2010-08-12 Thread Chris Hostetter

: Subject: hl.usePhraseHighlighter : References: <1281125904548-1031951.p...@n3.nabble.com> : <960560.55971...@web52904.mail.re2.yahoo.com> : In-Reply-To: <960560.55971...@web52904.mail.re2.yahoo.com> http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When startin

Re: Indexing and ExtractingRequestHandler

2010-08-12 Thread Lance Norskog

This is probably true about Luke. The trunk has a new Lucene format and does not read any previous format. The trunk is a busy code base. The 3.1 branch is slated to be the next Solr release, and is probably a better base for your testing. Best of all is to use the Solr 1.4.1 binary release. On W

Re: Deleting with the DIH sometimes doesn't delete

2010-08-12 Thread Lance Norskog

Which version of Solr is this? How many documents are there in the index? Etc. It is hard for us to help you without more details. On Thu, Aug 12, 2010 at 8:32 AM, Qwerky wrote: > > I'm doing deletes with the DIH but getting mixed results. Sometimes the > documents get deleted, other times I can

Re: indexing???

2010-08-12 Thread Erick Erickson

Can you provide more details? What is the error you're receiving? What do you "think" is going on? It might be helpful if you reviewed: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Thu, Aug 12, 2010 at 8:21 AM, satya swaroop wrote: > Hi all, > The indexing part of solr is

Re: Results from More then One Cors?

2010-08-12 Thread Erick Erickson

There is no information to go on here. Please review http://wiki.apache.org/solr/UsingMailingLists and add some more details... Best Erick On Thu, Aug 12, 2010 at 2:09 PM, Jörg Agatz wrote: > Hallo Users... > > I tryed to get results from more then one Cores.. > But i dont know how.. > > Maby y

Re: SOLR Query

2010-08-12 Thread Erick Erickson

You'll get a lot of insight into what's actually happening if you append &debugQuery=true to your queries, or check the "debug" checkbox in the solr admin page. But I suspect (and it's a guess since you haven't included your schema) that your problem is that you're mixing explicit and default fiel

Re: Index compatibility 1.4 Vs 3.1 Trunk

2010-08-12 Thread Robert Muir

On Thu, Aug 12, 2010 at 8:29 PM, Chris Hostetter wrote: > > It was a big part of the proposal regarding hte creation of hte 3x > branch ... that index format compabtibility between major versions would > no longer be supported by silently converted on first write -- instead > there there would be

DataImportHandler and SAXParseExceptions with Jetty

2010-08-12 Thread harrysmith

Win XP, Solr 1.4.1 out of the box install, using jetty. If I add greater than or less than (ie < or >) in any xml field and attempt to load or run from the DataImportConsole I receive a SAXParseException. Example follows: If I don't have a 'less than' it works just fine. I know this must work, be

Re: SOLR Query

2010-08-12 Thread Moiz Bhukhiya

I tried ap_address:(tom+cruise) and that worked. I am sure its the same problem as you suspected! Thanks a lot Erick(& users!) for your time. Moiz On Thu, Aug 12, 2010 at 8:51 PM, Erick Erickson wrote: > You'll get a lot of insight into what's actually happening if you append > &debugQuery=true

Re: DIH - Insert another record After first load

2010-08-12 Thread Shalin Shekhar Mangar

On Thu, Aug 12, 2010 at 7:05 AM, Girish wrote: > Hi, > > I did load of the data with DIH and now once the data is loaded. I want to > load the records dynamically as an when I received. > > Use cases: > > 1. I did load of 7MM records and now everything is working fine. > 2. A new record is re

90 matches

Mail list logo