date:20120815

Re: offsets issues with multiword synonyms since LUCENE_33

2012-08-15 Thread Konrad Lötzsch

I don't know wether this was discussed previously, but if you tell the synonmyfilter to not break your synonyms (which might be the default). In this case, the parts of the synonyms get new word positions. So you could use a Keywordtokenizer to avoid that behaviour: with regards, kon

Re: Query regarding dataimporthandler

2012-08-15 Thread Shalin Shekhar Mangar

There is no way to do it within DataImportHandler but you can configure in solrconfig.xml to automatically commit pending updates by time or number of documents. On Tue, Aug 14, 2012 at 4:11 PM, ravicv wrote: > Hi, > > Is there any way for intermediate commits while indexing data using > dataim

Re: scanned pdf with solr cell

2012-08-15 Thread Ahmet Arslan

> When I send a scanned pdf to extraction request > handler, below icon appears in my Dock. > > http://tinypic.com/r/2mpmo7o/6 > http://tinypic.com/r/28ukxhj/6 I found that text-extractable pdf files triggers above weird icon too. curl "http://localhost:8983/solr/update/extract?literal.id=solr-

Re: scanned pdf with solr cell

2012-08-15 Thread Paul Libbrecht

Ahmet, the dock icon appears when AWT starts, e.g. when a font is loaded. You can prevent it using the headless mode but this is likely to trigger an exception. Same if your user is not UI-logged-in. hope it helps. Paul Le 15 août 2012 à 01:30, Ahmet Arslan a écrit : > Hi All, > > I have set

Re: scanned pdf with solr cell

2012-08-15 Thread Ahmet Arslan

> the dock icon appears when AWT starts, e.g. when a font is > loaded. > You can prevent it using the headless mode but this is > likely to trigger an exception. > Same if your user is not UI-logged-in. Hi Paul, thanks for the explanation. So is it nothing to worry about?

Re: SOLR3.6:Field Collapsing/Grouping throws OOM

2012-08-15 Thread Tirthankar Chatterjee

Hi Erick, You are so right on the memory calculations. I am happy that I know now that I was doing something wrong. Yes I am getting confused with SQL. I will back up and let you know the use case. I am tracking file versions. And I want to give an option to browse your system for the latest fil

Re: scanned pdf with solr cell

2012-08-15 Thread Paul Libbrecht

Le 15 août 2012 à 13:03, Ahmet Arslan a écrit : > Hi Paul, thanks for the explanation. So is it nothing to worry about? it is nothing to worry about except to remember that you can't run this step in a daemon-like process. (on Linux, I had to set-up a VNC-server for similar tasks) paul

Re: Switch from Sphinx to Solr - some basics please

2012-08-15 Thread Ahmet Arslan

> Because I have set a post in Stackoverflow, I wan't, that > there is dublicate > questions. Can you please read this post: > > http://stackoverflow.com/questions/11956608/sphinx-user-is-switching-to-solr Your questions require Sphinx knowledge. I suggest you to read these book(s) http://lucene

Re: Switch from Sphinx to Solr - some basics please

2012-08-15 Thread nnikolay

HI iorixxx, thanks for the reply. Well you don't need sphinx knowledge to answer my questions. I have write you what I want: 1. I need to have 2 seprate indexes. In Stackoverlfow I became the answer I need to start 2 cores for example. How many cores can I run for solr? I have for example over 1

How to design index for related versioned database records

2012-08-15 Thread Stefan Burkard

Hi solr-users I have a case where I need to build an index from a database. ***Data structure*** The data is spread across multiple tables and in each table the records are versioned - this means that one "real" record can exist multiple times in a table, each with different validFrom/validUntil

Re: Switch from Sphinx to Solr - some basics please

2012-08-15 Thread Ahmet Arslan

> 1. I need to have 2 seprate indexes. In Stackoverlfow I > became the answer I > need to start 2 cores for example. How many cores can I run > for solr? Please see : http://search-lucene.com/m/6rYti2ehFZ82 > I have for example jobs form country A, jobs from country B > and so on until > 100 c

Re: RAMDirectoryFactory bug

2012-08-15 Thread Michael Della Bitta

Hi, Lance, Thanks for your reply! It seems as if RAMDirectoryFactory is being passed the correct path to the index, as it's being logged correctly. It just doesn't recognize it as an index. Michael Della Bitta Appinions | 18 East 41st St., Suite

Re: How to design index for related versioned database records

2012-08-15 Thread Jack Krupansky

The date checking can be implemented using range query as a filter query, such as &fq=startDate:[* TO NOW] AND endDate:[NOW TO *] (You can also use an "frange" query.) Then you will have to flatten the database tables. Your Solr schema would have a single "merged" record type. You will have t

Re: scanned pdf with solr cell

2012-08-15 Thread Michael Della Bitta

You can try passing -Djava.awt.headless=true as one of the arguments when you start Jetty to see if you can get this to go away with no ill effects. Michael Della Bitta Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017 www.appinions.com

Re: scanned pdf with solr cell

2012-08-15 Thread Ahmet Arslan

> You can try passing > -Djava.awt.headless=true as one of the arguments > when you start Jetty to see if you can get this to go away > with no ill > effects. I started jetty using : 'java -Djava.awt.headless=true -jar start.jar' and successfully indexed two pdf files. That icon didn't appeared :

Re: RAMDirectoryFactory bug

2012-08-15 Thread Mark Miller

On Aug 14, 2012, at 4:34 PM, Michael Della Bitta wrote: > Hi everyone, > > It looks like I found a bug with RAMDirectoryFactory (I know, I know...) > Fair warning - RAMDir use in Solr is like a third class citizen. You probably should be using the mmap dir anyway. See http://blog.thetaphi.d

Re: RAMDirectoryFactory bug

2012-08-15 Thread Michael Della Bitta

Yes, moving to mmap was on our roadmap. I'm in the middle of moving our infrastructure from 1.4 to 3.6.1, and didn't want to make too many changes at the same time. However, this bug might push us over the edge to mmap and away from ram. I'll file a bug regardless. Thanks! Michael Della Bitta -

RE: Solr 4.0 - Join performance

2012-08-15 Thread David Smiley (@MITRE.org)

You would index rectangles of 0 height but that have a left edge 'x' of the start time and a right edge 'x' of your end time. You can index a variable number of these per Solr document and then query by either a point or another rectangle to find documents which intersect your query shape. It can

Re: Index not loading

2012-08-15 Thread Jonatan Fournier

On Tue, Aug 14, 2012 at 5:37 PM, Jonatan Fournier wrote: > On Tue, Aug 14, 2012 at 10:25 AM, Erick Erickson > wrote: >> This is quite odd, it really sounds like you're not >> actually committing. So, some questions. >> >> 1> What happens if you search before you shut >> down your tomcat? Do you s

Re: Switch from Sphinx to Solr - some basics please

2012-08-15 Thread Walter Underwood

These do require some Sphinx knowledge. I could answer them on StackOverflow because I converted Chegg from Sphinx to Solr this year. As I said there, read about Solr cores. They are independent search configurations and indexes within one Solr server: http://wiki.apache.org/solr/CoreAdmin Fo

Re: Duplicated facet counts in solr 4 beta: user error

2012-08-15 Thread Erick Erickson

No problem, and thanks for posting the resolution If you have the time and energy, anyone can edit the Wiki if you create a logon, so any clarification you'd like to provide to keep others from having this problem would be most welcome! Best Erick On Tue, Aug 14, 2012 at 6:13 PM, Buttler, Da

Re: Facet sort numeric values

2012-08-15 Thread Erick Erickson

the problem you're running into is that lexical ordering of numeric data != numeric ordering. If you have a mixed alpha and numeric data, you man not care if the alpha stuff is first, i.e. asdb456 asdf490 sorts fine. Problems happen with 9jsdf 100ukel the 100ukel comes first. So if you have a m

Re: Solr 3.5 result grouping is failing

2012-08-15 Thread Erick Erickson

Please attach the results of adding &debugQuery=on to your query in both the success and failure case, there's very little information to go on here. You might review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Wed, Aug 15, 2012 at 12:57 AM, chethan wrote: > Hi, > > I'm trying

Re: SOLR3.6:Field Collapsing/Grouping throws OOM

2012-08-15 Thread Erick Erickson

No, sharding into multiple cores on the same machine still is limited by the physical memory available. It's still lots of stuf on a limited box. But try backing up and re-thinking the problem a bit. Some possibilities off the top of my head: 1> have a new field "current". when you update a d

Re: question(s) re lucene spatial toolkit aka LSP aka spatial4j

2012-08-15 Thread David Smiley (@MITRE.org)

Hey solr-user, are you by chance indexing LineStrings? That is something I never tried with this spatial index. Depending on which iteration of LSP you are using, I figure you'd either end up indexing a vast number of points along the line which would be slow to index and make the index quite big

Does DataImportHandler do any sanitizing?

2012-08-15 Thread Jon Drukman

I am pulling some fields from a mysql database using DataImportHandler and some of them have invalid XML in them. Does DataImportHandler do any kind of filtering/sanitizing to ensure that it will go in OK or is it all on me? Example bad data: orphaned ampersands ("Peanut Butter & Jelly"), curly

Re: Does DataImportHandler do any sanitizing?

2012-08-15 Thread Michael Della Bitta

Hi, Jon, As far as I know, DataImportHandler doesn't transfer data to the rest of Solr via XML so it shouldn't be a problem... Michael Della Bitta Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017 www.appinions.com Where Influence Isn’

custom complex field - PolyField

2012-08-15 Thread Leonardo Souza

Hi, I have to index a tuple like ('blah', 'more blah info') in a multivalued field type. I have read about the PolyField type and it seems the best solution so far but i can't find documentation pointing how to use or implement a custom field. Any help is appreciated. -- Leonardo S Souza

solr.xml entries got deleted when powered off

2012-08-15 Thread vempap

Hello, I created an index => all the schema.xml & solrconfig.xml files are created with content (I checked that they have contents in the xml files). But, if I poweroff the system & restart again - the contents of the files are gone. It's like 0 bytes files. Even, the solr.xml file which got up

Re: solr.xml entries got deleted when powered off

2012-08-15 Thread Leonardo Souza

Just guessing,. disk full? -- Abraços, Leonardo S Souza 2012/8/15 vempap > Hello, > > I created an index => all the schema.xml & solrconfig.xml files are > created with content (I checked that they have contents in the xml files). > But, if I poweroff the system & restart again - the conte

Re: solr.xml entries got deleted when powered off

2012-08-15 Thread vempap

nopes .. there is good amount of space left on disk -- View this message in context: http://lucene.472066.n3.nabble.com/solr-xml-entries-got-deleted-when-powered-off-tp4001496p4001502.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr.xml entries got deleted when powered off

2012-08-15 Thread vempap

It's happening when I'm not doing a clean shutdown. Are there any more scenarios it might happen ? -- View this message in context: http://lucene.472066.n3.nabble.com/solr-xml-entries-got-deleted-when-powered-off-tp4001496p4001503.html Sent from the Solr - User mailing list archive at Nabble.co

RE: solr.xml entries got deleted when powered off

2012-08-15 Thread Buttler, David

You are not putting these files in /tmp are you? That is sometimes wiped by different OS's on shutdown -Original Message- From: vempap [mailto:phani.vemp...@emc.com] Sent: Wednesday, August 15, 2012 3:31 PM To: solr-user@lucene.apache.org Subject: Re: solr.xml entries got deleted when

RE: solr.xml entries got deleted when powered off

2012-08-15 Thread vempap

No, I'm not keeping them in /tmp -- View this message in context: http://lucene.472066.n3.nabble.com/solr-xml-entries-got-deleted-when-powered-off-tp4001496p4001506.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: SOLR3.6:Field Collapsing/Grouping throws OOM

2012-08-15 Thread Chris Hostetter

: 2> Use external file fields (EFF) for the same purpose, that : won't require you to re-index the doc. The trick : here is you use the value in the EFF as a multiplier : for the score (that's what function queries do). So older : versions of the doc have scores of 0 and just d

Re: Atomic Multicore Operations - E.G. Move Docs

2012-08-15 Thread Nicholas Ball

Haven't managed to find a good way to do this yet. Does anyone have any ideas on how I could implement this feature? Really need to move docs across from one core to another atomically. Many thanks, Nicholas On Mon, 02 Jul 2012 04:37:12 -0600, Nicholas Ball wrote: > That could work, but then ho

Re: Atomic Multicore Operations - E.G. Move Docs

2012-08-15 Thread Li Li

在 2012-7-2 傍晚6:37，"Nicholas Ball" 写道： > > > That could work, but then how do you ensure commit is called on the two > cores at the exact same time? that may needs something like two phrase commit in relational dB. lucene has prepareCommit, but to implement 2pc, many things need to do. > Also, any w

Re: Atomic Multicore Operations - E.G. Move Docs

2012-08-15 Thread Li Li

do you really need this？ distributed transaction is a difficult problem. in 2pc, every node could fail, including coordinator. something like leader election needed to make sure it works. you maybe try zookeeper. but if the transaction is not very very important like transfer money in bank, you can

Re: Atomic Multicore Operations - E.G. Move Docs

2012-08-15 Thread Li Li

http://zookeeper.apache.org/doc/r3.3.6/recipes.html#sc_recipes_twoPhasedCommit On Thu, Aug 16, 2012 at 7:41 AM, Nicholas Ball wrote: > > Haven't managed to find a good way to do this yet. Does anyone have any > ideas on how I could implement this feature? > Really need to move docs across from on

Re: SOLR3.6:Field Collapsing/Grouping throws OOM

2012-08-15 Thread Tirthankar Chatterjee

Awesome thanks a lot, I am already on it with option 1. We need to track delete to flip the previous one as the current. Erick Erickson wrote: No, sharding into multiple cores on the same machine still is limited by the physical memory available. It's still lots of stuf on a limited box. But.

Re: Does DataImportHandler do any sanitizing?

2012-08-15 Thread Lance Norskog

If you want to sanitize them during indexing, the regular expression tools can do this. You would create a regular expression that matches bogus elements. There is a regular expression transformer in the DIH, and a regular expression CharFilter inside the Lucene text analysis stack. On Wed, Aug 15

MySQL Exception: Communications link failure WITH DataImportHandler

2012-08-15 Thread Jienan Duan

Hi all: I'm using DataImportHandler load data from MySQL. It works fine on my develop machine and online environment. But I got an exception on test environment: > Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: >> Communications link failure > > >> The last packet sent success

RE: Facet sort numeric values

2012-08-15 Thread Aleksander Akerø

I see the problem, but there are no possibilities for normalization as the upper limit could be anything in different cases (hard to explain). I think it is better for me to just apply the correct type of sorting with an array/list with some script. This is just for getting the facet values to look

43 matches

Mail list logo