date:20140426

RE: How can I convert xml message for updating a Solr index to a javabin file

2014-04-26 Thread Elran Dvir

Does anyone know a way to do this? Thanks. -Original Message- From: Elran Dvir Sent: Thursday, April 24, 2014 4:11 PM To: solr-user@lucene.apache.org Subject: RE: How can I convert xml message for updating a Solr index to a javabin file I want to measure xml vs javabin update message i

[ANN] Apache Gora 0.4 Released

2014-04-26 Thread Lewis John Mcgibbney

Good Afternoon Everyone, > > The Apache Gora team are very proud to announce the immediate release of > Gora 0.4 which is a major release for the project. > > The Apache Gora open source framework provides an in-memory data model and > persistence for big data. Gora supports persisting to column st

Re: Indexing Big Data With or Without Solr

2014-04-26 Thread Aman Tandon

Thanks vineet With Regards Aman Tandon On Wed, Apr 23, 2014 at 7:21 PM, Vineet Mishra wrote: > I did it with Tomcat and Zookeeper Ensemble, will mail you the steps > shortly. > > Cheers > > > On Sat, Apr 19, 2014 at 9:09 AM, Aman Tandon >wrote: > > > Vineet please share after you setup for sol

Re: Search for a mask that matches the requested string

2014-04-26 Thread Alan Woodward

Hi, I'm the author of luwak. I have a half-finished version sitting in a branch somewhere that pulls all the intervals-fork-specific code out of the library and would run with 4.6. It would need to be integrated into Solr as well, but I have an upcoming project which may well do just that. Fe

Re: DIH issues with 4.7.1

2014-04-26 Thread Mark Miller

bq. due to things like NTP, etc. The full sentence is very important. NTP is not the only way for this to happen - you also have leap seconds, daylight savings time, internet clock sync, a whole host of things that affect currentTimeMillis and not nanoTime. It is without question the way to go

Re: DIH issues with 4.7.1

2014-04-26 Thread Walter Underwood

NTP works very hard to keep the clock positive monotonic. But nanoTime is intended for elapsed time measurement anyway, so it is the right choice. You can get some pretty fun clock behavior by running on virtual machines, like in AWS. And some system real time clocks don't tick during a leap sec

Re: zkCli zkhost parameter

2014-04-26 Thread Mark Miller

Have you tried a comma-separated list or are you going by documentation? It should work. -- Mark Miller about.me/markrmiller On April 26, 2014 at 1:03:25 PM, Scott Stults (sstu...@opensourceconnections.com) wrote: It looks like this only takes a single host as its value, whereas the zkHost

zkCli zkhost parameter

2014-04-26 Thread Scott Stults

It looks like this only takes a single host as its value, whereas the zkHost environment variable for Solr takes a comma-separated list. Shouldn't the client also take a comma-separated list? k/r, Scott

Re: Optimal setup for multiple tools

2014-04-26 Thread Erick Erickson

Have you considered putting them in the _same_ index? There's not much penalty at all with having sparsely populated fields in a document, so the fact that the two parts of your index had orthogonal fields wouldn't cost you much and would solve the synchronization problem. You can include a type f

Re: SOLR 4 not utilizing multi CPU cores

2014-04-26 Thread Erick Erickson

I suspect your problem is that termfreq is looking at _terms_, not phrases. It has no sense of position, that's a higher-level construct. So "Research Development" is searched as a single _term_, and there are no two-word terms. What use-case are you trying to solve? This seems like an XY problem

Re: DIH issues with 4.7.1

2014-04-26 Thread Mark Miller

My answer remains the same. I guess if you want more precise terminology, nanoTime will generally be monotonic and currentTimeMillis will not be, due to things like NTP, etc. You want monotonicity for measuring elapsed times. -- Mark Miller about.me/markrmiller On April 26, 2014 at 11:25:16 AM,

Re: TB scale

2014-04-26 Thread Walter Underwood

I think Hathi Trust has a few terabytes of index. They do full-text search on 10 million books. http://www.hathitrust.org/blogs/Large-scale-Search wunder On Apr 26, 2014, at 8:36 AM, Toke Eskildsen wrote: >> Anyone with experience, suggestions or lessons learned in the 10 -100 TB >> scale th

RE: TB scale

2014-04-26 Thread Toke Eskildsen

> Anyone with experience, suggestions or lessons learned in the 10 -100 TB > scale they'd like to share? > Researching optimum design for a Solr Cloud with, say, about 20TB index. We're building a web archive with a projected index size of 20TB (distributed in 20 shards). Some test results and a

Re: DIH issues with 4.7.1

2014-04-26 Thread Walter Underwood

NTP should slew the clock rather than jump it. I haven't checked recently, but that is how it worked in the 90's when I was organizing the NTP hierarchy at HP. It only does step changes if the clocks is really wrong. That is most likely at reboot, when other demons aren't running yet. wunder O

Re: DIH issues with 4.7.1

2014-04-26 Thread Mark Miller

System.currentTimeMillis can jump around due to NTP, etc. If you are trying to count elapsed time, you don’t want to use a method that can jump around with the results. -- Mark Miller about.me/markrmiller On April 26, 2014 at 8:58:20 AM, YouPeng Yang (yypvsxf19870...@gmail.com) wrote: Hi Rafał

Optimal setup for multiple tools

2014-04-26 Thread Jimmy Lin

Hello, My team has been working with SOLR for the last 2 years. We have two main indices: 1. documents -index and store main text -one record for each document 2. places (all of the geospatial places found in the documents above) -index but don't store main text -

Re: get term frequency, just only keywords search

2014-04-26 Thread Jack Krupansky

You need to use a shingle filter at index time so that pairs of adjacent words get indexed as single terms, then you can do a term frequency for the shingled pair of terms ("Research Development" as a single term). Be sure to manually apply any other filters, such as lower case or stemming. Se

Re: DIH issues with 4.7.1

2014-04-26 Thread YouPeng Yang

Hi Rafał Kuć I got it,the point is many operating systems measure time in units of tens of milliseconds,and the System.currentTimeMillis() is just base on operating system. In my case,I just do DIH with a crontable, Is there any possiblity to get in that trouble?I am really can not picture w

Re: DIH issues with 4.7.1

2014-04-26 Thread YouPeng Yang

Hi Mark Miller Sorry to get you in these discussion . I notice that Mark Miller report this issure in https://issues.apache.org/jira/browse/SOLR-5734 according to https://issues.apache.org/jira/browse/SOLR-5721,but it just happened with the zookeeper. If I just do DIH with JDBCDataSource ,I d

Re: DIH issues with 4.7.1

2014-04-26 Thread Rafał Kuć

Hello! Look at the javadocs for both. The granularity of System.currentTimeMillis() depend on the operating system, so it may happen that calls to that method that are 1 millisecond away from each other still return the same value. This is not the case with System.nanoTime() - http://docs.oracle.c

Re: DIH issues with 4.7.1

2014-04-26 Thread YouPeng Yang

Hi I have just compare the difference between the version 4.6.0 and 4.7.1. Notice that the time in the getConnection function is declared with the System.nanoTime in 4.7.1 ,while System.currentTimeMillis(). Curious about the resson for the change.the benefit of it .Is it neccessory? I hav

'0' Status: Communication Error

2014-04-26 Thread Naresh

I've got this problem that I can't solve. Partly because I can't explain it with the right terms. I'm new to this so sorry for this clumsy question. Below you can see an overview of my goal. I'm using Magento CE1.7.0.2 & Solr 4.6.0. I'm using Magentix/Solr extension in Magento CE1.7.0.2 its work

How to sort solr results by foreign id field

2014-04-26 Thread hungctk33

I have documents with the following fields: id name parent color The parent field is an ID of another document. I want to select all documents where the color is red and sort the results by the name of the parent. Can it be done in solr? - I am a student IT -- View this message in context:

Re: get term frequency, just only keywords search

2014-04-26 Thread ksmith

Hi, jack i have a same problem as danielitos85 i want to search like "research development" but termfreq function not work as per your messages and you said that use phraseFreq but we can get it from debug query. my problem is i want to sort on "research development" count, higher count document wi

Re: SOLR 4 not utilizing multi CPU cores

2014-04-26 Thread ksmith

hi Salman, i getting one problem in solr 4.6 i have upgrade solr 1.4 to solr 4.6 because of i want to display search term count, and term count getting by solr term frequency but when i search only single word than its work fine i get perfect count but when i search multiple word within double quo

RE: How can I convert xml message for updating a Solr index to a javabin file

[ANN] Apache Gora 0.4 Released

Re: Indexing Big Data With or Without Solr

Re: Search for a mask that matches the requested string

Re: DIH issues with 4.7.1

Re: DIH issues with 4.7.1

Re: zkCli zkhost parameter

zkCli zkhost parameter

Re: Optimal setup for multiple tools

Re: SOLR 4 not utilizing multi CPU cores

Re: DIH issues with 4.7.1

Re: TB scale

RE: TB scale

Re: DIH issues with 4.7.1

Re: DIH issues with 4.7.1

Optimal setup for multiple tools

Re: get term frequency, just only keywords search

Re: DIH issues with 4.7.1

Re: DIH issues with 4.7.1

Re: DIH issues with 4.7.1

Re: DIH issues with 4.7.1

'0' Status: Communication Error

How to sort solr results by foreign id field

Re: get term frequency, just only keywords search

Re: SOLR 4 not utilizing multi CPU cores

25 matches

Site Navigation

Mail list logo

Footer information