Re: Solr PHP client

2012-12-14 Thread Guillaume Rossolini
Hi, The various Solr PHP clients have been a great help in the past, and I do not mean to belittle their efforts. However, the Solr project has made many efforts to support several input and output data formats, including JSON and even serialized PHP, which are fairly easy to implement. Maybe I am

Re: Solrj connect to already running solr server

2012-12-14 Thread Per Steffensen
Billy Newman skrev: I have deployed the solr.war to my application server. On deploy I can see the solr server and my core "general" start up. I have a timer that fires every so ofter to go out and 'crawl' some services and index into Solr. I am using Solrj in my application and I am having tr

Re: Solrj connect to already running solr server

2012-12-14 Thread Per Steffensen
Per Steffensen skrev: Billy Newman skrev: I have deployed the solr.war to my application server. On deploy I can see the solr server and my core "general" start up. I have a timer that fires every so ofter to go out and 'crawl' some services and index into Solr. I am using Solrj in my applica

Re: Update / replication of offline indexes

2012-12-14 Thread Upayavira
I guess without knowing more about the usecase, it is difficult to see whether it is best to ship pre-prepared indexes or indexable content. Certainly the latter would be far simpler, and more in-keeping with the way Solr is typically used, and personally I'd start with that. Thinking through what

Re: Solr PHP client

2012-12-14 Thread Romita Saha
Hi I am using Solr-PHP client. I am not able to ping Solr. Is their any change that I need to make in Solr config files so that it can listen to the PHP client? I get the following error : ping() ) { echo 'Solr service not responding.'; exit; } Could someone please help. Thanks and regards, R

Re: Highlighting data stored outside of Solr

2012-12-14 Thread lboutros
Hi Michael, it was late yesterday when I wrote my last message. And it did not help that much. Feel free to contact me directly. I can not share the code I wrote for legal obligations. But I can help you :) Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.

Re: Strange data-loss problem on one of our cores

2012-12-14 Thread John Nielsen
I did a manual commit, and we are still missing docs, so it doesn't look like the search race condition you mention. My boss wasn't happy when i mentioned that I wanted to try out unreleased code. Ill get him won over though and return with my findings. It will probably be some time next week. Th

RE: Strange data-loss problem on one of our cores

2012-12-14 Thread Markus Jelsma
FYI, we observe the same issue, after some time (days, months) a cluster running an older trunk version has at least two shards where the leader and the replica do not contain the same number of records. No recovery is attempted, it seems it thinks everything is alright. Also, one core of one of

Re: Strange data-loss problem on one of our cores

2012-12-14 Thread John Nielsen
How did you solve the problem? -- Med venlig hilsen / Best regards *John Nielsen* Programmer *MCB A/S* Enghaven 15 DK-7500 Holstebro Kundeservice: +45 9610 2824 p...@mcb.dk www.mcb.dk On Fri, Dec 14, 2012 at 12:04 PM, Markus Jelsma wrote: > FYI, we observe the same issue, after some tim

RE: Strange data-loss problem on one of our cores

2012-12-14 Thread Markus Jelsma
We did not solve it but reindexing can remedy the problem. -Original message- > From:John Nielsen > Sent: Fri 14-Dec-2012 12:31 > To: solr-user@lucene.apache.org > Subject: Re: Strange data-loss problem on one of our cores > > How did you solve the problem? > > > -- > Med venlig hil

Re: facet count distinct and sum group by field

2012-12-14 Thread Fredrik Rødland
Den 14. des. 2012 kl. 06:16 skrev cmd.ares: > i want to use solr like sql: > select type,count(distinct product_name)s1,sum(price)s2 group by type > how to do it with solr? > thanks I think you should be able to do this using the StatsComponent faceting on type http://wiki.apache.org/solr/Sta

Performance improvement in large OR query using boosting (also, cache doesn't work?)

2012-12-14 Thread David Radunz
Hey Guys, I have really been enjoying Solr and I can't really blame the slowness on solr as this is a pretty insane query. However, I am a little curious why a repeated query moments later also suffers from the same load time? Anyway, the queries are: // 1st Query INFO: [] webapp=/solr

RE: Performance improvement in large OR query using boosting (also, cache doesn't work?)

2012-12-14 Thread Markus Jelsma
Hi, This is insane indeed! Please enable debugging and report the prepare and process times for the query component. I think the prepare time is very high in both queries and the process time is slightly less for the second query due to caching. Cheers, -Original message- > From:D

Re: Strange data-loss problem on one of our cores

2012-12-14 Thread John Nielsen
I'm building a simple tool which will help us monitor the solr cores for this problem. Basically it does a q=*:* on both servers on each cores and compares numFound of each result. Problem is that since this is a cloud setup, i can't be sure which server gets me the result. Is there a parameter I c

RE: Strange data-loss problem on one of our cores

2012-12-14 Thread Markus Jelsma
You must use the core's name and not use the collection name so you have to know which core is on which server. http://host:port/solr/corename/select You can use the cores handler to find out about the cores on the node: http://host:port/solr/admin/cores You can also use luke for this. It return

Solr highlighting problem

2012-12-14 Thread dutchiexl
I have a solr query where I search for (webpage_text:*test* OR company_text:*test*) In my highlighting I set my fields to webpage_text, company_text. But now I always get BOTH fields in the highlighting result, even when the search term is only found in webpage_text, I also get a highlight result

Re: Strange data-loss problem on one of our cores

2012-12-14 Thread John Nielsen
Awesome! http://host:port/solr/admin/cores is exactly what i needed! -- Med venlig hilsen / Best regards *John Nielsen* Programmer *MCB A/S* Enghaven 15 DK-7500 Holstebro Kundeservice: +45 9610 2824 p...@mcb.dk www.mcb.dk On Fri, Dec 14, 2012 at 1:21 PM, Markus Jelsma wrote: > You mus

Re: Performance improvement in large OR query using boosting (also, cache doesn't work?)

2012-12-14 Thread David Radunz
Hey, Sorry for the delay, I had to enable larger head buffers in jetty to do this as a GET query (LOL). Anyway, I have put the results on pastebin to try and make it more presenable, though it's mostly failed. 1st Query: http://pastebin.com/uSGtQjA3 (query with a freshly started solr)

Re: Solr PHP client

2012-12-14 Thread Upayavira
Can you access your Solr server via a browser? I bet it is something simple like a URL being wrong. Upayavira On Fri, Dec 14, 2012, at 09:37 AM, Romita Saha wrote: > Hi I am using Solr-PHP client. I am not able to ping Solr. Is their any > change that I need to make in Solr config files so that

NGram with words

2012-12-14 Thread Arkadi Colson
Hi When "abcdefg 123456" is in Solr I would like to have match with - abcd - cdef - abcdefg 123456 - "abcdefg 123456" - "defg 1234" The last one is actually not working. What am I doing wrong? My config looks like this. /stored="false" multiValued="true" omitNorms="true" omitPositions="false"

Re: Solr PHP client

2012-12-14 Thread Bill Au
You need to configure and start Solr independent of any client you use. Bill On Fri, Dec 14, 2012 at 2:23 AM, Romita Saha wrote: > Hi, > > Can anyone please guide me to use SolrPhpClient? The documents available > are not clear. As to where to place SolrPhpClient? > > I have downloaded SolrPhpC

Re: if I only need exact search, does frequency/score matter?

2012-12-14 Thread Bill Au
If your exact search returns more than one result, then by default they are sorted by the score. Bill On Thu, Dec 13, 2012 at 11:41 PM, Otis Gospodnetic < otis.gospodne...@gmail.com> wrote: > Hi > > If you are doing a pure boolean search - something matches or doesn't match > and you don't care

RE: Need help with delta import

2012-12-14 Thread Swati Swoboda
If I am not mistaken, it's suppose to be "dataimporter.delta.ID" and "dataimporter.last_index_time" You are using dataimport.delta.ID and dataimport.last_index_time http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport -Original Message- From: umajava [mailto:umaj...@gm

Re: NGram with words

2012-12-14 Thread Jack Krupansky
Yeah, the positions for ngrams have a good chance of not being what you want. But do try the Solr Admin Analysis web page for that index text and see what positions it generates for the sub-words. The two generated words used in your query may not have adjacent positions. -- Jack Krupansky

Re: NGram with words

2012-12-14 Thread Walter Underwood
Positions for edge ngrams are wrong. They should be handled like synonyms. This breaks phrase matching with ngrams. Not sure if there is a bug filed for this. wunder On Dec 14, 2012, at 8:16 AM, Jack Krupansky wrote: > Yeah, the positions for ngrams have a good chance of not being what you want

Re: NGram with words

2012-12-14 Thread Jack Krupansky
I can believe it. Note: He's using "ngrams", not "edge" ngrams. -- Jack Krupansky -Original Message- From: Walter Underwood Sent: Friday, December 14, 2012 11:21 AM To: solr-user@lucene.apache.org Cc: ark...@smartbit.be Subject: Re: NGram with words Positions for edge ngrams are wrong

Re: score calculation

2012-12-14 Thread Jack Krupansky
boost(index) is the index-time boost, the optional boost that you can specify when adding a document to the index, such as: 05991 Bridgewater The index-time boost is multiplied by the length normalization factor which gives a higher score for shorter documents. Note that fieldNor

Re: how make a suggester?

2012-12-14 Thread iwo
now I have some suggest for single word by WFSTLookupFactory suggest org.apache.solr.spelling.suggest.Suggester org.apache.solr.spelling.suggest.fst.WFSTLookupFactory name 0.005 true ./suggester true tru

Re: Solr PHP client

2012-12-14 Thread Jorge Luis Betancourt Gonzalez
Hi Guillaume: I beg to differ, it's true that the native solr support has been a big aid to developers use of solr from many programming languages. But making all the queries "by hand" is not wice and in any case is hard to maintain, it's easier using some OO library to interact with solr. For

RE: Need help with delta import

2012-12-14 Thread umajava
Thanks, but this didn't help either. Documents are not getting commited. 3202012-12-14 16:57:252012-12-14 16:57:252012-12-14 16:57:252012-12-14 16:57:25200:0:0.140 Should I do full import again as I have included email and fname in the query and start again? -- View this message in context: h

Re: NGram with words

2012-12-14 Thread Walter Underwood
I specified "edge ngrams" because that is the one I've investigated. --wunder On Dec 14, 2012, at 8:30 AM, Jack Krupansky wrote: > I can believe it. > > Note: He's using "ngrams", not "edge" ngrams. > > -- Jack Krupansky > -Original Message- From: Walter Underwood > Sent: Friday, Decemb

Re: Differentiate between correctly spelled term and mis-spelled term with no corrections

2012-12-14 Thread Nalini Kartha
Hi James, Couple more follow up questions - 1. Do changes to the response format have to be backwards compatible at this point? Seems like if we changed it to always return the origFreq even if there are no suggestions then that could break things right? 2. For our purposes, we need to be able to

LUKE (lucene index toolbox) for lucene/solr 4.0

2012-12-14 Thread solr solr
Please, could someone tell me where can i find a version of luke (Lucene Index Toolbox ) compatible with lucene/solr 4.0 index format? The version-4.0.0-lukeall ALPHA.jar, currently available in http://code.google.com/p/luke/ does not work. I tried to re-build from

Re: How to change Solr UI

2012-12-14 Thread Erik Hatcher
There's absolutely nothing inherently wrong with using Velocity with lean templating to render responses from Solr. It's just a templating technology. What you've done in your patch is replace the (IMO, and perhaps I'm the minority here?) clean Velocity template approach with some JavaScript/

RE: Need help with delta import

2012-12-14 Thread umajava
I tried full import and then delta import but still the issue is same. -- View this message in context: http://lucene.472066.n3.nabble.com/Need-help-with-delta-impor

Re: User-Agent string in Solr

2012-12-14 Thread hy
Hi Victor, i have the same problem. Did you find a solution to set the user-agent in solr-cell (ExtractingRequestHandler)? Greetings -- View this message in context: http://lucene.472066.n3.nabble.com/User-Agent-string-in-Solr-tp4022869p4027067.html Sent from the Solr - User mailing list arch

Re: Strange data-loss problem on one of our cores

2012-12-14 Thread Mark Miller
On Dec 14, 2012, at 7:09 AM, John Nielsen wrote: > Is there a > parameter I can add to the GET requests that will lock the request to a > specific node in the cluster, treating the server receiving the request as > a standalone server as opposed to a member of a cluster? The param dist=false w

Re: Strange data-loss problem on one of our cores

2012-12-14 Thread Mark Miller
Have you filed a JIRA issue for this that I don't remember Markus? We need to make sure this is fixed. Any idea around when the trunk version came from? Before or after 4.0? - Mark On Dec 14, 2012, at 6:36 AM, Markus Jelsma wrote: > We did not solve it but reindexing can remedy the problem.

optimun precisionStep for DAY granularity in a TrieDateField

2012-12-14 Thread jmlucjav
Hi I have a TrieDateField in my index, where I will index dates (range 2000-2020). I am only interested in the DAY granularity, that is , I dont care about time (I'll index all based on the same Timezone). Is there an optimun value for precisionStep that I can use so I don't index info I will not

RE: Strange data-loss problem on one of our cores

2012-12-14 Thread Markus Jelsma
Mark, no issue has been filed. That cluster runs a check out from round end of july/beginning of august. I'm in the process of including another cluster in the indexing and removal of documents besides the old production clusters. I'll start writing to that one tuesday orso. If i notice a discre

Re: How to change Solr UI

2012-12-14 Thread Upayavira
I guess it is important to distinguish between the VelocityResponseWriter and the /browse request handler. I suspect you are referring to the latter. The /browse interface is both useful and problematic. It is useful because it allows users to interact with their searches and results in a more int

RE: Differentiate between correctly spelled term and mis-spelled term with no corrections

2012-12-14 Thread Dyer, James
Nalini, I don't think you can change the *default* response format until a new major release (so its ok for Trunk/5.0 but not for the 4.x branch). What you can do, however, is create a new "spellcheck.xxx" parameter to let users opt-in to the new functionality in 4.x as desired. We'd also wan

RE: Need help with delta import

2012-12-14 Thread Dyer, James
Try ${dih.delta.ID} instead of ${dataimporter.delta.id}. Also use ${dih.last_index_time} instead of ${dataimporter.last_index_time} . I noticed when updating the test cases that the wiki incorrectly used the longer name but with all the versions I tested this on only the short name works. The

Re: Need help with delta import

2012-12-14 Thread Shawn Heisey
On 12/14/2012 11:39 AM, Dyer, James wrote: Try ${dih.delta.ID} instead of ${dataimporter.delta.id}. Also use ${dih.last_index_time} instead of ${dataimporter.last_index_time} . I noticed when updating the test cases that the wiki incorrectly used the longer name but with all the versions I t

Re: optimun precisionStep for DAY granularity in a TrieDateField

2012-12-14 Thread Lance Norskog
Do you use rounding in your dates? You can index a date rounded to the nearest minute, N minutes, hour or day. This way a range query has to look at such a small number of terms that you may not need to tune the precision step. Hunt for NOW/DAY or 5DAYS in the queries. http://wiki.apache.org/s

Re: Solrcloud and Node.js

2012-12-14 Thread Mark Miller
Yes, you can access SolrCloud in any std way you can access Solr. The main difference when using a client that does not know how to talk to ZooKeeper about the cluster state: You have to specify a particular machines address or setup a load balancer when using a 'dumb client. A dumb client wil

RE: Need help with delta import

2012-12-14 Thread Swati Swoboda
I am also confused, as I've been using dataimporter.* and not dih.* and it is working fine. -Original Message- From: Shawn Heisey [mailto:s...@elyograg.org] Sent: Friday, December 14, 2012 2:41 PM To: solr-user@lucene.apache.org Subject: Re: Need help with delta import On 12/14/201

small QTime but slow results to user

2012-12-14 Thread S L
Sometimes when I use curl to query solr I get a slow real time response but a short QTime. Here's an example: $ time curl "solrsandbox/testindex/select/?q=all:science,data&rows=500" > foo % Total% Received % Xferd Average Speed TimeTime Time Current

RE: Need help with delta import

2012-12-14 Thread Dyer, James
Shawn, I think it only is a problem with "dih.delta.xxx" ... the longer version, "dataimport.delta.xxx" doesn't work. This is coded in DocBuilder#doDelta and this line: vri.addNamespace(ConfigNameConstants.IMPORTER_NS_SHORT + ".delta", map); There is no additional line for: vri.addNamespace(

Re: Solrcloud and Node.js

2012-12-14 Thread Luis Cappa Banda
I think that Node.js is extremely powerful for developing REST API very light and simple modules, so combining it with Solr sounds good, that´s why I´m obsessed to combine them. So then with an example of numShards=2 SolrCloud is posible to execute queries like: http://host1:8000/solr/collection1

Re: optimun precisionStep for DAY granularity in a TrieDateField

2012-12-14 Thread jmlucjav
thanks Lance. I new about rounding in the request params, but I want to know if there is something to tweak at indexing time (by changing precisionSteop in schema.xml) in order to store only needed information. At query time yes, I would round to /DAY -- View this message in context: http://l

Re: Solr 4 under Windows OK?

2012-12-14 Thread Jack Krupansky
I run under Windows 7 fine. I do run under Cygwin through. And, The Solr Admin UI is completely unusable in Internet Explorer but works fine in Google Chrome, except that query results via the Admin UI will not display properly (the XML) in ANY browser (although some people claim it does work f

Re: Solr 4 under Windows OK?

2012-12-14 Thread Alexandre Rafalovitch
I run one of the smallish systems on Windows (4.0beta actually, with embedded Jetty). Natively and so far without problems. I even have a guide of setting it up under windows service, if people are interested. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.li

Re: optimun precisionStep for DAY granularity in a TrieDateField

2012-12-14 Thread Jack Krupansky
I've tried to figure this out and haven't fully resolved it. I mean, sure, you can set the precisionStep to 26, which may ignore the milliseconds per day, but supposedly it makes it much slower to lookup and may not actually throw away those 26 bits. -- Jack Krupansky -Original Message---

Re: optimun precisionStep for DAY granularity in a TrieDateField

2012-12-14 Thread Jack Krupansky
And the "official" answer when I posed the question on the Lucene User list is that the time of day bits would still be stored in the index in spite of the precisionStep. So, it doesn't really matter very much at all what precisionStep you use for trie date fields.. -- Jack Krupansky -Ori

Re: Solr 4 under Windows OK?

2012-12-14 Thread Otis Gospodnetic
Thanks Jack and Alexandre. Is performance under Windows comparable with performance under Linux, assuming the same HW, JVM, and such, do you know? Otis -- SOLR Performance Monitoring - http://sematext.com/spm/index.html Search Analytics - http://sematext.com/search-analytics/index.html On Fri

Re: Solr 4 under Windows OK?

2012-12-14 Thread Alexandre Rafalovitch
Sorry, not at that stage yet. It works so much faster than the previous (in-house) system that we haven't even bothered with full-tuning yet, never mind comparative performance testing. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrer

Re: optimun precisionStep for DAY granularity in a TrieDateField

2012-12-14 Thread Shawn Heisey
On 12/14/2012 4:15 PM, Jack Krupansky wrote: And the "official" answer when I posed the question on the Lucene User list is that the time of day bits would still be stored in the index in spite of the precisionStep. So, it doesn't really matter very much at all what precisionStep you use for tr

Dedup component

2012-12-14 Thread Jorge Luis Betancourt Gonzalez
Hi all: I'm trying to build a query suggestion system using solr (also used to index all the data in the app). I've a separated core dedicated only for this purpose (along with some other for images, etc.). In the main app, written in Symfoy2 + Solarium Bundle, we store the queries in this core

Re: small QTime but slow results to user

2012-12-14 Thread Otis Gospodnetic
Hi, It's the network or disk. Monitor both when running the query for the first time. Try the query multiple times. If it's faster the second time around it's not the network. If it is slow the second time, it is likely the network. Try fewer rows. Otis -- SOLR Performance Monitoring - http://sem

Re: small QTime but slow results to user

2012-12-14 Thread Chris Hostetter
: 500 : : I'm guessing the delay is from Lucene and not the network but I could be : wrong. 90% of my queries are 8 to 10 times faster than this. http://wiki.apache.org/solr/SolrTerminology QTime: The elapsed time (in milliseconds) between the arrival of the request (when the ?SolrQueryRequest

Re: small QTime but slow results to user

2012-12-14 Thread Yonik Seeley
On Fri, Dec 14, 2012 at 3:43 PM, S L wrote: > Does anyone have an idea why a query that takes solr just half a second (500 > ms) to execute would take 3 seconds to transfer the data? Normally this is due to slow reading of the stored fields (i.e. slow disk IO). For scalability, we don't read all