Re: Can not find solr core on admin page after setup

2013-10-29 Thread engy.morsy
yes, I do. I installed the solr example instance. Engy. -- View this message in context: http://lucene.472066.n3.nabble.com/Can-not-find-solr-core-on-admin-page-after-setup-tp4098236p4098380.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Need additional data processing in Data Import Handler prior to indexing

2013-10-29 Thread Dileepa Jayakody
Thanks guys for your ideas. I will go through them and come back with questions. Regards, Dileepa On Wed, Oct 30, 2013 at 7:00 AM, Erick Erickson wrote: > Third time tonight I've been able to paste this link > > Also, you can consider just moving to SolrJ and > taking DIH out of the proces

Re: Apache-Solr with Tomcat: displaying the format of search result

2013-10-29 Thread pyramesh
Hi Erick, I guess you misunderstood . Let me explain my problem in details. Field Name :* PROBLEM* *Content of PROBLEM* Title: Title of issue Description: Description of issue Other Detail :OtherDetails fo the issue

Re: Many Dynamic Fields + Indexing Strategy

2013-10-29 Thread Jack Krupansky
Every multitenant situation is going to be different, but at the extreme a single core per tenant is the cleanest and provides the best separation, optimal performance, and supports full tf-idf relevancy of document fields for each tenant. You can also do a hybrid, where you have separate core

Re: Replace document title with filename if it's empty

2013-10-29 Thread Bayu Widyasanyata
Hi Erick, Thanks for the info. Regards, On Wed, Oct 30, 2013 at 8:01 AM, Erick Erickson wrote: > You can write a custom bit of update code that lives on the Solr server > that > would essentially copy the filename field to title if title wasn't present. > > You could write a SolrJ program that

Re: Background merge errors with Solr 4.4.0 on Optimize call

2013-10-29 Thread Robert Muir
I think its a bug, but thats just my opinion. i sent a patch to dev@ for thoughts. On Tue, Oct 29, 2013 at 6:09 PM, Erick Erickson wrote: > Hmmm, so you're saying that merging indexes where a field > has been removed isn't handled. So you have some documents > that do have a "what" field, but you

Re: Solr 4.5.1 replication Bug? "Illegal to have multiple roots (start tag in epilog?)."

2013-10-29 Thread Sai Gadde
I just opened a JIRA issue https://issues.apache.org/jira/browse/SOLR-5402 SOLR-5331 was closed and i could not open it again so, created a new one. Thanks Sai On Wed, Oct 30, 2013 at 5:49 AM, Mark Miller wrote: > Has someone filed a JIRA issue with the current known info yet? > > - Mark > >

Many Dynamic Fields + Indexing Strategy

2013-10-29 Thread Alejandro Calbazana
Hi, I have an application that has a fair number of dynamic fields in addition to static fields. The use case is that a customer can create any number of dynamic fields and associate them with domain objects that we then pull into an indexed document. I have no way to know these fields in advanc

Re: Need additional data processing in Data Import Handler prior to indexing

2013-10-29 Thread Erick Erickson
Third time tonight I've been able to paste this link Also, you can consider just moving to SolrJ and taking DIH out of the process, see: http://searchhub.org/2012/02/14/indexing-with-solrj/ Whichever approach fits your needs of course. Best, Erick On Tue, Oct 29, 2013 at 7:15 PM, Alexandre

Re: Configuration and specs to index a 1 terabyte (TB) repository

2013-10-29 Thread Erick Erickson
In addition to Shawn's comments... bq: we're close to beta release, so I can't upgrade right now WHO! You say you're close to release but you haven't successfully crawled the data even once? Upgrading to 4.5.1 is a trivial risk compared to that statement! This is setting itself up for a real

Re: Can not find solr core on admin page after setup

2013-10-29 Thread Erick Erickson
Do you have your schema etc in the insanceDir? Best, Erick On Tue, Oct 29, 2013 at 9:54 AM, engy.morsy wrote: > Hi, > > I setup solr4.2 under apache tomcat on windows m/c. I created solr.xml > under > catalina/localhost that holds the solr/home path, I have only one core, so > the solr.xml und

Re: distributed search is significantly slower than direct search

2013-10-29 Thread Erick Erickson
You can't. There will inevitably be some overhead in the distributed case. That said, 7 seconds is quite long. 5,000 rows is excessive, and probably where your issue is. You're having to go out and fetch the docs across the wire. Perhaps there is some batching that could be done there, I don't kno

Re: Apache-Solr with Tomcat: displaying the format of search result

2013-10-29 Thread Erick Erickson
This looks to me like strictly a Velocity display issue. Try just submitting the URL in the browser (NOT to the /browse handler, I'd use the /select handler OR comment out the bits int he browse handler that talk about the velocity response writer). I'd bet that you need to go into the velocity te

Re: Data import handler with multi tables

2013-10-29 Thread dtphat
yes, I've just used concat(id, '_', tableName) instead using compound key. I think this is an easy way. Thanks. - Phat T. Dong -- View this message in context: http://lucene.472066.n3.nabble.com/Re-Data-import-handler-with-multi-tables-tp4098048p4098328.html Sent from the Solr - User mailin

Re: Solr 4.5.1 Overseer error

2013-10-29 Thread Erick Erickson
Absolutely sounds like it's worth a JIRA to me Erick On Mon, Oct 28, 2013 at 8:54 PM, Shawn Heisey wrote: > On 10/28/2013 5:50 PM, dboychuck wrote: > >> I am upgrading from 4.4 to 4.5.1 >> >> I used to just upload my configurations to zookeeper and then install >> solr >> with no default

Re: Background merge errors with Solr 4.4.0 on Optimize call

2013-10-29 Thread Erick Erickson
Hmmm, so you're saying that merging indexes where a field has been removed isn't handled. So you have some documents that do have a "what" field, but your schema doesn't have it, is that true? It _seems_ like you could get by by putting the _what_ field back into your schema, just not sending any

Re: Replace document title with filename if it's empty

2013-10-29 Thread Erick Erickson
You can write a custom bit of update code that lives on the Solr server that would essentially copy the filename field to title if title wasn't present. You could write a SolrJ program that does the Tika processing and add it before you sent the doc, see: http://searchhub.org/2012/02/14/indexing-w

Re: return value from SolrJ client to php

2013-10-29 Thread Erick Erickson
Why bring SolrJ into it in the first place? I've heard of, but not used, php Solr clients. And you also have the possibility of just using HTTP calls... Best, Erick On Mon, Oct 28, 2013 at 10:54 AM, Anshum Gupta wrote: > Hi Amit, > > I haven't personally tried it, but have a look at the options

Re: Need additional data processing in Data Import Handler prior to indexing

2013-10-29 Thread Alexandre Rafalovitch
It's also possible to combine Update Request Processor with DIH. That way if a debug entry needs to be inserted it could go through the same Stanbol process. Just define a processing chain the DIH handler and write custom URP to call out to Stanbol web service. You have access to a full record in

Re: Solr - what's the next big thing?

2013-10-29 Thread Michael Sokolov
On 10/26/2013 8:31 PM, Bill Bell wrote: Full JSON support deep complex object indexing and search Game changer Bill Bell Sent from mobile Not JSON (yet?) but take a look at http://luxdb.org which does XML indexing and search. We index all the text of all the nodes in your tree: no nee

Re: Solr 4.5.1 replication Bug? "Illegal to have multiple roots (start tag in epilog?)."

2013-10-29 Thread Mark Miller
Has someone filed a JIRA issue with the current known info yet? - Mark > On Oct 29, 2013, at 12:36 AM, Sai Gadde wrote: > > Hi Michael, > > I downgraded to Solr 4.4.0 and this issue is gone. No additional settings > or tweaks are done. > > This is not a fix or solution I guess but, in our cas

Re: Need additional data processing in Data Import Handler prior to indexing

2013-10-29 Thread Michael Della Bitta
Hi Dileepa, You can write your own Transformers in Java. If it doesn't make sense to run Stanbol calls in a Transformer, maybe setting up a web service that grabs a record out of MySQL, sends the data to Stanbol, and displays the results could be used in conjunction with HttpDataSource rather than

RE: Need additional data processing in Data Import Handler prior to indexing

2013-10-29 Thread Dyer, James
Would an "onImportEnd" event listener serve your needs? See http://wiki.apache.org/solr/DataImportHandler#EventListeners James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Dileepa Jayakody [mailto:dileepajayak...@gmail.com] Sent: Tuesday, October 29, 2013 3:48 PM To

Need additional data processing in Data Import Handler prior to indexing

2013-10-29 Thread Dileepa Jayakody
Hi All, I'm a newbie to Solr, and I have a requirement to import data from a mysql database; enhance the imported content to identify Persons mentioned and index it as a separate field in Solr along with the other fields defined for the original db query. I'm using Apache Stanbol [1] for the co

Re: Reclaiming disk space from (large, optimized) segments

2013-10-29 Thread Jason Hellman
If I sage Otis’ intent here it is to create shards on the basis of intervals of time. A shard represents a single interval (let’s say a year’s worth of data) and when that data is no longer necessary it is simply shut down and no longer included in queries. So, for example, you could have thre

Re: Reclaiming disk space from (large, optimized) segments

2013-10-29 Thread Gun Akkor
Hello Chris, Thank you for the response, I am following up on the e-mail chain for Scott. I guess we can try using a commit with expungeDeletes=true, but does not really address the underlying problem. If we hadn't issued the "optimize" in the past, thereby creating the 2 big segments, my unders

Re: character encoding issue...

2013-10-29 Thread Rajani Maski
Hi, If you are using Apache Tomcat Server, hope you are not missing the below mentioned configuration: I had faced similar issue with Chinese Characters and had resolved with the above config. Links for reference : http://zensarteam.wordpress.com/2011/11/25/6-steps-to-configure-solr-on-apa

Re: When is/should qf different from pf?

2013-10-29 Thread Jason Hellman
It is probable that with no addition boost to pf fields that the sum of the scores will be higher. But it is *possible* that they are not, and adding a boost to pf gives greater probability that they will be. All of this bears testing to confirm what search use cases merit what level of boost.

Re: Configuration and specs to index a 1 terabyte (TB) repository

2013-10-29 Thread Shawn Heisey
On 10/29/2013 10:44 AM, eShard wrote: Offhand, how do I control how much of the index is held in RAM? Can you point me in the right direction? This is automatically handled by the operating system. For quite some time, Solr (Lucene) has by default used the MMap functionality provided by all

Re: Configuration and specs to index a 1 terabyte (TB) repository

2013-10-29 Thread eShard
P.S. Offhand, how do I control how much of the index is held in RAM? Can you point me in the right direction? Thanks, -- View this message in context: http://lucene.472066.n3.nabble.com/Configuration-and-specs-to-index-a-1-terabyte-TB-repository-tp4098227p4098260.html Sent from the Solr - User

Re: Configuration and specs to index a 1 terabyte (TB) repository

2013-10-29 Thread eShard
Wow, thanks for your response. You raise a lot of great questions; I wish I had the answers! We're still trying to get enough resources to finish crawling the repository, so I don't even know what the final size of the index will be. I've thought about excluding the videos and other large files and

Re: When is/should qf different from pf?

2013-10-29 Thread xavier jmlucjav
I am confused, wouldn't a doc that match both the phrase and the term queries have a better score than a doc matching only the term score, even if qf and pf are the same?? On Mon, Oct 28, 2013 at 7:54 PM, Upayavira wrote: > There'd be no point having them the same. > > You're likely to include

Re: Single multilingual field analyzed based on other field values

2013-10-29 Thread davetroiano
Hi Trey, I was reading v9 of the Solr in Action MEAP but browsing your github repo, so I think I'm looking at the latest stuff. Agreed that the thread caching idea is dangerous. Perhaps it would work now, but it could easily break in a later version of Solr. I didn't mention another reason why

Re: solr wiki edit privs (or just fix a typo)

2013-10-29 Thread Steve Rowe
Thanks for reporting, Chuck. I’ve added your username to the ContributorsGroup page on the Solr wiki, so you should be able to make this change now. Steve On Oct 29, 2013, at 11:37 AM, chuck wrote: > > > There's a mistake in the sample xml at > https://wiki.apache.org/solr/AnalyzersTokeniz

character encoding issue...

2013-10-29 Thread Chris
Hi All, I get characters like - �� - CTA - in the solr index. I am adding Java beans to solr by the addBean() function. This seems to be a character encoding issue. Any pointers on how to resolve this one? I have seen that this occurs mostly for japanese chinese ch

Re: Reclaiming disk space from (large, optimized) segments

2013-10-29 Thread Gun Akkor
Otis, Thank you for your response, Could you elaborate a bit more on what you have in mind when you say "time-based" indices? Gun --- Senior Software Engineer Carbon Black, Inc. gun.ak...@carbonblack.com On Thu, Oct 24, 2013 at 11:56 PM, Otis Gospodnetic < otis.gospodne...@gmail.com> wrote:

Re: Configuration and specs to index a 1 terabyte (TB) repository

2013-10-29 Thread Shawn Heisey
On 10/29/2013 7:24 AM, eShard wrote: > Good morning, > I have a 1 TB repository with approximately 500,000 documents (that will > probably grow from there) that needs to be indexed. > I'm limited to Solr 4.0 final (we're close to beta release, so I can't > upgrade right now) and I can't use SolrC

solr wiki edit privs (or just fix a typo)

2013-10-29 Thread chuck
There's a mistake in the sample xml at https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PatternCaptureGroupFilterFactory In the XML snippet instead of   class="solr.PatternCaptureGroupTokenFilter" you should have   class="solr.PatternCaptureGroupFilterFactory" (ie, the title

RE: Solr Highlighting Best Practices Guideline

2013-10-29 Thread Erwin Gunadi
Hi Jack, thank you for the hint. Best Regards Erwin -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Tuesday, October 29, 2013 4:14 PM To: solr-user@lucene.apache.org Subject: Re: Solr Highlighting Best Practices Guideline There are some detailed examples

Re: Phrase query combined with term query for maximum accuracy

2013-10-29 Thread Jack Krupansky
You need some parentheses: title:john doe^30 OR description:john doe^10 should be: title:(john doe)^30 OR description:(john doe)^10 -- Jack Krupansky -Original Message- From: michael.boom Sent: Tuesday, October 29, 2013 7:20 AM To: solr-user@lucene.apache.org Subject: Phrase query c

Re: Solr Highlighting Best Practices Guideline

2013-10-29 Thread Jack Krupansky
There are some detailed examples in my e-book as well. -- Jack Krupansky -Original Message- From: Erwin Gunadi Sent: Tuesday, October 29, 2013 8:21 AM To: solr-user@lucene.apache.org Subject: RE: Solr Highlighting Best Practices Guideline Hi Furkan, thanks for the reply. I was trying

Can not find solr core on admin page after setup

2013-10-29 Thread engy.morsy
Hi, I setup solr4.2 under apache tomcat on windows m/c. I created solr.xml under catalina/localhost that holds the solr/home path, I have only one core, so the solr.xml under the solr instance looks like: after starting the apache service, I did not find the core on the admin page. I ch

Configuration and specs to index a 1 terabyte (TB) repository

2013-10-29 Thread eShard
Good morning, I have a 1 TB repository with approximately 500,000 documents (that will probably grow from there) that needs to be indexed. I'm limited to Solr 4.0 final (we're close to beta release, so I can't upgrade right now) and I can't use SolrCloud because work currently won't allow it for

Re: How to set the shardid?

2013-10-29 Thread Thomas Egense
You can specify the shard in core.properties ie: core.properties: name=collection2 shard=shard2 Did this solve it ? From, Thomas Egense On Mon, Feb 25, 2013 at 5:13 PM, Mark Miller wrote: > > On Feb 25, 2013, at 10:00 AM, "Markus.Mirsberger" < > markus.mirsber...@gmx.de> wrote: > > > How ca

RE: Solr Highlighting Best Practices Guideline

2013-10-29 Thread Erwin Gunadi
Hi Furkan, thanks for the reply. I was trying to get some guidelines on how to apply highlighter types correctly and maybe some comparison results between the three highlighter types (standard highlighter, fast vector highlighter and posting highlighter) Best Regards Erwin -Original Messa

Re: HTTP Basic Authentication with solr's jetty

2013-10-29 Thread Fabiano Sidler
Thus spake Furkan KAMACI: > First of all did you read here: > http://wiki.apache.org/solr/SolrSecurity Yes. Many times. As I did with the relating page on the Jetty website. > What is your motivation for using security at your Solr Jetty? > 29 Ekim 2013 Salı tarihinde Fabiano Sidler >> I'm goin

Re: Solr Highlighting Best Practices Guideline

2013-10-29 Thread Furkan KAMACI
This is a too broad question. If you have specific questions or want to see other users' problems about highlighting feature at Solr you can check here: http://search-lucene.com/?q=highlighting&fc_project=Solr&fc_type=mail+_hash_+user On the other hand you can read Using Additional Solr Functionali

Re: HTTP Basic Authentication with solr's jetty

2013-10-29 Thread Furkan KAMACI
First of all did you read here: http://wiki.apache.org/solr/SolrSecurityWhat is your motivation for using security at your Solr Jetty? 29 Ekim 2013 Salı tarihinde Fabiano Sidler adlı kullanıcı şöyle yazdı: > Hi folks! > > I was asking this question last week already on the jetty mailing list, but

Re: Compound words

2013-10-29 Thread Parvesh Garg
Hi Erick, I tried with expand=true and got exactly the same tokens i.e., seabiscuit sea bird at 1,2 and 3 positions respectively. As per solr documentation at http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory, explicit mappings ignore the expand parameter in the

Phrase query combined with term query for maximum accuracy

2013-10-29 Thread michael.boom
For maximum search accuracy on my SolrCloud system i was thinking of combining phrase search with term search in the following way: search term: john doe search fields: title, description - a match in the title is more relevant than one in the description What i want to achieve - the following doc

HTTP Basic Authentication with solr's jetty

2013-10-29 Thread Fabiano Sidler
Hi folks! I was asking this question last week already on the jetty mailing list, but haven't got any answer. I'm going to run multiple Solr instances on one server, which arises the need of user authentication in front of Solr. I've done the following steps (after a lot of others which didn't wo

Re: Data import handler with multi tables

2013-10-29 Thread Giovanni Bricconi
maybe So you can keep the original id, maybe add also an originalTable field if you don't like parsing the id colum to discover the table from which the data was read. 2013/10/29 Stefan Matheis > I'v

Solr Highlighting Best Practices Guideline

2013-10-29 Thread Erwin Gunadi
Hi, After having done the official Solr-Tutorial (https://lucene.apache.org/solr/4_5_1/) and read the Solr-Reference-Guide (https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/), I would like to ask whether someone can point me to best practices and/or tutorial on how to setup the high

Re: Data import handler with multi tables

2013-10-29 Thread Stefan Matheis
I've never looked for another way, what's the problem using a compound key? On Monday, October 28, 2013 at 1:38 PM, dtphat wrote: > Hi, > is there no another way to import all data for this case instead Only the > way using compound key? > Thanks. > > > > - > Phat T. Dong > -- > View this