Re: Negative Boosting at Recent Versions of Solr?

2013-05-13 Thread Toke Eskildsen
On Fri, 2013-05-10 at 16:49 +0200, Jason Hellman wrote: > -23.0 = product(float(price)=11.5,const(-2)) > I wonder how fantastically this can be abused now? Mmm... Products of negative scores. I foresee "The products matching an uneven number of search terms gets much higher scores. Why?"-questio

Re: Unable to load environment info from /solr/collection1/admin/system?wt=json

2013-05-13 Thread Furkan KAMACI
I have tried to open that URL: ip:8983/solr/ I get that error: INFO: [collection1] CLOSING SolrCore org.apache.solr.core.SolrCore@62ad1b5c May 13, 2013 10:38:40 AM org.apache.solr.update.DirectUpdateHandler2 close INFO: closing DirectUpdateHandler2{commits=0,autocommit maxTime=15000ms,autocommit

Re: MultiValue

2013-05-13 Thread manju16832003
Hi All, I managed to *solve* the issue I had posted earlier with respect to multiValued. Here is the Query suppose to configured this way in *data-config.xml * Description: in the below, first query has associated table images. Each person would have many images. Here the JSON/XML would return all

Re: multiValued schema example (SOLVED)

2013-05-13 Thread manju16832003
Hi All, I managed to *solve* the issue I had posted earlier with respect to multiValued. Here is the Query suppose to configured this way in *data-config.xml * Description: in the below, first query has associated table images. Each person would have many images. Here the JSON/XML would return all

Best way to design a "story and comments" schema.

2013-05-13 Thread samabhiK
Hi, I wish to know how to best design a schema to store comments in stories / articles posted. I have a set of fields: / Solr Sorting Algorithm
Good Morning All, The alphabetical sorting is causing slight issues as below: I have 3 documents with title value as below: 1) "Acer Palmatum (Tree)" 2) "Aceraceae (Tree Family)" 3) "Acer Pseudoplatanus (Tree)" I have created title_sort field which is defined with field type as alphaNumericalSo

Mandatory words search in SOLR

Hi SOLR Experts When I search documents with keyword as *java, mysql* then I get the documents containing either *java* or *mysql* or both. Is it possible to get the documents those contains both *java* and *mysql*. In that case, how the query would look like. Thanks a lot Kamal

Re: Mandatory words search in SOLR

Hello! Change the default query operator. For example add the q.op=AND to your query. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch > Hi SOLR Experts > When I search documents with keyword as *java, mysql* then I get the > documents contai

CJK question

A question about CJK, how will U+3000 be handled? U+3000 belongs to "CJK Symbols and Punctuation" and is named "IDEOGRAPHIC SPACE". Is it wrong if I just map it to U+0020 (SPACE)? What is CJK Analyzer doing with U+3000? If "two CJK words" have U+3000 inside, does it mean these "two CJK words"

RE: CJK question

Hi, It uses the StandardAnalyzer which does split on IDEOGRAPHIC SPACE. Cheers, Markus -Original message- > From:Bernd Fehling > Sent: Mon 13-May-2013 13:36 > To: solr-user@lucene.apache.org > Subject: CJK question > > A question about CJK, how will U+3000 be handled? > > U+3000 be

Re: Mandatory words search in SOLR

Hi Rafał Kuć I added q.op=AND as per you suggested. I see though some initial record document contains both keywords (*java* and *mysql*), towards end I see still there are number of documents, they have only one key word either *java* or *mysql*. Is it the SOLR behaviour or can I ask for a *stric

Re: Mandatory words search in SOLR

Kamal You could also use the 'mm' parameter to require a minimum match, or you could prepend '+' to each required term. Cheers François On May 13, 2013, at 7:57 AM, Kamal Palei wrote: > Hi Rafał Kuć > I added q.op=AND as per you suggested. I see though some initial record > document contain

Re: Log Monitor System for SolrCloud and Logging to log4j at SolrCloud?

Sorry but do you mean that I can use log4j with Solr 4.2.1? 2013/5/6 Steve Rowe > Done - see http://markmail.org/message/66vpwk42ih6uxps7 > > On May 6, 2013, at 5:29 AM, Furkan KAMACI wrote: > > > Is there any road map for Solr when will Solr 4.3 be tagged at svn? > > > > 2013/4/26 Mark Miller

Solr Licensing (Sizzle)

In the source code of Apache Solr 4.2.0 there is an unclear license reference in * \solr-4.2.0\solr\webapp\web\js\lib\jquery-1.7.2.min.js and * \solr-4.2.0\solr\webapp\web\js\require.jstxt Can you please tell me what kind of license does this refer to exactly: "* Sizzle C

maximum number of simultaneous threads

I am seeing the following in solrconfig.xml It is possible to specific max number of threads for query time too? -- View this message in context: http://lucene.472066.n3.nabble.com/maximum-number-of-simultaneous-threads-tp4062903.html Sent from the Solr - User mailing list archiv

Re: Mandatory words search in SOLR

Hi François Thanks for input. The major problem I face is , I make use of Drupal (as a framework) and apachesolr_module provided by Drupal. Where I am not sure, how do I directly modify the query. However this is not a right forum to ask Drupal related questions. If somebody here knows both Drupal

Re: Best way to design a "story and comments" schema.

Try the simplest, cleanest design first (at least on paper), before you start resorting to either dynamic fields or multi-valued fields or other messy approaches. Like, one collection for stories, which would have a story id and a second collection for comments, each with a comment id and a fiel

Re: maximum number of simultaneous threads

venkata, only blank lines between "..in solrconfig.xml" and "Is it possible.." have arrived. On Mon, May 13, 2013 at 3:25 PM, venkata wrote: > I am seeing the following in solrconfig.xml > > > > > > > It is possible to specific max number of threads for query time too? > > > > > -- > View this

Re: Best way to design a "story and comments" schema.

Thanks for your reply. I generally get confused by a collection and a core. But just FYI, I do have two cores at the moment - one for the users and another for the Stories. Initially I thought of adding an extra core for the Comments too but realized that it would mean multiple HTTP calls to fetch

Getting "explain" information of more like this search in a more usable format

Hi, I'm executing a more like this search using the MoreLikeThisHandler. I can add "score" to the fields to be returned, but that's all I could find about getting information about how/why documents match. I would like to give my users more hints why documents are similar, so I would like to di

Solr fullname search

Hi, I'm trying to set up a fullname search in Solr. Until now I thought my work was fine until I've found something strange, and I can't figure out how to correct it. So I want to be able to do searches on full names. My index is a database where I get first name and last name and put them in one

Re: Best way to design a "story and comments" schema.

There are no transactions in Solr. Delete the Story and then the comments. "Core" is just the old Solr terminology. A "collection" is the data itself, like the data on the disk. And with SolrCloud, the collection terminology is required. How much data will hou have. I mean, a news article cou

Re: Getting "explain" information of more like this search in a more usable format

Try debug.explain.structured=true, which will give you an XML response that can be traversed. Don't worry about the fact that these features are labeled "debug" - they are there simply to explain what is happening. Is there some particular concern you have about them being labeled "debug"? Alt

Re: Best way to design a "story and comments" schema.

I think I got your point. So, what I will create are three cores (or collections) - one for the users, one for the stories and the last one for comments. When I need to find all the stories posted by a single user, I first need to search the stories core with a unique userid in the filter and th

Re: Best way to design a "story and comments" schema.

Jack, Why are multi-valued fields considered messy? I think I am about to learn something.. Thanks Another Jack On Mon, May 13, 2013 at 5:29 AM, Jack Krupansky wrote: > Try the simplest, cleanest design first (at least on paper), before you > start resorting to either dynamic fields or multi-va

Re: Mandatory words search in SOLR

Hi François As per suggestion, I used 'mm' param and was able to do search for mandatory fields. In Drupal, one need to do as $query->addParam('mm' , '100%'); in query alter hook. Thanks a lot for guiding me. Best Regards Kamal On Mon, May 13, 2013 at 5:56 PM, Kamal Palei wrote: >

Can we search some mandatory words and some optional words in SOLR

Dear SOLR Experts Llets say I want to search some mandatory words and some optional words. Say I want to search all documents those contains all *Java, mysql, php*keywords along with atleast one keyword out of * TCL, Perl, Selenium*. *Basically I am looking at few mandatory keywords and few option

Re: Log Monitor System for SolrCloud and Logging to log4j at SolrCloud?

On 5/13/2013 6:09 AM, Furkan KAMACI wrote: > Sorry but do you mean that I can use log4j with Solr 4.2.1? You can. You need to obtain a war without any slf4j jars, which you can do by unpacking the original war, deleting the jars, and repackaging it. You can also build from source with the dist-e

Re: Can we search some mandatory words and some optional words in SOLR

That's simply a standard, old-fashioned Lucene query: +Java +mysql +php TCL Perl Selenium And you can decide if "min should match" (mm) is 0, 1, 2, 3, etc. for the optional terms (TCL, Perl, Selenium) -- Jack Krupansky -Original Message- From: Kamal Palei Sent: Monday, May 13, 2013

Re: Best way to design a "story and comments" schema.

Multi-valued fields don't have the same full support as simple fields and documents (since they are effectively a sub-document). Although we do now have the ability to "add" to a multi-valued field with atomic update, we can't directly edit them, like delete/replace the kth item or insert befor

Quick question about indexing with SolrJ.

Is it possible to index plain String JSON documents using SolrJ? I already know annotating POJOs works fine, but I need a more flexible way to index data without any intermediate POJO. That's because when changing, adding or removing new fields I don't want to change continously that POJO again an

Re: Solr Licensing (Sizzle)

On May 13, 2013, at 14:15 , Polhodzik Peter (ext) wrote: > In the source code of Apache Solr 4.2.0 there is an unclear license reference > in > > · \solr-4.2.0\solr\webapp\web\js\lib\jquery-1.7.2.min.js > and > · \solr-4.2.0\solr\webapp\web\js\require.jstxt > > Can you please

Re: Quick question about indexing with SolrJ.

Do your POJOs follow a simple flat data model that is 100% compatible with Solr? If so, maybe you can simply ingest them by setting the Content-type to "application/json" and maybe having to put some minimal wrapper around the raw JSON. But... if they DON'T follow a simple, flat data model,

Re: Quick question about indexing with SolrJ.

Hello, Jack. I don't want to use POJOs, that's the main problem. I know that you can send AJAX POST HTTP Requests with JSON data to index new documents and I would like to do that with SolrJ, that's all, but I don't find the way to do that, :-/ . What I would like to do is simple retrieve an Strin

Re: Quick question about indexing with SolrJ.

You can send JSON to Solr as update documents: http://wiki.apache.org/solr/UpdateJSON. Not sure if SolrJ supports it, but it is just an HTTP post, so you may not even need SolrJ. But the issue is that your own JSON probably does not match JSON expected by Solr. So, you need to map it somehow, righ

Re: maximum number of simultaneous threads

I am seeing configuration point for indexing threads. However I am not finding anything for search. How many simultaneous threads, SOLR can spin during search time? -- View this message in context: http://lucene.472066.n3.nabble.com/maximum-number-of-simultaneous-threads-tp4062

Making protwords.txt changes effective

Hi I added some words to protwords.txt, but there doesnt seem to be any effect in the resulting search. Do I need to restart Apache or Solr or rebuild the index?

Re: Need solr query help

Hi Abhishek, I've had a look into this problem and have come up with a solution. Following instructions assume you have downloaded the 4.3.0 release of Solr from:- http://www.apache.org/dyn/closer.cgi/lucene/solr/4.3.0 First add to:- solr-4.3.0/solr/example/solr/collection1/conf/schema.xml th

Re: Disabling tf (term frequency) during indexing and/or scoring

This is an old post, now there is a solution in SOLR omitTermFreqAndPositions="true" http://wiki.apache.org/solr/SchemaXml#Data_Types -- View this message in context: http://lucene.472066.n3.nabble.com/Disabling-tf-term-frequency-during-indexing-and-or-scoring-tp502956p4062595.html Sent from

Re: Need solr query help

Hi Abhishek, I forgot to explain why it works. It uses the frange filter which is mentioned here:- http://wiki.apache.org/solr/CommonQueryParameters and it works because it filters in results where the geodist minus the shopMaxDeliveryDistance is less than zero (that's what the u=0 means, upper

Re: Making protwords.txt changes effective

Yes, restart Solr. Not to reindex, but simply to reload the file. Well... depending on where you use the protected words, you may need to reindex as well. For a query-time filter you don't need to reindex, but for index-time filters, you must reindex. -- Jack Krupansky -Original Message-

Re: Looking for Best Practice of Spellchecker

Thank you for you help, guys. I agreed, "wall mart" should be a synonyms, it's not a good example. I did an experiment by using KeywordTokenizer + DirectSolrSpellChecker, I can get suggestion even for "wall mart" to "walmart". But I don't know whether it's a good practice or not. It's much like a

How to improve performance of geodist()

Hi guys, I'm using geodist() in a recip boost function. I noticed a performance impact to the response time. I did a profiling session, the geodist() calculation took 30% of CPU time. I'm wondering is there any alternative to Haversine function that can reduce CPU calculation? I don't need very a

Re: rename a core to same name of existing core

did any one verified the following is ture? > the Description on http://wiki.apache.org/solr/CoreAdmin#CREATE is: > > *quote* > If a core with the same name exists, while the "new" created core is > initalizing, the "old" one will continue to accept requests. Once it > has finished, all new request

Re: SOLR guidance required

If this is for the US, remove the age range feature before you get sued. On 05/09/2013 08:41 PM, Kamal Palei wrote: Dear SOLR experts I might be asking a very silly question. As I am new to SOLR kindly guide me. I have a job site. Using SOLR to search resumes. When a HR user enters some keywor

Re: SOLR guidance required

Jason can you explain what you mean at here: "Where OR operators apply, this does not matter. But your Solr cache will be much more savvy with the first construct." 2013/5/13 Lance Norskog > If this is for the US, remove the age range feature before you get sued. > > > On 05/09/2013 08:41 PM, Ka

Re: How to improve performance of geodist()

On Mon, May 13, 2013 at 1:12 PM, Nicholas Ding wrote: > I'm using geodist() in a recip boost function. I noticed a performance > impact to the response time. I did a profiling session, the geodist() > calculation took 30% of CPU time. Are you also using an "fq" with geofilt to narrow down the num

How to force a document to be indexed in a given shard at SolrCloud?

I want to run some test cases on SolrCloud at my pre-prototype system. How can I force a document to be indexed in a given shard at SolrCloud (I use Solr 4.2.1) ? Does something like shard.keys works for me?

Re: SOLR guidance required

Multiple fq params are ANDed. So if you have fq=clause1 AND clause2, you should implement that as fq=clause1&fq=clause2. However, if you want fq=clause1 OR clause2, you have no choice but to keep it as a single filter query. Upayavira On Mon, May 13, 2013, at 06:55 PM, Furkan KAMACI wrote: > Jaso

Re: SOLR guidance required

On 5/13/2013 11:55 AM, Furkan KAMACI wrote: Jason can you explain what you mean at here: "Where OR operators apply, this does not matter. But your Solr cache will be much more savvy with the first construct." If you need to OR different filters together, you have to have all those in the same

Faceting json response - odd format

Hello, Relatively new to SOLR, I am quite happy with the API. I am a bit challenged by the faceting response in JSON though. This is what i am getting which mirrors what is in the documentation: "facet_counts":{"facet_queries":{}, "facet_fields":{"metadata_meta_last_author":["Nick",330,"

Re: rename a core to same name of existing core

On 5/13/2013 11:46 AM, Jie Sun wrote: did any one verified the following is ture? the Description on http://wiki.apache.org/solr/CoreAdmin#CREATE is: *quote* If a core with the same name exists, while the "new" created core is initalizing, the "old" one will continue to accept requests. Once it

RE: Looking for Best Practice of Spellchecker

The Word Break spellchecker will incorporate the broken & combined words in the collations. Its designed to work seamlessly in conjunction with a "regular" spellchecker (IndexBased- or Direct-). James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Nicholas Ding [m

Re: SOLR guidance required

Best advice in this thread. :) Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com The science of influence marketing. On Mon, May 13, 2013 at 1:29 PM, Lance Norskog wrote: > If this is for

Re: How to improve performance of geodist()

Yes, I did. But instead of sorting by geodist(), I use function query to boost by distance. That's why I noticed the heavy calculation happened in the processing. Example: bf=recip(geodist(), 50, 5) Basically, I think the boost function will iterate all the results, and calculate the distance.

Anybody knows what IBM FileNet search looks like?

And how does it compare to Solr. I am not buying (or selling), just trying to get some technical details and my GoogleFoo is failing me. I thought they were one of the purchased companies, but Autonomy/Verity seems to be referred to as 'old' search engine with FileNet's as new. Regards, Alex.

Re: Making protwords.txt changes effective

I think you can put it in your data dir and it'll get reloaded on commit. Try it and report back. Upayavira On Mon, May 13, 2013, at 06:01 PM, Jack Krupansky wrote: > Yes, restart Solr. Not to reindex, but simply to reload the file. > > Well... depending on where you use the protected words, you

Re: rename a core to same name of existing core

thanks for the information, you are right, I was using the same instance dir. I agree with you, I would like to see an error is I am creating a core with the name of existing core name. right now I have to do ping first, and analyze if the returned code is 404 or not. Jie -- View this message

Re: Anybody knows what IBM FileNet search looks like?

:-) Alex, it seems to be a copyright ... Think about Lucene + ManifoldCF. FileNet is file repository saved in DB2. ManifoldCF has a connector that helps retrieve files/directories from DB & using Lucene it may index the context of the files. I am not sure if Solr has such handler like Tika, howeve

Re: How to force a document to be indexed in a given shard at SolrCloud?

Hi, Yes shard.keys should work for this case.Please check this link. http://docs.lucidworks.com/display/solr/Shards+and+Indexing+Data+in+SolrCloud Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-force-a-document-to-be-indexed-in-a-given-shard-at-SolrCloud-

Re: Faceting json response - odd format

: This is what i am getting which mirrors what is in the documentation: : : "facet_counts":{"facet_queries":{}, : "facet_fields":{"metadata_meta_last_author":["Nick",330,"standarduser",153,"Mohan",52,"wwd",49,"gerald",45,"Riggins",36,"fallon",31,"blister",28," ",26,"morfitelli",24,"Adminis

Re: Faceting json response - odd format

thank you Hoss, What i would prefer to see as we do with all other parameters is a normal key/value pairing. this might look like: {"metadata_meta_last_author":[{"value": "Nick", "count": 330},{"value": "standard user","count": 153},{"value": "Mohan","count": 52},{"value":"wwd","count": 49}…

RE: spellcheker and exact match

I tried those parameters and it does suggest keywords but not the ones I'm interested in -- View this message in context: http://lucene.472066.n3.nabble.com/spellcheker-and-exact-match-tp4061672p4063060.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Faceting json response - odd format

: What i would prefer to see as we do with all other parameters is a : normal key/value pairing. this might look like: a true key value pairing via a "map" type structure is what you get with json.nl=map -- but in most client langauges that would lose whatever sorting you might have specified

Solr 4.3 core swap

Since upgrading to solr 4.3 we get the following errors on our slaves when we swap cores on our master: Solr index directory '/usr/local/solr_aggregate/solr_aggregate/data/index.20130513152644966' is locked. Throwing exception SEVERE: Unable to reload core: production org.apache.solr.common.Solr

Solr Boolean query help

I am trying to form a Solr query. Our documents have a multi-valued field named tag_id. I want to get documents that either do not have tag_id 1 or have both tag_id 1 and 2 i.e. q=(tag_id:(1 AND 2) OR tag_id:(NOT 1)) This is not giving the desired results. The result is the same as that of q=tag_i

Re: Solr Boolean query help

Inner purely negative clauses aren't allowed by Lucene. (Solr supports top-level negative clauses, though, so works as expected) To get a nested negative clause to work, try this: q=tag_id:(1 AND 2) OR (*:* AND -tag_id:1) On May 13, 2013, at 16:11 , Arun Rangarajan wrote: > I am trying

Re: Solr Boolean query help

Pure negative queries only work at the top level. So, try: q=(tag_id:(1 AND 2) OR tag_id:(*:* NOT 1)) -- Jack Krupansky -Original Message- From: Arun Rangarajan Sent: Monday, May 13, 2013 4:11 PM To: solr-user@lucene.apache.org Subject: Solr Boolean query help I am trying to for

Re: Solr Boolean query help

Erik, Jack, Thanks for your quick replies! That works. On Mon, May 13, 2013 at 1:18 PM, Jack Krupansky wrote: > Pure negative queries only work at the top level. > > So, try: > > q=(tag_id:(1 AND 2) OR tag_id:(*:* NOT 1)) > > > -- Jack Krupansky > -Original Message- From: Arun Rangarajan

writing a custom Filter plugin?

Does anyone know of any tutorials, basic examples, and/or documentation on writing your own Filter plugin for Solr? For Solr 4.x/4.3? I would like a Solr 4.3 version of the normalization filters found here for Solr 1.4: https://github.com/billdueber/lib.umich.edu-solr-stuff But those are old,

.skip.autorecovery=Y + restart solr after crash + losing many documents

Hi all, We write to two same-named cores in the same collection for redundancy, and are not taking advantage of the full benefits of solr cloud replication. We use solrcloud.skip.autorecovery=true so that Solr doesn't try to sync the indexes when it starts up. However, we find that if the core

Re: How to get/set customized Solr data source properties?

: learned it should work. And this is my actual code. I create this : DataSource for testing my ideas. I am blocked at the very beginning...sucks : :( but you only showed us one line of code w/o any context. nothing in your email was reproducible for other people to try to compile/run themselve

Solritas truncates content

Hi, I'm playing around with the example that comes with SOLR 4. I've indexed some documents using the Tika extractor. I'm looking at the velocity templates and trying to figure out how the /browse (solritas) functionality works because I would like to add functionality to view the complete docume

Re: How to deal with cache for facet search when index is always increment?

: > For real time seach, the docs would be import to index anytime. In this : > case, the cache is nealy always need to create again, which cause the facet : > seach is very slowly. : > Do you have any idea to deal with such problem? : We're in a similar situation and have had better performance

Request to be added to ContributorsGroup

Hello Wiki Admins, Request you to please add me to the ContributorsGroup. I have been using Solr for a few years now and I would like to contribute back by adding more information to the wiki Pages. Wiki User Name : Shreejay --Shreejay

Re: Quick question about indexing with SolrJ.

: I don't want to use POJOs, that's the main problem. I know that you can : send AJAX POST HTTP Requests with JSON data to index new documents and I : would like to do that with SolrJ, that's all, but I don't find the way to : do that, :-/ . What I would like to do is simple retrieve an String wit

Re: Request to be added to ContributorsGroup

On May 13, 2013, at 6:54 PM, Shreejay Nair wrote: > Hello Wiki Admins, > > Request you to please add me to the ContributorsGroup. > > I have been using Solr for a few years now and I would like to contribute > back by adding more information to the wiki Pages. > > Wiki User Name : Shreejay > >

Re: How to get/set customized Solr data source properties?

If the property has a full stop, it is probably going through the scoped resolver which may be causing issues. I would start with very basic property name format and see what happens. Otherwise, it is probably a breakpoint and debug time. Regards, Alex. Personal blog: http://blog.outerthoughts

Re: Solritas truncates content

#field is defined in conf/velocity/VM_global_library.vm as: #macro(field $f) #if($response.response.highlighting.get($docId).get($f).get(0)) #set($pad = "") #foreach($v in $response.response.highlighting.get($docId).get($f)) $pad$v## #set($pad = " ... ") #end #else #forea

Re: How to improve performance of geodist()

Hi Nicholas, Given that boosting is generally inherently fuzzy / inexact thing, you can likely get away with using simpler calculations. dist() can do the Euclidean distance (i.e. the Pythagorean theorem). If your data is in just one region of the world, you can project your data into a 2-D plan

Request to be added to Contributor Group

Hi Admins, My name is Eric. I got an account at http://wiki.apache.org/solr/ with user name is Eric D. Please add me to the Contributor Group. We currently have JobSearcher.com.au up and running which is using Solr. I am sure we can add comments and share some experience with Solr up there. Thank