Re: Search results are not coming if the query contains()

2010-11-15 Thread sivaprasad
I tried by escaping the special chars.Now it is working fine. Thanks guys. -- View this message in context: http://lucene.472066.n3.nabble.com/Search-results-are-not-coming-if-the-query-contains-tp1906181p1909589.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: hash uniqueKey generation?

2010-11-15 Thread Chris Hostetter
: I just finished reading on the wiki about deduplication and the solr.UUIDField : type. What I'd like to do is generate an ID for a document by hashing a subset : of its fields. One route I thought would be to do this ahead of time to CSV : data, but I would think sticking something into the Upda

encoding messy code

2010-11-15 Thread xu cheng
hi all: I configure an app with solr to index documents and there are some Chinese content in the documents and I've configure the apache tomcat URIEncoding to be utf-8 and I use the program curl to sent the documents in xml format however , when I query the documents, all the Chinese content becom

Re: Term component sort is not working

2010-11-15 Thread sivaprasad
Hi, I have given the terms component configuration. true autoSuggestTerm index termsComponent And i have two fileds in schema file Now iam trying to sort the terms which are returned by terms component based on WEIGHT

Re: Problem with synonyms

2010-11-15 Thread sivaprasad
I did changes to the schema file as shown below. And i have an entry in the synonym.txt file as shown below. hdtv => High Definition Television, High Definition TV,High Definition Televisions,High Definition TVs Now i

Re: Search results are not coming if the query contains()

2010-11-15 Thread sivaprasad
Hi Here is the query. hp laserjet 2200 (h3978a) mainten kit (h3978-60001) hp laserjet 2200 (h3978a) mainten kit (h3978-60001) +searchtext:hp +searchtext:laserjet +searchtext:2200 +searchtext:h3978a +searchtext:mainten +searchtext:kit +searchtext:h3978-60001 +searchte

Re: hash uniqueKey generation?

2010-11-15 Thread Lance Norskog
I think the deduplication signature field will work as a multiValued field. So you can do copyField to it from all of the source fields. Dan Lynn wrote: Hi, I just finished reading on the wiki about deduplication and the solr.UUIDField type. What I'd like to do is generate an ID for a docume

strange problem

2010-11-15 Thread Li Li
hi all I confronted a strange problem when feed data to solr. I started feeding and then Ctrl+C to kill feed program(post.jar). Then because XML stream is terminated unnormally, DirectUpdateHandler2 will throw an exception. And I goto the index directory and sorted it by date. newest files are f

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Koji Sekiguchi
(10/11/16 8:36), Jonathan Rochkind wrote: In Solr 1.4, facet.method=enum DOES work on multi-valued fields, I'm pretty certain. Correct, and I didn't say that facet.method=enum doesn't work for multiValued/tokenized field in my previous mail. I think Koji's explanation is based on before So

Re: Corename after Swap in MultiCore

2010-11-15 Thread Shawn Heisey
On 11/12/2010 3:00 PM, Shawn Heisey wrote: I have not tried reloading the core instead of restarting Solr, I should do that. Just so everyone's aware: Reloading the core is not enough to get solr.core.name to be updated in the healthcheck filename. Solr must be restarted.

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Jonathan Rochkind
Koji Sekiguchi wrote: Usually, you do not need to set facet.method because Solr automatically uses most appropriate facet method for each field type: boolean: TermEnum multiValued/tokenized: UnInvertedField other than those above: FieldCache As I understand it, in Solr 1.4, (and I may NOT un

Re: Dismax - Boosting

2010-11-15 Thread Ahmet Arslan
> 1. Do we need to change the above DisMax handler > configuration as per our > requirements? Or Leave it as it is? What changes? Yes, you need to edit it. At least field names. Does your schema has a field named sku? > 2. Do we need make DisMax as a default request > handler?  Do I need to add

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Koji Sekiguchi
(10/11/16 6:43), Dennis Gearon wrote: fc='field collapsing'? fc of facet.method=fc stands for Lucene's FieldCache. enum of facet.method=enum stands for Lucene's TermEnum. Usually, you do not need to set facet.method because Solr automatically uses most appropriate facet method for each field t

Re: Search results are not coming if the query contains()

2010-11-15 Thread Erick Erickson
I don't see the query you're submitting. Try submitting your query with &debugQuery=on and pasting the results... But first I'd try escaping the parenthesis in your query since they are part of the query syntax... Best Erick On Mon, Nov 15, 2010 at 1:04 PM, sivaprasad wrote: > > Hi, > I have t

Re: Term component sort is not working

2010-11-15 Thread Erick Erickson
Would it answer to sort on your Weightage field? Note that your weightage field has to be comparable. that is, if you store it as a string, it must be normalized such that the dictionary comparison works. 5 would sort after 1000 unless you padded your 5 to something like 0005... Best Erick On Mon

Re: Problem with synonyms

2010-11-15 Thread Ahmet Arslan
> Do i need to expand the synonyms at index time? Probably yes. You can play with its parameters and experiment.

Re: Boosting on a document value

2010-11-15 Thread Ahmet Arslan
> I've got a document with a "type" > field.  If the type is 1, I want to boost the > document's relevancy, but type=1 is not a > requirement.  Types other than 1 > should still be returned and scored as normal, just without > the boost. > > How do I do this? You can do it with one of the solutio

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Peter Karich
I think it stands for field cache (according to http://wiki.apache.org/solr/SimpleFacetParameters this could be true ;-)) fc='field collapsing'? Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Jonathan Rochkind
Don't know, don't care. It may have began standing for that, I dont' know. It's now more of a 'strategy' than a method, it uses different algorithms depending on the nature of your facets, including whether they are multi-term or not. I don't entirely understand it. I've looked at the source a

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Dennis Gearon
fc='field collapsing'? Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501&tag

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Peter Karich
Hi Jonathan, I am too using fc because it simply was faster. Not sure if this can be applied in general. I will add this info to the wiki. Regards, Peter. Awesome. I'm not sure his point 1 about facet.method=enum is still valid in Solr 1.4+. The "fc" facet.method was changed significantly

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Jonathan Rochkind
Awesome. I'm not sure his point 1 about facet.method=enum is still valid in Solr 1.4+. The "fc" facet.method was changed significantly in 1.4, and generally no longer takes a lot of memory -- for facets with "many" unique values, method fc in fact should take less than enum, I think? Peter Ka

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-15 Thread Peter Karich
Just in case someone is interested: I put the emails of Peter Sturge with some minor edits in the wiki: http://wiki.apache.org/solr/NearRealtimeSearchTuning I found myself search the thread again and again ;-) Feel free to add and edit content! Regards, Peter. Hi Erik, I thought this woul

Possibilities of (near) real time search with solr

2010-11-15 Thread Peter Karich
Hi, I wanted to provide my indexed docs (tweets) relative fast: so 1 to 10 sec or even 30 sec would be ok. At the moment I am using the read only core scenario described here (point 5)* with a commit frequency of 180 seconds which was fine until some days. (I am using solr1.4.1) Now the time

Re: Using jetty's GzipFilter in the example solr.war

2010-11-15 Thread Jay Luker
On Sun, Nov 14, 2010 at 12:49 AM, Kiwi de coder wrote: > try to put u filter on top of web.xml (instead of middle or bottom), i try > this few day and it just only a simple solution (not sure is a spec to put > on top or is a bug) Thank you. An explanation of why this worked is probably better e

Search results are not coming if the query contains()

2010-11-15 Thread sivaprasad
Hi, I have the below analysis settings at index time and query time. When i analyze the query using admin screen i got the below result. hp laserjet2200(h3978a)mainten kit (h3978-60001) But when i submit the query for search

Boosting on a document value

2010-11-15 Thread Jon Drukman
I've got a document with a "type" field. If the type is 1, I want to boost the document's relevancy, but type=1 is not a requirement. Types other than 1 should still be returned and scored as normal, just without the boost. How do I do this? -jsd-

Dismax - Boosting

2010-11-15 Thread Solr User
Hi, Currently we are using StandardRequestHandler and the configuration in SolrConfig.xml is as below: explicit We would like to switch to DisMax request handler and the configuration in SolrConfig.xml is: dismax explicit 0.01

Re: Problem with synonyms

2010-11-15 Thread sivaprasad
Do i need to expand the synonyms at index time? -- View this message in context: http://lucene.472066.n3.nabble.com/Problem-with-synonyms-tp1905051p1905976.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Term component sort is not working

2010-11-15 Thread sivaprasad
How can i utilize the weightage of terms which i captured from end user? any ideas -- View this message in context: http://lucene.472066.n3.nabble.com/Term-component-sort-is-not-working-tp1905059p1905958.html Sent from the Solr - User mailing list archive at Nabble.com.

hash uniqueKey generation?

2010-11-15 Thread Dan Lynn
Hi, I just finished reading on the wiki about deduplication and the solr.UUIDField type. What I'd like to do is generate an ID for a document by hashing a subset of its fields. One route I thought would be to do this ahead of time to CSV data, but I would think sticking something into the Upd

Re: Where is the lock file?

2010-11-15 Thread Bharat Jain
Hi guys, We are also running into unusually high number of LockObtainFailedException in our production environment. We have a very simple setup. A master and a slave with multi-core setup. We are using SOLR 1.3. What is the use of lockType? Thanks Bharat Jain On Tue, Oct 12, 2010 at 4:23 AM,

Re: Searching with acronyms

2010-11-15 Thread Savvas-Andreas Moysidis
yes, a synonyms filter should allow you to achieve what you want. On 15 November 2010 03:14, sivaprasad wrote: > > Hi, > > I have a requirement where a user enters acronym of a word, then the search > results should come for the expandable word.Let us say. If the user enters > 'TV', the search r

NoVA/DC - Lucene/Solr Meetup - Wednesday, Nov. 17

2010-11-15 Thread Erik Hatcher
We still have some open spots for the meetup we're hosting this Wednesday night in DC. Come on out, it'll be a great time. Erik

Re: Term component sort is not working

2010-11-15 Thread Ahmet Arslan
> As part of terms component we have a parameter > terms.sort=index|count. > > If we put terms.sort=index, will be returns the terms in > index order. terms.sort=index means sort by lexicographic. Like normal dictionary.

Re: Problem with synonyms

2010-11-15 Thread Ahmet Arslan
Multi-word synonyms are meant to be used at index time. QueryParser will split your query on white spaces unless you use quotes. "The Lucene QueryParser tokenizes on white space before giving any text to the Analyzer" [1] [1]http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.S

Term component sort is not working

2010-11-15 Thread sivaprasad
Hi, As part of terms component we have a parameter terms.sort=index|count. If we put terms.sort=index, will be returns the terms in index order. While doing the import, i have used the below query to index. SELECT ID,SEARCH_KEY,WEIGHTAGE FROM SEARCH_KEY_WEIGHTAGE ORDER BY weightage DESC So t

Problem with synonyms

2010-11-15 Thread sivaprasad
Hi, I have a set of synonyms in synonyms.txt file. For ex: hdtv,High Definition Television, High Definition TV In the admin screen when i type "High Definition Television" as the query term to analyze , i got hdtv as the result of the analysis. But when is search for the term hdtv and "High D

Re: segments_7eq4 not found !!!! HELP

2010-11-15 Thread Erick Erickson
260K documents isn't even close to straining Solr, so it must be something else... What version of Solr are you using? Can we see your solrconfig.xml and schema.xml? What exactly is the command you issue when you get this error? Have you looked at the admin page in Solr (localhost:8983/solr/admi

Re: Link to download solr4.0 is not working?

2010-11-15 Thread Jan Høydahl / Cominvent
Yes, the project is not good enough at communicating the roadmap clearly. We often hide behind the fact that nobody knows since it's open source, but I think the PMC would benefit from trying to maintain some sort of no-guarantee roadmap clarifying to all what most people think will happen going

Re: Link to download solr4.0 is not working?

2010-11-15 Thread kenf_nc
Thanks Jan. I didn't know about 1.4.2 I'll give it a look. However, your link is something I've already seen. I understand the different Solr versions, my question was more on what is the process, and timeline, for the community to turn the current trunk into a 'release'. From that link, and other

Re: Link to download solr4.0 is not working?

2010-11-15 Thread Jan Høydahl / Cominvent
Hi, Added a link to the wiki to the latest stable 1.4 branch that will become 1.4.2. You should checkout and build this branch if you have a requirement to use only a released version. 1.4.2 only contains critical bug fixes over 1.4.1 and is considered stable. See here for a clarification of S

Re: Link to download solr4.0 is not working?

2010-11-15 Thread kenf_nc
While we are on this subject...my company is kind of new to the whole open source as a production tool concept. I can't push anything to production that isn't labeled as 'release' or similar designation. So, 1.4.1 is what I have right now. I can play with other versions but that's about it. I'm fa

Re: Link to download solr4.0 is not working?

2010-11-15 Thread Jan Høydahl / Cominvent
Fixed the Wiki. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 12. nov. 2010, at 03.44, Deche Pangestu wrote: > Hello, > Does anyone know where to download solr4.0 source? > I tried downloading from this page: > http://wiki.apache.org/solr/FrontPage#solr_developmen

Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-15 Thread Robert Muir
https://issues.apache.org/jira/browse/SOLR-2237 On Mon, Nov 15, 2010 at 5:04 AM, Jakub Godawa wrote: > I tried to reach the autors twice, but with no luck. I've seen some > posts where people finally were able to lunch it (without much pain). > I don't know. If any pro would be so nice to try to

Re: Solr Negative query

2010-11-15 Thread Yonik Seeley
On Mon, Nov 15, 2010 at 12:42 AM, Viswa S wrote: > > Apologies for starting a new thread again, my mailing list subscription > didn't finalize till later than Yonik's response. > > Using "Field1:Val1 AND (*:* NOT Field2:Val2)" works, thanks. > > Does my original query "Field1:Value1 AND (NOT Fiel

Re: simple dismax with OR

2010-11-15 Thread Jakub Godawa
thank you, that works well. 2010/11/15 Matti Oinas : > Define mm(Minimum 'should' match) value for dismax. The default is > 100% so every clause must match. > > http://wiki.apache.org/solr/DisMaxQParserPlugin#mm_.28Minimum_.27Should.27_Match.29 > > 2010/11/15 Jakub Godawa : >> Hi! I have my dismax

Re: simple dismax with OR

2010-11-15 Thread Matti Oinas
Define mm(Minimum 'should' match) value for dismax. The default is 100% so every clause must match. http://wiki.apache.org/solr/DisMaxQParserPlugin#mm_.28Minimum_.27Should.27_Match.29 2010/11/15 Jakub Godawa : > Hi! I have my dismax that is searching through two fields. > > >   >    dismax >    

simple dismax with OR

2010-11-15 Thread Jakub Godawa
Hi! I have my dismax that is searching through two fields. dismax name_en^1.0 answe_en^1.5 Now I have a document that has "Various appliances can be installed here" in the answen_en field, indexed with English analyzer. When I query "installation" I have the result of

Re: DIH URLDataSource and useSolrAddSchema=true

2010-11-15 Thread Dario Rigolin
On Monday, November 15, 2010 11:18:47 am Lance Norskog wrote: > This is more complex than you need. The Solr update command can accept > streamed data, with the stream.url and stream.file options. You can just > use solr/update with stream.url=http://your.machine/your.php.script and > it will read

Re: A Newbie Question

2010-11-15 Thread Lance Norskog
"There is no current feature" is what I meant. Yes, it would be very handy to do this. I handled this problem in the DIH by creating two documents, both with the same unique ID. The first doc just had the metadata. The second document parsed the input with Tika, but had 'skip doc on error' set

Re: DIH URLDataSource and useSolrAddSchema=true

2010-11-15 Thread Lance Norskog
This is more complex than you need. The Solr update command can accept streamed data, with the stream.url and stream.file options. You can just use solr/update with stream.url=http://your.machine/your.php.script and it will read as fast as it wants. There is no "parallel indexing" support, but y

Re: XML to solr

2010-11-15 Thread Lance Norskog
The XPathEntityProcessor has a very limited grammar of path expressions. It has the ability to use an XSL script, which would then let you do anything, but I have not used it. Chantal Ackermann wrote: Hi Jörg, you could use the DataImportHandler's XPathEntityProcessor. There you can specify f

Re: my index has 500 million docs ,how to improve solr search performance?

2010-11-15 Thread Lance Norskog
It's not that EC2 instances have slow disks, it's that they have no quota system to guarantee you X amount of throughput. I've benchmarked 1x to 3x on the same instance type at different times. That is, 300% variation in disk speeds. Filter queries are only slow once; after that they create a

DIH URLDataSource and useSolrAddSchema=true

2010-11-15 Thread Dario Rigolin
I'm looking to index data in Solr using a PHP page feeding the index. In my application I have all docs allready "converted" to a solr/add xml document and I need to make solr able to get all changed documents into the index. Looking at DIH I decidec to use URLDataSource and useSolrAddSchema=true

Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-15 Thread Jakub Godawa
I tried to reach the autors twice, but with no luck. I've seen some posts where people finally were able to lunch it (without much pain). I don't know. If any pro would be so nice to try to run the stempel on his/her machine and paste me some verbose step by step solution I would really appreciate.

Re: my index has 500 million docs ,how to improve solr search performance?

2010-11-15 Thread Toke Eskildsen
On Mon, 2010-11-15 at 06:35 +0100, lu.rongbin wrote: > In addition,my index has only two store fields, id and price, and other > fields are index. I increase the document and query cache. the ec2 > m2.4xLarge instance is 8 cores, 68G memery. all indexs size is about 100G. Looking at http://aws.ama

Re: XML to solr

2010-11-15 Thread Chantal Ackermann
Hi Jörg, you could use the DataImportHandler's XPathEntityProcessor. There you can specify for each sorl field the XPath at which its value is stored in the original file (your first example snippet). The value of field "FIEL_ITEMS_DATEINAME" for example would have the XPath //fie...@name='DATEIN

Re: full text search in multiple fields

2010-11-15 Thread PeterKerk
@Erick: Nope, those fields indeed arent chainable, I used iorixxx's solution and now it works. :) -- View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p1903486.html Sent from the Solr - User mailing list archive at Nabble.com.

XML to solr

2010-11-15 Thread Jörg Agatz
hi Users. I have a Question, i have a lot of XML to indexing, at the Moment i have two XML files, one original, and one for solr a (Search_xml) for example: 6483030ed18d8b7a58a701c8bb638d20 0012_20101105111938206.pdf PDM

segments_7eq4 not found !!!! HELP

2010-11-15 Thread Jörg Agatz
Hallo, I have a Problem.. Last month i get an Error, segment "7eq4 " not found, so i cant indexing anything but can search in the index. i remove the index and create a new, but now i get the Same error again... maby you can help me, can tell me what ist wrong with my Solr at the Moment i have 2