Personalized Search

2010-05-19 Thread Rih
Has anybody done personalized search with Solr? I'm thinking of including fields such as "bought" or "like" per member/visitor via dynamic fields to a product search schema. Another option is to have a multi-value field that can contain user IDs. What are the possible performance issues with this s

Subclassing DIH

2010-05-19 Thread Blargy
I am trying to subclass DIH to add I am having a hard time trying to get access to the current Solr Context. How is this possible? Is there anyway to get access to the current DataSource, DataImporter etc? On a related note... when working with an onImportEnd, or onImportStart how can I get a r

caching on unique queries

2010-05-19 Thread Kevin Osborn
Pretty much every one of my queries is going to be unique. However, the query is fairly complex and also contains both unique and non-unique data. In the query, some fields will be unique (e.g description), but other fields will be fairly common (e.g. category). If we could use those common fiel

Re: Moving from Lucene to Solr?

2010-05-19 Thread Peter Karich
Sorry. Wasn't intended as a hijacking :-( : Subject: Moving from Lucene to Solr? : References: : In-Reply-To: http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instea

Query Timings increase after system is idle

2010-05-19 Thread ST ST
Folks, We have a problem in our environment where after a system is idle the query time goes up from a few 100ms to 4+ seconds after 9 hours of idle time on the system. System Details: - Solr 1.4 - 10 Million Index. - Use MMAP for mapping the index files in memory Test Details: - 8 hour perf

Re: Stemming Filters in wiki

2010-05-19 Thread Chris Hostetter
: : These entries were moved here: http://wiki.apache.org/solr/LanguageAnalysis but there doesn't seem to be a link to that page from AnalyzersTokenizersTokenFilters (or from anywhere on the wiki according to the wiki link search feature) ... so i'll add some verbage about it. : : On Wed, May

Re: Moving from Lucene to Solr?

2010-05-19 Thread Chris Hostetter
: Subject: Moving from Lucene to Solr? : References: : In-Reply-To: http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you change

Re: Stemming Filters in wiki

2010-05-19 Thread Robert Muir
Hi Asif, These entries were moved here: http://wiki.apache.org/solr/LanguageAnalysis On Wed, May 19, 2010 at 2:49 PM, Asif Rahman wrote: > I see that the entries for PorterStemFilterFactory, > EnglishPorterFilterFactory, and SnowballPorterFilterFactory have been > removed from the Analyzers, Tok

Re: Embedded Server, Caching, Stats page updates

2010-05-19 Thread Chris Hostetter
: "Switched" works for the specific setup i'm using - the server would refer : to itself in the CommonHttpSolrServer request sent, i.e. it would run both : the server and client sides. Removing this and simply using : EmbeddedSolrServer just made the setup a little more sane in that aspect. : Does

Stemming Filters in wiki

2010-05-19 Thread Asif Rahman
I see that the entries for PorterStemFilterFactory, EnglishPorterFilterFactory, and SnowballPorterFilterFactory have been removed from the Analyzers, Tokenizers, and Token Filters wiki page. Is there a reason for this? Thanks, asif -- Asif Rahman Lead Engineer - NewsCred a...@newscred.com htt

RE: disable caches in real time

2010-05-19 Thread Nagelberg, Kallin
I suppose you are still losing some performance on the replicated box since it needs to use some resources to warm the cache. It would be nice if a warmed cache could be replicated from the master though perhaps that's not practical. Chris is right though: The newly updated index created by a co

Re: disable caches in real time

2010-05-19 Thread Chris Hostetter
: I've always undestand that if you do a commit (replication does it), a new : searcher is open, and you lose performance (queries per second) while the : caches are regenerated. I think i don't explain correctly my situation not if you configure your caches with autowarming -- then solr will war

Re: Custom sorting

2010-05-19 Thread Daniel Cassiano
Hi Dan, It seems that you want a SearchComponent[1], something like the QueryElevationComponent[2]. Take a look how at him and I think you can build your custom solution. [1]- http://lucene.apache.org/solr/api/org/apache/solr/handler/component/SearchComponent.html [2]- http://wiki.apache.org/solr

The Seven Deadly Sins of Solr spanish translation

2010-05-19 Thread Juan Pedro Danculovic
Hello, I translate this article into Spanish. It is very helpful to avoid common mistakes in solr installations. http://www.linebee.com/?p=434&lang=es Thanks, Juan

Re: index merge

2010-05-19 Thread Ahmet Arslan
> I am running solr in 64 bit HP-UX system. The total > index size is about > 5GB and when i try load any new document, solr tries to > merge the existing > segments first and results in following error. I could see > a temp file is > growng within index dir around 2GB in size and later it > fails

Re: index merge

2010-05-19 Thread uma m
Hi All, I am running solr in 64 bit HP-UX system. The total index size is about 5GB and when i try load any new document, solr tries to merge the existing segments first and results in following error. I could see a temp file is growng within index dir around 2GB in size and later it fails with

Solr Delta Queries

2010-05-19 Thread Vladimir Sutskever
I have a "indexed_timestamp" field in my index - which lets me know when document was indexed: For some reason when doing delta indexing via DIH, this field is not being updated. Are timestamp fields updated during DELTA updates? Kind regards, Vladimir Sutskever Investment Bank - Techno

Re: defaultSearchField

2010-05-19 Thread Antonello Mangone
thank you all ;) 2010/5/19 Jan Kammer > There is something called dismax-requesthandler. I think this is what you > are looking for. > > greetz, Jan > > > Am 19.05.2010 15:47, schrieb Antonello Mangone: > > Hi to everyone, I'd like to know if it's possible to use the * >> defaultSearchField* on

RE: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread hkmortensen
yes I think that will make a good solution. In Dänish "sku" is a bad word ;-), but thanks for the info. Nagelberg, Kallin wrote: > > Sorry, in North America 'sku' (stock keeping unit) is the common term in > business to specifically identify a particular product, > http://lmgtfy.com/?q=sku. >

Re: DIH. behavior after a import. Log, delete table !?

2010-05-19 Thread Ahmet Arslan
> createn an Jar-file. this jar file delete my table. > > but SOLR absolute dont want to start this JAR. i put a > run.bat file into my > folder where is my jar saved. this batch-file runs and > delete the table, but > when solr start this batch-file. it doesnt work. i dont > know why. !?!?!? > i

RE: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread Nagelberg, Kallin
Sorry, in North America 'sku' (stock keeping unit) is the common term in business to specifically identify a particular product, http://lmgtfy.com/?q=sku. And yes, I think you understand me. I am imagining you can structure your products in a hierarchy. For each node in the tree you traverse a

RE: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread hkmortensen
sorry, what does "sku" mean? I understand you like this: indexing base and variants, and include all atributes (for one base and its variants) in each document. I think that would work. Thanks. Nagelberg, Kallin wrote: > > I agree that pulling all attributes into the parent sku during indexing

Re: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread hkmortensen
your are right, in that case an arbitrary on would have to be chosen or probably then both should be in the result set. Difficult to say what the marketing department would like ;-) Leonardo Menezes wrote: > > if that is so, and maybe, you have for example, two variants of cars with > automati

RE: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread Nagelberg, Kallin
I agree that pulling all attributes into the parent sku during indexing could work well. Define a Boolean field like 'isVirtual' to identify the non-leaf skus, and use a multi-valued field for each of the attributes. For now you can do a search like (isVirtual:true AND doorType:screen). If at a

Re: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread Leonardo Menezes
if that is so, and maybe, you have for example, two variants of cars with automatic, what would define on which one was the hit? or field dont share common information across variants? if they do share, you wouldnt be able to define in which one was the hit(because it was on both of them) and would

Re: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread hkmortensen
thanks. Currently not, but requirements change all the time as always ;-) If we get a requirement, that a facet shall be "material of doors", we will need to know which variant was the hit. I would like to be prepared for that. Leonardo Menezes wrote: > > would you then need to know in which

Re: Embedded Server, Caching, Stats page updates

2010-05-19 Thread Antoniya Statelova
> > The way you phrased that paragraph makes me think that one of us doesn't > understand what exactly you did when you "switched" ... > "Switched" works for the specific setup i'm using - the server would refer to itself in the CommonHttpSolrServer request sent, i.e. it would run both the server

Re: Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread Leonardo Menezes
would you then need to know in which variant was your match produced? because if not, you can just index the whole thing as one single document... On Wed, May 19, 2010 at 4:23 PM, hkmortensen wrote: > > I do searching for products. Each base product exist in variants as well. > One > variant has

Re: DIH. behavior after a import. Log, delete table !?

2010-05-19 Thread stockii
hey, thx i did all what you say. createn an Jar-file. this jar file delete my table. but SOLR absolute dont want to start this JAR. i put a run.bat file into my folder where is my jar saved. this batch-file runs and delete the table, but when solr start this batch-file. it doesnt work. i dont k

Challenge: Searching for variant products and get basic products in result set

2010-05-19 Thread hkmortensen
I do searching for products. Each base product exist in variants as well. One variant has a glass door, another a steel door etc. The variants can have diffent prices. The base product does not really exist, only the variants exists IRL. The case corresponds to cars: the car model is the base prod

Re: defaultSearchField

2010-05-19 Thread Jan Kammer
There is something called dismax-requesthandler. I think this is what you are looking for. greetz, Jan Am 19.05.2010 15:47, schrieb Antonello Mangone: Hi to everyone, I'd like to know if it's possible to use the * defaultSearchField* on more fields ??? i.e. field1, field2, field3 Thanks

Re: defaultSearchField

2010-05-19 Thread Ahmet Arslan
> Hi to everyone, I'd like to know if > it's possible to use the * > defaultSearchField* on more fields ??? > > i.e. > > field1, field2, field3 > > No. But you can query multiple fields using dismax. qf=field1,field2,field3&defType=dismax http://wiki.apache.org/solr/DisMaxRequestHandler

defaultSearchField

2010-05-19 Thread Antonello Mangone
Hi to everyone, I'd like to know if it's possible to use the * defaultSearchField* on more fields ??? i.e. field1, field2, field3 Thanks you all

Re: jmx issue with solr

2010-05-19 Thread Na_D
Thanks for the info , using the above properties solved the issue . -- View this message in context: http://lucene.472066.n3.nabble.com/jmx-issue-with-solr-tp828478p829057.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Storing RandomSortField

2010-05-19 Thread Alexandre Rocco
Leonardo, I was able to use the feature with a dynamic field as pointed in the documentation. So, I was just curious to take a peek at the values that are generated, even when the field is not dynamic, so I tried to figure out a way to do so. Maybe some output when the debug query is enabled would

Re: jmx issue with solr

2010-05-19 Thread Jean-Sebastien Vachon
Hi, Try adding these options... -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false On 2010-05-19, at 3:44 AM, Na_D wrote: > > Hi, > > I am trying to start solr with the following command : > > java -Dsolr.solr.home="./example-DIH/solr/" -Dcom.sun.man

Re: Deduplication

2010-05-19 Thread Ahmet Arslan
> TermsComponent maybe? > > or faceting? > q=*:*&facet=true&facet.field=signatureField&defType=lucene&rows=0&start=0 > > if you append &facet.mincount=1 to above url you can > see your duplications > After re-reading your message: sometimes you want to show duplicates, sometimes you don't wan

Re: Deduplication

2010-05-19 Thread Ahmet Arslan
> Basically for some uses cases I would like to show > duplicates for other I > wanted them ignored. > > If I have overwriteDupes=false and I just create the dedup > hash how can I > query for only unique hash values... ie something like a > SQL group by. TermsComponent maybe? or faceting? q

Re: Moving from Lucene to Solr?

2010-05-19 Thread findbestopensource
Hi Peter, You need to use Lucene, - To have more control - You cannot depend on any Web server - To use termvector, termdocs etc - You could easily extend to have your own Analyzer You need to use Solr, - To index and search docs easily by writting few code - Solr is a standal

Moving from Lucene to Solr?

2010-05-19 Thread Peter Karich
Hi all, while asking a question on stackoverflow [1] some other questions appear: Is SolrJ a recommended way to access Solr or should I prefer the HTTP interface? How can I (j)unit-test Solr? (e.g. create+delete index via Java call) Is Lucene faster than Solr? ... do you have experiences, prefer

Re: TikaEntityProcessor on Solr 1.4?

2010-05-19 Thread Noble Paul നോബിള്‍ नोब्ळ्
I guess it should work because Tika Entityprocessor does not use any new 1.4 APIs On Wed, May 19, 2010 at 1:17 AM, Sixten Otto wrote: > Sorry to repeat this question, but I realized that it probably > belonged in its own thread: > > The TikaEntityProcessor class that enables DataImportHandler to

Custom sorting

2010-05-19 Thread dan sutton
Hi, I have a requirement to do the following: For up to the first 10 results (i.e. only on the first page) show sponsored category ads, in order of bid, but no more than 2 / category, and only if all sponsored cat' ads are more that min% of the highest score. e.g. If I had the following: min% =1

Re: Solr Architecture discussion

2010-05-19 Thread rabahb
Do you have any insights that could help me and other people that might be interested in that discussion? Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Architecture-discussion-tp825708p828658.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Storing RandomSortField

2010-05-19 Thread Leonardo Menezes
Hey, for random sorting, random values are generated in runtime using the seed you passed as one of the parameters to generate the value, among other things. this way, if the value you use as seed is the same in different request, the sorting order should be the same. you could also, for debbuin

Re: Storing RandomSortField

2010-05-19 Thread Marco Martinez
Hi Alexandre, I am not totally sure about this, but the random sort field its only used to do a random sort on your searchs, and you will to pass differents values to have differents sorts, so this only applies in the searchs, so no value is indexed. You will find more information here: http://luc

jmx issue with solr

2010-05-19 Thread Na_D
Hi, I am trying to start solr with the following command : java -Dsolr.solr.home="./example-DIH/solr/" -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=3000 On doing so an error is reported : Error: Password file read access must be restricted: C:\Program Files\Java\jdk1. 6.

Re: disable caches in real time

2010-05-19 Thread Marco Martinez
Hi Chris, Thank you for your answer. I've always undestand that if you do a commit (replication does it), a new searcher is open, and you lose performance (queries per second) while the caches are regenerated. I think i don't explain correctly my situation before, with my schema i want to avoid t