Re: No "group by"? looking for an alternative.

2010-08-04 Thread Mickael Magniez
Thanks for your response. Unfortunately, I don't think it'll be enough. In fact, I have many other products than shoes in my index, with many other facets fields. I simplified my schema : in reality facets are dynamic fields. -- View this message in context: http://lucene.472066.n3.nabble.com/

how to take a value from the query result

2010-08-04 Thread twojah
this is my query in browser navigation toolbar http://172.16.17.126:8983/search/select/?q=AUC_ID:607136 and this is the result in browser page: ... 1 1.0 576 27017 Bracket Ceiling untuk semua merk projector, panjang 60-90 cm Bahan Besi Cat Hitam = 325rb Bahan Sta /aksesoris-batere-dan-tripod/up

Re: Solrj ContentStreamUpdateRequest Slow

2010-08-04 Thread jayendra patil
ContentStreamUpdateRequest seems to read the file contents and transfer it over http, which slows down the indexing. Try Using StreamingUpdateSolrServer with stream.file param @ http://wiki.apache.org/solr/SolrPerformanceFactors#Embedded_vs_HTTP_Post e.g. SolrServer server = new StreamingUpdateS

XML Format

2010-08-04 Thread twojah
1 1.0 27017 Bracket Ceiling untuk semua merk projector, panjang 60-90 cm Bahan Besi Cat Hitam = 325rb Bahan Sta /aksesoris-batere-dan-tripod/update-bracket-projector-dan-lcd-plasma-tv-607136.html 607136 Nego 7 270/27017/bracket_lcd_plasma_3a-1274291780.JPG 2010-05-19 17:56:45 [UPDATE] BRACKET Pr

Re: Indexing fieldvalues with dashes and spaces

2010-08-04 Thread Erick Erickson
I suspect you're running afoul of tokenizers and filters. The parts of your schema that you published aren't the ones that really count. What you probably need to look at is the FieldType definitions, i.e. what analysis is done for, say, text_ws (see http://wiki.apache.org/solr/AnalyzersTokenizers

Re: analysis tool vs. reality

2010-08-04 Thread Lance Norskog
"there is some kind of caching of query results going on that doesnt get flushed on a restart of tomcat." Yes. Solr by default has http caching on if there is no configuration, and the example solrconfig.xml has it configured on. You should edit solrconfig.xml to use the alternative described in t

Re: No "group by"? looking for an alternative.

2010-08-04 Thread Lance Norskog
Hello- A way to do this is to create on faceting field that includes both the size and the color. I assume you have a different shoe product document for each model. Each model would include the color & size 'red' and '14a' fields, but you would add a field with 'red-14a'. On Wed, Aug 4, 2010 at

Re: Some basic DataImportHandler questions

2010-08-04 Thread harrysmith
Thanks, I think part of my issue may be I am misunderstanding how to use the entity and field tags to import data in a particular format and am looking for a few more examples. Lets say I have a database table with 2 columns that contain metadata fields and values, and would like to import this i

Re: Is there a better for solor server side loadbalance?

2010-08-04 Thread Peter Karich
>> The default solr solution is client side loadbalance. >> Is there a solution provide the server side loadbalance? >> >> >> > No. Most of us stick a HTTP load balancer in front of multiple Solr servers. > E.g. mod_jk is a very easy solution (maybe too simple/stupid?) for a load balancer

Re: DIH and Cassandra

2010-08-04 Thread Dennis Gearon
If data is stored in the index, isn't the index of Solr pretty much already a 'Big/Cassandra Table', except with tokenized columns to make seaching easier? How are Cassandra/Big/Couch DBs doing text/weighted searching? Seems a real duplication to use Cassandra AND Solr. OTOH, I don't know how m

RE: Indexing fieldvalues with dashes and spaces

2010-08-04 Thread PeterKerk
Well the example you provided is 100% relevant to me :) I've read the wiki now (SchemaXml,SolrFacetingOverview,Query Syntax, SimpleFacetParameters), but still do not have an exact idea of what you mean. My situation: a city field is something that I want users to search on via text input, so let

Re: DIH and Cassandra

2010-08-04 Thread Shalin Shekhar Mangar
On Wed, Aug 4, 2010 at 9:11 PM, Mark wrote: > Is it possible to use DIH with Cassandra either out of the box or with > something more custom? Thanks > It will take some modifications but DIH is built to create denormalized documents so it is possible. Also see https://issues.apache.org/jira/bro

RE: Indexing fieldvalues with dashes and spaces

2010-08-04 Thread Markus Jelsma
Hmm, you should first read a bit more on schema design on the wiki and learn about indexing and querying Solr.   The copyField directive is what is commonly used in a faceted navigation system, search on analyzed fields, show faceting results using the primitive string field type. With copyFie

RE: Indexing boolean value

2010-08-04 Thread Michael Griffiths
Copyfield copies the field so you can have multiple versions. Useful to dump all fields into one "super" field you can search on, for perf reasons. If the column isn't being indexed, I'd suggest the problem is in DIH. No suggestions as to why, I'm afraid. -Original Message- From: PeterK

RE: Indexing fieldvalues with dashes and spaces

2010-08-04 Thread Michael Griffiths
Echoing Markus - use the tokenized field to return results, but have a duplicate field of fieldtype="string" to show the untokenized results. E.g. facet on that field. -Original Message- From: Markus Jelsma [mailto:markus.jel...@buyways.nl] Sent: Wednesday, August 04, 2010 4:18 PM To: s

RE: Indexing fieldvalues with dashes and spaces

2010-08-04 Thread PeterKerk
Sorry, but Im a newbie to Solr...how would I change my schema.xml to match your requirements? And what do you mean by "it will mess with your results"? What will happen then? -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-fieldvalues-with-dashes-and-spaces-tp102369

RE: Indexing boolean value

2010-08-04 Thread PeterKerk
Hi, I tried that already, so that would make this: (still not sure what copyField does though) But even that wont work. I also dont see the officallocation columns indexed in the documents: http://localhost:8983/solr/db/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on -- View this me

RE: Indexing fieldvalues with dashes and spaces

2010-08-04 Thread Markus Jelsma
You shouldn't fetch faceting results from analyzed fields, it will mess with your results. Search on analyzed fields but don't retrieve values from them.   -Original message- From: PeterKerk Sent: Wed 04-08-2010 22:15 To: solr-user@lucene.apache.org; Subject: RE: Indexing fieldvalues wi

RE: Indexing fieldvalues with dashes and spaces

2010-08-04 Thread PeterKerk
I changed values to text_ws Now I only seem to have problems with fieldvalues that hold spacessee below: It has now become: "facet_counts":{ "facet_queries":{}, "facet_fields":{ "theme":[ "Gemeentehuis",2, "&",1, < still & is c

RE: Indexing boolean value

2010-08-04 Thread Michael Griffiths
I could be wrong, but I thought bit was an integer. Try changing fieldtype to integer. -Original Message- From: PeterKerk [mailto:vettepa...@hotmail.com] Sent: Wednesday, August 04, 2010 3:42 PM To: solr-user@lucene.apache.org Subject: Indexing boolean value Im trying to index a boolea

RE: Indexing fieldvalues with dashes and spaces

2010-08-04 Thread Michael Griffiths
Your schema.xml setting for the field is probably tokenizing the punctuation. Change the field type to one that doesn't tokenize on punctuation; e.g. use "text_ws" and not "text" -Original Message- From: PeterKerk [mailto:vettepa...@hotmail.com] Sent: Wednesday, August 04, 2010 3:36 PM

Indexing boolean value

2010-08-04 Thread PeterKerk
Im trying to index a boolean location, but for some reason it does not show up in my indexed data. data-config.xml OFFICIALLOCATION is a MSSQL database field of type 'bit' schema.xml (im not sure why I would use http://lucene.47

Re: Best solution to avoiding multiple query requests

2010-08-04 Thread Ken Krugler
Hi Geert-jan, On Aug 4, 2010, at 12:04pm, Geert-Jan Brits wrote: If I understand correctly: you want to sort your collapsed results by 'nr of collapsed results'/ hits. It seems this can't be done out-of-the-box using this patch (I'm not entirely sure, at least it doesn't follow from the wiki

Re: Best solution to avoiding multiple query requests

2010-08-04 Thread Geert-Jan Brits
If I understand correctly: you want to sort your collapsed results by 'nr of collapsed results'/ hits. It seems this can't be done out-of-the-box using this patch (I'm not entirely sure, at least it doesn't follow from the wiki-page. Perhaps best is to check the jira-issues to make sure this isn't

Solrj ContentStreamUpdateRequest Slow

2010-08-04 Thread Tod
I'm running a slight variation of the example code referenced below and it takes a real long time to finally execute. In fact it hangs for a long time at solr.request(up) before finally executing. Is there anything I can look at or tweak to improve performance? I am also indexing a local pdf

Re: Is there a better for solor server side loadbalance?

2010-08-04 Thread Andrei Savu
Check this article [1] that explains how to setup haproxy to do load balacing. The steps are the same even if you are not using Drupal. By using this approach you can easily add more replicas without changing the application configuration files. You should also check SolrCloud [2] which does auto

Re: DIH and Cassandra

2010-08-04 Thread Andrei Savu
DIH only works with relational databases and XML files [1], you need to write custom code in order to index data from Cassandra. It should be pretty easy to map documents from Cassandra to Solr. There are a lot of client libraries available [2] for Cassandra. [1] http://wiki.apache.org/solr/DataI

Re: DIH and Cassandra

2010-08-04 Thread Andrei Savu
DIH only works with relational databases and XML files [1], you need to write custom code in order to index data from Cassandra. It should be pretty easy to map documents from Cassandra to Solr. There are a lot of client libraries available [2] for Cassandra. [1] http://wiki.apache.org/solr/DataI

Re: Best solution to avoiding multiple query requests

2010-08-04 Thread Ken Krugler
Hi Geert-Jan, On Aug 4, 2010, at 5:30am, Geert-Jan Brits wrote: Field Collapsing (currently as patch) is exactly what you're looking for imo. http://wiki.apache.org/solr/FieldCollapsing Thanks for the ref, good stuff. I think it's close, but if I understand this correctly, then I could

Re: analysis tool vs. reality

2010-08-04 Thread Robert Muir
Furthermore, I would like to add its not just the highlight matches functionality that is horribly broken here, but the output of the analysis itself is misleading. lets say i take 'textTight' from the example, and add the following synonym: this is broken => broke the query time analysis is wro

Re: analysis tool vs. reality

2010-08-04 Thread Robert Muir
On Wed, Aug 4, 2010 at 1:45 PM, Chris Hostetter wrote: > > it really only attempts to identify when there is overlap between > analaysis at query time and at indexing time so you can easily spot when > one analyzer or the other "breaks" things so that they no longer line up > (or when it "fiexes"

Re: analysis tool vs. reality

2010-08-04 Thread Chris Hostetter
: I think I agree with Justin here, I think the way analysis tool highlights : 'matches' is extremely misleading, especially considering it completely : ignores queryparsing. it really only attempts to identify when there is overlap between analaysis at query time and at indexing time so you can

Re: Sharing index files between multiple JVMs and replication

2010-08-04 Thread Kelly Taylor
Is anybody else encountering these same issues; IF having a similar setup?  And is there a way to configure certain Solr web-apps as read-only (basically dummy instances) so that index changes are not allowed? - Original Message From: Kelly Taylor To: solr-user@lucene.apache.org Sent

Re: Setting up apache solr in eclipse with Tomcat

2010-08-04 Thread Hando420
Thanks now its clear and works fine. Regards, Hando -- View this message in context: http://lucene.472066.n3.nabble.com/Setting-up-apache-solr-in-eclipse-with-Tomcat-tp1021673p1023404.html Sent from the Solr - User mailing list archive at Nabble.com.

can't use strdist as functionquery?

2010-08-04 Thread solr-user
I want to sort my results by how closely a given resultset field matches a given string. For example, say I am searching for a given product, and the product can be found in many cities including "seattle". I want to sort the results so that results from city of "seattle" are at the top, and all

Re: Setting up apache solr in eclipse with Tomcat

2010-08-04 Thread jayendra patil
The sole home is configured in the web.xml of the application which points to the folder having the conf files and the data directory solr/home D:/multicore java.lang.String Regards, Jayendra On Wed, Aug 4, 2010 at 12:21 PM, Hando420 wrote: > > Thanks man i haven'

Re: Setting up apache solr in eclipse with Tomcat

2010-08-04 Thread Hando420
Thanks man i haven't tried this but where do put that xml configuration. Is it to the web.xml in solr? Cheers, Hando -- View this message in context: http://lucene.472066.n3.nabble.com/Setting-up-apache-solr-in-eclipse-with-Tomcat-tp1021673p1023188.html Sent from the Solr - User mailing list ar

Re: enhancing auto complete

2010-08-04 Thread Avlesh Singh
I preferred to answer this question privately earlier. But I have received innumerable requests to unveil the architecture. For the benefit of all, I am posting it here (after hiding as much info as I should, in my company's interest). The context: Auto-suggest feature on http://askme.in *Solr se

DIH and Cassandra

2010-08-04 Thread Mark
Is it possible to use DIH with Cassandra either out of the box or with something more custom? Thanks

Re: Is there a better for solor server side loadbalance?

2010-08-04 Thread Shalin Shekhar Mangar
2010/8/4 Chengyang > The default solr solution is client side loadbalance. > Is there a solution provide the server side loadbalance? > > No. Most of us stick a HTTP load balancer in front of multiple Solr servers. -- Regards, Shalin Shekhar Mangar.

Re: analysis tool vs. reality

2010-08-04 Thread Shalin Shekhar Mangar
On Wed, Aug 4, 2010 at 7:52 PM, Robert Muir wrote: > I think I agree with Justin here, I think the way analysis tool highlights > 'matches' is extremely misleading, especially considering it completely > ignores queryparsing. > > it would be better if it put your text in a memoryindex and actuall

Re: Support loading queries from external files in QuerySenderListener

2010-08-04 Thread Shalin Shekhar Mangar
On Wed, Aug 4, 2010 at 3:27 PM, Stanislaw wrote: > Hi all! > I cant load my custom queries from the external file, as written here: > https://issues.apache.org/jira/browse/SOLR-784 > > This option is seems to be not implemented in current version 1.4.1 of > Solr. > It was deleted or it comes first

analysis tool vs. reality

2010-08-04 Thread Justin Lolofie
Wow, I got to work this morning and my query results now include the 'ABC12' document. I'm not sure what that means. Either I made a mistake in the process I described in the last email (I dont think this is the case) or there is some kind of caching of query results going on that doesnt get flushe

Re: analysis tool vs. reality

2010-08-04 Thread Robert Muir
I think I agree with Justin here, I think the way analysis tool highlights 'matches' is extremely misleading, especially considering it completely ignores queryparsing. it would be better if it put your text in a memoryindex and actually parsed the query w/ queryparser, ran it, and used the highli

No "group by"? looking for an alternative.

2010-08-04 Thread Mickael Magniez
Hello, I'm dealing with a problem since few days : I want to index and search shoes, each shoe can have several size and colors, at different prices. So, what i want is : when I search for "Converse", i want to retrieve one "shoe per model", i-e one color and one size, but having colors and siz

analysis tool vs. reality

2010-08-04 Thread Justin Lolofie
Erik: Yes, I did re-index if that means adding the document again. Here are the exact steps I took: 1. analysis.jsp "ABC12" does NOT match title "ABC12" (however, ABC or 12 does) 2. changed schema.xml WordDelimeterFilterFactory catenate-all 3. restarted tomcat 4. deleted the document with title "A

Re: Setting up apache solr in eclipse with Tomcat

2010-08-04 Thread jayendra patil
Have got solr working in the Eclipse and deployed on Tomcat through eclipse plugin. The Crude approach, was to 1. Import the Solr war into Eclipse which will be imported as a web project and can be deployed on tomcat. 2. Add multiple source folders to the Project, linked to the checked ou

Re: Multi word synomyms

2010-08-04 Thread Qwerky
It would be nice if you could configure some kind of filter to be processed before the query string is passed to the parser. The QueryComponent class seems a nice place for this; a filter could be run against the raw query and ResponseBuilder's queryString value could be modified before the QParse

Re: Date faceting

2010-08-04 Thread Eric Grobler
Thanks Koji, It works :-) Have a nice day. regards ericz On Wed, Aug 4, 2010 at 12:08 PM, Koji Sekiguchi wrote: > (10/08/04 19:42), Eric Grobler wrote: > >> Hi Solr community, >> >> How do I facet on timestamp for example? >> >> I tried something like this - but I get no result. >> >> facet=t

Re: Best solution to avoiding multiple query requests

2010-08-04 Thread Geert-Jan Brits
Field Collapsing (currently as patch) is exactly what you're looking for imo. http://wiki.apache.org/solr/FieldCollapsing Geert-Jan 2010/8/4 Ken Krugler > Hi all, > > I've got a situation where the key result from an initial search request > (let's

Re: Best solution to avoiding multiple query requests

2010-08-04 Thread kenf_nc
Not sure the processing would be any faster than just querying again, but, in your original result set the first doc that has a field value that matches a to 10 facet, will be the number 1 item if you fq on that facet value. So you don't need to query it again. You would only need to query those t

Re: Date faceting

2010-08-04 Thread Koji Sekiguchi
(10/08/04 19:42), Eric Grobler wrote: Hi Solr community, How do I facet on timestamp for example? I tried something like this - but I get no result. facet=true facet.date=timestamp f.facet.timestamp.date.start=2010-01-01T00:00:00Z f.facet.timestamp.date.end=2010-12-31T00:00:00Z f.facet.timesta

Date faceting

2010-08-04 Thread Eric Grobler
Hi Solr community, How do I facet on timestamp for example? I tried something like this - but I get no result. facet=true facet.date=timestamp f.facet.timestamp.date.start=2010-01-01T00:00:00Z f.facet.timestamp.date.end=2010-12-31T00:00:00Z f.facet.timestamp.date.gap=+1HOUR f.facet.timestamp.dat

Support loading queries from external files in QuerySenderListener

2010-08-04 Thread Stanislaw
Hi all! I cant load my custom queries from the external file, as written here: https://issues.apache.org/jira/browse/SOLR-784 This option is seems to be not implemented in current version 1.4.1 of Solr. It was deleted or it comes first with new version? regards, Stanislaw

Is there a better for solor server side loadbalance?

2010-08-04 Thread Chengyang
The default solr solution is client side loadbalance. Is there a solution provide the server side loadbalance?

Re: Migrating from Lucene 2.9.1 to Solr 1.4.0 - Performance issues under heavy load

2010-08-04 Thread Ophir Adiv
On Wed, Aug 4, 2010 at 10:50 AM, Peter Karich wrote: > Ophir, > > this sounds a bit strange: > > > CommonsHttpSolrServer.java, line 416 takes about 95% of the application's > total search time > > Is this only for heavy load? > > I think this makes sense, since the hard work is done by Solr - onc

AW: Migrating from Lucene 2.9.1 to Solr 1.4.0 - Performance issues under heavy load

2010-08-04 Thread Bastian Spitzer
Im not sure if i understand your problem, but basicly it isnt Solr vs Lucene but HttpURLConnection vs Solrj's CommonsHttpSolrServer, because Server Query Times havent changed at all from what u say? Why arent you querying the Server the same way you did before when u want to compare solr to lu

RE: wildcard and proximity searches

2010-08-04 Thread Frederico Azeiteiro
Thanks for you ideia. At this point I'm logging each query time. My ideia is to divide my queries into "normal queries" and "heavy queries". I have some heavy queries with 1 minute or 2mintes to get results. But they have for instance (*word1* AND *word2* AND word3*). I guess that this will be alw

Re: Migrating from Lucene 2.9.1 to Solr 1.4.0 - Performance issues under heavy load

2010-08-04 Thread Peter Karich
Ophir, this sounds a bit strange: > CommonsHttpSolrServer.java, line 416 takes about 95% of the application's > total search time Is this only for heavy load? Some other things: * with lucene you accessed the indices with MultiSearcher in a LAN, right? * did you look into the logs of the se

Setting up apache solr in eclipse with Tomcat

2010-08-04 Thread Hando420
I would like to setup apache solr in eclipse using tomcat. It is easy to setup with jetty but with tomcat it doesn't run solr on runtime. Anyone has done this before? Hando -- View this message in context: http://lucene.472066.n3.nabble.com/Setting-up-apache-solr-in-eclipse-with-Tomcat-tp102167

Migrating from Lucene 2.9.1 to Solr 1.4.0 - Performance issues under heavy load

2010-08-04 Thread Ophir Adiv
[posted this yesterday in lucene-user mailing list, and got an advice to post this here instead. excuse me for spamming] Hi, I'm currently involved in a project of migrating from Lucene 2.9.1 to Solr 1.4.0. During stress testing, I encountered this performance problem: While actual search times i