Hi. Geert-Jan. Thanks for replying. I know solr has querycache and it improves the search speed from second time. Actually when I talk about the search speed. I don't mean talking about the speed of cache. When user search on our site, I don't want the first time cost 10s and all following cost 0s. These are unacceptable. So I want the first time to be as fast as it can. So all my test speed only count the first time. For fq, yes, I need that. We have 5 different types, for general search, user doesn't need to specify which type he need to search over. But sometimes he needs to search over eg: type:product, that's the time I used "fq" and I believe I understand it correctly. Before I get today's speed, I was always testing against the simple search "design" etc, for the time before today, even the simple search speed is not acceptable so I doesn't care how "fq" speed will go. Today, as the simple search speed is acceptable. I move on to check "fq" and looks it sometimes is much slower than the simple search(The slower means it would take more than 2s, maybe 10s) .
>The only thing that helps you here would be a big solr querycache, depending >on how often queries are repeated. I don't agree. I don't really care the speed of cache as I know it is always super fast. What I want to for solr is to consume as many memory as it can to pre-load the lucene index(maybe be 50% or even 100%). Then when the time comes it need to do the first time of a keyword. It is fast. (I haven't got the answer for this question.) Thanks. Regards. 在2010-07-17 19:30:26,"Geert-Jan Brits" <gbr...@gmail.com> 写道: >>My query string is always simple like "design", "principle of design", >"tom" >>EG: >>URL: >http://localhost:7550/solr/select/?q=design&version=2.2&start=0&rows=10&indent=on > >IMO, indeed with these types of simple searches caching (and thus RAM usage) >can not be fully exploited, i.e: there isn't really anything to cache (no >sort-ordering, faceting (Lucene fieldcache), no documentsets,faceting (Solr >filtercache)) > >The only thing that helps you here would be a big solr querycache, depending >on how often queries are repeated. >Just execute the same query twice, the second time you should see a fast >response (say < 20ms) that's the querycache (and thus RAM) working for >you. > >>Now the issue I found is search with "fq" argument looks slow down the >search. > >This doesn't align with your previous statement that you only use search >with a q-param (e.g: >http://localhost:7550/solr/select/?q=design&version=2.2&start=0&rows=10&indent=on >) >For your own sake, explain what you're trying to do, otherwise we really are >guessing in the dark. > >Anyway the FQ-param let's you cache (using the Solr-filtercache) individual >documentsets that can be used to efficiently to intersect your resultset. >Also the first time, caches should be warmed (i.e: the fq-query should be >exectuted and results saved to cache, since there isn't anything there yet) >. Only on the second time would you start seeing improvements. > >For instance: >http://localhost:7550/solr/select/?q=design&fq=doctype:pdf&version=2.2&start=0&rows=10&indent=on<http://localhost:7550/solr/select/?q=design&version=2.2&start=0&rows=10&indent=on> > ><http://localhost:7550/solr/select/?q=design&version=2.2&start=0&rows=10&indent=on>would >only show documents containing "design" when the doctype=pdf (Again this is >just an example here where I'm just assuming that you have defined a field >'doctype') >since the nr of values of documenttype would be pretty low and would be used >independently of other queries, this would be an excellent candidate for the >FQ-param. > >http://wiki.apache.org/solr/CommonQueryParameters#fq ><http://wiki.apache.org/solr/CommonQueryParameters#fq> >This was a longer reply than I wanted to. Really think about your use-cases >first, then present some real examples of what you want to achieve and then >we can help you in a more useful manner. > >Cheers, >Geert-Jan > >2010/7/17 marship <mars...@126.com> > >> Hi. Peter and All. >> I merged my indexes today. Now each index stores 10M document. Now I only >> have 10 solr cores. >> And I used >> >> java -Xmx1g -jar -server start.jar >> to start the jetty server. >> >> At first I deployed them all on one search. The search speed is about 3s. >> Then I noticed from cmd output when search start, 4 of 10's QTime only cost >> about 10ms-500ms. The left 5 cost more, up to 2-3s. Then I put 6 on web >> server, 4 on another(DB, high load most time). Then the search speed goes >> down to about 1s most time. >> Now most search takes about 1s. That's great. >> >> I watched the jetty output on cmd windows on web server, now when each >> search start, I saw 2 of 6 costs 60ms-80ms. The another 4 cost 170ms - >> 700ms. I do believe the bottleneck is still the hard disk. But at least, >> the search speed at the moment is acceptable. Maybe i should try memdisk to >> see if that help. >> >> >> And for -Xmx1g, actually I only see jetty consume about 150M memory, >> consider now the index is 10x bigger. I don't think that works. I googled >> -Xmx is go enlarge the heap size. Not sure can that help search. I still >> have 3.5G memory free on server. >> >> Now the issue I found is search with "fq" argument looks slow down the >> search. >> >> Thanks All for your help and suggestions. >> Thanks. >> Regards. >> Scott >> >> >> 在2010-07-17 03:36:19,"Peter Karich" <peat...@yahoo.de> 写道: >> >> > Each solr(jetty) instance on consume 40M-60M memory. >> > >> >> java -Xmx1024M -jar start.jar >> > >> >That's a good suggestion! >> >Please, double check that you are using the -server version of the jvm >> >and the latest 1.6.0_20 or so. >> > >> >Additionally you can start jvisualvm (shipped with the jdk) and hook >> >into jetty/tomcat easily to see the current CPU and memory load. >> > >> >> But I have 70 solr cores >> > >> >if you ask me: I would reduce them to 10-15 or even less and increase >> >the RAM. >> >try out tomcat too >> > >> >> solr distriubted search's speed is decided by the slowest one. >> > >> >so, try to reduce the cores >> > >> >Regards, >> >Peter. >> > >> >> you mentioned that you have a lot of mem free, but your yetty containers >> >> only using between 40-60 mem. >> >> >> >> probably stating the obvious, but have you increased the -Xmx param like >> for >> >> instance: >> >> java -Xmx1024M -jar start.jar >> >> >> >> that way you're configuring the container to use a maximum of 1024 MB >> ram >> >> instead of the standard which is much lower (I'm not sure what exactly >> but >> >> it could well be 64MB for non -server, aligning with what you're seeing) >> >> >> >> Geert-Jan >> >> >> >> 2010/7/16 marship <mars...@126.com> >> >> >> >> >> >>> Hi Tom Burton-West. >> >>> >> >>> Sorry looks my email ISP filtered out your replies. I checked web >> version >> >>> of mailing list and saw your reply. >> >>> >> >>> My query string is always simple like "design", "principle of design", >> >>> "tom" >> >>> >> >>> >> >>> >> >>> EG: >> >>> >> >>> URL: >> >>> >> http://localhost:7550/solr/select/?q=design&version=2.2&start=0&rows=10&indent=on >> >>> >> >>> Response: >> >>> >> >>> <response> >> >>> - >> >>> <lst name="responseHeader"> >> >>> <int name="status">0</int> >> >>> <int name="QTime">16</int> >> >>> - >> >>> <lst name="params"> >> >>> <str name="indent">on</str> >> >>> <str name="start">0</str> >> >>> <str name="q">design</str> >> >>> <str name="version">2.2</str> >> >>> <str name="rows">10</str> >> >>> </lst> >> >>> </lst> >> >>> - >> >>> <result name="response" numFound="5981" start="0"> >> >>> - >> >>> <doc> >> >>> <str name="id">product_208619</str> >> >>> </doc> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> EG: >> >>> >> http://localhost:7550/solr/select/?q=Principle&version=2.2&start=0&rows=10&indent=on >> >>> >> >>> <response> >> >>> - >> >>> <lst name="responseHeader"> >> >>> <int name="status">0</int> >> >>> <int name="QTime">94</int> >> >>> - >> >>> <lst name="params"> >> >>> <str name="indent">on</str> >> >>> <str name="start">0</str> >> >>> <str name="q">Principle</str> >> >>> <str name="version">2.2</str> >> >>> <str name="rows">10</str> >> >>> </lst> >> >>> </lst> >> >>> - >> >>> <result name="response" numFound="104" start="0"> >> >>> - >> >>> <doc> >> >>> <str name="id">product_56926</str> >> >>> </doc> >> >>> >> >>> >> >>> >> >>> As I am querying over single core and other cores are not querying at >> same >> >>> time. The QTime looks good. >> >>> >> >>> But when I query the distributed node: (For this case, 6422ms is still >> a >> >>> not bad one. Many cost ~20s) >> >>> >> >>> URL: >> >>> >> http://localhost:7499/solr/select/?q=the+first+world+war&version=2.2&start=0&rows=10&indent=on&debugQuery=true >> >>> >> >>> Response: >> >>> >> >>> <response> >> >>> - >> >>> <lst name="responseHeader"> >> >>> <int name="status">0</int> >> >>> <int name="QTime">6422</int> >> >>> - >> >>> <lst name="params"> >> >>> <str name="debugQuery">true</str> >> >>> <str name="indent">on</str> >> >>> <str name="start">0</str> >> >>> <str name="q">the first world war</str> >> >>> <str name="version">2.2</str> >> >>> <str name="rows">10</str> >> >>> </lst> >> >>> </lst> >> >>> - >> >>> <result name="response" numFound="4231" start="0"> >> >>> >> >>> >> >>> >> >>> Actually I am thinking and testing a solution: As I believe the >> bottleneck >> >>> is in harddisk and all our indexes add up is about 10-15G. What about I >> just >> >>> add another 16G memory to my server then use "MemDisk" to map a memory >> disk >> >>> and put all my indexes into it. Then each time, solr/jetty need to load >> >>> index from harddisk, it is loading from memory. This should give solr >> the >> >>> most throughout and avoid the harddisk access delay. I am testing .... >> >>> >> >>> But if there are way to make solr use better use our limited resource >> to >> >>> avoid adding new ones. that would be great. >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >> >> > >> > >> >-- >> >http://karussell.wordpress.com/ >> > >>