Problem has been resolved. My disk subsystem been a bottleneck for quick search. I put my indexes to RAM and I see very nice QTimes :) Sorry for your time, guys.
On Mon, Nov 28, 2011 at 4:02 PM, Artem Lokotosh <arco...@gmail.com> wrote: > Hi all again. Thanks to all for your replies. > > On this weekend I'd made some interesting tests, and I would like to share > it with you. > > > First of all I made speed test of my hdd: > > root@LSolr:~# hdparm -t /dev/sda9 > > > /dev/sda9: > > Timing buffered disk reads: 146 MB in 3.01 seconds = 48.54 MB/sec > > > Then with iperf I had tested my network: > > [ 4] 0.0-18.7 sec 2.00 GBytes 917 Mbits/sec > > > Then, I tried to post my quesries using shard parameter with one > > shard, so my queries were like: > > http://localhost:8080/solr1/select/?q=(test)&qt=requestShards > <http://localhost:8080/solr1/select/?q=%28test%29&qt=requestShards> > > where "requestShards" is: > > <requestHandler name="requestShards" class="solr.SearchHandler" > default="false"> > > <lst name="defaults"> > > <str name="echoParams">explicit</str> > > <int name="rows">10</int> > > <str name="shards">127.0.0.1:8080/solr1 > <http://127.0.0.1:8080/solr1></str> > > </lst> > > </requestHandler> > > > Maybe its not correct, but: > > INFO: [] webapp=/solr1 > path=/select/params={fl=*,score&ident=true&start=0&q=(genuflections)&qt=requestShards&rows=2000}status=0 > QTime=6525 > > INFO: [] webapp=/solr1 > path=/select/params={fl=*,score&ident=true&start=0&q=(tunefulness)&qt=requestShards&rows=2000} > status=0 QTime=20170 > > INFO: [] webapp=/solr1 > path=/select/params={fl=*,score&ident=true&start=0&q=(societal)&qt=requestShards&rows=2000} > status=0 QTime=44958 > > INFO: [] webapp=/solr1 > path=/select/params={fl=*,score&ident=true&start=0&q=(euchre's)&qt=requestShards&rows=2000} > status=0 QTime=32161 > > INFO: [] webapp=/solr1 > path=/select/params={fl=*,score&ident=true&start=0&q=(monogram's)&qt=requestShards&rows=2000} > status=0 QTime=85252 > > > When I posted similar queries direct to solr1 without "requestShards" I had: > > INFO: [] webapp=/solr1 > path=/select/params={fl=*,score&ident=true&start=0&q=(reopening)&rows=2000} > hits=712 status=0 QTime=10 > > INFO: [] webapp=/solr1 > path=/select/params={fl=*,score&ident=true&start=0&q=(housemothers)&rows=2000} > hits=0 status=0 QTime=446 > > INFO: [] webapp=/solr1 > path=/select/params={fl=*,score&ident=true&start=0&q=(harpooners)&rows=2000} > hits=76 status=0 QTime=399 > > INFO: [] webapp=/solr1 path=/select/ > params={fl=*,score&ident=true&start=0&q=(coaxing)&rows=2000} hits=562 > status=0 QTime=2820 > > INFO: [] webapp=/solr1 path=/select/ > params={fl=*,score&ident=true&start=0&q=(superstar's)&rows=2000} hits=4748 > status=0 QTime=672 > > INFO: [] webapp=/solr1 path=/select/ > params={fl=*,score&ident=true&start=0&q=(sedateness's)&rows=2000} hits=136 > status=0 QTime=923 > > INFO: [] webapp=/solr1 path=/select/ > params={fl=*,score&ident=true&start=0&q=(petrolatum)&rows=2000} hits=8 > status=0 QTime=6183 > > INFO: [] webapp=/solr1 path=/select/ > params={fl=*,score&ident=true&start=0&q=(everlasting's)&rows=2000} > hits=1522 status=0 QTime=2625 > > > And finally I found a bug: > > https://issues.apache.org/jira/browse/SOLR-1524 > <https://issues.apache.org/jira/browse/SOLR-1524> > > Why is no activity on it? Its not actual? > > > Today I wrote a bash script: > > #!/bin/bash > > ds=$(date +%s.%N) > > echo "START: $ds"> ./data/east_2000 > > curl > http://127.0.0.1:8080/solr1/select/?fl=*,score&ident=true&start=0&q=(east)&rows=2000 > <http://127.0.0.1:8080/solr1/select/?fl=*,score&ident=true&start=0&q=%28east%29&rows=2000-s> > -s-H 'Content-type:text/xml; charset=utf-8'>> ./data/east_2000 > > de=$(date +%s.%N) > > ddf=$(echo "$de - $ds" | bc) > > echo "END: $de">> ./data/east_2000 > > echo "DIFF: $ddf">> ./data/east_2000 > > > Before runing a Tomcat I'd dropped cache: > > root@LSolr:~# echo 3> /proc/sys/vm/drop_caches > > > Then I started Tomcat and run the script. Result is bellow: > > START: 1322476131.783146691 > > <?xml version="1.0" encoding="UTF-8"?> > > <response> > > <lst name="responseHeader"><int name="status">0</int><int > > name="QTime">125</int><lst name="params"><str > > name="fl">*,score</str><str name="ident">true</str><str > > name="start">0</str><str name="q">(east)</str><str > > name="rows">2000</str></lst></lst><result name="response" > > numFound="21439" start="0" maxScore="4.387605"> > > ... > > </response> > > END: 1322476180.262770244 > > DIFF: 48.479623553 > > > File size is: > > root@LSolr:~# ls -l | grep east > > -rw-r--r-- 1 root root 1063579 Nov 28 12:29 east_2000 > > > I'm using nmon to monitor a HDD activity. It was near 100% when I run the > script. But when I tried to run it again the result was: > > DIFF: .063678709 > > and no much HDD activity at nmon. > > > I can't undestand one thing: is this my huge hardware such as slow HDDor its > a Solr troubles? > > And why is no activity on bug > https://issues.apache.org/jira/browse/SOLR-1524 > <https://issues.apache.org/jira/browse/SOLR-1524> since 27/Oct/09 07:19? > > > On 11/25/2011 10:02 AM, Dmitry Kan wrote: > >> 45 000 000 per shard approx, Tomcat, caching was tweaked in solrconfig and >> shard given 12GB of RAM max. >> >> <!-- Filter Cache >> >> Cache used by SolrIndexSearcher for filters (DocSets), >> unordered sets of *all* documents that match a query. When a >> new searcher is opened, its caches may be prepopulated or >> "autowarmed" using data from caches in the old searcher. >> autowarmCount is the number of items to prepopulate. For >> LRUCache, the autowarmed items will be the most recently >> accessed items. >> >> Parameters: >> class - the SolrCache implementation LRUCache or >> (LRUCache or FastLRUCache) >> size - the maximum number of entries in the cache >> initialSize - the initial capacity (number of entries) of >> the cache. (see java.util.HashMap) >> autowarmCount - the number of entries to prepopulate from >> and old cache. >> --> >> >> filterCache class="solr.FastLRUCache" size="1200" initialSize="1200" >> autowarmCount="128"/> >> >> <!-- Query Result Cache >> >> Caches results of searches - ordered lists of document ids >> (DocList) based on a query, a sort, and the range of >> documents requested. >> --> >> >> <queryResultCache class="solr.LRUCache" size="512" initialSize="512" >> autowarmCount="32"/> >> >> <!-- Document Cache >> >> Caches Lucene Document objects (the stored fields for each >> document). Since Lucene internal document ids are transient, >> this cache will not be autowarmed. >> --> >> >> <documentCache class="solr.LRUCache" size="512" initialSize="512" >> autowarmCount="0"/> >> >> <!-- Field Value Cache >> >> Cache used to hold field values that are quickly accessible >> by document id. The fieldValueCache is created by default >> even if not configured here. >> --> >> >> <!-- >> <fieldValueCache class="solr.FastLRUCache" >> size="512" >> autowarmCount="128" >> showItems="32" /> >> --> >> >> <!-- Custom Cache >> >> Example of a generic cache. These caches may be accessed by >> name through SolrIndexSearcher.getCache(),cacheLookup(), and >> cacheInsert(). The purpose is to enable easy caching of >> user/application level data. The regenerator argument should >> be specified as an implementation of solr.CacheRegenerator >> if autowarming is desired. >> --> >> >> <!-- >> <cache name="myUserCache" >> class="solr.LRUCache" >> size="4096" >> initialSize="1024" >> autowarmCount="1024" >> regenerator="com.mycompany.MyRegenerator" >> /> >> --> >> >> <!-- Lazy Field Loading >> >> If true, stored fields that are not requested will be loaded >> lazily. This can result in a significant speed improvement >> if the usual case is to not load all stored fields, >> especially if the skipped fields are large compressed text >> fields. >> --> >> >> <enableLazyFieldLoading> >> true >> </enableLazyFieldLoading> >> >> <!-- Use Filter For Sorted Query >> >> A possible optimization that attempts to use a filter to >> satisfy a search. If the requested sort does not include >> score, then the filterCache will be checked for a filter >> matching the query. If found, the filter will be used as the >> source of document ids, and then the sort will be applied to >> that. >> >> For most situations, this will not be useful unless you >> frequently get the same search repeatedly with different sort >> options, and none of them ever use "score" >> --> >> >> <!-- >> <useFilterForSortedQuery>true</useFilterForSortedQuery> >> --> >> >> <!-- Result Window Size >> >> An optimization for use with the queryResultCache. When a search >> is requested, a superset of the requested number of document ids >> are collected. For example, if a search for a particular query >> requests matching documents 10 through 19, and queryWindowSize is >> 50, >> then documents 0 through 49 will be collected and cached. Any >> further >> requests in that range can be satisfied via the cache. >> --> >> >> <queryResultWindowSize> >> 50 >> </queryResultWindowSize> >> >> <!-- Maximum number of documents to cache for any entry in the >> queryResultCache. >> --> >> >> <queryResultMaxDocsCached> >> 200 >> </queryResultMaxDocsCached> >> >> >> In you case I would first check if the network throughput is a bottleneck. >> >> It would be nice if you could check timestamps of completing a request on >> each of the shards and arrival time (via some http sniffer) at the >> frondend >> SOLR's servers. Then you will see if it is frontend taking so much time or >> was it a network issue. >> >> Are you shards btw well balanced? >> >> On Thu, Nov 24, 2011 at 7:06 PM, Artem Lokotosh<arco...@gmail.com> wrote: >> >>>>> Can you merge, e.g. 3 shards together or is it much effort for your >>> >>> team?>Yes, we can merge. We'll try to do this and review how it will >>> works >>> Merge does not help :(I've tried to merge two shards in one, three >>> shards in one, but results are similar to results first configuration >>> with 30 shardsbut this solution have an one big minus the optimization >>> proccess may take more time >>>>> >>>>> In our setup we currently have 16 shards with ~30GB each, but we >>> >>> rarely>>search in all of them at once >>> How many documents per shards in your setup?Any difference between >>> Tomcat, Jetty or other? >>> Have you configured your servlet more specifically than default >>> configuration? >>> >>> >>> On Wed, Nov 23, 2011 at 4:38 PM, Artem Lokotosh<arco...@gmail.com> >>> wrote: >>>>> >>>>> Is this log from the frontend SOLR (aggregator) or from a shard? >>>> >>>> from aggregator >>>> >>>>> Can you merge, e.g. 3 shards together or is it much effort for your >>> >>> team? >>>> >>>> Yes, we can merge. We'll try to do this and review how it will works >>>> Thanks, Dmitry >>>> >>>> Any another ideas? >>>> >>>> On Wed, Nov 23, 2011 at 4:01 PM, Dmitry Kan<dmitry....@gmail.com> >>> >>> wrote: >>>>> >>>>> Hello, >>>>> >>>>> Is this log from the frontend SOLR (aggregator) or from a shard? >>>>> Can you merge, e.g. 3 shards together or is it much effort for your >>> >>> team? >>>>> >>>>> In our setup we currently have 16 shards with ~30GB each, but we rarely >>>>> search in all of them at once. >>>>> >>>>> Best, >>>>> Dmitry >>>>> >>>>> On Wed, Nov 23, 2011 at 3:12 PM, Artem Lokotosh<arco...@gmail.com> >>> >>> wrote: >>>>> >>>> -- >>>> Best regards, >>>> Artem Lokotosh mailto:arco...@gmail.com >>>> >>> >>> -- >>> Best regards, >>> Artem Lokotosh mailto:arco...@gmail.com >>> >> >> >> > -- > Best regards, > Artem Lokotosh mailto:arco...@gmail.com > > -- Best regards, Artem Lokotosh mailto:arco...@gmail.com