@Erick it is alot of hw, but basically trying to create a "best case scenario" to take HW out of the question. Will try increasing heap size tomorrow.. I haven't seen it get close to the max heap size yet.. but it's worth trying...
Note that these queries look something like: q=*:* fq=[date range] fq=geo query on the fq for the geo query i've added {!cache=false} to prevent it from ending up in the filter cache.. once it's in filter cache queries come back in 10-20ms. For my use case i need the first unique geo search query to come back in a more reasonable time so I am currently ignoring the cache. @Bill will look into that, I'm not certain it will support the particular queries that are being executed but I'll investigate.. steve On Mon, Jul 29, 2013 at 6:25 PM, Erick Erickson <erickerick...@gmail.com>wrote: > This is very strange. I'd expect slow queries on > the first few queries while these caches were > warmed, but after that I'd expect things to > be quite fast. > > For a 12G index and 256G RAM, you have on the > surface a LOT of hardware to throw at this problem. > You can _try_ giving the JVM, say, 18G but that > really shouldn't be a big issue, your index files > should be MMaped. > > Let's try the crude thing first and give the JVM > more memory. > > FWIW > Erick > > On Mon, Jul 29, 2013 at 4:45 PM, Steven Bower <smb-apa...@alcyon.net> > wrote: > > I've been doing some performance analysis of a spacial search use case > I'm > > implementing in Solr 4.3.0. Basically I'm seeing search times alot higher > > than I'd like them to be and I'm hoping people may have some suggestions > > for how to optimize further. > > > > Here are the specs of what I'm doing now: > > > > Machine: > > - 16 cores @ 2.8ghz > > - 256gb RAM > > - 1TB (RAID 1+0 on 10 SSD) > > > > Content: > > - 45M docs (not very big only a few fields with no large textual content) > > - 1 geo field (using config below) > > - index is 12gb > > - 1 shard > > - Using MMapDirectory > > > > Field config: > > > > <fieldType name="geo" class="solr.SpatialRecursivePrefixTreeFieldType" > > distErrPct="0.025" maxDistErr="0.00045" > > > spatialContextFactory="com.spatial4j.core.context.jts.JtsSpatialContextFactory" > > units="degrees"/> > > > > <field name="geopoint" indexed="true" multiValued="false" > > required="false" stored="true" type="geo"/> > > > > > > What I've figured out so far: > > > > - Most of my time (98%) is being spent in > > java.nio.Bits.copyToByteArray(long,Object,long,long) which is being > > driven by > BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock() > > which from what I gather is basically reading terms from the .tim file > > in blocks > > > > - I moved from Java 1.6 to 1.7 based upon what I read here: > > http://blog.vlad1.com/2011/10/05/looking-at-java-nio-buffer-performance/ > > and it definitely had some positive impact (i haven't been able to > > measure this independantly yet) > > > > - I changed maxDistErr from 0.000009 (which is 1m precision per docs) > > to 0.00045 (50m precision) .. > > > > - It looks to me that the .tim file are being memory mapped fully (ie > > they show up in pmap output) the virtual size of the jvm is ~18gb > > (heap is 6gb) > > > > - I've optimized the index but this doesn't have a dramatic impact on > > performance > > > > Changing the precision and the JVM upgrade yielded a drop from ~18s > > avg query time to ~9s avg query time.. This is fantastic but I want to > > get this down into the 1-2 second range. > > > > At this point it seems that basically i am bottle-necked on basically > > copying memory out of the mapped .tim file which leads me to think > > that the only solution to my problem would be to read less data or > > somehow read it more efficiently.. > > > > If anyone has any suggestions of where to go with this I'd love to know > > > > > > thanks, > > > > steve >