Matthew -

I (and others at Lucid) worked with The Motley Fool to convert them from the 
GSA to Solr a few years ago with great success.  They had a variety of data 
sources (relational databases, XML feeds, etc) that smoothly went into Solr and 
drastically improved their search relevancy and user experiences, as well as 
their costs.

Here's an article about it: 
http://www.cio.com/article/682241/Why_The_Motley_Fool_Chose_Enterprise_Search_Technology_to_Integrate_Content

And to ad (heh) a little bit of commercial help to the decision making process, 
give LucidWorks Search a try, as you'll be able to out of the box set up 
crawling of your sites and indexing of relational data (and other built-in 
connector data sources) and of course also be able to use it as plain ol' Solr 
as well.  See <http://www.lucidworks.com/lucidworks-product-suite> for more 
details.

        Erik


On Sep 25, 2012, at 12:18 , Matthew Shapiro wrote:

> Hi all,  I don't know if this is the correct mailing list, so I apologize
> if it isn't.  I wasn't sure what other list it would go to.
> 
> Anyways, my company a while back (before I started) got Google envy and
> decided to purchase a GSA system to store our searchable data.  While the
> GSA seems ok for a web-crawler it seems woefully inadequate for quick
> searching of application/non-web data.  Unfortunately, since they already
> purchased a GSA license and support I am trying to put together a
> non-direct cost argument on why I need to switch our search infrastructure
> from GSA to Solr.  To preface this, while I have used Lucene in various
> projects in the past (though not too extensively, just for basic search
> implementations) I have never used Solr.
> 
> I was hoping someone could comment on some of the areas below where I have
> encountered friction with the GSA and let me know if / how Solr is an
> improvement.
> 
> 1) Sorting by anything other than last modified date or relevancy is
> impossible with the GSA.  I need to be able to sort results based on a
> specific piece of metadata
> 
> 2) When performing a search outside of the page bounds (e.g. there are only
> 2 pages of results but the user queries for data on page 3) the GSA returns
> a total results count of zero, making it impossible to know if you have
> paged too far or if there were actually zero results
> 
> 3) No insight into data being fed into the GSA.  When I send data to the
> GSA it lists the data feed in the "feeds" page, but it's impossible to know
> which feed contained what data, and if an error occurs (depending on the
> error) you have no idea which peice of data was rejected or caused the
> failure.  Due to this I had to cut down and only send data to the system in
> very small chunks, just so one bad entry doesn't hold back too many records
> being updated.
> 
> 4) The GSA does not allow searching for data between two dates.  The most
> it lets you do is define a numerical data field with the dates (e.g.
> 20120901) but the GSA only supports numerical searching up to 6 significant
> digits, which means it only gives month accuracy but not day.
> 
> 5) The GSA does not allow operations nested within OR statements.  For
> example, you cannot do (x and y) or (a and b).
> 
> 6) No way to selectively flush mass data.  If I need to flush all the data
> in a collection to re-index it I have to deny a whole URL so the indexer
> clears the data out, then re-enable that URL.  Sometimes I need to flush
> only data flagged as articles or data for a specific client.
> 
> 7) Setting up facet groups is a very manual process in the GSA.  Also
> there's no easy way to have date ranges as search facets (date ranges all
> have to be explicitely defined through the web interface and
> manually maintained, I'd rather be able to have it give me facets on a year
> by year basis, or month by month).
> 
> Those are the main pain points.  There are others, such as community
> support (which between the mailing list and stack overflow I'm not worried
> about) but if anyone can give me a quick rundown on if Solr addresses any
> of these issues  I would be immensely thankful.

Reply via email to