Re: Adding pdf/word file using JSON/XML

Jack Krupansky Sun, 16 Jun 2013 20:20:07 -0700

I won't assert "total" mastery as a requirement. Degrees of mastery aresufficient. But even then, even "partial" mastery of some rather basic areasof Solor can be quite daunting.

It is enlightening to consider just how many nooks and crannies of Solrthere are to master, and how many reasonable levels of mastery there are.


Spatial... the final frontier.

-- Jack Krupansky

-----Original Message-----From: Walter Underwood

Sent: Sunday, June 16, 2013 7:27 PM
To: solr-user@lucene.apache.org
Subject: Re: Adding pdf/word file using JSON/XML

1. Total mastery of a product is a strange requirement. That would would bea huge trivia contest that would include all the vestigial bad bits. Forexample, I feel no need to master the Porter stemmer. I have no idea how todo geo search in Solr, though I'm sure I could learn it pretty quickly ifneeded.

2. Someone who expects partial update in a search engine, or transactions,has a deep misunderstandings of the tradeoffs you make for what search cando. That isn't mastery of arcane details, that is search 101.


Here are Rob Pike's rules for a good software architecture:

1. Simple things are simple.
2. Hard things are possible.
3. You don't need to understand the entire system to use part of it.

I think Solr comes pretty close to that. It doesn't do as well on #1 asUltraseek did, but it is better on #2.

If you really need search with transactions with field updates, that isreally hard. You can buy it from Mark Logic. It works great and they chargewhat it is worth.


wunder
Former Principle Architect Infoseek/Inktomi/Verity Ultraseek
Former Search Guy Netflix
Search Guy Chegg

On Jun 16, 2013, at 3:05 PM, Jack Krupansky wrote:

Jan, you made no mention of "mastering" Solr - which was the crux of mycomments.
I think everyone agrees that anyone can download and "use" Solr, in abasic sense, with minimal effort. The issue is how far the averageapplication developer can get beyond "start" towards "mastery" without adetailed cheat sheet and eventually intensive guidance, if not outrightexasperation and pain. How many of the many thousands of Solr deploymentsdidn't hit some kind of wall where they had the impression that Solrshould be able to do something easily and found that was not the case(multi-word synonyms come to mind.)
Oh, and yes, by my standards, MOST software IS "bad" and "hard to use".The level of training and books is certainly an indicator of the level of"badness". Some of Solr is indeed "not so bad" - while other parts arehave at least some elements of "extreme badness" (NPE for a missing orinvalid parameters is a mark of extreme badness.)
[Again, my apologies to Roland - none of these comments reflect on hisoriginal inquiry! Except, that Solr's divergence from a true, pure RESTAPI is certainly one of the elements of its "badness". The fact thatSolrCell does not support partial update as a true REST CRUD API should,is a good example of relative "badness" in Solr.]
-- Jack Krupansky

-----Original Message----- From: Jan Høydahl
Sent: Sunday, June 16, 2013 4:16 PM
To: solr-user@lucene.apache.org
Subject: Re: Adding pdf/word file using JSON/XML

Hi,
I've never heard the complaint that Solr is hard to use. To the contrary,most people I come across have downloaded Solr themselves, walked throughthe tutorial and praise the simplicity with which they can start indexingand searching content.
When they come to us asking for consultancy or training, they are alreadyin love with the product, they use it but realize that great search is somuch more than just getting the HTTP requests or XML right. So while any"average Java developer" will be able to download and use Solr within anhour or two (my statement - even PHP developers can do that :-) ), that'sjust the beginning of it all.
With your reasoning, all software for which training classes exist are badand hard to use. Our training classes do not focus on the technologyitself, but best practices to achieve good search user experience *using*Solr. This is a skill not even seasoned SQL developers have.
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

15. juni 2013 kl. 21:39 skrev Jack Krupansky <j...@basetechnology.com>:
[My apologies to Roland for "hijacking" his original thread for thisrant! Look what you started!!]
And I will stand by my statement: "Solr is too much of a beast foraverage app developers to master."
And the key word there, in case a too-casual reader missed it is"master" - not "use" in the sense of hack something together or solving aniche application for a typical Solr deployment, but master in the senseof having a high level of confidence about the vast bulk (even if notabsolutely 100%) of the subject matter, Solr itself.
I mean, generally, on average what percentage of Solr's many featureshas the average Solr app-deployer actually "mastered"?
And, what I am really referring to is not what expertise the pioneers and"expert" Solr solution consultants have had, but the level of expertiserequired for those who are to come in the years ahead who simply want tofocus on their application without needing to become a "Solr expert"first.
The context of my statement was the application "devs" referenced earlierin this thread who were struggling because the Solr API was not 100% pureRESTful. As the respondent indicated, they were much happier to have acleaner, more RESTful API that they as app developers can deal with, sothat they wouldn't have to "master" all of the bizarre inconsistencies ofSolr itself (e.g., just the knowledge that SolrCell doesn't supportpartial/atomic update.)
And, the real focus of my statement, again in this particular context" isthe actual application devs, the guys focused on the actual applicationsubject matter itself, not the "Solr Experts" or "Solr solutionarchitects" who do have a lot higher mastery of Solr than the "average"application devs.
And if my statement were in fact false, questions such as began thisthread would never have come up. The level of traffic for Solr User wouldbe essentially zero if it were really true that average applicationdevelopers can easily "master" Solr.
And there would be zero need so many of these Solr training classes ifSolr were so easy to "master". In fact, the very existence of so manySolr training classes effectively proves my point. And that's just for"basic" Solr, not any of the many esoteric points such as at the heart ofthis particular thread (i.e., SolrCell not supporting partial/atomicupdate.)
And, in conclusion, my real interest is in helping the many "average"application developers who post inquiries on this Solr user list for thesimple reason that they ARE in fact "struggling" with Solr.
Personally, I would suggest that a typical (average) successful deployerof Solr would be more readily characterized as having "survived" the Solrdeployment process rather than having achieved a truly deep "mastery" ofSolr. They may have achieved confidence about exactly what they havedeployed, but do they also have great confidence that they know exactlywhat will happen if they make slight and subtle changes or what exactlythe fix will be for certain runtime errors? For the "average applicationdeveloper" I'm talking about, not the elite expert Solr consultants.
One final way of putting it. If a manager or project leader wanted tostaff a dev position to be "in-house Solr expert", can they just hire anyold average Java programmer with no Solr experience and expect that hewill rapidly "master" Solr?
I mean, why would so many recruiters be looking for a "Solr expert" orengaging the services of Solr sonsultancies if mastery of Solr by"average application developers" was a reality?!
[I want to hear Otis' take on this!]

-- Jack Krupansky

-----Original Message----- From: Grant Ingersoll
Sent: Saturday, June 15, 2013 1:47 PM
To: solr-user@lucene.apache.org
Subject: Re: Adding pdf/word file using JSON/XML
On Jun 15, 2013, at 12:54 PM, Alexandre Rafalovitch <arafa...@gmail.com>wrote:
On Sat, Jun 15, 2013 at 10:35 AM, Grant Ingersoll <gsing...@apache.org>wrote:
That being said, it truly amazes me that people were ever able toimplement Solr, given some of the FUD in this thread. I guess thosetens of thousands of deployments out there were all done by aboveaverage devs...
I would not classify the thread as FUD.
I was just referring to the part about how Solr isn't something averagedevs can do, which I think is FUD.
At any rate, I think the ExtractingReqHandler could be updated to allowfor metadata, etc. to be passed in with the raw document itself and apatch would be welcome. It's something the literals stand in for now asa lightweight proxy, but clearly there is an opportunity for more to bepassed in.=


--
Walter Underwood
wun...@wunderwood.org

Re: Adding pdf/word file using JSON/XML

Reply via email to