Michael,

  Thanks for this. I do already have a multi core setup in place. At least for 
my different non-prod environments. I have not identified if I need multiple 
indexes across a single environment yet, though I think I will. I have seen 
that I need to define the schemes but had not yet found that excellent page on 
it. I'll keep you posted as I learn more. 

Andrew Farnsworth
(804) 405-3630

> On Jul 21, 2015, at 10:29 AM, Michael Chaney <[email protected]> 
> wrote:
> 
> You probably want to look at this page:
> 
> https://wiki.apache.org/solr/Solrj
> 
> If you get about half way down it starts explaining how to index.  I'm 
> assuming if you use Java you can get the server set up and running (I don't 
> use Java and was able to get it up and running under Tomcat in a matter of a 
> few hours).
> 
> Basically, you have to define your schema as an xml document.  Each field 
> that you want to store must be defined along with a data type, whether to 
> store the original value, whether to index it, whether it is multi-valued, 
> etc.  See here:
> 
> https://wiki.apache.org/solr/SchemaXml
> 
> As they mention there you usually add an "id" field that is unique.  For you 
> that might be the url or some part of the url that makes it unique.  You can 
> add text, numbers, etc. and it'll all be indexed.  But note that you have to 
> define up front what you'll be storing.  You can use "dynamicField" to give 
> you great flexibility in storing data without having to define the exact 
> fields up front.
> 
> After that, you can see on the first wiki link above how to create documents 
> and add them to the index.  You'll also want to create something to properly 
> keep your index updated as your data changes, is added, or removed.
> 
> On the search side you just create queries and run them.  The query will 
> return the data that you want, probably just the id but you might also want 
> it to do highlighting for text fields that you've stored.  You can look 
> through the documentation to determine how to set field weights but it's not 
> difficult.  You can also do faceting if that's helpful.
> 
> There are also various other plugins that you can throw in to the mix.  One 
> that is on my backlog to use is the autocompleter.  You also have to 
> determine which stemming to use although I think it has one by default that 
> works well.
> 
> I also recommend this page for more info:
> 
> https://wiki.apache.org/solr/FrontPage#Tips.2C_Tricks_and_Use_Cases
> 
> This is a really complete search engine that will do anything you need for 
> searching.  It takes a few days to learn everything about it that you need 
> for basic indexing but it's all well-documented and pretty straight-forward.
> 
> One other thing to note is that if you're going to index more than one data 
> set (or think that you might in the future) it probably makes sense to create 
> a multi-core setup right off the bat.
> 
> That should be enough to keep you busy for awhile.
> 
> Michael
> 
>> On Mon, Jul 20, 2015 at 10:48 AM, Andrew Farnsworth <[email protected]> 
>> wrote:
>> We are planning on using it for the internal websites we create and 
>> providing detailed search into the content but also weighting the search 
>> results depending on where you are on the site.  So the content we would be 
>> indexing would use URL as the key (not sure of SOLR terminology for this) 
>> and it would index the content of the page.  Since we are generating the 
>> page we can just feed the search engine the content rather than crawling the 
>> pages.  I would rather not index the web pages html, css, and javascript 
>> though, which is another reason I want to feed it rather than just spidering 
>> the website and indexing that.
>> 
>> Andy
>> 
>>> On Mon, Jul 20, 2015 at 9:11 AM, Michael Chaney 
>>> <[email protected]> wrote:
>>> My search shows "SolrJ" as a Java client for accessing solr.  What kind of 
>>> data are you trying to index?
>>> 
>>> To give you an idea, I use Solr for doing music catalog searches.  I'm 
>>> using Ruby and there's a Ruby gem that allows pretty easy indexing of data 
>>> from a Rails app.  I get to describe the data that I want to be indexed and 
>>> searching is really simple.  Even if you're going directly to the server 
>>> it's not difficult to search.
>>> 
>>> I was using thinking sphinx (based on the sphinx search engine) but solr is 
>>> much faster in indexing.
>>> 
>>> Michael
>>> 
>>>> On Sat, Jul 18, 2015 at 10:35 AM, Andrew Farnsworth <[email protected]> 
>>>> wrote:
>>>> Java is the language of choice at the moment though I would be interested 
>>>> in perl too. 
>>>> 
>>>> Andrew Farnsworth
>>>> (804) 405-3630
>>>> 
>>>>> On Jul 18, 2015, at 9:36 AM, Michael Chaney <[email protected]> 
>>>>> wrote:
>>>>> 
>>>>> What language are you using?  Typically there'll be a library to make it 
>>>>> easy to interface.
>>>>> 
>>>>>> On Friday, July 17, 2015, Andrew Farnsworth <[email protected]> wrote:
>>>>>> Does anyone have any experience with Apache SOLR?  I'm trying to get my 
>>>>>> own content into it and am not sure how it works.
>>>>>> 
>>>>>> Andy Farnsworth
>>>>>> 
>>>>>> --
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "NLUG" group.
>>>>>> To post to this group, send email to [email protected]
>>>>>> To unsubscribe from this group, send email to 
>>>>>> [email protected]
>>>>>> For more options, visit this group at 
>>>>>> http://groups.google.com/group/nlug-talk?hl=en
>>>>>> 
>>>>>> ---
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "NLUG" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>>> an email to [email protected].
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Michael Darrin Chaney, Sr.
>>>>> [email protected]
>>>>> http://www.michaelchaney.com/
>>>>> -- 
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google Groups 
>>>>> "NLUG" group.
>>>>> To post to this group, send email to [email protected]
>>>>> To unsubscribe from this group, send email to 
>>>>> [email protected]
>>>>> For more options, visit this group at 
>>>>> http://groups.google.com/group/nlug-talk?hl=en
>>>>> 
>>>>> --- 
>>>>> You received this message because you are subscribed to the Google Groups 
>>>>> "NLUG" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>>> email to [email protected].
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>> 
>>>> -- 
>>>> -- 
>>>> You received this message because you are subscribed to the Google Groups 
>>>> "NLUG" group.
>>>> To post to this group, send email to [email protected]
>>>> To unsubscribe from this group, send email to 
>>>> [email protected]
>>>> For more options, visit this group at 
>>>> http://groups.google.com/group/nlug-talk?hl=en
>>>> 
>>>> --- 
>>>> You received this message because you are subscribed to the Google Groups 
>>>> "NLUG" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>> email to [email protected].
>>>> 
>>>> For more options, visit https://groups.google.com/d/optout.
>>> 
>>> 
>>> 
>>> -- 
>>> Michael Darrin Chaney, Sr.
>>> [email protected]
>>> http://www.michaelchaney.com/
>>> -- 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "NLUG" group.
>>> To post to this group, send email to [email protected]
>>> To unsubscribe from this group, send email to 
>>> [email protected]
>>> 
>>> For more options, visit this group at 
>>> http://groups.google.com/group/nlug-talk?hl=en
>>> 
>>> --- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "NLUG" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>> 
>> -- 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "NLUG" group.
>> To post to this group, send email to [email protected]
>> To unsubscribe from this group, send email to 
>> [email protected]
>> For more options, visit this group at 
>> http://groups.google.com/group/nlug-talk?hl=en
>> 
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "NLUG" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
> 
> 
> 
> -- 
> Michael Darrin Chaney, Sr.
> [email protected]
> http://www.michaelchaney.com/
> -- 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "NLUG" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to 
> [email protected]
> For more options, visit this group at 
> http://groups.google.com/group/nlug-talk?hl=en
> 
> --- 
> You received this message because you are subscribed to the Google Groups 
> "NLUG" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.

-- 
-- 
You received this message because you are subscribed to the Google Groups 
"NLUG" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/nlug-talk?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"NLUG" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to