Michael, Thanks for this. I do already have a multi core setup in place. At least for my different non-prod environments. I have not identified if I need multiple indexes across a single environment yet, though I think I will. I have seen that I need to define the schemes but had not yet found that excellent page on it. I'll keep you posted as I learn more.
Andrew Farnsworth (804) 405-3630 > On Jul 21, 2015, at 10:29 AM, Michael Chaney <[email protected]> > wrote: > > You probably want to look at this page: > > https://wiki.apache.org/solr/Solrj > > If you get about half way down it starts explaining how to index. I'm > assuming if you use Java you can get the server set up and running (I don't > use Java and was able to get it up and running under Tomcat in a matter of a > few hours). > > Basically, you have to define your schema as an xml document. Each field > that you want to store must be defined along with a data type, whether to > store the original value, whether to index it, whether it is multi-valued, > etc. See here: > > https://wiki.apache.org/solr/SchemaXml > > As they mention there you usually add an "id" field that is unique. For you > that might be the url or some part of the url that makes it unique. You can > add text, numbers, etc. and it'll all be indexed. But note that you have to > define up front what you'll be storing. You can use "dynamicField" to give > you great flexibility in storing data without having to define the exact > fields up front. > > After that, you can see on the first wiki link above how to create documents > and add them to the index. You'll also want to create something to properly > keep your index updated as your data changes, is added, or removed. > > On the search side you just create queries and run them. The query will > return the data that you want, probably just the id but you might also want > it to do highlighting for text fields that you've stored. You can look > through the documentation to determine how to set field weights but it's not > difficult. You can also do faceting if that's helpful. > > There are also various other plugins that you can throw in to the mix. One > that is on my backlog to use is the autocompleter. You also have to > determine which stemming to use although I think it has one by default that > works well. > > I also recommend this page for more info: > > https://wiki.apache.org/solr/FrontPage#Tips.2C_Tricks_and_Use_Cases > > This is a really complete search engine that will do anything you need for > searching. It takes a few days to learn everything about it that you need > for basic indexing but it's all well-documented and pretty straight-forward. > > One other thing to note is that if you're going to index more than one data > set (or think that you might in the future) it probably makes sense to create > a multi-core setup right off the bat. > > That should be enough to keep you busy for awhile. > > Michael > >> On Mon, Jul 20, 2015 at 10:48 AM, Andrew Farnsworth <[email protected]> >> wrote: >> We are planning on using it for the internal websites we create and >> providing detailed search into the content but also weighting the search >> results depending on where you are on the site. So the content we would be >> indexing would use URL as the key (not sure of SOLR terminology for this) >> and it would index the content of the page. Since we are generating the >> page we can just feed the search engine the content rather than crawling the >> pages. I would rather not index the web pages html, css, and javascript >> though, which is another reason I want to feed it rather than just spidering >> the website and indexing that. >> >> Andy >> >>> On Mon, Jul 20, 2015 at 9:11 AM, Michael Chaney >>> <[email protected]> wrote: >>> My search shows "SolrJ" as a Java client for accessing solr. What kind of >>> data are you trying to index? >>> >>> To give you an idea, I use Solr for doing music catalog searches. I'm >>> using Ruby and there's a Ruby gem that allows pretty easy indexing of data >>> from a Rails app. I get to describe the data that I want to be indexed and >>> searching is really simple. Even if you're going directly to the server >>> it's not difficult to search. >>> >>> I was using thinking sphinx (based on the sphinx search engine) but solr is >>> much faster in indexing. >>> >>> Michael >>> >>>> On Sat, Jul 18, 2015 at 10:35 AM, Andrew Farnsworth <[email protected]> >>>> wrote: >>>> Java is the language of choice at the moment though I would be interested >>>> in perl too. >>>> >>>> Andrew Farnsworth >>>> (804) 405-3630 >>>> >>>>> On Jul 18, 2015, at 9:36 AM, Michael Chaney <[email protected]> >>>>> wrote: >>>>> >>>>> What language are you using? Typically there'll be a library to make it >>>>> easy to interface. >>>>> >>>>>> On Friday, July 17, 2015, Andrew Farnsworth <[email protected]> wrote: >>>>>> Does anyone have any experience with Apache SOLR? I'm trying to get my >>>>>> own content into it and am not sure how it works. >>>>>> >>>>>> Andy Farnsworth >>>>>> >>>>>> -- >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "NLUG" group. >>>>>> To post to this group, send email to [email protected] >>>>>> To unsubscribe from this group, send email to >>>>>> [email protected] >>>>>> For more options, visit this group at >>>>>> http://groups.google.com/group/nlug-talk?hl=en >>>>>> >>>>>> --- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "NLUG" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>>> an email to [email protected]. >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>>> >>>>> -- >>>>> Michael Darrin Chaney, Sr. >>>>> [email protected] >>>>> http://www.michaelchaney.com/ >>>>> -- >>>>> -- >>>>> You received this message because you are subscribed to the Google Groups >>>>> "NLUG" group. >>>>> To post to this group, send email to [email protected] >>>>> To unsubscribe from this group, send email to >>>>> [email protected] >>>>> For more options, visit this group at >>>>> http://groups.google.com/group/nlug-talk?hl=en >>>>> >>>>> --- >>>>> You received this message because you are subscribed to the Google Groups >>>>> "NLUG" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send an >>>>> email to [email protected]. >>>>> For more options, visit https://groups.google.com/d/optout. >>>> >>>> -- >>>> -- >>>> You received this message because you are subscribed to the Google Groups >>>> "NLUG" group. >>>> To post to this group, send email to [email protected] >>>> To unsubscribe from this group, send email to >>>> [email protected] >>>> For more options, visit this group at >>>> http://groups.google.com/group/nlug-talk?hl=en >>>> >>>> --- >>>> You received this message because you are subscribed to the Google Groups >>>> "NLUG" group. >>>> To unsubscribe from this group and stop receiving emails from it, send an >>>> email to [email protected]. >>>> >>>> For more options, visit https://groups.google.com/d/optout. >>> >>> >>> >>> -- >>> Michael Darrin Chaney, Sr. >>> [email protected] >>> http://www.michaelchaney.com/ >>> -- >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "NLUG" group. >>> To post to this group, send email to [email protected] >>> To unsubscribe from this group, send email to >>> [email protected] >>> >>> For more options, visit this group at >>> http://groups.google.com/group/nlug-talk?hl=en >>> >>> --- >>> You received this message because you are subscribed to the Google Groups >>> "NLUG" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >> >> -- >> -- >> You received this message because you are subscribed to the Google Groups >> "NLUG" group. >> To post to this group, send email to [email protected] >> To unsubscribe from this group, send email to >> [email protected] >> For more options, visit this group at >> http://groups.google.com/group/nlug-talk?hl=en >> >> --- >> You received this message because you are subscribed to the Google Groups >> "NLUG" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/d/optout. > > > > -- > Michael Darrin Chaney, Sr. > [email protected] > http://www.michaelchaney.com/ > -- > -- > You received this message because you are subscribed to the Google Groups > "NLUG" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/nlug-talk?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "NLUG" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. -- -- You received this message because you are subscribed to the Google Groups "NLUG" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/nlug-talk?hl=en --- You received this message because you are subscribed to the Google Groups "NLUG" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
