Re: Solr8.7 - How to optmize my index ?

2020-12-02 Thread Dave
I’m going to go against the advice SLIGHTLY, it really depends on how you have things set up as far as your solr server hosting is done. If you’re searching off the same solr server you’re indexing to, yeah don’t ever optimize it will take care of itself, people much smarter than us, like Erick/

Re: Recovering deleted files without backup

2020-11-13 Thread Dave
Just rebuild the index. Pretty sure they’re gone if they aren’t in your vm backup, and solr isn’t a document storage tool, it’s a place to index the data from your document store, so it’s understood more or less that it can always be rebuilt when needed > On Nov 13, 2020, at 9:52 PM, Alex Hanna

Re: Need help to resolve Apache Solr vulnerability

2020-11-12 Thread Dave
Solr isn’t meant to be public facing. Not sure how anyone would send these commands since it can’t be reached from the outside world > On Nov 12, 2020, at 7:12 AM, Sheikh, Wasim A. > wrote: > > Hi Team, > > Currently we are facing the below vulnerability for Apache Solr tool. So can > you

Re: Avoiding single digit and single charcater ONLY query by putting them in stopwords list

2020-10-27 Thread Dave
Agreed. Just a JavaScript check on the input box would work fine for 99% of cases, unless something automatic is running them in which case just server side redirect back to the form. > On Oct 27, 2020, at 11:54 AM, Mark Robinson wrote: > > Hi Konstantinos , > > Thanks for the reply. > I t

Re: Solr endpoint on the public internet

2020-10-08 Thread Dave
#1. This is a HORRIBLE IDEA #2 If I was going to do this I would destroy the update request handler as well as the entire admin ui from the solr instance, set up a replication from a secure solr instance on an interval. This way no one could send an update /delete command, you could still update

Re: solr startup

2020-08-08 Thread Dave
er@lucene.apache.org > Subject: RE: solr startup > > suggester? what do i need to look for in the configs? > > Tony > > > > Sent from my Verizon, Samsung Galaxy smartphone > > > > Original message > From: Dave mailto:hastings.

Re: solr startup

2020-08-07 Thread Dave
It sounds like you have suggester indexes being built on startup. Without them they just come up in a second or so > On Aug 7, 2020, at 6:03 PM, Schwartz, Tony wrote: > > I have many collections. When I start solr, it takes 30 - 45 minutes to > start up and load all the collections. My col

Re: sorting help

2020-07-15 Thread Dave
That’s a good place to start. The idea was to make sure titles that started with a date would not always be at the forefront and the actual title of the doc would be sorted. > On Jul 15, 2020, at 4:58 PM, Erick Erickson wrote: > > Yeah, it’s always a question “how much is enough/too much”. >

Re: ***URGENT***Re: Questions about Solr Search

2020-07-03 Thread Dave
Seriously. Doug answered all of your questions. > On Jul 3, 2020, at 6:12 AM, Atri Sharma wrote: > > Please do not cross post. I believe your questions were already answered? > >> On Fri, Jul 3, 2020 at 3:08 PM Gautam K wrote: >> >> Since it's a bit of an urgent request so if could please h

Re: Getting rid of zookeeper

2020-06-09 Thread Dave
Is it horrible that I’m already burnt out from just reading that? I’m going to stick to the classic solr master slave set up for the foreseeable future, at least that let’s me focus more on the search theory rather than the back end system non stop. > On Jun 9, 2020, at 5:11 PM, Vincenzo D'Amo

Re: How to determine why solr stops running?

2020-06-09 Thread Dave
I’ll add that whenever I’ve had a solr instance shut down, for me it’s been a hardware failure. Either the ram or the disk got a “glitch” and both of these are relatively fragile and wear and tear type parts of the machine, and should be expected to fail and be replaced from time to time. Solr i

Re: Script to check if solr is running

2020-06-08 Thread Dave
A simple Perl script would be able to cover this, I have a cron job Perl script that does a search with an expected result, if the result isn’t there it fails over to a backup search server, sends me an email, and I fix what’s wrong. The backup search server is a direct clone of the live server

Re: Upgrading Solrcloud indexes from 7.2 to 8.4.1

2020-03-06 Thread Dave
You best off doing a full reindex to a single solr cloud 8.x node and then when done start taking down 7.x nodes, upgrade them to 8.x and add them to the new cluster. upgrading indexes has so many potential issues, > On Mar 6, 2020, at 9:21 PM, lstusr 5u93n4 wrote: > > Hi Webster, > > When

Re: Clarity on Stable Release

2020-01-29 Thread Dave
But! If we don’t have people throwing a new release into production and finding real world problems we can’t trust that the current release problems will be exposed and then remedied, so it’s a double edged sword. I personally agree with staying a major version back, but that’s because it takes

Re: Solr cloud production set up

2020-01-18 Thread Dave
If you’re not getting values, don’t ask for the facet. Facets are expensive as hell, maybe you should think more about your query’s than your infrastructure, solr cloud won’t help you at all especially if your asking for things you don’t need > On Jan 18, 2020, at 1:25 PM, Rajdeep Sahoo wrote:

Re: Solr cloud production set up

2020-01-18 Thread Dave
Agreed with the above. what’s your idea of “huge”? I have 600 ish gb in one core plus another 250x2 in two more on the same standalone solr instance and it runs more than fine > On Jan 18, 2020, at 11:31 AM, Shawn Heisey wrote: > > On 1/18/2020 1:05 AM, Rajdeep Sahoo wrote: >> Our Index size

Re: Failed to connect to server

2020-01-17 Thread Dave
It doesn’t need to be identical, just anything with a buildon reload statement > On Jan 17, 2020, at 12:17 PM, rhys J wrote: > > On Fri, Jan 17, 2020 at 12:10 PM David Hastings < > hastings.recurs...@gmail.com> wrote: > >> something like this in your solr config: >> >> autosuggest > "exactMa

Re: Solr 7.5 speed up, accuracy details

2019-12-28 Thread Dave
There is no increase in speed, but features. Doc values add some but it’s hard to quantify, and some people think solr cloud has speed increases but I don’t think they exist when hardware cost is nonexistent and it adds too much complexity to something that should be simple. > On Dec 28, 2019

Re: does copyFields increase indexe size ?

2019-12-25 Thread Dave
#1 merry Xmas thing #2 you initially said you were talking about 1k documents. That will not be a large enough sample size to see the index size differences with this new field, in any case the index size should never really matter. But if you go to a few million you will notice the size has

Re: Indexing strategies for user profiles

2019-12-10 Thread Dave
I would index the products a user purchased as well as the number of times purchased, then I would take a user, search their bought products boosted by how many times purchased, against other users, have a facet for products and filter out the top bought products that are not on the users alre

Re: How to add a new field to already an existing index in Solr 6.6 ?

2019-12-08 Thread Dave
Or just do it the lazy way and use a dynamic field. I’ve found little to no drawbacks with them aside from a complete lack of documentation of the field in the schema itself > On Dec 8, 2019, at 8:07 AM, David Barnett wrote: > > Also - look at adding fields using Solr admin, this will these

Re: xms/xmx choices

2019-12-06 Thread Dave
Actually at about that time the replication finished and added about 20-30gb to the index from the master. My current set up goes Indexing master -> indexer slave/production master (only replicated on command)-> three search slaves (replicate each 15 minutes) We added about 2.3m docs, then I re

Re: Is it possible to have different Stop words depending on the value of a field?

2019-12-02 Thread Dave
.org If you can, please build on your explanation as It > sounds relevant. > -Original Message- > From: Dave > Sent: Monday, December 2, 2019 7:38 PM > To: solr-user@lucene.apache.org > Cc: jornfra...@gmail.com > Subject: Re: Is it possible to have different Stop

Re: Is it possible to have different Stop words depending on the value of a field?

2019-12-02 Thread Dave
It clarifies yes. You need new fields. In this case something like Address_us Address_uk And index and search them accordingly with different stopword files used in different field types, hence the copy field from “address” into as many new fields as needed > On Dec 2, 2019, at 7:33 PM, wrote:

Re: A Last Message to the Solr Users

2019-11-30 Thread Dave
I’m young here I think, not even 40 and only been using solr since like 2008 or so, so like 1.4 give or take. But I know a really good therapist if you want to talk about it. > On Nov 30, 2019, at 6:56 PM, Mark Miller wrote: > > Now I have sacrificed to give you a new chance. A little for my

Re: Solr process takes several minutes before accepting commands after restart

2019-11-21 Thread Dave
https://doc.sitecore.com/developers/90/platform-administration-and-architecture/en/using-solr-auto-suggest.html If you need more references. Set all parameters yourself, don’t rely on defaults. > On Nov 21, 2019, at 3:41 PM, Dave wrote: > > https://lucidworks.com/post/solr-

Re: Solr process takes several minutes before accepting commands after restart

2019-11-21 Thread Dave
https://lucidworks.com/post/solr-suggester/ You must set buildonstartup to false, the default is true. Try it > On Nov 21, 2019, at 3:21 PM, Koen De Groote > wrote: > > Erick: > > No suggesters. There is 1 spellchecker for > > text_general > > But no buildOnCommit or buildOnStartup setting

Re: Active directory integration in Solr

2019-11-20 Thread Dave
I guess I don’t understand why one wouldn’t simply make a basic front end for solr, it’s literally the easiest thing to throw together and then you control all authentication and filters per user. Even a basic one would be some w3 school tutorials with php+json+whatever authentication Mech you

Re: POS Tagger

2019-10-25 Thread Dave
me against teh query >> >> On Fri, Oct 25, 2019 at 12:11 PM Audrey Lorberfeld - >> audrey.lorberf...@ibm.com wrote: >> >>> So then you do run your POS tagger at query-time, Dave? >>> >>> -- >>> Audrey Lorberfeld >>> Data Scientist, w3

Re: Sample JWT Solr configuration

2019-09-19 Thread Dave
I know this has nothing to do with the issue at hand but if you have a public facing solr instance you have much bigger issues. > On Sep 19, 2019, at 10:16 PM, Tyrone Tse wrote: > > I finally got JWT Authentication working on Solr 8.1.1. > This is my security.json file contents > { > "authe

Re: Need more info on MLT (More Like This) feature

2019-09-13 Thread Dave
As a side note, if you use shingles with the mlt handler I believe you will get better scores/relevant results. So “to be free” becomes indexes as “to_be” “to_be_free” and “be_free” but also as each word. It makes the index significantly larger but creates better “unique terms” in my opinion and

Re: Solr 7.7.2 Autoscaling policy - Poor performance

2019-09-03 Thread Dave
You’re going to want to start by having more than 3gb for memory in my opinion but the rest of your set up is more complex than I’ve dealt with. On Sep 3, 2019, at 1:10 PM, Andrew Kettmann wrote: >> How many zookeepers do you have? How many collections? What is there size? >> How much CPU / m

Re: Best way to retrieve parent documents with children using getBeans method?

2019-08-12 Thread Dave Durbin
Unsubscribe -- *P.S. We've launched a new blog to share the latest ideas and case studies from our team. Check it out here: product.canva.com . *** ** Empowering the world to design Also, we're hiring. Apply here!

Sort date stored in text field?

2019-06-10 Thread Dave Beckstrom
Hi Everyone, Running SOLR 7.3.1 I have a field called metatag.date that is field-type: org.apache.solr.schema.TextFieldThe field is being populated by NUTCH, which grabs the date from the html: and stores it in the metatag.date field in SOLR. I'm trying to sort by date (metatag.date de

Re: Using Solr as a Database?

2019-06-02 Thread Dave
You *can use solr as a database, in the same sense that you *can use a chainsaw to remodel your bathroom. Is it the right tool for the job? No. Can you make it work? Yes. As for HA and cluster rdbms gallera cluster works great for Maria db, and is acid compliant. I’m sure any other database h

Re: SOLR Text Field

2019-04-08 Thread Dave Beckstrom
nly for example purposes only. That would have been most helpful. Even a FAQ somewhere would have been helpful. Anyway, you're the best and thank you again Best, Dave Beckstrom -- *Fig Leaf Software, Inc.*  https://www.figleaf.com/ <https://www.figleaf.com/>   Full-Service Solutions Integrator

Re: SOLR Text Field

2019-04-06 Thread Dave
Wow. Ok dude relax and take a nap. It sounds like you don’t even have a core defined. Maybe you’d do and I’m reaching a bit but start there solr is super simple and only gets complicated when you’re complicated. > On Apr 6, 2019, at 8:59 AM, Dave Beckstrom wrote: > > Hi Everyone,

SOLR Text Field

2019-04-06 Thread Dave Beckstrom
Hi Everyone, I'm really hating SOLR. All I want is to define a text field that data can be indexed into and which is searchable. Should be super simple. But I run into issue after issue. I'm running SOLR 7.3 because it's compatible with the version of NUTCH I'm running. The docs say that SOL

Error on text field

2019-03-26 Thread Dave Beckstrom
Hi Everyone, I'm using Nutch to crawl and index some content. It failed on a SOLR field defined as a text field when it was trying to insert the following value for the field: 33011-54192-EWHServer1234-3BA9D1CA-05B6-42BA-9D88-BAD970CAEEC6 The field was defined in the schema.xml as: The error

Solr 7.6 Shard name - possible issue?

2019-03-17 Thread Dave Durbin
-name-2018-10-30_shard1_2_replica_n1 I have a couple who’s replica number exceeds a couple of hundred. collection-name-2018-10-30_shard2_1_replica_n213 Does this seem reasonable? Does it suggest a problem with these shard replicas or this shard in general ? Thanks Dave -- *P.S. We've lau

Boolean Searches?

2019-03-14 Thread Dave Beckstrom
Hi Everyone, I'm building a SOLR search application and the customer wants the search to work like google search. They want the user to be able to enter boolean searches like: train OR dragon. which would find any matches that has the word "train" or the word "dragon" in the title. I know tha

Re: MLT and facetting

2019-02-28 Thread Dave
I’m more curious what you’d expect to see, and what possible benefit you could get from it > On Feb 28, 2019, at 8:48 PM, Zheng Lin Edwin Yeo wrote: > > Hi Martin, > > I have no idea on this, as the case has not been active for almost 2 years. > Maybe I can try to follow up. > > Faceting by d

Re: MLT and facetting

2019-02-25 Thread Dave
Use the mlt to get the queries to use for getting facets in a two search approach > On Feb 25, 2019, at 10:18 PM, Zheng Lin Edwin Yeo > wrote: > > Hi Martin, > > I think there are some pictures which are not being sent through in the > email. > > Do send your query that you are using, and wh

Re: edismax: sorting on numeric fields

2019-02-16 Thread Dave
Sounds like you need to use code and post process your results as it sounds too specific to your use case. Just my opinion, unless you want to get into spacial queries which is a whole different animal and something I don’t think many have experience with, including myself > On Feb 16, 2019, a

Re: English Analyzer

2019-02-05 Thread Dave
This will tell you pretty everything you need to get started https://lucene.apache.org/solr/guide/6_6/language-analysis.html > On Feb 5, 2019, at 4:55 AM, akash jayaweera wrote: > > Hello All, > > Can i get details how to use English analyzer with stemming, > lemmatizatiion, stopword removal t

Re: Large Number of Collections takes down Solr 7.3

2019-01-22 Thread Dave
Do you mind if I ask why so many collections rather than a field in one collection that you can apply a filter query to each customer to restrict the result set, assuming you’re the one controlling the middle ware? > On Jan 22, 2019, at 4:43 PM, Monica Skidmore > wrote: > > We have been runni

Re: Solr Cloud configuration

2018-11-20 Thread Dave
But then I would lose the steaming expressions right? > On Nov 20, 2018, at 6:00 PM, Edward Ribeiro wrote: > > Hi David, > > Well, as a last resort you can resort to classic schema.xml if you are > using standalone Solr and don't bother to give up schema API. Then you are > back to manually edi

Re: Index optimization takes too long

2018-11-03 Thread Dave
On a side note, does adding docvalues to an already indexed field, and then optimizing, prevent the need to reindex to take advantage of docvalues? I was under the impression you had to reindex the content. > On Nov 3, 2018, at 4:41 AM, Deepak Goel wrote: > > I would start by monitoring the h

7.3 to 7.5

2018-10-18 Thread Dave
Would a minor solr upgrade such as this require a reindexing in order to take advantage of the skg functionality, or would it work regardless? A full reindex is quite a large operation in my use case

cursorMark and sort order

2018-07-24 Thread Dave Durbin
just : sort = asc and have Solr understand that the sort is only for tie break purposes? Thanks Dave -- *P.S. We've launched a new blog to share the latest ideas and case studies from our team. Check it out here: product.canva.com <http://product.canva.com/>. **

Re: Sorting on ip address

2018-06-18 Thread Dave
Store it as an atom rather than an up address. > On Jun 18, 2018, at 12:14 PM, root23 wrote: > > Hi all, > is there a built in data type which i can use for ip address which can > provide me sorting ip address based on the class? if not then what is the > best way to sort based on ip address ?

Re: Scaling issue with Solr

2017-12-27 Thread Dave
You may find that buying some more memory will be your best bang for the buck in your set up. 32-64 gb isn’t expensive, > On Dec 27, 2017, at 6:57 PM, Suresh Pendap wrote: > > What is the downside of configuring ramBufferSizeMB to be equal to 5GB ? > Is it only that the window of time for flu

Re: Migrating from Solr 6.X to Solr 7.X: "non legacy mode coreNodeName missing"

2017-10-30 Thread Dave Seltzer
Thanks Erick, I've looked over the documentation. Quick follow-up question: What are the consequences of running with legacyCloud=true? Would I need to point a new Solr cluster at a new Zookeeper instance to avoid this? Many thanks! -Dave On Mon, Oct 30, 2017 at 1:36 PM, Erick Erickson

Re: Migrating from Solr 6.X to Solr 7.X: "non legacy mode coreNodeName missing"

2017-10-30 Thread Dave Seltzer
indicate the state of the cluster. Does that mean I'm using the "zookeeper is the truth" system or the old system? Thanks! -Dave On Mon, Oct 30, 2017 at 11:55 AM, Erick Erickson wrote: > You may have to set legacyCloud=true in your cluster properties. In > the Solr

Migrating from Solr 6.X to Solr 7.X: "non legacy mode coreNodeName missing"

2017-10-30 Thread Dave Seltzer
;: non legacy mode coreNodeName missing {collection.configName=content, shard=shard1, collection=content_collection_20171013} Is there something I have to do to prepare this collection for Solr 7.x? Thanks, -Dave [root@crompcoreph02 ~]# curl "http:// [solrclusterloadbalancer]/solr/admin/

Re: Semantic Knowledge Graph

2017-10-09 Thread Dave
>> dont suppose any one knows where i may be able to find them, or point me in >> a direction to get more information about this tool. >> >> Thanks - dave >>

Re: length of indexed value

2017-10-03 Thread Dave
I’d personally use your second option. Simple and straightforward if you can afford the time for a reindex > On Oct 3, 2017, at 6:23 PM, John Blythe wrote: > > hey all. > > was hoping to find a query function that would allow me to filter based on > the length of an indexed value. only things

Re: Performance Test

2017-09-04 Thread Dave
Get the raw logs from normal use, script out something to replicate the searches and have it fork to as many cores as the solr server has is what I'd do. > On Sep 4, 2017, at 5:26 AM, Daniel Ortega wrote: > > I would recommend you Solrmeter cloud > > This fork supports solr cloud: > https:

Re: query with wild card with AND taking lot of time

2017-09-03 Thread Dave
My other concern would be your p's and q's. If you start mixing in Boolean logic and solrs weak respect for it, it could be unpredictable > On Sep 3, 2017, at 5:43 PM, Phil Scadden wrote: > > 5 seems a reasonable limit to me. After that revert to slow. > > -Original Message- > From: E

Re: Different order of docs between SOLR-4.10.4 to SOLR-6.5.1

2017-08-13 Thread Dave
Rebuild your index. It's just the safest way. On Aug 13, 2017, at 2:02 PM, SOLR4189 wrote: >> If you are changing things like WordDelimiterFilterFactory to the graph >> version, you'll definitely want to reindex > > What does it mean "*want to reindex*"? If I change > WordDelimiterFilterFacto

Re: MongoDb vs Solr

2017-08-12 Thread Dave
Personally I say use a rdbms for data storage, it's what it's for. Solr is for search and retrieve and the expense of possible loss of all data, in which case you rebuild it. > On Aug 12, 2017, at 11:26 AM, Muwonge Ronald wrote: > > Hi Solr can use mongodb for storage and you can play with th

Re: Fetch a binary field

2017-08-11 Thread Dave
Why didn't you set it to be indexed? Sure it would be a small dent in an index > On Aug 11, 2017, at 5:20 PM, Barbet Alain wrote: > > Re, > I take a look on the source code where this msg happen > https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/schema/SchemaF

Re: Need help with query syntax

2017-08-10 Thread Dave
Eric you going to vegas next month? > On Aug 10, 2017, at 7:38 PM, Erick Erickson wrote: > > Omer: > > Solr does not implement pure boolean logic, see: > https://lucidworks.com/2011/12/28/why-not-and-or-and-not/. > > With appropriate parentheses it can give the same results as you're > discov

Re: MongoDb vs Solr

2017-08-05 Thread Dave
o the mailing list that's supposed to serve as a source of help, which, you asked for. > On Aug 5, 2017, at 7:54 AM, Dave wrote: > > Also I wouldn't really recommend mongodb at all, it should only to be used as > a fast front end to an acid compliant relational db same with

Re: MongoDb vs Solr

2017-08-05 Thread Dave
; wunder >>>> Walter Underwood >>>> wun...@wunderwood.org >>>> http://observer.wunderwood.org/ (my blog) >>>> >>>> >>>>> On Aug 4, 2017, at 8:13 PM, David Hastings >>>> wrote: >>>>> >>

Re: MongoDb vs Solr

2017-08-05 Thread Dave
hip in a nosql db >> as you described, since that's a rdbms concept. If it exists in a nosql >> environment I would like to learn how... >> >>> On Aug 4, 2017, at 10:56 PM, Dave wrote: >>> >>> Uhm. Dude are you drinking? >>> >>&g

Re: MongoDb vs Solr

2017-08-04 Thread Dave
Uhm. Dude are you drinking? 1. Lucidworks would never say that. 2. Maria is not a json +MySQL. Maria is a fork of the last open source version of MySQL before oracle bought them 3.walter is 100% correct. Solr is search. The only complex data structure it has is an array. Something like mongo c

Re: MongoDb vs Solr

2017-08-04 Thread Dave
Ones a search engine and the other is a nosql db. They're nothing alike and are completely different tools for completely different jobs. > On Aug 4, 2017, at 7:16 PM, Francesco Viscomi wrote: > > Hi all, > why i have to choose solr if mongoDb is easier to learn and to use? > Both are NoSql

Re: Move index directory to another partition

2017-08-01 Thread Dave
To add to this, not sure of solr cloud uses it, but you're going to want to destroy the wrote.lock file as well > On Aug 1, 2017, at 9:31 PM, Shawn Heisey wrote: > >> On 8/1/2017 7:09 PM, Erick Erickson wrote: >> WARNING: what I currently understand about the limitations of AWS >> could fill vo

Re: solr cloud vs standalone solr

2017-07-29 Thread Dave
There is no solid rule. Honestly stand alone solr can handle quite a bit, I don't think there's a valid reason to go to cloud unless you are starting from scratch and want to use the newest buzz word, stand alone can handle well over half a terabyte index at sub second speeds all day long. >

Re: Network segmentation of replica

2017-07-06 Thread Dave
Sorry that should have read have not tested in solr cloud. > On Jul 6, 2017, at 6:37 PM, Dave wrote: > > I have tested that out in solr cloud, but for solr master slave replication > the config sets will not go without a reload, even if specified in the in the > slave settings

Re: Network segmentation of replica

2017-07-06 Thread Dave
I have tested that out in solr cloud, but for solr master slave replication the config sets will not go without a reload, even if specified in the in the slave settings. > On Jul 6, 2017, at 5:56 PM, Erick Erickson wrote: > > I'm not entirely sure what happens if the sequence is > 1> node dro

Re: Solr Web Crawler - Robots.txt

2017-06-01 Thread Dave
And I mean that in the context of stealing content from sites that explicitly declare they don't want to be crawled. Robots.txt is to be followed. > On Jun 1, 2017, at 5:31 PM, David Choi wrote: > > Hello, > > I was wondering if anyone could guide me on how to crawl the web and > ignore the

Re: Solr Web Crawler - Robots.txt

2017-06-01 Thread Dave
If you are not capable of even writing your own indexing code, let alone crawler, I would prefer that you just stop now. No one is going to help you with this request, at least I'd hope not. > On Jun 1, 2017, at 5:31 PM, David Choi wrote: > > Hello, > > I was wondering if anyone could gui

Re: Solr in NAS or Network Shared Drive

2017-05-26 Thread Dave
This could be useful in a space expensive situation, although the reason I wanted to try it is multiple solr instances in one server reading one index on the ssd. This use case where on the nfs still leads to a single point of failure situation on one of the most fragile parts of a server, the d

Re: Best practices for backup & restore

2017-05-16 Thread Dave
I think it's depends what you are backing up and restoring from. Hardware failure? Accidental delete? For my use case my master indexer stores the index on a San with daily snap shots for reliability, then my live searching master is on a San as well, my live slave searchers are all on SSD driv

Re: SOLR as nosql database store

2017-05-08 Thread Dave
You will want to have both solr and a sql/nosql data storage option. They serve different purposes > On May 8, 2017, at 10:43 PM, bharath.mvkumar > wrote: > > Hi All, > > We have a use case where we have mysql database which stores documents and > also some of the fields in the document is

Re: Filter Facet Query

2017-04-17 Thread Dave
Min.count is what you're looking for to get non 0 facets > On Apr 17, 2017, at 6:51 PM, Furkan KAMACI wrote: > > My query: > > /select?facet.field=research&facet=on&q=content:test > > Q1) Facet returns research values with 0 counts which has a research value > that is not from a document match

Re: Phrase Fields performance

2017-04-01 Thread Dave
Maybe commongrams could help this but it boils down to speed/quality/cheap. Choose two. Thanks > On Apr 1, 2017, at 10:28 AM, Shawn Heisey wrote: > >> On 3/31/2017 1:55 PM, David Hastings wrote: >> So I un-commented out the line, to enable it to go against 6 important >> fields. Afterwards thro

Re: Facet? Search problem

2017-03-13 Thread Dave
https://wiki.apache.org/solr/FieldCollapsing > On Mar 13, 2017, at 9:59 PM, Dave wrote: > > Perhaps look into grouping on that field. > >> On Mar 13, 2017, at 9:08 PM, Scott Smith wrote: >> >> I'm trying to solve a search problem and wondering if facets

Re: Facet? Search problem

2017-03-13 Thread Dave
Perhaps look into grouping on that field. > On Mar 13, 2017, at 9:08 PM, Scott Smith wrote: > > I'm trying to solve a search problem and wondering if facets (or something > else) might solve the problem. > > Let's assume I have a bunch of documents (100 million+). Each document has a > cate

Re: SOLR JOIN

2017-02-28 Thread Dave
That seems difficult if not impossible. The joins are just complex queries, with the same data set. > On Feb 28, 2017, at 11:37 PM, Nitin Kumar wrote: > > Hi, > > Can we use join query for more than 2 cores in solr. If yes, please provide > reference or example. > > Thanks, > Nitin

Re: solr warning - filling logs

2017-02-26 Thread Dave
e? > Sure, will try to move out to external zookeeper > >> On Sun, Feb 26, 2017 at 7:07 PM Dave wrote: >> >> You shouldn't use the embedded zookeeper with solr, it's just for >> development not anywhere near worthy of being out in production. Otherwise &

Re: solr warning - filling logs

2017-02-26 Thread Dave
You shouldn't use the embedded zookeeper with solr, it's just for development not anywhere near worthy of being out in production. Otherwise it looks like you may have a port scanner running. In any case don't use the zk that comes with solr > On Feb 26, 2017, at 6:52 PM, Satya Marivada wrote

Re: Question about best way to architect a Solr application with many data sources

2017-02-21 Thread Dave
could re-read and send to Solr. > > Best, > Erick > >> On Tue, Feb 21, 2017 at 5:17 PM, Dave wrote: >> B is a better option long term. Solr is meant for retrieving flat data, >> fast, not hierarchical. That's what a database is for and trust me you would

Re: Question about best way to architect a Solr application with many data sources

2017-02-21 Thread Dave
B is a better option long term. Solr is meant for retrieving flat data, fast, not hierarchical. That's what a database is for and trust me you would rather have a real database on the end point. Each tool has a purpose, solr can never replace a relational database, and a relational database cou

Re: Issues with Solr Morphline reading RFC822 files

2017-02-13 Thread Dave
Can't see what's color coded in the email. > On Feb 13, 2017, at 5:35 PM, Anatharaman, Srinatha (Contractor) > wrote: > > Hi, > > I am loading email files which are in RFC822 format into SolrCloud using Flume > But some meta data of the emails is not getting loaded to Solr. > Please find belo

Re: Solr Data Import Handler

2017-02-12 Thread Dave
That sounds pretty much like a hack. So if two imports happen at the same time they have to wait for each other? > On Feb 12, 2017, at 4:01 PM, Shawn Heisey wrote: > >> On 2/12/2017 10:30 AM, Minh wrote: >> Hi everyone, >> How can i run multithreads of DIH in a cluster for a collection? > > Th

Re: Configuring Solr for Maximum Concurrency

2016-12-29 Thread Dave Seltzer
cause more fundamental issues in Solr's performance Or maybe I missed something stupid at the OS level. Sigh. Many thanks for all the help! -Dave On Wed, Dec 28, 2016 at 7:11 PM, Erick Erickson wrote: > You'll see some lines with three different times in them, "user" &

Re: Configuring Solr for Maximum Concurrency

2016-12-28 Thread Dave Seltzer
;13188910K(16078208K), 1.2403791 secs] [Times: user=4.60 sys=0.00, real=1.24 secs] Is there something I should be grepping for in this enormous file? Many thanks! -Dave On Wed, Dec 28, 2016 at 12:44 PM, Erick Erickson wrote: > Threads are usually a container parameter I think. True, Solr

Re: Configuring Solr for Maximum Concurrency

2016-12-28 Thread Dave Seltzer
xy. -D On Wed, Dec 28, 2016 at 12:42 PM, Pablo Anzorena wrote: > Dave, > > there is something similar like MAX_CONNECTIONS and > MAX_CONNECTIONS_PER_HOST which control the number of connections. > > Are you leaving open the connection to zookeeper after you establish it? > Ar

Re: Configuring Solr for Maximum Concurrency

2016-12-28 Thread Dave Seltzer
ervers are dead based on the fact that responses are so very sluggish. You've mentioned lots of timeouts, but are there any settings which control the number of available threads? Or is this something which is largely handled automagically? Many thanks! -Dave On Wed, Dec 28, 2016 at 11:56 AM,

Configuring Solr for Maximum Concurrency

2016-12-28 Thread Dave Seltzer
e concurrent requests. Many thanks! -Dave

Re: Cloud Behavior when using numShards=1

2016-12-27 Thread Dave Seltzer
Thanks Erick, That's pretty much where I'd landed on the issue. To me Solr Cloud is clearly the preferable option here - especially when it comes to indexing and cluster management. I'll give "preferLocalShards" a try and see what happens. Many thanks for your in-dep

Re: Cloud Behavior when using numShards=1

2016-12-27 Thread Dave Seltzer
ed queries more likely to be distributed in this fashion? -Dave q=_query_:"{!edismax mm=5}hashTable_0:359079936 hashTable_1:440999735 hashTable_2:1376147226 hashTable_3:35668745 hashTable_4:671810129 hashTable_5:536885545 hashTable_6:453337089 hashTable_7:1279281410 hashTable_8:772478009 h

Cloud Behavior when using numShards=1

2016-12-27 Thread Dave Seltzer
m curious why SERVER1 would be proxying requests to SERVER3 in a situation where the sf_fingerprints index is completely present on the local system. Is this a situation where I should be using generic replication rather than Cloud? Many thanks! -Dave

Cloud Behavior when using

2016-12-27 Thread Dave Seltzer
ric replication rather than Cloud? Dave Seltzer Chief Systems Architect TVEyes (203) 254-3600 x222

Re: Poor Solr Cloud Query Performance against a Small Dataset

2016-11-03 Thread Dave Seltzer
Good tip Rick, I'll dig in and make sure everything is set up correctly. Thanks! -D Dave Seltzer Chief Systems Architect TVEyes (203) 254-3600 x222 On Wed, Nov 2, 2016 at 9:05 PM, Rick Leir wrote: > Here is a wild guess. Whenever I see a 5 second delay in networking, I > think D

Poor Solr Cloud Query Performance against a Small Dataset

2016-11-01 Thread Dave Seltzer
,'71309060'), termfreq(hashTable_21,'1125848323'), termfreq(hashTable_22,'1077548043'), termfreq(hashTable_23,'117638159'), termfreq(hashTable_24,'-1408039642')) The schema looks like this: subFingerprintId I'

RE: Performance of facet contain search in 5.2.1

2015-07-22 Thread Lo Dave
Yes. I am going to provide autocomplete with facet count as rank.i.e. when yours input "owe a duty", the system will suggest "xxx owe a duty yyy" with highest count. Thanks. Dave > Date: Wed, 22 Jul 2015 14:35:40 +0100 > Subject: Re: Performance of facet cont

  1   2   3   >