[CVE-2020-13957] The checks added to unauthenticated configset uploads in Apache Solr can be circumvented

2020-10-12 Thread Tomas Fernandez Lobbe
Severity: High Vendor: The Apache Software Foundation Versions Affected: 6.6.0 to 6.6.5 7.0.0 to 7.7.3 8.0.0 to 8.6.2 Description: Solr prevents some features considered dangerous (which could be used for remote code execution) to be configured in a ConfigSet that's uploaded via API without auth

[SECURITY] CVE-2019-12401: XML Bomb in Apache Solr versions prior to 5.0

2019-09-09 Thread Tomas Fernandez Lobbe
Severity: Medium Vendor: The Apache Software Foundation Versions Affected: 1.3.0 to 1.4.1 3.1.0 to 3.6.2 4.0.0 to 4.10.4 Description: Solr versions prior to 5.0.0 are vulnerable to an XML resource consumption attack (a.k.a. Lol Bomb) via it’s update handler. By leveraging XML DOCTYPE and ENTITY

CVE-2019-0192 Deserialization of untrusted data via jmx.serviceUrl in Apache Solr

2019-03-06 Thread Tomas Fernandez Lobbe
Severity: High Vendor: The Apache Software Foundation Versions Affected: 5.0.0 to 5.5.5 6.0.0 to 6.6.5 Description: ConfigAPI allows to configure Solr's JMX server via an HTTP POST request. By pointing it to a malicious RMI server, an attacker could take advantage of Solr's unsafe deserializatio

[SECURITY] CVE-2017-3164 SSRF issue in Apache Solr

2019-02-12 Thread Tomas Fernandez Lobbe
CVE-2017-3164 SSRF issue in Apache Solr Severity: High Vendor: The Apache Software Foundation Versions Affected: Apache Solr versions from 1.3 to 7.6.0 Description: The "shards" parameter does not have a corresponding whitelist mechanism, so it can request any URL. Mitigation: Upgrade to Apach

Re: Exception writing document xxxxxx to the index; possible analysis error.

2018-07-11 Thread Tomas Fernandez Lobbe
I Daphne, the “possible analysis error” is a misleading error message (to be addressed in SOLR-12477). The important piece is the “java.lang.ArrayIndexOutOfBoundsException”, it looks like your index may be corrupted in some way. Tomás > On Jul 11, 2018, at 3:01 PM, Liu, Daphne wrote: > > He

Re: User queries end up in filterCache if facetting is enabled

2018-05-09 Thread Tomas Fernandez Lobbe
I'd never noticed this before, but I believe it happens because, once you say `facet=true`, Solr will need the full docset (the set of all matching docs, not just the top matches) and does so by using the filter cache. > On May 3, 2018, at 7:10 AM, Markus Jelsma wrote: > > By the way, the quer

Re: Solr 7.2.1 DELETEREPLICA automatically NRT replica appears

2018-03-07 Thread Tomas Fernandez Lobbe
This shouldn’t be happening. Did you see anything related in the logs? Does the new NRT replica ever becomes active? Is there a new core created or do you just see the replica in the clusterstate? Tomas Sent from my iPhone > On Mar 7, 2018, at 8:18 PM, Greg Roodt wrote: > > Hi

Re: solr cloud unique key query request is sent to all shards!

2018-02-18 Thread Tomas Fernandez Lobbe
this implicit request handler is configured correctly Any > thoughts, what I might be missing? > > > > On Sun, Feb 18, 2018 at 11:18 PM, Tomas Fernandez Lobbe > wrote: > >> I think real-time get should be directed to the correct shard. Try: >> [COLLECTION]/ge

Re: solr cloud unique key query request is sent to all shards!

2018-02-18 Thread Tomas Fernandez Lobbe
I think real-time get should be directed to the correct shard. Try: [COLLECTION]/get?id=[YOUR_ID] Sent from my iPhone > On Feb 18, 2018, at 3:17 PM, Ganesh Sethuraman > wrote: > > Hi > > I am using Solr 7.2.1. I have 8 shards in two nodes (two different m/c) > using Solr Cloud. The data was

Re: Request routing / load-balancing TLOG & PULL replica types

2018-02-12 Thread Tomas Fernandez Lobbe
t;>> Is my understanding correct? >>> >>> Is this sensible to do, or is it not worth it due to the smart proxying >>> that SolrCloud can do anyway? >>> >>> If the TLOG and PULL replicas are so similar, is there any real advantage >>>

Re: Request routing / load-balancing TLOG & PULL replica types

2018-02-11 Thread Tomas Fernandez Lobbe
On the last question: For Writes: Yes. Writes are going to be sent to the shard leader, and since PULL replicas can’t be leaders, it’s going to be a TLOG replica. If you are using CloudSolrClient, then this routing will be done directly from the client (since it will send the update to the lead

Re: 7.2.1 cluster dies within minutes after restart

2018-02-02 Thread Tomas Fernandez Lobbe
Hi Markus, If the same code that runs OK in 7.1 breaks 7.2.1, it is clear to me that there is some bug in Solr introduced between those releases (maybe an increase in memory utilization? or maybe some decrease in query throughput making threads to pile up?). I’d hate to have this issue lost in

Re: Master Slave Replication Issue

2018-02-01 Thread Tomas Fernandez Lobbe
This seems pretty serious. Please create a Jira issue Sent from my iPhone > On Feb 1, 2018, at 12:15 AM, dennis nalog > wrote: > > Hi, > We are using Solr 7.1 and are solr setup is master-slave replication. > We encounter this issue that when we disable the replication in master via UI > or U

Re: Mixing simple and nested docs in same update?

2018-01-30 Thread Tomas Fernandez Lobbe
I believe the problem is that: * BlockJoin queries do not know about your “types”, in the BlockJoin query world, everything that’s not a parent (matches the parentFilter) is a child. * All docs indexed before a parent are considered childs of that doc. That’s why in your first case it considers “f

Re: Limit search queries only to pull replicas

2018-01-08 Thread Tomas Fernandez Lobbe
This feature is not currently supported. I was thinking in implementing it by extending the work done in SOLR-10880. I still didn’t have time to work on it though. There is a patch for SOLR-10880 that doesn’t implement support for replica types, but could be used as base. Tomás > On Jan 8, 2

Re: Solr cloud optimizer

2017-09-07 Thread Tomas Fernandez Lobbe
applies AFAIK): [3] [1] https://lucene.apache.org/core/6_6_0/core/org/apache/lucene/index/TieredMergePolicy.html [2] https://lucene.apache.org/solr/guide/6_6/indexconfig-in-solrconfig.html [3] http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html Tomas > On Sep 7, 2

Re: Request to be added to the ContributorsGroup

2017-08-23 Thread Tomas Fernandez Lobbe
I just added you to the wiki. Note that the official documentation is now in the "solr-ref-guide" directory of the code base, and you can create patches/PRs to it. Tomás > On Aug 23, 2017, at 10:58 AM, Kevin Grimes wrote: > > Hi there, > > I would like to contribute to the Solr wiki. My user

Re: Query not working with DatePointField

2017-06-15 Thread Tomas Fernandez Lobbe
The query field:* doesn't work with point fields (numerics or dates), only exact or range queries are supported, so an equivalent query would be field:[* TO *] Sent from my iPhone > On Jun 15, 2017, at 5:24 PM, Saurabh Sethi wrote: > > Hi, > > We have a fieldType specified for date. Earlier

Re: Solr 6: how to get SortedSetDocValues from index by field name

2017-06-14 Thread Tomas Fernandez Lobbe
Hi, To respond your first question: “How do I get SortedSetDocValues from index by field name?”, DocValues.getSortedSet(LeafReader reader, String field) (which is what you want to use to assert the existence and type of the DV) will give you the dv instance for a single leaf reader. In general,

Re: Got a 404 trying to update a solr. 6.5.1 server. /solr/update not found.

2017-06-05 Thread Tomas Fernandez Lobbe
I think you are missing the collection name in the path. Tomás Sent from my iPhone > On Jun 5, 2017, at 9:08 PM, Phil Scadden wrote: > > Simple piece of code. Had been working earlier (though against a 6.4.2 > instance). > > ConcurrentUpdateSolrClient solr = new > ConcurrentUpdateSolrC

Re: A working example to play with Naive Bayes classifier

2016-07-15 Thread Tomas Ramanauskas
Hi, Allesandro, sorry for the delay. What do you mean? As I mentioned earlier, I followed a super simply set of steps. 1. Download Solr 2. Configure classification 3. Create some documents using curl over HTTP. Is it difficult to reproduce the steps / problem? Tomas > On 23 Jun 2

Re: A working example to play with Naive Bayes classifier

2016-06-22 Thread Tomas Ramanauskas
nsume.java:156)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)\n\tat java.lang.Thread.run(Thread.java:745)\n","code":500}} Tomas On 22 Jun 2016, at 17:

Re: A working example to play with Naive Bayes classifier

2016-06-22 Thread Tomas Ramanauskas
e_t":["The Way of Kings"], "author_s":"Brandon Sanderson", "pubyear_i":2010, "ISBN_s":"978-0-7653-2635-5", "_version_":1537854598189940736}} I don’t see “cat_s” field in the results at all. Tomas On 22 J

Re: A working example to play with Naive Bayes classifier

2016-06-22 Thread Tomas Ramanauskas
I also tried this configuration, but could get the feature to work: classification title_t,author_s cat_s bayes Tomas On 22 Jun 2016, at 13:46, Tomas Ramanauskas mailto:tomas.ramanaus...@springer.com>> wrote: P.S. The version

Re: A working example to play with Naive Bayes classifier

2016-06-22 Thread Tomas Ramanauskas
' {"responseHeader":{"status":0,"QTime":0}} $ curl http://localhost:8983/solr/demo/get?id=book1 { "doc": { "id":"book1", "title_t":["The Way of Kings"], "author_s":"Brandon Sanderson",

A working example to play with Naive Bayes classifier

2016-06-22 Thread Tomas Ramanauskas
Hi, everyone, would someone be able to share a working example (step by step) that demonstrates the use of Naive Bayes classifier in Solr? I followed this Blog post: https://alexbenedetti.blogspot.co.uk/2015/07/solr-document-classification-part-1.html?showComment=1464358093048#c248990230208500

Re: SOLR 4.0 + ReversedWildcardFilterFactory + DefaultSolrHighlighter + multibyte chars => crash?

2012-10-29 Thread Tomas Zerolo
On Mon, Oct 29, 2012 at 08:55:27AM -0700, Ahmet Arslan wrote: > Hi Tomas, > > I think this is same case Marian reported before. > > https://issues.apache.org/jira/browse/SOLR-3193 > https://issues.apache.org/jira/browse/SOLR-3901 Thanks, Ahmet. Yes, by the descriptions they

SOLR 4.0 + ReversedWildcardFilterFactory + DefaultSolrHighlighter + multibyte chars => crash?

2012-10-29 Thread Tomas Zerolo
Hi, SOLR gurus we're experiencing a crash with SOLR 4.0 whenever the results contain multibyte characters (more precisely: German umlauts, utf-8 encoded). The crashes only occur when using ReversedWildcardFilterFactory (which is necessary in 4.0 to be able to have wildcards at the beginning of th

Re: SOLR 4.0 / Jetty Security Set Up

2012-09-07 Thread Tomas Zerolo
On Fri, Sep 07, 2012 at 08:50:58AM +0200, Paul Libbrecht wrote: > Erick, > > I think that should be described differently... > You need to set-up protected access for some paths. > /update is one of them. > And you could make this protected at the jetty level or using Apache proxies > and rewrite

Re: AW: Indexing wildcard patterns

2012-08-13 Thread Tomas Zerolo
> stored in the table data, but used at query time I don't know about others, but PostgreSQL copes just fine: | tomas@rasputin:~$ psql template1 | psql (9.1.2) | Type "help" for help. | | template1=# create database test; | CREATE DATABASE | template1=# create table foo

Re: Lexical analysis tools for German language data

2012-04-12 Thread Tomas Zerolo
On Thu, Apr 12, 2012 at 03:46:56PM +, Michael Ludwig wrote: > > Von: Walter Underwood > > > German noun decompounding is a little more complicated than it might > > seem. > > > > There can be transformations or inflections, like the "s" in > > "Weinachtsbaum" (Weinachten/Baum). > > I remembe

Re: Solr as an part of api to unburden databases

2012-02-15 Thread Tomas Zerolo
On Wed, Feb 15, 2012 at 11:48:14AM +0100, Ramo Karahasan wrote: > Hi, > > > > does anyone of the maillinglist users use solr as an API to avoid database > queries? [...] Like in a... cache? Why not use a cache then? (memcached, for example, but there are more). Regards -- tomás

Re: how to avoid OOM while merge index

2012-01-09 Thread Tomas Zerolo
On Mon, Jan 09, 2012 at 01:29:39PM +0800, James wrote: > I am build the solr index on the hadoop, and at reduce step I run the task > that merge the indexes, each part of index is about 1G, I have 10 indexes to > merge them together, I always get the java heap memory exhausted, the heap > size i

Re: Poor performance on distributed search

2011-12-20 Thread Tomas Zerolo
On Mon, Dec 19, 2011 at 01:32:22PM -0800, ku3ia wrote: > >>Uhm, either I misunderstand your question or you're doing > >>a lot of extra work for nothing > > >>The whole point of sharding it exactly to collect the top N docs > >>from each shard and merge them into a single result [...] > >>

Re: Don't snowball depending on terms

2011-11-29 Thread Tomas Zerolo
On Tue, Nov 29, 2011 at 01:53:44PM -0500, François Schiettecatte wrote: > It won't and depending on how your analyzer is set up the terms are most > likely stemmed at index time. > > You could create a separate field for unstemmed terms though, or use a less > aggressive stemmer such as EnglishM

Re: Filtering results based on a set of values for a field

2011-08-18 Thread Tomas Zerolo
On Thu, Aug 18, 2011 at 02:32:48PM -0400, Erick Erickson wrote: > Hmmm, I'm still not getting it... > > You have one or more lists. These lists change once a month or so. Are > you trying > to include or exclude the documents in these lists? In our specific case to include *only* the documents ha

Re: Filtering results based on a set of values for a field

2011-08-18 Thread Tomas Zerolo
On Thu, Aug 18, 2011 at 08:36:08AM -0400, Erick Erickson wrote: > How does this list of authors get selected? The reason I'm asking is > I'm wondering > if you can "define the problem away". In other words, I'm wondering if this > is an XY problem (http://people.apache.org/~hossman/#xyproblem). :-

Re: Filtering results based on a set of values for a field

2011-08-17 Thread Tomas Zerolo
On Tue, Aug 16, 2011 at 07:56:51AM +, tomas.zer...@axelspringer.de wrote: > Hello, Solrs > > we are trying to filter out documents written by (one or more of) the authors > from > a mediumish list (~2K). The document set itself is in the millions. [...] Sorry. Forgot to say that we are usin

Re: Faceted Search Patent Lawsuit - Please Read

2011-08-17 Thread Tomas Zerolo
On Tue, Aug 16, 2011 at 03:58:29PM -0400, Grant Ingersoll wrote: > I know you mean well and are probably wondering what to do next [...] Still, a short heads-up like Johnson's would seem OK? After all, this is of concern to us all. Regards -- tomás

Re: Searching with AND + OR and spaces

2010-11-12 Thread Tomas Fernandez Lobbe
Hi Jon, for the first query: title:"Call of Duty" OR subhead:"Call of Duty" If you are sure that you have documents with the same phrase, make sure you don't have a problem with stop words and with token positions. I recommend you to check the analysis page at the Solr admin. pay special attent

Re: analyzer type

2010-11-12 Thread Tomas Fernandez Lobbe
For a field type the anslysis applied at index time (when you are adding documents to Solr) can be a slightly different than the analysis applied at query time (when a user executes a query). For example, if you know you are going to be indexing html pages, you might need to use the HTMLStripCh

Re: Search with accent

2010-11-10 Thread Tomas Fernandez Lobbe
eady working by default? Is this something to config on my schema.xml? Tks!! On Wed, Nov 10, 2010 at 6:40 PM, Tomas Fernandez Lobbe < tomasflo...@yahoo.com.ar> wrote: > That's what the ASCIIFoldingFilter does, it removes the accents, that's why > you > have to add it to t

Re: Search with accent

2010-11-10 Thread Tomas Fernandez Lobbe
e same way you should be able to find all documents as you require. On 10 November 2010 20:25, Tomas Fernandez Lobbe wrote: > It looks like ISOLatin1AccentFilter is deprecated on Solr 1.4.1, If you are > on > that version, you should use the ASCIIFoldingFilter instead. > > Like with

Re: Search with accent

2010-11-10 Thread Tomas Fernandez Lobbe
ith accent Tomas, Let me try to explain better. For example. - I have 10 documents, where 7 have the word pereque (without accent) and 3 have the word perequê (with accent) When I do a search pereque, solr is returning just 7, and when I do a search perequê solr is returning 3. But for me, t

Re: Search with accent

2010-11-10 Thread Tomas Fernandez Lobbe
I don't understand, when the user search for perequê you want the results for perequê and pereque? If thats the case, any field type with ISOLatin1AccentFilterFactory should work. The accent should be removed at index time and at query time (Make sure the filter is being applied on both cases)

Re: How to use protwords.txt

2010-08-31 Thread Tomas
Shaui, are you using a WordDelimiterFilterFactory in the analysis? That's the filter that might be transforming "met1" into "met" and "1" and not the steamer. Check de Analysis page on Solr admin. De: Shuai Weng Para: solr-user@lucene.apache.org Enviado: lun

Stress Test Solr

2010-08-02 Thread Tomas
Hi All, we've been building an open source tool for load tests on Solr Installations. Thetool is called SolrMeter. It's on google code at http://code.google.com/p/solrmeter/. Here is some information about it: SolrMeter is an stress testing / performance benchmarking tool for Apache Solr instal

Index size on disk

2010-03-11 Thread Tomas
Hello, I needed an easy way to see the index size (the actual size on disk, not just the number of documents indexed) and as i didn't found anything for doing that on the documentation or on the list, I coded a fast solution. I added the Index size as a statistic of the searcher, that way the va

Re: user feedback in solr

2010-02-05 Thread Tomas
I'm responding to this old mail becouse I implemented something like this similar to http://wiki.apache.org/solr/SolrSnmp . Maybe we could discuss if this is a good solution. I'm using Solr 1.4 on a JBoss 4.0.5 and Java 1.5. In my particular case, what I'm trying to find out is how often the use