Re: Checking Optimal Values for BM25

2016-12-15 Thread Sascha Szott
Hi Furkan, in order to change the BM25 parameter values k1 and b, the following XML snippet needs to be added in your schema.xml configuration file: 1.3 0.7 It is even possible to specify the SimilarityFactory on individual index fields. See [1] for more details. Best Sascha [1] htt

Re: field length within BM25 score calculation in Solr 6.3

2016-12-15 Thread Sascha Szott
fieldLength does not match 8. Is there same "magic“ applied to the value of field length that goes beyond the standard BM25 score formula? If so, what is the idea behind this modification. If not, is this a Lucene / Solr bug? Best regards, Sascha -- Sascha Szott :: KOBV/ZIB :: +49 30 84185-457

field length within BM25 score calculation in Solr 6.3

2016-12-04 Thread Sascha Szott
Hi folks, my Solr index consists of one document with a single valued field "title" of type "text_general". The title field was index with the content: 1 2 3 4 5 6 7 8 9. The field type text_general uses a StandardTokenizer which should result in 9 tokens. The corresponding length of field titl

Re: Problem of facet on 170M documents

2013-11-02 Thread Sascha SZOTT
Hi Ming, which Solr version are you using? In case you use one of the latest versions (4.5 or above) try the new parameter facet.threads with a reasonable value (4 to 8 gave me a massive performance speedup when working with large facets, i.e. nTerms >> 10^7). -Sascha Mingfeng Yang wrote: > I h

intersection of filter queries with raw query parser

2013-05-31 Thread Sascha Szott
Hi folks, is it possible to use the raw query parser with a disjunctive filter query? Say, I have a field 'foo' and two values 'v1' and 'v2' (the field values are free text and can contain any character). What I want is to retrieve all documents satisying fq=foo:(v1 OR v2). In case only one f

Re: Does SolrCloud support distributed IDFs?

2012-10-22 Thread Sascha SZOTT
Hi Mark, Mark Miller wrote: Still waiting on that issue. I think Andrzej should just update it to trunk and commit - it's option and defaults to off. Go vote :) Sounds like the problem is already solved and the remaining work consists of code integration? Can somebody estimate how much work tha

Does SolrCloud support distributed IDFs?

2012-10-21 Thread Sascha Szott
Hi folks, a known limitation of the "old" distributed search feature is the lack of distributed/global IDFs (#SOLR-1632). Does SolrCloud bring some improvements in this direction? Best regards, Sascha

Re: indexing documents in Apache Solr using php-curl library

2012-07-02 Thread Sascha SZOTT
Hi, perhaps it's better to use a PHP Solr client library. I used https://code.google.com/p/solr-php-client/ in a project of mine and it worked just fine. -Sascha Asif wrote: > I am indexing the file using php curl library. I am stuck here with the code > echo "Stored in: " . "upload/" . $_F

Re: Prefix query is not analysed?

2012-07-02 Thread Sascha Szott
Hi, I suppose you are using Solr 3.6. Then take a look at http://www.lucidimagination.com/blog/2011/11/29/whats-with-lowercasing-wildcard-multiterm-queries-in-solr/ -Sascha Alok Bhandari schrieb: Thanks for reply. If I check the debug query through solr-admin I can see that the lower case

Re: Prefix query is not analysed?

2012-07-02 Thread Sascha Szott
Hi, wildcard and fuzzy queries are not analyzed. -Sascha Alok Bhandari schrieb: Hello , I am pushing "Chuck Follett'.?.?" in solr and when I query for this field with query string field:Follett'.* I am getting 0 results. field type declared is and parser we are using is EdisMax .

Re: querying thru solritas gives me zero results

2012-06-30 Thread Sascha Szott
Hi, Solritas uses the dismax query parser. The dismax config parameter 'qf' specifies the index fields to be searched in. Make sure that 'name' is your default search field. -Sascha Giovanni Gherdovich schrieb: Hi all, this morning I was very proud of myself since I managed to set up sol

Re: how to retrieve a doc from its docID ?

2012-06-30 Thread Sascha Szott
Hi, did you include the fl parameter in the Solr query URL? If that's the case make sure that the field name 'text' is mentioned there. You should also make sure that the field definition (in schema.xml) for 'text' says stored="true", otherwise the field will not be returned. -Sascha Giovan

Re: Searching for digits with strings

2012-06-27 Thread Sascha Szott
Hi, as far as I know Solr does not provide such a feature. If you cannot make any assumptions on the numbers, choose an appropriate library that is able to transform between numerical and non-numerical representations and populate the search field with both versions at index-time. -Sascha Ali

Re: getting started

2011-06-16 Thread Sascha SZOTT
Hi Mari, it depends ... * How many records are stored in your MySQL databases? * How often will updates occur? * How many db records / index documents are changed per update? I would suggest to start with a single Solr core first. Thereby, you can concentrate on the basics and do not need to d

Re: Search failing for matched text in large field

2011-03-23 Thread Sascha Szott
become even slower. Is there any way to get all the results in a way that is quick enough that my user won't get bored waiting? Is there some optimization of this coming in solr 3.0? On Wed, Mar 23, 2011 at 12:15 PM, Sascha Szott wrote: Hi Paul, did you increase the value of the maxFieldL

Re: Search failing for matched text in large field

2011-03-23 Thread Sascha Szott
Hi Paul, did you increase the value of the maxFieldLength parameter in your solrconfig.xml? -Sascha On 23.03.2011 17:05, Paul wrote: I'm using solr 1.4.1. I have a document that has a pretty big field. If I search for a phrase that occurs near the start of that field, it works fine. If I se

Re: Solr coding

2011-03-23 Thread Sascha Szott
Hi, depending on your needs, take a look at Apache ManifoldCF. It adds document-level security on top of Solr. -Sascha On 23.03.2011 14:20, satya swaroop wrote: Hi All, As for my project Requirement i need to keep privacy for search of files so that i need to modify the code of so

Re: Index MS office

2011-02-02 Thread Sascha Szott
Hi, have a look at Solr's ExtractingRequestHandler: http://wiki.apache.org/solr/ExtractingRequestHandler -Sascha On 02.02.2011 16:49, Thumuluri, Sai wrote: Good Morning, I am planning to get started on indexing MS office using ApacheSolr - can someone please direct me where I should start?

Re: Malformed XML with exotic characters

2011-02-01 Thread Sascha Szott
works perfectly with the same returned data in some AJAX environment. On Tuesday 01 February 2011 18:29:06 Sascha Szott wrote: Hi folks, I've made the same observation when working with Solr's ExtractingRequestHandler on the command line (no browser interaction). When issuing the followi

Re: Malformed XML with exotic characters

2011-02-01 Thread Sascha Szott
Hi folks, I've made the same observation when working with Solr's ExtractingRequestHandler on the command line (no browser interaction). When issuing the following curl command curl 'http://mysolrhost/solr/update/extract?extractOnly=true&extractFormat=text&wt=xml&resource.name=foo.pdf' --da

Re: missing type check when working with pint field type

2011-01-18 Thread Sascha Szott
;s happening is that the field is being indexed as a text type and, I suspect, begin tokenized. The error you're getting is when trying to sort against a tokenized field which is undefined. At least that's my story and I'm sticking to it Best Erick On Tue, Jan 18, 2011 at 8:10

missing type check when working with pint field type

2011-01-18 Thread Sascha Szott
Hi folks, I've noticed an unexpected behavior while working with the various built-in integer field types (int, tint, pint). It seems as the first two ones are subject to type checking, while the latter one is not. I'll give you an example based on the example schema that is shipped out with

Re: post search using solrj

2010-12-30 Thread Sascha SZOTT
Hi Don, you could give the HTTP method to be used as a second argument to the QueryRequest constructor: [http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/request/QueryRequest.html#QueryRequest(org.apache.solr.common.params.SolrParams,%20org.apache.solr.client.solrj.SolrRequest.ME

Re: DataImportHandler in Solr 1.4.1: exception handling in FileListEntityProcessor

2010-08-11 Thread Sascha Szott
FullImport(DataImporter.java:331) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:389) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370) -Sascha On 11.08.2010 15:18, Sascha Szott wrote: Hi folks, why does FileListEntityProcess

DataImportHandler in Solr 1.4.1: exception handling in FileListEntityProcessor

2010-08-11 Thread Sascha Szott
Hi folks, why does FileListEntityProcessor ignores onError="continue" and abort indexing if a directory or a file does not exist? I'm using both XPathEntityProcessor and FileListEntityProcessor with onError set to continue. In case a directory or file is not present an Exception is thrown an

Re: problem with formulating a negative query

2010-07-06 Thread Sascha Szott
Hi, Chris Hostetter wrote: AND, OR, and NOT are just syntactic-sugar for modifying the MUST, MUST_NOT, and SHOULD. The default op of "OR" only affects the first clause of your query (R) because it doesn't have any modifiers -- Thanks for pointing that out! -Sascha the second clause has that

Re: Is there a way to delete multiple documents using wildcard?

2010-06-30 Thread Sascha Szott
Hi, take a look inside Solr's log file. Are there any error messages with respect to the update request? Furthermore, you could try the following two commands instead: curl "http://host:port/solr/update"; --form-string stream.body="uid:6-HOST*" curl "http://host:port/solr/update"; --form-s

Re: Is there a way to delete multiple documents using wildcard?

2010-06-30 Thread Sascha Szott
Hi, does /select?q=uid:6-HOST* return any documents? -Sascha bbarani wrote: Hi, Thanks a lot for your reply.. I tried the below query update?commit=true%20-H%20"Content-Type:%20text/xml"%20--data-binary%20'uid:6-HOST*' But even now none of the documents are getting deleted.. Am I forming

Re: Is there a way to delete multiple documents using wildcard?

2010-06-30 Thread Sascha Szott
Hi, you can delete all docs that match a certain query: uid:6-HOST* -Sascha bbarani wrote: Hi, I am trying to delete a group of documents using wildcard. Something like update?commit=true%20-H%20"Content-Type:%20text/xml"%20--data-binary%20'6-HOST*' I want to delete all documents which co

Re: problem with formulating a negative query

2010-06-30 Thread Sascha Szott
results. HTH Erick On Tue, Jun 29, 2010 at 12:40 PM, Sascha Szott wrote: Hi Ahmet, it works, thanks a lot! To be true I have no idea what's the problem with defType=lucene&q.op=OR&df=topic&q=R NOT [* TO *] -Sascha Ahmet Arslan wrote: I have a (multi-valued) field topic

Re: problem with formulating a negative query

2010-06-29 Thread Sascha Szott
Hi Ahmet, it works, thanks a lot! To be true I have no idea what's the problem with defType=lucene&q.op=OR&df=topic&q=R NOT [* TO *] -Sascha Ahmet Arslan wrote: I have a (multi-valued) field topic in my index which does not need to exist in every document. Now, I'm struggling with formulating

problem with formulating a negative query

2010-06-29 Thread Sascha Szott
Hi folks, I have a (multi-valued) field topic in my index which does not need to exist in every document. Now, I'm struggling with formulating a query that returns all documents that either have no topic field at all *or* whose topic field value is R. Unfortunately, the query /select?q={!lu

Re: Specifiying multiple mlt.fl fields

2010-06-19 Thread Sascha Szott
Hi Darren, try mlt.fl=field1 field2 Best, Sascha Darren Govoni wrote: Hi, I read the wiki and tried about a dozen variations such as: ...&mlt.fl=field1&mlt.fl=field2 and ...&mlt.fl=field1,field2&... to specify more than one MLT field and it won't take. What's the trick? Also, how to do

Re: federated / meta search

2010-06-18 Thread Sascha Szott
;s but if each node is queried with fields that it supports, it should return useful results. [1]: http://wiki.apache.org/solr/DistributedSearch Cheers. -Original message----- From: Sascha Szott Sent: Thu 17-06-2010 19:44 To: solr-user@lucene.apache.org; Subject: federated / meta search

federated / meta search

2010-06-17 Thread Sascha Szott
Hi folks, if I'm seeing it right Solr currently does not provide any support for federated / meta searching. Therefore, I'd like to know if anyone has already put efforts into this direction? Moreover, is federated / meta search considered a scenario Solr should be able to deal with at all or

Re: strange results with query and hyphened words

2010-05-31 Thread Sascha Szott
the WordDelimiterFilter. What about using the PatternReplaceCharFilter at query time to eliminate all intra-word hyphens? -Sascha Sascha Szott wrote: Hi Markus, the default-config for index is: and for query: That's not true. The default configuration for query-time processing is: By

Re: strange results with query and hyphened words

2010-05-31 Thread Sascha Szott
imiterFilterFactory's catenate* parameters should only be used in the index-time analysis stack. Otherwise the strange behaviour (search for profi-auskunft is translated into "profi followed by (auskunft or profiauskunft)" you mentioned will occur. Best, Sascha -Ursprünglich

Re: strange results with query and hyphened words

2010-05-30 Thread Sascha Szott
Hi Markus, I was facing the same problem a few days ago and found an explanation in the mail archive that clarifies my question regarding the usage of Solr's WordDelimiterFilterFactory: http://markmail.org/message/qoby6kneedtwd42h Best, Sascha markus.rietz...@rzf.fin-nrw.de wrote: i am won

Re: sort by field length

2010-05-26 Thread Sascha Szott
tigate "exceptional" data items that are not appropriately handled by my curation process. Best, Sascha Best Erick On Tue, May 25, 2010 at 3:40 AM, Sascha Szott wrote: Hi Erick, Erick Erickson wrote: Are you sure you want to recompute the length when sorting? It's

Re: Faceted search not working?

2010-05-25 Thread Sascha Szott
explicitly mentioned (i.e., QueryComponent and FacetComponent in your example). To avoid this, use name="last-components" instead. -Sascha Jean-Sebastien Vachon wrote: Is the FacetComponent loaded at all? query facet On 2010-05-25, at 3:32 AM, Sascha Szot

Re: Highlighting is not happening

2010-05-25 Thread Sascha Szott
onse which contains query should be bold Regards Prakash -Original Message----- From: Sascha Szott [mailto:sz...@zib.de] Sent: Monday, May 24, 2010 10:55 PM To: solr-user@lucene.apache.org Subject: Re: Highlighting is not happening Hi Prakash, can you provide 1. the definition of th

Re: sort by field length

2010-05-25 Thread Sascha Szott
uery time requires only a simple lookup. -Sascha But you could consider payloads for storing the length, although that would still be redundant... Best Erick On Mon, May 24, 2010 at 8:30 AM, Sascha Szott wrote: Hi folks, is it possible to sort by field length without having to (red

Re: Faceted search not working?

2010-05-25 Thread Sascha Szott
Hi Birger, Birger Lie wrote: I don't think the bolean fields is mapped to "on" and "off" :) You can use true and on interchangeably. -Sascha -birger -Original Message- From: Ilya Sterin [mailto:ster...@gmail.com] Sent: 24. mai 2010 23:11 To: solr-user@lucene.apache.org Subject: Fa

Re: Faceted search not working?

2010-05-24 Thread Sascha Szott
Hi Ilya, Ilya Sterin wrote: I'm trying to perform a faceted search without any luck. Result set doesn't return any facet information... http://localhost:8080/solr/select/?q=title:*&facet=on&facet.field=title I'm getting the result set, but no face information present? Is there something else

Re: Highlighting is not happening

2010-05-24 Thread Sascha Szott
al Message- From: Sascha Szott [mailto:sz...@zib.de] Sent: Monday, May 24, 2010 10:29 PM To: solr-user@lucene.apache.org Subject: Re: Highlighting is not happening Hi Prakash, more importantly, check the field type and its associated analyzer. In case you use a "non-tokenized" type (e.

Re: Highlighting is not happening

2010-05-24 Thread Sascha Szott
Hi Prakash, more importantly, check the field type and its associated analyzer. In case you use a "non-tokenized" type (e.g., string), highlighting will not appear if only a partial field match exists (only exact matches, i.e. the query coincides with the field value, will be highlighted). If

sort by field length

2010-05-24 Thread Sascha Szott
Hi folks, is it possible to sort by field length without having to (redundantly) save the length information in a seperate index field? At first, I thought to accomplish this using a function query, but I couldn't find an appropriate one. Thanks in advance, Sascha

Re: Wildcard queries

2010-05-21 Thread Sascha Szott
all stems. but, this is impossible, since the * operator allows an infinite language. On Fri, May 21, 2010 at 10:11 AM, Sascha Szott wrote: Hi folks, what's the idea behind the fact that no text analysis (e.g. lowercasing) is performed on wildcarded search terms? In my context this behaviou

Wildcard queries

2010-05-21 Thread Sascha Szott
Hi folks, what's the idea behind the fact that no text analysis (e.g. lowercasing) is performed on wildcarded search terms? In my context this behaviour seems to be counter-intuitive (I guess that's the case in the majority of applications) and my application needs to lowercase any input ter

Re: How to tell which field matched?

2010-05-15 Thread Sascha Szott
Hi, I'm not sure if debugQuery=on is a feasible solution in a productive environment, as generating such extra information requires a reasonable amount of computation. -Sascha Jon Baer wrote: Does the standard debug component (?debugQuery=on) give you what you need? http://wiki.apache.org/

Re: Autosuggest

2010-05-15 Thread Sascha Szott
Hi, maybe you would like to have a look at solr.ShingleFilterFactory [1] to expand your autosuggest to more than one term. -Sascha [1] http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory Blargy wrote: Thanks for your help and especially your analyzer.. p

Re: Solr Schema Question

2010-04-17 Thread Sascha Szott
Hi Serdar, take a look at Solr's DataImportHandler: http://wiki.apache.org/solr/DataImportHandler Best, Sascha Serdar Sahin wrote: Hi, I am rather new to Solr and have a question. We have around 200.000 txt files which are placed into the file cloud. The file path is something similar to th

Re: StreamingUpdateSolrServer hangs

2010-04-16 Thread Sascha Szott
you! Anything that would help us reproduce this is much appreciated. Are there any other who have experienced the same problem? -Sascha On Fri, Apr 16, 2010 at 8:57 AM, Sascha Szott wrote: Hi Yonik, Yonik Seeley wrote: Stephen, were you running stock Solr 1.4, or did you apply any of th

Re: StreamingUpdateSolrServer hangs

2010-04-16 Thread Sascha Szott
Hi Yonik, Yonik Seeley wrote: Stephen, were you running stock Solr 1.4, or did you apply any of the SolrJ patches? I'm trying to figure out if anyone still has any problems, or if this was fixed with SOLR-1711: I'm using the latest trunk version (rev. 934846) and constantly running into the sam

How to sort facet values lexicographically in descending order?

2010-03-11 Thread Sascha Szott
Hi folks, is there a way to sort facet values lexicographically in descending order? If it's not possible right now, are there any feasible workarounds to accomplish this? Note: I've seen issue SOLR-1672, but it does not solve my problem since it deals with facet counts only. Best, Sascha

Re: (default) maximum chars per field

2010-02-05 Thread Sascha Szott
markus.rietz...@rzf.fin-nrw.de wrote: ok, i was looking for all types of "max" but somehow didn't saw the maxFieldLength. this is a global parameter, right? can this be defined on a field basis? It's a global parameter counting the maximum number of tokens(!) - not the number of characters

Re: Deploying Solr 1.3 in JBoss 5

2010-02-05 Thread Sascha Szott
to depoy solr 1.3 to jBoss 5? I should update the wiki page too, maybe. Thank you very much, Sascha. Bye L.M. On 2 February 2010 18:02, Sascha Szott wrote: Luca Molteni wrote: Actually, if I hard-code the value, it gives me the same error... interesting. According to the error message: Th

Re: Deploying Solr 1.3 in JBoss 5

2010-02-02 Thread Sascha Szott
he order of elements within env-entry (env-entry-value before env-entry-type)? -Sascha On 2 February 2010 17:14, Sascha Szott wrote: Hi, I'm not sure if that's a Solr issue. However, what happens if you set env-entry-value to C:/mypath/solr instead of ${solr.home.myhome}? -Sascha

Re: java.lang.NullPointerException with MySQL DataImportHandler

2010-02-02 Thread Sascha Szott
ored for the item with productId 220213. Since no default value is specified, Solr raises an error when creating the index document. -Sascha Jean-Michel Philippon-Nadeau wrote: Hi, Thanks for the reply. On Tue, 2010-02-02 at 16:57 +0100, Sascha Szott wrote: * the output of MySQL's

Re: Deploying Solr 1.3 in JBoss 5

2010-02-02 Thread Sascha Szott
eploy/Solrrepo/solr-mysolr.war/WEB-INF/web.xml[146,14] at org.jboss.xb.binding.parser.sax.SaxJBossXBParser$MetaDataErrorHandler.error(SaxJBossXBParser.java:426) Notice that the same .war and properties-services.xml works flawlessly in JBoss 4.2.3 Any ideas? Thank you very much. L.M. -- Sascha Szott Kooperativer Biblioth

Re: java.lang.NullPointerException with MySQL DataImportHandler

2010-02-02 Thread Sascha Szott
Hi, can you post * the output of MySQL's describe command for all tables/views referenced in your DIH configuration * the DIH configuration file (i.e., data-config.xml) * the schema definition (i.e., schema.xml) -Sascha Jean-Michel Philippon-Nadeau wrote: Hi, It is my first install of Solr

Re: How to display Highlight with VelocityResponseWriter?

2010-01-13 Thread Sascha Szott
(only a bit of syntactic sugar). -Sascha [1] http://wiki.apache.org/solr/VelocityResponseWriter#line-93 [2] http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/response/QueryResponse.html > Quoting Sascha Szott : > >> Qiuyan, >> >>> with highlight can also be di

Re: How to display Highlight with VelocityResponseWriter?

2010-01-11 Thread Sascha Szott
Qiuyan, with highlight can also be displayed in the web gui. I've added true into the standard responseHandler and it already works, i.e without velocity. But the same line doesn't take effect in itas. Should i configure anything else? Thanks in advance. First of all, just a few notes on the /it

Re: solrJ and spell check queries

2010-01-03 Thread Sascha Szott
Hi, Jay Fisher wrote: I'm trying to find a way to formulate the following query in solrJ. This is the only way I can get the desired result but I can't figure out how to get solrJ to generate the same query string. It always generates a url that starts with select and I need it to start with spe

Re: how to do a Parent/Child Mapping using entities

2009-12-30 Thread Sascha Szott
ot;keyword". I think that i should use a custom field analyser or some thing like that; but i don't know what to do. but thanks for all; and any supplied help will be lovable. Sascha Szott wrote: Hi, you could create an additional index field res_ranked_url tha

Re: how to do a Parent/Child Mapping using entities

2009-12-29 Thread Sascha Szott
Hi, you could create an additional index field res_ranked_url that contains the concatenated value of an url and its corresponding rank, e.g., res_rank + " " + res_url Then, q=res_ranked_url:"1 url1" retrieves all documents with url1 as the first url. A drawback of this wor

Re: Optimize not having any effect on my index

2009-12-18 Thread Sascha Szott
Hi Aleksander, Aleksander Stensby wrote: So i tried with curl: curl http://server:8983/solr/update --data-binary '' -H 'Content-type:text/xml; charset=utf-8' No difference here either... Am I doing anything wrong? Do i need to issue a commit after the optimize? Did you restart the Solr server i

Re: Exception from Spellchecker

2009-12-15 Thread Sascha Szott
Hi Rafael, Rafael Pappert wrote: I try to enable the spellchecker in my 1.4.0 solr (running with tomcat 6 on debian). But I always get the following exception, when I try to open http://localhost:8080/spell?: The spellcheck=true pair is missing in your request. Try http://localh

RE: search on tomcat server

2009-12-07 Thread Sascha Szott
Hi Jill, just to make sure your index contains at least one document, what is the output of Best, Sascha Jill Han wrote: > In fact, I just followed the instructions titled as Tomcat On Windows. > Here are the updates on my

How to instruct MoreLikeThisHandler to sort results

2009-12-03 Thread Sascha Szott
Hi Folks, is there any way to instruct MoreLikeThisHandler to sort results? I was wondering that MLTHandler recognizes faceting parameters among others, but it ignores the sort parameter. Best, Sascha

Re: Indexing file content with custom field

2009-12-02 Thread Sascha Szott
Piero, it sounds you're looking for an integration of Solr Cell and Solr's DIH facility -- a feature that isn't implemented yet (but the issue is already addressed in Solr-1358). As a workaround, you could store the extracted contents in plain text files (either by using Solr Cell or Apache

Re: Hierarchical xml

2009-12-02 Thread Sascha Szott
Pooja, have a look at Solr's DataImportHandler. XPathEntityProcessor [1] should suit your needs. Best, Sascha [1] http://wiki.apache.org/solr/DataImportHandler#XPathEntityProcessor Pooja Verlani schrieb: Hi, I want to index an xml like following: John 1979-29-17T28:14:48Z

[Solved] Re: VelocityResponseWriter/Solritas character encoding issue

2009-11-27 Thread Sascha Szott
That's it! Best, Sascha Erik Hatcher schrieb: Sascha, Can you give me a test document that causes an issue? (maybe send me a Solr XML document in private e-mail). I'll see what I can do once I can see the issue first hand. Erik On Nov 18, 2009, at 2:48 PM, Sascha Szott wrote

Re: VelocityResponseWriter/Solritas character encoding issue

2009-11-18 Thread Sascha Szott
a Ubuntu Linux machine. -Sascha > > On Wed, Nov 18, 2009 at 8:15 AM, Sascha Szott wrote: >> Hi Erik, >> >> Erik Hatcher wrote: >>> >>> Can you give me a test document that causes an issue?  (maybe send me >>> a >>> Solr XML document in

Re: VelocityResponseWriter/Solritas character encoding issue

2009-11-18 Thread Sascha Szott
gainst the VelocityResponseWriter returns a lot of Unicode replacement characters (u+FFFD) instead. -Sascha On Nov 18, 2009, at 2:48 PM, Sascha Szott wrote: Hi, I've played around with Solr's VelocityResponseWriter (which is indeed a very useful feature for rapid prototyping)

VelocityResponseWriter/Solritas character encoding issue

2009-11-18 Thread Sascha Szott
Hi, I've played around with Solr's VelocityResponseWriter (which is indeed a very useful feature for rapid prototyping). I've realized that Velocity uses ISO-8859-1 as default character encoding. I've changed this setting to UTF-8 in my velocity.properties file (inside the conf directory), i.e

Re: Indexing multiple documents in Solr/SolrCell

2009-11-17 Thread Sascha Szott
at 5:02 PM, Sascha Szott wrote: Hi, the problem you've described -- an integration of DataImportHandler (to traverse the XML file and get the document urls) and Solr Cell (to extract content afterwards) -- is already addressed in issue SOLR-1358 ( https://issues.apache.org/jira/browse/SOLR-1358)

Re: Indexing multiple documents in Solr/SolrCell

2009-11-16 Thread Sascha Szott
Hi, the problem you've described -- an integration of DataImportHandler (to traverse the XML file and get the document urls) and Solr Cell (to extract content afterwards) -- is already addressed in issue SOLR-1358 (https://issues.apache.org/jira/browse/SOLR-1358). Best, Sascha Kerwin wrote:

Re: [DIH] concurrent requests to DIH

2009-11-12 Thread Sascha Szott
already read this thread. In my opinion, both support for batch processing and multi-threading are important extensions of DIH's current capabilities, though issue SOLR-1352 mainly targets the latter. Is your PDIH implementation able to deal with batch processing right now? Best, Sascha >

Re: [DIH] blocking import operation

2009-11-12 Thread Sascha Szott
Noble Paul wrote: > Yes , open an issue . This is a trivial change I've opened JIRA issue SOLR-1554. -Sascha > > On Thu, Nov 12, 2009 at 5:08 AM, Sascha Szott wrote: >> Noble, >> >> Noble Paul wrote: >>> DIH imports are really long running. There is a

[DIH] concurrent requests to DIH

2009-11-11 Thread Sascha Szott
Hi all, I'm using the DIH in a parameterized way by passing request parameters that are used inside of my data-config. All imports end up in the same index. 1. Is it considered as good practice to set up several DIH request handlers, one for each possible parameter value? 2. In case the range of

Re: [DIH] blocking import operation

2009-11-11 Thread Sascha Szott
t. There was a discussion on adding a callback url to DIH a month ago, but it seems that no issue was raised. So, up to now its only possible to implement an appropriate Solr EventListener. Should we open an issue for supporting callback urls? Best, Sascha > > On Tue, Nov 10, 2009 at 12:12 AM, Sascha

[DIH] blocking import operation

2009-11-09 Thread Sascha Szott
Hi all, currently, DIH's import operation(s) only works asynchronously. Therefore, after submitting an import request, DIH returns immediately, while the import process (in case a large amount of data needs to be indexed) continues asynchronously behind the scenes. So, what is the recommende

Re: [DIH] SqlEntityProcessor does not recognize onError attribute

2009-11-09 Thread Sascha Szott
Hi, Noble Paul നോബിള്‍ नोब्ळ् wrote: On Mon, Nov 9, 2009 at 4:24 PM, Sascha Szott wrote: Hi all, as stated in the Solr-WIKI, Solr 1.4 allows it to specify an onError attribute for *each* entity listed in the data config file (it is considered as one of the default attributes). Unfortunately

[DIH] SqlEntityProcessor does not recognize onError attribute

2009-11-09 Thread Sascha Szott
Hi all, as stated in the Solr-WIKI, Solr 1.4 allows it to specify an onError attribute for *each* entity listed in the data config file (it is considered as one of the default attributes). Unfortunately, the SqlEntityProcessor does not recognize the attribute's value -- i.e., in case an SQL

Re: How to use DataImportHandler with ExtractingRequestHandler?

2009-09-03 Thread Sascha Szott
Hi Khai, a few weeks ago, I was facing the same problem. In my case, this workaround helped (assuming, you're using Solr 1.3): For each row, extract the content from the corresponding pdf file using a parser library of your choice (I suggest Apache PDFBox or Apache Tika in case you need to pr

Re: Building documents using content residing both in database tables and text files

2009-08-11 Thread Sascha Szott
11, 2009 at 8:05 PM, Sascha Szott wrote: Hello, is it possible (and if it is, how can I accomplish it) to configure DIH to build up index documents by using content that resides in different data sources? Here is an example scenario: Let's assume we have a table T with two columns, ID (which is

Building documents using content residing both in database tables and text files

2009-08-11 Thread Sascha Szott
Hello, is it possible (and if it is, how can I accomplish it) to configure DIH to build up index documents by using content that resides in different data sources? Here is an example scenario: Let's assume we have a table T with two columns, ID (which is the primary key of T) and TITLE. Furt