RE: Problem starting solr on jetty

2011-07-27 Thread Chris Hostetter
: I tried to debug the issue by runing start.jar in eclipse debuger and : found that the root of the issue was that the jetty.home system property : was not set. If I set the jetty.home property then the server starts : properly. H, weird ... that still doesn't really make much sense. The

RE: Problem starting solr on jetty

2011-07-27 Thread Steven A Rowe
Hi Anand, Congrats! And thanks for letting us know. Steve -Original Message- From: anand.ni...@rbs.com [mailto:anand.ni...@rbs.com] Sent: Thursday, July 28, 2011 12:00 AM To: solr-user@lucene.apache.org Subject: RE: Problem starting solr on jetty Hi All, I tried to debug the issue by

RE: Problem starting solr on jetty

2011-07-27 Thread Anand.Nigam
Hi All, I tried to debug the issue by runing start.jar in eclipse debuger and found that the root of the issue was that the jetty.home system property was not set. If I set the jetty.home property then the server starts properly. Thanks, Anand -Original Message- From: Nigam, Anand, GBM

Store complete XML record (DIH & XPathEntityProcessor)

2011-07-27 Thread solruser@9913
I am trying to use DIH to import an XML based file with multiple XML records in it. Each record corresponds to one document in Lucene. I am using the DIH FileListEntityProcessor (to get file list) followed by the XPathEntityProcessor to create the entities. It works perfectly and I am able to

RE: Problem starting solr on jetty

2011-07-27 Thread Anand.Nigam
Thanks for your reply Steve. My environment details: Java version: 1.6.0_24 System: Microsoft Windows XP Professional Version 2002 Service Pack 3 Interestingly my colleagues who have the same environment are not facing this problem. Thanks & Regards Anand Nigam -Original Message- F

RE: Problem starting solr on jetty

2011-07-27 Thread Anand.Nigam
Thanks for your reply Steve. My environment details: Java version: 1.6.0_24 System: Microsoft Windows XP Professional Version 2002 Service Pack 3 Interestingly my colleagues who have the same environment are not facing this problem. Thanks & Regards Anand Nigam -Original Message- F

Re: Dealing with keyword stuffing

2011-07-27 Thread Chris Hostetter
: Presumably, they are doing this by increasing tf (term frequency), : i.e., by repeating keywords multiple times. If so, you can use a custom : similarity class that caps term frequency, and/or ensures that the scoring : increases less than linearly with tf. Please see in paticular, using someth

Re: Exact match not the first result returned

2011-07-27 Thread Chris Hostetter
: With your solution, RECORD 1 does appear at the top but I think thats just : blind luck more than anything else because RECORD 3 shows as having the same : score. So what more can I do to push RECORD 1 up to the top. Ideally, I'd : like all three records returned with RECORD 1 being the first li

Re: Data Import Handler Architecture Diagram

2011-07-27 Thread Chris Hostetter
: Maybe I am looking at the wrong version - the diagram (and the screenshot in : the interactive dev mode section) don't show up in the WIKI page. : : http://wiki.apache.org/solr/DataImportHandler#Architecture : : Is this a wrong link? Ugh. a while back the Infra team disabled attachments i

colocated term stats

2011-07-27 Thread Twomey, David
Given a query term, is it possible to get from the index the top 10 collocated terms in the index. ie: return the top 10 terms that appear with this term based on doc count. A plus would be to add some constraints on how near the terms are in the docs.

Re: schema.xml changes, need re-indexing ?

2011-07-27 Thread Alexei Martchenko
I always run http://localhost:8983/solr/admin/cores?action=RELOAD&core=corename in the browser when I wanna reload solr and see any changes in config xmls. 2011/7/27 François Schiettecatte > I have not seen this mentioned anywhere, but I found a useful 'trick' to > restart solr without having to

Re: An idea for an intersection type of filter query

2011-07-27 Thread Shawn Heisey
On 7/27/2011 3:49 PM, Jonathan Rochkind wrote: I don't know the answer to feasibilty either, but I'll just point out that boolean "OR" corresponds to set "union", not set "intersection". So I think you probably mean a 'union' type of filter query; 'intersection' does not seem to describe what

Re: Speeding up search by combining common sub-filters

2011-07-27 Thread Jonathan Rochkind
I'm pretty sure Solr/lucene have no such "optimization" already, but it's not clear to me that it would result in much of a performance benefit, just because of the way lucene works, it's not obvious to me that the second version of your query will be noticeably faster than the first version.

Re: An idea for an intersection type of filter query

2011-07-27 Thread Jonathan Rochkind
I don't know the answer to feasibilty either, but I'll just point out that boolean "OR" corresponds to set "union", not set "intersection". So I think you probably mean a 'union' type of filter query; 'intersection' does not seem to describe what you are describing; ordinary 'fq' values are 'i

Re: Indexing SharePoint from SolrJ

2011-07-27 Thread Glen Newton
+1 On 7/27/11, Twomey, David wrote: > > Does anyone have examples of indexing SP content using the Google Connectors > API and using SolrJ. > > I know Lucid Imagination has a Sharepoint connector and I have used that > successfully. > > However, I would like to create a thumbnail image of PDF's

Re: Solr Performance Tuning: -XX:+AggressiveOpts

2011-07-27 Thread Robert Muir
On Wed, Jul 27, 2011 at 4:12 PM, Fuad Efendi wrote: > Thanks Robert!!! > > "Submitted On 26-JUL-2011" - yesterday. > > This option was popular in HbaseŠ Then you should tell them also, not to use it, if they want their loops to work. -- lucidimagination.com

RE: Spellcheck compounded words

2011-07-27 Thread Dyer, James
I could not reproduce the problem even with the two parameters you show below added to the Default handler. I tried using this default handler with different queries with correct & incorrect terms. I made sure it would sometimes successfully create collations and other times try to create col

Re: Solr Performance Tuning: -XX:+AggressiveOpts

2011-07-27 Thread Fuad Efendi
Thanks Robert!!! "Submitted On 26-JUL-2011" - yesterday. This option was popular in HbaseŠ On 11-07-27 3:58 PM, "Robert Muir" wrote: >Don't use this option, these optimizations are buggy: > >http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7070134 > > >On Wed, Jul 27, 2011 at 3:56 PM, Fuad

Re: An idea for an intersection type of filter query

2011-07-27 Thread Shawn Heisey
On 7/27/2011 2:00 PM, Shawn Heisey wrote: I've seen a number of requests here for the ability to have multiple fq parameters ORed together. This is probably possible, but in the interests of compatibility between versions, very impractical. What if a new parameter was introduced? It could be

Re: schema.xml changes, need re-indexing ?

2011-07-27 Thread François Schiettecatte
I have not seen this mentioned anywhere, but I found a useful 'trick' to restart solr without having to restart tomcat. All you need to do is 'touch' the solr.xml in the solr.home directory. It can take a few seconds but solr will restart and reload any config. Cheers François On Jul 27, 201

An idea for an intersection type of filter query

2011-07-27 Thread Shawn Heisey
I've been looking at the slow queries our Solr installation is receiving. They are dominated by queries with a simple q parameter (often *:* for all docs) and a VERY complicated fq parameter. The filter query is built by going through a set of rules for the user and putting together each rule

Re: Solr Performance Tuning: -XX:+AggressiveOpts

2011-07-27 Thread Robert Muir
Don't use this option, these optimizations are buggy: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7070134 On Wed, Jul 27, 2011 at 3:56 PM, Fuad Efendi wrote: > Anyone tried this? I can not start Solr-Tomcat with following options on > Ubuntu: > > JAVA_OPTS="$JAVA_OPTS -Xms2048m -Xmx2048m

Solr Performance Tuning: -XX:+AggressiveOpts

2011-07-27 Thread Fuad Efendi
Anyone tried this? I can not start Solr-Tomcat with following options on Ubuntu: JAVA_OPTS="$JAVA_OPTS -Xms2048m -Xmx2048m -Xmn256m -XX:MaxPermSize=256m" JAVA_OPTS="$JAVA_OPTS -Dsolr.solr.home=/data/solr -Dfile.encoding=UTF8 -Duser.timezone=GMT -Djava.util.logging.config.file=/data/solr/logging.pr

Indexing SharePoint from SolrJ

2011-07-27 Thread Twomey, David
Does anyone have examples of indexing SP content using the Google Connectors API and using SolrJ. I know Lucid Imagination has a Sharepoint connector and I have used that successfully. However, I would like to create a thumbnail image of PDF's and PPT docs and add that to my index and I assu

Re: Filter content upon indexing

2011-07-27 Thread Emmanuel Espina
I want to add, that since the stored text (not the indexed) is not analyzed, if you retrieve the title you will get all the html. If you want to extract the title for storage in a separate field that will have to be done with a different tool not just with the analysis. My previous answer was focus

Re: schema.xml changes, need re-indexing ?

2011-07-27 Thread Alexei Martchenko
I believe you're fine with that. Don't need to reindex all solr database. 2011/7/27 Charles-Andre Martin > Hi, > > > > We currently have a big index in production. We would like to add 2 > non-required fields to our schema.xml : > > > > required="false"/> > > required="false" multiValued="true

RE: schema.xml changes, need re-indexing ?

2011-07-27 Thread Michael Ryan
You should be fine - no need to re-index your data. Adding and removing fields is generally safe to do without a re-index. Changing a field (its type, analyzers, etc) requires more caution and generally does require a re-index. -Michael

Data Import Handler Diagram

2011-07-27 Thread solruser@9913
Maybe I am looking at the wrong version - the diagram (and the screenshot in the interactive dev mode section) don't show up in the WIKI page. http://wiki.apache.org/solr/DataImportHandler#Architecture Is this a wrong link? I did an inspect element and this is what I see ... ... img alt=

Data Import Handler Architecture Diagram

2011-07-27 Thread solruser@9913
Maybe I am looking at the wrong version - the diagram (and the screenshot in the interactive dev mode section) don't show up in the WIKI page. http://wiki.apache.org/solr/DataImportHandler#Architecture Is this a wrong link? I did an inspect element and this is what I see ... /solr/DataImport

Re: Filter content upon indexing

2011-07-27 Thread Emmanuel Espina
If you can express what you want with a regular expression then the pattern Filter should work! I'm thinking that maybe you tokenized the field and that invalidated the structure of the html. I would use a "contents" field analized with a http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

schema.xml changes, need re-indexing ?

2011-07-27 Thread Charles-Andre Martin
Hi, We currently have a big index in production. We would like to add 2 non-required fields to our schema.xml : I made some tests: - I stopped tomcat - I changed the schema.xml - I started tomcat The data was still there and

Re: Autocomplete with Solr 3.1

2011-07-27 Thread scorpking
Hi Klein, Thanks for your reply. But i tried some suggestion with solr, and results return is good. But i want to using search component with solr 3.1. Now i have had some problems with Suggester. i think my problem perhaps about in schema file. This is schema file:

Jetty Logs - Max line size?

2011-07-27 Thread alexander sulz
Hello I enabled Jetty Logs but my GET requests seem so long that they get truncated and without a line break, so in the end it looks like this: notice the logged ping and where it begins. How can i change this? thank you very much 000.000.000.000 - - [27/Jul/2011:17:38:04 +0100] "GET /sol

Filter content upon indexing

2011-07-27 Thread Rafael Ribeiro
Hi all, I am trying to index html documents using Solr and I am having difficulties to extract certain parts of the main content of the document and store them sepparately into other fields. I saw on the docs that it is possible to achieve this using xpath but in my certain case I need to do a re

Solr Master-slave master failover without data loss

2011-07-27 Thread Nagendraprasad
Suppose master goes down immediately after the index updates, while the updates haven't been replicated to the slaves, data loss seems to happen. Does Solr have any mechanism to deal with that? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Master-slave-master-failover-w

Re: Why Slop doens't match anything?

2011-07-27 Thread Gora Mohanty
On Wed, Jul 27, 2011 at 8:38 PM, Alexander Ramos Jardim wrote: > Hello pals, > > Using solr 1.4.0. Trying to understand something. When I run the query > *fieldA:"nokia > c3"*, I get 5 results. All with "nokia c3", as expected. But when I run > fieldA:"nokia c3"~100, I don get any result! > > As f

Re: Exact match not the first result returned

2011-07-27 Thread Brian Lamb
Thanks Emmanuel for that explanation. I implemented your solution but I'm not quite there yet. Suppose I also have a record: RECORD 3 Fred G. Anderson Fred Anderson With your solution, RECORD 1 does appear at the top but I think thats just blind luck more than anything else because RECORD 3

Re: Dealing with keyword stuffing

2011-07-27 Thread Gora Mohanty
On Wed, Jul 27, 2011 at 7:15 PM, Pranav Prakash wrote: > I guess most of you have already handled and many of you might still be > handling keyword stuffing. Here is my scenario. We have a huge index > containing about 6m docs. (Not sure if that is huge :-) And every document > contains title, des

Why Slop doens't match anything?

2011-07-27 Thread Alexander Ramos Jardim
Hello pals, Using solr 1.4.0. Trying to understand something. When I run the query *fieldA:"nokia c3"*, I get 5 results. All with "nokia c3", as expected. But when I run fieldA:"nokia c3"~100, I don get any result! As far as I understand the "~100" should make my query bring even more results as

RE: Problem starting solr on jetty

2011-07-27 Thread Steven A Rowe
Hi Anand, Someone else reported this exact same error with Solr v1.4.0: http://www.lucidimagination.com/search/document/fd5b83f3595a1c6c/can_t_start_solr_by_java_jar_start_jar I downloaded the apache-solr-3.3.0.zip, unpacked it, then ran 'java -jar start.jar' from the cmdline. It worked. (Win

Re: using distributed search with the suggest component

2011-07-27 Thread Tobias Rübner
Thanks, but this does not work. Looking at the log files, I see only one request, when executing a search. Executing a request to the default servlet (/select) with multiple shards, each core gets ask for the current query. Any other suggestions? Tobias On Tue, Jul 26, 2011 at 2:11 PM, mdz-munic

Re: Delete by range query

2011-07-27 Thread Mohammad Shariq
Thanks Koji Its working now. On 27 July 2011 19:30, Koji Sekiguchi wrote: > time:[**1296777600+TO+1296778000] >> > > Should be time:[**1296777600 TO > 1296778000] ? > > koji > -- > http://www.rondhuit.com/en/ > -- Thanks and Regards Mohammad Shariq

Re: Spellcheck compounded words

2011-07-27 Thread O. Klein
All the talk about logging derailed the thread. So can someone test if adding 2 2 to the dedault requesthandler in solrconfig.xml using collations causes system to hang? O. Klein wrote: > > Anyways. I was testing on 3.3 and found that when I added > &spellcheck.maxCollations=2&spe

Re: Solr vs ElasticSearch

2011-07-27 Thread Yonik Seeley
On Wed, Jul 27, 2011 at 7:17 AM, Tarjei Huse wrote: > On 06/01/2011 08:22 AM, Jason Rutherglen wrote: >> Thanks Shashi, this is oddly coincidental with another issue being put >> into Solr (SOLR-2193) to help solve some of the NRT issues, the timing >> is impeccable. > Hmm, does anyone have an ide

Re: Delete by range query

2011-07-27 Thread Koji Sekiguchi
time:[1296777600+TO+1296778000] Should be time:[1296777600 TO 1296778000] ? koji -- http://www.rondhuit.com/en/

Re: what data type for geo fields?

2011-07-27 Thread Yonik Seeley
On Wed, Jul 27, 2011 at 9:01 AM, Peter Wolanin wrote: > Looking at the example schema: > > http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_3/solr/example/solr/conf/schema.xml > > the solr.PointType field type uses double (is this just an example > field, or used for geo search?)

Re: Solr vs ElasticSearch

2011-07-27 Thread Jeff Schmidt
You might also check out Solandra: https://github.com/tjake/Solandra With Solr's configuration and indexes in Cassandra, you can benefit from replication, distribution etc., and still have Cassandra available for non-Solr specific purposes. Cheers, Jeff On Jul 27, 2011, at 5:17 AM, T

Dealing with keyword stuffing

2011-07-27 Thread Pranav Prakash
I guess most of you have already handled and many of you might still be handling keyword stuffing. Here is my scenario. We have a huge index containing about 6m docs. (Not sure if that is huge :-) And every document contains title, description, tags, content (textual data). People have been doing k

Delete by range query

2011-07-27 Thread Mohammad Shariq
Hi, I want to delete the bunch of docs from my solr using rangeQuery. I have one field called 'time' which is tint. I am deleting using the query : time:[1296777600+TO+1296778000] but solr is returning Error, Saying bad request. however I am able to delete one by one using below deleteQuery: time

what data type for geo fields?

2011-07-27 Thread Peter Wolanin
Looking at the example schema: http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_3/solr/example/solr/conf/schema.xml the solr.PointType field type uses double (is this just an example field, or used for geo search?), while the solr.LatLonType field uses tdouble and it's unclear ho

Re: How to make a valid date facet query?

2011-07-27 Thread Tomás Fernández Löbbe
Hi Floyd, yes, those queries are supported. Make sure you use the right encoding for the plus sign: facet.query=onlinedate:[NOW/YEAR-3YEARS TO NOW/YEAR%2B5YEARS] the result of this facet query will be the number of documents in the result set that match that range. You'll have to use different fa

Re: Autocomplete with Solr 3.1

2011-07-27 Thread O. Klein
I know the solution, just not how to actually implement it, but maybe somebody can help with that :) >From Wiki: If you want to use a dictionary file that contains phrases (actually, strings that can be split into multiple tokens by the default QueryConverter) then define a different QueryConvert

Problem starting solr on jetty

2011-07-27 Thread Anand.Nigam
Hi, I am new to solr. I have downloaded the solr 3.3.0 distribution and tryign to run it using java -jar start.jar from the apache-solr-3.3.0\example directory (start.jar is present here). But I am getting following error on running this command: C:\downloads\apache-solr-3.3.0\apache-solr-3.3.

Re: Solr vs ElasticSearch

2011-07-27 Thread Tarjei Huse
On 06/01/2011 08:22 AM, Jason Rutherglen wrote: > Thanks Shashi, this is oddly coincidental with another issue being put > into Solr (SOLR-2193) to help solve some of the NRT issues, the timing > is impeccable. Hmm, does anyone have an idea on when this will be finished? I'm considering if I shoul

Re: Different options for autocomplete/autosuggestion

2011-07-27 Thread scorpking
HI Bell, i used autocomplete in solr 3.1. same this: autocomplete org.apache.solr.spelling.suggest.Suggester org.apache.solr.spelling.suggest.jaspell.JaspellLookup autocomplete true and i make following URL* http://solr.pl/en/2010/11/15/solr-and-autocomp

Re: Conditional field values in DataImport

2011-07-27 Thread Gora Mohanty
On Wed, Jul 27, 2011 at 7:20 AM, solruser@9913 wrote: > This may be a trivial question - I am noob :). > In the dataimport of a CSV file, am trying to assign a field based on a > conditional check on another field. > > E.g. >   > >   this works well.  However I need to create another field A that

Re: how often do you boys restart your tomcat?

2011-07-27 Thread Bernd Fehling
It is definately Lucenes fieldCache making the trouble. Restart your solr and monitor it with jvisualvm, especially OldGen heap. When it gets to 100 percent filled use jmap to dump heap of your system. Then use Eclipse Memory Analyzer http://www.eclipse.org/mat/ and open the heap dump. You will s

Re: how often do you boys restart your tomcat?

2011-07-27 Thread Paul Libbrecht
On curriki.org, our solr's Tomcat saturates memory after 2-4 weeks. I am still investigating if I am accumulating something or something else is. To check it, I am running a "query all, return num results" every minute to measure the time it takes. It's generally when it meets a big GC that gives