Re: Term searches with colon(:)

2012-09-07 Thread Chris Hostetter
: I was wondering if anybody has run into this issue before. Solr is not : returing any search results for word that contain colon ( : ) in it : when we perform a term search containing colon. We do escape this : correctly, I believe as shown in the sample (taken from tomcat logs) ... :

Term searches with colon(:)

2012-09-07 Thread Nemani, Raj
All, I was wondering if anybody has run into this issue before. Solr is not returing any search results for word that contain colon ( : ) in it when we perform a term search containing colon. We do escape this correctly, I believe as shown in the sample (taken from tomcat logs) Sep 06, 2012

Re: N-gram ranking based on term position

2012-09-07 Thread Kiran Jayakumar
Since Edge N-gram tokens are a subset of N-gram tokens, I was wondering if I could be a bit more space efficient. On Fri, Sep 7, 2012 at 3:07 PM, Amit Nithian wrote: > I think your thought about using the edge ngram as a field and > boosting that field in the qf/pf sections of the dismax handler

Re: N-gram ranking based on term position

2012-09-07 Thread Amit Nithian
I think your thought about using the edge ngram as a field and boosting that field in the qf/pf sections of the dismax handler sounds reasonable. Why do you have qualms about it? On Fri, Sep 7, 2012 at 12:28 PM, Kiran Jayakumar wrote: > Hi, > > Is it possible to score documents with a match "earl

RE: [Solr4 beta] error 503 on commit

2012-09-07 Thread Markus Jelsma
Hi, We've seen this too on one of the test nodes yesterday, it ran on a build of a few days old. The node receiving documents complained it could not forward them to the fifth node and returned a 503. The fifth node itself only logged a NPE and the 503, nothing more, no stack traces. There was

Re: Importing of unix date format from mysql database and dates of format 'Thu, 06 Sep 2012 22:32:33 +0000' in Solr 4.0

2012-09-07 Thread Shawn Heisey
On 9/6/2012 6:54 PM, kiran chitturi wrote: The error i am getting is 'org.apache.solr.common.SolrException: Invalid Date String: '1345743552'. I think it was being saved as a string in DB, so i will use the DateFormatTransformer. To go along with all the other replies that you have gotten:

Re: Why is using edismax in Admin UI puts edismax=true but not defType=edismax?

2012-09-07 Thread Chris Hostetter
: I am not edismax=true as a flag actually does anything (Solr4 beta): Alexandre: You are 100% correct, this appears to be a bug in the Admin UI. Thank you for reporting it... https://issues.apache.org/jira/browse/SOLR-3811 -Hoss

Why is using edismax in Admin UI puts edismax=true but not defType=edismax?

2012-09-07 Thread Alexandre Rafalovitch
Hello, I am not edismax=true as a flag actually does anything (Solr4 beta): 'responseHeader'=>{ 'status'=>0, 'QTime'=>1, 'params'=>{ 'debugQuery'=>'true', 'indent'=>'true', 'edismax'=>'true', 'q'=>'text', 'qf'=>'TitleEN DescEN', 'wt'=>'ruby',

Re: [Solr4 beta] error 503 on commit

2012-09-07 Thread Chris Hostetter
: I get sometimes (not often): : SolrException e where e.code() == : SolrException.ErrorCode.SERVICE_UNAVAILABLE.code Are there any errors in your solr server logs? Are you using the DistributedUpdateProcessor (ie: SolrCloud) ? There aren't many places in Solr that will throw a 503 status cod

Re: Solr 4.0 Beta, termIndexInterval vs termIndexDivisor vs termInfosIndexDivisor

2012-09-07 Thread Tom Burton-West
Thanks Robert, >>if not, just customize blocktree's params with a CodecFactory in solr, >>or even pick another implementation (FixedGap, VariableGap, whatever). Still trying to get my head around 4.0 and flexible indexing. I'll take another look at Mike's and your presentations. I'm trying to f

Re: Importing of unix date format from mysql database and dates of format 'Thu, 06 Sep 2012 22:32:33 +0000' in Solr 4.0

2012-09-07 Thread Chris Hostetter
: > When i index a text field which has arabic and English like this tweet : > “@anaga3an: هو سعد الحريري بيعمل ايه غير تحديد الدوجلاس ويختار الكرافته ؟؟” : > #gcc #ksa #lebanon #syria #kuwait #egypt #سوريا : > with field_type as 'text_ar' and when i try to see the same field again in : > solr, it

Re: Solr search not working after copying a new field to an existing Indexed Field

2012-09-07 Thread Mani
yes..I do have this uniquekey defined properly. id Before the schema change... After the schema change... -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-search-not-working-after-copying-a-new-field-to-an-existing-Indexed-Field-tp4005993p4006217.html Sent f

Re: Solr search not working after copying a new field to an existing Indexed Field

2012-09-07 Thread Kiran Jayakumar
Do you have the unique key set up in your schema.xml ? It should be automatic if you have the ID field and define it as the unique key. ID On Thu, Sep 6, 2012 at 11:50 AM, Mani wrote: > I have a made a schema change to copy an existing field "name" (Source > Field) > to an existing search fiel

Re: Problem with verifying signature ?

2012-09-07 Thread Kiran Jayakumar
Thank you. On Thu, Sep 6, 2012 at 9:51 AM, Chris Hostetter wrote: > > : gpg: Signature made 08/06/12 19:52:21 Pacific Daylight Time using RSA key > : ID 322 > : D7ECA > : gpg: Good signature from "Robert Muir (Code Signing Key) < > rm...@apache.org>" > : *gpg: WARNING: This key is not certified w

Re: How to preserve source column names in multivalue catch all field

2012-09-07 Thread Kiran Jayakumar
Thank you Erick. I think #2 is the best for me because I have more than hundred fields & dont want to construct a huge query each time. On Thu, Sep 6, 2012 at 9:38 PM, Erick Erickson wrote: > Try using edismax to distribute the search across the fields rather > than using the catch-all field. The

[Solr4 beta] error 503 on commit

2012-09-07 Thread Antoine LE FLOC'H
Hello, Using "package org.apache.solr.client.solrj;" when I do: UpdateResponse ur = solrServer.commit(false, false); I get sometimes (not often): SolrException e where e.code() == SolrException.ErrorCode.SERVICE_UNAVAILABLE.code When I catch this exception, I try to commit again, the call d

Re: Re: Schema model to store additional field metadata

2012-09-07 Thread sysrq
> Why would you store the actual images in SOLR? No, the images are files on the filesystem. Only the path to the image should be stored in Solr. > And you are most likely looking at dynamic fields as the solution > > 1) Define *_Path, *_Size, *_Alt as a dynamic field with appropriate types > 2

Re: Solr 4.0 Beta, termIndexInterval vs termIndexDivisor vs termInfosIndexDivisor

2012-09-07 Thread Robert Muir
On Fri, Sep 7, 2012 at 2:19 PM, Tom Burton-West wrote: > Thanks Robert, > > I'll have to spend some time understanding the default codec for Solr 4.0. > Did I miss something in the changes file? http://lucene.apache.org/core/4_0_0-BETA/ see the file formats section, especially http://lucene.apac

Solr 4: Private master, public slave?

2012-09-07 Thread Alexandre Rafalovitch
Hello, I have a bunch of documents that I would like to index on a local server behind the firewall. But then, the actual search will happen on a public infrastructure (Amazon, etc). The documents themselves are not quite public, so I want just the index content (indexed, not stored) being availab

Re: Version Migration from solr 1.3

2012-09-07 Thread Mani
If you have time, you might as well wait for 4.0 to be released otherwise 3.6.1 -- View this message in context: http://lucene.472066.n3.nabble.com/Version-Migration-from-solr-1-3-tp4006193p4006200.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr 4.0 Beta, termIndexInterval vs termIndexDivisor vs termInfosIndexDivisor

2012-09-07 Thread Tom Burton-West
Thanks Robert, I'll have to spend some time understanding the default codec for Solr 4.0. Did I miss something in the changes file? I'll be digging into the default codec docs and testing sometime in next week or two (with a 2 billion term index) If I understand it well enough, I'll be happy t

Re: Version Migration from solr 1.3

2012-09-07 Thread Sujatha Arun
I see that 4.0 alpha has been release after 3.6.1 , so should I look at 3.5 as the most stable release currently? Version Source : https://issues.apache.org/jira/browse/SOLR?selectedTab=com.atlassian.jira.plugin.system.project%3Aversions-panel Regards Sujatha On Fri, Sep 7, 2012 at 11:17 PM, Su

Re: Schema model to store additional field metadata

2012-09-07 Thread Alexandre Rafalovitch
Why would you store the actual images in SOLR? There is no way to really search the bytes of image, is there? What you probably want to do is extract all searchable metadata out of that image, name, alt, EXIF, etc. And you are most likely looking at dynamic fields as the solution 1) Define *_Path

Re: Indexing CSV files with filenames

2012-09-07 Thread edvicif
My problem is more like the left hand side of the equatation. Is it ${f.name} or something? On Sep 7, 2012 5:36 PM, "Rafał Kuć-3 [via Lucene]" < ml-node+s472066n4006179...@n3.nabble.com> wrote: > Hello! > > You can just pass the name of the file to the 'literal' parameter. For > example adding >

Re: Solr 4.0 Beta, termIndexInterval vs termIndexDivisor vs termInfosIndexDivisor

2012-09-07 Thread Robert Muir
Hi Tom: I already enhanced the javadocs about this for Lucene, putting warnings everywhere in bold: NOTE: This parameter does not apply to all PostingsFormat implementations, including the default one in this release. It only makes sense for term indexes that are implemented as a fixed gap between

Schema model to store additional field metadata

2012-09-07 Thread sysrq
Hi, I want to create a Solr index of articles. Each article should have a title, content, published date and an arbitrary number of images attached to. An article could look like this: title: An article about Foo and Bar content: This is some text about Foo and Bar. published: 2012.09.07T

Solr 4.0 Beta, termIndexInterval vs termIndexDivisor vs termInfosIndexDivisor

2012-09-07 Thread Tom Burton-West
Hello all, Due to multiple languages and dirty OCR, our indexes have over 2 billion unique terms ( http://www.hathitrust.org/blogs/large-scale-search/too-many-words-again). In Solr 3.6 and previous we needed to reduce the memory used for storing the in-memory representation of the tii file. We o

Re: Indexing CSV files with filenames

2012-09-07 Thread Rafał Kuć
Hello! You can just pass the name of the file to the 'literal' parameter. For example adding literal.filename=my_file.csv would set the 'filename' field of your document with the value of 'my_file.csv'. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Elastic

Re: Indexing CSV files with filenames

2012-09-07 Thread edvicif
Thx for the quick answer. Can you help a little more? I don't really got the concept of literal. How can I set a field with the source absolute path? I mean how can I find out the parameter names? An example will be really help full. -- View this message in context: http://lucene.472066.n3.

Re: solrcloud setup using tomcat, single machine

2012-09-07 Thread Mark Miller
The above does not look right - you probably would want /usr/solr/example/solr for your solrhome based on other info you give. You also reference /usr/solr/data/conf as your conf folder, but I'd expect it to be something like /usr/solr/example/solr/collection1/conf -DhostPort=8080" #mi

Re: Indexing CSV files with filenames

2012-09-07 Thread Rafał Kuć
Hello! In Solr 4.0 you will have the ability to add arbitrary field along with all documents from a single file - http://wiki.apache.org/solr/UpdateCSV#literal -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch > Hi! > I've have a set of CSV file

Indexing CSV files with filenames

2012-09-07 Thread edvicif
Hi! I've have a set of CSV files. I wanted to index them by certain columns. But I also want to store the filename, where they got indexed from. The reason is, that the queries I want to run is to identify files. David -- View this message in context: http://lucene.472066.n3.nabble.com/Index

Re: Access and copy lucene index data

2012-09-07 Thread Jack Krupansky
You can use the Solr admin "analysis" web page to enter a term or even a passage of text and see how it would be analyzed/indexed for any specified field or field type. -- Jack Krupansky -Original Message- From: Bill_78 Sent: Friday, September 07, 2012 11:23 AM To: solr-user@lucene.a

Re: SOLR 4.0 DataImport frozen or fails with WARNING: Unable to read: dataimport.properties?

2012-09-07 Thread Travis Low
Change your data-config.xml connection XML to this: Then try again. This keeps the driver from trying to fetch the entire result set at the same time. cheers, Travis On Fri, Sep 7, 2012 at 4:17 AM, deniz wrote: > Hi all, > > I have been trying to index my data from mysql db, but somehow

Re: Website (crawler for) indexing

2012-09-07 Thread Dominique Bejean
May be you can take a look at Crawl-Anywhere which have administration web interface, solr indexer and search web application. www.crawl-anywhere.com Regards. Dominique Le 05/09/12 17:05, Lochschmied, Alexander a écrit : This may be a bit off topic: How do you index an existing website and c

Access and copy lucene index data

2012-09-07 Thread Bill_78
Dear all, Similar subjects about index data have already been post, but I would like your advise. I use solr analysers to process fields, like synonyms, stopwords, ... and I cannot see the result without using a special tool (like LukeRequestHandler for example). I would like to copy the index

Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0

2012-09-07 Thread Erick Erickson
Thank the guys who actually fixed it! Thanks for bringing this up, and please let us know if Yonik's patch fixes your problem Best Erick On Thu, Sep 6, 2012 at 11:39 PM, guenter.hip...@unibas.ch wrote: > Erick, thanks for response! > Our use case is very straight forward and basic. > - no c

Re: groups.limit=0 in sharding core results in IllegalArgumentException

2012-09-07 Thread yriveiro
Hi, I have the same issue using solr 4.0-ALPHA. -- View this message in context: http://lucene.472066.n3.nabble.com/groups-limit-0-in-sharding-core-results-in-IllegalArgumentException-tp4006086p4006110.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Dotan Cohen
On Fri, Sep 7, 2012 at 5:04 PM, Yonik Seeley wrote: > On Fri, Sep 7, 2012 at 9:39 AM, Erik Hatcher wrote: >> A "trie" field probably doesn't work properly, as it indexes multiple terms >> per value and you'd get odd values. > > I don't know about pivot faceting, but all of the other types of > f

Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Dotan Cohen
On Fri, Sep 7, 2012 at 4:39 PM, Erik Hatcher wrote: >> Just to be clear, as I'm not logged onto the dev server at the moment >> but it was implied in an earlier mail: Any field that is to be pivoted >> on needs to be a string field? Is that documented, as I cannot find >> that in the docs. > > No,

Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Yonik Seeley
On Fri, Sep 7, 2012 at 9:39 AM, Erik Hatcher wrote: > A "trie" field probably doesn't work properly, as it indexes multiple terms > per value and you'd get odd values. I don't know about pivot faceting, but all of the other types of faceting take this into account (hence faceting works fine on t

Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Erik Hatcher
On Sep 7, 2012, at 09:29 , Dotan Cohen wrote: > On Fri, Sep 7, 2012 at 4:05 PM, Erik Hatcher wrote: > >> Ranges won't work at all pivots are purely by individual term currently. >> >> If you want to pivot by ranges, and you can define those ranges during >> indexing, then you could make a

Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Dotan Cohen
On Fri, Sep 7, 2012 at 4:05 PM, Erik Hatcher wrote: > Ranges won't work at all pivots are purely by individual term currently. > > If you want to pivot by ranges, and you can define those ranges during > indexing, then you could make a field that represented which range each > document is i

Re: Solr 4.0alpha: edismax complaints on certain characters

2012-09-07 Thread Alexandre Rafalovitch
Thank you. I can confirm that moving to Beta has made that problem go away. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't see

Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Erik Hatcher
On Sep 7, 2012, at 08:36 , Dotan Cohen wrote: > On Fri, Sep 7, 2012 at 12:23 PM, Erik Hatcher wrote: >> Pivot facets currently only work with individual terms, not ranges. >> >> The response you provided does look odd in that there are duplicate >> timestamps listed, but pivots were only imple

Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Dotan Cohen
On Fri, Sep 7, 2012 at 12:23 PM, Erik Hatcher wrote: > Pivot facets currently only work with individual terms, not ranges. > > The response you provided does look odd in that there are duplicate > timestamps listed, but pivots were only implemented for textual (string being > the most common typ

RE: Is Boilerpipe usable through Solr ExtractingUpdateHandler or the DIH?

2012-09-07 Thread Markus Jelsma
It works indeed: https://issues.apache.org/jira/browse/SOLR-3808 -Original message- > From:Markus Jelsma > Sent: Fri 07-Sep-2012 10:40 > To: solr-user@lucene.apache.org > Subject: RE: Is Boilerpipe usable through Solr ExtractingUpdateHandler or the > DIH? > > Hi, > > It should not b

Re: SOLR 4.0 / Jetty Security Set Up

2012-09-07 Thread dan sutton
Hi, If like most people you have application server(s) in front of solr, the simplest and most secure option is to bind solr to a local address (192.168.* or 10.0.0.*). The app server talks to solr via the local (a.k.a blackhole) ip address that no-one from outside can ever access as it's not rout

Marco Scalone está ausente de la oficina.

2012-09-07 Thread Marco Scalone
Estaré ausente de la oficina desde el Vie 07/09/2012 y no volveré hasta el Jue 20/09/2012 . Responderé a su mensaje cuando regrese.

Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Erik Hatcher
Pivot facets currently only work with individual terms, not ranges. The response you provided does look odd in that there are duplicate timestamps listed, but pivots were only implemented for textual (string being the most common type) fields initially. Erik On Sep 6, 2012, at 19:04 ,

groups.limit=0 in sharding core results in IllegalArgumentException

2012-09-07 Thread mechravi25
Hi, Im using solr 3.6.1 version. I have kept corex as the common core i.e. I ve used the sharding concept on this core to get the indexed data from all the other cores. Here, If i use grouping with groups.limit=0, its resulting in the following exception numHits must be > 0; please use TotalHit

Re: SOLR 4.0 / Jetty Security Set Up

2012-09-07 Thread Tomas Zerolo
On Fri, Sep 07, 2012 at 08:50:58AM +0200, Paul Libbrecht wrote: > Erick, > > I think that should be described differently... > You need to set-up protected access for some paths. > /update is one of them. > And you could make this protected at the jetty level or using Apache proxies > and rewrite

RE: Is Boilerpipe usable through Solr ExtractingUpdateHandler or the DIH?

2012-09-07 Thread Markus Jelsma
Hi, It should not be so hard but it looks like the current SolrContentHandler builds up the document via SAX-events. You could pass a BoilerpipeContentHandler((ContentHandler)parsingHandler, BoilerpipeExtractor) to the parser in ExtractingDocumentLoader.java. It should work. Markus -O

SOLR 4.0 DataImport frozen or fails with WARNING: Unable to read: dataimport.properties?

2012-09-07 Thread deniz
Hi all, I have been trying to index my data from mysql db, but somehow i cant index anything, and dont see any exception / error in logs, except a warning which is highlighted below... Here is my db-config's connection string: (I can connect to the db from command line by using the above sett