Re: Solr and DateTimes - bug?

2011-09-12 Thread Nicklas Overgaard
Hi Mauricio, Thanks for the suggestions :) I'm already running mono 2.10.5 so i should be safe.. And thanks to everybody for quick answers and friendly attitude. Best regards, Nicklas On 2011-09-13 03:01, Mauricio Scheffer wrote: Hi Nicklas, Use a nullable DateTime type instead of MinValue

Re: question about Field Collapsing/ grouping

2011-09-12 Thread Jayendra Patil
The time we implemented the feature, there was no straight forward solution. What we did is to facet on the grouped by field and counting the facets. This would give you the distinct count for the groups. You may also want to check the Patch @ https://issues.apache.org/jira/browse/SOLR-2242, whic

question about Field Collapsing/ grouping

2011-09-12 Thread Ahson Iqbal
Hi Is it possible to get number of groups that matched with specified query. like let say there are three fields in index DocumentID Content Industry and now i want to query as +(Content:is Content:the) group=true&group.field=industry now is it possible to get how many industries matched wit

Re: Re; DIH Scheduling

2011-09-12 Thread Bill Bell
You can easily use cron with curl to do what you want to do. On 9/12/11 2:47 PM, "Pulkit Singhal" wrote: >I don't see anywhere in: >http://issues.apache.org/jira/browse/SOLR-2305 >any statement that shows the code's inclusion was "decided against" >when did this happen and what is needed from th

Re: indexing data from rich documents - Tika with solr3.1

2011-09-12 Thread scorpking
Hi, Can you explain me this problem? I have indexed data from multi file which use tika libs. And i have indexed data from http. But only one file (ex: http://myweb/filename.pdf). Now i have many file formats in a http path (ex:http://myweb/files/). I tried index data from a http path but it's not

Re: Solr and DateTimes - bug?

2011-09-12 Thread Mauricio Scheffer
Hi Nicklas, Use a nullable DateTime type instead of MinValue. It's semantically more correct, and SolrNet will do the right mapping. I also heard that Mono had a bug in date parsing, it didn't behave just like .NET : https://github.com/mausch/SolrNet/commit/f3a76ea5535633f4b301e644e25eb2dc7f0cb7ef

RE: Weird behaviors with not operators.

2011-09-12 Thread Patrick Sauts
I mean it's a known bug. Hostetter AND (-chris *:*) Should do the trick. Depending on your request. NAME:(-chris *:*) -Original Message- From: Patrick Sauts [mailto:patrick.via...@gmail.com] Sent: Monday, September 12, 2011 3:57 PM To: solr-user@lucene.apache.org Subject: RE: Weird b

RE: Weird behaviors with not operators.

2011-09-12 Thread Patrick Sauts
Maybe this will answer your question http://wiki.apache.org/solr/FAQ Why does 'foo AND -baz' match docs, but 'foo AND (-bar)' doesn't ? Boolean queries must have at least one "positive" expression (ie; MUST or SHOULD) in order to match. Solr tries to help with this, and if asked to execute a Bool

Re: Weird behaviors with not operators.

2011-09-12 Thread Chris Hostetter
: I'm crashing into a weird behavior with - operators. I went ahead and added a FAQ on this using some text from a previous nearly identical email ... https://wiki.apache.org/solr/FAQ#Why_does_.27foo_AND_-baz.27_match_docs.2C_but_.27foo_AND_.28-bar.29.27_doesn.27t_.3F please reply if you have

How to return a function result instead of doclist in the Solr collapsing/grouping feature?

2011-09-12 Thread Pablo Ricco
I have the following solr fields in schema.xml: - id (string) - name (string) - category(string) - latitude (double) - longitude(double) Is it possible to make a query that groups by category and returns the average of latitude and longitude instead of the doclist? Thanks, Pablo

Re: Solr: Return field names that contain search term

2011-09-12 Thread Rahul Warawdekar
Thanks Chris ! Will try out the second approach you suggested and share my findings. On Mon, Sep 12, 2011 at 5:03 PM, Chris Hostetter wrote: > > : > Would highly appreciate if someone can suggest other efficient ways to > : > address this kind of a requirement. > > one approach would be to index

Re: Solr: Return field names that contain search term

2011-09-12 Thread Chris Hostetter
: > Would highly appreciate if someone can suggest other efficient ways to : > address this kind of a requirement. one approach would be to index each attachment as it's own document and search those. you could then use things like the group collapsing features to return onlly the "main" type

Re: Re; DIH Scheduling

2011-09-12 Thread Pulkit Singhal
I don't see anywhere in: http://issues.apache.org/jira/browse/SOLR-2305 any statement that shows the code's inclusion was "decided against" when did this happen and what is needed from the community before someone with the powers to do so will actually commit this? 2011/6/24 Noble Paul നോബിള്‍ नोब

Re: pagination with grouping

2011-09-12 Thread alxsss
Is case #2 planned to be coded in the future releases? Thanks. Alex. -Original Message- From: Bill Bell To: solr-user Sent: Thu, Sep 8, 2011 10:17 pm Subject: Re: pagination with grouping There are 2 use cases: 1. rows=10 means 10 groups. 2. rows=10 means to results (irr

[Commercial training announcement] Lucene training at Lucene EuroCon, Barcelona - Oct. 17,18, 2011

2011-09-12 Thread Erik Hatcher
http://www.lucidimagination.com/blog/2011/09/12/learn-lucene/ - pasted below too Hi everyone... I'm not usually much on advertising/hyping events where I speak and teach, but I'm really interested in drumming up a solid attendance for our Lucene training that I'll be teaching at Lucene EuroCon i

Re: Parameter not working for master/slave

2011-09-12 Thread Pulkit Singhal
Hello Bill, I can't really answer your question about replicaiton being supported on Solr3.3 (I use trunk 4.x myself) BUT I can tell you that if each Solr node has just one core ... only then does it make sense to use -Denable.master=true and -Denable.slave=true ... otherwise, as Yury points out,

How to combine RSS w/ Tika when using Data Import Handler (DIH)

2011-09-12 Thread Pulkit Singhal
Given an RSS raw feed source link such as the following: http://persistent.info/cgi-bin/feed-proxy?url=http%3A%2F%2Fwww.amazon.com%2Frss%2Ftag%2Fblu-ray%2Fnew%2Fref%3Dtag_rsh_hl_ersn I can easily get to the value of the description for an item like so: But the content of "description" happens to

Re: Running solr on small amounts of RAM

2011-09-12 Thread Chris Hostetter
Beyond the suggestions already made, i would add: a) being really aggressive about stop words can help keep the index size down, which can help reduce the amount of memory needed to scan the term lists b) faceting w/o any caching is likelye going to be too slow to be acceptible. c) don't sor

RE: select query does not find indexed pdf document

2011-09-12 Thread Bob Sandiford
Hi, Michael. Well, the stock answer is, 'it depends' For example - would you want to be able to search filename without searching file contents, or would you always search both of them together? If both, then copy both the file name and the parsed file content from the pdf into a single searc

Re: Parameter not working for master/slave

2011-09-12 Thread Erik Hatcher
On Sep 11, 2011, at 23:24 , William Bell wrote: > I am using 3.3 SOLR. I tried passing in -Denable.master=true and > -Denable.slave=true on the Slave machine. > Then I changed solrconfig.xml to reference each as per: > > http://wiki.apache.org/solr/SolrReplication#enable.2BAC8-disable_master.2BA

Re: Parameter not working for master/slave

2011-09-12 Thread Yury Kats
On 9/11/2011 11:24 PM, William Bell wrote: > I am using 3.3 SOLR. I tried passing in -Denable.master=true and > -Denable.slave=true on the Slave machine. > Then I changed solrconfig.xml to reference each as per: > > http://wiki.apache.org/solr/SolrReplication#enable.2BAC8-disable_master.2BAC8-slav

Re: Solr: Return field names that contain search term

2011-09-12 Thread darren
I also would like to know the answer to this. But my feeling is that you can't do what you want. I also had to use the highlighting workaround and aggregate dynamic field to accomplish the inability of multivalued fields to accommodate it. On Mon, 12 Sep 2011 11:44:01 -0400, Rahul Warawdekar wro

Solr: Return field names that contain search term

2011-09-12 Thread Rahul Warawdekar
Hi, I have a a query on Solr search as follows. I am indexing an entity which includes a multivalued field using DIH. This multivalued field contains content from multiple attachments for a single entity. Now, for eg. if i search for the term "solr", will I be able to know which field contains t

London Open Source Search Social - Tuesday 18th October

2011-09-12 Thread Richard Marr
Hi all, That's right, hold on to your hats, we're holding another London Search Social on the 18th Oct. http://www.meetup.com/london-search-social/events/33218292/ Venue is still TBD, but highly likely to be a quiet(ish) central London pub. There's usually a healthy mix of experience and backgro

Re: Stemming and other tokenizers

2011-09-12 Thread Jan Høydahl
Hi, Do they? Can you explain the layout of the documents? There are two ways to handle multi lingual docs. If all your docs have both an English and a Norwegian version, you may either split these into two separate documents, each with the "language" field filled by LangId - which then also l

Re: FastVectorHighlighter with wildcard queries

2011-09-12 Thread Rahul Warawdekar
Hi Koji, Thanks for the information ! I will try the patches provided by you. On 9/8/11, Koji Sekiguchi wrote: > (11/09/09 6:16), Rahul Warawdekar wrote: >> Hi, >> >> I am currently evaluating the FastVectorHighlighter in a Solr search based >> project and have a couple of questions >> >> 1. Is

RE: How to serach on specific file types ?

2011-09-12 Thread Jaeger, Jay - DOT
Some possibilities: 1) Put the file extension into your index (that is what we did when we were testing indexing documents with Solr) 2) Put a mime type for the document into your index. 3) Put the whole file name / URL into your index, and match on part of the name. This will give some false p

Re: MMapDirectory failed to map a 23G compound index segment

2011-09-12 Thread Rich Cariens
Thanks. It's definitely repeatable and I may spend some time plumbing this further. I'll let the list know if I find anything. The problem went away once I optimized the index down to a single segment using a simple IndexWriter driver. This was a bit strange since the resulting index contained sim

RE: Master Slave Question

2011-09-12 Thread Jaeger, Jay - DOT
You could prevent queries to the master by limiting what IP addresses are allowed to communicate with it, or by modifying web.xml to put different security on /update vs. /select . We took a simplistic approach. We did some load testing, and discovered that we could handle our expected update

RE: question about StandardAnalyzer, differences between solr 1.4 and solr 3.3

2011-09-12 Thread Jaeger, Jay - DOT
Looking at the Wiki ( http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters ), it looks like the solr.StandardTokenizerFactory changed with Solr 3.1 . We use solr.KeyWordTokenizerFactory for our middle names (and then also throw in solr.LowerCaseFilterFactory to normalize to lower case).

RE: select query does not find indexed pdf document

2011-09-12 Thread Bob Sandiford
Um - looks like you specified your id value as "pdfy", which is reflected in the results from the "*:*" query, but your id query is searching for "vpn", hence no matches... What does this query yield? http://www/SearchApp/select/?q=id:pdfy Bob Sandiford | Lead Software Engineer | SirsiDynix P:

Re: Problem with SolrJ and Grouping

2011-09-12 Thread Martijn v Groningen
The changes to that class were minor. There is only support for parsing a grouped response. Check the QueryResponse class there is a method getGroupResponse() I ran into similar exceptions when creating the QueryResponseTest#testGroupResponse test. The test use a xml response from a file. On 12 Se

Re: is it possibler to do a scheduling internally in solr application?

2011-09-12 Thread O. Klein
The easiest way is to use CRON and cURL for this. So add somthing like curl http://localhost:8080/solr/dataimport?command=full-import to your cron. -- View this message in context: http://lucene.472066.n3.nabble.com/is-it-possibler-to-do-a-scheduling-internally-in-solr-application-tp3329381p3329

Re: select query does not find indexed pdf document

2011-09-12 Thread Michael Dockery
http://www/SearchApp/select/?q=id:vpn   yeilds this:    - -   0   15 -   id:vpn             *    http://www/SearchApp/select/?q=*:* yeilds this:     - -   0   16 -   *.*     - -   doc -   application/pdf     pdfy   2011-05-20T

Re: Problem with SolrJ and Grouping

2011-09-12 Thread Kirill Lykov
Martijn, I can't find the fixed version. I've got the last version of SolrJ but I see only minor changes in XMLResponseParser.java. And it doesn't support grouping yet. I also checked branch_3x, branch for 3.4. On Mon, Sep 12, 2011 at 5:45 PM, Martijn v Groningen wrote: > Also the error you desc

Re: Solandra - select query error

2011-09-12 Thread tom135
It's complecated to give you sample data. But this error depends on the size of data. I have indexed 200 docs and this error did not occurred. But I need much more (ie. 5 000 000), so if I try to index 2000 docs then come this error. -- View this message in context: http://lucene.472066.n3.nabble

Re: Solandra - select query error

2011-09-12 Thread Jake Luciani
Hi, Solandra specific issue should be raised on http://github.com/tjake/Solandra/issues Could you also provide some sample data ans schama I can try to reproduce with? Thanks, Jake On Mon, Sep 12, 2011 at 7:57 AM, tom135 wrote: > Hello, > > I have some index and two search query: > 1. http:/

Solandra - select query error

2011-09-12 Thread tom135
Hello, I have some index and two search query: 1. http://127.0.0.1:8983/solandra/INDEX_NAME/select?q=type:(3 2 1) AND category:(2 1) AND text:(WORD1 WORD2 WORD3 WORD4 WORD5)&facet.field=creation_date&facet=true&wt=javabin&version=2 This query works good 2. 1. http://127.0.0.1:8983/solandra/INDEX

Re: solr equivalent of "select distinct"

2011-09-12 Thread lee carroll
if you have a limited set of searches which need to use this and they act on a limited known set of fields you can concat fields at index time and then facet PK FLD1 FLD2FLD3 FLD4 FLD5 copy45 AB0 AB 0 x yx y AB1 AB 1 x

is it possibler to do a scheduling internally in solr application?

2011-09-12 Thread vighnesh
hi all i am unable to do scheduling in solr to execute the commands like full-import and delta-import .and also is it possibler to do a scheduling internally in solr application? -- View this message in context: http://lucene.472066.n3.nabble.com/is-it-possibler-to-do-a-scheduling-internally-in

Re: Nested documents

2011-09-12 Thread Martijn v Groningen
To support this, we also need to implement indexing block of documents in Solr. Basically the UpdateHandler should also use this method: IndexWriter#addDocuments(Collection documents) On 12 September 2011 01:01, Michael McCandless wrote: > Even if it applies, this is for Lucene.  I don't think we

Fwd: How to serach on specific file types ?

2011-09-12 Thread ahmad ajiloo
Hello I want to search on articles. So need to find only specific files like doc, docx, and pdf. I don't need any html pages. Thus the result of our search should only consists of doc, docx, and pdf files. can you help me?

Re: Problem with SolrJ and Grouping

2011-09-12 Thread Martijn v Groningen
Also the error you described when wt=xml and using SolrJ is also fixed in 3.4 (and in trunk / branch3x). You can wait for the 3.4 release of use a night 3x build. Martijn On 12 September 2011 12:41, Sanal K Stephen wrote: > Kirill, > >         Parsing the grouped result using SolrJ is not releas

Re: Problem with SolrJ and Grouping

2011-09-12 Thread Sanal K Stephen
Kirill, Parsing the grouped result using SolrJ is not released yet I think..its going to release with Solr 3.4.0.SolrJ client cannot parse grouped and range facets results SOLR-2523. see the release notes of Solr 3.4.0 http://wiki.apache.org/solr/ReleaseNote34 On Mon, Sep 12, 2011 at 3

Problem with SolrJ and Grouping

2011-09-12 Thread Kirill Lykov
I found that SolrQuery doesn’t work with grouping. I constructed SolrQuery this way: solrQuery = constructFullSearchQuery(searchParams); solrQuery.set("group", true); solrQuery.set("group.field", "GeonameId"); Solr successfully handles request and writes about that in log: INFO: [] webapp=/solr

Re: Solr and DateTimes - bug?

2011-09-12 Thread Nicklas Overgaard
I see. I'm using that date to flag that my entity "has not yet ended". I can just use another constant which Solr is capable of returning in the correct format. The nice thing about DateTime.MinValue is that it's just part of the .net framework :) Hope that the issue is resolved at some point.

Re: Document row in solr Result

2011-09-12 Thread Eric Grobler
Hi Pierre, Great idea, that will speed things up! Thank your very much. Regards Ericz On Mon, Sep 12, 2011 at 10:19 AM, Pierre GOSSE wrote: > Hi Eric, > > If you want a query informing one customer of its product row at any given > time, the easiest way is to filter on submission date greater

Re: OOM issue

2011-09-12 Thread Manish Bafna
Number of cache is definitely going to reduce heap usage. Can you run those xlsx file separately with Tika and see if you are getting OOM issue. On Mon, Sep 12, 2011 at 3:09 PM, abhijit bashetti wrote: > I am facing the OOM issue. > > OTHER than increasing the RAM , Can we chnage some other par

OOM issue

2011-09-12 Thread abhijit bashetti
I am facing the OOM issue. OTHER than increasing the RAM , Can we chnage some other parameters to avoid the OOM issue. such as minimizing the filter cache size , document cache size etc. Can you suggest me some other option to avoid the OOM issue? Thanks in advance! Regards, Abhijit

Re: Stemming and other tokenizers

2011-09-12 Thread Manish Bafna
What is single document has multiple languages? On Mon, Sep 12, 2011 at 2:23 PM, Jan Høydahl wrote: > Hi > > Everybody else use dedicated field per language, so why can't you? > Please explain your use case, and perhaps we can better help understand > what you're trying to do. > Do you always kn

RE: Document row in solr Result

2011-09-12 Thread Pierre GOSSE
Hi Eric, If you want a query informing one customer of its product row at any given time, the easiest way is to filter on submission date greater than this customer's and return the result count. If you have 500 products with an earlier submission date, your row number is 501. Hope this helps,

Re: Document row in solr Result

2011-09-12 Thread Eric Grobler
Hi Manish, Thank you for your time. For upselling reasons I want to inform the customer that: "your product is on the last page of the search result. However, click here to put your product back on the first page..." Here is an example: I have a phone with productid 635001 in the iphone categor

Re: select query does not find indexed pdf document

2011-09-12 Thread Jan Høydahl
Hi, What do you get from a query http://www/SearchApp/select/?q=*:* or http://www/SearchApp/select/?q=id:vpn ? You may not have mapped the fields correctly to your schema? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com On 12. sep.

Re: Stemming and other tokenizers

2011-09-12 Thread Jan Høydahl
Hi Everybody else use dedicated field per language, so why can't you? Please explain your use case, and perhaps we can better help understand what you're trying to do. Do you always know the query language in advance? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Sol

Re: Document row in solr Result

2011-09-12 Thread Manish Bafna
You might not be able to find the row index. Can you post your query in detail. The kind of inputs and outputs you are expecting. On Mon, Sep 12, 2011 at 2:01 PM, Eric Grobler wrote: > Hi Manish, > > Thanks for your reply - but how will that return me the row index of the > original query. > > Re

Re: Document row in solr Result

2011-09-12 Thread Eric Grobler
Hi Manish, Thanks for your reply - but how will that return me the row index of the original query. Regards Ericz On Mon, Sep 12, 2011 at 9:24 AM, Manish Bafna wrote: > fq -> filter query parameter searches within the results. > > On Mon, Sep 12, 2011 at 1:49 PM, Eric Grobler >wrote: > > > Hi

Re: Document row in solr Result

2011-09-12 Thread Manish Bafna
fq -> filter query parameter searches within the results. On Mon, Sep 12, 2011 at 1:49 PM, Eric Grobler wrote: > Hi Solr experts, > > If you have a site with products sorted by submission date, the product of > a > customer might be on page 1 on the first day, and then move down to page x > as ot

Document row in solr Result

2011-09-12 Thread Eric Grobler
Hi Solr experts, If you have a site with products sorted by submission date, the product of a customer might be on page 1 on the first day, and then move down to page x as other customers submit newer entries. To find the row of a product you can of course run the query and loop through the resul

Re:OOM issue

2011-09-12 Thread abhijit bashetti
Yes , I am using TIKA for content extraction. The xlsx file size is 25MB. IS there any other option to relsolve the OOM issue rather than increasing the RAM. Can we chnage some other configuration param of solr to avoid OOM issue? Are you using Tika to do the extraction of content? You might

Re: Running solr on small amounts of RAM

2011-09-12 Thread Toke Eskildsen
On Fri, 2011-09-09 at 18:48 +0200, Mike Austin wrote: > Our index is very small with 100k documents and a light load at the moment. > If I wanted to use the smallest possible RAM on the server, how would I do > this and what are the issues? The index size depends just as much on the size of the do

Re: OOM issue

2011-09-12 Thread Manish Bafna
Are you using Tika to do the extraction of content? You might be getting OOM because of huge xlsx file. Try having bigger RAM and you might not get the issue. On Mon, Sep 12, 2011 at 12:44 PM, abhijit bashetti < abhijitbashe...@gmail.com> wrote: > Hi, > > I am getting the OOM error. > > I am wor

OOM issue

2011-09-12 Thread abhijit bashetti
Hi, I am getting the OOM error. I am working with multi-core for solr . I am using DIH for indexing. I have also integrated TIKA for content extraction. I am using ORACLE 10g DB. In the solrconfig.xml , I have added native My indexing server is on linux with 8GB of ram. I am indexing

Re: Will Solr/Lucene crawl multi websites (aka a mini google with faceted search)?

2011-09-12 Thread dpt9876
Thankyou for the clarification and help guys I will try them. On Sep 12, 2011 10:29 AM, "kkrugler [via Lucene]" < ml-node+s472066n332847...@n3.nabble.com> wrote: > > > > On Sep 11, 2011, at 7:04pm, dpt9876 wrote: > >> Hi thanks for the reply. >> >> How does nutch/solr handle the scenario where 1 we