AW: is it possible to run deltaimport command with out delta query?

2012-02-15 Thread Ramo Karahasan
Hi, may you have a look at http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport hth, Ramo -Ursprüngliche Nachricht- Von: nagarjuna [mailto:nagarjuna.avul...@gmail.com] Gesendet: Donnerstag, 16. Februar 2012 07:27 An: solr-user@lucene.apache.org Betreff: is it possible to

Using Solr for a rather busy "Yellow Pages"-type index - good idea or not really?

2012-02-15 Thread Alexey Verkhovsky
Hi, all, I'm new here. Used Solr on a couple of projects before, but didn't need to dive deep into anything until now. These days, I'm doing a spike for a "yellow pages" type search server with the following technical requirements: ~10 mln listings in the database. A listing has a name, address,

is it possible to run deltaimport command with out delta query?

2012-02-15 Thread nagarjuna
hi all.. i am new to solr .can any body explain me about the delta-import and delta query and also i have the below questions 1.is it possible to run deltaimport without delataquery? 2. is it possible to write a delta query without having last_modified column in database? if yes pls explain m

RE: Search for hashtags and mentions

2012-02-15 Thread Rohit
Go the problem, I need to user "types=" parameter to ignore character like #,@ in WordDelimiterFilterFactory factory. Regards, Rohit Mobile: +91-9901768202 About Me: http://about.me/rohitg -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: 16 February 2012 06:22 To: so

Re: Spatial Search and faceting

2012-02-15 Thread William Bell
One way to do it is to group by city and then sort=geodist() asc select?group=true&group.field=city&sort=geodist() desc&rows=10&fl=city It might require 2 calls to SOLR to get it the way you want. On Wed, Feb 15, 2012 at 5:51 PM, Eric Grobler wrote: > Hi Solr community, > > I am doing a spatial

Re: Can I rebuild an index and remove some fields?

2012-02-15 Thread Li Li
great. I think you could make it a public tool. maybe others also need such functionality. On Thu, Feb 16, 2012 at 5:31 AM, Robert Stewart wrote: > I implemented an index shrinker and it works. I reduced my test index > from 6.6 GB to 3.6 GB by removing a single shingled field I did not > need a

Re: Search for hashtags and mentions

2012-02-15 Thread Robert Muir
On Wed, Feb 15, 2012 at 2:04 PM, Rohit wrote: > generateNumberParts="1" catenateWords="1" catenateNumbers="1" > catenateAll="0" splitOnCaseChange="0" splitOnNumerics="0" > preserveOriginal="1" handleAsChar="@#"/> There is no such parameter as 'handleAsChar'. If you want to do this, you need to u

Spatial Search and faceting

2012-02-15 Thread Eric Grobler
Hi Solr community, I am doing a spatial search and then do a facet by city. Is it possible to then sort the faceted cities by distance? We would like to display the hits per city, but sort them by distance. Thanks & Regards Ericz q=iphone fq={!bbox} sfield=geopoint pt=49.594857,8.468614 d=50 fl

Re: Search for hashtags and mentions

2012-02-15 Thread Erick Erickson
We need the rest of your fieldType, it's quite possible that other parts of it are stripping out the characters in question. Try looking at the admin/analysis page. If that doesn't help, please show us the whole fieldType definition and the results of attaching &debugQuery=on to the URL. Best Eri

Re: Language specific tokenizer for purpose of multilingual search in single-core solr,

2012-02-15 Thread Chris Hostetter
: I want to do multilingual search in single-core solr. That requires to : define language specific tokenizers in scheme.xml. Say for example, I have : two tokenizers, one for English ("en") and one for simplified Chinese : ("zh-cn"). Can I just put following definitions together in one schema.xml

Re: Query in starting solr 3.5

2012-02-15 Thread Chris Hostetter
: WARNING: XML parse warning in "solrres:/dataimport.xml", line 2, column 95: : Include operation failed, reverting to fallback. Resource error reading file : as XML (href='solr/conf/solrconfig_master.xml'). Reason: Can't find resource : 'solr/conf/solrconfig_master.xml' in classpath or : '/solr/a

Re: Size of suggest dictionary

2012-02-15 Thread Em
Hello Mike, have a look at Solr's Schema Browser. Click on "FIELDS", select "label" and have a look at the number of distinct (term-)values. Regards, Em Am 15.02.2012 23:07, schrieb Mike Hugo: > Hello, > > We're building an auto suggest component based on the "label" field of > documents. Is

Date formatting issue

2012-02-15 Thread Zajkowski, Radoslaw
Hi all, here's an interesting one, in my xml imported if I use very simple xpath like this I will get the date properly imported, however if I use this expression for another node which is nested: I will receive this type of exception: java.text.ParseException: Unparseable date: "Tue Aug

Size of suggest dictionary

2012-02-15 Thread Mike Hugo
Hello, We're building an auto suggest component based on the "label" field of documents. Is there a way to see how many terms are in the dictionary, or how much memory it's taking up? I looked on the statistics page but didn't find anything obvious. Thanks in advance, Mike ps- here's the conf

Re: Can I rebuild an index and remove some fields?

2012-02-15 Thread Robert Stewart
I implemented an index shrinker and it works. I reduced my test index from 6.6 GB to 3.6 GB by removing a single shingled field I did not need anymore. I'm actually using Lucene.Net for this project so code is C# using Lucene.Net 2.9.2 API. But basic idea is: Create an IndexReader wrapper that

Re: feeding mahout cluster output back to solr

2012-02-15 Thread abhayd
I was looking at this http://java.dzone.com/videos/configuring-mahout-clustering seems like possible but can anyone shed more light, specially on the part of mapping clusters to original docs abhay -- View this message in context: http://lucene.472066.n3.nabble.com/feeding-mahout-cluster-output

Re: update extracted docs

2012-02-15 Thread Emmanuel Espina
Solr or Lucene does not update documents. It deletes the old one and replaces it with a new one when it has the same id. So if you create a document with the changed fields only, and the same id, and upload that one, the old one will be erased and replaced with the new one. So THAT behaviour is exp

Re: Search for hashtags and mentions

2012-02-15 Thread Emmanuel Espina
Do you want to index the hashtags and usernames to different fields? Probably using http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PatternTokenizerFactory will solve your problem. However I don't fully understand the problem when you search Thanks Emmanuel 2012/2/15 Rohit :

update extracted docs

2012-02-15 Thread Harold Frayman
Hi I have a solr 3.5 database which is populated by using /update/extract (configured pretty much as per the examples) and additional metadata. The uploads are handled by a perl-driven webapp which uses WebService::Solr (which use behind-the-scenes POSTing). That all works fine. When I come to up

Re: problem with accents

2012-02-15 Thread Erick Erickson
Did you specify the correct field with the search? If you just specified entered the word in the search box without the field, the search would be made against your default search field (defined in schema.xml). If you go to the "full interface" link on the admin page, you can then click the debug

problem with accents

2012-02-15 Thread R M
Hi, I've got a problem with the configuration of solr. I have defined a new type of data : "text_fr" to use accent like "é à è". I have added this on my fieldtype definition : Everything seems to be ok, data are well added. But when I'm going to this adress : http://localhost:8983/solr/admin

Search for hashtags and mentions

2012-02-15 Thread Rohit
Hi, We are using solr version 3.5 to search though Tweets, I am using WordDelimiterFactory with the following setting, to be able to search for @username or #hashtags I saw the following patch but this doesn't seem to be working as I expected, am I missing something? https://issu

Re: Solr soft commit feature

2012-02-15 Thread Dipti Srivastava
Hi Nagendra, Certainly interesting! Would this work in a Master/slave setup where the reads are from the slaves and all writes are to the master? Regards, Dipti Srivastava On 2/15/12 5:40 AM, "Nagendra Nagarajayya" wrote: > >If you are looking for NRT functionality with Solr 3.5, you may want

Re: Solr multiple cores - multiple databases approach

2012-02-15 Thread Em
Hello Radu, > - I want to index multiple enttities from those databases Do you want to combine data of both databases within one document or are you just interested in indexing both databases on their own? If the second applies: You can do it within one core by using a field (i.e. "source") to f

Solr multiple cores - multiple databases approach

2012-02-15 Thread Radu Toev
Hello, I have a use where I'm trying to integrate Solr: - 2 databases with the same schema - I want to index multiple enttities from those databases My question is what is the best way of approaching this topic: - should I create a core for each database and inside that core create a document w

Re: Facet on TrieDateField field without including date

2012-02-15 Thread Chantal Ackermann
I've done something like that by calculating the hours during indexing time (in the script part of the DIH config using java.util.Calendar which gives you all those field values without effort). I've also extracted information on which weekday it is (using the integer constants of Calendar). If you

Re: Facet on TrieDateField field without including date

2012-02-15 Thread Jamie Johnson
Thanks guys that's what I figured, just wanted to make sure I was going down the right path. On Wed, Feb 15, 2012 at 9:55 AM, Ted Dunning wrote: > Use multiple fields and you get what you want.  The extra fields are going > to cost very little and will have a bit positive impact. > > On Wed, Feb

Re: Facet on TrieDateField field without including date

2012-02-15 Thread Ted Dunning
Use multiple fields and you get what you want. The extra fields are going to cost very little and will have a bit positive impact. On Wed, Feb 15, 2012 at 9:30 AM, Jamie Johnson wrote: > I think it would if I indexed the time information separately. Which > was my original thought, but I was h

Re: Semantic autocomplete with Solr

2012-02-15 Thread Octavian Covalschi
Thank you! I'll check them out. On Wed, Feb 15, 2012 at 6:50 AM, Jan Høydahl wrote: > Check out > http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/ > You can feed it anything, such as a log of previous searches, or a > pre-computed dictionary of "item" + "color" combinat

Re: Facet on TrieDateField field without including date

2012-02-15 Thread Yonik Seeley
On Wed, Feb 15, 2012 at 9:30 AM, Jamie Johnson wrote: > I think it would if I indexed the time information separately.  Which > was my original thought, but I was hoping to store this in one field > instead of 2.  So my idea was I'd store the time portion as as a > number (an int might suffice fro

Re: MoreLikeThis Question

2012-02-15 Thread Jamie Johnson
Yes, agree that ID would be one that would need to be ignored. I don't think specifying them is too difficult I was just curious if it was possible to do this or not. On Wed, Feb 15, 2012 at 8:41 AM, Chantal Ackermann wrote: > Hi, > > you would not want to include the unique ID and similar stuff

Re: Facet on TrieDateField field without including date

2012-02-15 Thread Jamie Johnson
I think it would if I indexed the time information separately. Which was my original thought, but I was hoping to store this in one field instead of 2. So my idea was I'd store the time portion as as a number (an int might suffice from 0 to 24 since I only need this to have that level of granular

Re: Facet on TrieDateField field without including date

2012-02-15 Thread Yonik Seeley
On Wed, Feb 15, 2012 at 8:58 AM, Jamie Johnson wrote: > I would like to be able to facet based on the time of > day items are purchased across a date span.  I was hoping that I could > do a query of something like date:[NOW-1WEEK TO NOW] and then specify > I wanted facet broken into hourly bins.  

Re: Error Indexing in solr 3.5

2012-02-15 Thread Chantal Ackermann
Hi, I've got these errors when my client used a different SolrJ version from the SOLR server it connected to: SERVER 3.5 responding ---> CLIENT some other version You haven't provided any information on your client, though. Chantal On Wed, 2012-02-15 at 13:09 +0100, mechravi25 wro

Re: Solr as an part of api to unburden databases

2012-02-15 Thread Chantal Ackermann
> > > > does anyone of the maillinglist users use solr as an API to avoid database > > queries? [...] > > Like in a... cache? > > Why not use a cache then? (memcached, for example, but there are more). > Good point. A cache only uses lookup by one kind of cache key while SOLR provides lookup b

Re: MoreLikeThis Question

2012-02-15 Thread Chantal Ackermann
Hi, you would not want to include the unique ID and similar stuff, though? No idea whether it would impact the number of hits but it would most probably influence the scoring if nothing else. E.g. if you compare by certain fields, I would expect that a score of 1.0 indicates a match on all of tho

Re: Solr soft commit feature

2012-02-15 Thread Nagendra Nagarajayya
If you are looking for NRT functionality with Solr 3.5, you may want to take a look at Solr 3.5 with RankingAlgorithm. This allows you to add/update documents without a commit while being able to search concurrently. The add/update performance to add 1m docs is about 5000 docs in about 498 ms

Re: Solr as an part of api to unburden databases

2012-02-15 Thread Tomas Zerolo
On Wed, Feb 15, 2012 at 11:48:14AM +0100, Ramo Karahasan wrote: > Hi, > > > > does anyone of the maillinglist users use solr as an API to avoid database > queries? [...] Like in a... cache? Why not use a cache then? (memcached, for example, but there are more). Regards -- tomás

Re: Semantic autocomplete with Solr

2012-02-15 Thread Jan Høydahl
Check out http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/ You can feed it anything, such as a log of previous searches, or a pre-computed dictionary of "item" + "color" combinations that exist in your DB etc. -- Jan Høydahl, search solution architect Cominvent AS - www

Re: Stemming and accents (HunspellStemFilterFactory)

2012-02-15 Thread Jan Høydahl
Or if you know that you'll always strip accents in your search you may pre-process your pt_PT.dic to remove accents from it and use that custom dictionary instead in Solr. Another alternative could be to extend HunSpellFilter so that it can take in the class name of a TokenFilter class to apply

Re: Solr binary response for C#?

2012-02-15 Thread Jan Høydahl
Hi, I just created a JIRA to investigate an Avro based serialization format for Solr: https://issues.apache.org/jira/browse/SOLR-3135 You're welcome to contribute. Guess we'll first need to define schemas, then create an AvroResponseWriter and then support in the C# Solr client. -- Jan Høydahl,

Re: Highlighting stopwords

2012-02-15 Thread O. Klein
Koji Sekiguchi wrote > > (12/02/14 22:25), O. Klein wrote: >> I have not been able to find any logic in the behavior of hl.q and how it >> analyses the query. Could you explain how it is supposed to work? > > Nothing special on hl.q. If you use hl.q, the value of it will be used for > highlighti

Error Indexing in solr 3.5

2012-02-15 Thread mechravi25
Hi, When I tried to index in solr 3.5 i got the following exception org.apache.solr.client.solrj.SolrServerException: Error executing query at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95) at com.quartz.test.FullImport.callIndex(FullImport.java:80

MoreLikeThis Requesthandler

2012-02-15 Thread Molidor, Robert
Hi, I'm quite new to Solr. We want to find similar documents based on a MoreLikeThis query. In general this works fine and gives us reasonable results. Now we want to influence the result score by ranking more recent documents higher than older documents. Is this possible with the MoreLikeThis

Solr as an part of api to unburden databases

2012-02-15 Thread Ramo Karahasan
Hi, does anyone of the maillinglist users use solr as an API to avoid database queries? I know that this depends on the type of data. Imagine you have something like Quora "Q&A" System, which is most just "text". If I would embed some of these "Q&A" into my personal site, and would invoke the Q

RE: OR-FilterQuery

2012-02-15 Thread spring
> In other words, there's no attempt to decompose the fq clause > and store parts of it in the cache, it's exact-match or > nothing. Ah ok, thank you.

RE: OR-FilterQuery

2012-02-15 Thread spring
> > q=some text > > fq=id:(1 OR 2 OR 3...) > > > > Should I better use q:some text AND id:(1 OR 2 OR 3...)? > > > 1. These two opts have the different scoring. > 2. if you hit same fq=id:(1 OR 2 OR 3...) many times you have > a benefit due > to reading docset from heap instead of searching on disk

Re: MoreLikeThis Question

2012-02-15 Thread Michael Jakl
Hi! On Wed, Feb 15, 2012 at 07:27, Jamie Johnson wrote: > Is there anyway with MLT to say get similar based on all fields or is > it always a requirement to specify the fields? It seems to be not the case. But you could append the fields Parameter in the solrconfig.xml: ... Cheers, Micha