Re: SolR - Index problems

2011-12-09 Thread jiggy
Hello Guys, thanks for the fast replies. Now I think solr is working with tomcat and with the magento shop. My problem is now, how i can index the articles from magento to solr ? How i can solr reindex? On the tutorial i only see , how i can reindex the documents in the example folder. Many than

SmartChineseAnalyzer

2011-12-09 Thread waynelam
Hi all, I checked the documentation of SmartChineseAnalyzer, It looks like it is for Simplified Chinese Only. Does anyone tried to include Traditional Chinese characters also. As the analyzer is based on a dictionary from ICTCLAS1.0. My first thought is maybe i can get it work by simply conver

Highlighting uses lots of memory and eventually slows down Solr

2011-12-09 Thread Pranav Prakash
Hi Group, I would like to have highlighting for search and I have the fields indexed with the following schema (Solr 3.4) And the following config 100 20 0.5 [-\w ,/\n\"']{20,200} The problem is that when I turn on highlighting, I face memory issues.

snaptshot rotation - replication handler argument available?

2011-12-09 Thread Torsten Krah
Configured my replication handler on the master with this option: optimize I am running an optimize call on a regular basis (e.g. every week or every day, not the question here) and a snapshot is created. I am wonder where the option ist, to specify how much snapshots should be kept? Index is ve

new NRT in my case quite useful?

2011-12-09 Thread stockii
I Read this Articel from Mark Miller http://www.lucidimagination.com/blog/2011/07/11/benchmarking-the-new-solr-%E2%80%98near-realtime%E2%80%99-improvements/ Now i want to know if its useful to update on a new solr version. My version is: 4.0.0.2010.10.26.08.43.14 I need a really good NRT search fo

Solr Best Practice Configuration

2011-12-09 Thread BenMccarthy
Good Morning. I have now been through the various Solr tutorials and read the SOLR 3 Enterprise server book. Im not at the point of figuring out if Solr can help us with a scaling problem. Im looking for advice on the following scenario any pointers or references will be great: I have two sets

Re: r1201855 broke stats.facet on long fields

2011-12-09 Thread Luis Neves
On 12/08/2011 11:16 PM, Chris Hostetter wrote: ...so if you don't have a version param, or your version param is "1.0" then that would explain this error I have the version param set to "1.4". (If that doens't fix the problem for you. It doesn't. > then i'm genuinely baffled, and plea

Re: Solr Best Practice Configuration

2011-12-09 Thread Marc SCHNEIDER
Hi, What about using the delta-import command of the DIH? http://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command If you want two have 2 separated indexes, you could play with the "swap" command. One index would be continuously updated and the other one used for the user requests

Setting group.ngroups=true considerable slows down queries

2011-12-09 Thread Michael Jakl
Hi, I'm using the grouping feature of Solr to return a list of unique documents together with a count of the duplicates. Essentially I use Solr's signature algorithm to create the "signature" field and use grouping on it. To provide good numbers for paging through my result list, I'd like to comp

Re: Solr Best Practice Configuration

2011-12-09 Thread Chantal Ackermann
Hi Ben, what I understand from your post is: Advertiser (1) <-> (*) Advert (one-to-many where there can be 50,000 per single Advertiser) Your index entity is based on Advert which means that there can be 50,000 documents in the index that need to be changed if a field of an Advertiser is updated

Re: NRT or similar for Solr 3.5?

2011-12-09 Thread Nagendra Nagarajayya
Steven: Please take a look at Solr with RankingAlgorithm. It offers NRT functionality. You can set your autoCommit to about 15 mins. You can get more information from here: http://solr-ra.tgels.com/wiki/en/Near_Real_Time_Search_ver_3.x Regards, - Nagendra Nagarajayya http://solr-ra.tgels.o

Re: [Announce] Solr-RA, Solr with RankingAlgorithm

2011-12-09 Thread Nagendra Nagarajayya
Spark: BTW, for NRT, you do not need to commit, just set your autocommit to about 15 mins. Regards, - Nagendra Nagarajayya http://solr-ra.tgels.org http://rankingalgorithm.tgels.org On 12/6/2011 6:33 PM, yu shen wrote: thanks for the information 2011/12/6 Nagendra Nagarajayya Spark:

Re: Solr Best Practice Configuration

2011-12-09 Thread BenMccarthy
Thanks for the replies guys. The Advert index would be around 1 million records and we have a churn of around 400K record changes per day. Our current system does around 400K updates a day to the index and we have over 72 million searches on the index. Im more wondering what type of configuratio

Re: Solr Best Practice Configuration

2011-12-09 Thread BenMccarthy
What does (3) imply? You will not be able to facet or sort or group on Adverts using any of the Advertiser fields (as they reside in a different index core). In relation to my last reply this is exactly what i need to do. Return all adverts where a postcode (translated to lang/long) is within

Re: Replication downtime?? - master slave

2011-12-09 Thread roySolr
Thanks Erick, It's good to hear the slave doesn't notice anything. Roy -- View this message in context: http://lucene.472066.n3.nabble.com/Replication-downtime-master-slave-tp3561031p3572969.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: debugging failed documents

2011-12-09 Thread Alan Miller
Thanks Cody, I forgot that had I redirected my logs to webapps/solr/log/solr.log The error there was clear enough for me. The output of my DIH query contained some string values and my schema was trying to read the field as an integer. I though I had to set the field type to integer to be able t

Re: UI support for Multi-Select Facet queries?

2011-12-09 Thread Erik Hatcher
No, multiselect is not wired into /browse. That'd be a nice addition though. Erik On Dec 8, 2011, at 16:30 , PJ Shimmer wrote: > Greetings, > > I see that we can query multiple facets for a search with a syntax like > "fq=grade:A OR grade:B". However, I only know how to do this by mo

RE: snaptshot rotation - replication handler argument available?

2011-12-09 Thread Dyer, James
Just committed to Solr 3.5 is a "numberToKeep" parameter which lets you tell it to automatically delete the older backups. See http://wiki.apache.org/solr/SolrReplication#HTTP_API , under which is a short quip about this. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311

Re: UI support for Multi-Select Facet queries?

2011-12-09 Thread PJ Shimmer
Thanks Erik. Is there any available UI that supports multi-select faceting? From: Erik Hatcher To: solr-user@lucene.apache.org Sent: Friday, December 9, 2011 9:46 AM Subject: Re: UI support for Multi-Select Facet queries? No, multiselect is not wired into /

Re: UI support for Multi-Select Facet queries?

2011-12-09 Thread Erik Hatcher
Nothing that I'm aware of, open-source-wise. It'd be a fairly straightforward set of changes to the /browse templates though. Blacklight - http://projectblacklight.org - is a nice UI on top of Solr, and I've seen some demos where multi-select faceting was customized into it, but I don't think

Re: Field collapsing results caching

2011-12-09 Thread Martijn v Groningen
There is no cross query cache for result grouping. The only caching option out there is the group.cache.percent option: http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters Martijn On 8 December 2011 14:29, Kissue Kissue wrote: > Hi, > > I was just testing field collapsing in my solr a

Re: Setting group.ngroups=true considerable slows down queries

2011-12-09 Thread Martijn v Groningen
Hi Micheal, On what field type are you grouping and what version of Solr are you using? Grouping by string field is faster. Martijn On 9 December 2011 12:46, Michael Jakl wrote: > Hi, I'm using the grouping feature of Solr to return a list of unique > documents together with a count of the dupl

Re: Grouping or Facet ?

2011-12-09 Thread Juan Pablo Mora
Sorry if I don´t explain my problem clearly... I need to do a suggester of names based on a prefix. My data are from two categories of people, admins and developers for example. So when the client write "SAN" my results should be: Prefix: San Developers: Sanchez Garcia, Juan (5)

Re: Solr 3.2 - Results not showing up

2011-12-09 Thread Erick Erickson
OK, so querying *:* from the admin interface returns no results or a specific query that you think should return results doesn't? If the former. that would be very, very weird and would likely be a lack of commit, but that's grasping at straws.. Did you just copy your solrconfig.xml file from

Re: RegexQuery performance

2011-12-09 Thread Erick Erickson
Could you show us some examples of the kinds of things you're using regex for? I.e. the raw text and the regex you use to match the example? The reason I ask is that perhaps there are other approaches, especially thinking about some clever analyzing at index time. For instance, perhaps NGrams are

Re: r1201855 broke stats.facet on long fields

2011-12-09 Thread Chris Hostetter
: > (If that doens't fix the problem for you. : : It doesn't. Definitely odd... : > please file a Jira bug with as much details as possible about your setup : > (ideally a fully usable solrconfig.xml+schema.xml that demonstrates your : > problem) because the StatsComponentTest most certainly al

Using DisMax with Join possible?

2011-12-09 Thread Otis Gospodnetic
Hi, Is there a reason why we have the "lucene" parser baked into Join? Is there a way to use (e)DisMax with Join? For example, when I do this:     /solr/indexA/select?q={!join fromIndex=indexB from=id to=id}FooBar  Check line 60 in http://search-lucene.com/c/Solr:/core/src/java/org/apache/solr

Re: Suggest component

2011-12-09 Thread kmf
I have the same exact problem as the OP and can't seem to figure out why it's not working. The indexed field to derive suggestions from is "name_auto" which is of type "text_general" in schema.xml. I made some copyFields to have certain fields be put into "name_auto." I can see that "name_auto"

Re: Testing a custom implementation of CommonsHttpSolrServer

2011-12-09 Thread Chris Hostetter
: One thing I want to avoid is having to have a Solr instance set up on : every developers sandbox in order : for the tests to work. What I'm looking for is an embedded solution : which is started up programmatically : BUT is accessed over HTTP. Take a look at SolrJettyTestBase and the tests that

Re: r1201855 broke stats.facet on long fields

2011-12-09 Thread Yonik Seeley
On Thu, Dec 8, 2011 at 6:16 PM, Chris Hostetter wrote: > Solr can > not reasonably compute stats on a multivalued field Wasn't that added here? https://issues.apache.org/jira/browse/SOLR-1380 -Yonik http://www.lucidimagination.com

Re: Solr Join with Dismax

2011-12-09 Thread Chris Hostetter
: Is there a specific reason why it is hard-coded to use the "lucene" : QParser? I was looking at JoinQParserPlugin.java and here it is in : createParser: : : QParser fromQueryParser = subQuery(v, "lucene"); : : I could pass another param named "fromQueryParser" and use it instead of : "lucene"

Re: Facet values that should always appear

2011-12-09 Thread Chris Hostetter
: Is there a way within Solr to instruct the system that a certain set : of values should always appear regardless of their counts when : faceting? nope ... the only way to force something like this at the moment is to request the count explicitly as a facet.query. -Hoss

Re: Possible to facet across two indices, or document types in single index?

2011-12-09 Thread Chris Hostetter
: What you said about faceting is the key. I want to use my existing : edismax configuration to create the scored document result set of type : Y. I don't want to affect their scores, but for each document ID, I : want join it with another type of document (X), which has a field which : cont

Re: cache monitoring tools?

2011-12-09 Thread Chris Hostetter
: The culprit seems to be the merger (frontend) SOLR. Talking to one shard : directly takes substantially less time (1-2 sec). ... : >> > > >>facet.limit=50 Your probably most likeley has very little to do with your caches at all -- a facet.limit that high requires sending a very lar

DIH full import and clean

2011-12-09 Thread O. Klein
Can someone explain to me why, when I run full import with clean on it only runs the last entity and with clean off I get the behaviour I want (runs both entities)? I thought clean was only to clear the index before running. -- View this message in context: http://lucene.472066.n3.nabble.com/D

Maximum File Size Handled by post.jar / Speed of Deletes?

2011-12-09 Thread kingkong
Hi, We would like to know is there a maximum size of a xml file that can be posted to Solr using the post.jar, maximum number of docs, etc. at one time as well as how fast deletes can be achieved. Our goal is to delete "all documents" belonging to a "given user" then add all the docs again with t

Re: cache monitoring tools?

2011-12-09 Thread Paul Libbrecht
Allow me to chim in and ask a generic question about monitoring tools for people close to developers: are any of the tools mentioned in this thread actually able to show graphs of loads, e.g. cache counts or CPU load, in parallel to a console log or to an http request log?? I am working on such

possible to do arithmetic on returned values?

2011-12-09 Thread Gabriel Cooper
Is there a way to manipulate the results coming back from SOLR? I have a SOLR 3.5 index that contains values in cents (e.g. "100" in the index represents $1.00) and in certain contexts (e.g. CSV export) I'd like to divide by 100 for that field to provide a user-friendly "in dollars" number. To

Re: solr.VelocityResponseWriter error in version 3.5.0

2011-12-09 Thread Erik Hatcher
My bad. To clarify the issue here... the problem manifests itself only on Solr 3.5 specifically when the example configuration is copied somewhere else (losing the relative path nature to the references). Generally this happens when folks want to deploy into Tomcat. In Solr 3.5, the Velocity

MoreLikeThis questions

2011-12-09 Thread Scott Smith
I'm implementing a MoreLikeThis search. I have a couple of questions. I'm implementing this with solrj so I would appreciate it if any code snippets reflect that. First, I want to provide the text that Solr should check for "interesting words" and do the search on. This means I don't want t

VelocityResponseWriter's future

2011-12-09 Thread Erik Hatcher
So I thought that Solr having a decent HTML search UI out of the box was a good idea. I still do. But it's been a bit of a pain to maintain (originally it was a contrib module, then core, then folks didn't want it as a core dependency, and now it is back as a contrib), and the UI has accumulat

Re: VelocityResponseWriter's future

2011-12-09 Thread Paul Libbrecht
Erik, The VelocityResponseWriter has solved a need by me: provide an interface that shows off an amount of the solr capability with queries close to a developer and a UI that you can mail to colleagues. The out-of-the-box-ness is crucial here. Adjust the vm files was also crucial (e.g. to creat

Re: possible to do arithmetic on returned values?

2011-12-09 Thread Erik Hatcher
The one trick that can be done with 3.x is something like this (try this URL on the example app with the example data indexed): http://localhost:8983/solr/browse?v.template.doc=%23set%28$cents=$doc.getFieldValue%28%27price%27%29*100%29%20$cents%20cents Un-urlencoded, this is saying to make th

filterQuery (fq=) vs q differences other than scoring.

2011-12-09 Thread Andrew Lundgren
I know that fq's are used to improve performance by reducing the data set that you score. I have read the documentation that says that non-cached fq's are created in parallel to your query, but would like to know more about how that is done. Does it do a match on all the FQ's, then AND the resu

Suggester for Numbers

2011-12-09 Thread Awasthi, Shishir
Is there a way to make suggester return suggestion for numbers also? My use case is where I have a street address say 123 Street1 When I type 12 it doesn't return any data using suggester. Here is my config

Re: VelocityResponseWriter's future

2011-12-09 Thread Erik Hatcher
Paul - Thanks for your feedback. As for JSP... the problem with JSP's is that they must be inside the .war file and that is prohibitive for the flexibility of adjusting the vm files to "create links to the right resource" easily. Certainly choice templating languages are an opinionated kind o

Re: Setting group.ngroups=true considerable slows down queries

2011-12-09 Thread Michael Jakl
Hi! On Fri, Dec 9, 2011 at 17:41, Martijn v Groningen wrote: > On what field type are you grouping and what version of Solr are you > using? Grouping by string field is faster. The field is defined as follows: Grouping itself is quite fast, only computing the number of groups seems to increase

RE: MoreLikeThis questions

2011-12-09 Thread Scott Smith
I realized I probably should have said Solr 3.5 in case that makes a difference. -Original Message- From: Scott Smith [mailto:ssm...@mainstreamdata.com] Sent: Friday, December 09, 2011 2:29 PM To: solr-user@lucene.apache.org Subject: MoreLikeThis questions I'm implementing a MoreLikeThis

Re: VelocityResponseWriter's future

2011-12-09 Thread Erik Hatcher
s/choice templating languages/template language choices/ Also, meant to include * http://today.java.net/pub/a/today/2003/12/16/velocity.html On Dec 9, 2011, at 17:07 , Erik Hatcher wrote: > Paul - > > Thanks for your feedback. > > As for JSP... the problem with JSP's is that they must be ins

Re: VelocityResponseWriter's future

2011-12-09 Thread Paul Libbrecht
Erik, don't argue with me about Velocity, I'm using it several hours a day in XWiki. It's fast and easy but its testing ability is simply... unpredictable. I did not mean to say it is not documented enough but that it could be reformulated as a tutorial wiki page instead of an example software.

Re: Solr Best Practice Configuration

2011-12-09 Thread Erick Erickson
What kind of data gets changed for your adverts? Is it anything ExternalFileField could help with? In the 4.0 (trunk) code there's a limited join capability that may help evenutally Best Erick On Fri, Dec 9, 2011 at 8:33 AM, BenMccarthy wrote: > What does (3) imply? > You will not be able to fa

RE: MoreLikeThis questions

2011-12-09 Thread Scott Smith
OK. I just found Juan Grande's 7/1/2011 post. It seems like that gives me some ideas on the second question. I still don't know what to do about the first question. Maybe if I saw the Request xml, it would give me a hint what to do with the solrj stuff. Anybody have any thoughts? Scott ---

Re: NRT or similar for Solr 3.5?

2011-12-09 Thread Steven Ou
Hey Nagendra, I took a look and Solr-RA looks promising - but: - I could not figure out how to download it. It seems like all the download links just point to "#" - I wasn't looking for another ranking algorithm, so would it be possible for me to use NRT but *not* RA (i.e. just use th

Re: Using DisMax with Join possible?

2011-12-09 Thread Erick Erickson
Otis Did you see Boss's answer to a similar question about an hour after your post? On Dec 9, 2011 2:19 PM, "Otis Gospodnetic" wrote: > Hi, > > Is there a reason why we have the "lucene" parser baked into Join? > Is there a way to use (e)DisMax with Join? > > For example, when I do this: > >

Re: Suggest component

2011-12-09 Thread Erick Erickson
Is the field indexed? I.e. indexed="true"? Just because you see values there when you specify it in fl only says it's stored, not whether it's available from the in ed terms. You might show us your schema file... Best Erick On Dec 9, 2011 2:33 PM, "kmf" wrote: > I have the same exact problem as t

Images for the DataImportHandler page

2011-12-09 Thread Mike O'Leary
There is some very useful information on the http://wiki.apache.org/solr/DataImportHandler page about indexing database contents, but the page contains three images whose links are broken. The descriptions of those images sound like it would be quite handy to see them in the page. Could someone

Re: Suggest component

2011-12-09 Thread kmf
Thanks for the reply. Yes, all are indexed=true (name_auto and the copyField items). Here are the snippets from schema.xml (didn't want to post the whole thing because the following is what I've changed/added)

Virtual Memory very high

2011-12-09 Thread Rohit
Hi All, Don't know if this question is directly related to this forum, I am running Solr in Tomcat on linux server. The moment I start tomcat the virtual memory shown using TOP command goes to its max 31.1G and then remains there. Is this the right behaviour, why is the virtual memory usage

Re: VelocityResponseWriter's future

2011-12-09 Thread David Smiley (@MITRE.org)
Erik, The /browse UI pages have definitely gotten a bit long in the tooth, to the point that it's a maintenance nightmare IMO. Perhaps it was inevitable given the nature of the technology; it would be the same situation with JSP/ASP/PHP. FWIW, in my experience with Endeca (a Solr competitor) the