Re: Can I add custom fields to the input XML file?

2008-09-18 Thread Otis Gospodnetic
The format is fixed, you can't change it -- something on the Solr end needs to parse that XML and expects specific XML elements and structure, so it can't handle whatever one throws at it. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: co

Re: error when post xml data to solr

2008-09-18 Thread Otis Gospodnetic
Could you paste the XML you are posting? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: 李学健 <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Thursday, September 18, 2008 11:27:26 PM > Subject: error when post xml data to solr >

Re: Some new SOLR features

2008-09-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
why to restart solr ? reloading a core may be sufficient. SOLR-561 already supports this - On Thu, Sep 18, 2008 at 5:17 PM, Jason Rutherglen <[EMAIL PROTECTED]> wrote: > Servlets is one thing. For SOLR the situation is different. There > are always small changes people want to make, a new stop

Re: Setting request method to post on SolrQuery causes ClassCastException

2008-09-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
it is surprising as to why this happens the the javabin offers significant perf improvements over the xml one. probably you can also try this javabin On Thu, Sep 18, 2008 at 10:17 PM, syoung <[EMAIL PROTECTED]> wrote: > > I tried setting the 'wt' parameter to both 'xml' and 'javabin

Can I add custom fields to the input XML file?

2008-09-18 Thread convoyer
Hi guys. Is the XML format for inputting data, is a standard one? or can I change it. That is instead of : 3007WFP Dell Widescreen UltraSharp 3007WFP Dell, Inc. can I enter something like, 100100 BPO 1500 100200 ITES 2500 Thanks -- View this message in context: http://www.nab

Re: Filtering results

2008-09-18 Thread ristretto . rb
Thanks Otis for reply! Always appreciated! That is indeed what we are looking for implementing. But, I'm running out of time to prototype or experiment for this release. I'm going to run the two index thing for now, unless I find something saying is really easy and sensible to run one and collap

error when post xml data to solr

2008-09-18 Thread 李学健
hi, all when i post an xml file to solr, some errors happen as below: == com.ctc.wstx.exc.WstxEOFException: Unexpected EOF in prolog at [row,col {unknown-source}]: [1,0] at com.ctc.wstx.sr.StreamScanner.throwUnexpectedEOF(StreamScanner.java:686) at c

Re: Filtering results

2008-09-18 Thread Otis Gospodnetic
Gene, I haven't looked at Field Collapsing for a while, but if you have a single index and collapse hits on your category field, then won't first 10 hits be items you are looking for - top 1 item for each category x 10 using a single query. Otis -- Sematext -- http://sematext.com/ -- Lucene -

Re: firstSearcher and newSearcher events

2008-09-18 Thread Shalin Shekhar Mangar
On Fri, Sep 19, 2008 at 5:55 AM, oleg_gnatovskiy < [EMAIL PROTECTED]> wrote: > > Hello. I am using the spellcheck component > (https://issues.apache.org/jira/browse/SOLR-572). Since the spell checker > index is kept in RAM, it gets erased every time the Solr server gets > restarted. I was thinking

firstSearcher and newSearcher events

2008-09-18 Thread oleg_gnatovskiy
Hello. I am using the spellcheck component (https://issues.apache.org/jira/browse/SOLR-572). Since the spell checker index is kept in RAM, it gets erased every time the Solr server gets restarted. I was thinking of using either the firstSearcher or the newSearcher to reload the index every time So

RE: Hardware config for SOLR

2008-09-18 Thread Andrey Shulinskiy
Matthew, Thanks, a very good point. Andrey. > -Original Message- > From: Matthew Runo [mailto:[EMAIL PROTECTED] > Sent: Thursday, September 18, 2008 11:38 AM > To: solr-user@lucene.apache.org > Subject: Re: Hardware config for SOLR > > I can't speak to a lot of this - but regarding the

Re: Filtering results

2008-09-18 Thread ristretto . rb
Otis, Would be reasonable to run a query like this http://localhost:8280/solr/select/?q=terms_x&version=2.2&start=0&rows=0&indent=on 10 times, one for each result from an initial category query on a different index. So, it's still 1+10, but I'm not returning values. This would give me the numbe

RE: Searching for future or "null" dates

2008-09-18 Thread Chris Maxwell
Here is what I was able to get working with your help. (productId:(102685804)) AND liveDate:[* TO NOW] AND ((endDate:[NOW TO *]) OR ((*:* -endDate:[* TO *]))) the *:* is what I was missing. Thanks for your help. hossman wrote: > > > : If the query stars with a negative clause Lucene return

Re: Unable to filter fq param on a dynamic field

2008-09-18 Thread Otis Gospodnetic
Barry, You are seeing the value of the field as it was saved (as the original), but perhaps something is funky with how it was analyzed/tokenized at search time and how it is being analyzed now at query time. Double-check your fieldType/analysis settings for this field and make sure you are us

Re: Dismax + Dynamic fields

2008-09-18 Thread Jon Drukman
Daniel Papasian wrote: Norberto Meijome wrote: Thanks Yonik. ok, that matches what I've seen - if i know the actual name of the field I'm after, I can use it in a query it, but i can't use the dynamic_field_name_* (with wildcard) in the config. Is adding support for this something that is desir

RE: Unable to filter fq param on a dynamic field

2008-09-18 Thread Barry Harding
Hi Otis, no that does not seem to bring back the correct results either in fact its still zero results. Its also not bringing back results if I use the standard handler http://127.0.0.1:8080/apache-solr-1.3.0/select?q=Output-Type-facet:Monochrome but the field is visible in the documents retur

Re: Setting request method to post on SolrQuery causes ClassCastException

2008-09-18 Thread syoung
I tried setting the 'wt' parameter to both 'xml' and 'javabin'. Neither worked. However, setting the parser on the server to XMLResponseParser did fix the problem. Thanks for the help. Susan Noble Paul നോബിള്‍ नोब्ळ् wrote: > > I guess the post is not sending the correct 'wt' parameter. tr

snapshot.yyyymmdd ... can't found them?

2008-09-18 Thread sunnyfr
Hi sorry I think I've started properly rsyncd : [EMAIL PROTECTED]:/# ./data/solr/books/bin/rsyncd-enable [EMAIL PROTECTED]:/# ./data/books/video/bin/rsyncd-start but then I can't found this snapshot.current files ?? How can I check I did it properly ? my rsyncd.log : 2008/09/18 18:06:0

Re: Solr vs Autonomy

2008-09-18 Thread Otis Gospodnetic
Geoff, In short: all items that you listed are not a problem for Solr. Indices can be sharded, distributed search is possible, custom ranking is possible, 30 fields is possible, etc. etc. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From:

Re: Field level security

2008-09-18 Thread Otis Gospodnetic
Hi, If all you have to do is "hide" certain fields from search results for some users, then your application -- the application that sends search requests to Solr can just use different fl=XXX parameters based on user's permission. I think that's all you need and the custom fieldType should n

Re: AW: Date field mystery

2008-09-18 Thread Otis Gospodnetic
Hi Christian, While I can't tell you whether the problem with "-" will be solved when you try it on 1.3, I can tell you that you should probably trim your dates so they are not as fine as you currently have them, unless you need such precision. We need to add this to the FAQ. :) Otis -- Semat

Re: Unable to filter fq param on a dynamic field

2008-09-18 Thread Otis Gospodnetic
Barry, does this return the correct hits: http://127.0.0.1:8080/apache-solr-1.3.0/IvolutionSearch?q=Output-Type-facet:Monochrome Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Barry Harding <[EMAIL PROTECTED]> > To: "solr-user@lucene.apach

RE: Solr vs Autonomy

2008-09-18 Thread Kashyap, Raghu
Hi Geoff, I cannot vouch for Autonomy however, earlier this year we did evaluate Endeca & Solr and we went with Solr some of the reasons were: 1. Freedom of open source with Solr 2. Very good & active solr open source community 3. Features pretty much overlap with both solr & Endeca 4. Endeca how

Re: Hardware config for SOLR

2008-09-18 Thread Matthew Runo
I can't speak to a lot of this - but regarding the servers I'd go with the more powerful ones, if only for the amount of ram. Your index will likely be larger than 1 gig, and with only two you'll have a lot of your index not stored in ram, which will slow down your QPS. Thanks for your time

Re: delta-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread sunnyfr
this is my log file : [EMAIL PROTECTED]:/home# tail -f /var/log/tomcat5.5/catalina.$(date +%Y-%m-%d).log Sep 18, 2008 5:25:02 PM org.apache.solr.handler.dataimport.JdbcDataSource$1 call INFO: Creating a connection for entity books with URL: jdbc:mysql://master-spare.vip.books.com/books Sep 18, 2

Re: Solr vs Autonomy

2008-09-18 Thread Walter Underwood
I would do the field visibility one layer up from the search engine. That layer already knows about the user and can request the appropriate fields. Or request them all (better HTTP caching) and only show the appropriate ones. As I understand your application, putting access control in Solr doesn'

Re: delta-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread Shalin Shekhar Mangar
On Thu, Sep 18, 2008 at 8:45 PM, sunnyfr <[EMAIL PROTECTED]> wrote: > > I agree about that but the last time 4hours later the number wasn't > different > : Do you mean that the number doesn't change at all on refreshing the page? Can you check the solr log file for exceptions? I suspect that yo

Re: delta-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread sunnyfr
I agree about that but the last time 4hours later the number wasn't different : and if I check now, nothing changed : does it have to go across all the data like full import, I thought it would bring back just ids which need to be modify ...? 0:39:36.943 3447914 9054602 492558 0 2008-09-18 16:29

Re: delta-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread Shalin Shekhar Mangar
Well it shows the number of documents that have changed, you can't expect 1603970 documents to be indexed instantly. On Thu, Sep 18, 2008 at 8:24 PM, sunnyfr <[EMAIL PROTECTED]> wrote: > > It is exactly what I've done but it can't works like that ... > > - what would that mean ... cron job c

Re: Solr vs Autonomy

2008-09-18 Thread Ryan McKinley
On Sep 18, 2008, at 3:23 AM, Geoff Hopson wrote: As per other thread 1) security down to field level how complex of a security model do you need? Is each users field visibility totally distinct? are there a few basic groups? If you are willing to write (or hire someone to write) a cus

Re: delta-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread sunnyfr
It is exactly what I've done but it can't works like that ... - what would that mean ... cron job can't hit it properly ? - I've browse to /dataimport but it was like nothing was running so I finally went back to /dataimport?command=delta-import and then to /dataimport and I refresh it of

Re: delta-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread sunnyfr
It is exactly what I've done but it can't works like that ... - what would that mean ... cron job can't hit it properly ? - I've browse to /dataimport but it was like nothing was running so I finally went back to /dataimport?command=delta-import and then to /dataimport and I refresh it of

Re: delta-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread Shalin Shekhar Mangar
Hit /dataimport again from a browser and refresh periodically to see the progress (number of documents indexed). On Thu, Sep 18, 2008 at 7:55 PM, sunnyfr <[EMAIL PROTECTED]> wrote: > > It was too long so I finally restart tomcat .. then 5mn later my cron job > started : > but it looks like nothin

Re: No server response code on insert: how do I avoid this at high speed?

2008-09-18 Thread Paleo Tek
Otis Gospodnetic wrote: Perhaps the container logs explain what happened? How about just throttling to the point where the failure rate is 0%? Too slow? Otis's questions regarding dropped inserts sent me back to the drawing board. The system had been tuned to a slower database to optimiz

Re: Solr vs Autonomy

2008-09-18 Thread Geoff Hopson
My project is looking to index 10s of millions of documents, providing search across a live-live environment (hence index distribution/replication is important). Most searches have to be done (ie to end user) in 5 seconds or less. The index has about 30 fields, and I reckon that the security access

Re: Solr vs Autonomy

2008-09-18 Thread Walter Underwood
It depends entirely on the needs of the project. For some things, Solr is superior to Autonomy, for other things, not. I used to work at Autonomy (and Verity and Inktomi and Infoseek), and I chose Solr for Netflix. It is working great for us. wunder == Walter Underwood Former Ultraseek Architect

Re: delta-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread sunnyfr
It was too long so I finally restart tomcat .. then 5mn later my cron job started : but it looks like nothing happening by cron job : This is my OUTPUT file : tot.txt 00data-config.xmldelta-import,idleThis response format is experimental. It is likely to change in the future. This is my C

Re: problem index accented character with release version of solr 1.3

2008-09-18 Thread Sean Timm
From the XML 1.0 spec.: "Legal characters are tab, carriage return, line feed, and the legal graphic characters of Unicode and ISO/IEC 10646." So, \005 is not a legal XML character. It appears the old StAX implementation was more lenient than it should have been and Woodstox is doing the corr

Re: Some new SOLR features

2008-09-18 Thread Jason Rutherglen
Yes, so it's probably best to make the changes through a remote interface so that the app will be able to make the appropriate internal changes. File based system changes are less than ideal, agreed, however I suppose with an open source project such as SOLR the kitchen sink affect happens and it

delt-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread sunnyfr
This XML file does not appear to have any style information associated with it. The document tree is shown below. − − 0 0 − − data-config.xml idle − 4:26:16.934 3451431 9165885 493061 0 2008-09-18 10:01:01 2008-09-18 10:01:01 2008-09-18 10:01:43 2008-09-18 10:01:43 1587889

Re: Some new SOLR features

2008-09-18 Thread Mark Miller
Dynamic changes are not what I'm against...I'm against dynamic changes that are triggered by the app noticing that the config have changed. Jason Rutherglen wrote: Servlets is one thing. For SOLR the situation is different. There are always small changes people want to make, a new stop word,

Re: Some new SOLR features

2008-09-18 Thread Jason Rutherglen
Servlets is one thing. For SOLR the situation is different. There are always small changes people want to make, a new stop word, a small tweak to an analyzer. Rebooting the server for these should not be necessary. Ideally this is handled via a centralized console and deployed over the network

Re: Some new SOLR features

2008-09-18 Thread Jason Rutherglen
> multi-core allows you to instantiate a completely > new core and swap it for the old one, but it's a bit of a heavyweight > approach. Multi core seems like more of a hack to get around running multiple JVMs. It doesn't seem like the most elegant solutions for most problems because usually the s

Re: Some new SOLR features

2008-09-18 Thread Mark Miller
Isnt this done in servlet containers for debugging type work? Maybe an option, but I disagree that this should drive anything in solr. It should really be turned off in production in servelet containers imo as well. This can really be such a pain in the ass on a live site...someone touches we

Re: Some new SOLR features

2008-09-18 Thread Jason Rutherglen
> That would allow a single request to see a stable view of the > schema, while preventing having to make every aspect of the schema > thread-safe. Yes that is the best approach. > Nothing will stop one from using java serialization for config > persistence, Persistence should not be serialized.

Re: Some new SOLR features

2008-09-18 Thread Jason Rutherglen
This should be done. Great idea. On Wed, Sep 17, 2008 at 3:41 PM, Lance Norskog <[EMAIL PROTECTED]> wrote: > My vote is for dynamically scanning a directory of configuration files. When > a new one appears, or an existing file is touched, load it. When a > configuration disappears, unload it. Th

Re: Some new SOLR features

2008-09-18 Thread Jason Rutherglen
Hi Yonik, One approach I have been working on that I will integrate into SOLR is the ability to use serialized objects for the analyzers so that the schema can be defined on the client side if need be. The analyzer classes will be dynamically loaded. Or there is no need for a schema and plain Ja

Unable to filter fq param on a dynamic field

2008-09-18 Thread Barry Harding
Hi, I have a fairly simple solr setup with several predefined fields that are indexed and stored and also depending on the type of product I also add various dynamic fields of type string to a record, and I should mention that I am using the solr.DisMaxRequestHandler request handler called "/

Re: Special character matching 'x' ?

2008-09-18 Thread Sanjay Suri
Thanks Akshay and Norberto, I am still trying to make it work. I know the solution is what you pointed me to but is just taking me some time to make it work. thanks, -Sanjay On Thu, Sep 18, 2008 at 12:34 PM, Norberto Meijome <[EMAIL PROTECTED]>wrote: > On Thu, 18 Sep 2008 10:53:39 +0530 > "Sanja

Re: recip(myfield,m,a,b)

2008-09-18 Thread sunnyfr
I don't think it can works at the index time, because I when somebody look for a book I want to boost the search in relation with the user language ...so I don"t think it can works, except if I didn't get it. Thanks for your answer, hossman wrote: > > > : Is there a way to convert to integer

Re: Setting request method to post on SolrQuery causes ClassCastException

2008-09-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
I guess the post is not sending the correct 'wt' parameter. try setting wt=javabin explicitly . wt=xml may not work because the parser still is binary. check this http://wiki.apache.org/solr/Solrj#xmlparser On Thu, Sep 18, 2008 at 11:49 AM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > A qui

Re: cron job update index

2008-09-18 Thread sunnyfr
Ok Thanks it's very clear. Just do you know why my cron job doesn't work : # m h dom mon dow command */5 * * * * /usr/bin/wget http://solr-test.books.com:8080/solr/books/dataimport?command=delta-import When I go to check the date in conf/dataimport.properties, the date and hour doesn't change

AW: Date field mystery

2008-09-18 Thread Kolodziej Christian
Hi Chris, it was a long night for our solr server today because we rebuilt the complete index using "well formed" date string. And the date field is stored now so that we can see if there went something wrong :-) But our problems are solved completely. Now I can give you a very exact descripti

Re: Solr vs Autonomy

2008-09-18 Thread Geoff Hopson
As per other thread 1) security down to field level Otherwise I am mostly happy that Solr gives me everything that Autonomy does. 2008/9/18 Otis Gospodnetic <[EMAIL PROTECTED]>: > Geoff, > > Perhaps you can find out the list of features/functionalities that your > project requires and we can gi

Re: Field level security

2008-09-18 Thread Geoff Hopson
Hi Otis, Thanks for the response. I'll try and inline some clarity... 2008/9/18 Otis Gospodnetic <[EMAIL PROTECTED]>: >> I am trying to put together a security model around fields in my >> index. My requirement is that a user may not have permission to view >> certain fields in the index when he

Re: Special character matching 'x' ?

2008-09-18 Thread Norberto Meijome
On Thu, 18 Sep 2008 10:53:39 +0530 "Sanjay Suri" <[EMAIL PROTECTED]> wrote: > One of my field values has the name "R__ikk__nen" which contains a special > characters. > > Strangely, as I see it anyway, it matches on the search query 'x' ? > > Can someone explain or point me to the solution/doc