RE: How do I set up an embedded server with version 1.3.0 ?
From: Manepalli, Kalyan [mailto:kalyan.manepa...@orbitz.com] Sent: 24 June 2009 19:47 To: solr-user@lucene.apache.org Subject: RE: How do I set up an embedded server with version 1.3.0 ? Hi Ian, I use the embeddedSolrServer from a Solr Component. The code for invoking the embeddedSolrServer looks like this SolrServer locServer = new EmbeddedSolrServer(SolrCore.getCoreDescriptor() .getCoreContainer(), "locationCore"); Where locationCore is the core name in the multicore environment. In single core env you can pass "" Thanks, Kalyan Manepalli Hi Kalyan, Thanks for the reply, but it does not work for me as getCoreDescriptor() is NOT a static method of SolrCore. So, I am still left trying to instantiate a SolCore instance to pass to the EmbeddedSolrServer constructor. Can you or someone else possibly help me with a working SolrCore constructor call? TIA, Ian. Website Content Management Tamar Science Park, 15 Research Way, Plymouth, PL6 8BT Save a tree, think before printing this email. This email contains proprietary information, some or all of which may be legally privileged. It is for the intended recipient only. If an addressing or transmission error has misdirected this email, please notify the author by replying to this email. If you are not the intended recipient you may not use, disclose, distribute, copy, print or rely on this email. Email transmission cannot be guaranteed to be secure or error free, as information may be intercepted, corrupted, lost, destroyed, arrive late or incomplete or contain viruses. This email and any files attached to it have been checked with virus detection software before transmission. You should nonetheless carry out your own virus check before opening any attachment. GOSS Interactive Ltd accepts no liability for any loss or damage that may be caused by software viruses. Registered Office: c/o Bishop Fleming, Cobourg House, Mayflower Street, Plymouth, PL1 1LG. Company Registration No: 3553908
Fwd: [Solr Wiki] Update of "SolrReplication" by NoblePaul
some of the replication commands have been changed in the trunk https://issues.apache.org/jira/browse/SOLR-1216 So, please keep it in mind if you are alreday using it and you are upgrading to a new build. Refer the wiki for the latest commands -- Forwarded message -- From: Apache Wiki Date: Thu, Jun 25, 2009 at 2:24 PM Subject: [Solr Wiki] Update of "SolrReplication" by NoblePaul To: solr-comm...@lucene.apache.org Dear Wiki user, You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification. The following page has been changed by NoblePaul: http://wiki.apache.org/solr/SolrReplication The comment on the change is: command names changed -- optimize - + - optimize + optimize schema.xml,stopwords.txt,elevate.xml @@ -107, +107 @@ == HTTP API == These commands can be invoked over HTTP to the !ReplicationHandler - * Abort copying snapshot from master to slave command : http://slave_host:port/solr/replication?command=abort + * Abort copying index from master to slave command : http://slave_host:port/solr/replication?command=abortfetch * Force a snapshot on master. This is useful to take periodic backups .command : http://master_host:port/solr/replication?command=snapshoot - * Force a snap pull on slave from master command : http://slave_host:port/solr/replication?command=snappull + * Force a snap pull on slave from master command : http://slave_host:port/solr/replication?command=fetchindex * It is possible to pass on extra attribute 'masterUrl' or other attributes like 'compression' (or any other parameter which is specified in the tag) to do a one time replication from a master. This obviates the need for hardcoding the master in the slave. * Disable polling for snapshot from slave command : http://slave_host:port/solr/replication?command=disablepoll * Enable polling for snapshot from slave command : http://slave_host:port/solr/replication?command=enablepoll -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: How do I set up an embedded server with version 1.3.0 ?
On Thu, Jun 25, 2009 at 2:05 PM, Ian Smith wrote: > > Can you or someone else possibly help me with a working SolrCore > constructor call? > > Here is a working example for single index/core: System.setProperty("solr.solr.home", "/home/shalinsmangar/work/oss/branch-1.3/example/solr"); CoreContainer.Initializer initializer = new CoreContainer.Initializer(); CoreContainer coreContainer = initializer.initialize(); EmbeddedSolrServer server = new EmbeddedSolrServer(coreContainer, ""); SolrInputDocument doc = new SolrInputDocument(); doc.addField("id", "101"); server.add(doc); server.commit(true, true); SolrQuery query = new SolrQuery(); query.setQuery("id:101"); QueryResponse response = server.query(query); SolrDocumentList list = response.getResults(); System.out.println("list.size() = " + list.size()); coreContainer.shutdown(); Make sure your dataDir in solrconfig.xml is fixed (absolute) otherwise your data directory will get created relative to the current working directory (or you could set a system property for solr.data.dir) Hope that helps. I'll add it to the wiki too. -- Regards, Shalin Shekhar Mangar.
Top tf_idf in TermVectorComponent
In order to perform any further study of the resultset, like clustering, the TermVectorComponent gives the list of words with the correspoing tf, idf, but this list can be huge for each document, and most of the terms may have a low tf or a too high df, maybe, it is usefull to compare the relative increment of DF to the collection in order to improve the facets (show only these terms that the relative DF in the query is higher than in the full collection) To perform this it could be interesting that the TermVectorComponent could sort the results by some of these options: *tf *DF * tf/df (to simplify) or tf*idf where idf is computed as log(total_docs/df) and truncate the list to a number of words or a given value or maybe there is another way to perform this? Joan -- View this message in context: http://www.nabble.com/Top-tf_idf-in-TermVectorComponent-tp24201076p24201076.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: How do I set up an embedded server with version 1.3.0 ?
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: 25 June 2009 10:41 To: solr-user@lucene.apache.org Subject: Re: How do I set up an embedded server with version 1.3.0 ? On Thu, Jun 25, 2009 at 2:05 PM, Ian Smith wrote: > > Can you or someone else possibly help me with a working SolrCore > constructor call? > > Here is a working example for single index/core: System.setProperty("solr.solr.home", "/home/shalinsmangar/work/oss/branch-1.3/example/solr"); CoreContainer.Initializer initializer = new CoreContainer.Initializer(); CoreContainer coreContainer = initializer.initialize(); EmbeddedSolrServer server = new EmbeddedSolrServer(coreContainer, ""); SolrInputDocument doc = new SolrInputDocument(); doc.addField("id", "101"); server.add(doc); server.commit(true, true); SolrQuery query = new SolrQuery(); query.setQuery("id:101"); QueryResponse response = server.query(query); SolrDocumentList list = response.getResults(); System.out.println("list.size() = " + list.size()); coreContainer.shutdown(); Make sure your dataDir in solrconfig.xml is fixed (absolute) otherwise your data directory will get created relative to the current working directory (or you could set a system property for solr.data.dir) Hope that helps. I'll add it to the wiki too. -- Regards, Shalin Shekhar Mangar. --- Fantastic, I now have the embedded server working, thank you! PS. Sorry about all the huge sigs, I don't have the facility to suppress them from work, I'll post from a webmail account in future . . . Ian. Website Content Management Tamar Science Park, 15 Research Way, Plymouth, PL6 8BT Save a tree, think before printing this email. This email contains proprietary information, some or all of which may be legally privileged. It is for the intended recipient only. If an addressing or transmission error has misdirected this email, please notify the author by replying to this email. If you are not the intended recipient you may not use, disclose, distribute, copy, print or rely on this email. Email transmission cannot be guaranteed to be secure or error free, as information may be intercepted, corrupted, lost, destroyed, arrive late or incomplete or contain viruses. This email and any files attached to it have been checked with virus detection software before transmission. You should nonetheless carry out your own virus check before opening any attachment. GOSS Interactive Ltd accepts no liability for any loss or damage that may be caused by software viruses. Registered Office: c/o Bishop Fleming, Cobourg House, Mayflower Street, Plymouth, PL1 1LG. Company Registration No: 3553908
Re: Solr document security
On Wed, 24 Jun 2009 23:20:26 -0700 (PDT) pof wrote: > > Hi, I am wanting to add document-level security that works as following: An > external process makes a query to the index, depending on their security > allowences based of a login id a list of hits are returned minus any the > user are meant to know even exist. I was thinking maybe a custom filter with > a JDBC connection to check security of the user vs. the document. I'm not > sure how I would add the filter or how to write the filter or how to get the > login id from a GET parameter. Any suggestions, comments etc.? Hi Brett, (keeping in mind that i've been away from SOLR for 8 months, but i dont think this was added of late) standard approach is to manage security @ your application layer, not @ SOLR. ie, search, return documents (which should contain some kind of data to identify their ACL ) and then you can decide whether to show it or not. HIH _ {Beto|Norberto|Numard} Meijome "They never open their mouths without subtracting from the sum of human knowledge." Thomas Brackett Reed I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Function query using Map
Noble Paul നോബിള് नोब्ळ् wrote: The five parameter feature is added in solr1.4 . which version of solr are you using? On Wed, Jun 24, 2009 at 12:57 AM, David Baker wrote: Hi, I'm trying to use the map function with a function query. I want to map a particular value to 1 and all other values to 0. We currently use the map function that has 4 parameters with no problem. However, for the map function with 5 parameters, I get a parse error. The following are the query and error returned: _query_ id:[* TO *] _val_:"map(ethnicity,3,3,1,0)" _error message_ *type* Status report *message* _org.apache.lucene.queryParser.ParseException: Cannot parse 'id:[* TO *] _val_:"map(ethnicity,3,3,1,0)"': Expected ')' at position 20 in 'map(ethnicity,3,3,1,0)'_ *description* _The request sent by the client was syntactically incorrect (org.apache.lucene.queryParser.ParseException: Cannot parse 'id:[* TO *] _val_:"map(ethnicity,3,3,1,0)"': Expected ')' at position 20 in 'map(ethnicity,3,3,1,0)'). _ It appears that the parser never evaluates the map string for anything other than the 4 parameters version. Could anyone give me some insight into this? Thanks in advance. -- - Noble Paul | Principal Engineer| AOL | http://aol.com we're running 1.3, which explains this. Thanks for the response.
Python Response Bug?
I'm not sure, but I think I ran across some unexpected behavior in the python response in solr 1.3 (1.3.0 694707 - grantingersoll - 2008-09-12 11:06:47). I'm running Python 2.5 and using eval to convert the string solr returns to python data objects. I have a blank field in my xml file that I am importing labeled "contacthours". If I make the field a "string" type, the interpreter doesn't throw an error. If I make the field a "float" type, python throws an error on the eval function. Here is portion of the output from solr, with the two differences near the end by the "contacthours" variable: Broken (Python didn't like this): {'responseHeader':{'status':0,'QTime':0,'params':{'q':'id:"161"','wt':'p ython'}},'response':{'numFound':1,'start':0,'docs':[{'approvalpending':' ','assessmentcourseprograminstitutionimprovement':u'The students\u2019 final reports along with the \u201cBiology Externship Completion Form\u201d, which is completed by externship supervisors, will be used for this assessment.','contacthourrationale':'Variable hours - no explanation given','contacthours':,'coursearea':'BIO', Good (this worked): {'responseHeader':{'status':0,'QTime':0,'params':{'q':'id:"161"','wt':'p ython'}},'response':{'numFound':1,'start':0,'docs':[{'approvalpending':' ','assessmentcourseprograminstitutionimprovement':u'The students\u2019 final reports along with the \u201cBiology Externship Completion Form\u201d, which is completed by externship supervisors, will be used for this assessment.','contacthourrationale':'Variable hours - no explanation given','contacthours':'','coursearea':'BIO', Any insights? Is this a bug or am I missing something? Mike Beccaria Systems Librarian Head of Digital Initiatives Paul Smith's College 518.327.6376 mbecca...@paulsmiths.edu
Re: Python Response Bug?
The first JSON is invalid as you see because of the missing value. :, is not valid JSON syntax. The reason it works for string type is because the empty string '' is a valid field value but _nothing_ (the first examepl) is not. There is no empty float placeholder that JSON likes. So either it has to be an empty string or some legitimate float value. FAIK, Solr doesn't generate empty numeric values compatible with JSON. zero and null are different things. > I'm not sure, but I think I ran across some unexpected behavior in the > python response in solr 1.3 (1.3.0 694707 - grantingersoll - 2008-09-12 > 11:06:47). > > I'm running Python 2.5 and using eval to convert the string solr returns > to python data objects. > > I have a blank field in my xml file that I am importing labeled > "contacthours". If I make the field a "string" type, the interpreter > doesn't throw an error. If I make the field a "float" type, python > throws an error on the eval function. Here is portion of the output from > solr, with the two differences near the end by the "contacthours" > variable: > > > Broken (Python didn't like this): > {'responseHeader':{'status':0,'QTime':0,'params':{'q':'id:"161"','wt':'p > ython'}},'response':{'numFound':1,'start':0,'docs':[{'approvalpending':' > ','assessmentcourseprograminstitutionimprovement':u'The students\u2019 > final reports along with the \u201cBiology Externship Completion > Form\u201d, which is completed by externship supervisors, will be used > for this assessment.','contacthourrationale':'Variable hours - no > explanation given','contacthours':,'coursearea':'BIO', > > > Good (this worked): > {'responseHeader':{'status':0,'QTime':0,'params':{'q':'id:"161"','wt':'p > ython'}},'response':{'numFound':1,'start':0,'docs':[{'approvalpending':' > ','assessmentcourseprograminstitutionimprovement':u'The students\u2019 > final reports along with the \u201cBiology Externship Completion > Form\u201d, which is completed by externship supervisors, will be used > for this assessment.','contacthourrationale':'Variable hours - no > explanation given','contacthours':'','coursearea':'BIO', > > > Any insights? Is this a bug or am I missing something? > > Mike Beccaria > Systems Librarian > Head of Digital Initiatives > Paul Smith's College > 518.327.6376 > mbecca...@paulsmiths.edu > > >
RE: Python Response Bug?
Regardless, I think it should return valid JSON so programs don't crash when trying to interpret it. I don't think about these things often so maybe I'm missing something obvious, but I think putting in an empty string is better than putting in nothing and having it break. My 2 cents. Mike -Original Message- From: dar...@ontrenet.com [mailto:dar...@ontrenet.com] Sent: Thursday, June 25, 2009 11:11 AM To: solr-user@lucene.apache.org Cc: solr-user@lucene.apache.org Subject: Re: Python Response Bug? The first JSON is invalid as you see because of the missing value. :, is not valid JSON syntax. The reason it works for string type is because the empty string '' is a valid field value but _nothing_ (the first examepl) is not. There is no empty float placeholder that JSON likes. So either it has to be an empty string or some legitimate float value. FAIK, Solr doesn't generate empty numeric values compatible with JSON. zero and null are different things. > I'm not sure, but I think I ran across some unexpected behavior in the > python response in solr 1.3 (1.3.0 694707 - grantingersoll - 2008-09-12 > 11:06:47). > > I'm running Python 2.5 and using eval to convert the string solr returns > to python data objects. > > I have a blank field in my xml file that I am importing labeled > "contacthours". If I make the field a "string" type, the interpreter > doesn't throw an error. If I make the field a "float" type, python > throws an error on the eval function. Here is portion of the output from > solr, with the two differences near the end by the "contacthours" > variable: > > > Broken (Python didn't like this): > {'responseHeader':{'status':0,'QTime':0,'params':{'q':'id:"161"','wt':'p > ython'}},'response':{'numFound':1,'start':0,'docs':[{'approvalpending':' > ','assessmentcourseprograminstitutionimprovement':u'The students\u2019 > final reports along with the \u201cBiology Externship Completion > Form\u201d, which is completed by externship supervisors, will be used > for this assessment.','contacthourrationale':'Variable hours - no > explanation given','contacthours':,'coursearea':'BIO', > > > Good (this worked): > {'responseHeader':{'status':0,'QTime':0,'params':{'q':'id:"161"','wt':'p > ython'}},'response':{'numFound':1,'start':0,'docs':[{'approvalpending':' > ','assessmentcourseprograminstitutionimprovement':u'The students\u2019 > final reports along with the \u201cBiology Externship Completion > Form\u201d, which is completed by externship supervisors, will be used > for this assessment.','contacthourrationale':'Variable hours - no > explanation given','contacthours':'','coursearea':'BIO', > > > Any insights? Is this a bug or am I missing something? > > Mike Beccaria > Systems Librarian > Head of Digital Initiatives > Paul Smith's College > 518.327.6376 > mbecca...@paulsmiths.edu > > >
Re: building custom RequestHandlers
Ok, I glued all stuff and ended up extending handler.component.SearchHandlercause I wan to use all it's functionlity only toadjust the q param before it gets preocessed. How exactly can I get the q and set it back later? >From digging code it seems thats the way SolrParams p = req.getParams(); String words = p.get("q"); but I get SolrParamsis depricated and type mismatch cannot convert SolrParams to SolrParams even like this SolrParams p = (SolrParams) req.getParams(); I get error and 500 when trying to use it. Any pointers to howto set and get are more than welcome. at end of it I am using super.handleRequestBody(req, rsp); so no other stuff to mess. Mats Lindh wrote: > I wrote a small post regarding how to create an analysis filter about a year > ago. I'm guessing that the process is quite similar when developing a custom > request handler: > > http://e-mats.org/2008/06/writing-a-solr-analysis-filter-plugin/ > > Hope that helps. > > --mats > > On Wed, Jun 24, 2009 at 12:54 PM, Julian Davchev wrote: > > >> Well it's really lovely whats in there but this is just configuration >> aspect. Is there sample where should I place my class etc >> and howto complie and all. just simple top to bottom example. I guess >> most of those aspects might be java but they are solr related as well. >> >> Noble Paul നോബിള് नोब्ळ् wrote: >> >>> this part of the doc explains what you shold do to write a custom >>> >> requesthandler >> >>> >> http://wiki.apache.org/solr/SolrPlugins#head-7c0d03515c496017f6c0116ebb096e34a872cb61 >> >>> On Wed, Jun 24, 2009 at 3:35 AM, Julian Davchev wrote: >>> >>> Is it just me or this is thread steal? nothing todo with what thread is originally about. Cheers Bill Dueber wrote: > Is it possible to change the javascript output? I find some of the > information choices (e.g., that facet information is returned in a flat > list, with facet names in the even-numbered indexes and number-of-items > following them in the odd-numbered indexes) kind of annoying. > > On Tue, Jun 23, 2009 at 12:16 PM, Eric Pugh < > >> ep...@opensourceconnections.com >> > >> wrote: >> >> >> >> Like most things JavaScript, I found that I had to just dig through it >> >> and >> >> play with it. However, the Reuters demo site was very easy to >> >> customize to >> >> interact with my own Solr instance, and I went from there. >> >> >> On Jun 23, 2009, at 11:30 AM, Julian Davchev wrote: >> >> Never used it.. I am just looking in docs how can I extend solr but >> >> no >> >> >>> luck so far :( >>> Hoping for some docs or real extend example. >>> >>> >>> >>> Eric Pugh wrote: >>> >>> >>> >>> Are you using the JavaScript interface to Solr? http://wiki.apache.org/solr/SolrJS It may provide much of what you are looking for! Eric On Jun 23, 2009, at 10:27 AM, Julian Davchev wrote: I am using solr and php quite nicely. > Currently the work flow includes some manipulation on php side so I > correctly format the query string and pass to tomcat/solr. > I somehow want to build own request handler in java so I skip the > >> whole >> > apache/php request that is just for formating. > This will saves me tons of requests to apache since I use solr > >> directly >> > from javascript. > > Would like to ask if there is something ready that I can use and > >> adjust. >> > I am kinda new in Java but once I get the pointers > I think should be able to pull out. > Thanks, > JD > > > > > > - Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com Free/Busy: http://tinyurl.com/eric-cal >> - >> Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | >> http://www.opensourceconnections.com >> Free/Busy: http://tinyurl.com/eric-cal >> >> >> >> >> >> >> >> > >>> >>> >>> >> > >
RE: Python Response Bug?
The problem is what should Solr put for a float that is undefined? There is no such value. Typically a value anomaly should not exist and so when data structures break, they reveal these situations. The correct action is to explore why a non-value is making its way into the index and correct it before that by applying your own logic to populate the field with a tangible value. > Regardless, I think it should return valid JSON so programs don't crash > when trying to interpret it. I don't think about these things often so > maybe I'm missing something obvious, but I think putting in an empty > string is better than putting in nothing and having it break. > > My 2 cents. > Mike > > > -Original Message- > From: dar...@ontrenet.com [mailto:dar...@ontrenet.com] > Sent: Thursday, June 25, 2009 11:11 AM > To: solr-user@lucene.apache.org > Cc: solr-user@lucene.apache.org > Subject: Re: Python Response Bug? > > The first JSON is invalid as you see because of the missing value. > > :, is not valid JSON syntax. > > The reason it works for string type is because the empty string '' is a > valid field value but _nothing_ (the first examepl) is not. There is no > empty float placeholder that JSON likes. So either it has to be an empty > string or some legitimate float value. > > FAIK, Solr doesn't generate empty numeric values compatible with JSON. > zero and null are different things. > > > >> I'm not sure, but I think I ran across some unexpected behavior in the >> python response in solr 1.3 (1.3.0 694707 - grantingersoll - > 2008-09-12 >> 11:06:47). >> >> I'm running Python 2.5 and using eval to convert the string solr > returns >> to python data objects. >> >> I have a blank field in my xml file that I am importing labeled >> "contacthours". If I make the field a "string" type, the interpreter >> doesn't throw an error. If I make the field a "float" type, python >> throws an error on the eval function. Here is portion of the output > from >> solr, with the two differences near the end by the "contacthours" >> variable: >> >> >> Broken (Python didn't like this): >> > {'responseHeader':{'status':0,'QTime':0,'params':{'q':'id:"161"','wt':'p >> > ython'}},'response':{'numFound':1,'start':0,'docs':[{'approvalpending':' >> ','assessmentcourseprograminstitutionimprovement':u'The students\u2019 >> final reports along with the \u201cBiology Externship Completion >> Form\u201d, which is completed by externship supervisors, will be used >> for this assessment.','contacthourrationale':'Variable hours - no >> explanation given','contacthours':,'coursearea':'BIO', >> >> >> Good (this worked): >> > {'responseHeader':{'status':0,'QTime':0,'params':{'q':'id:"161"','wt':'p >> > ython'}},'response':{'numFound':1,'start':0,'docs':[{'approvalpending':' >> ','assessmentcourseprograminstitutionimprovement':u'The students\u2019 >> final reports along with the \u201cBiology Externship Completion >> Form\u201d, which is completed by externship supervisors, will be used >> for this assessment.','contacthourrationale':'Variable hours - no >> explanation given','contacthours':'','coursearea':'BIO', >> >> >> Any insights? Is this a bug or am I missing something? >> >> Mike Beccaria >> Systems Librarian >> Head of Digital Initiatives >> Paul Smith's College >> 518.327.6376 >> mbecca...@paulsmiths.edu >> >> >> > >
Re: Python Response Bug?
: Subject: Python Response Bug? : In-Reply-To: <20090625214339.415e6...@suspectum.octantis.com.au> http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you change the subject line of your email, other mail headers still track which thread you replied to and your question is "hidden" in that thread and gets less attention. It makes following discussions in the mailing list archives particularly difficult. See Also: http://en.wikipedia.org/wiki/Thread_hijacking -Hoss
Re: Reverse querying
Otis Gospodnetic wrote: > > > Alex & Oleg, > > Look at MemoryIndex in Lucene's contrib. It's the closest thing to what > you are looking for. What you are describing is sometimes referred to as > "prospective search", sometimes "saved searches", and a few other names. > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > - Original Message >> From: AlexElba >> To: solr-user@lucene.apache.org >> Sent: Wednesday, June 24, 2009 7:47:20 PM >> Subject: Reverse querying >> >> >> Hello, >> >> I have problem which I am trying to solve using solr. >> >> I have search text (term) and I have index full of words which are mapped >> to >> ids. >> >> Is there any query that I can run to do this? >> >> Example: >> >> Term >> "3) A recommendation to use VAR=value in the configure command line will >> not work with some 'configure' scripts that comply to GNU standards >> but are not generated by autoconf. " >> >> Index docs >> >> id:1 name:recommendation >> ... >> id:3 name:GNU >> id:4 name food >> >> after running "query" I want to get as results 1 and 3 >> >> Thanks >> >> -- >> View this message in context: >> http://www.nabble.com/Reverse-querying-tp24194777p24194777.html >> Sent from the Solr - User mailing list archive at Nabble.com. > > > Hello, I looked into this MemoryIndex, there search is returning only score. Which will mean is it here or not I build test method base on example Term: " On my last night in the Silicon Valley area, I decided to head up the east side of San Francisco Bay to visit Vito’s Pizzeria located in Newark, California. I have to say it was excellent! I met the owner (Vito!) and after eating a couple slices I introduced myself. I was happy to know he was familiar with the New York Pizza Blog and the New York Pizza Finder directory. Once we got to talking he decided I NEEDED to try some bread sticks and home-made marinara sauce and they were muy delicioso. I finished off my late night snack with a meatball dipped in the same marinara. " Data {Silicon Valley, New York, Chicago} public static void find(String term, Set data) throws Exception { Analyzer analyzer = PatternAnalyzer.EXTENDED_ANALYZER; MemoryIndex index = new MemoryIndex(); int i = 0; for (String str : data) { index.addField("bn" + i, str, analyzer); i++; } QueryParser parser = new QueryParser("bn*", analyzer); Query query = parser.parse(URLEncoder.encode(term, "UTF-8")); float score = index.search(query); if (score > 0.0f) { System.out.println("it's a match"); } else { System.out.println("no match found"); } // System.out.println("indexData=" + index.toString()); } no match found What I am doing wrong? Thanks, Alex -- View this message in context: http://www.nabble.com/Reverse-querying-tp24194777p24208522.html Sent from the Solr - User mailing list archive at Nabble.com.
Empty results after merging index via IndexMergeTool
Hi All, I am trying to merge two index using mergeindextool. The two index created using solr1.4 and fine showing results . Used the below cmd as per the http://wiki.apache.org/solr/MergingSolrIndexes#head-feb9246bab59b54c0ba361d84981973976566c2a to merge the two index java -cp C:\jbdevstudio\jboss-eap\jboss-as\bin\Core\lib\lucene-core-2.9-dev.jar C:\jbdevstudio\jboss-eap\jboss-as\bin\Core\lib\lucene-misc-2.4.1.jar org.apache.lucene.misc.IndexMergeTool C:\jbdevstudio\jboss-eap\jboss-as\bin\Core\core\data C:\jbdevstudio\jboss-eap\jboss-as\bin\Core\core1\data\index C:\jbdevstudio\jboss-eap\jboss-as\bin\Core\core2\data\index After exeuting the above cmd got the result as Merging... Optimizing... Done. The core data folder contains the files " _0.cfs , segments.gen,segments_2 " Once I chk the results from the merged data respose got as zero results no documents found. I am using lucene-core-2.9-dev.jar and lucene-misc-2.4.1.jar files Please help resolve the issue. Thanks in advance Jay
Is it possible to apply index-time synonyms just for a section of the index
I've posted a few questions on synonyms before and finally understood how it worked and settled with index-time synonyms. Seems to work much better than query time synonyms. But now @ my work, they have a special request. They want certain synonyms to be applied only to certain sections of the index. For example, we have legal faqs, forms etc and we have attorneys in our index. The following synonyms for example, california,san diego florida,miami So for a search 'real estate san diego', it makes sense to return all faqs, forms for 'california' in the index but doesn't make sense to return a real estate attorney elsewhere in california (like burbank) besides just restricting to san diego attorneys. To be more clear I want to be able to return all california faqs & forms for 'real estate san diego' but not all california attorneys for the same. That means, i should index the faqs, forms with the state => city mappings as above but not for attorneys. Well I could index all other resources like faqs, forms first with these synonyms, then remove them and index attorneys. But that wouldn't work well in my case because we have a scheduler set up that runs every night to index any new resources from our database. Can someone suggest a good solution for this? -- View this message in context: http://www.nabble.com/Is-it-possible-to-apply-index-time-synonyms-just-for-a-section-of-the-index-tp24209490p24209490.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Is it possible to apply index-time synonyms just for a section of the index
What is stopping you from defining different field types for faqs and attorneys? One with index time synomyms and one without. anuvenk wrote: > > I've posted a few questions on synonyms before and finally understood how > it worked and settled with index-time synonyms. Seems to work much better > than query time synonyms. But now @ my work, they have a special request. > They want certain synonyms to be applied only to certain sections of the > index. > For example, we have legal faqs, forms etc and we have attorneys in our > index. > The following synonyms for example, > california,san diego > florida,miami > So for a search 'real estate san diego', it makes sense to return all > faqs, forms for 'california' in the index but doesn't make sense to return > a real estate attorney elsewhere in california (like burbank) besides just > restricting to san diego attorneys. > To be more clear I want to be able to return all california faqs & forms > for 'real estate san diego' but not all california attorneys for the same. > That means, i should index the faqs, forms with the state => city mappings > as above but not for attorneys. > Well I could index all other resources like faqs, forms first with these > synonyms, then remove them and index attorneys. But that wouldn't work > well in my case because we have a scheduler set up that runs every night > to index any new resources from our database. > Can someone suggest a good solution for this? > > > > > -- View this message in context: http://www.nabble.com/Is-it-possible-to-apply-index-time-synonyms-just-for-a-section-of-the-index-tp24209490p24210694.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Is it possible to apply index-time synonyms just for a section of the index
That's right. Simple. I can very well do that. Why didn't I think of it. Thanks. rswart wrote: > > What is stopping you from defining different field types for faqs and > attorneys? One with index time synomyms and one without. > > > > anuvenk wrote: >> >> I've posted a few questions on synonyms before and finally understood how >> it worked and settled with index-time synonyms. Seems to work much better >> than query time synonyms. But now @ my work, they have a special request. >> They want certain synonyms to be applied only to certain sections of the >> index. >> For example, we have legal faqs, forms etc and we have attorneys in our >> index. >> The following synonyms for example, >> california,san diego >> florida,miami >> So for a search 'real estate san diego', it makes sense to return all >> faqs, forms for 'california' in the index but doesn't make sense to >> return a real estate attorney elsewhere in california (like burbank) >> besides just restricting to san diego attorneys. >> To be more clear I want to be able to return all california faqs & forms >> for 'real estate san diego' but not all california attorneys for the >> same. That means, i should index the faqs, forms with the state => city >> mappings as above but not for attorneys. >> Well I could index all other resources like faqs, forms first with these >> synonyms, then remove them and index attorneys. But that wouldn't work >> well in my case because we have a scheduler set up that runs every night >> to index any new resources from our database. >> Can someone suggest a good solution for this? >> >> >> >> >> > > -- View this message in context: http://www.nabble.com/Is-it-possible-to-apply-index-time-synonyms-just-for-a-section-of-the-index-tp24209490p24210788.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: THIS WEEK: PNW Hadoop / Apache Cloud Stack Users' Meeting, Wed Jun 24th, Seattle
Hey all, Just writing a quick note of "thanks", we had another solid group of people show up! As always, we learned quite a lot about interesting use cases for Hadoop, Lucene, and the rest of the Apache 'Cloud Stack'. I couldn't get it taped, but we talked about: -Scaling Lucene with Katta and the Katta infrastructure -the need for low-latency BI on distributed document stores -Lots and lots of detail on Amazon Elastic MapReduce We'll be doing it again next month -- July 29th. On Mon, Jun 22, 2009 at 5:40 PM, Bradford Stephens wrote: > Hey all, just a friendly reminder that this is Wednesday! I hope to see > everyone there again. Please let me know if there's something interesting > you'd like to talk about -- I'll help however I can. You don't even need a > Powerpoint presentation -- there's many whiteboards. I'll try to have a > video cam, but no promises. > Feel free to call at 904-415-3009 if you need directions or any questions :) > ~~` > Greetings, > > On the heels of our smashing success last month, we're going to be > convening the Pacific Northwest (Oregon and Washington) > Hadoop/HBase/Lucene/etc. meetup on the last Wednesday of June, the > 24th. The meeting should start at 6:45, organized chats will end > around 8:00, and then there shall be discussion and socializing :) > > The meeting will be at the University of Washington in > Seattle again. It's in the Computer Science building (not electrical > engineering!), room 303, located > here: http://www.washington.edu/home/maps/southcentral.html?80,70,792,660 > > If you've ever wanted to learn more about distributed computing, or > just see how other people are innovating with Hadoop, you can't miss > this opportunity. Our focus is on learning and education, so every > presentation must end with a few questions for the group to research > and discuss. (But if you're an introvert, we won't mind). > > The format is two or three 15-minute "deep dive" talks, followed by > several 5 minute "lightning chats". We had a few interesting topics > last month: > > -Building a Social Media Analysis company on the Apache Cloud Stack > -Cancer detection in images using Hadoop > -Real-time OLAP on HBase -- is it possible? > -Video and Network Flow Analysis in Hadoop vs. Distributed RDBMS > -Custom Ranking in Lucene > > We already have one "deep dive" scheduled this month, on truly > scalable Lucene with Katta. If you've been looking for a way to handle > those large Lucene indices, this is a must-attend! > > Looking forward to seeing everyone there again. > > Cheers, > Bradford > > http://www.roadtofailure.com -- The Fringes of Distributed Computing, > Computer Science, and Social Media.
Re: Solr document security
Thats what I was going to do originally, however what is stopping a user from simply running a search through http://localhost:8983/solr/admin/ of the index server? Norberto Meijome-6 wrote: > > On Wed, 24 Jun 2009 23:20:26 -0700 (PDT) > pof wrote: > >> >> Hi, I am wanting to add document-level security that works as following: >> An >> external process makes a query to the index, depending on their security >> allowences based of a login id a list of hits are returned minus any the >> user are meant to know even exist. I was thinking maybe a custom filter >> with >> a JDBC connection to check security of the user vs. the document. I'm not >> sure how I would add the filter or how to write the filter or how to get >> the >> login id from a GET parameter. Any suggestions, comments etc.? > > Hi Brett, > (keeping in mind that i've been away from SOLR for 8 months, but i > dont think this was added of late) > > standard approach is to manage security @ your > application layer, not @ SOLR. ie, search, return documents (which should > contain some kind of data to identify their ACL ) and then you can decide > whether to show it or not. > > HIH > _ > {Beto|Norberto|Numard} Meijome > > "They never open their mouths without subtracting from the sum of human > knowledge." Thomas Brackett Reed > > I speak for myself, not my employer. Contents may be hot. Slippery when > wet. > Reading disclaimers makes you go blind. Writing them is worse. You have > been > Warned. > > -- View this message in context: http://www.nabble.com/Solr-document-security-tp24197620p24212752.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr document security
That URL to your Solr Admin page should never be exposed to the outside world. You can play with network, routing, DNS and other similar things to make sure one can't get to this from the outside even if the URL is know. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: pof > To: solr-user@lucene.apache.org > Sent: Thursday, June 25, 2009 7:40:12 PM > Subject: Re: Solr document security > > > Thats what I was going to do originally, however what is stopping a user from > simply running a search through http://localhost:8983/solr/admin/ of the > index server? > > > Norberto Meijome-6 wrote: > > > > On Wed, 24 Jun 2009 23:20:26 -0700 (PDT) > > pof wrote: > > > >> > >> Hi, I am wanting to add document-level security that works as following: > >> An > >> external process makes a query to the index, depending on their security > >> allowences based of a login id a list of hits are returned minus any the > >> user are meant to know even exist. I was thinking maybe a custom filter > >> with > >> a JDBC connection to check security of the user vs. the document. I'm not > >> sure how I would add the filter or how to write the filter or how to get > >> the > >> login id from a GET parameter. Any suggestions, comments etc.? > > > > Hi Brett, > > (keeping in mind that i've been away from SOLR for 8 months, but i > > dont think this was added of late) > > > > standard approach is to manage security @ your > > application layer, not @ SOLR. ie, search, return documents (which should > > contain some kind of data to identify their ACL ) and then you can decide > > whether to show it or not. > > > > HIH > > _ > > {Beto|Norberto|Numard} Meijome > > > > "They never open their mouths without subtracting from the sum of human > > knowledge." Thomas Brackett Reed > > > > I speak for myself, not my employer. Contents may be hot. Slippery when > > wet. > > Reading disclaimers makes you go blind. Writing them is worse. You have > > been > > Warned. > > > > > > -- > View this message in context: > http://www.nabble.com/Solr-document-security-tp24197620p24212752.html > Sent from the Solr - User mailing list archive at Nabble.com.