Suggester - how to return exact match?
Hi, we implemented a Solr suggester (http://wiki.apache.org/solr/Suggester) that uses a file based dictionary. We use the results of the suggester to populate a dropdown field of a search field on a webpage. Our dictionary (autosuggest.txt) contains: foo bar Our suggester has the following behavior: We can make a request with the search query "fo" and get a response with the suggestion "foo". This is great. However, if we make a request with the query "foo" (an exact match) we get no suggestions. We would expect that the response returns the suggestion "foo". How can we configure the suggester to return also the perfect match as a suggestion? This is the config for our search component: spellCheck default org.apache.solr.spelling.suggest.Suggester autosuggest.txt Thanks for help! Mirko
Re: Suggester - how to return exact match?
Hi, I'd like to clarify our use case a bit more. We want to return the exact search query as a suggestion only if it is present in the index. So in my example we would expect to get the suggestion "foo" for the query "foo" but no suggestion "abc" for the query "abc" (because "abc" is not in the dictionary). For me this use case seems quite common. Say, we have three products in our store: "foo", "foo 1", "foo 2". If the user types "foo" in the product search, we want to suggest all our products in the dropdown. Is this something we can do with the Solr suggester? Mirko 2013/11/20 Developer > May be there is a way to do this but it doesn't make sense to return the > same > search query as a suggestion (Search query is not a suggestion as it might > or might not be present in the index). > > AFAIK you can use various look up algorithm to get the suggestion list and > they lookup the terms based on the query value (some alogrithm implements > fuzzy logic too). so searching Foo will return FooBar, Foo2 but not foo. > > You should fetch the suggestion only if the numfound is greater than 0 else > you don't have any suggestion. > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Suggester-how-to-return-exact-match-tp4102203p4102259.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Parse eDisMax queries for keywords
Hi, We would like to implement special handling for queries that contain certain keywords. Our particular use case: In the example query "Footitle season 1" we want to discover the keywords "season" , get the subsequent number, and boost (or filter for) documents that match "1" on field name="season". We have two fields in our schema: Our idea was to use a Keyword tokenizer and a Regex on the "season" field to extract the season number from the complete query. However, we use a ExtendedDisMax query parser in our search handler: edismax title season The problem is that the eDisMax tokenizes the query, so that our field "season" receives the tokens ["Foo", "season", "1"] without any order, instead of the complete query. How can we pass the complete query (untokenized) to the season field? We don't understand which tokenizer is used here and why our "season" field received tokens instead of the complete query. Or is there another approach to solve this use case with Solr? Thanks, Mirko
Re: Parse eDisMax queries for keywords
Hi Jack, thanks for your reply. Ok in this case I agree that "enriching" the query in the application layer is a good idea. We are still a bit puzzled how the enriched query should look like. I'll post here when we found a solution. If somebody has suggestions, I'd be happy to hear them. Mirko 2013/11/21 Jack Krupansky > The query parser does its own tokenization and parsing before your > analyzer tokenizer and filters are called, assuring that only one white > space-delimited token is analyzed at a time. > > You're probably best off having an application layer preprocessor for the > query that "enriches" the query in the manner that you're describing. > > Or, simply settle for a "heuristic" approach that may give you 70% of what > you want using only existing Solr features on the server side. > > -- Jack Krupansky > > -Original Message- From: Mirko > Sent: Thursday, November 21, 2013 5:30 AM > To: solr-user@lucene.apache.org > Subject: Parse eDisMax queries for keywords > > > Hi, > We would like to implement special handling for queries that contain > certain keywords. Our particular use case: > > In the example query "Footitle season 1" we want to discover the keywords > "season" , get the subsequent number, and boost (or filter for) documents > that match "1" on field name="season". > > We have two fields in our schema: > > > multiValued="false"/> > > > > mapping="mapping-ISOLatin1Accent.txt"/> > > > > > > > multiValued="false"/> > > > > > > > > > > > > Our idea was to use a Keyword tokenizer and a Regex on the "season" field > to extract the season number from the complete query. > > However, we use a ExtendedDisMax query parser in our search handler: > > > >edismax > >title season > > > > > > > The problem is that the eDisMax tokenizes the query, so that our field > "season" receives the tokens ["Foo", "season", "1"] without any order, > instead of the complete query. > > How can we pass the complete query (untokenized) to the season field? We > don't understand which tokenizer is used here and why our "season" field > received tokens instead of the complete query. > > Or is there another approach to solve this use case with Solr? > > Thanks, > Mirko >
Re: Suggester - how to return exact match?
Thanks! We solved this issue in the front-end now. I.e. we add the exact match to the list of suggestions there. Mirko 2013/11/22 Developer > Might not be a perfect solution but you can use edgengram filter and copy > all > your field data to that field and use it for suggestion. > > positionIncrementGap="100"> > > > > maxGramSize="250" /> > > > > > > > > http://localhost:8983/solr/core1/select?q=name:iphone > > The above query will return > iphone > iphone5c > iphone4g > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Suggester-how-to-return-exact-match-tp4102203p4102521.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Automatically build spellcheck dictionary on replicas
Hi all, We use a Solr SpellcheckComponent with a file-based dictionary. We run a master and some replica slave servers. To update the dictionary, we copy the dictionary txt file to the master, from where it is automatically replicated to all slaves. However, it seems we need to run the "spellcheck.build" query on all servers individually. Is there a way to automatically build the spellcheck dictionary on all servers without calling "spellcheck.build" on all slaves individually? We use Solr 4.0.0 Thanks, Mirko
Re: Automatically build spellcheck dictionary on replicas
Yes, I have that, but it doesn't help. It seems Solr still needs the query with the "spellcheck.build" parameter to build the spellchecker index. 2013/12/3 Kydryavtsev Andrey > Did you try to add > true > parameter to your slave's spellcheck configuration? > > 03.12.2013, 12:04, "Mirko" : > > Hi all, > > We use a Solr SpellcheckComponent with a file-based dictionary. We run a > > master and some replica slave servers. To update the dictionary, we copy > > the dictionary txt file to the master, from where it is automatically > > replicated to all slaves. However, it seems we need to run the > > "spellcheck.build" query on all servers individually. > > > > Is there a way to automatically build the spellcheck dictionary on all > > servers without calling "spellcheck.build" on all slaves individually? > > > > We use Solr 4.0.0 > > > > Thanks, > > Mirko >
Solr Suggester ranked by boost
I want to implement a Solr Suggester (http://wiki.apache.org/solr/Suggester) that ranks suggestions by document boost factor. As I understand the documentation, the following config should work: Solrconfig.xml: ... true 7 true suggest default suggesttext org.apache.solr.spelling.suggest.Suggester org.apache.solr.spelling.suggest.fst.WFSTLookupFactory true ... Schema.xml: ... ... ... I added three documents with a document boost: { "add": { "commitWithin": 5000, "overwrite": true, "boost": 3.0, "doc": { "id": "1", "suggesttext": "text bb" } }, "add": { "commitWithin": 5000, "overwrite": true, "boost": 2.0, "doc": { "id": "2", "suggesttext": "text cc" } }, "add": { "commitWithin": 5000, "overwrite": true, "boost": 1.0, "doc": { "id": "3", "suggesttext": "text aa" } } } A query the suggest handler (with spellcheck.q=te) gives the following response: { "responseHeader":{ "status":0, "QTime":6}, "command":"build", "response":{"numFound":3,"start":0,"docs":[ { "id":"1", "suggesttext":["text bb"]}, { "id":"2", "suggesttext":["text cc"]}, { "id":"3", "suggesttext":["text aa"]}] }, "spellcheck":{ "suggestions":[ "te",{ "numFound":3, "startOffset":0, "endOffset":2, "suggestion":["text aa", "text bb", "text cc"]}]}} The search results are ranked by boost as expected. However, the suggestions are not ranked by boost (but alphabetically instead). I also tried the TSTLookup and FSTLookup lookup implementations with the same result. Any ideas what I'm missing? Thanks, Mirko
Re: Automatically build spellcheck dictionary on replicas
Ok, thanks for pointing that out! 2013/12/3 Kydryavtsev Andrey > Yep, sorry, it doesn't work for file-based dictionaries: > > > In particular, you still need to index the dictionary file once by > issuing a search with &spellcheck.build=true on the end of the URL; if you > system doesn't update that dictionary file, then this only needs to be done > once. This manual step may be required even if your configuration sets > build=true and reload=true. > > http://wiki.apache.org/solr/FileBasedSpellChecker > > 03.12.2013, 21:27, "Mirko" : > > Yes, I have that, but it doesn't help. It seems Solr still needs the > query > > with the "spellcheck.build" parameter to build the spellchecker index. > > > > 2013/12/3 Kydryavtsev Andrey > > > >> Did you try to add > >>true > >> parameter to your slave's spellcheck configuration? > >> > >> 03.12.2013, 12:04, "Mirko" >: > >>> Hi all, > >>> We use a Solr SpellcheckComponent with a file-based dictionary. We > run a > >>> master and some replica slave servers. To update the dictionary, we > copy > >>> the dictionary txt file to the master, from where it is automatically > >>> replicated to all slaves. However, it seems we need to run the > >>> "spellcheck.build" query on all servers individually. > >>> > >>> Is there a way to automatically build the spellcheck dictionary on all > >>> servers without calling "spellcheck.build" on all slaves individually? > >>> > >>> We use Solr 4.0.0 > >>> > >>> Thanks, > >>> Mirko >
Indexing XML files
Hi, I am trying to index an xml file as a field in lucene, see example below: As You Like it Shakespeare, William here goes the xml... I can index the title and author fields because they are strings, but the record field is an xml itself and I bump into some problems as I cannot directly input an xml file using the post.sh script (solr complains). I wonder what would be the correct (and relatively simple) way of doing it. Ideally, I would like to store the xml as is, and index only the content removing the xml-tags (I believe there is HTMLStripWhitespaceAnalyzer for that). And output the result as an xml (so, simple escaping does not work for me). So far, I had the idea of escaping the xml record and then unescaping it for inner storage and using the analyzer for indexing (which would possible require creating a class like XMLField or such). thanks, mirko
Re: Indexing XML files
Hi, Thanks for the quick response. Now, I have one more question. Is it possible to get the result for a query back in the following form (considering the input is the escaped xml, what you mentioned before): 0 0 As You Like It (Promptbook of McVicars 1860)Shakespeare, William, ... Note, that the here the xml data is not escaped. If yes, what do I have to do to get such results back? Would need to be replaced with a type, say, which has a different write method? Or will I only be able to display escaped xml within (and any other types). If so, why? thanks, mirko Quoting Chris Hostetter <[EMAIL PROTECTED]>: > > Since XML is the transport for sending data to Solr, you need to make sure > all field values are XML escaped. > > If you wanted to index a plain text "title" and that tile contained an > ampersand character > > Sense & Sensability > > ...you would need to XML escape that as... > > Sense & Sensability > > ...Solr internally will treat that consistently as the JAva string "Sense > & Sensability" and when it comes time to return that string back to your > query clients, will output it in whatever form is appropraite for your > ResponseWriter -- if that's XML, then it will be XML escaped again, if > it's JSON or something ike it, it can probably be left alone. > > The same holds tru for any other characters you wna to include in your > field values: Solr doens't care that they *value* itself is an XML string, > just that you properly escape the value in your XML message to > Solr... > > > >As You Like it >Shakespeare, William ><myxml>here goes the > xml...</myxml> > > > > ...does that make sense? > > : Ideally, I would like to store the xml as is, and index only the content > : removing the xml-tags (I believe there is HTMLStripWhitespaceAnalyzer for > : that). > : And output the result as an xml (so, simple escaping does not work for me). > > the escaping is just to send the data to Solr -- once sent, Solr will > process the unescaped string when deailing with analyzers, etc exactly as > you'd expect. > > > -Hoss >
Re: Indexing XML files
You are right, it is escaped. But my question is: (how) can I make it unescaped? mirko Quoting Yonik Seeley <[EMAIL PROTECTED]>: ... > > I bet it is escaped, but your browser has helpfully displayed it as > unescaped. > Try doing CTRL-U in firefox to see the real source for the reply. > > > -Yonik >
Re: Indexing XML files
Hi, the idea is to apply XSLT transformation on the result. But it seems that I would have to apply two transformations in a row, one which unescapes the escaped node and a second which performs the actual transformation... mirko Quoting Yonik Seeley <[EMAIL PROTECTED]>: > On 12/5/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > You are right, it is escaped. But my question is: (how) can I > > make it unescaped? > > For what purpose? > If you use an XML parser, the values it gives back to you will be unescaped. > > -Yonik >
Re: Indexing XML files
Thank you all for the quick responses. They were very helpful. My XML is well-formed, so I ended up implementing my own FieldType: public class XMLField extends TextField { public void write(XMLWriter xmlWriter, String name, Fieldable f) throws IOException { xmlWriter.writePrim("xml", name, f.stringValue(), false); } } I looked at the XSD and there is one thing I don't understand: If the desired way is to conform to the XSD (and hence the types used in XSD), then how would it possible to use user-defined fieldtypes as plugins? Wouldn't they violate the same principle? thanks, mirko Quoting Chris Hostetter <[EMAIL PROTECTED]>: ... > I think Walters got the right idea ... as a general rule, we want to make > the XmlResponseWriter "bullet proof" so that no matter waht data you put > into your index, it is garunteed to produce a well formed XML document > that conforms to a specified DTD, or XSD (see SOLR-17 for one we already > have but we haven't figured out what to do with yet) > ... > if you're interested in writing a bit of custom java code you could in > fact write a new FieldType (which could easily subclass TextField) with a > custom "write" method that just outputs the raw value directly, and then > load your field type as a plugin... > > http://wiki.apache.org/solr/SolrPlugins > > -Hoss >
solr + cocoon problem
Hi, I am trying to implement a cocoon based application using solr for searching. In particular, I would like to forward the request from my response page to solr. I have tried several alternatives, but none of them worked for me. One which would seem a logical way to me is to have response page, which is forwarded to solr with cocoon's file generator. It works fine if I perform queries which contain only alphanumeric characters, but it gives the following error if I try to query for a string containing nonalphanum characters: http://hostname/cocoon/mywebapp/response?q=a+b java.io.IOException: Server returned HTTP response code: 505 for URL: http://hostname/solr/select/?q=a b The interesting thing is that if I access http://hostname/solr/select/?q=a b directly it works. The relevant part of my sitemap.xmap: http://hostname/solr/select/?q={request-param:q}"; type="file" > Any ideas on how to implement a cocoon layer above solr? thanks, mirko ps. I realize this question might be more of a cocoon question, but I am posting it here because I have gotten the idea from http://wiki.apache.org/solr/XsltResponseWriter to use cocoon on top of solr) So, I assume some of you have already had run into similar issues and/or knows the solution...
Re: solr + cocoon problem
Hi, I agree, this is not a legal URL. But the thing is that cocoon itself is sending the unescaped URL. That is why I thought I am not using the right tools from cocoon. mirko Quoting Chris Hostetter <[EMAIL PROTECTED]>: > > : java.io.IOException: Server returned HTTP response code: 505 for URL: > : http://hostname/solr/select/?q=a b > : > : > : The interesting thing is that if I access http://hostname/solr/select/?q=a > b > : directly it works. > > i don't know anything about cocoon, but that is not a legal URL, URLs > can't have spaces in them ... if you type a space into your browser, it's > probably being nice and URL escaping it for you (that's what most browsers > seem to do now a days) > > i'm guessing Cocoon automaticaly un-escapes the input to your app, and you > need to re-URL escape it before sending it to Solr. > > > > > -Hoss >
Re: solr + cocoon problem
Thanks Thorsten, that really was helpful. Cocoon's url-encode module does solve my problem. mirko Quoting Thorsten Scherler <[EMAIL PROTECTED]>: > On Wed, 2007-01-17 at 10:25 -0500, [EMAIL PROTECTED] wrote: > > Hi, > > > > I agree, this is not a legal URL. But the thing is that cocoon itself is > > sending the unescaped URL. > > ...because you told it so. > > You use > src="http://hostname/solr/select/?q={request-param:q}"; > type="file" > > > The request param module will not escape the param by default. > > salu2 >
SolrSearchGenerator for Cocoon (2.1)
Hi, I looked at the SolrSearchGenerator (this is the part which is of interest to me), but I could not get it work for Cocoon 2.1 yet. It seems that the there is no getParameters method for the org.apache.cocoon.environment interface: http://cocoon.apache.org/2.1/apidocs/org/apache/cocoon/environment/Request.html I guess you using the getParameterNames and getParameter methods instead should do the trick. Or am I missing something? mirko Quoting Thorsten Scherler <[EMAIL PROTECTED]>: > On Mon, 2007-03-26 at 09:30 -0400, Winona Salesky wrote: > > Thanks Chris, I'll take another look at the forest plugin. > > Have a look as well at http://wiki.apache.org/solr/SolrForrest > it points out the cocoon components. > > salu2 > -- > Thorsten Scherler thorsten.at.apache.org > Open Source Java & XMLconsulting, training and solutions >
Re: Filter query doesn't always work...
Hi, you might want to use the sint (sortable integer) fieldtype instead. If you use the integer fieldtype I guess the range queries are treated as string prefixes (like in [Ab TO Ch]). You can find some documentation about it in the example schema.xml: http://svn.apache.org/viewvc/lucene/solr/trunk/example/solr/conf/schema.xml mirko Quoting escher2k <[EMAIL PROTECTED]>: > > I have a strange problem, and I don't seem to see any issue with the data. I > am filtering > on a field called reviews_positive_6_mos. The field is declared as an > integer. > > If I specify - > (a) fq=reviews_positive_6mos%3A[*+TO+*] => 36033 records are retrieved. > (b) fq=reviews_positive_6mos%3A[*+TO+100] => 35996 records are retrieved. > (c) fq=reviews_positive_6mos%3A[80+TO+100] => 0 records are retrieved. > (d) fq=reviews_positive_6mos%3A[80+TO+*] => 9 records are retrieved. > (e) fq=reviews_positive_6mos%3A[100+TO+100] => 764 records are retrieved. > > I am not sure what could be wrong in cases (c) and (d), especially when > there is a lot of data where > reviews_positive_6mos = 100. Any suggestions would be most appreciated. > > Thanks. > -- > View this message in context: > http://www.nabble.com/Filter-query-doesn%27t-always-work...-tf3474766.html#a9698269 > Sent from the Solr - User mailing list archive at Nabble.com. >
numFound for facet results
Hi, could you tell me what is the (simplest|elegant|fast) way of implementing the following: I use faceted browsing, but I limit the number of facet counts to 5 (i.e., facet.limit=5). 1. I would like to be able to show if there are more facet values (this can be achieved with the trick for asking 6 values and only displaying 5 and if the 6th is non-empty obviously there are more than 5 :) 2. I would like to be able to tell how many facet values are there total. (This would be a value like numFound for the results). Is there such a thing or a workaround like for 1. thanks, mirko
problem with schema.xml
Hi, I just started playing around with Solr 1.2. It has some nice improvements. I noticed that errors in the schema.xml get reported in a verbose way now, but the following steps cause a problem for me: 1. start with a correct schema.xml - Solr works fine 2. edit it in a way that is no longer correct (say, remove the closing tag - Solr works fine 3. restart the webapp (through the Tomcat manager interface) - Solr complains that the schema.xml does not parse, fine. 4. now restart again (without fixing the schema.xml!) - Solr won't even start up 5. fix the above problem (add the closing tag) and restart via Tomcat's manager - the webapp cannot restart showing that there is a problem: FAIL - Application at context path /furness could not be started The following steps might seem artificial, but assume you don't manage to fix all the typos in your schema.xml for the first attempt. It seems after restart Solr gets stuck in some state and I cannot get it up and running by Tomcat's manager, only by restarting Tomcat. Am I missing something? Thanks, mirko
Re: problem with schema.xml
Hi Ryan, I have my .war file located outside the webapps folder (I am using multiple Solr instances with a config as suggested on the wiki: http://wiki.apache.org/solr/SolrTomcat). Nevertheless, I touched the .war file, the config file, the directory under webapps, but nothing seems to be working. Any other suggestions? Is someone else experiencing the same problem? thanks, mirko Quoting Ryan McKinley <[EMAIL PROTECTED]>: > I don't use tomcat, so I can't be particularly useful. The behavior you > describe does not happen with resin or jetty... > > My guess is that tomcat is caching the error state. Since fixing the > problem is outside the webapp directory, it does not think it has > changed so it stays in a broken state. > > if you "touch" the .war file, does it restart ok? > > but i'm just guessing... > >
Create field date using name file
Hi folks, Hopefully this is an easy question but I couldn't do it after several hours.. I created a new field (adding indexed="true" stored="true"/>) and I'd like to use name file value to fill out it. The name files are like: TEXT_CRE_MMGG_X-XXX-XXX.txt or TEXT_CRE_MMGG_X-XXX.txt (where every X are random numbers). I'd like to use a date field type to be able to use some group functions. Thank in advance. Have a nice week, Mirko
Re: Create field date using name file
I forgot to add that the txt files are divided in directory following this rule: //MM/**files**. Regards, Mirko
Invalid Date String:'1992-07-10T17'
Hi all, I am very new with Solr (and Lucene) and I use the last version of it. I do not understand why I obtain this: Exception in thread "main" org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8983/solr/Collection1: Invalid Date String:'1992-07-10T17' at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:558) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:214) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:210) at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:91) at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:302) at Update.main(Update.java:18) Here the code that creates this error: SolrQuery query = new SolrQuery(); String a = "speechDate:1992-07-10T17:33:18Z"; query.set("fq", a); //query.setQuery( a ); <-- I also tried using this one. According to https://cwiki.apache.org/confluence/display/solr/Working+with+Dates, it should be right. I tried with others date, or just |-MM-DD, with no success. My goal is to group these speeches (hopefully using date math syntax). I would like to know if you suggest me to use date or tdate or other because I have not understood the difference. Thanks in advance,| Mirko||
Re: Invalid Date String:'1992-07-10T17'
Thanks very much for each of your replies. These resolved my problem and teach me something important. I have just discovered that I have another problem but I guess that I have to open another discussion. Cheers, Mirko On 10/03/15 20:30, Chris Hostetter wrote: ":" is a syntactically significant character to the query parser, so it's getting confused by it in the text of your query. you're seeing the same problem as if you tried to search for "foo:bar" in the "yak" field using q=yak:foo:bar you either need to backslash escape the ":" characters, or wrap the date in quotes, or use a diff parser that doesn't treat colons as special characters (but remember that since you are building this up as a java string, you have to deal with *java* string escaping as well... String a = "speechDate:1992-07-10T17\\:33\\:18Z"; String a = "speechDate:\"1992-07-10T17:33:18Z\""; String a = "speechDate:" + ClientUtils.escapeQueryChars("1992-07-10T17:33:18Z"); String a = "{!field f=speechDate}1992-07-10T17:33:18Z"; : My goal is to group these speeches (hopefully using date math syntax). I would Unless you are truely seraching for only documents that have an *exact* date value matching your input (down to the millisecond) then seraching or a single date value is almost certainly not what you want -- you most likely want to do a range search... String a = "speechDate:[1992-07-10T00:00:00Z TO 1992-07-11T00:00:00Z]"; (which doesn't require special escaping, because the query parser is smart enough to know that ":" aren't special inside of the "[..]") : like to know if you suggest me to use date or tdate or other because I have : not understood the difference. the difference between date and tdate has to do with how you wnat to trade index size (on disk & in ram) with search speed for range queries like these -- tdate takes up a little more room in the index, but came make range queries faster. -Hoss http://www.lucidworks.com/
how to store _text field
Hi folks, I googled and tried without success so I ask you: how can I modify the setting of a field to store it ? It is interesting to note that I did not add _text field so I guess it is a default one. Maybe it is normal that it is not showed on the result but actually this is my real problem. It could be grand also to copy it in a new field but I do not know how to do it with the last Solr (5) and the new kind of schema. I know that I have to use curl but I do not know how to use it to copy a field. Thank you in advance! Cheers, Mirko
Re: how to store _text field
Hi Alexandre, I need to visualize the content of _txt. For some reasons, actual it is not showed in the results (the "response"). I guess that it doesn't happen because it isn't stored (for some default setting that I'd like to change). Thanks for your help, Mirko On 13/03/15 00:27, Alexandre Rafalovitch wrote: Wait, step back. This is confusing. What's your real problem you are trying to solve? Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 12 March 2015 at 19:50, Mirko Torrisi wrote: Hi folks, I googled and tried without success so I ask you: how can I modify the setting of a field to store it ? It is interesting to note that I did not add _text field so I guess it is a default one. Maybe it is normal that it is not showed on the result but actually this is my real problem. It could be grand also to copy it in a new field but I do not know how to do it with the last Solr (5) and the new kind of schema. I know that I have to use curl but I do not know how to use it to copy a field. Thank you in advance! Cheers, Mirko
Re: how to store _text field
Hi Erick, I'm sorry for this delay but I've just seen this reply. I'm using the last version of solr and the default setting is to use the new kind of indexing, it doesn't use schema.xml and for that I have no idea about how set "store" for this field. The content is grabbed because I've obtained results using the search function but it is not showed because it is not setted to "store". I hope to be clear. Thanks very much. All the best, Mirko On 14/03/15 17:58, Erick Erickson wrote: Right, your schema.xml file will define, perhaps, some "dynamic fields". First insure that stored="true" is specified. If you change this, you have to re-index the docs. Second, insure that your "fl" parameter with the field is specified on the requests, something like q=*:*&fl=eoe_txt. Third, insure that you are actually sending content to that field when you index docs. If none of this helps, show us the definition from schema.xml and a sample input document and a query that illustrate the problem please. Best, Erick On Fri, Mar 13, 2015 at 1:20 AM, Mirko Torrisi wrote: Hi Alexandre, I need to visualize the content of _txt. For some reasons, actual it is not showed in the results (the "response"). I guess that it doesn't happen because it isn't stored (for some default setting that I'd like to change). Thanks for your help, Mirko On 13/03/15 00:27, Alexandre Rafalovitch wrote: Wait, step back. This is confusing. What's your real problem you are trying to solve? Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 12 March 2015 at 19:50, Mirko Torrisi wrote: Hi folks, I googled and tried without success so I ask you: how can I modify the setting of a field to store it ? It is interesting to note that I did not add _text field so I guess it is a default one. Maybe it is normal that it is not showed on the result but actually this is my real problem. It could be grand also to copy it in a new field but I do not know how to do it with the last Solr (5) and the new kind of schema. I know that I have to use curl but I do not know how to use it to copy a field. Thank you in advance! Cheers, Mirko
Addtion to solr wiki editor list
Hi there! I'd like to be added to the list of people who are able to edit the solr wiki at https://wiki.apache.org/solr. I'm working as a Java developer for a german company using Solr (and like it a lot) a lot and I would like to be able to correct things as soon as I find them without going to the IRC-channel to get things changed. My wiki name should be campfire. Thanks in advance
Re: how to store _text field
Hi guys, I used the Erick's suggestions (thanks again!!) to create a new field and copy in it the _text content. curl -X POST -H 'Content-type:application/json' --data-binary '{ "add-field" : { "name":"content", "type":"string", "indexed":true, "stored":true}, "add-copy-field" : { "source":"_text", "dest": [ "content"]}}' http://localhost:8983/solr/Test/schema That seems a good way but I discovered the presence of "bias" in every content field. Indeed, they start with a string of this kind: \n \n stream_content_type text/plain \n stream_size 1556 \n Content-Encoding UTF-8 \n X-Parsed-By org.apache.tika.parser.DefaultParser \n X-Parsed-By org.apache.tika.parser.txt.TXTParser \n Content-Type text/plain; charset=UTF-8 \n resourceName /home/mirko/Desktop/data sample/sample1/TEXT_CRE_20110608_3-114-500.txt Now I need to cut off this part but I have no idea also because the path (present in the last part) has a dynamic length. For someone could be a problem to have two field with the same content (double space needed). I have not this problem because I use Solrj to import, modify and export each document. Maybe I could use it to do also this but hopefully you know a cleaner method. Cheers, Mirko Mirko On 19 March 2015 at 20:11, Erick Erickson wrote: > Hmm, not all that sure. That's one thing about schemaless indexing, it > has to guess. It does the best it can, but it's quite possible that it > guesses wrong. > > If this is a "mananged schema", you can use the REST API commands to > make whatever field you want. Or you can start over with a concrete > schema.xml and use _that_. Otherwise, I'm not sure what to say without > actually being on your system. > > Wish I could help more. > Erick > > On Thu, Mar 19, 2015 at 5:39 AM, Mirko Torrisi > wrote: > > Hi Erick, > > > > I'm sorry for this delay but I've just seen this reply. > > > > I'm using the last version of solr and the default setting is to use the > new > > kind of indexing, it doesn't use schema.xml and for that I have no idea > > about how set "store" for this field. > > The content is grabbed because I've obtained results using the search > > function but it is not showed because it is not setted to "store". > > > > I hope to be clear. > > Thanks very much. > > > > All the best, > > > > Mirko > > > > > > On 14/03/15 17:58, Erick Erickson wrote: > >> > >> Right, your schema.xml file will define, perhaps, some "dynamic > >> fields". First insure that stored="true" is specified. If you change > >> this, you have to re-index the docs. > >> > >> Second, insure that your "fl" parameter with the field is specified on > >> the requests, something like q=*:*&fl=eoe_txt. > >> > >> Third, insure that you are actually sending content to that field when > >> you index docs. > >> > >> If none of this helps, show us the definition from schema.xml and a > >> sample input document and a query that illustrate the problem please. > >> > >> Best, > >> Erick > >> > >> On Fri, Mar 13, 2015 at 1:20 AM, Mirko Torrisi > >> wrote: > >>> > >>> Hi Alexandre, > >>> > >>> I need to visualize the content of _txt. For some reasons, actual it is > >>> not > >>> showed in the results (the "response"). > >>> I guess that it doesn't happen because it isn't stored (for some > default > >>> setting that I'd like to change). > >>> > >>> Thanks for your help, > >>> > >>> Mirko > >>> > >>> > >>> On 13/03/15 00:27, Alexandre Rafalovitch wrote: > >>>> > >>>> Wait, step back. This is confusing. What's your real problem you are > >>>> trying to solve? > >>>> > >>>> Regards, > >>>> Alex. > >>>> > >>>> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: > >>>> http://www.solr-start.com/ > >>>> > >>>> > >>>> On 12 March 2015 at 19:50, Mirko Torrisi > > >>>> wrote: > >>>>> > >>>>> Hi folks, > >>>>> > >>>>> I googled and tried without success so I ask you: how can I modify > the > >>>>> setting of a field to store it ? > >>>>> > >>>>> It is interesting to note that I did not add _text field so I guess > it > >>>>> is > >>>>> a > >>>>> default one. Maybe it is normal that it is not showed on the result > but > >>>>> actually this is my real problem. It could be grand also to copy it > in > >>>>> a > >>>>> new > >>>>> field but I do not know how to do it with the last Solr (5) and the > new > >>>>> kind > >>>>> of schema. I know that I have to use curl but I do not know how to > use > >>>>> it > >>>>> to > >>>>> copy a field. > >>>>> > >>>>> Thank you in advance! > >>>>> Cheers, > >>>>> > >>>>>Mirko > >>> > >>> > > >
Near-Realtime-Search, CommitWithin and AtomicUpdates
Hi@all, I'm using Solr 6.6 and trying to validate my setup for AtomicUpdates and Near-Realtime-Search. Some questions are bogging my mind, so maybe someone can give me a hint to make things clearer. I am posting regular updates to a collection using the UpdateHandler and Solr Command Syntax, including updates and deletes. These changes are commited using the commitWithin configuration every 30 seconds. Now I want to use AtomicUpdates on MultiValue'd fields, so I post the "add" commands for these fields only. Sometimes I have to post multiple Solr commands affecting the same document, but within the same commitWithin interval. The question is now, what is the final new value of the field after the atomic update add operations? From my point of view the final value should be the old value plus the newly added values, which is commited to the index in the next commitWithin period. So can I combine multiple AtomicUpdate commands affecting the same document within the same commitWithin interval? Another thing that is bogging me: can I combine multiple AtomicUpdates for the same document with CopyFields? Does Solr use some kind of dirty-read or pending uncommited changes to get the right value of the source field, or is the source always the last commited value? So in summary, does Solr AtomicUpdates use some kind of dirty-read mechanism do do its "magic" ? Thanks in advance, Mirko