Couple of questions on solr
[Reposting cause for some reasons I can't find this on the list, so apologies for the double post] Hi All, I am quite new to solr and trying to use solr with a .net web site (!). So far, solr hasn't given me any major jitters but I've been stuck with few things off-late; hopefully; I can get them answered over here. 1) Overview: Currently, we have around 20,000 documents to index with individual doc size around 5k. We have set up faceting on a multi-valued field (there will be ~20 facets per document). 2) Faceted navigation: I've read that faceted navigation on a multi-valued field has some performance implications. Unfortunately; the current site requires multi-valued faceting and I cannot break them into unique fields for faceting. What is the best way to get maximum performance on a multi-valued field? 3) We support three kinds of search on our site: a free text search, the faceted navigation and a negation search. For free text search, we have created a text field (as defined in example schema.xml) and use copyField to copy other fields to this, the free text search happens on this text field only, for example, say a doc can have two fields: title and description, both of these are copied to text field using copyField to this text field and free text search happens on this field. Is there any way, that I can assign weights for the relevancy search, if I am using this new field i.e. if the user searches for "sunny" and there is a document (doc1) with title as "sunny something" and another document (doc2) with description as "sunny description", is it possible to return doc1 before doc2; given that the search is happening on the copied field? If that's not possible is there any other way that this can be achieved (please keep in mind that we have around 8 fields for a document with 4 of them being multi-valued, so searching explicitly on all of them and assign boost at query time might not be very funny). Can I use DismaxHandler for this kind of copyField field, if yes, is there any way that I can define Dismaxhandler as the default handler in solrconfig.xml, without having to explicitly provide it during query time? Thanks in advance, Sachin Smart Girls Secret Weapon Read Unbiased Beauty Product Reviews, Get Helpful Tips, Tricks and Sam http://thirdpartyoffers.netzero.net/TGL2231/fc/JKFkuJO6p0C525RM1AF9yqNIvskd7xVfuxfTXAn9eee8K4yeOkBKLP/
Re: Problem with add a XML
Yes my file is UTF-8. I Have Upload my file. Grant Ingersoll-6 wrote: > > > On Jun 11, 2008, at 3:46 AM, Thomas Lauer wrote: > >> now I want tho add die files to solr. I have start solr on windows >> in the example directory with java -jar start.jar >> >> >> I have the following Error Message: >> >> C:\test\output>java -jar post.jar *.xml >> SimplePostTool: version 1.2 >> SimplePostTool: WARNING: Make sure your XML documents are encoded in >> UTF-8, other encodings are not currently supported > > > This is your issue right here. You have to save that second file in > UTF-8. > >> >> SimplePostTool: POSTing files to http://localhost:8983/solr/update.. >> SimplePostTool: POSTing file 1.xml >> SimplePostTool: POSTing file 2.xml >> SimplePostTool: FATAL: Connection error (is Solr running at >> http://localhost:8983/solr/update >> ?): java.io.IOException: S >> erver returned HTTP response code: 400 for URL: >> http://localhost:8983/solr/update >> >> C:\test\output> >> >> Regards Thomas Lauer >> >> >> >> >> >> __ Hinweis von ESET NOD32 Antivirus, Signaturdatenbank- >> Version 3175 (20080611) __ >> >> E-Mail wurde geprüft mit ESET NOD32 Antivirus. >> >> http://www.eset.com > > -- > Grant Ingersoll > http://www.lucidimagination.com > > Lucene Helpful Hints: > http://wiki.apache.org/lucene-java/BasicsOfPerformance > http://wiki.apache.org/lucene-java/LuceneFAQ > > > > > > > > > http://www.nabble.com/file/p17794387/2.xml 2.xml -- View this message in context: http://www.nabble.com/Problem-with-add-a-XML-tp17772018p17794387.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: synonym token types and ranking
yes... I actually implemented it. I'll just clean up the code and add it to JIRA. Uri On Thu, Jun 12, 2008 at 5:48 AM, Otis Gospodnetic < [EMAIL PROTECTED]> wrote: > Hi Uri, > > Yes, I think that would make sense (word vs. synonym token types). Custom > boosting/weighting of original token vs. synonym token(s) also makes sense. > Is this something you can provide a patch for? > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > - Original Message > > From: Uri Boness <[EMAIL PROTECTED]> > > To: solr-user@lucene.apache.org > > Sent: Wednesday, June 11, 2008 8:56:02 PM > > Subject: synonym token types and ranking > > > > Hi, > > > > I've noticed that currently the SynonymFilter replaces the original > > token with the configured tokens list (which includes the original > > matched token) and each one of these tokens is of type "word". Wouldn't > > it make more sense to only mark the original token as type "word" and > > the the other tokens as "synonym" types? In addition, once payloads are > > integrated with Solr, it would be nice if it would be possible to > > configure a payload for synonyms. One of the requirements we're > > currently facing in our project is that matches on synonyms should weigh > > less than exact matches. > > > > cheers, > > Uri > >
Re: Problem with add a XML
Usually you get better error messages from the start.jar console, you don't see anything there? On Thu, Jun 12, 2008 at 7:49 AM, Thomas Lauer <[EMAIL PROTECTED]> wrote: > > Yes my file is UTF-8. I Have Upload my file. > > > > > Grant Ingersoll-6 wrote: >> >> >> On Jun 11, 2008, at 3:46 AM, Thomas Lauer wrote: >> >>> now I want tho add die files to solr. I have start solr on windows >>> in the example directory with java -jar start.jar >>> >>> >>> I have the following Error Message: >>> >>> C:\test\output>java -jar post.jar *.xml >>> SimplePostTool: version 1.2 >>> SimplePostTool: WARNING: Make sure your XML documents are encoded in >>> UTF-8, other encodings are not currently supported >> >> >> This is your issue right here. You have to save that second file in >> UTF-8. >> >>> >>> SimplePostTool: POSTing files to http://localhost:8983/solr/update.. >>> SimplePostTool: POSTing file 1.xml >>> SimplePostTool: POSTing file 2.xml >>> SimplePostTool: FATAL: Connection error (is Solr running at >>> http://localhost:8983/solr/update >>> ?): java.io.IOException: S >>> erver returned HTTP response code: 400 for URL: >>> http://localhost:8983/solr/update >>> >>> C:\test\output> >>> >>> Regards Thomas Lauer >>> >>> >>> >>> >>> >>> __ Hinweis von ESET NOD32 Antivirus, Signaturdatenbank- >>> Version 3175 (20080611) __ >>> >>> E-Mail wurde geprüft mit ESET NOD32 Antivirus. >>> >>> http://www.eset.com >> >> -- >> Grant Ingersoll >> http://www.lucidimagination.com >> >> Lucene Helpful Hints: >> http://wiki.apache.org/lucene-java/BasicsOfPerformance >> http://wiki.apache.org/lucene-java/LuceneFAQ >> >> >> >> >> >> >> >> >> > http://www.nabble.com/file/p17794387/2.xml 2.xml > -- > View this message in context: > http://www.nabble.com/Problem-with-add-a-XML-tp17772018p17794387.html > Sent from the Solr - User mailing list archive at Nabble.com. > >
Re: Couple of questions on solr
Hi, The answer for 3) is: Use DisMax request handler. In solrconfig.xml assign weights/boosts to different fields. No need to use copyField then, as you can search multiple fields with DisMax by just specifying them in the solrconfig.xml. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Sachin <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Thursday, June 12, 2008 3:06:31 AM > Subject: Couple of questions on solr > > [Reposting cause for some reasons I can't find this on the list, so apologies > for the double post] > Hi All, > > I am quite new to solr and trying to use solr with a .net web site (!). So > far, > solr > hasn't given me any major jitters but I've been stuck with few things > off-late; > hopefully; I can get them answered over here. > > 1) Overview: Currently, we have around 20,000 documents to index with > individual > doc size around 5k. We have set up faceting on a multi-valued field (there > will > be ~20 facets per document). > 2) Faceted navigation: I've read that faceted navigation on a multi-valued > field > has some performance implications. Unfortunately; the current site requires > multi-valued faceting and I cannot break them into unique fields for > faceting. > What is the best way to get > maximum performance on a multi-valued field? > 3) We support three kinds of search on our site: a free text search, the > faceted > navigation and a negation search. For free text search, we have created a > text > field (as defined in example schema.xml) and use copyField to copy other > fields > to this, the free text search happens on this text field only, for example, > say > a doc can have two fields: > title and description, both of these are copied to text field using copyField > > to this text field and free text search happens on this field. Is there any > way, > that I can assign weights for the relevancy search, if I am using this new > field > i.e. if the user > searches for "sunny" and there is a document (doc1) with title as "sunny > something" and another document (doc2) with description as "sunny > description", > is it possible to return doc1 before doc2; given that the search is happening > on > the copied field? If that's not possible is there any other way that this can > be > achieved (please keep in mind that we have around 8 fields for a document > with 4 > of them being multi-valued, so searching explicitly on all of them and assign > boost at query time might not be very funny). Can I use DismaxHandler for > this > kind of copyField field, if yes, is there any way that I can define > Dismaxhandler as the default handler in solrconfig.xml, without having to > explicitly provide it during query time? > > Thanks in advance, > Sachin > > > Smart Girls Secret Weapon > Read Unbiased Beauty Product Reviews, Get Helpful Tips, Tricks and Sam > http://thirdpartyoffers.netzero.net/TGL2231/fc/JKFkuJO6p0C525RM1AF9yqNIvskd7xVfuxfTXAn9eee8K4yeOkBKLP/
Re: Strategy for presenting fresh data
On Wed, 11 Jun 2008 22:13:24 -0700 (PDT) rohit arora <[EMAIL PROTECTED]> wrote: > I am new to Solr Lucene I have only one defaule core i am working on creating > multiple core. > Can you help me in this matter. hi Rohit, please do NOT hijack the thread. You are far more likely to get useful, helpful answers if you state your question on a new email, with an appropriate subject. thanks, B _ {Beto|Norberto|Numard} Meijome "Some cause happiness wherever they go; others, whenever they go." Oscar Wilde I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Strategy for presenting fresh data
On Wed, 11 Jun 2008 20:49:54 -0700 (PDT) Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > Hi James, > > Yes, this makes sense. I've recommended doing the same to others before. It > would be good to have this be a part of Solr. There is one person (named > Jason) working on adding more real-time search support to both Lucene and > Solr. v. interesting - do you have any pointers handy on this? In the meantime, I had imagined that, although clumsy, federated search could be used for this purpose - posting the new documents to a group of servers ('latest updates servers') with v limited amount of documents with v. fast "reload / refresh" times, and sending them again (on a work queue, possibly), to the 'core servers'. Regularly cleaning the 'latest updates servers' of the already posted documents to 'core servers' would keep them lean... of course, this approach sucks compared to a proper solution like what James is suggesting :) B _ {Beto|Norberto|Numard} Meijome "Ugly programs are like ugly suspension bridges: they're much more liable to collapse than pretty ones, because the way humans (especially engineer-humans) perceive beauty is intimately related to our ability to process and understand complexity. A language that makes it hard to write elegant code makes it hard to write good code." Eric Raymond I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: searching only within allowed documents
climbingrose wrote: It depends on your query. The second query is better if you know that fieldb:bar filtered query will be reused often since it will be cached separately from the query. The first query occuppies one cache entry while the second one occuppies two cache entries, one in queryCache and one in filteredCache. Therefore, if you're not going to reuse fieldb:bar, the second query is better. ok, that makes more sense. thanks. --Geoff
Error loading class 'solr.RandomSortField'
Hi, I configured multi core in Solr Lucene but while giving "java -jar start.jar" command it through an Error "Caused by: java.lang.ClassNotFoundException: solr.RandomSortField" can you help me in this problem. with regards Rohit Arora
Re: Strategy for presenting fresh data
What you are describing is pretty much what the original poster intends to do, as far as I understand. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Norberto Meijome <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Thursday, June 12, 2008 9:13:39 AM > Subject: Re: Strategy for presenting fresh data > > On Wed, 11 Jun 2008 20:49:54 -0700 (PDT) > Otis Gospodnetic wrote: > > > Hi James, > > > > Yes, this makes sense. I've recommended doing the same to others before. > > It > would be good to have this be a part of Solr. There is one person (named > Jason) > working on adding more real-time search support to both Lucene and Solr. > > v. interesting - do you have any pointers handy on this? > > In the meantime, I had imagined that, although clumsy, federated search > could > be used for this purpose - posting the new documents to a group of servers > ('latest updates servers') with v limited amount of documents with v. fast > "reload / refresh" times, and sending them again (on a work queue, possibly), > to > the 'core servers'. Regularly cleaning the 'latest updates servers' of the > already posted documents to 'core servers' would keep them lean... of > course, > this approach sucks compared to a proper solution like what James is > suggesting > :) > > B > _ > {Beto|Norberto|Numard} Meijome > > "Ugly programs are like ugly suspension bridges: they're much more liable to > collapse than pretty ones, because the way humans (especially > engineer-humans) > perceive beauty is intimately related to our ability to process and > understand > complexity. A language that makes it hard to write elegant code makes it hard > to > write good code." >Eric Raymond > > I speak for myself, not my employer. Contents may be hot. Slippery when wet. > Reading disclaimers makes you go blind. Writing them is worse. You have been > Warned.
Re: Strategy for presenting fresh data
In the meantime, I had imagined that, although clumsy, federated search could be used for this purpose - posting the new documents to a group of servers ('latest updates servers') with v limited amount of documents with v. fast "reload / refresh" times, and sending them again (on a work queue, possibly), to the 'core servers'. Regularly cleaning the 'latest updates servers' of the already posted documents to 'core servers' would keep them lean... of course, this approach sucks compared to a proper solution like what James is suggesting :) Otis - is there an issue I should be looking at for more information on this? Yes, in principle, sending updates both to a fresh, forgetful and fast index and a larger, slower index is what I'm thinking of doing. The only difference is that I'm talking about having the fresh index be implemented as a RAMDirectory in the same JVM as the large index. This means that I can avoid the slowness of cross-disk or cross- machine replication, I can avoid having to index all documents in two places and I cut out the extra moving part of federated search. On the other hand, I am going to have to write my own piece to handle the index flushes and federate searches to the fast and large indices. Thanks for your input! James
Re: Error loading class 'solr.RandomSortField'
Hi Rohit, It seems like you may be using a Solr 1.2 war file. You must use the 1.3 nightly builds to use the newer multicore features. http://people.apache.org/builds/lucene/solr/nightly/ On Thu, Jun 12, 2008 at 2:03 PM, rohit arora <[EMAIL PROTECTED]> wrote: > > Hi, > > I configured multi core in Solr Lucene but while giving "java -jar > start.jar" command it through an Error > > "Caused by: java.lang.ClassNotFoundException: solr.RandomSortField" > > can you help me in this problem. > > with regards > Rohit Arora > > > > -- Regards, Shalin Shekhar Mangar.
Re: Analytics e.g. "Top 10 searches"
I keep this information on a separate index that I call moreSearchedWords. I use it to generate tag clouds 2008/6/6 Matthew Runo <[EMAIL PROTECTED]>: > I'm nearly certain that everyone who maintains these stats does it > themselves in their 'front end'. It's very easy to log terms and whatever > else just before or after sending the query off to Solr. > > Thanks! > > Matthew Runo > Software Developer > Zappos.com > 702.943.7833 > > > On Jun 6, 2008, at 3:51 AM, McBride, John wrote: > > >> Hello, >> >> Is anybody familiar with any SOLR-based analytical tools which would >> allow us to extract "top ten seaches", for example. >> >> I imagine at the query parse level, where the query is tokenized and >> filtered would be the best place to log this, due to the many >> permutations possible at the user input level. >> >> Is there an existing plugin to do this, or could you suggest how to >> architect this? >> >> Thanks, >> John >> >> > -- Alexander Ramos Jardim
Re: Re[2]: "null" in admin page
Sorry for the late response. Too many messages, I got distracted. Steps as follow: 1. I download the solr example app. 2. Unpack it. 3. cd 4. java -jar start.jar 5. Try do use one of the links in admin webapp 6. Get core=null 2008/5/30 Chris Hostetter <[EMAIL PROTECTED]>: > > : It surely comes on the example, as I got this problem all times I get the > : example, and I have to remove the file multicore.xml or I get the error. > > something is wrong then. if yo uare runing "java -jar start.jar" in the > "example" directory then "example/solr" will be used as your solr home > directory, and it has no multicore configuration files. > the directory "example/multicore" does contain a multicore.xml file, > and if you use that directory as your Solr Home then Solr will see that > file and go into "MultiCore" -- but you you shouldn't have to remove that > multicore.xml file unless you are explicitly using "example/multicore". > > If you are seeing differnet behavior, can you descibe in more details what > steps you are taking (from a clean checkout) and why you think > multicore.xml is getting used when you do those steps? > > > -Hoss > > -- Alexander Ramos Jardim
Re: Re: Analytics e.g. "Top 10 searches"
Hello Jon, These are the fields in my search index: -- Where on the site this search was made -- Search text -- Number of times this search was made How it works: 1. When I someone hits the search functionality I put the search made on a JMS to process searches statistics asynchronously. 2. The search information in the JMS is read in short time intervals and condensed. This way I get beans that contains exactly the information that I want to put on the index. 3. I retrieve all the X most executed searches sorted by hits and update their information using the one I got in (2). 4. I empty the index. 5. I update the search index using the information generated in (3). 2008/6/12 Jon Lehto <[EMAIL PROTECTED]>: > Are you doing anything 'fancy'? > > Thanks, > Jon > > > = > From: Alexander Ramos Jardim <[EMAIL PROTECTED]> > Date: 2008/06/12 Thu PM 01:23:22 EDT > To: solr-user@lucene.apache.org > Subject: Re: Analytics e.g. "Top 10 searches" > > I keep this information on a separate index that I call moreSearchedWords. > I > use it to generate tag clouds > > 2008/6/6 Matthew Runo <[EMAIL PROTECTED]>: > > > I'm nearly certain that everyone who maintains these stats does it > > themselves in their 'front end'. It's very easy to log terms and whatever > > else just before or after sending the query off to Solr. > > > > Thanks! > > > > Matthew Runo > > Software Developer > > Zappos.com > > 702.943.7833 > > > > > > On Jun 6, 2008, at 3:51 AM, McBride, John wrote: > > > > > >> Hello, > >> > >> Is anybody familiar with any SOLR-based analytical tools which would > >> allow us to extract "top ten seaches", for example. > >> > >> I imagine at the query parse level, where the query is tokenized and > >> filtered would be the best place to log this, due to the many > >> permutations possible at the user input level. > >> > >> Is there an existing plugin to do this, or could you suggest how to > >> architect this? > >> > >> Thanks, > >> John > >> > >> > > > > > -- > Alexander Ramos Jardim > > -- Alexander Ramos Jardim
Re: Problem with add a XML
You need to define fields in the schema.xml (and otherwise change the schema to match your data). -Yonik On Wed, Jun 11, 2008 at 3:46 AM, Thomas Lauer <[EMAIL PROTECTED]> wrote: > > > >85f4fdf9-e596-4974-a5b9-57778e38067b >143885 >28.10.2005 13:06:15 >Rechnung 2005-025235 >Rechnungsduplikate >2002 >330T.doc >KIS >Bonow >25906 >Hofma GmbH >Mandant > >
Re: Re: Analytics e.g. "Top 10 searches"
Just as a thought, would it be possible to expose the original query text from the QueryResultCache keys (Query) somehow? If that is possible, it would allow us to query the top N most frequent queries anytime for reasonable values of N. On Fri, Jun 13, 2008 at 12:18 AM, Alexander Ramos Jardim < [EMAIL PROTECTED]> wrote: > Hello Jon, > > These are the fields in my search index: > -- > Where on the site this search was made > -- > Search > text > -- Number > of > times this search was made > > How it works: > 1. When I someone hits the search functionality I put the search made on a > JMS to process searches statistics asynchronously. > 2. The search information in the JMS is read in short time intervals and > condensed. This way I get beans that contains exactly the information that > I > want to put on the index. > 3. I retrieve all the X most executed searches sorted by hits and update > their information using the one I got in (2). > 4. I empty the index. > 5. I update the search index using the information generated in (3). > > 2008/6/12 Jon Lehto <[EMAIL PROTECTED]>: > > > Are you doing anything 'fancy'? > > > > Thanks, > > Jon > > > > > > = > > From: Alexander Ramos Jardim <[EMAIL PROTECTED]> > > Date: 2008/06/12 Thu PM 01:23:22 EDT > > To: solr-user@lucene.apache.org > > Subject: Re: Analytics e.g. "Top 10 searches" > > > > I keep this information on a separate index that I call > moreSearchedWords. > > I > > use it to generate tag clouds > > > > 2008/6/6 Matthew Runo <[EMAIL PROTECTED]>: > > > > > I'm nearly certain that everyone who maintains these stats does it > > > themselves in their 'front end'. It's very easy to log terms and > whatever > > > else just before or after sending the query off to Solr. > > > > > > Thanks! > > > > > > Matthew Runo > > > Software Developer > > > Zappos.com > > > 702.943.7833 > > > > > > > > > On Jun 6, 2008, at 3:51 AM, McBride, John wrote: > > > > > > > > >> Hello, > > >> > > >> Is anybody familiar with any SOLR-based analytical tools which would > > >> allow us to extract "top ten seaches", for example. > > >> > > >> I imagine at the query parse level, where the query is tokenized and > > >> filtered would be the best place to log this, due to the many > > >> permutations possible at the user input level. > > >> > > >> Is there an existing plugin to do this, or could you suggest how to > > >> architect this? > > >> > > >> Thanks, > > >> John > > >> > > >> > > > > > > > > > -- > > Alexander Ramos Jardim > > > > > > > -- > Alexander Ramos Jardim > -- Regards, Shalin Shekhar Mangar.
Re: Re: Analytics e.g. "Top 10 searches"
On Thu, Jun 12, 2008 at 3:04 PM, Shalin Shekhar Mangar <[EMAIL PROTECTED]> wrote: > Just as a thought, would it be possible to expose the original query text > from the QueryResultCache keys (Query) somehow? If that is possible, it > would allow us to query the top N most frequent queries anytime for > reasonable values of N. That would only give most recent, not most frequent. -Yonik
Re: Num docs
Cacti, Nagios you name it already in use :) Well I'm the CTO so the one really really interested in estimating perf. The id's come from a db initially and is later used for retrieval from a distributed on disk caching system which I have written. I'm in the process of moving from MySQL to HBase or Hypertable. /M On Tue, Jun 10, 2008 at 10:03 PM, Otis Gospodnetic < [EMAIL PROTECTED]> wrote: > Marcus, > > It sounds like you may just want to use a good server monitoring package > that collects server data and prints out pretty charts. Then you can show > them to your IT/budget people when the charts start showing increased query > latency times, very little available RAM, swapping, high CPU usage and such. > Nagios, Ganglia, any of those things will do. > > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > - Original Message > > From: Marcus Herou <[EMAIL PROTECTED]> > > To: solr-user@lucene.apache.org > > Sent: Tuesday, June 10, 2008 3:29:40 PM > > Subject: Re: Num docs > > > > Well guys you are right... Still I want to have a clue about how much > each > > machine stores to predict when we need more machines (measure performance > > degradation per new document). But it's harder to collect that kind of > data. > > It sure is doable no doubt and is a normal sharding "algo" for MySQL. > > > > The best approach I think is to have some bg threads run X number of > queries > > and collect the response times, throw away the n lowest/highest response > > times and calc an avg time which is used for in sharding and query > lb'ing. > > > > Little off topic but interesting > > What would you guys say about a good correlation between the index size > on > > disk (no stored text content) and available RAM and having good response > > times. > > > > How long is a rope would you perhaps say...but I think some rule of thumb > > could be established... > > > > One of the schemas of concern > > > > > > required="true" /> > > > > required="true" /> > > > > required="false" /> > > > > stored="false" required="true" /> > > > > required="true" /> > > > > required="true" /> > > > > required="false" /> > > > > required="true" /> > > > > required="true" /> > > > > required="false" /> > > > > required="false" multiValued="true"/> > > > > required="false" /> > > > > required="false" /> > > > > required="false" /> > > > > required="false" /> > > > > > > and a normal solr query (taken from the log): > > /select > > > start=0&q=(title:(apple)^4+OR+description:(apple))&version=2.2&rows=15&wt=xml&sort=publishDate+desc > > > > > > //Marcus > > > > > > > > > > > > On Tue, Jun 10, 2008 at 1:15 AM, Otis Gospodnetic < > > [EMAIL PROTECTED]> wrote: > > > > > Exactly. I think I mentioned this once before several months ago. One > can > > > take various hardware specs (# cores, CPU speed, FSB, RAM, etc.), > > > performance numbers, etc. and come up with a number for each server's > > > overall capacity. > > > > > > > > > As a matter of fact, I think this would be useful to have right in > Solr, > > > primarily for use when allocating and sizing shards for Distributed > Search. > > > JIRA enhancement/feature issue? > > > Otis > > > -- > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > - Original Message > > > > From: Alexander Ramos Jardim > > > > To: solr-user@lucene.apache.org > > > > Sent: Monday, June 9, 2008 6:42:17 PM > > > > Subject: Re: Num docs > > > > > > > > I even think that such a decision should be based on the overall > machine > > > > performance at a given time, and not the index size. Unless you are > > > talking > > > > solely about HD space and not having any performance issues. > > > > > > > > 2008/6/7 Otis Gospodnetic : > > > > > > > > > Marcus, > > > > > > > > > > > > > > > For that you can rely on du, vmstat, iostat, top and such, too. :) > > > > > > > > > > Otis > > > > > -- > > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > > > > > > > - Original Message > > > > > > From: Marcus Herou > > > > > > To: solr-user@lucene.apache.org > > > > > > Sent: Saturday, June 7, 2008 12:33:10 PM > > > > > > Subject: Re: Num docs > > > > > > > > > > > > Thanks, I wanna ask the indices how much more each shard can > handle > > > > > before > > > > > > they're considered "full" and scream for a budget to get a new > > > machine :) > > > > > > > > > > > > /M > > > > > > > > > > > > On Sat, Jun 7, 2008 at 3:07 PM, Otis Gospodnetic > > > > > > wrote: > > > > > > > > > > > > > Marcus, check out the Luke request handler. You can get it > from > > > its > > > > > > > output. It may also be possible to get *just* that number, but > I'm > > > not > > > > > > > looking at docs/code right now to know for sure. > > > > > > > > > > > > > > Otis > > > > > > > -- > > > > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > > > > > > > > > > > > > - Original Message > > > > > > > >
Re: Re: Analytics e.g. "Top 10 searches"
Ah! I see. But can the original queries be exposed? I guess exposing this through a SearchComponent would be appropriate. This can help in displaying things like "What users are searching for right now?" On Fri, Jun 13, 2008 at 12:44 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > On Thu, Jun 12, 2008 at 3:04 PM, Shalin Shekhar Mangar > <[EMAIL PROTECTED]> wrote: > > Just as a thought, would it be possible to expose the original query text > > from the QueryResultCache keys (Query) somehow? If that is possible, it > > would allow us to query the top N most frequent queries anytime for > > reasonable values of N. > > That would only give most recent, not most frequent. > > -Yonik > -- Regards, Shalin Shekhar Mangar.
Re: Problem with add a XML
This is the error message from the console. SCHWERWIEGEND: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: [EMAIL PROTECTED]:\Dokumente und E instellungen\tla\Desktop\solr\apache-solr-1.2.0\apache-solr-1.2.0\example\solr\data\index\write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:70) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:579) at org.apache.lucene.index.IndexWriter.(IndexWriter.java:341) at org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:65) at org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:120) at org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:181) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:259) at org.apache.solr.handler.XmlUpdateRequestHandler.update(XmlUpdateRequestHandler.java:166) at org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:84) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:77) at org.apache.solr.core.SolrCore.execute(SolrCore.java:658) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:191) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:159) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) Jón Helgi Jónsson wrote: > > Usually you get better error messages from the start.jar console, you > don't see anything there? > > On Thu, Jun 12, 2008 at 7:49 AM, Thomas Lauer <[EMAIL PROTECTED]> wrote: >> >> Yes my file is UTF-8. I Have Upload my file. >> >> >> >> >> Grant Ingersoll-6 wrote: >>> >>> >>> On Jun 11, 2008, at 3:46 AM, Thomas Lauer wrote: >>> now I want tho add die files to solr. I have start solr on windows in the example directory with java -jar start.jar I have the following Error Message: C:\test\output>java -jar post.jar *.xml SimplePostTool: version 1.2 SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, other encodings are not currently supported >>> >>> >>> This is your issue right here. You have to save that second file in >>> UTF-8. >>> SimplePostTool: POSTing files to http://localhost:8983/solr/update.. SimplePostTool: POSTing file 1.xml SimplePostTool: POSTing file 2.xml SimplePostTool: FATAL: Connection error (is Solr running at http://localhost:8983/solr/update ?): java.io.IOException: S erver returned HTTP response code: 400 for URL: http://localhost:8983/solr/update C:\test\output> Regards Thomas Lauer __ Hinweis von ESET NOD32 Antivirus, Signaturdatenbank- Version 3175 (20080611) __ E-Mail wurde geprüft mit ESET NOD32 Antivirus. http://www.eset.com >>> >>> -- >>> Grant Ingersoll >>> http://www.lucidimagination.com >>> >>> Lucene Helpful Hints: >>> http://wiki.apache.org/lucene-java/BasicsOfPerformance >>> http://wiki.apache.org/lucene-java/LuceneFAQ >>> >>> >>> >>> >>> >>> >>> >>> >>> >> http://www.nabble.com/file/p17794387/2.xml 2.xml >> -- >> View this message in context: >> http://www.nabble.com/Problem-with-add-a-XML-tp17772018p17794387.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > -- View this message in context: http://www.nabble.com/Problem-with-add-a-XML-tp17772018p17808276.html Sent from the Solr - User mailing list
Re: Problem with add a XML
That can happen if the JVM died or got a critical error. You can remove the lock file manually or configure Solr to remove it manually (see solrconfig.xml) -Yonik On Thu, Jun 12, 2008 at 3:57 PM, Thomas Lauer <[EMAIL PROTECTED]> wrote: > > This is the error message from the console. > > SCHWERWIEGEND: org.apache.lucene.store.LockObtainFailedException: Lock > obtain timed out: [EMAIL PROTECTED]:\Dokumente und E > instellungen\tla\Desktop\solr\apache-solr-1.2.0\apache-solr-1.2.0\example\solr\data\index\write.lock >at org.apache.lucene.store.Lock.obtain(Lock.java:70) >at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:579) >at org.apache.lucene.index.IndexWriter.(IndexWriter.java:341) >at > org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:65) >at > org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:120) >at > org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:181) >at > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:259) >at > org.apache.solr.handler.XmlUpdateRequestHandler.update(XmlUpdateRequestHandler.java:166) >at > org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:84) >at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:77) >at org.apache.solr.core.SolrCore.execute(SolrCore.java:658) >at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:191) >at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:159) >at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) >at > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) >at > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) >at > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) >at > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) >at > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) >at > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) >at > org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) >at > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) >at org.mortbay.jetty.Server.handle(Server.java:285) >at > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) >at > org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835) >at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641) >at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) >at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) >at > org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) >at > org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) > > > > > > Jón Helgi Jónsson wrote: >> >> Usually you get better error messages from the start.jar console, you >> don't see anything there? >> >> On Thu, Jun 12, 2008 at 7:49 AM, Thomas Lauer <[EMAIL PROTECTED]> wrote: >>> >>> Yes my file is UTF-8. I Have Upload my file. >>> >>> >>> >>> >>> Grant Ingersoll-6 wrote: On Jun 11, 2008, at 3:46 AM, Thomas Lauer wrote: > now I want tho add die files to solr. I have start solr on windows > in the example directory with java -jar start.jar > > > I have the following Error Message: > > C:\test\output>java -jar post.jar *.xml > SimplePostTool: version 1.2 > SimplePostTool: WARNING: Make sure your XML documents are encoded in > UTF-8, other encodings are not currently supported This is your issue right here. You have to save that second file in UTF-8. > > SimplePostTool: POSTing files to http://localhost:8983/solr/update.. > SimplePostTool: POSTing file 1.xml > SimplePostTool: POSTing file 2.xml > SimplePostTool: FATAL: Connection error (is Solr running at > http://localhost:8983/solr/update > ?): java.io.IOException: S > erver returned HTTP response code: 400 for URL: > http://localhost:8983/solr/update > > C:\test\output> > > Regards Thomas Lauer > > > > > > __ Hinweis von ESET NOD32 Antivirus, Signaturdatenbank- > Version 3175 (20080611) __ > > E-Mail wurde geprüft mit ESET NOD32 Antivirus. > > http://www.eset.com -- Grant Ingersoll http://www.lucidimagination.com Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ >
Re: Strategy for presenting fresh data
On Thu, 12 Jun 2008 07:14:04 -0700 (PDT) Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > What you are describing is pretty much what the original poster intends to > do, as far as I understand. ah right, i am reading it again in the morning and it makes sense . thanks for shaking the cobwebs off my mind :P B _ {Beto|Norberto|Numard} Meijome Lack of planning on your part does not constitute an emergency on ours. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
RE: Strategy for presenting fresh data
You can also use a shared file system mounted on a common SAN. (This is a high-end server configuration.) -Original Message- From: James Brady [mailto:[EMAIL PROTECTED] Sent: Thursday, June 12, 2008 9:59 AM To: solr-user@lucene.apache.org Subject: Re: Strategy for presenting fresh data >> >> In the meantime, I had imagined that, although clumsy, federated >> search could be used for this purpose - posting the new documents to >> a group of servers ('latest updates servers') with v limited amount >> of documents with v. fast "reload / refresh" times, and sending them >> again (on a work queue, possibly), to the 'core servers'. Regularly >> cleaning the 'latest updates servers' >> of the >> already posted documents to 'core servers' would keep them lean... >> of course, >> this approach sucks compared to a proper solution like what James is >> suggesting >> :) >> Otis - is there an issue I should be looking at for more information on this? Yes, in principle, sending updates both to a fresh, forgetful and fast index and a larger, slower index is what I'm thinking of doing. The only difference is that I'm talking about having the fresh index be implemented as a RAMDirectory in the same JVM as the large index. This means that I can avoid the slowness of cross-disk or cross- machine replication, I can avoid having to index all documents in two places and I cut out the extra moving part of federated search. On the other hand, I am going to have to write my own piece to handle the index flushes and federate searches to the fast and large indices. Thanks for your input! James
Best type to use for enum-like behavior
I am going to store two totally different types of documents in a single solr instance. Eventually I may separate them into separate instances but we are a long way from having either the size or traffic to require that. I read somewhere that a good approach is to add a 'type' field to the data and then use a filter query. What data type would you use for the type field? I could just an integer but then we have to remember that 1=user, 2=item, and so on. In mysql there's an enum type where you use text labels that are mapped to integers behind the scenes (good performance and user friendly). Is there something similar in solr or should I just use a string? -jsd-
Re: Best type to use for enum-like behavior
Just use a string. Any ol' string that suits your domain will do. Just be sure the field type is untokenized (the "string" type in the example configuration will do). Erik On Jun 12, 2008, at 8:07 PM, Jon Drukman wrote: I am going to store two totally different types of documents in a single solr instance. Eventually I may separate them into separate instances but we are a long way from having either the size or traffic to require that. I read somewhere that a good approach is to add a 'type' field to the data and then use a filter query. What data type would you use for the type field? I could just an integer but then we have to remember that 1=user, 2=item, and so on. In mysql there's an enum type where you use text labels that are mapped to integers behind the scenes (good performance and user friendly). Is there something similar in solr or should I just use a string? -jsd-
Re: Num docs
Or, if you want to go with something older/more stable, go with BDB. :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Marcus Herou <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Thursday, June 12, 2008 3:17:52 PM > Subject: Re: Num docs > > Cacti, Nagios you name it already in use :) > > Well I'm the CTO so the one really really interested in estimating perf. > > The id's come from a db initially and is later used for retrieval from a > distributed on disk caching system which I have written. > I'm in the process of moving from MySQL to HBase or Hypertable. > > /M > > On Tue, Jun 10, 2008 at 10:03 PM, Otis Gospodnetic < > [EMAIL PROTECTED]> wrote: > > > Marcus, > > > > It sounds like you may just want to use a good server monitoring package > > that collects server data and prints out pretty charts. Then you can show > > them to your IT/budget people when the charts start showing increased query > > latency times, very little available RAM, swapping, high CPU usage and such. > > Nagios, Ganglia, any of those things will do. > > > > > > Otis > > -- > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > - Original Message > > > From: Marcus Herou > > > To: solr-user@lucene.apache.org > > > Sent: Tuesday, June 10, 2008 3:29:40 PM > > > Subject: Re: Num docs > > > > > > Well guys you are right... Still I want to have a clue about how much > > each > > > machine stores to predict when we need more machines (measure performance > > > degradation per new document). But it's harder to collect that kind of > > data. > > > It sure is doable no doubt and is a normal sharding "algo" for MySQL. > > > > > > The best approach I think is to have some bg threads run X number of > > queries > > > and collect the response times, throw away the n lowest/highest response > > > times and calc an avg time which is used for in sharding and query > > lb'ing. > > > > > > Little off topic but interesting > > > What would you guys say about a good correlation between the index size > > on > > > disk (no stored text content) and available RAM and having good response > > > times. > > > > > > How long is a rope would you perhaps say...but I think some rule of thumb > > > could be established... > > > > > > One of the schemas of concern > > > > > > > > > required="true" /> > > > > > > required="true" /> > > > > > > required="false" /> > > > > > > stored="false" required="true" /> > > > > > > required="true" /> > > > > > > required="true" /> > > > > > > required="false" /> > > > > > > required="true" /> > > > > > > required="true" /> > > > > > > required="false" /> > > > > > > required="false" multiValued="true"/> > > > > > > required="false" /> > > > > > > required="false" /> > > > > > > required="false" /> > > > > > > required="false" /> > > > > > > > > > and a normal solr query (taken from the log): > > > /select > > > > > > start=0&q=(title:(apple)^4+OR+description:(apple))&version=2.2&rows=15&wt=xml&sort=publishDate+desc > > > > > > > > > //Marcus > > > > > > > > > > > > > > > > > > On Tue, Jun 10, 2008 at 1:15 AM, Otis Gospodnetic < > > > [EMAIL PROTECTED]> wrote: > > > > > > > Exactly. I think I mentioned this once before several months ago. One > > can > > > > take various hardware specs (# cores, CPU speed, FSB, RAM, etc.), > > > > performance numbers, etc. and come up with a number for each server's > > > > overall capacity. > > > > > > > > > > > > As a matter of fact, I think this would be useful to have right in > > Solr, > > > > primarily for use when allocating and sizing shards for Distributed > > Search. > > > > JIRA enhancement/feature issue? > > > > Otis > > > > -- > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > > > > - Original Message > > > > > From: Alexander Ramos Jardim > > > > > To: solr-user@lucene.apache.org > > > > > Sent: Monday, June 9, 2008 6:42:17 PM > > > > > Subject: Re: Num docs > > > > > > > > > > I even think that such a decision should be based on the overall > > machine > > > > > performance at a given time, and not the index size. Unless you are > > > > talking > > > > > solely about HD space and not having any performance issues. > > > > > > > > > > 2008/6/7 Otis Gospodnetic : > > > > > > > > > > > Marcus, > > > > > > > > > > > > > > > > > > For that you can rely on du, vmstat, iostat, top and such, too. :) > > > > > > > > > > > > Otis > > > > > > -- > > > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > > > > > > > > > > - Original Message > > > > > > > From: Marcus Herou > > > > > > > To: solr-user@lucene.apache.org > > > > > > > Sent: Saturday, June 7, 2008 12:33:10 PM > > > > > > > Subject: Re: Num docs > > > > > > > > > > > > > > Thanks, I wanna ask the indices how much more each shard can > > handle > > > > > > before > > > > > > > they're considered "full" and scream for a budget to get a
Re: Strategy for presenting fresh data
Hi James, Right, you'll have to write some custom components. It may be wiser to spend your time looking at what Jason R (sorry, can't remember the last name of the top of my head) put in JIRA. (you'll have to search, don't recall issue IDs). Actually, having a full Solr email folder helps sometime - see SOLR-564. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: James Brady <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Thursday, June 12, 2008 12:58:44 PM > Subject: Re: Strategy for presenting fresh data > > >> > >> In the meantime, I had imagined that, although clumsy, federated > >> search could > >> be used for this purpose - posting the new documents to a group of > >> servers > >> ('latest updates servers') with v limited amount of documents with > >> v. fast > >> "reload / refresh" times, and sending them again (on a work queue, > >> possibly), to > >> the 'core servers'. Regularly cleaning the 'latest updates servers' > >> of the > >> already posted documents to 'core servers' would keep them lean... > >> of course, > >> this approach sucks compared to a proper solution like what James > >> is suggesting > >> :) > >> > > > Otis - is there an issue I should be looking at for more information > on this? > > Yes, in principle, sending updates both to a fresh, forgetful and fast > index and a larger, slower index is what I'm thinking of doing. > > The only difference is that I'm talking about having the fresh index > be implemented as a RAMDirectory in the same JVM as the large index. > > This means that I can avoid the slowness of cross-disk or cross- > machine replication, I can avoid having to index all documents in two > places and I cut out the extra moving part of federated search. > > On the other hand, I am going to have to write my own piece to handle > the index flushes and federate searches to the fast and large indices. > > Thanks for your input! > James
My First Solr
HI, i have installed my first solr on tomcat. I have modify my shema.xml for my XML´s and I have import with the post.jar some xml files. tomcat runs solr/admin runs post.jar imports files but I can´t find my files. the reponse ist always 0 0 10 0 on KIS 2.2 My files in the attachment Regards Thomas guid beschreibung  00098d72-c03a-4075-b8af-80bd7d6fd7c5 143882 28.10.2005 13:05:52 Rechnung 2005-025232 Rechnungsduplikate 2002 330Q.doc KIS Bonow 29536 Guardus Solutions AG Mandant
Re: My First Solr
Hello Thomas, Have you performed a commit? Try adding as the last line of the document you are adding. I would suggest you read up on commits and how often you should perform them and how to do auto commits. Brian Am Freitag, den 13.06.2008, 07:20 +0200 schrieb Thomas Lauer: > HI, > > i have installed my first solr on tomcat. I have modify my shema.xml > for my XML´s and I have import with the post.jar some xml files. > > tomcat runs > solr/admin runs > > post.jar imports files > > > but I can´t find my files. > > the reponse ist always > > > > > 0 > 0 > > 10 > 0 > on > KIS > 2.2 > > > > > > My files in the attachment > > Regards Thomas > >
AW: My First Solr
Hi Brian, i have tested: SimplePostTool: version 1.2 SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, other encodings are not currently supported SimplePostTool: POSTing files to http://localhost:8983/solr/update.. SimplePostTool: POSTing file import_sample.xml SimplePostTool: COMMITting Solr index changes.. but can´t find the document. Regards Thomas -Ursprüngliche Nachricht- Von: Brian Carmalt [mailto:[EMAIL PROTECTED] Gesendet: Freitag, 13. Juni 2008 07:36 An: solr-user@lucene.apache.org Betreff: Re: My First Solr Hello Thomas, Have you performed a commit? Try adding as the last line of the document you are adding. I would suggest you read up on commits and how often you should perform them and how to do auto commits. Brian Am Freitag, den 13.06.2008, 07:20 +0200 schrieb Thomas Lauer: > HI, > > i have installed my first solr on tomcat. I have modify my shema.xml > for my XML´s and I have import with the post.jar some xml files. > > tomcat runs > solr/admin runs > > post.jar imports files > > > but I can´t find my files. > > the reponse ist always > > > > > 0 > 0 > > 10 > 0 > on > KIS > 2.2 > > > > > > My files in the attachment > > Regards Thomas > >
Re: My First Solr
Hi, the string you're looking for is in the "anwendung" field, while your default searchfield is the "beschreibung" field, try specifying the searchfield like this anwendung:"KIS". Geert Thomas Lauer wrote: HI, i have installed my first solr on tomcat. I have modify my shema.xml for my XML´s and I have import with the post.jar some xml files. tomcat runs solr/admin runs post.jar imports files but I can´t find my files. the reponse ist always 0 0 10 0 on KIS 2.2 My files in the attachment Regards Thomas
Re: AW: My First Solr
Do you see if the document update is sucessful? When you start solr with java -jar start.jar for the example, Solr will list the the document id of the docs that you are adding and tell you how long the update took. A simple but brute force method to findout if a document has been commited is to stop the server and then restart it. You can also use the solr/admin/stats.jsp page to see if the docs are there. After looking at your query in the results you posted, I would bet that you are not specifying a search field. try searching for "anwendung:KIS" or "id:[1 TO *]" to see all the docs in you index. Brian Am Freitag, den 13.06.2008, 07:40 +0200 schrieb Thomas Lauer: > i have tested: > SimplePostTool: version 1.2 > SimplePostTool: WARNING: Make sure your XML documents are encoded in > UTF-8, other encodings are not currently supported > SimplePostTool: POSTing files to http://localhost:8983/solr/update.. > SimplePostTool: POSTing file import_sample.xml > SimplePostTool: COMMITting Solr index changes..
AW: AW: My First Solr
ok, i find my files now. can I make all files to the default search file? Regards Thomas -Ursprüngliche Nachricht- Von: Brian Carmalt [mailto:[EMAIL PROTECTED] Gesendet: Freitag, 13. Juni 2008 08:03 An: solr-user@lucene.apache.org Betreff: Re: AW: My First Solr Do you see if the document update is sucessful? When you start solr with java -jar start.jar for the example, Solr will list the the document id of the docs that you are adding and tell you how long the update took. A simple but brute force method to findout if a document has been commited is to stop the server and then restart it. You can also use the solr/admin/stats.jsp page to see if the docs are there. After looking at your query in the results you posted, I would bet that you are not specifying a search field. try searching for "anwendung:KIS" or "id:[1 TO *]" to see all the docs in you index. Brian Am Freitag, den 13.06.2008, 07:40 +0200 schrieb Thomas Lauer: > i have tested: > SimplePostTool: version 1.2 > SimplePostTool: WARNING: Make sure your XML documents are encoded in > UTF-8, other encodings are not currently supported > SimplePostTool: POSTing files to http://localhost:8983/solr/update.. > SimplePostTool: POSTing file import_sample.xml > SimplePostTool: COMMITting Solr index changes.. __ Hinweis von ESET NOD32 Antivirus, Signaturdatenbank-Version 3182 (20080612) __ E-Mail wurde geprüft mit ESET NOD32 Antivirus. http://www.eset.com
Re: AW: AW: My First Solr
The DisMaxQueryHandler is your friend. Am Freitag, den 13.06.2008, 08:29 +0200 schrieb Thomas Lauer: > ok, i find my files now. can I make all files to the default search file? > > Regards Thomas > > -Ursprüngliche Nachricht- > Von: Brian Carmalt [mailto:[EMAIL PROTECTED] > Gesendet: Freitag, 13. Juni 2008 08:03 > An: solr-user@lucene.apache.org > Betreff: Re: AW: My First Solr > > Do you see if the document update is sucessful? When you start solr with > java -jar start.jar for the example, Solr will list the the document id > of the docs that you are adding and tell you how long the update took. > > A simple but brute force method to findout if a document has been > commited is to stop the server and then restart it. > > You can also use the solr/admin/stats.jsp page to see if the docs are > there. > > After looking at your query in the results you posted, I would bet that > you are not specifying a search field. try searching for "anwendung:KIS" > or "id:[1 TO *]" to see all the docs in you index. > > Brian > > Am Freitag, den 13.06.2008, 07:40 +0200 schrieb Thomas Lauer: > > i have tested: > > SimplePostTool: version 1.2 > > SimplePostTool: WARNING: Make sure your XML documents are encoded in > > UTF-8, other encodings are not currently supported > > SimplePostTool: POSTing files to http://localhost:8983/solr/update.. > > SimplePostTool: POSTing file import_sample.xml > > SimplePostTool: COMMITting Solr index changes.. > > > > __ Hinweis von ESET NOD32 Antivirus, Signaturdatenbank-Version 3182 > (20080612) __ > > E-Mail wurde geprüft mit ESET NOD32 Antivirus. > > http://www.eset.com >