Re: two cores but have single result set in solr
I do not know how to search both cores and not define "shard" parameter,could you show me some solutions for solve my issue? On 9/24/11, Yury Kats [via Lucene] wrote: > > > On 9/23/2011 6:00 PM, hadi wrote: >> I index my files with solrj and crawl my sites with nutch 1.3 ,as you >> know, i have to overwrite the nutch schema on solr schema in order to >> have view the result in solr/browse, in this case i should define two >> cores,but i want have single result or the user can search into both >> core indexes at the same time > > Can you not use 'shard' parameter and specify both cores there? > > > > ___ > If you reply to this email, your message will be added to the discussion > below: > http://lucene.472066.n3.nabble.com/two-cores-but-have-single-result-set-in-solr-tp3363043p3363164.html > > To unsubscribe from two cores but have single result set in solr, visit > http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3363043&code=bWQuYW5iYXJpQGdtYWlsLmNvbXwzMzYzMDQzfC02NDQ5ODMwMjM= -- View this message in context: http://lucene.472066.n3.nabble.com/two-cores-but-have-single-result-set-in-solr-tp3363043p3363901.html Sent from the Solr - User mailing list archive at Nabble.com.
matching reponse and request
Hi, sorry for this question but I am hoping it has a quick solution. I am sending multiple get request queries to solr but solr is not returning the responses in the sequence I send the requests. The shortest responses arrive back first I am wondering whether I can add a tag to the request which will be given back to me in the response so that when the response comes I can connect it to re original request and handle it in the appropriate manner. If this is possible, how? Help appreciated! Regards, Roland.
Re: levenshtein ranked results
Thanks Otis, this helps me tremendously. Kind regards, Roland Otis Gospodnetic wrote: Hi Roland, I did this: http://search-lucene.com/?q=sort+by+function&fc_project=Solr&fc_type=wiki Which took me to this: http://wiki.apache.org/solr/FunctionQuery#Sort_By_Function And further on that page you'll find strdist function documented: http://wiki.apache.org/solr/FunctionQuery#strdist I hope this helps. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message - From: Roland Tollenaar To: "solr-user@lucene.apache.org" Cc: Sent: Friday, September 23, 2011 1:50 AM Subject: levenshtein ranked results Hi, I tried an internet search to find out how to query solr to get the results ranked (ordered) by levenshtein distance. This appears to be possible but I could not find a concrete example as to how I would have to formulate the query, or if its a schema setting on a particular field, how to set up the schema. I am new to solr, any help appreciated. tia. Roland.
Sending pdf files to slor for indexing
Hi all I want to send a pdf file to slor for indexing. there is a command to send Solr a file via HTTP POST: http://wiki.apache.org/solr/ExtractingRequestHandler#Getting_Started_with_the_Solr_Example but "*curl*" is for linux and I want to use Solr in Windows. thanks a lot.
Re: Sending pdf files to slor for indexing
Also when I use that command in Linux, see this error: --- *Error 400 ERROR:unknown field 'ignored_meta'* HTTP ERROR 400 Problem accessing /solr/update/extract. Reason: ERROR:unknown field 'ignored_meta'Powered by Jetty:// - My command is: curl "http://localhost:8983/solr/update/extract?literal.id=doc1&commit=true"; -F "myfile=@ebrat.pdf" On Sat, Sep 24, 2011 at 1:33 PM, ahmad ajiloo wrote: > Hi all > I want to send a pdf file to slor for indexing. there is a command to send > Solr a file via HTTP POST: > > http://wiki.apache.org/solr/ExtractingRequestHandler#Getting_Started_with_the_Solr_Example > > but "*curl*" is for linux and I want to use Solr in Windows. > thanks a lot. >
Re: two cores but have single result set in solr
On 9/24/2011 3:09 AM, hadi wrote: > I do not know how to search both cores and not define "shard" > parameter,could you show me some solutions for solve my issue? See this: http://wiki.apache.org/solr/DistributedSearch
indexing a xml file
hello Solr Tutorial page explains about index a xml file. but when I try to index a xml file with this command: ~/Desktop/apache-solr-3.3.0/example/exampledocs$ java -jar post.jar solr.xml I get this error: SimplePostTool: FATAL: Solr returned an error #400 ERROR:unknown field 'name' can anyone help me? thanks
Re: indexing a xml file
i think the xml to be indexed has to follow a certain schema, defined in schema.xml under conf directory. maybe, your solr.xml is not doing that Sent from my iPhone On 24 Sep 2011, at 18:15, ahmad ajiloo wrote: hello Solr Tutorial page explains about index a xml file. but when I try to index a xml file with this command: ~/Desktop/apache-solr-3.3.0/example/exampledocs$ java -jar post.jar solr.xml I get this error: SimplePostTool: FATAL: Solr returned an error #400 ERROR:unknown field 'name' can anyone help me? thanks
Re: two cores but have single result set in solr
I read the link but the 'http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=ipod+solr' have a XML response that is not useful for me, i want to create query in solr/browse so this is need to change the template engine,do you know how to change that to search both cores? thanks On 9/24/11, Yury Kats [via Lucene] wrote: > > > On 9/24/2011 3:09 AM, hadi wrote: >> I do not know how to search both cores and not define "shard" >> parameter,could you show me some solutions for solve my issue? > > See this: http://wiki.apache.org/solr/DistributedSearch > > > ___ > If you reply to this email, your message will be added to the discussion > below: > http://lucene.472066.n3.nabble.com/two-cores-but-have-single-result-set-in-solr-tp3363043p3364157.html > > To unsubscribe from two cores but have single result set in solr, visit > http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3363043&code=bWQuYW5iYXJpQGdtYWlsLmNvbXwzMzYzMDQzfC02NDQ5ODMwMjM= -- View this message in context: http://lucene.472066.n3.nabble.com/two-cores-but-have-single-result-set-in-solr-tp3363043p3364459.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Sending pdf files to slor for indexing
You should get cygwin for windows and make sure to select curl as one of the many packages that come with cygwin when it's installer runs. Sent from my iPhone On Sep 24, 2011, at 5:29 AM, ahmad ajiloo wrote: > Also when I use that command in Linux, see this error: > --- > > > > *Error 400 ERROR:unknown field 'ignored_meta'* > > HTTP ERROR 400 > Problem accessing /solr/update/extract. Reason: > ERROR:unknown field 'ignored_meta'Powered > by Jetty:// > > > > > > > > > > > > > > > > > > > > > > > - > My command is: > curl "http://localhost:8983/solr/update/extract?literal.id=doc1&commit=true"; > -F "myfile=@ebrat.pdf" > > On Sat, Sep 24, 2011 at 1:33 PM, ahmad ajiloo wrote: > >> Hi all >> I want to send a pdf file to slor for indexing. there is a command to send >> Solr a file via HTTP POST: >> >> http://wiki.apache.org/solr/ExtractingRequestHandler#Getting_Started_with_the_Solr_Example >> >> but "*curl*" is for linux and I want to use Solr in Windows. >> thanks a lot. >>
Re: strategy for post-processing answer set
ok. this is a very basic question so please bear with me. I see where the velocity templates are and I have looked at the documentation and get the idea of how to write them. it looks to me as if Solr just brings back the URLs. what I want to do is to get the actual documents in the answer set, simplify their HTML and remove all the javascript, ads, etc., and append them into a single document. Now ... does Nutch already have the documents? can I get them from its db? or do I have to go get the documents again with something like a wget? Fred On Fri, Sep 23, 2011 at 16:02, Erik Hatcher wrote: > conf/velocity by default. See Solr's example configuration. > > Erik > > On Sep 23, 2011, at 12:37, Fred Zimmerman wrote: > > > ok, answered my own question, found velocity rw in solrconfig.xml. next > > question: > > > > where does velocity look for its templates? > > > > - > > Subscribe to the Nimble Books Mailing List http://eepurl.com/czS- for > > monthly updates > > > > > > > > On Fri, Sep 23, 2011 at 11:57, Fred Zimmerman > wrote: > > > >> This seems to be out of date. I am running Solr 3.4 > >> > >> * the file structure of apachehome/contrib is different and I don't see > >> velocity anywhere underneath > >> * the page referenced below only talks about Solr 1.4 and 4.0 > >> > >> ? > >> > >> On Thu, Sep 22, 2011 at 19:51, Markus Jelsma < > markus.jel...@openindex.io>wrote: > >> > >>> Hi, > >>> > >>> Solr support the Velocity template engine and has veyr good support. > Ideal > >>> for > >>> generating properly formatted output from the search engine. There's a > >>> clustering example and it's easy to format documents indexed by Nutch. > >>> > >>> http://wiki.apache.org/solr/VelocityResponseWriter > >>> > >>> Cheers > >>> > > Hi, > > I would like to take the HTML documents that are the result of a Solr > search and combine them into a single HTML document that combines the > >>> body > text of each individual document. What is a good strategy for this? I > >>> am > crawling with Nutch and Carrot2 for clustering. > Fred > >>> > >> > >> >
Re: Is verboten?
Does wrapping your content in CDATAs work? Best Erick On Mon, Sep 19, 2011 at 6:39 PM, chadsteele.com wrote: > It seems xml docs that use fail to be indexed properly and I've > recently discovered the following fails on my installation. > > /solr/update?stream.body= > > thoughts? > > I need to allow content to have in the xml. > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Is-doc-verboten-tp3350714p3350714.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: shareSchema="true" - location of schema.xml?
I have 300 cores so I feel your pain :-) What we do is use a relative path for the file. It works if you use ../../common/schema.xml for each core, then just create a common directory off your solr home and put your schema file there. I found this works great with solrconfig.xml and all of it's dependencies as well. Another choice is to look at the sharedLib parameter, which adds some directory to your classpath. I played with this for a little bit and couldn't get it working, so I went with the relative path solution. -- View this message in context: http://lucene.472066.n3.nabble.com/shareSchema-true-location-of-schema-xml-tp3297392p3364809.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: matching reponse and request
I don't think you can do this. If you are sending multiple GET requests, you are doing it across different HTTP connections. The web service has no way of knowing these are related. One solution would be to pass a spare, unused parameter to your request, like sequenceId=NNN and get the response to echo that back. Then at least you can tell which one is coming back and fix the order up in your program. -- View this message in context: http://lucene.472066.n3.nabble.com/matching-reponse-and-request-tp3363976p3364816.html Sent from the Solr - User mailing list archive at Nabble.com.
resource to see which versions build from trunk?
Hi all, I am testing various versions of solr from trunk, I am finding that often times the example doesn't build and I can't test out the version. Is there a resource that shows which versions build correctly so that we can test it out?
RE: JdbcDataSource and threads
My guess on this is that you're making a LOT of database requests and have a million TIME-WAIT connections, and your port range for local ports is running out. You should first confirm that's true by running netstat on the machine while the load is running. See if it gives a lot of output. One way to solve this problem is to use a connection pool. Look at adding a pooled JNDI connection into your web service and connect with that instead. The best way is to avoid making the extra connections. If the data in the subqueries is really short, look into caching the results using a CachedSqlEntityProcessor instead. I wasn't able to use this approach because I had a lot of data in the inner queries. What I ended out doing was writing my own OrderedSqlEntityProcessor which correlates an outer ordered query with an inner ordered query. This ran a lot faster and reduced my load times from 20 hours to 20 minutes. Let me know if you're interested in that code. -- View this message in context: http://lucene.472066.n3.nabble.com/JdbcDataSource-and-threads-tp3359874p3364831.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: field value getting null with special char
I can't imagine that the ( or ) is a problem. So I think we need to see how you're using SolrJ. In particular, are you asking for the field in question to be returned (e.g. SolrQuery.setFields or addField)? Second question: Are you sure your SolrJ is connecting to the server you connect to with the browser? You should see activity in the logs for both cases. Best Erick On Tue, Sep 20, 2011 at 6:23 PM, Ranveer Kumar wrote: > Is any help.. I am unable to figure out.. > On 20-Sep-2011 2:22 PM, "Ranveer" wrote: >> Hi All, >> >> I am facing problem to get value from solr server for a particular field. >> My environment is : >> Red hat 5.3 >> Solr 3.3 >> Jdk 1.6.24 >> Tomcat 6.2x >> Fetching value using SolrJ >> >> When using *:* on browser it show but when using solrj all value coming >> except few fields those have special char. >> >> Scoring (TEST) >> Scoring rate 3/4 (5) >> >> above value coming on browser but getting blank in solrj. I also noticed >> that all field with '(' or ')' have this kind of problem. >> >> If this is related to '(' then how to skip special char so can get value >> in solrj. >> >> regards >> Ranveer >> >> >> >
Re: q and fq in solr 1.4.1
Why is it important? What are you worried about that this implementation detail is necessary to know about? But the short answer is that the fq's are calculated against the whole index and the results are efficiently cached. That's the only way that the fq can be re-used against a different search term. The fq clauses are applied before sorting. Best Erick On Tue, Sep 20, 2011 at 10:55 PM, roz dev wrote: > Hi All > > I am sure that q vs fq question has been answered several times. > > But, I still have a question which I would like to know the answers for: > > if we have a solr query like this > > q=*&fq=field_1:XYZ&fq=field_2:ABC&sortBy=field_3+asc > > How does SolrIndexSearcher fire query in 1.4.1 > > Will it fire query against whole index first because q=* then filter the > results against field_1 and field_2 or is it in parallel? > > and, if we say that get only 20 rows at a time then will solr do following > 1) get all the docs (because q is set to *) and sort them by field_3 > 2) then, filter the results by field_1 and field_2 > > Or, will it apply sorting after doing the filter? > > Please let me know how Solr 1.4.1 works. > > Thanks > Saroj >
Re: JSON response with SolrJ
Hmmm, what advantage does JSON have over the SolrDocument you get back? Perhaps if you describe that we can offer better suggestions. Best Erick On Wed, Sep 21, 2011 at 5:01 AM, Kissue Kissue wrote: > Hi, > > I am using solr 3.3 with SolrJ. Does anybody have any idea how i can > retrieve JSON response with SolrJ? Is it possible? It seems to be more > focused on XML and Beans. > > Thanks. >
Re: Selective values for facets
You don't do anything special for facet at index time unless you, say, wanted to remove some value from the facet field, but then it would NEVER be available. So if you're saying that at index time you have certain documents 'New Year's Offers' that ONLY EVER want to map to NEWA, NEWB, NEWY, you could just take care of that at index time (don't index those values for that document). Assuming that it isn't that deterministic, have you looked at facet queries? Best Erick On Wed, Sep 21, 2011 at 7:58 AM, ntsrikanth wrote: > Hi, > > The dataset I have got is for special offers. > We got lot of offer codes. But I need to create few facets for specific > conditions only. > > For example, I got the following codes: ABCD, AGTR, KUYH, NEWY, NEWA, NEWB, > EAS1, EAS2 > > And I need to create a facet like > 'New Year Offers' mapped with NEWA, NEWB, NEWY and > 'Easter Offers' mapped with EAS1, EAS2 > > I dont want other codes returned in the facet when I query it. How to > prevent other values to be ignored while creating the facet during indexing > time? > > Thanks, > Srikanth NT > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Selective-values-for-facets-tp3355676p3355676.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: Production Issue: SolrJ client throwing this error even though field type is not defined in schema
You might want to review: http://wiki.apache.org/solr/UsingMailingLists There's really not much to go on here. Best Erick On Wed, Sep 21, 2011 at 12:13 PM, roz dev wrote: > Hi All > > We are getting this error in our Production Solr Setup. > > Message: Element type "t_sort" must be followed by either attribute > specifications, ">" or "/>". > Solr version is 1.4.1 > > Stack trace indicates that solr is returning malformed document. > > > Caused by: org.apache.solr.client.solrj.SolrServerException: Error > executing query > at > org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95) > at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118) > at > com.gap.gid.search.impl.SearchServiceImpl.executeQuery(SearchServiceImpl.java:232) > ... 15 more > Caused by: org.apache.solr.common.SolrException: parsing error > at > org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:140) > at > org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:101) > at > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:481) > at > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244) > at > org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89) > ... 17 more > Caused by: javax.xml.stream.XMLStreamException: ParseError at > [row,col]:[3,136974] > Message: Element type "t_sort" must be followed by either attribute > specifications, ">" or "/>". > at > com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:594) > at > org.apache.solr.client.solrj.impl.XMLResponseParser.readArray(XMLResponseParser.java:282) > at > org.apache.solr.client.solrj.impl.XMLResponseParser.readDocument(XMLResponseParser.java:410) > at > org.apache.solr.client.solrj.impl.XMLResponseParser.readDocuments(XMLResponseParser.java:360) > at > org.apache.solr.client.solrj.impl.XMLResponseParser.readNamedList(XMLResponseParser.java:241) > at > org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:125) > ... 21 more >
Re: NRT and commit behavior
No . The problem is that "number of documents" isn't a reliable indicator of resource consumption. Consider the difference between indexing a twitter message and a book. I can put a LOT more docs of 140 chars on a single machine of size X than I can books. Unfortunately, the only way I know of is to test. Use something like jMeter of SolrMeter to fire enough queries at your machine to determine when you're over-straining resources and shard at that point (or get a bigger machine ).. Best Erick On Wed, Sep 21, 2011 at 8:24 PM, Tirthankar Chatterjee wrote: > Okay, but is there any number that if we reach on the index size or total > docs in the index or the size of physical memory that sharding should be > considered. > > I am trying to find the winning combination. > Tirthankar > -Original Message- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Friday, September 16, 2011 7:46 AM > To: solr-user@lucene.apache.org > Subject: Re: NRT and commit behavior > > Uhm, you're putting a lot of index into not very much memory. I really think > you're going to have to shard your index across several machines to get past > this problem. Simply increasing the size of your caches is still limited by > the physical memory you're working with. > > You really have to put a profiler on the system to see what's going on. At > that size there are too many things that it *could* be to definitively answer > it with e-mails > > Best > Erick > > On Wed, Sep 14, 2011 at 7:35 AM, Tirthankar Chatterjee > wrote: >> Erick, >> Also, we had our solrconfig where we have tried increasing the cache >> making the below value for autowarm count as 0 helps returning the commit >> call within the second, but that will slow us down on searches >> >> > class="solr.FastLRUCache" >> size="16384" >> initialSize="4096" >> autowarmCount="4096"/> >> >> >> >> >> > class="solr.LRUCache" >> size="16384" >> initialSize="4096" >> autowarmCount="4096"/> >> >> >> > class="solr.LRUCache" >> size="512" >> initialSize="512" >> autowarmCount="512"/> >> >> -Original Message- >> From: Tirthankar Chatterjee [mailto:tchatter...@commvault.com] >> Sent: Wednesday, September 14, 2011 7:31 AM >> To: solr-user@lucene.apache.org >> Subject: RE: NRT and commit behavior >> >> Erick, >> Here is the answer to your questions: >> Our index is 267 GB >> We are not optimizing... >> No we have not profiled yet to check the bottleneck, but logs indicate >> opening the searchers is taking time... >> Nothing except SOLR >> Total memory is 16GB tomcat has 8GB allocated Everything 64 bit OS and >> JVM and Tomcat >> >> -Original Message- >> From: Erick Erickson [mailto:erickerick...@gmail.com] >> Sent: Sunday, September 11, 2011 11:37 AM >> To: solr-user@lucene.apache.org >> Subject: Re: NRT and commit behavior >> >> Hmm, OK. You might want to look at the non-cached filter query stuff, it's >> quite recent. >> The point here is that it is a filter that is applied only after all of the >> less expensive filter queries are run, One of its uses is exactly ACL >> calculations. Rather than calculate the ACL for the entire doc set, it only >> calculates access for docs that have made it past all the other elements of >> the query See SOLR-2429 and note that it is a 3.4 (currently being >> released) only. >> >> As to why your commits are taking so long, I have no idea given that you >> really haven't given us much to work with. >> >> How big is your index? Are you optimizing? Have you profiled the application >> to see what the bottleneck is (I/O, CPU, etc?). What else is running on your >> machine? It's quite surprising that it takes that long. How much memory are >> you giving the JVM? etc... >> >> You might want to review: >> http://wiki.apache.org/solr/UsingMailingLists >> >> Best >> Erick >> >> >> On Fri, Sep 9, 2011 at 9:41 AM, Tirthankar Chatterjee >> wrote: >>> Erick, >>> What you said is correct for us the searches are based on some Active >>> Directory permissions which are populated in Filter query parameter. So we >>> don't have any warming query concept as we cannot fire for every user ahead >>> of time. >>> >>> What we do here is that when user logs in we do an invalid query(which >>> return no results instead of '*') with the correct filter query (which is >>> his permissions based on the login). This way the cache gets warmed up with >>> valid docs. >>> >>> It works then. >>> >>> >>> Also, can you please let me know why commit is taking 45 mins to 1 hours on >>> a good resourced hardware with multiple processors and 16gb RAM 64 bit VM, >>> etc. We tried passing waitSearcher as false and found that inside the code >>> it hard coded to be true. Is there any specific reason. Can we change that >>> value to honor what is being passed. >>> >>> Thanks, >>> Tirthankar >>> >>> -Original Message- >>> From: Erick
Re: Solr Indexing - Null Values in date field
Solr dates are very specific, and your parsing exception is expected. See: http://lucene.apache.org/solr/api/org/apache/solr/schema/DateField.html Best Erick On Thu, Sep 22, 2011 at 6:28 AM, mechravi25 wrote: > Hi, > > Thanks for the suggestions. This is the option I tried. > > I changed the data type in my source to date and then indexed the field once > again. > > for the particular field , in my query in dataimport file, I gave the > following condition IFNULL(startdate,NULL). > > The document was indexed sucessfully. But the field startdate was not > present in the document. > > I have few other records in my source where in there is a value present in > the startdate but when I index that I am getting this exception > > org.apache.solr.common.SolrException: Invalid Date String:'2011-09-21 > 18:28:32.733' > at org.apache.solr.schema.DateField.parseMath(DateField.java:163) > at > org.apache.solr.schema.TrieDateField.createField(TrieDateField.java:171) > at org.apache.solr.schema.SchemaField.createField(SchemaField.java:95) > at > org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:204) > at > org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:277) > at > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60) > at > org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:75) > at > org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:292) > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:618) > at > org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:261) > at > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:185) > at > org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333) > at > org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391) > at > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372) > > > Please help. > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-Indexing-Null-Values-in-date-field-tp3355068p3358752.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: mlt content stream help
What version of Solr? When you copied the default, did you set up default values for MLT? Showing us the request you used and the relevant portions of your solrconifg file would help a lot, you might want to review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Thu, Sep 22, 2011 at 9:08 AM, dan whelan wrote: > I would like to use MLT and the content stream feature in solr like on this > page: > > http://wiki.apache.org/solr/MoreLikeThisHandler > > How should the request handler / solrconfig be setup? > > I enabled streaming and I set a requestHandler up by copying the default > request handler and I changed the name to: > > name="/mlt" > > but when accessing the url like the example on the wiki I get a NPE because > q is not supplied > > I'm sure I am just doing it wrong just not sure what. > > Thanks, > > dan >
Update ingest rate drops suddenly
just looking for hints where to look for... We were testing single threaded ingest rate on solr, trunk version on atypical collection (a lot of small documents), and we noticed something we are not able to explain. Setup: We use defaults for index settings, windows 64 bit, jdk 7 U2. on SSD, machine with enough memory and 8 cores. Schema has 5 stored fields, 4 of them indexed no positions no norms. Average net document size (optimized index size / number of documents) is around 100 bytes. On a test with 40 Mio document: - we had update ingest rate on first 4,4Mio documents @ incredible 34k records / second... - then it dropped, suddenly to 20k records per second and this rate remained stable (variance 1k) until... - we hit 13Mio, where ingest rate dropped again really hard, from one instant in time to another to 10k records per second. it stayed there until we reached the end @40Mio (slightly reducing, to ca 9k, but this is not long enough to see trend). Nothing unusual happening with jvm memory ( tooth-saw 200- 450M fully regular). CPU in turn was following the ingest rate trend, inicating that we were waiting on something. No searches , no commits, nothing. autoCommit was turned off. Updates were streaming directly from the database. - I did not expect something like this, knowing lucene merges in background. Also, having such sudden drops in ingest rate is indicative that we are not leaking something. (drop would have been much more gradual). It is some caches, but why two really significant drops? 33k/sec to 20k and than to 10k... We would love to keep it @34 k/second :) I am not really acquainted with the new MergePolicy and flushing settings, but I suspect this is something there we could tweak. Could it be windows is somehow, hmm, quirky with solr default directory on win64/jvm (I think it is MMAP by default)... We did not saturate IO with such a small documents I guess, It is a just couple of Gig over 1-2 hours. All in all, it works good, but is having such hard update ingest rate drops normal? Thanks, eks.
Re: Production Issue: SolrJ client throwing - Element type must be followed by either attribute specifications, ">" or "/>".
I suspect this is an issue with, say, your servelet container truncating the response or some such, but that's a guess... Best Erick On Thu, Sep 22, 2011 at 9:09 PM, roz dev wrote: > Wanted to update the list with our finding. > > We reduced the number of documents which are being retrieved from Solr and > this error did not appear again. > Might be the case that due to high number of documents, solr is returning > incomplete documents. > > -Saroj > > > On Wed, Sep 21, 2011 at 12:13 PM, roz dev wrote: > >> Hi All >> >> We are getting this error in our Production Solr Setup. >> >> Message: Element type "t_sort" must be followed by either attribute >> specifications, ">" or "/>". >> Solr version is 1.4.1 >> >> Stack trace indicates that solr is returning malformed document. >> >> >> Caused by: org.apache.solr.client.solrj.SolrServerException: Error executing >> query >> at >> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95) >> at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118) >> at >> com.gap.gid.search.impl.SearchServiceImpl.executeQuery(SearchServiceImpl.java:232) >> ... 15 more >> Caused by: org.apache.solr.common.SolrException: parsing error >> at >> org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:140) >> at >> org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:101) >> at >> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:481) >> at >> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244) >> at >> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89) >> ... 17 more >> Caused by: javax.xml.stream.XMLStreamException: ParseError at >> [row,col]:[3,136974] >> Message: Element type "t_sort" must be followed by either attribute >> specifications, ">" or "/>". >> at >> com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:594) >> at >> org.apache.solr.client.solrj.impl.XMLResponseParser.readArray(XMLResponseParser.java:282) >> at >> org.apache.solr.client.solrj.impl.XMLResponseParser.readDocument(XMLResponseParser.java:410) >> at >> org.apache.solr.client.solrj.impl.XMLResponseParser.readDocuments(XMLResponseParser.java:360) >> at >> org.apache.solr.client.solrj.impl.XMLResponseParser.readNamedList(XMLResponseParser.java:241) >> at >> org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:125) >> ... 21 more >> >> >
Re: How to map database table for facted search?
In general, you flatten the data when you put things into Solr. I know that's anathema to DB training, but this is searching ... If you have a reasonable number of distinct column names, you could just define your schema to have an entry for each and index the associated values that way. Then your facets become easy, you're just faceting on the "facet_hobby" field in your example. If that's impractical (say you can add arbitrary columns), you can do something very similar with dynamic fields. You could also create a field with the column/name pairs (watch your tokenizer!) in a single field and facet by prefix, where the prefix was the column name (e.g. index tokens like hobby_sailing hobby_camping interest_reading then facet with facet.prefix:hobby_). There are tradeoffs for each that you'll have to experiment with. Note that there is no penalty in Solr for defining fields in your schema but not using them. Best Erick On Fri, Sep 23, 2011 at 12:06 AM, Chorherr Nikolaus wrote: > Hi All! > > We are working first time with solr and have a simple data model > > Entity Person(column surname) has 1:n Attribute(column name) has 1:n > Value(column text) > > We need faceted search on the content of Attribute:name not on Attribute:name > itself, e.g if an Attribute of person has name=hobby, we would like to have > something like ... "facet=true&facet.name=hobby" and get back > all related Value with count.(We do not need a "facet.name=name" and get back > all distinct values of the name column of Attribute) > > How do we have to map our database, define or document and/or define our > schema? > > Any help is highly appreciated - Thx in advance > > Niki >
Re: Solr wildcard searching
Really, really, get in the habit of looking at your query with &debugQuery=on appended, it'll save you a world of pain .. customer_name:John Do* doesn't do what you think. It parses into customer_name:John OR default_search_field:Do* you want something like customer_name:(+John +Do*) or +customer_name:John +customer_name:Do* You want to look particularly at the parsedquery part of the return, the scoring stuff is useful for understanding scoring... And, watch out for your default operator. In the absence of you setting it in your schema.xml file, it is OR, and a result (with &debugQuery=on) of customer_name:John customer_name:Do* translates as though it is a SHOULD clause... Best Erick On Thu, Sep 22, 2011 at 7:08 PM, jaystang wrote: > Hey guys, > Very new to solr. I'm using the data import handler to pull customer data > out of my database and index it. All works great so far. Now I'm trying to > query against a specific field and I seem to be struggling with doing a > wildcard search. See below. > > I have several indexed documents with a "customer_name" field containing > "John Doe". I have a UI that contains a listing of this indexed data as > well has a keyword filter field (filter as you type). So I would like when > the user starts typing "J", "John Doe will return, and "Jo", "John Doe" will > return, "Joh"... etc, etc... > > I've tried the following: > > Search: customer_name:Joh* > Returns: The correct "John Doe" Record" > > Search: customer_name:John Do* > Returns: No results (nothing returns w/ 2 works since I don't have the > string in quotes.) > > Search: customer_name:"Joh*" > Returns: No results > > Search: customer_name:"John Do*" > Returns: No results > > Search: customer_NAME:"John Doe*" > Returns: The correct "John Doe" Record" > > I feel like I'm close, only issue is when there are multiple words. > > Any advice would be appreciated. > > Thanks! > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-wildcard-searching-tp3360681p3360681.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: Solrj - when a request fails
Hmmm. I'm a little confused. Are you sure your log is going somewhere and that you are NOT seeing any stack traces? Because it looks like you *are* seeing them. In which case re-throwing an error breaks your file fetch loop and stops your processing. I'd actually expect that you're losing some files before as well, since even if you do trap those errors you're not doing a commit operation, unless perhaps autocommit is saving you. BTW, committing after every document is probably a bad idea, it'll create lots and lots of segments unnecessarily. I'd rely on the autocommit features and optionally commit after the run is completed. Best Erick On Fri, Sep 23, 2011 at 5:55 AM, Walter Closenfleight wrote: > * > I have a java program which sends thousands of Solr XML files up to Solr > using the following code. It works fine until there is a problem with one of > the Solr XML files. The code fails on the solrServer.request(up) line, but > it does not throw an exception, my application therefore cannot catch it and > recover, and my whole application dies. > > I've fixed this individual file that made it fail, but want to better trap > these so my application does not die. > > Thanks for any insight you can provide. Java code and log below- > > > // ... start of a loop to process each file removed ... > > try { > > String xml = read(filename); > DirectXmlRequest up = new DirectXmlRequest( "/update", xml ); > > solrServer.request( up ); > solrServer.commit(); > > } catch (SolrServerException e) { > log.warn("Exception: "+ e.toString()); > throw new MyException(e); > } catch (IOException e) { > log.warn("Exception: "+ e.toString()); > throw new MyException(e); > } > DEBUG >> "[\n]" - (Wire.java:70) > DEBUG Request body sent - (EntityEnclosingMethod.java:508) > DEBUG << "HTTP/1.1 400 Bad Request[\r][\n]" - (Wire.java:70) > DEBUG << "HTTP/1.1 400 Bad Request[\r][\n]" - (Wire.java:70) > DEBUG << "Server: Apache-Coyote/1.1[\r][\n]" - (Wire.java:70) > DEBUG << "Content-Type: text/html;charset=utf-8[\r][\n]" - (Wire.java:70) > DEBUG << "Content-Length: 1271[\r][\n]" - (Wire.java:70) > DEBUG << "Date: Fri, 23 Sep 2011 12:08:05 GMT[\r][\n]" - (Wire.java:70) > DEBUG << "Connection: close[\r][\n]" - (Wire.java:70) > DEBUG << "[\r][\n]" - (Wire.java:70) > DEBUG << "Apache Tomcat/6.0.29 - Error > report > HTTP Status 400 - Unexpected character 'x' (code 120) in > prolog; expected '<'[\n]" - (Wire.java:70) > DEBUG << " at [row,col {unknown-source}]: [3,1] noshade="noshade">type Status reportmessage > Unexpected character 'x' (code 120) in prolog; expected '<'[\n]" - > (Wire.java:70) > DEBUG << " at [row,col {unknown-source}]: [3,1]description > " - (Wire.java:84) > DEBUG << "The request sent by the client was syntactically incorrect > (Unexpected character 'x' (code 120) in prolog; expected '<'[\n]" - > (Wire.java:70) > DEBUG << " at [row,col {unknown-source}]: [3,1]). noshade="noshade">Apache Tomcat/6.0.29" - > (Wire.java:84) > DEBUG Should close connection in response to directive: close - > (HttpMethodBase.java:1008) > * >
Re: Solr 3.4 Problem with integrating Query Parser Plug In
Could you please add some details here? It's really hard to figure out what the problem is. Perhaps you could review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Fri, Sep 23, 2011 at 9:28 AM, Ahson Iqbal wrote: > Hi > > I have indexed some 1M documents, just for performance testing. I have > written a query parser plug, when i add it in solr lib folder under tomcat > wepapps folder. and try to load solr admin page it keeps on loading and when > I delete jar file of query parser plugin from lib it works fine. but jar file > works good with solr 3.3 and also with solr 1.4. > > please help. > > Regards > Ahsan >
Re: what are the disdvantages of using dynamic fields?
There are really no differences between dynamic and static fields performance-wise that I know of. Personally, though, I tend to prefer static over dynamic from a maintenance/debugging perspective. At issue is tracking down why results weren't as expected, then spending several days discovering that I managed to mis-spell some field in my docs, in my SolrJ program or in my queries. A variant of the "fail early" notion. Dynamic fields have great uses, but I think you're better off using static when possible. Best Erick On Fri, Sep 23, 2011 at 3:14 PM, Jason Toy wrote: > Hi all, > > I'd like to know what the specific disadvantages are for using dynamic > fields in my schema are? About half of my fields are dynamic, but I could > move all of them to be static fields. WIll my searches run faster? If there > are no disadvantages, can I just set all my fields to be dynamic? > > Jason >
Re: two cores but have single result set in solr
I think you should step back and consider what you're asking for as Ken pointed out. You have different schemas. And presumably different documents in each schema. The scores from the different cores are NOT comparable. So how could you "combine" the meaningfully? Further, assuming that the documents have different characteristics, the term frequencies and document frequencies will be different. Solr really only supports this notion if your schemas are identical and you're indexing similar documents to shards, using shards with this intent probably won't do what you expect. But why not just index your SolrJ documents directly into the same core that you use for Nutch and just search the one index? You don't have to provide values for any fields that don't have 'required="true" ' set. And if you do this, I suspect you'll have trouble with relevance, but at least you'll get started. Best Erick On Sat, Sep 24, 2011 at 8:02 AM, hadi wrote: > I read the link but the > 'http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=ipod+solr' > have a XML response that is not useful for me, i want to create query > in solr/browse so this is need to change the template engine,do you > know how to change that to search both cores? thanks > > On 9/24/11, Yury Kats [via Lucene] > wrote: >> >> >> On 9/24/2011 3:09 AM, hadi wrote: >>> I do not know how to search both cores and not define "shard" >>> parameter,could you show me some solutions for solve my issue? >> >> See this: http://wiki.apache.org/solr/DistributedSearch >> >> >> ___ >> If you reply to this email, your message will be added to the discussion >> below: >> http://lucene.472066.n3.nabble.com/two-cores-but-have-single-result-set-in-solr-tp3363043p3364157.html >> >> To unsubscribe from two cores but have single result set in solr, visit >> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3363043&code=bWQuYW5iYXJpQGdtYWlsLmNvbXwzMzYzMDQzfC02NDQ5ODMwMjM= > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/two-cores-but-have-single-result-set-in-solr-tp3363043p3364459.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: resource to see which versions build from trunk?
Hmmm, why are you doing this? Why not use the latest successful trunk build? You can get a series of built artifacts at: https://builds.apache.org//view/S-Z/view/Solr/job/Solr-trunk/ but I'm not sure how far back they go. How are you getting the trunk source code? And *how* don't they build? But I really question how useful this is unless it's curiosity, since none of the trunk builds will be officially supported until the release... Best Erick On Sat, Sep 24, 2011 at 11:08 AM, Jason Toy wrote: > Hi all, I am testing various versions of solr from trunk, I am finding that > often times the example doesn't build and I can't test out the version. Is > there a resource that shows which versions build correctly so that we can > test it out? >
Re: Solr wildcard searching
And to complete the answer of Erick, in this search, customer_name:"Joh*" * is not considered as a wildcard, it is an exact search. another thing, (it is not your problem...), Words with wildcards are not analyzed, so, if your analyzer contains a lower case filter, in the index, these words are stored in lower case: John -> john Do -> do so, customer_name:Do* will not find anything. Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-wildcard-searching-tp3360681p3365086.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr wildcard searching
Thanks Ludovic, you're absolutely right, I should have added that. BTW, there are patches that haven't been committed, see: https://issues.apache.org/jira/browse/SOLR-1604 and similar. Best Erick On Sat, Sep 24, 2011 at 1:32 PM, lboutros wrote: > And to complete the answer of Erick, > > in this search, > > customer_name:"Joh*" > > * is not considered as a wildcard, it is an exact search. > > another thing, (it is not your problem...), > > Words with wildcards are not analyzed, > so, if your analyzer contains a lower case filter, > in the index, these words are stored in lower case: > > John -> john > Do -> do > > so, > > customer_name:Do* will not find anything. > > Ludovic. > > - > Jouve > France. > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-wildcard-searching-tp3360681p3365086.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: SOLR error with custom FacetComponent
Erik, Unfortunately the facet fields are not static. The field are dynamic SOLR fields and are generated by different applications. The field names will be populated into a data store (like memcache) and facets have to be driven from that data store. I need to write a Custom FacetComponent which picks up the facet fields from the data store. Thanks for your response. -Ravi Bulusu Subject: Re: SOLR error with custom FacetComponent From: Erik Hatcher Date: 2011-09-21 18:18 Why create a custom facet component for this? Simply add lines like this to your request handler(s): manu_exact either in defaults or appends sections. Erik On Wed, Sep 21, 2011 at 2:00 PM, Ravi Bulusu wrote: > Hi All, > > > I'm trying to write a custom SOLR facet component and I'm getting some > errors when I deploy my code into the SOLR server. > > Can you please let me know what Im doing wrong? I appreciate your help on > this issue. Thanks. > > *Issue* > > I'm getting an error saying "Error instantiating SearchComponent Class> is not a org.apache.solr.handler.component.SearchComponent". > > My custom class inherits from *FacetComponent* which extends from * > SearchComponent*. > > My custom class is defined as follows… > > I implemented the process method to meet our functionality. > > We have some default facets that have to be sent every time, irrespective > of the Query request. > > > /** > > * > > * @author ravibulusu > > */ > > public class MyFacetComponent extends FacetComponent { > > …. > > } >
Re: resource to see which versions build from trunk?
Hey, the more hammering on trunk the better! On Sep 24, 2011, at 13:31 , Erick Erickson wrote: > Hmmm, why are you doing this? Why not use the latest > successful trunk build? > > You can get a series of built artifacts at: > https://builds.apache.org//view/S-Z/view/Solr/job/Solr-trunk/ > but I'm not sure how far back they go. How are you getting > the trunk source code? And *how* don't they build? > > But I really question how useful this is unless it's curiosity, > since none of the trunk builds will be officially supported > until the release... > > Best > Erick > > On Sat, Sep 24, 2011 at 11:08 AM, Jason Toy wrote: >> Hi all, I am testing various versions of solr from trunk, I am finding that >> often times the example doesn't build and I can't test out the version. Is >> there a resource that shows which versions build correctly so that we can >> test it out? >>
Re: resource to see which versions build from trunk?
Agreed, but I'd rather see hammering on latest code On Sat, Sep 24, 2011 at 1:53 PM, Erik Hatcher wrote: > Hey, the more hammering on trunk the better! > > > On Sep 24, 2011, at 13:31 , Erick Erickson wrote: > >> Hmmm, why are you doing this? Why not use the latest >> successful trunk build? >> >> You can get a series of built artifacts at: >> https://builds.apache.org//view/S-Z/view/Solr/job/Solr-trunk/ >> but I'm not sure how far back they go. How are you getting >> the trunk source code? And *how* don't they build? >> >> But I really question how useful this is unless it's curiosity, >> since none of the trunk builds will be officially supported >> until the release... >> >> Best >> Erick >> >> On Sat, Sep 24, 2011 at 11:08 AM, Jason Toy wrote: >>> Hi all, I am testing various versions of solr from trunk, I am finding that >>> often times the example doesn't build and I can't test out the version. Is >>> there a resource that shows which versions build correctly so that we can >>> test it out? >>> > >
Re: indexing a xml file
Send us the example "solr.xml" and "schema.xml'". You are missing fields in the schema.xml that you are referencing. On 9/24/11 8:15 AM, "ahmad ajiloo" wrote: >hello >Solr Tutorial page explains about index a xml file. but when I try to >index >a xml file with this command: >~/Desktop/apache-solr-3.3.0/example/exampledocs$ java -jar post.jar >solr.xml >I get this error: >SimplePostTool: FATAL: Solr returned an error #400 ERROR:unknown field >'name' > >can anyone help me? >thanks
Re: Update ingest rate drops suddenly
eks, This is clear as day - you're using Winblows! Kidding. I'd: * watch IO with something like vmstat 2 and see if the rate drops correlate to increased disk IO or IO wait time * monitor the DB from which you were pulling the data - maybe the DB or the server that runs it had issues * monitor the network over which you pull data from DB If none of the above reveals the problem I'd still: * grab all data you need to index and copy it locally * index everything locally Out of curiosity, how big is your ramBufferSizeMB and your -Xmx? And on that 8-core box you have ~8 indexing threads going? Otis Sematext is Hiring -- http://sematext.com/about/jobs.html > >From: eks dev >To: solr-user >Sent: Saturday, September 24, 2011 3:18 PM >Subject: Update ingest rate drops suddenly > >just looking for hints where to look for... > >We were testing single threaded ingest rate on solr, trunk version on >atypical collection (a lot of small documents), and we noticed >something we are not able to explain. > >Setup: >We use defaults for index settings, windows 64 bit, jdk 7 U2. on SSD, >machine with enough memory and 8 cores. Schema has 5 stored fields, >4 of them indexed no positions no norms. >Average net document size (optimized index size / number of documents) >is around 100 bytes. > >On a test with 40 Mio document: >- we had update ingest rate on first 4,4Mio documents @ incredible >34k records / second... >- then it dropped, suddenly to 20k records per second and this rate >remained stable (variance 1k) until... >- we hit 13Mio, where ingest rate dropped again really hard, from one >instant in time to another to 10k records per second. > >it stayed there until we reached the end @40Mio (slightly reducing, to >ca 9k, but this is not long enough to see trend). > >Nothing unusual happening with jvm memory ( tooth-saw 200- 450M fully >regular). CPU in turn was following the ingest rate trend, inicating >that we were waiting on something. No searches , no commits, nothing. > >autoCommit was turned off. Updates were streaming directly from the database. > >- >I did not expect something like this, knowing lucene merges in >background. Also, having such sudden drops in ingest rate is >indicative that we are not leaking something. (drop would have been >much more gradual). It is some caches, but why two really significant >drops? 33k/sec to 20k and than to 10k... We would love to keep it @34 >k/second :) > >I am not really acquainted with the new MergePolicy and flushing >settings, but I suspect this is something there we could tweak. > >Could it be windows is somehow, hmm, quirky with solr default >directory on win64/jvm (I think it is MMAP by default)... We did not >saturate IO with such a small documents I guess, It is a just couple >of Gig over 1-2 hours. > >All in all, it works good, but is having such hard update ingest rate >drops normal? > >Thanks, >eks. > > >
Re: matching reponse and request
Hi Roland, Check this: 0 0 on 0 solr 1 <=== from &foo=1 2.2 10 I added &foo=1 to the request to Solr and got the above back. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ > >From: Roland Tollenaar >To: solr-user@lucene.apache.org >Sent: Saturday, September 24, 2011 4:07 AM >Subject: matching reponse and request > >Hi, > >sorry for this question but I am hoping it has a quick solution. > >I am sending multiple get request queries to solr but solr is not returning >the responses in the sequence I send the requests. > >The shortest responses arrive back first > >I am wondering whether I can add a tag to the request which will be given back >to me in the response so that when the response comes I can connect it to re >original request and handle it in the appropriate manner. > >If this is possible, how? > >Help appreciated! > >Regards, > >Roland. > > >
Best Solr escaping?
What is the best algorithm for escaping strings before sending to Solr? Does someone have some code? A few things I have witnessed in "q" using DIH handler * Double quotes - " that are not balanced can cause several issues from an error (strip the double quote?), to no results. * Should we use + or %20 and what cases make sense: > * "Dr. Phil Smith" or "Dr.+Phil+Smith" or "Dr.%20Phil%20Smith" - also what is > the impact of double quotes? * Unmatched parenthesis I.e. Opening ( and not closing. > * (Dr. Holstein > * Cardiologist+(Dr. Holstein Regular encoding of strings does not always work for the whole string due to several issues like white space: * White space works better when we use back quote "Bill\ Bell" especially when using facets. Thoughts? Code? Ideas? Better Wikis?
Re: Search query doesn't work in solr/browse pnnel
Yes. It appears that "&" cannot be encoded in the URL or there is really bad results. For example we get an error on first request, but if we refresh it goes away. On 9/23/11 2:57 PM, "hadi" wrote: >When I create a query like "something&fl=content" in solr/browse the "&" >and >"=" in URL converted to %26 and %3D and no result occurs. but it works in >solr/admin advanced search and also in URL bar directly, How can I solve >this problem? Thanks > >-- >View this message in context: >http://lucene.472066.n3.nabble.com/Search-query-doesn-t-work-in-solr-brows >e-pnnel-tp3363032p3363032.html >Sent from the Solr - User mailing list archive at Nabble.com.
RE: JdbcDataSource and threads
Thanks a lot for your response! I think that is exactly what's happening. It runs ok for a short time and starts throwing that error while some of the ueriea run successfully. I had it setup with 10 threads, maybe that was too much. I'd be very interested in that code if you don't mind sharing. I'm migrating code from pure Lucene to Solr and indexing time went from less that one hour to more than 4 because it's using only one thread. Thanks a lot again, very helpful. Maria Sent from my Motorola ATRIX™ 4G on AT&T -Original message- From: rkuris To: solr-user@lucene.apache.org Sent: Sat, Sep 24, 2011 18:11:20 GMT+00:00 Subject: RE: JdbcDataSource and threads My guess on this is that you're making a LOT of database requests and have a million TIME-WAIT connections, and your port range for local ports is running out. You should first confirm that's true by running netstat on the machine while the load is running. See if it gives a lot of output. One way to solve this problem is to use a connection pool. Look at adding a pooled JNDI connection into your web service and connect with that instead. The best way is to avoid making the extra connections. If the data in the subqueries is really short, look into caching the results using a CachedSqlEntityProcessor instead. I wasn't able to use this approach because I had a lot of data in the inner queries. What I ended out doing was writing my own OrderedSqlEntityProcessor which correlates an outer ordered query with an inner ordered query. This ran a lot faster and reduced my load times from 20 hours to 20 minutes. Let me know if you're interested in that code. -- View this message in context: http://lucene.472066.n3.nabble.com/JdbcDataSource-and-threads-tp3359874p3364831.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr UpdateJSON - extra fields
If JSON being posted to ''http://localhost:8983/solr/update/json' URL has extra fields that are not defined in the index schema definition, will those be silently ignored or an error thrown. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-UpdateJSON-extra-fields-tp3366066p3366066.html Sent from the Solr - User mailing list archive at Nabble.com.