Spell checking the synonym list?
Hi all, I'm wondering if it's possible to have spell checking performed on terms in the synonym list? For example, let's say I have documents with the word "lawyer" in them and I add "lawyer, attorney" in the synonyms.txt file. Then a query is made for the word "atorney". Is there any way to provide spell checking on this? Thanks, Ryan
Re: Lost connection to Zookeeper
Hi, We are facing the same issues on our setup. 3 zk nodes, 1 shard, 10 collections, 1 replica. v. 5.0.0. default startup params. Solr Servers: 2 core cpu, 7gb memory Index size: 28g, 3gb heap This setup was running on v. 4.6 before upgrading to 5 without any of these errors. The timeout seems to happen randomly and only to 1 of the replicas (fortunately) at the time. Joe: did you get anywhere with the perf hints? If not, any other tips appreciated. null:org.apache.solr.common.SolrException: CLUSTERSTATUS the collection time out:180s at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:630) at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:582) at org.apache.solr.handler.admin.CollectionsHandler.handleClusterStatus(CollectionsHandler.java:932) at org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:256) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144) at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:736) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:261 ) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:204) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) - Eirik fre. 5. jun. 2015 kl. 15.58 skrev Joseph Obernberger < j...@lovehorsepower.com>: > Thank you Shawn! Yes - it is now a Solr 5.1.0 cloud on 27 nodes and we > use the startup scripts. The current index size is 3.0T - about 115G > per node - index is stored in HDFS which is spread across those 27 nodes > and about (a guess) - 256 spindles. Each node has 26G of HDFS cache > (MaxDirectMemorySize) allocated to Solr. Zookeeper storage is on local > disk. Solr and HDFS run on the same machines. Each node is connected to > a switch over 1G Ethernet, but the backplane is 40G. > Do you think the clusterstatus and the zookeeper timeouts are related to > performance issues talking to HDFS? > > The JVM parameters are: > - > -DSTOP.KEY=solrrocks > -DSTOP.PORT=8100 > -Dhost=helios > -Djava.net.preferIPv4Stack=true > -Djetty.port=9100 > -DnumShards=27 > -Dsolr.clustering.enabled=true > -Dsolr.install.dir=/opt/solr > -Dsolr.lock.type=hdfs > -Dsolr.solr.home=/opt/solr/server/solr > -Duser.timezone=UTC-DzkClientTimeout=15000 > -DzkHost=eris.querymasters.com:2181,daphnis.querymasters.com:2181, > triton.querymasters.com:2181,oberon.querymasters.com:2181, > portia.querymasters.com:2181,puck.querymasters.com:2181/solr5 > > -XX:+CMSParallelRemarkEnabled > -XX:+CMSScavengeBeforeRemark > -XX:+ParallelRefProcEnabled > -XX:+PrintGCApplicationStoppedTime > -XX:+PrintGCDateStamps > -XX:+PrintGCDetails > -XX:+PrintGCTimeStamps > -XX:+PrintHeapAtGC > -XX:+PrintTenuringDistribution > -XX:+UseCMSInitiatingOccupancyOnly > -XX:+UseConcMarkSweepGC > -XX:+UseLarge
Re: Can I instruct the Tika Entity Processor to skip the first page using the DIH?
On 08/07/2015 20:39, Allison, Timothy B. wrote: Unfortunately, no. We can't even do that now with straight Tika. I imagine this is for pdf files? If you'd like to add this as a feature, please submit a ticket over on Tika. Another alternative is to pre-process the PDF files to remove the first page. I've used the command line version of PDFtk for this kind of thing in the past: https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/ I'd also recommend using Tika outside Solr rather than via the DIH: certain nasty PDFs can kill Tika, which then can kill Solr. Charlie -Original Message- From: Paden [mailto:rumsey...@gmail.com] Sent: Wednesday, July 08, 2015 12:14 PM To: solr-user@lucene.apache.org Subject: Can I instruct the Tika Entity Processor to skip the first page using the DIH? Hello, I'm using the DIH to import some files from one of my local directories. However, every single one of these files has the same first page. So I want to skip that first page in order to optimize search. Can this be accomplished by an instruction within the dataimporthandler or, if not, how could you do this? -- View this message in context: http://lucene.472066.n3.nabble.com/Can-I-instruct-the-Tika-Entity-Processor-to-skip-the-first-page-using-the-DIH-tp4216373.html Sent from the Solr - User mailing list archive at Nabble.com. -- Charlie Hull Flax - Open Source Enterprise Search tel/fax: +44 (0)8700 118334 mobile: +44 (0)7767 825828 web: www.flax.co.uk
Re: Solr cache when using custom scoring
Mikhail, We've now override the equal & hashcode of the custom query to use this new param as well, and it works like charm. Thanks allot, Ami -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-cache-when-using-custom-scoring-tp4216419p4216496.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Do I really need copyField when my app can do the copy?
Hi, I wants to add a question regarding copyField and LowerCaseFilterFactory We notice that LowerCaseFilterFactory takes huge part of the CPU ( via profiling ) for the text filed Can we avoid it or improve that implementation? ( keeping the insensitive case search ) Best Regards, Nir Barel -Original Message- From: Petersen, Robert [mailto:robert.peter...@rakuten.com] Sent: Thursday, July 09, 2015 1:59 AM To: solr-user@lucene.apache.org Subject: RE: Do I really need copyField when my app can do the copy? Perhaps some people like maybe those using DIH to feed their index might not have that luxury and copyfield is the better way for them. If you have an application you can do it either way. I have done both ways in different situations. Robi -Original Message- From: Steven White [mailto:swhite4...@gmail.com] Sent: Wednesday, July 08, 2015 3:38 PM To: solr-user@lucene.apache.org Subject: Do I really need copyField when my app can do the copy? Hi Everyone, What good is the use of copyField in Solr's schema.xml if my application can do it into the designated field? Having my application do so helps me simplify the schema.xml maintains task thus my motivation. Thanks Steve &j)ly˫y�� ���"�
Re: Too many Soft commits and opening searchers realtime
Cool ! So actually you were not using the default you defined in th Solrconfig, but it was loaded from a java environment property set to be "3" ms ? Cheers 2015-07-09 4:21 GMT+01:00 Summer Shire : > Yonik, Mikhail, Alessandro > > After a lot of digging around and isolation, All u guys were right. I was > using property based value and there was one place where it was 30 secs and > that was overriding my main props. > > Also Yonik thanks for the explanation on the real time searcher. I wasn't > sure if the maxwarmingSearcher error I was getting also had something to do > with it. > > Thanks a lot > > > On Jul 8, 2015, at 5:28 AM, Yonik Seeley wrote: > > > > A realtime searcher is necessary for internal bookkeeping / uses if a > > normal searcher isn't opened on a commit. > > This searcher doesn't have caches and hence doesn't carry the weight > > that a normal searcher would. It's also invisible to clients (it > > doesn't change the view of the index for normal searches). > > > > Your hard autocommit at 8 minutes with openSearcher=false will trigger > > a realtime searcher to open on every 8 minutes along with the hard > > commit. > > > > -Yonik > > > > > >> On Tue, Jul 7, 2015 at 5:29 PM, Summer Shire > wrote: > >> HI All, > >> > >> Can someone help me understand the following behavior. > >> I have the following maxTimes on hard and soft commits > >> > >> yet I see a lot of Opening Searchers in the log > >> org.apache.solr.search.SolrIndexSearcher - Opening Searcher@1656a258[main] > realtime > >> also I see a soft commit happening almost every 30 secs > >> org.apache.solr.update.UpdateHandler - start > commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false} > >> > >> 48 > >> false > >> > >> > >> > >> 18 > >> > >> I tried disabling softCommit by setting maxTime to -1. > >> On startup solrCore recognized it and logged "Soft AutoCommit: disabled" > >> but I could still see softCommit=true > >> org.apache.solr.update.UpdateHandler - start > commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false} > >> > >> -1 > >> > >> > >> Thanks, > >> Summer > -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England
Re: Do I really need copyField when my app can do the copy?
Let me answer in line : 2015-07-09 9:35 GMT+01:00 Nir Barel : > Hi, > > I wants to add a question regarding copyField and LowerCaseFilterFactory > We notice that LowerCaseFilterFactory takes huge part of the CPU ( via > profiling ) for the text filed > Can we avoid it or improve that implementation? ( keeping the insensitive > case search ) > If you want to have case insensitive search you DO need the Lower case token filter and i suggest to use it. Or possibly a tokenizer that already produce lowercase tokens ( example : lower case tokenizer ). To answer your second question, of course you can improve the lower case token filter implementation ! I never checked it, I think it is already good but if you believe you can improve it, i encourage you to do so ! Cheers > > Best Regards, > Nir Barel > > > -Original Message- > From: Petersen, Robert [mailto:robert.peter...@rakuten.com] > Sent: Thursday, July 09, 2015 1:59 AM > To: solr-user@lucene.apache.org > Subject: RE: Do I really need copyField when my app can do the copy? > > Perhaps some people like maybe those using DIH to feed their index might > not have that luxury and copyfield is the better way for them. If you have > an application you can do it either way. I have done both ways in > different situations. > > Robi > > -Original Message- > From: Steven White [mailto:swhite4...@gmail.com] > Sent: Wednesday, July 08, 2015 3:38 PM > To: solr-user@lucene.apache.org > Subject: Do I really need copyField when my app can do the copy? > > Hi Everyone, > > What good is the use of copyField in Solr's schema.xml if my application > can do it into the designated field? Having my application do so helps me > simplify the schema.xml maintains task thus my motivation. > > Thanks > > Steve > &j)ly˫y�� > ���"� > -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England
Data form cataloggroup and catalogentry cores
Hi All, Is there a way to get a combined data from 2 different cores together in a single call? like a data form both CatalogEntry and CatalogGroup cores in a single call to solr. -- Regards, Santosh Sidnal
Re: Ranking based on term position
Hi Li Li, I am experiencing the same problem. can you Explain little detailed? Where do i change these methods? I am using Solr 5.0.0, And How do i query this? Is there any change while query? -- View this message in context: http://lucene.472066.n3.nabble.com/Ranking-based-on-term-position-tp979271p4216522.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Spell checking the synonym list?
One of the uses of synonyms is to replace a mis-spelled query term with a correctly spelled value. The "2 sided" synonym file format allows you to control which values "survive" into the actual query. lawyer, attorney, ambulance chaser, atorney, lowyor => lawyer, attorney I am not aware, however, of any integration between synonym processing and a spellcheck dictionary. Makes sense, though. But I think additional metadata would be required, per dictionary entry, to govern synonym processing. Thus, building the dictionary would not be a transparent/automatic process. https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-SynonymFilter -Original Message- From: Ryan Yacyshyn [mailto:ryan.yacys...@gmail.com] Sent: Thursday, July 09, 2015 3:28 AM To: solr-user@lucene.apache.org Subject: Spell checking the synonym list? Hi all, I'm wondering if it's possible to have spell checking performed on terms in the synonym list? For example, let's say I have documents with the word "lawyer" in them and I add "lawyer, attorney" in the synonyms.txt file. Then a query is made for the word "atorney". Is there any way to provide spell checking on this? Thanks, Ryan * This e-mail may contain confidential or privileged information. If you are not the intended recipient, please notify the sender immediately and then delete it. TIAA-CREF *
Restore index API does not work in solr 5.1.0 ?
Hi all, How can we restore the index in Solr 5.1.0 ? We did following: 1:- Started Solr Cloud from: bin/solr start -e cloud -noprompt 2:- posted some documents to solr from examples folder using : java -Dc=gettingstarted -jar post.jar *.xml 3:- Backed up the Index using: http://localhost:8983/solr/gettingstarted/replication?command=backup 4:- Deleted 1 document using: http://localhost:8983/solr/gettingstarted/update?stream.body=id:"IW-02"&commit=true 5:- restored the index using: http://localhost:8983/solr/gettingstarted/replication?command=restore The Restore works fine with same steps for 5.2 versions but not 5.1 Is there any other way to restore index in Solr 5.1.0? -- Best Regards, Dinesh Naik
SolrJ/Tika custom indexer not indexing CERTAIN .doc text?
Hello, I've been working to get a search engine up an running for a little while now. I'm using Solr to index from both a database and a file system. However, I'm using the filepath contained inside the database to find the file in the filesystem and then merge the the metadata in the DB and the file system. I pretty much figured out I had two options. I could use the DIH or I could create my own custom indexer in Java. I got pretty far on the indexer, almost complete actually. But I defaulted to the DIH because it indexed all the files I had at the time well. Now I'm taking the project to the next stage of development and I'm worried that the larger PDF's that I have to index might just kill Tika/Solr, thereby stopping me in my tracks. So I want to have that custom indexer as a backup. As I said I got pretty far with the custom indexer but I encountered one problem at the the end. Tika wouldn't index the text of all the .doc files. It would pull it but when I got the results in Solr it would look blank } Author: "Some name" text: "" } Some context, I got these files from a .zip that was given to me by another department so they were all sitting in a single file system. After trying a few things I finally created a NEW .doc and copied the text from another .doc file in the system to see if that would work. And it did. So it's not that it wasn't indexing the text of .doc files. It was just THOSE .doc's I was given in the .zip. I didn't request another zip with fresh files because that would mean jumping through some hoops but I wonder if I should. Now I haven't posted the code cause I don't feel like this is really a code issue. I feel like it might be some bizarre file issue. I've posted the code below but really I was just wondering whether or not anyone has ran into this particular brand of problem before and how they solved it. I'm using a linux file system so theres that and ALL data except for the text comes from the database. That means author, id,...etc all comes from the database. That's why i could get the "some name" author above in the response. import org.apache.solr.client.solrj.impl.HttpSolrClient; import org.apache.solr.client.solrj.SolrServerException; import org.apache.solr.client.solrj.impl.XMLResponseParser; import org.apache.solr.client.solrj.SolrClient; import org.apache.solr.client.solrj.response.UpdateResponse; import org.apache.solr.common.SolrInputDocument; import org.apache.pdfbox.pdmodel.PDDocument; /* Tika jars need to be retrieved online */ import org.apache.tika.metadata.Metadata; import org.apache.tika.parser.pdf.PDFParser; import org.apache.tika.parser.AutoDetectParser; import org.apache.tika.parser.ParseContext; import org.apache.tika.sax.BodyContentHandler; import org.xml.sax.ContentHandler; import java.io.File; import java.io.FileInputStream; import java.io.IOException; import java.io.InputStream; import java.sql.*; import java.util.ArrayList; import java.util.Collection; public class TikaSqlIndexer { private SolrClient server = new HttpSolrClient("http://localhost:8983/solr/Testcore3";); private long _start = System.currentTimeMillis(); private AutoDetectParser autoParser; private PDFParser pdfParser; private int _totalTika = 0; private int _totalSql = 0; private Collection _docs = new ArrayList(); public static void main(String[] args) { try{ TikaSqlIndexer idxer = new TikaSqlIndexer("http://localhost:8983/solr/Testcore3";); //idxer.Index(); idxer.doTikaDocuments(new File("/home/paden/Documents/LWP_Files/BIGDATA")); } catch (Exception e) { e.printStackTrace(); } } private TikaSqlIndexer(String url) throws IOException, SolrServerException { // creates a channel with the Solr server server = new HttpSolrClient(url); //server.setParser(new XMLResponseParser()); autoParser = new AutoDetectParser(); pdfParser = new PDFParser(); } private void Index() throws SQLException, SolrServerException { Connection con = null; try{ Class.forName("com.mysql.jdbc.Driver").newInstance(); log("Driver Loaded .."); String URL = "jdbc:mysql://localhost:3306/EDMS_Metadata";
Re: SolrJ/Tika custom indexer not indexing CERTAIN .doc text?
I posted the code anyway just forgot to get rid of that line in the post. Sorry -- View this message in context: http://lucene.472066.n3.nabble.com/SolrJ-Tika-custom-indexer-not-indexing-CERTAIN-doc-text-tp4216541p4216542.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: LowerCaseFilterFactory burns CPU
That should be fixable. In a past life, I generated a perfect hash to fold case for Unicode in a locale-neutral manner and it was very fast. If I remember right, there are only about 2500 Unicode characters that can be case folded at all. So the generated, collision-free hash function was very small and fast and the lookup table was small. I used Bob Jenkins' tool suite for a C application. http://burtleburtle.net/bob/hash/perfect.html But there are a number of other open source tools available. Bob Jenkins currently recommends this one by Botelho and Ziviani: http://homepages.dcc.ufmg.br/~nivio/papers/cikm07.pdf -Original Message- From: Nir Barel [mailto:ni...@checkpoint.com] Sent: Thursday, July 09, 2015 4:35 AM To: solr-user@lucene.apache.org Subject: RE: Do I really need copyField when my app can do the copy? Hi, I wants to add a question regarding copyField and LowerCaseFilterFactory We notice that LowerCaseFilterFactory takes huge part of the CPU ( via profiling ) for the text filed Can we avoid it or improve that implementation? ( keeping the insensitive case search ) Best Regards, Nir Barel -Original Message- From: Petersen, Robert [mailto:robert.peter...@rakuten.com] Sent: Thursday, July 09, 2015 1:59 AM To: solr-user@lucene.apache.org Subject: RE: Do I really need copyField when my app can do the copy? Perhaps some people like maybe those using DIH to feed their index might not have that luxury and copyfield is the better way for them. If you have an application you can do it either way. I have done both ways in different situations. Robi -Original Message- From: Steven White [mailto:swhite4...@gmail.com] Sent: Wednesday, July 08, 2015 3:38 PM To: solr-user@lucene.apache.org Subject: Do I really need copyField when my app can do the copy? Hi Everyone, What good is the use of copyField in Solr's schema.xml if my application can do it into the designated field? Having my application do so helps me simplify the schema.xml maintains task thus my motivation. Thanks Steve &j)ly˫y " * This e-mail may contain confidential or privileged information. If you are not the intended recipient, please notify the sender immediately and then delete it. TIAA-CREF *
Re: Do I really need copyField when my app can do the copy?
On 7/9/2015 2:35 AM, Nir Barel wrote: > I wants to add a question regarding copyField and LowerCaseFilterFactory > We notice that LowerCaseFilterFactory takes huge part of the CPU ( via > profiling ) for the text filed > Can we avoid it or improve that implementation? ( keeping the insensitive > case search ) I don't know what the CPU usage is like compared to LCF, but I use ICUFoldingFilterFactory instead. This does several things in one pass, including lowercasing (which it calls case folding), and it is aware of the all characters in Unicode. https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ICUFoldingFilterFactory The ICU classes require additional jars to be loaded into Solr before they will work. Thanks, Shawn
RE: Spell checking the synonym list?
Ryan, If you use index-time synonyms on the spellcheck field, this will give you what you want. For instance, if the document has "lawyer" and you index both terms "lawyer","attorney", then the spellchecker will see that "atorney" is 1 edit away from an indexed term and will suggest "attorney". You'll need to have the same synonyms set up against the query field, but you have the option of making these query-time synonyms if you prefer. James Dyer Ingram Content Group -Original Message- From: Ryan Yacyshyn [mailto:ryan.yacys...@gmail.com] Sent: Thursday, July 09, 2015 2:28 AM To: solr-user@lucene.apache.org Subject: Spell checking the synonym list? Hi all, I'm wondering if it's possible to have spell checking performed on terms in the synonym list? For example, let's say I have documents with the word "lawyer" in them and I add "lawyer, attorney" in the synonyms.txt file. Then a query is made for the word "atorney". Is there any way to provide spell checking on this? Thanks, Ryan
RE: Do I really need copyField when my app can do the copy?
-Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: Thursday, July 09, 2015 9:55 AM To: solr-user@lucene.apache.org Subject: Re: Do I really need copyField when my app can do the copy? On 7/9/2015 2:35 AM, Nir Barel wrote: > I wants to add a question regarding copyField and > LowerCaseFilterFactory We notice that LowerCaseFilterFactory takes > huge part of the CPU ( via profiling ) for the text filed Can we avoid > it or improve that implementation? ( keeping the insensitive case > search ) I don't know what the CPU usage is like compared to LCF, but I use ICUFoldingFilterFactory instead. This does several things in one pass, including lowercasing (which it calls case folding), and it is aware of the all characters in Unicode. https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ICUFoldingFilterFactory The ICU classes require additional jars to be loaded into Solr before they will work. Thanks, Shawn * This e-mail may contain confidential or privileged information. If you are not the intended recipient, please notify the sender immediately and then delete it. TIAA-CREF *
Class loading problem from ${solr.solr.home}/lib in 5.2.1
I was having a problem in a 4.x version of Solr and wanted to check 5.2.1 to see if it still had the same problem, so I copied my fieldType into a 5.2.1 example schema. My fieldType uses some ICU analysis classes, so I also put the contrib jars into server/solr/lib. I ran into a problem similar to SOLR-4852. https://issues.apache.org/jira/browse/SOLR-4852 The Solr log shows the icu jars being loaded twice, and i got a class not found exception for the ICU classes. If I change the schema from the "short" factory class name (solr.ICUTokenizerFactory) to the full class name (org.apache.lucene.analysis.icu.segmentation.ICUTokenizerFactory), then it works. I'm guessing that the root of the problem is the fact that the jars are loaded twice ... but I cannot see any reason for them to be loaded twice. They are not mentioned in any directives in solrconfig.xml, and there is no sharedLib declaration in solr.xml. The core that I created was using the techproducts sample config. Should I open an issue for this problem? I canprovide a very clear step-by-step minimal process for showing the problem with a stock 5.2.1 download. Thanks, Shawn
RE: LowerCaseFilterFactory burns CPU
Combining under new subject to reflect new question. Took a quick look at both the LowerCaseFilter and Java implementation it uses. A perfect hash would be much faster and, since LowerCaseFilter does not consider locale, applicable. ICUFoldingFilter is a somewhat different animal. But I take your point, for an indexed/searchable field (vs. stored/returned) that may contain accented characters from a wider variety of locales, it makes a lot of sense. It seems like a single filter would perform the tasks that we use 3 filters to do. Have you ever looked at the ICU internals? Are they fairly efficient wrt character attributes and folding? I used ICU C++ libs a while back and they were never a bottleneck, but that doesn't mean they wouldn't be in this context or that the Java libs have the same performance characteristics. We use the following for US-only content (which may be a common use case): Thus my interest in the question. -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: Thursday, July 09, 2015 9:55 AM To: solr-user@lucene.apache.org Subject: Re: Do I really need copyField when my app can do the copy? I don't know what the CPU usage is like compared to LCF, but I use ICUFoldingFilterFactory instead. This does several things in one pass, including lowercasing (which it calls case folding), and it is aware of the all characters in Unicode. https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ICUFoldingFilterFactory The ICU classes require additional jars to be loaded into Solr before they will work. Thanks, Shawn -Original Message- From: Reitzel, Charles [mailto:charles.reit...@tiaa-cref.org] Sent: Thursday, July 09, 2015 9:47 AM To: solr-user@lucene.apache.org Subject: RE: LowerCaseFilterFactory burns CPU That should be fixable. In a past life, I generated a perfect hash to fold case for Unicode in a locale-neutral manner and it was very fast. If I remember right, there are only about 2500 Unicode characters that can be case folded at all. So the generated, collision-free hash function was very small and fast and the lookup table was small. I used Bob Jenkins' tool suite for a C application. http://burtleburtle.net/bob/hash/perfect.html But there are a number of other open source tools available. Bob Jenkins currently recommends this one by Botelho and Ziviani: http://homepages.dcc.ufmg.br/~nivio/papers/cikm07.pdf -Original Message- From: Nir Barel [mailto:ni...@checkpoint.com] Sent: Thursday, July 09, 2015 4:35 AM To: solr-user@lucene.apache.org Subject: RE: Do I really need copyField when my app can do the copy? Hi, I wants to add a question regarding copyField and LowerCaseFilterFactory We notice that LowerCaseFilterFactory takes huge part of the CPU ( via profiling ) for the text filed Can we avoid it or improve that implementation? ( keeping the insensitive case search ) Best Regards, Nir Barel * This e-mail may contain confidential or privileged information. If you are not the intended recipient, please notify the sender immediately and then delete it. TIAA-CREF *
[JOB] Financial search engine company AlphaSense is looking for Search Engineers
Company: AlphaSense https://www.alpha-sense.com/ Position: Search Engineer AlphaSense is a one-stop financial search engine for financial research analysts all around the world. AlphaSense is looking for Search Engineers experienced with Lucene / Solr and search architectures in general. Positions are open in Helsinki ( http://www.visitfinland.com/helsinki/). Daily routine topics for our search team: 1. Sharding 2. Commit vs query performance 3. Performance benchmarking 4. Custom query syntax, lucene / solr grammars 5. Relevancy 6. Query optimization 7. Search system monitoring: cache, RAM, throughput etc 8. Automatic deployment 9. Internal tool development We have evolved the system through a series of Solr releases starting from 1.4 to 4.10. Requirements: 1. Core Java + web services 2. Understanding of distributed search engine architecture 3. Java concurrency 4. Understanding of performance issues and their solutions 5. Clean and beautiful code + design patterns Our search team members are active in the open source search scene, in particular we support and develop luke toolbox: https://github.com/dmitrykey/luke, participate in search / OS conferences (Lucene Revolution, ApacheCon, Berlin buzzwords), review books on Solr. Send your CV over and let's have a chat. -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info
How to determine cache setting in Solr Search Instance
Hi All, I did a load test with a total of 800 requests (at 40 concurrent requests per second) to be executed against Solr index with 14 M records. Performance was good (< 1 second) especially after a short period of time of the test. BTW, the second round of load test was even better. The local machine has a free memory of about 15 G during the load test I observed the following from the stats page: (1) documentCache reached the configured size for documentCache with a hit ratio of 0.66 (2) filterCache has 2519 hits with a hit ratio of 0.63. The size is 1465 (less than a configured size: 16384) (3) queryResultCache has a hit ratio of 0 (4) fieldValueCache has a hit ratio of 0 The following are the cache configuration in solrconfig.xml It looks like I need to increase the size of documentCache. The hit ratio of zero for queryResultCache and fieldValueCache was surprising (zero). Is it possible that this is due to randomly generated requests? What are some guideline in tuning the cache parameter? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-determine-cache-setting-in-Solr-Search-Instance-tp4216562.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Data form cataloggroup and catalogentry cores
You can try using the "shards" parameter. The problem will be, though, that the score calculations may not really be comparable... Best, Erick On Thu, Jul 9, 2015 at 3:40 AM, santosh sidnal wrote: > Hi All, > > Is there a way to get a combined data from 2 different cores together in a > single call? > > > like a data form both CatalogEntry and CatalogGroup cores in a single call > to solr. > > > > -- > Regards, > Santosh Sidnal
Re: SolrJ/Tika custom indexer not indexing CERTAIN .doc text?
Wow, that code looks familiar ;)... Anyway, what have you tried? bq: It would pull it but when I got the results in Solr it would look blank How do you know this? Do _some_ docs have text in Solr but some don't or are all of your text fields blank? In this case I suspect you're not storing the data. What I'd do is isolate just the one file and look at the processing in the debugger to see if any text is extracted. Then I'd look at the doc in Word (or whatever) to insure that there _is_ text in it. Then. Perhaps the program is swallowing the error. Perhaps the file is mal-formed and isn't being analyzed appropriately. Perhaps the file isn't there at all. And sending one doc to Solr at a time isn't very efficient, but perhaps some of your files are so big that it's better that way. Best, Erick On Thu, Jul 9, 2015 at 6:36 AM, Paden wrote: > I posted the code anyway just forgot to get rid of that line in the post. > Sorry > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/SolrJ-Tika-custom-indexer-not-indexing-CERTAIN-doc-text-tp4216541p4216542.html > Sent from the Solr - User mailing list archive at Nabble.com.
RE: Can I instruct the Tika Entity Processor to skip the first page using the DIH?
Concur on both points. You can also use PDFBox's app "ExtractText" with -startPage and -endPage parameters: https://pdfbox.apache.org/1.8/commandline.html#extractText -Original Message- From: Charlie Hull [mailto:char...@flax.co.uk] Sent: Thursday, July 09, 2015 3:55 AM To: solr-user@lucene.apache.org Subject: Re: Can I instruct the Tika Entity Processor to skip the first page using the DIH? On 08/07/2015 20:39, Allison, Timothy B. wrote: > Unfortunately, no. We can't even do that now with straight Tika. I > imagine this is for pdf files? If you'd like to add this as a > feature, please submit a ticket over on Tika. Another alternative is to pre-process the PDF files to remove the first page. I've used the command line version of PDFtk for this kind of thing in the past: https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/ I'd also recommend using Tika outside Solr rather than via the DIH: certain nasty PDFs can kill Tika, which then can kill Solr. Charlie > > -Original Message- From: Paden [mailto:rumsey...@gmail.com] > Sent: Wednesday, July 08, 2015 12:14 PM To: > solr-user@lucene.apache.org Subject: Can I instruct the Tika Entity > Processor to skip the first page using the DIH? > > Hello, I'm using the DIH to import some files from one of my local > directories. However, every single one of these files has the same > first page. So I want to skip that first page in order to optimize > search. > > Can this be accomplished by an instruction within the > dataimporthandler or, if not, how could you do this? > > > > -- View this message in context: > http://lucene.472066.n3.nabble.com/Can-I-instruct-the-Tika-Entity-Processor-to-skip-the-first-page-using-the-DIH-tp4216373.html > > Sent from the Solr - User mailing list archive at Nabble.com. > -- Charlie Hull Flax - Open Source Enterprise Search tel/fax: +44 (0)8700 118334 mobile: +44 (0)7767 825828 web: www.flax.co.uk
Re: How to determine cache setting in Solr Search Instance
On 7/9/2015 9:48 AM, wwang525 wrote: > I did a load test with a total of 800 requests (at 40 concurrent requests > per second) to be executed against Solr index with 14 M records. Performance > was good (< 1 second) especially after a short period of time of the test. > BTW, the second round of load test was even better. > > The local machine has a free memory of about 15 G during the load test > > I observed the following from the stats page: > > (1) documentCache reached the configured size for documentCache with a hit > ratio of 0.66 > (2) filterCache has 2519 hits with a hit ratio of 0.63. The size is 1465 > (less than a configured size: 16384) > (3) queryResultCache has a hit ratio of 0 > (4) fieldValueCache has a hit ratio of 0 > > The following are the cache configuration in solrconfig.xml > > class="solr.LRUCache" > size="16384" > initialSize="512" > autowarmCount="0"/> > > >class="solr.LRUCache" > size="16384" > initialSize="4096" > autowarmCount="256"/> > > >class="solr.LRUCache" > size="16384" > initialSize="4096" > autowarmCount="256"/> > > It looks like I need to increase the size of documentCache. The hit ratio of > zero for queryResultCache and fieldValueCache was surprising (zero). Is it > possible that this is due to randomly generated requests? I would say that a hitrate of 0.66 for your documentCache is pretty good. You can increase the size to try and make it better, but that will use more java heap memory. Note that I am not specifically talking about the 15GB of free memory that you mentioned -- that's the operating system, not Java. Depending on what exactly is happening with Java's memory management, that additional heap memory *might* come out of the 15GB, but it might not. Cache sizes of 16384 are VERY large. Your filterCache could actually be dropped quite a bit, because it only has 1465 entries. The autowarmCount settings are a little high. Executing 256 filters (which would be required on every single commit) can be slow. You may want to use FastLRUCache instead of LRUCache for documentCache and filterCache, because your hitrate is good. LRUCache is better when the hitrate is lower. The fieldValueCache is an internal cache that Solr and Lucene use and manage. I don't know much about it. If you don't send the same query more than once, then your queryResultCache will have a hitrate of zero. Thanks, Shawn
Re: How to determine cache setting in Solr Search Instance
It's actually unlikely that increasing the documentCache will help materially. It's primarily so various components won't have to fetch the documents off disk for a _single_ request. I've heard some anecdotal evidence that it helps in some situations, but that's been rare in my experience. Your filterCache is a bomb waiting to go off. Each filterCache entry can be up to maxDoc/8 bytes long, so if you let you don't re-open searchers (thus causing it to flush), that cache could eventually grow to on the order of 28G. The autowarm count is also quite high. I'd start by knocking these back a _lot_. Also, look at why your hit ratio is this low. If you have a lot of fq clauses that you _know_ aren't going to be re-used, use {!cache=false}. If you are using NOW, that's an anti-pattern, see: http://lucidworks.com/blog/date-math-now-and-filter-queries/ queryResultCache is largely used for paging, having 0 hits is a good thing for stress tests since it insures that you're not getting false stats because of this. But there's no good reason to have this over 16K max. It will be significantly smaller than the filterCache, but still a waste. Again, the autowarm count is very high. Taking it back to 32 or so is where I'd start. bq: especially after a short period of time of the test. right, this is due to filling up the lower-level caches. If you haven't restarted your system then the autowarming will fill them up. But you can also specify newSearcher and firstSearcher queries to do the same thing at far less expense than executing 256 queries every time you open a new searcher. In summary, your settings are far too high for most people, I'd start by examining the usage rather than making them bigger. Best, Erick On Thu, Jul 9, 2015 at 8:48 AM, wwang525 wrote: > Hi All, > > I did a load test with a total of 800 requests (at 40 concurrent requests > per second) to be executed against Solr index with 14 M records. Performance > was good (< 1 second) especially after a short period of time of the test. > BTW, the second round of load test was even better. > > The local machine has a free memory of about 15 G during the load test > > I observed the following from the stats page: > > (1) documentCache reached the configured size for documentCache with a hit > ratio of 0.66 > (2) filterCache has 2519 hits with a hit ratio of 0.63. The size is 1465 > (less than a configured size: 16384) > (3) queryResultCache has a hit ratio of 0 > (4) fieldValueCache has a hit ratio of 0 > > The following are the cache configuration in solrconfig.xml > > class="solr.LRUCache" > size="16384" > initialSize="512" > autowarmCount="0"/> > > >class="solr.LRUCache" > size="16384" > initialSize="4096" > autowarmCount="256"/> > > >class="solr.LRUCache" > size="16384" > initialSize="4096" > autowarmCount="256"/> > > It looks like I need to increase the size of documentCache. The hit ratio of > zero for queryResultCache and fieldValueCache was surprising (zero). Is it > possible that this is due to randomly generated requests? > > What are some guideline in tuning the cache parameter? > > Thanks > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/How-to-determine-cache-setting-in-Solr-Search-Instance-tp4216562.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrJ/Tika custom indexer not indexing CERTAIN .doc text?
Haha no need to reinvent wheels. Especially when you don't know java. Just a prototype anyway. I made a very strong assumption that it was pulling the text as blank because I would copy the EXACT same text from one file in the file system and put it into another file under a different name, but instead of it show as } Author:"Some author" text:"blank" } It would show as } Author:"Some author" text:"text that should have shown up in the other file but appeared as blank"' } But I'm a more familiar with solr now than I was about 4 weeks ago so I'll run that debugger and see if I can find something that's a problem. I just find it weird that it was ONLY .doc files and when I put it into another .doc it actually pulled. Thanks for the post and let me know if there's any new info I should know. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrJ-Tika-custom-indexer-not-indexing-CERTAIN-doc-text-tp4216541p4216576.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Problem with distributed search using grouping and highlighting
Rich, I've run into various problems with group.query and highlighting. You noted one below (SOLR-5046), and there is also SOLR-6712, which might be related to what you are experiencing. Still waiting for that patch to be reviewed... -Original Message- From: Rich Hume [mailto:rh...@identifix.com] Sent: Monday, June 08, 2015 2:23 PM To: solr-user@lucene.apache.org Subject: Problem with distributed search using grouping and highlighting I am currently using Solr 4.5.1. In the hopes of seeing better query performance, I have sharded an index and I am trying to use the shards parameter along with grouping and highlighting. I am not currently using Solr cloud. I got past an earlier problem by adding a second sort parameter (as described in JIRA Solr-5046). Unfortunately, I have found nothing related to my latest index out of bounds problem. I do not believe that JIRA Solr-5709 is related since my unique keys are in fact unique across the shards. If anyone can point out something that I am doing wrong it would be greatly appreciated. Thanks, Rich I am seeing the following error, the parameters I am passing are below the stack trace. null:java.lang.ArrayIndexOutOfBoundsException: 35 at org.apache.solr.handler.component.HighlightComponent.finishStage(HighlightComponent.java:185) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) Here are the parameters I am passing: group=true&group.offset=0&group.limit=10&group.field=DeDup &group.query=DocumentTypes:34&group.query=DocumentTypes:35&group.query=DocumentTypes:32 &shards=localhost:8983/solr/IX1,localhost:8983/solr/IX2 &fq=+DocumentTypes:(34 35 32) &defType=edismax&qf=csTitle^100 csContent &q="any matching search string" &start=0&rows=10 &fl=PageNumber,FilePath,DocumentGUID,ResultDisplayContent,DocumentTypes &sort=score desc,DocumentGUID asc &hl=on &hl.fl=csTitle,csContent
Re: SolrJ/Tika custom indexer not indexing CERTAIN .doc text?
I rather doubt that it's a Solr issue. Text is text after all. If some docs display text, then it's probably a matter of not getting the text in the first place. My _guess_ is that you're not getting any text at all from the document. Either the document isn't being found or it's not a form that Tika expects (perhaps the file's extension has been changed and it's really an Libre Office file. Or Tika has a bug. Or your database doesn't have a value for TextContentURL. Or... So what I'd do, since you know the name of the file in question is print out what text you get from it to try to put in the Solr doc and go from there. Best, Erick On Thu, Jul 9, 2015 at 9:59 AM, Paden wrote: > Haha no need to reinvent wheels. Especially when you don't know java. Just a > prototype anyway. > > I made a very strong assumption that it was pulling the text as blank > because I would copy the EXACT same text from one file in the file system > and put it into another file under a different name, but instead of it show > as > > } > Author:"Some author" > text:"blank" > } > > It would show as > > } > Author:"Some author" > text:"text that should have shown up in the other file but appeared as > blank"' > } > > But I'm a more familiar with solr now than I was about 4 weeks ago so I'll > run that debugger and see if I can find something that's a problem. I just > find it weird that it was ONLY .doc files and when I put it into another > .doc it actually pulled. Thanks for the post and let me know if there's any > new info I should know. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/SolrJ-Tika-custom-indexer-not-indexing-CERTAIN-doc-text-tp4216541p4216576.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to determine cache setting in Solr Search Instance
Hi, The real production requests will not be randomly generated, and a lot of requests will be repeated. I think the performance will be better due to the repeated requests. In addition, I am sure the configuration will need to be adjusted once the application is in production. For the time being, I can drop the size of filterCache to 4096 or 2048 since it is now only 1465 in the stats page. I forgot to mention that the size I saw in the stats page for documentCache is already 16384 after the test, and this is the configured size in solrconfig.xml. This is why I was asking if I need to raise the number in the configuration. Is there any issue or will there be any performance improvement if I raise up the size for documentCache? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-determine-cache-setting-in-Solr-Search-Instance-tp4216562p4216591.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr Grouping - sorting groups based on the sum of the scores of the documents within each group
Hi, I'm having a similar use case, still looking for a solution, I have posted a question about it in Stack Overflow ( http://stackoverflow.com/questions/31281640/sum-field-and-sort-on-solr ) Did you solve it ? Regards. -- Emilio Borraz *Back-end Developer* emilio.bor...@sonatasmx.com
Re: How to determine cache setting in Solr Search Instance
I'd examine the filter queries used to see whether they make sense as well. You really have to re-tune after you start getting real user queries though as anything you generate won't reflect reality. I'd start _much_ smaller, 512 or 1024 and work _up_ with real data. Raising the document cache limit is not where I'd start. Given that the docs will eventually be held in MMapDirectory space (i.e. O/S system memory) probably anyway all you're _really_ saving is decompression upon occasion. Best, Erick On Thu, Jul 9, 2015 at 10:58 AM, wwang525 wrote: > Hi, > > The real production requests will not be randomly generated, and a lot of > requests will be repeated. I think the performance will be better due to the > repeated requests. In addition, I am sure the configuration will need to be > adjusted once the application is in production. > > For the time being, I can drop the size of filterCache to 4096 or 2048 since > it is now only 1465 in the stats page. > > I forgot to mention that the size I saw in the stats page for documentCache > is already 16384 after the test, and this is the configured size in > solrconfig.xml. This is why I was asking if I need to raise the number in > the configuration. > > Is there any issue or will there be any performance improvement if I raise > up the size for documentCache? > > Thanks > > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/How-to-determine-cache-setting-in-Solr-Search-Instance-tp4216562p4216591.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Windows Version
Thanks for all your help. I decided to switch to Ubuntu linux. Allan elkowitzelkow...@alumni.caltech.edu On Wednesday, July 8, 2015 1:44 AM, Shawn Heisey wrote: On 7/7/2015 10:43 AM, Allan Elkowitz wrote: > So I am a newbie at Solr and am having trouble getting the examples working > on Windows 7. > I downloaded and unzipped the distribution and have been able to get Solr up > and running. I can access the admin page. However, when I try to follow the > instructions for loading the examples I find that there is a file that I am > supposed to have called post.jar which I cannot find in the directory > specified, exampledocs. There is a file called "post" in another directory > but it does not seem to be a .jar file. > Two questions: > 1. Has this been addressed on some site that I am not yet aware of? > 2. What am I missing here? The post.jar file is in example\exampledocs in the Solr 5.2.1 download. The "bin\post" file is a shell script for Linux/UNIX systems that offers easier access to the SimplePostTool class included in the solr-core jar. Unfortunately, no Windows equivalent (post.cmd) exists yet. If you're getting the impression that Windows is a second-class citizen around here, you are not really wrong. A typical Solr user has found that the free operating systems offer better performance and stability, with the added advantage that they don't have to pay Microsoft a pile of money in order to get useful work done. Windows, especially the server operating systems, is a perfectly good platform, but it's not free. Thanks, Shawn
Get content in response from ExtractingRequestHandler
Hi everyone, I use solr to index and search in office file (docx, pptx, ...). To reduce the size of solr index, I do not store the content of the file on solr, however now my customer want to preview the content of the file. I have read the document of ExtractingRequestHandler, but it seems that to return content in the response from solr, the only option is to set extractOnly=true, but in that case, solr would not index the file. My question is: is there anyway for solr to extract the content from tika, index the content (without storing it) and then give me the content in the response? Thanks in advanced and sorry because my explanation is confusing. Trung.
LogTransformer
I want to log query running through DIH should i use LogTransformer to do that
RE: LogTransformer
One thing I noted that you need to give full package detail while mentioning transformer. Like, I have added bellow mailto:test.mi...@gmail.com] Sent: Friday, July 10, 2015 11:08 AM To: solr-user@lucene.apache.org Subject: LogTransformer I want to log query running through DIH should i use LogTransformer to do that
Re: LogTransformer
Hi Jagdish, not working for me. On Fri, Jul 10, 2015 at 11:21 AM, Jagdish Vasani wrote: > One thing I noted that you need to give full package detail while > mentioning transformer. > Like, I have added bellow > > Hope this will help you. > > Thanks, > Jagdish > -Original Message- > From: Midas A [mailto:test.mi...@gmail.com] > Sent: Friday, July 10, 2015 11:08 AM > To: solr-user@lucene.apache.org > Subject: LogTransformer > > I want to log query running through DIH should i use LogTransformer to do > that > > > logLevel="info" name="products" pk="product_id" query = "SELECT > p.product_id > > in log i am getting text "Query:" but not query variable > > my Solr version : 4.2 > > > Please correct me what is wrong .or other ways to do this . > > Regards, > Abhishek >