Apache Solr Reference Guide 5.0
Greetings, I was looking at the PDF version of the Apache Solr Reference Guide 5.0 and noticed that it has no TOC nor any section numbering. http://apache.claz.org/lucene/solr/ref-guide/apache-solr-ref-guide-5.0.pdf The lack of a TOC and section headings makes navigation difficult. I have just started making suggestions on the documentation and was wondering if there is a reason why the TOC and section headings are missing? (that isn't apparent from the document) Thanks! Hope everyone is near a great weekend! Patrick
Re: Apache Solr Reference Guide 5.0
Shawn, Thanks! I was using Document Viewer and not Adobe Acrobat so was unclear. The TOC I meant was as in a traditional print publication with section #s, etc. Not a navigation TOC sans numbering as in Adobe. The Confluence documentation (I can't see the actual stylesheet in use, I don't think) here: https://confluence.atlassian.com/display/DOC/Customising+Exports+to+PDF Says: * Disabling the Table of Contents To prevent the table of contents from being generated in your PDF document, add the div.toc-macro rule to the PDF Stylesheet and set its display property to none: * Which is why I was asking if there was a reason for the TOC and section numbering not appearing. They can be defeated but that doesn't appear to be the default setting. This came up because a section said it would cover topics N - S and I could not determine if all those topics fell in that section or not. Thanks! Hope you are having a great day! Patrick On 03/06/2015 12:28 PM, Shawn Heisey wrote: On 3/6/2015 10:20 AM, Patrick Durusau wrote: I was looking at the PDF version of the Apache Solr Reference Guide 5.0 and noticed that it has no TOC nor any section numbering. http://apache.claz.org/lucene/solr/ref-guide/apache-solr-ref-guide-5.0.pdf The lack of a TOC and section headings makes navigation difficult. I have just started making suggestions on the documentation and was wondering if there is a reason why the TOC and section headings are missing? (that isn't apparent from the document) The TOC is built into the PDF and it's up to the PDF viewer to display it. Here's a screenshot of the ref guide in Adobe Reader with a clickable TOC open. https://www.dropbox.com/s/3ajuri1emj61imu/refguide-5.0-TOC.png?dl=0 Section numbering might be a good idea, if it's not too intrusive or difficult. Thanks, Shawn
Solr-3.5.0/Nutch-1.4 - SolrDeleteDuplicates fails
Greetings! This may be a Nutch question and if so, I will repost to the Nutch list. I can run the following commands with Solr-3.5.0/Nutch-1.4: bin/nutch crawl urls -dir crawl -depth 3 -topN 5 then: bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb -linkdb crawl/linkdb crawl/segments/* successfully. But, if I run: bin/nutch crawl urls -solr http://localhost:8983/solr/ -depth 3 -topN 5 It fails with the following messages: SolrIndexer: starting at 2011-12-11 14:01:27 Adding 11 documents SolrIndexer: finished at 2011-12-11 14:01:28, elapsed: 00:00:01 SolrDeleteDuplicates: starting at 2011-12-11 14:01:28 SolrDeleteDuplicates: Solr url: http://localhost:8983/solr/ Exception in thread "main" java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252) at org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:373) at org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:353) at org.apache.nutch.crawl.Crawl.run(Crawl.java:153) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.nutch.crawl.Crawl.main(Crawl.java:55) I am running on Ubuntu 10.10 with 12 GB of memory, Java version 1.6.0_26. I can delete the crawl directory and replicate this error consistently. Suggestions? Other than "...use the way that doesn't fail." ;-) I am concerned that a different invocation of Solr failing consistently represents something that may cause trouble elsewhere when least expected. (And hard to isolate as the problem.) Thanks! Hope everyone is having a great weekend! Patrick PS: From the hadoop log (when it fails) if that's helpful: 2011-12-11 15:21:51,436 INFO solr.SolrWriter - Adding 11 documents 2011-12-11 15:21:52,250 INFO solr.SolrIndexer - SolrIndexer: finished at 2011-12-11 15:21:52, elapsed: 00:00:01 2011-12-11 15:21:52,251 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: starting at 2011-12-11 15:21:52 2011-12-11 15:21:52,251 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: Solr url: http://localhost:8983/solr/ 2011-12-11 15:21:52,330 WARN mapred.LocalJobRunner - job_local_0020 java.lang.NullPointerException at org.apache.hadoop.io.Text.encode(Text.java:388) at org.apache.hadoop.io.Text.set(Text.java:178) at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:270) at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:241) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) -- Patrick Durusau patr...@durusau.net Chair, V1 - US TAG to JTC 1/SC 34 Convener, JTC 1/SC 34/WG 3 (Topic Maps) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps) OASIS Technical Advisory Board (TAB) - member Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau
Solr-3.5.0/Nutch-1.4 - SolrDeleteDuplicates fails
Greetings! On the Nutch Tutorial: I can run the following commands with Solr-3.5.0/Nutch-1.4: bin/nutch crawl urls -dir crawl -depth 3 -topN 5 then: bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb -linkdb crawl/linkdb crawl/segments/* successfully. But, if I run: bin/nutch crawl urls -solr http://localhost:8983/solr/ -depth 3 -topN 5 It fails with the following messages: SolrIndexer: starting at 2011-12-11 14:01:27 Adding 11 documents SolrIndexer: finished at 2011-12-11 14:01:28, elapsed: 00:00:01 SolrDeleteDuplicates: starting at 2011-12-11 14:01:28 SolrDeleteDuplicates: Solr url: http://localhost:8983/solr/ Exception in thread "main" java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252) at org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:373) at org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:353) at org.apache.nutch.crawl.Crawl.run(Crawl.java:153) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.nutch.crawl.Crawl.main(Crawl.java:55) I am running on Ubuntu 10.10 with 12 GB of memory, Java version 1.6.0_26. I can delete the crawl directory and replicate this error consistently. Suggestions? Other than "...use the way that doesn't fail." ;-) I am concerned that a different invocation of Solr failing consistently represents something that may cause trouble elsewhere when least expected. (And hard to isolate as the problem.) Thanks! Hope everyone is having a great weekend! Patrick PS: From the hadoop log (when it fails) if that's helpful: 2011-12-11 15:21:51,436 INFO solr.SolrWriter - Adding 11 documents 2011-12-11 15:21:52,250 INFO solr.SolrIndexer - SolrIndexer: finished at 2011-12-11 15:21:52, elapsed: 00:00:01 2011-12-11 15:21:52,251 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: starting at 2011-12-11 15:21:52 2011-12-11 15:21:52,251 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: Solr url: http://localhost:8983/solr/ 2011-12-11 15:21:52,330 WARN mapred.LocalJobRunner - job_local_0020 java.lang.NullPointerException at org.apache.hadoop.io.Text.encode(Text.java:388) at org.apache.hadoop.io.Text.set(Text.java:178) at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:270) at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:241) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) -- Patrick Durusau patr...@durusau.net Chair, V1 - US TAG to JTC 1/SC 34 Convener, JTC 1/SC 34/WG 3 (Topic Maps) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps) OASIS Technical Advisory Board (TAB) - member Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau