[ https://issues.apache.org/jira/browse/SOLR-14959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Erick Erickson resolved SOLR-14959. ----------------------------------- Resolution: Not A Problem Please raise questions like this on the user's list, we try to reserve JIRAs for known bugs/enhancements rather than usage questions. The JIRA system is not a support portal. See: http://lucene.apache.org/solr/community.html#mailing-lists-irc there are links to both Lucene and Solr mailing lists there. A _lot_ more people will see your question on that list and may be able to help more quickly. You might want to review: https://wiki.apache.org/solr/UsingMailingLists If it's determined that this really is a code issue or enhancement to Lucene or Solr and not a configuration/usage problem, we can raise a new JIRA or reopen this one. > Getting an error trying to web crawl a website > ---------------------------------------------- > > Key: SOLR-14959 > URL: https://issues.apache.org/jira/browse/SOLR-14959 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: website > Affects Versions: 8.6.3 > Environment: OS: Mac > > Reporter: Ravisher Singh > Priority: Major > Labels: starter > > Hi, > I am getting following error when trying to crawl a website please direct me > in right direction. > Ravishers-MacBook-Air:solr-8.6.3 ravishersingh$ bin/post -c solrhelp > -filetypes html https://factorpad.com/tech/solr/index.html > java -classpath > /Users/ravishersingh/desktop/solr-8.6.3/dist/solr-core-8.6.3.jar -Dauto=yes > -Dfiletypes=html -Dc=solrhelp -Ddata=web org.apache.solr.util.SimplePostTool > https://factorpad.com/tech/solr/index.html > SimplePostTool version 5.0.0 > Posting web pages to Solr url > http://localhost:8983/solr/solrhelp/update/extract > Entering auto mode. Indexing pages with content-types corresponding to file > endings html > Entering crawl at level 0 (1 links total, 1 new) > SimplePostTool: WARNING: Solr returned an error #404 (Not Found) for url: > http://localhost:8983/solr/solrhelp/update/extract?literal.id=https%3A%2F%2Ffactorpad.com%2Ftech%2Fsolr%2Findex.html&literal.url=https%3A%2F%2Ffactorpad.com%2Ftech%2Fsolr%2Findex.html > SimplePostTool: WARNING: Response: <html> > <head> > <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/> > <title>Error 404 Not Found</title> > </head> > <body><h2>HTTP ERROR 404 Not Found</h2> > <table> > <tr><th>URI:</th><td>/solr/solrhelp/update/extract</td></tr> > <tr><th>STATUS:</th><td>404</td></tr> > <tr><th>MESSAGE:</th><td>Not Found</td></tr> > <tr><th>SERVLET:</th><td>default</td></tr> > </table> > > </body> > </html> > SimplePostTool: WARNING: IOException while reading response: > java.io.FileNotFoundException: > http://localhost:8983/solr/solrhelp/update/extract?literal.id=https%3A%2F%2Ffactorpad.com%2Ftech%2Fsolr%2Findex.html&literal.url=https%3A%2F%2Ffactorpad.com%2Ftech%2Fsolr%2Findex.html > SimplePostTool: WARNING: An error occurred while posting > https://factorpad.com/tech/solr/index.html > 0 web pages indexed. > COMMITting Solr index changes to > http://localhost:8983/solr/solrhelp/update/extract... > SimplePostTool: WARNING: Solr returned an error #404 (Not Found) for url: > http://localhost:8983/solr/solrhelp/update/extract?commit=true > SimplePostTool: WARNING: Response: <html> > <head> > <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/> > <title>Error 404 Not Found</title> > </head> > <body><h2>HTTP ERROR 404 Not Found</h2> > <table> > <tr><th>URI:</th><td>/solr/solrhelp/update/extract</td></tr> > <tr><th>STATUS:</th><td>404</td></tr> > <tr><th>MESSAGE:</th><td>Not Found</td></tr> > <tr><th>SERVLET:</th><td>default</td></tr> > </table> > > </body> > </html> > Time spent: 0:00:01.356 -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org