[ 
https://issues.apache.org/jira/browse/SOLR-14959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-14959.
-----------------------------------
    Resolution: Not A Problem

Please raise questions like this on the user's list, we try to reserve JIRAs 
for known bugs/enhancements rather than usage questions. The JIRA system is not 
a support portal.

See: 
http://lucene.apache.org/solr/community.html#mailing-lists-irc there are links 
to both Lucene and Solr mailing lists there.

A _lot_ more people will see your question on that list and may be able to help 
more quickly.

You might want to review: 
https://wiki.apache.org/solr/UsingMailingLists

If it's determined that this really is a code issue or enhancement to Lucene or 
Solr and not a configuration/usage problem, we can raise a new JIRA or reopen 
this one.



> Getting an error trying to web crawl a website
> ----------------------------------------------
>
>                 Key: SOLR-14959
>                 URL: https://issues.apache.org/jira/browse/SOLR-14959
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: website
>    Affects Versions: 8.6.3
>         Environment: OS: Mac
>  
>            Reporter: Ravisher Singh
>            Priority: Major
>              Labels: starter
>
> Hi,
> I am getting following error when trying to crawl a website please direct me 
> in right direction.
> Ravishers-MacBook-Air:solr-8.6.3 ravishersingh$ bin/post -c solrhelp 
> -filetypes html https://factorpad.com/tech/solr/index.html
> java -classpath 
> /Users/ravishersingh/desktop/solr-8.6.3/dist/solr-core-8.6.3.jar -Dauto=yes 
> -Dfiletypes=html -Dc=solrhelp -Ddata=web org.apache.solr.util.SimplePostTool 
> https://factorpad.com/tech/solr/index.html
> SimplePostTool version 5.0.0
> Posting web pages to Solr url 
> http://localhost:8983/solr/solrhelp/update/extract
> Entering auto mode. Indexing pages with content-types corresponding to file 
> endings html
> Entering crawl at level 0 (1 links total, 1 new)
> SimplePostTool: WARNING: Solr returned an error #404 (Not Found) for url: 
> http://localhost:8983/solr/solrhelp/update/extract?literal.id=https%3A%2F%2Ffactorpad.com%2Ftech%2Fsolr%2Findex.html&literal.url=https%3A%2F%2Ffactorpad.com%2Ftech%2Fsolr%2Findex.html
> SimplePostTool: WARNING: Response: <html>
> <head>
> <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
> <title>Error 404 Not Found</title>
> </head>
> <body><h2>HTTP ERROR 404 Not Found</h2>
> <table>
> <tr><th>URI:</th><td>/solr/solrhelp/update/extract</td></tr>
> <tr><th>STATUS:</th><td>404</td></tr>
> <tr><th>MESSAGE:</th><td>Not Found</td></tr>
> <tr><th>SERVLET:</th><td>default</td></tr>
> </table>
>  
> </body>
> </html>
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.FileNotFoundException: 
> http://localhost:8983/solr/solrhelp/update/extract?literal.id=https%3A%2F%2Ffactorpad.com%2Ftech%2Fsolr%2Findex.html&literal.url=https%3A%2F%2Ffactorpad.com%2Ftech%2Fsolr%2Findex.html
> SimplePostTool: WARNING: An error occurred while posting 
> https://factorpad.com/tech/solr/index.html
> 0 web pages indexed.
> COMMITting Solr index changes to 
> http://localhost:8983/solr/solrhelp/update/extract...
> SimplePostTool: WARNING: Solr returned an error #404 (Not Found) for url: 
> http://localhost:8983/solr/solrhelp/update/extract?commit=true
> SimplePostTool: WARNING: Response: <html>
> <head>
> <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
> <title>Error 404 Not Found</title>
> </head>
> <body><h2>HTTP ERROR 404 Not Found</h2>
> <table>
> <tr><th>URI:</th><td>/solr/solrhelp/update/extract</td></tr>
> <tr><th>STATUS:</th><td>404</td></tr>
> <tr><th>MESSAGE:</th><td>Not Found</td></tr>
> <tr><th>SERVLET:</th><td>default</td></tr>
> </table>
>  
> </body>
> </html>
> Time spent: 0:00:01.356



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to