RE: [Non-DoD Source] Re: SimplePostTool error (UNCLASSIFIED)

2016-07-15 Thread Musshorn, Kris T CTR USARMY RDECOM ARL (US)
rick Erickson [mailto:erickerick...@gmail.com] Sent: Friday, July 15, 2016 12:30 PM To: solr-user Subject: [Non-DoD Source] Re: SimplePostTool error (UNCLASSIFIED) simplePostTool is just that, simple. It's intended to get you started. It is not a full-featured web crawler. As such,

Re: SimplePostTool error (UNCLASSIFIED)

2016-07-15 Thread Yonik Seeley
On Fri, Jul 15, 2016 at 12:29 PM, Erick Erickson wrote: > simplePostTool is just that, simple. It's intended to get you started. > It is not a full-featured web crawler. As such, if you're encountering > wonky web pages that are not well formed HTML there's no guarantee > that it'll handle them gr

Re: SimplePostTool error (UNCLASSIFIED)

2016-07-15 Thread Erick Erickson
simplePostTool is just that, simple. It's intended to get you started. It is not a full-featured web crawler. As such, if you're encountering wonky web pages that are not well formed HTML there's no guarantee that it'll handle them gracefully. Crawling websites is a pain, so if you require somethi

SimplePostTool error (UNCLASSIFIED)

2016-07-15 Thread Musshorn, Kris T CTR USARMY RDECOM ARL (US)
CLASSIFICATION: UNCLASSIFIED How do I correct this error when running the simple post tool against a website? The tool successfully indexed for about 30 mins before throwing this error and terminating. [Fatal Error] :642:15: XML document structures must start and end within the same entity. Exc