rzo1 commented on PR #1834: URL: https://github.com/apache/stormcrawler/pull/1834#issuecomment-4104435574
Yes - that is the downside of the URL deprecation. A follow-up issue documenting this as a known behavior change seems the right call, with a note in the upgrade/migration guide pointing out that pre-existing URLs in the status index may need to be re-normalized (or that a sanitization pass on the seeds (store) may be advisable before upgrading). An alternative to handle that, could be to move the sanitization (from the normalizer) to `URLUtil#toURL(...)`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
