Hi everyone!

Does anyone have any suggestions on how to URL encode URLs that I'm
importing from SQL using the DIH? The importer pulls in something like
"http://www.downloadsite.com/document that is being downloaded.doc" and then
the Tika parser can't download the document because it ends up trying to
access "http://www.downloadsite.com/document"; and gets a 404 error. What I
need to do is transform the URL to
"http://www.downloadsite.com/document%20that%20is%20being%20downloaded.doc";
I added a regex transformer to the DIH field, but I have not found a
successful regex to accomplish this. Thoughts? 

Any advice would be appreciated! Thanks!

-Teague

Reply via email to