This is what libcurl says about proxy support:

   Proxies

What "proxy" means according to Merriam-Webster: "a person authorized to act for another" but also "the agency, function, or office of a deputy who acts as a substitute for another".

Proxies are exceedingly common these days. Companies often only offer Internet access to employees through their proxies. Network clients or user-agents ask the proxy for documents, the proxy does the actual request and then it returns them.

libcurl supports SOCKS and HTTP proxies. When a given URL is wanted, libcurl will ask the proxy for it instead of trying to connect to the actual host identified in the URL.

If you're using a SOCKS proxy, you may find that libcurl doesn't quite support all operations through it.

For HTTP proxies: the fact that the proxy is a HTTP proxy puts certain restrictions on what can actually happen. A requested URL that might not be a HTTP URL will be still be passed to the HTTP proxy to deliver back to libcurl. This happens transparently, and an application may not need to know. I say "may", because at times it is very important to understand that all operations over a HTTP proxy use the HTTP protocol. For example, you can't invoke your own custom FTP commands or even proper FTP directory listings.

*Proxy Options*

To tell libcurl to use a proxy at a given port number:

curl_easy_setopt(easyhandle, CURLOPT_PROXY, "proxy-host.com:8080");

Some proxies require user authentication before allowing a request, and you pass that information similar to this:

curl_easy_setopt(easyhandle, CURLOPT_PROXYUSERPWD, "user:password");

If you want to, you can specify the host name only in the CURLOPT_PROXY option, and set the port number separately with CURLOPT_PROXYPORT.

Tell libcurl what kind of proxy it is with CURLOPT_PROXYTYPE (if not, it will default to assume a HTTP proxy):

curl_easy_setopt(easyhandle, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS4);

*Environment Variables*

libcurl automatically checks and uses a set of environment variables to know what proxies to use for certain protocols. The names of the variables are following an ancient de facto standard and are built up as "[protocol]_proxy" (note the lower casing). Which makes the variable 'http_proxy' checked for a name of a proxy to use when the input URL is HTTP. Following the same rule, the variable named 'ftp_proxy' is checked for FTP URLs. Again, the proxies are always HTTP proxies, the different names of the variables simply allows different HTTP proxies to be used.

The proxy environment variable contents should be in the format "[protocol://][user:passw...@]machine[:port]". Where the protocol:// part is simply ignored if present (so http://proxy and bluerk://proxy will do the same) and the optional port number specifies on which port the proxy operates on the host. If not specified, the internal default port number will be used and that is most likely *not* the one you would like it to be.

There are two special environment variables. 'all_proxy' is what sets proxy for any URL in case the protocol specific variable wasn't set, and 'no_proxy' defines a list of hosts that should not use a proxy even though a variable may say so. If 'no_proxy' is a plain asterisk ("*") it matches all hosts.

To explicitly disable libcurl's checking for and using the proxy environment variables, set the proxy name to "" - an empty string - with CURLOPT_PROXY.

So, setting your environment variable 'http_proxy' to 'proxy-host:8080' should work: does it work indeed?

Alberto

Dantzler, DeWayne C wrote:
I've looked at the libcurl doc, but I'm using Xerces to parse and it is during the parsing of the XMl file that internet access occurs since the XML calls entities that must be resolved via a URL. Since Xerces can be configured with libcurl, my assumption is that somehow Xerces must use it. Given Alberto comments "If you use --enable-netaccessor-curl Xerces will use the APIs provided by libcurl, so if you use the same API to setup a global proxy, you will be able to use it." and based on the libcurl doc, my assumption is that setting the env 'http_proxy' using libcurl's API is how Xerces will know the correct proxy to use. Is this how Xerces will know the correct proxy to use?
-----Original Message-----
From: Vitaly Prapirny [mailto:[email protected]] Sent: Thursday, November 05, 2009 12:18 AM
To: [email protected]
Subject: Re: What is the difference between configuring Xerces 3.0.1 
with--enable-netaccessor-curlvs --enable-netaccessor-socket?

Proxy settings should be recognized by libcurl, not Xerces.
So please look at http://curl.haxx.se/libcurl/c/libcurl-tutorial.html
as I suggest you in my answer to your previous message
http://marc.info/?l=xerces-c-users&m=125541575925143&w=2

Good luck!
        Vitaly

Dantzler, DeWayne C wrote:
Ok, If Xerces will use the APIs provided by libcurl, then what Xerces's APIs 
must I use to get Xerces to recognize my proxy settings or how does Xerces 
determine the proxy settings? I've tried googling for the answer, but came up 
empty. I'm not sure of the right text combo to get a hit.

Thanks

-----Original Message-----
From: Alberto Massari [mailto:[email protected]]
Sent: Tuesday, November 03, 2009 11:27 PM
To: [email protected]
Subject: Re: What is the difference between configuring Xerces 3.0.1 
with--enable-netaccessor-curl vs --enable-netaccessor-socket?

If you use --enable-netaccessor-socket you will not be able to specify a proxy, 
as the code that reads from the Internet is simply working with plain TCP 
sockets. If you use --enable-netaccessor-curl Xerces will use the APIs provided 
by libcurl, so if you use the same API to setup a global proxy, you will be 
able to use it.

Alberto

Dantzler, DeWayne C wrote:
Hello

Problem: there is a proxy between Xerces and the outside World and I need Xerces to perform XML 
validation against a schema which includes online references to an external scheme (e.g<xs:import 
namespace="http://www.w3.org/myspace 
schemaLocation="http://www.w3.org/schema.xsd"/>.

What is the difference between configuring Xerces with 
--enable-netaccessor-curl vs --enable-netaccessor-socket? Basically, how does 
this effect the socket behavior of Xerces and why would I choose one over the 
other?

Thanks





Reply via email to