Proxies are special-purpose HTTP servers designed to transfer data from remote servers to local clients. One typical use of proxies is lightening network load for users behind a slow connection. This is achieved by channeling all HTTP and FTP requests through the proxy which caches the transferred data. When a cached resource is requested again, proxy will return the data from cache. Another use for proxies is for companies that separate (for security reasons) their internal networks from the rest of Internet. In order to obtain information from the Web, their users connect and retrieve remote data using an authorized proxy.
Wget supports proxies for both HTTP and FTP retrievals. The standard way to specify proxy location, which Wget recognizes, is using the following environment variables:
http_proxy
https_proxy
If set, the http_proxy
and https_proxy
variables should
contain the URLs of the proxies for HTTP and HTTPS
connections respectively.
ftp_proxy
This variable should contain the URL of the proxy for FTP
connections. It is quite common that http_proxy
and
ftp_proxy
are set to the same URL.
no_proxy
This variable should contain a comma-separated list of domain extensions
proxy should not be used for. For instance, if the value of
no_proxy
is ‘.mit.edu’, proxy will not be used to retrieve
documents from MIT.
In addition to the environment variables, proxy location and settings may be specified from within Wget itself.
This option and the corresponding command may be used to suppress the use of proxy, even if the appropriate environment variables are set.
These startup file variables allow you to override the proxy settings specified by the environment.
Some proxy servers require authorization to enable you to use them. The
authorization consists of username and password, which must
be sent by Wget. As with HTTP authorization, several
authentication schemes exist. For proxy authorization only the
Basic
authentication scheme is currently implemented.
You may specify your username and password either through the proxy URL or through the command-line options. Assuming that the company’s proxy is located at ‘proxy.company.com’ at port 8001, a proxy URL location containing authorization data might look like this:
http://hniksic:mypassword@proxy.company.com:8001/
Alternatively, you may use the ‘proxy-user’ and
‘proxy-password’ options, and the equivalent .wgetrc
settings proxy_user
and proxy_password
to set the proxy
username and password.