Skip to content

Commit

Permalink
Deduplicate urls parsed to reduce crawl requests
Browse files Browse the repository at this point in the history
bazelbuild#14700 added couple more URLs to fetch JDK package and seems to be causing some infrastructure as discussed in bazelbuild#14700.

This patch workaround the issue by removing the duplicated URLs and reduce the crawl request.

Closes bazelbuild#14763.

PiperOrigin-RevId: 427464876
  • Loading branch information
niyas-sait authored and copybara-github committed Feb 9, 2022
1 parent 91c8085 commit f936e1b
Showing 1 changed file with 8 additions and 3 deletions.
11 changes: 8 additions & 3 deletions src/test/shell/bazel/verify_workspace.sh
Original file line number Diff line number Diff line change
Expand Up @@ -46,13 +46,18 @@ function test_verify_urls() {
# Find url-shaped lines, skipping jekyll-tree (which isn't a valid URL), and
# skipping comments.
invalid_urls=()
checked_urls=()
for file in "${WORKSPACE_FILES[@]}"; do
for url in $(grep -E '"https://|http://' "${file}" | \
sed -e '/jekyll-tree/d' -e '/^#/d' -r -e 's#^.*"(https?://[^"]+)".*$#\1#g' | \
sort -u); do
echo "Checking ${url} ..."
if ! curl --head --silent --show-error --fail --output /dev/null --retry 3 "${url}"; then
invalid_urls+=("${url}")
# add only unique url to the array
if [[ ${#checked_urls[@]} == 0 ]] || [[ ! " ${checked_urls[@]} " =~ " ${url} " ]]; then
checked_urls+=("${url}")
echo "Checking ${url} ..."
if ! curl --head --silent --show-error --fail --output /dev/null --retry 3 "${url}"; then
invalid_urls+=("${url}")
fi
fi
done
done
Expand Down

0 comments on commit f936e1b

Please sign in to comment.