-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random DNS issue when using Github Actions #107
Comments
Are you using a Dockerfile runner or the Tailscale-supplied action.yml? What's your GitHub runner type/version? |
I tried both the GitHub actions and the manual installation, run on ubuntu-latest, and 20.04 (seems more stable). |
I was running into a lot of transient DNS resolution failures, followed this recommendation and it seems to be working a lot better: #51 (comment) |
I too encounter a lot of transient DNS errors, my deployment pipelines randomly fail like this:
It was working fine a few weeks ago, now I have to restart my deployment pipelines a lot. |
I saw this a while back, but it seemed to go away for a while, then it became a problem again about a week ago. We are using the standard hosted runner and the following action. When it started causing us problems last week, we added the Tailscale version based on the same issue @matthewjthomas referenced, #51. It has not made a difference. This is our action. name: 'connect_tailscale'
description: 'Connects to Tailscale'
inputs:
ts_oauth_client_id:
description: 'TS_OAUTH_CLIENT_ID'
required: true
ts_oauth_secret:
description: 'TS_OAUTH_SECRET'
required: true
runs:
using: 'composite'
steps:
- name: Tailscale
uses: tailscale/github-action@v2
with:
version: 1.64.0
oauth-client-id: ${{ inputs.TS_OAUTH_CLIENT_ID }}
oauth-secret: ${{ inputs.TS_OAUTH_SECRET }}
tags: tag:github
args: --accept-routes --accept-dns |
We are also experiencing DNS timeouts with tailscale in our ci. Our setup
|
We found that the tailscale action is "reporting ready" to quickly. I'd like to have a more consistent way of waiting for DNS to become ready though. |
I've just hit this problem (again) and it took several minutes for tailscale network to be in a working state (I put a sleep 600 and tried to ssh into the github runner, it took at least 3 minutes before my ssh went through). I'm wondering if the problem could be caused by an overloaded github network. In any case, I agree with @arnecls, it would be nice to have a tailscale command that could wait until magic dns is in working order. |
Also seeing this at the moment, trying a |
@lukeramsden I'm glad someone is having the issue at the same time as me. My hypothesis:
|
@sylr same for us, especially today |
|
It's a bit hackish but less dumb than a sleep: sylr@338b779 |
I've forked the github action and added this at the end:
Currently seeing this:
|
It makes sense, maybe the us-east is running workflows just before lunch :D |
we're seeing this with github actions:
|
this is failing 100% of the time now when SplitDNS is used on Github Action hosted runners:
|
Yep all my deploys failing atm. Seems to be correlated with other people. I'm also hosting my runners on https://blacksmith.sh/ and not using GitHub Actions hosted runners so maybe its on Tailscale's end? |
tailscale status now showing degraded performance in Coordinator server https://status.tailscale.com/ |
I've made a feature request for this: #146 On this note, even before this outage we'd sometimes still get timeouts because of what I assume is coordination latency - is this something other folks have experienced? |
it's working now, it seems like it was the coordinator server after all |
We are still encountering lots of timeouts, is there another incident on going? Status page is reporting all green |
Has anyone else tried to ring tailscale support about this ? I've sent 2 mails without response. |
it's not great, they acknowledged it in their status page today but it's been like this since Saturday, it's pretty bad for Split DNS. |
not heard anything either |
We faced same issue without response from support too :/ |
we're seeing this issue again, it's timing out not just in Github actions but also our on-premise ubuntu machines with split DNS |
yes, same for us |
experiencing exact this behavior too. Become worse last quarter. |
We primarily use Tailscale to access subnets hosted on AWS. To work around this problem, I've written a script that does the following: - name: Wait for propagation
uses: Wandalen/wretry.action@master
with:
command: |
check_dns() {
local server=$1
if dig @$server example.com +short +timeout=5; then
echo "DNS check against $server successful"
return 0
else
echo "DNS check against $server failed"
return 1
fi
}
# https://docs.aws.amazon.com/vpc/latest/userguide/AmazonDNS-concepts.html
# This is our VPC addresses + 2
servers=(
"10.152.0.2"
"10.160.0.2"
"10.168.0.2"
"10.176.0.2"
)
for server in "${servers[@]}"; do
if ! check_dns "$server"; then
echo "Failed checking DNS server $server"
exit 1
fi
done
echo "All DNS checks successful!"
attempt_limit: 20
attempt_delay: 5000 # 5 seconds in milliseconds
This is done after connecting to tailscale in the GH action, and has effectively worked around this problem. I imagine that you can find another IP to ping/dig to verify that the connection has in fact fully propagated to the entire tailnet. |
I'm trying to use TailScale GitHub actions on the latest (I also tested different versions) version of TailScale and getting these DNS Issues.
It also happens when attempting to install it manually on the machine while it is running,
I've tried injecting the nameserver and the search to
/etc/resolve.conf
But it doesn't help in this case.
On the Admin console, I've defined the machine as an Ephermal and pre-approved machine.
It happens only on GitHub action machines.
This issue is something that happens and sometimes does not.
Thanks.
The text was updated successfully, but these errors were encountered: