With Ansible, is it possible to connect connect to hosts that are behind Cloud IAP (Identity-Aware Proxy) in GCP?

Cloud IAP is a sort of proxy for Google Cloud Platform that lets you connect to compute instances that don’t have public IP addresses, without a VPN. You stand up an instance and then you can use the gcloud utility to connect to it by name like so: gcloud compute ssh my-server-01. This handles authenticating you through the proxy and logging you into the target server with your own Google account (using a feature called OS Login).

I figure to make ansible do what the gcloud tool is doing I would need a custom connection plugin.

Asked By: mat

||

I figured out a way to make this work without a connection plugin. Basically you can write a script that wraps the gcloud tool and point the ansible_ssh_executable parameter at this script, which you can define at the inventory level. You do need to make sure the gcp_compute inventory plugin identifies hosts by name, because this is what gcloud compute ssh expects.

Here’s the script:

#!/bin/sh
set -o errexit
# Wraps the gcloud utility to enable connecting to instances which are behind
# GCP Cloud IAP. Used by setting the `ansible_ssh_executable` setting for a play
# or inventory. Parses out the relevant information from Ansible's call to the
# script and injects into the right places of the gcloud utility.

arg_string="$@"

grep_hostname_regex='[a-z]*[0-9]{2}(live|test)'
sed_hostname_regex='[a-z]*[0-9]{2}(live|test)'

target_host=$(
  echo "$arg_stringc" | grep -o "$grep_hostname_regex"
)

ssh_args=$(
  echo "$arg_stringc" | sed -E "s# ${sed_hostname_regex}.*##"
)

cmd=$(
  echo "$arg_stringc" | sed -E "s#.*${sed_hostname_regex} ##"
)

gcloud compute ssh "$target_host" 
  --command="$cmd" 
  --tunnel-through-iap 
  -- $ssh_args

Note:

  • This is tested on macOS. The sed options might be different on Linux, for example.
  • The “host” regexes will need to fit your naming convention. If you don’t have a consistent naming convention that would work like it does for me you’ll need to find some other way to parse out the information.
Answered By: mat

After discussion on https://www.reddit.com/r/ansible/comments/e9ve5q/ansible_slow_as_a_hell_with_gcp_iap_any_way_to/ I altered solution to use an SSH connection sharing via socket.

It is two times faster then @mat solution. I put it on our PROD. Here is an implementation that doesn’t depend on host name patterns!

The proper solution is to use Bastion/Jump host because gcloud command still spawns Python interpreter that spawns ssh – it is still inefficient!

ansible.cfg:

[ssh_connection]
pipelining = True
ssh_executable = misc/gssh.sh
ssh_args =
transfer_method = piped

[privilege_escalation]
become = True
become_method = sudo

[defaults]
interpreter_python = /usr/bin/python
gathering = False
# Somehow important to enable parallel execution...
strategy = free

gssh.sh:

#!/bin/bash

# ansible/ansible/lib/ansible/plugins/connection/ssh.py
# exec_command(self, cmd, in_data=None, sudoable=True) calls _build_command(self, binary, *other_args) as:
#   args = (ssh_executable, self.host, cmd)
#   cmd = self._build_command(*args)
# So "host" is next to the last, cmd is the last argument of ssh command.

host="${@: -2: 1}"
cmd="${@: -1: 1}"

# ControlMaster=auto & ControlPath=... speedup Ansible execution 2 times.
socket="/tmp/ansible-ssh-${host}-22-iap"

gcloud_args="
--tunnel-through-iap
--zone=europe-west1-b
--quiet
--no-user-output-enabled
--
-C
-o ControlMaster=auto
-o ControlPersist=20
-o PreferredAuthentications=publickey
-o KbdInteractiveAuthentication=no
-o PasswordAuthentication=no
-o ConnectTimeout=20"

exec gcloud compute ssh "$host" $gcloud_args -o ControlPath="$socket" "$cmd"

UPDATE There is response from Google engineer that gcloud aren’t supposed to be called in parallel! See "gcloud compute ssh" can’t be used in parallel

Experiments were shown that with Ansible fork=5 I almost always hit an error. With fork=2 I’ve never experienced one.

UPDATE 2 Time passed and as of end of 2020 I can run gcloud compute ssh in parallel (in WSL I did fork = 10) without locking errors.

Answered By: gavenkoa

Thanks for posting these @gavenkoa and @matt. One suggestion is to add the following so you don’t need to hard code the zone.

Snippet:

ZONE=$(gcloud compute instances list --filter="name:${host}" --format='value(zone)')

gcloud_args="
--tunnel-through-iap 
--zone=${ZONE}
Answered By: Steveno
Categories: Answers Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.