NOTE: This post is only relevant if you use Linux or one of the modern UNIX flavored operating systems.
The Problem
Recently, I came across a problem where I needed to sync files (actually, whole directory trees) from one host to other, with the latter hiding behind another host. Here’s the arrangement just to be clear:

Arrangement of hosts for file syncing
Let’s call this host in the middle, the gateway host or just “gw“. To review the situation again, I need to copy files from host A to host B (with gw host in the middle) and back after making some changes on host B. As you can see, there are two parts to this puzzle:
- sync files between two hosts
- make the sync happen through the gw.
The Digression
[Skip beating around the bush and show me the solution]
There are a few scenarios when you’ll run into a syncing requirement, but a common one is when you are developing for multiple platforms. You have your source repository on A, and host B is a different hardware or software platform, or both. You copy your sources from A to B; compile; make changes on B and when you are done, copy all changes back to A. And as far as the gateway host requirement is concerned, you may need to ssh into a server on a distant network and access other hosts on an internal network connected to it.
The traditional ways to solve this problem are:
- Copy the files from A to gw and then copy them from gw to B. Reverse the procedure for file-transfer in the other direction.
- Copy the files to an external host publicly visible to both A and B. Then download these files from B. This of course requires access to an external host and network connectivity from A and B to the external host.
But, as it turns out, there’s an easier solution. We are going to exploit the capabilities of ssh to act as a remote shell to solve this problem. ssh is not only a secure shell for logging into a host, it is also a remote shell (remember rsh). In other words, it can not only log you into a remote host and give you a terminal, it can also operate in a mode where it just runs a command on the remote host and returns. We are going to exploit that feature for our own good.
Assuming that you already know how to run a remote command using ssh, I’m going to jump to the fact that you can chain multiple ssh sessions to reach a host hidden behind a remote host. For example to login directly to B from A, use the command:
ssh -t user1@gw ssh user2@hostB
The “-t” forces ssh to allocate a pseudo tty on the gw host so that it can run a terminal based program (which is ssh in this case) on it. You can do this multiple times to reach hosts hidden more than one level behind another host.
The Solution
rsync supports an external shell for connecting to a host through the “-e” command-line argument. Take a look at the rsync man page. Combining this feature with the ability of chain ssh to reach hidden hosts, we can easily sync files. Here’s the command to sync contents of dir1 on host A to contents of dir2 on host B:
rsync -av -e 'ssh -t user1@gw ssh' dir1 user2@hostB:dir2