Software: Apache. PHP/8.1.30 uname -a: Linux server1.tuhinhossain.com 5.15.0-151-generic #161-Ubuntu SMP Tue Jul 22 14:25:40 UTC uid=1002(picotech) gid=1003(picotech) groups=1003(picotech),0(root) Safe-mode: OFF (not secure) /usr/share/doc/proftpd-doc/howto/ drwxr-xr-x |
Viewing file: Select action/file-type: AWS and ProFTPDSo you want to run ProFTPD on an AWS EC2 instance? Due to FTP's nature as a multi-connection protocol, it is not as straightforward to use FTP within AWS EC2, but it can be done. Read on to find out how. Note that the following documentation assumes that you know how to install and configure ProFTPD already. If you are only running individual FTP servers, then the sections on AWS security groups and addresses are relevant. If you want to provide a "scalable" pool/cluster of FTP servers, then the AWS Elastic Load Balancing and AWS Route53 sections will also be of interest.
Security Groups
Clients wishing to make a connection to the $ aws ec2 authorize-security-group-ingress \ --group-id sg-XXXX \ --protocol tcp \ --port 21 \ --cidr 0.0.0.0/0Note that you do not need to allow access to port 20! Many, many sites/howtos recommend opening port 20 in addition to port 21 for FTP access, but it simply not needed. For active data transfers (i.e. where the FTP server actively connects back to the client machine for the data transfer), the source port will be port 20. But incoming connections for FTP will never be to port 20.
If you are allowing SFTP/SCP connections, e.g. to your
$ aws ec2 authorize-security-group-ingress \ --group-id sg-YYYY \ --protocol tcp \ --port 22 \ --cidr 0.0.0.0/0Note: I recommend using different SGs for your FTP/FTPS rules and your SFTP/SCP rules. FTP/FTPS rules are more complex, and it is more clear to manage an SG named "FTP", with all of the related FTP rules, and separately to have an SG named "SFTP", with the SFTP/SCP related rules. If you are only allowing SFTP/SCP access, that should suffice for the security group configuration for your instance. Allowing FTP/FTPS connections requires more security group tweaks. FTP uses multiple TCP connections: one for the control connection, and separate other connections for data transfers (directory listings and file uploads/downloads). The ports used for these data connections are dynamically negotiated over the control connection; it is this dynamic nature of the data connections which causes complexity with network access rules. This site does a great job of describing these issues more in detail: http://slacksite.com/other/ftp.htmlRemember how I said that SGs are similar to NAT rules? This similarity is one of the reasons why the ProFTPD NAT howto is relevant here as well.
We want to configure ProFTPD to use a known range of ports for its passive
data transfers, and then we want to configure our FTP SG to allow access to
that known port range. Thus we would use something like this in the
PassivePorts 60000 65535And then, to configure the SG to allow those ports: $ aws ec2 authorize-security-group-ingress \ --group-id sg-XXXX \ --protocol tcp \ --port 60000-65534 \ --cidr 0.0.0.0/0The SFTP/SCP protocols only use a single TCP connection, and thus they do not require any other special configuration/access rules.
Public vs Private Instance Addresses
If your EC2 instance will be supporting FTP/FTPS sessions, then you will need
to determine whether your instance has a public address. If so, that address
needs to be configured using the
So how can you tell what the public address of your EC2 instance is, if it
even has one? You can use the EC2 instance metadata, via
$ curl http://169.254.169.254/latest/meta-data/public-hostnameIf your instance has a public address, the DNS name to use would be returned. Otherwise, you might see something like this: $ curl http://169.254.169.254/latest/meta-data/public-hostname <?xml version="1.0" encoding="iso-8859-1"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>404 - Not Found</title> </head> <body> <h1>404 - Not Found</h1> </body> </html>which indicates that your EC2 instances does not have a public address. And if your instance does not have a public address, then you do not need to use the MasqueradeAddress directive.
Here's one solution for handling this situation: obtain the public hostname
for your instance, store it in an environment variable, and then use that
environment variable in your $ export EC2_PUBLIC_HOSTNAME=`curl -f -s http://169.254.169.254/latest/meta-data/public-hostname`The -f option is necessary, in case the instance does not
have a public address. The -s option simply makes for
quieter shell scripts. Then, in your proftpd.conf , you might
use:
MasqueradeAddress %{env:EC2_PUBLIC_HOSTNAME}If the instance does not have a public address, though, that environment variable will be the empty string, and proftpd will fail to start
up because of that. Better would be to automatically handle the
"no public address" case, if we can. Assume you have a shell script for
starting proftpd which does something like this, using our
EC2_PUBLIC_HOSTNAME environment variable:
PROFTPD_ARGS="" # If we have a public hostname, then the string will not be # zero length, and we define a property for ProFTPD's use. if [ ! -z "$EC2_PUBLIC_HOSTNAME" ]; then PROFTPD_ARGS="$PROFTPD_ARGS -DUSE_MASQ_ADDR" fiThen, in your proftpd.conf , you use both that property
and the environment variable notation:
<IfDefined USE_MASQ_ADDR> MasqueradeAddress %{env:EC2_PUBLIC_HOSTNAME} </IfDefined> Fortunately the EC2 instance addressing does not require any additional changes/tweaks to the AWS Security Groups.
Elastic Load Balancing Yes, ELBs can be used for FTP. Like SGs, though, it's complicated by FTP's use of multiple TCP connections; for SFTP/SCP, ELBs are simpler to configure. The first thing to keep in mind is that ELBs only distribute (i.e. "balance") connections in a round-robin fashion among the backend TCP servers; they do not distribute connections based on the load of those backend servers. (The balancing algorithm is slightly different for HTTP servers, but that does not apply to ProFTPD.) This means that any user might connect to any of your ProFTPD instances; this, in turn, means that users must be able to login on all instances, and that the files for all users should be available on all instances. These requirements lead to the requirements for centralized/shared authentication data, and for shared filesystems. The centralized/shared authentication data can be handled by using e.g. SQL databases, LDAP directories, or even synchronized password files. For shared filesystems, the popular approaches are: There are probably other solutions as well; the key is to have the users' files available on any/every instance.The next thing to keep in mind is whether you have an EC2 Classic account, or whether you are using AWS VPC. Chances are that you are using a VPC. ELBs for an EC2 Classic account can only be configured to listen on a restricted list of ports, i.e.:
Let's assume that you are using a VPC, and thus you configure a TCP listener on your ELB for port 21, which uses the instance port 21. And for SFTP/SCP, it would be a TCP listener for port 22, using instance port 22. Obviously you would not use HTTP or HTTPS listeners, but what about an SSL listener, for FTPS? No. An SSL listener performs the SSL/TLS handshake first, then forwards the plaintext messages to the backend instance. But FTPS is a "STARTTLS" protocol, which means the connection is first unencrypted, and then feature negotiation happens on that connection, and then the SSL/TLS handshake happens. ELBs do not support STARTTLS protocols, thus you cannot use them for terminating SSL/TLS sessions for FTP servers.
Your ProFTPD configuration might use multiple different ports, for different
An ELB wants to perform health checks on its backend instances, to know that that instance is up, running, and available to handle connections. ELBs can perform HTTP requests as healthchecks, or make TCP connections. ProFTPD is not an HTTP server, so using TCP health checks is necessary. You would configure the ELB to make TCP connections to ProFTPD port, e.g. port 21 for FTP/FTPS, and/or port 22 for SFTP/SCP.
What about the range of ports defined via client --- ctrl ---> ELB:21 --- ctrl ---> instance:21The client and server negotiate a passive data transfer; the FTP server tells the client, over the control connection, an address and port to which to connect. Now, let's assume that ProFTPD gives the address of the ELB, and one of the PassivePorts ; we'l use port 65000 for this example.
The FTP client connects to the address/port on the ELB, like this:
client --- data ---> ELB:65000 --- data ---> instance:65000This would mean that the ELB would need TCP listeners for the PassivePorts , and that MasqueradeAddress would
need to point to the ELB DNS name. So why did I say that the ELB did not
need those extra TCP listeners?
If your ELB will only ever have just one backend instance, then the above configuration would work. Your EC2 instance might be in a VPC, with no public address, and thus perhaps the only way to make your FTP server there reachable is using an ELB. Where forcing passive data connections through an ELB starts to fail is when there are multiple backend instances. Consider the case where your ELB might have 3 instances: +--> instance1:21 ELB:21 --|--> instance2:21 +--> instance3:21An FTP client connects to the ELB, and the ELB selects instance #2: client --- ctrl ---> ELB:21 --- ctrl ---> instance2:21So far, so good. The client requests a passive data transfer; the FTP server tells the client to connect to the ELB address, port 65000, but the ELB sends that connection to instance #3, not instance #2: client --- data ---> ELB:65000 --- data ---> instance3:65000This can happen because the ELB does not understand FTP; it does not know that the data connection is related, in any way, to any other connections. To the ELB, all TCP connections are independent, and thus any connection will be routed, round-robin, to any backend instance. There is no guarantee that the data connections, going through the ELB, will connect to the proper backend instance. If there is only one backend instance, though, everything will work as expected.
In order to properly support multiple backend instances (which is one of the
goals/benefits of using an ELB in the first place) for FTP, then, the trick
is to not force data connections through the ELB. Instead, the
client --- ctrl ---> ELB:21 --- ctrl ---> instance2:21And for the data transfer, ProFTPD tells the client the instance public hostname, and port 65000: client -------------- data -------------> instance2:65000Notice how, with this configuration, the TCP connection for the data transfer bypasses the ELB completely. This is why you do not need to configure any TCP listeners on the ELB for those PassivePorts , and why
you do not want MasqueradeAddress using the ELB DNS name;
you do not want passive data connections going through the ELB.
Now you have an ELB with multiple backend FTP servers. Success, right? Maybe.
There are some caveats. FTP clients might notice that they connect to
one name (the ELB DNS name), but for data transfers, they are being told
(by the FTP server) to connect to a different name; some FTP clients
might warn/complain about this mismatch. ProFTPD would definitely complain
about this mismatch, for it would see the control connection as originating
from the ELB, but the data connection originating from a different address,
and would refuse the data transfer. To allow data transfers to work, then,
you would need to add the following to your # Allow "site-to-site" transfers, since that is what FTP traffic with # an ELB looks like. AllowForeignAddress onwhich has its own security implications. Next, there is the ELB idle timeout setting to adjust. The default is 60 seconds. During a data transfer, most FTP clients will be handling the data connection, and the control connection is idle. Thus if the data transfer lasts longer than 60 seconds, the ELB might terminate the idle control connection, and the FTP session is lost. Unfortunately the maximum allowed idle timeout for ELBs is 1 hour (3600 seconds); for large (or slow) data transfers, even that timeout could be a problem. There are ways of keeping the control connection from being idle for too long, using keepalives. Note that this idle timeout is not really an issue for SFTP/SCP sessions, as all data transfers for them use the same single TCP connection.
Last, using an ELB only for FTP control connections, and using direct
connections for the FTP data transfers only works if your backend EC2 instances
have public hostnames; for instances in a VPC, that may not be true.
So how can we use an ELB for multiple backend instances that only have private
addresses? Sadly, the answer is: you can't. For load balancing FTP sessions
among multiple backend EC2 instances with private addresses, you need an
FTP-aware proxy, such as ProFTPD with the
DNS and AWS Route53
Instead of using an AWS ELB for balancing/distributing connections across
your pool of ProFTPD-running instances, you can use DNS tricks to implement
the same functionality. Note, however, these DNS tricks still assume that
your EC2 instances are publicly reachable, i.e. have public hostnames.
With DNS load balancing, the client resolves a DNS name to an IP address,
and connects to that IP address:
Within AWS, the Route53 service
can be used as the DNS service for your domain names. AWS Route53 calls this round robin of addresses a weighted routing
policy, as each address associated with a name can be given a "weight",
affecting the probability that that address will be returned, by Route53,
when the DNS name is resolved to an IP address. Other routing policies are
supported, e.g. latency-based routing
(so that the instance with the fastest response time is chosen), and
geolocation-based routing (the instance address
chosen is based on the location of the resolving client).
If you are using AWS Route53, then you will need to configure health checks,
just as you would for an ELB. Route53 supports TCP health checks, which
you would point at your FTP/FTPS port (21) or SFTP/SCP port (22) on your
instances.
Since any/all clients could connect to any/all of the EC2 instances associated
with your DNS name, all of the users would need to be able to login on any
instance, and have their files/data available. Thus using a shared filesystem
for the files (such as s3fs, NFS, Samba, gluster, etc) and a centralized/shared authentication
mechanism (e.g. SQL database, LDAP directory, etc) would be
needed.
Future Work
Frequently Asked Questions
Question: I need to send particular users only to
a particular instance/set of instances. How do I configure AWS to do this?
The AWS services like ELBs and Route53 understand TCP connections, and the
HTTP protocol, but they do not understand FTP. And understanding of the
protocol is necessary, so that you know how/when to expect the user name, and
how to redirect/proxy the backend connection. This is why you cannot use
AWS to do per-user balancing. However, you can use the
Question: I am using ELBs for my pool of ProFTPD
servers. I would like my logs to show the IP address of the connecting
clients, but all I get is the IP address of the ELB. Is there a way to get
the original IP address, an equivalent to the
To enable use of the
The |
:: Command execute :: | |
--[ c99shell v. 2.5 [PHP 8 Update] [24.05.2025] | Generation time: 0.0046 ]-- |