CommuniGate Pro: Clusters

If your site serves many Domains, you may want to install several independent CommuniGate Pro Servers and distribute the load by distributing domains between the servers. In this case you do not need to employ the special Cluster Support features. However if you have one or several Domains with 100,000 or more Accounts in each, and you cannot guarantee that clients will always connect to the proper server, or if you need dynamic load balancing and very high availability, you should implement a CommuniGate Pro Cluster on your site.

Many vendors use the term Cluster for simple fail-over or hot stand-by configurations. The CommuniGate Pro software can be used in fail-over, as well as in Distributed Domains configurations, however these configurations are not referred to as Cluster configurations.

A CommuniGate Pro Cluster is a set of Server computers that handle the site mail load together. Each Cluster Server hosts a set of regular, non-shared domains (the CommuniGate Pro Main Domain is always a non-shared one), and it also serves (together with other Cluster Servers) a set of Shared Domains.

To use CommuniGate Pro servers in a Cluster, you need a special CommuniGate Pro Cluster License.

Please read the Scalability section first to learn how to estimate your Server load, and how to get most out of each CommuniGate Pro Server running in the Single-server or in the Cluster mode.

Cluster Types

There are two main types of Cluster configurations: Static and Dynamic.

Each Account in a Shared Domain served with a Static Cluster is created (hosted) on a certain Server, and only that Server can access the account data directly. When a Static Cluster Server needs to perform any operation with an account hosted on a different Server, it establishes a TCP/IP connection with the account Host Server and accesses account data via that Host Server. This architecture allows you to use local (i.e. non-shared) storage devices for account data.

Note: some vendors have "Mail Multiplexor"-type products. Those products usually implement a subset of Static Cluster Frontend functionality.

Accounts in Shared Domains served with a Dynamic Cluster are stored on a shared storage, so each Cluster Server (except for Frontend Servers, see below) can access the account data directly. At any given moment, one of the Cluster Servers acts as a Cluster Controller synchronizing access to Accounts in Shared Domains. When a Dynamic Cluster Server needs to perform any operation with an account currently opened on a different Server, it establishes a TCP/IP connection with that "current host" Server and accesses account data via that Server. This architecture provides the highest availability (all accounts can be accessed as long as at least one Server is running), and does not require file-locking operations on the storage device.

Supported Services

The CommuniGate Pro Clustering features support the following services:

SMTP mail receiving
SMTP mail delivery
SIP signaling
XIMSS access
WebUser Interface access
XMPP signaling
POP3 access
IMAP access
FTP
RADIUS
TFTP
HTTP access to File Storage (including uploading)
ACAP access
PWD access and remote administration
CG/PL inter-task communication
HTTP request to external servers
RPOP polling
Remote SIP Registrations

Frontend Servers

Clusters of both types are usually equipped with Frontend Servers. Frontend Servers cannot access Account data directly - they always open connections to other (Backend) Servers to perform any operation with Account data.

Frontend servers accept TCP/IP connections from client computers (usually - from the Internet). In a pure Frontend-Backend configuration no Accounts are created on any Frontend Server, but nothing prohibits you from serving some Domains (with Accounts and mailing lists) directly on the Frontend servers.

When a client establishes a connection with one of the Frontend Servers and sends the authentication information (the Account name), the Frontend server detects on which Backend server the addressed Account can be opened, and establishes a connection with that Backend Server.

The Frontend Servers:

handle all SSL/TLS encryption/decryption operations
implement SIP Farm functionality, and most of SIP protocol features, including NAT Traversal
handle most of the SMTP relaying operations themselves
run Real-Time CG/PL applications and Media Server Channels
virtually eliminate inter-server communications between Backend Servers, and (in Dynamic Clusters) provide second-level load balancing
provide an additional layer of protection against Internet attacks and allow you to avoid exposing Backend Servers to the Internet
smooth out the external traffic (soften peaks in the site load), and protect the Backend Servers from the Denial-of-Service attacks
execute outgoing HTTP requests.
execute outgoing Remote SIP Registration tasks.
execute outgoing Remote POP (RPOP) polling sessions.

If the Frontend Servers are directly exposed to the Internet, and the security of a Frontend Server operating system is compromised so that someone gets unauthorized access to that Server OS, the security of the site is not totally compromised. Frontend Servers do not keep any Account information (Mailboxes, passwords) on their disks. The "cracker" would then have to go through the firewall and break the security of the Backend Server OS in order to get access to any Account information. Since the network between Frontend and Backend Servers can be disabled for all types of communications except the CommuniGate Pro inter-server communications, breaking the Backend Server OS is virtually impossible.

Both Static and Dynamic Clusters can work without dedicated Frontend Servers This is called a symmetric configuration, where each Cluster Server implements both Frontend and Backend functions.

In the example below, the domain1.dom and domain2.dom Domain Accounts are distributed between three Static Cluster Servers, and each Server accepts incoming connections for these Domains. If the Server SV1 receives a connection for the Account kate@domain1.dom located on the Server SV2, the Server SV1 starts to operate as a Frontend Server, connecting to the Server SV2 as the Backend Server hosting the addressed Account.
At the same time, an external connection established with the server SV2 can request access to the ada@domain1.dom Account located on the Server SV1. The Server SV2 acting as a Frontend Server will open a connection to the Server SV1 and will use it as the Backend Server hosting the addressed Account.

In a symmetric configuration, the number of inter-server connections can be equal to the number of external (user) access-type (POP, IMAP, HTTP) connections. For a symmetric Static Cluster, the average number of inter-server connections is M*(N-1)/N, where M is the number of external (user) connections, and the N is the number of Servers in the Static Cluster. For a symmetric Dynamic Cluster, the average number of inter-Server connections is M*(N-1)/N * A/T, where T is the total number of Accounts in Shared Domains, and A is the average number of Accounts opened on each Server. For large ISP-type and portal-type sites, the A/T ratio is small (usually - not more than 1:100).

In a pure Frontend-Backend configuration, the number of inter-server connections is usually the same as the number of external (user) connections: for each external connection, a Frontend Server opens a connection to a Backend Server. A small number of inter-server connections can be opened between Backend Servers, too.

Withdrawing Frontend Servers from a Cluster

To remove a Frontend Server from a Cluster (for maintenance, hardware upgrade, etc.), reconfigure your Load Balancer or the round-robin DNS server to stop redirection of incoming requests to this Frontend Server address. After all current POP, IMAP, SMTP sessions are closed, the Frontend Server can be shut down. Since the WebMail sessions do not use persistent HTTP connections, a Frontend Server in a WebMail-only Cluster can be shut down almost immediately.

Access to all Shared Domain Accounts is provided without interruption as long as at least one Frontend Server is running.

If a Frontend server fails, no Account becomes unavailable and no mail is lost. While POP and IMAP sessions conducted via the failed Frontend server are interrupted, all WebUser Interface session remain active, and WebUser Interface clients can continue to work via remaining Frontend Servers. POP and IMAP users can immediately re-establish their connections via remaining Frontend Servers.

If the failed Frontend server cannot be repaired quickly, its Queue can be processed with a different server, as a Foreign Queue.

Cluster Server Configuration

This section specifies how each CommuniGate Pro Server should be configured to participate in a Static or Dynamic Cluster. These settings control inter-server communications in your Cluster.

First, install CommuniGate Pro Software on all Servers that will take part in your Cluster. Specify the Main Domain Name for all Cluster Servers. Those names should differ in the first domain name element only:

back1.isp.dom, back2.isp.dom, front1.isp.dom, front2.isp.dom, etc.

Remember that Main Domains are never shared, so all these names should be different. You may want to create only the Server administrator accounts in the Main Domains - these accounts can be used to connect to that particular Server and configure its local, Server-specific settings.

Cluster Network

Use the WebAdmin Interface to open the Settings->General->Cluster page on each Backend and Frontend Server, and enter all Frontend and Backend Server IP addresses. CommuniGate Pro Servers will accept Cluster connections from the specified IP addresses only. If the Frontend Servers use dedicated Network Interface Cards (NICs) to communicate with Backend Servers, specify the IP addresses the Frontend Servers have on that internal network:

Cluster Network

This Server Cluster Address:

On restart	Active
	[192.168.0.5]

Backend Server Addresses:

Frontend Server Addresses:

Dynamic Cluster Locker Log:

Admin Connections:

This Server Cluster Address: This setting specifies the local network address this Server will use to communicate with other Servers in the Cluster. Connections to other Servers will be established from this IP address. This address is used as this Server "name", identifying the Server in the Cluster.
If you change this setting value, the new value will be in effect only after Server restart.
Admin Connections: See the Security Issues section below.

Cluster Communication

In all types of CommuniGate Pro cluster, connections to Backend servers can be established from Frontend servers and from other Backend servers.

If your Backend Servers use non-standard port numbers for mail services, change the Backend Server Ports values.

For example, if your Backend Servers accept WebUser Interface connections not on the port number 8100, but on the standard HTTP port 80, set 80 in the HTTP User field and click the Update button.

Inter-Cluster Communications
Backend Port	Cache	Backend Port	Cache
`Delivery:`		`Submit:`
`POP:`		`IMAP:`
`HTTPU:`		`HTTPA:`
`XIMSS:`		`XMPP:`
`PWD:`		`ACAP:`
Log Level	Cache	Log Level	Cache
`Object Admin:`		`Mailboxes:`
`Cluster Admin:`

CommuniGate Pro can reuse inter-server connections for some services. Instead of closing a connection when an operation is completed, the connection is placed into an internal cache, and it is reused later, when this server needs to connect to the same server. The Cache parameter specifies the size of that connection cache. If there are too many connections in the cache, older connection are closed and pushed out of the cache.

Cluster members use the PWD protocol to perform administration operations remotely, on other cluster members. The port number they use to connect to on other Cluster members is the same as the port specified for the PWD protocol connections. These remote administrative operations have their own Log Level settings.

Servers in a Dynamic Cluster use the SMTP modules of other Cluster Backend members for remote message delivery (though the protocol between the servers is not the SMTP protocol). Use the Delivery port setting to specify the port number used with SMTP modules on other cluster members.

Servers in a Dynamic Cluster use the SMTP modules of other Cluster members to submit messages remotely (though the protocol between the servers is not the SMTP protocol). Use the Submit port setting to specify the port number used with SMTP modules on other cluster members.

When a user session running on one Cluster member needs to access a foreign Mailbox, and the account that this Mailbox belongs to cannot be opened by the same Cluster member, the Cluster Mailbox manager is used to access Mailboxes remotely. The Cluster Mailbox manager uses the IMAP port to connect to other cluster members. The Cluster Mailbox manager has its own Log Level setting.

Service Processing
HTTP Client:		RPOP Client:

When a Cluster is configured so that only the frontend servers can access the Internet, certain services can run on those frontend servers only.

HTTP Client

This setting specifies how the outgoing HTTP requests (initiated with XIMSS sessions, CG/PL applications, Automated Rules, etc.) are executed.

Locally

when there is an HTTP request to execute, it is executed on the same Server (this is the "regular", single-server processing mode).

Locally for Others

HTTP requests are executed on the same Server.
The Dynamic Cluster Controller is informed that this Server can execute HTTP requests for other Cluster members.
The Dynamic Cluster Controller collects and distributes information about all active Cluster members that have this option selected.

Remotely

when there is an HTTP request to execute, a request is relayed to some Cluster member that has this setting set to Locally for Others.

Auto

if this Server is not a Dynamic Cluster member, same as Locally
if this Server is a Dynamic Cluster frontend, same as Locally for Others
if this Server is a Dynamic Cluster backend, same as Remotely if there are other Dynamic Cluster members configured as Locally for Others, if there are none - same as Locally

RPOP Client

This setting specifies how the outgoing RPOP requests are executed. The setting values have the same meanings as the HTTP Client setting values.

Assigning IP Addresses to Shared Domains

A CommuniGate Pro Cluster can serve several Shared Domains. If you plan to provide POP and IMAP access to Accounts in those Domains, you may want to assign dedicated IP addresses to those Domains to simplify client mailer setups. See the Access section for more details.

If you use Frontend Servers, only Frontend Servers should have dedicated IP Addresses for Shared Domains. Inter-server communications always use full account names (accountname@domainname), so there is no need to dedicate IP Addresses to Shared Domains on Backend Servers.

If you use the DNS round-robin mechanisms to distribute the site load, you need to assign N IP addresses to each Shared Domain that needs dedicated IP addresses, where N is the number of your Frontend Servers. Configure the DNS Server to return these addresses in the round-robin manner:

In this example, the Cluster is serving two Shared Domains: domain1.dom and domain2.dom, and the Cluster has three Frontend Servers. Three IP addresses are assigned to each domain name in the DNS server tables, and the DNS server returns all three addresses when a client is requesting A-records for one of these domain names. Each time the DNS server "rotates" the order of the IP addresses in its responses, implementing the DNS "round-robin" load balancing (client applications usually use the first address in the DNS server response, and use other addresses only if an attempt to establish a TCP/IP connection with the first address fails).

When configuring these Shared Domains in your CommuniGate Pro Servers, you assign all three IP addresses to each Domain.

If you use a Load Balancer to distribute the site load, you need to place only one "external" IP address into DNS records describing each Shared Domain. You assign one "virtual" (LAN) IP address to each Shared Domain on each Frontend Server:

In this example, the Cluster is serving two Shared Domains: domain1.dom and domain2.dom, and the Cluster has three Frontend Servers. One IP Addresses assigned to each Shared Domain in the DNS server tables, and those addresses are external (Internet) addresses of your Load Balancer. You should instruct the Load Balancer to distribute connections received on each of its external IP addresses to three internal IP addresses - the addresses assigned to your Frontend Servers.

When configuring these Shared Domains in your CommuniGate Pro Servers, you assign these three internal IP addresses to each Domain.

DNS MX-records for Shared Domains can point to their A-records.

Security Issues

The Frontend-Backend topology allows you to protect the site information and Backend Servers not only when a Frontend Server crashes because of some type of network attack, but even if the Frontend Server OS is "cracked" and an intruder gets the complete ("root") access to the Frontend Server OS using a security hole in that OS.

To protect the site from these "cracks":

Do not use the Frontend Servers to administer the Shared Domains as a Frontend Server administrator. In this case you can disable the Admin Connections option on the Cluster page of all Backend Servers.
Enable the Admin Connections option on the current Cluster Controller and Backup Controller Servers only when you are adding a new Server to the Cluster. When that new Server is up and running, you can disable the Admin Connections option on the Cluster Controllers.

These measures do not cause any problem for your users that have the Domain Administrator rights and want to administer their Shared Domains (using WebAdmin Interface or CLI). They also do not cause any problem for your regular users that want to use the PWD module to update their passwords.

Cluster Configuration Details

Startup Parameters

The best way to specify additional startup parameters is to create the Startup.sh file inside the base directory.

To create a Static Cluster, use the --staticBackend startup parameter with the Backend servers, and the --staticFrontend startup parameter with the Frontend servers.

To create a Dynamic Cluster, use the --clusterBackend startup parameter with the Backend servers, and the --clusterFrontend startup parameter with the Frontend servers.

Listeners

To protect your site from DoS attacks, you may want to open SMTP, POP, IMAP, and other Listeners and limit the number of connections accepted from the same IP address. Set those limits on Frontend servers only, since Backend servers receive all connections from Frontends, and each Frontend can open a lot of connections from the same IP address.

WebAdmin

Usually the Backend servers are not directly accessible from the Internet. If you need to change the settings or monitor one of the Backend servers from "outside", you can use the WebAdmin interface of one of the Frontend servers, using the following URL:
http://Frontendaddress:8010/Cluster/12.34.56.78/
where 12.34.56.78 is the [internal] IP address of the Backend server you want to access.

SMTP

The outgoing mail traffic generated with regular (POP/IMAP) clients is submitted to the site using the A-records of the site Domains. As a result, the submitted messages go to the Frontend Servers and the messages are distributed from there.

Messages generated with WebUser clients and messages generated automatically (using the Automated Rules) are generated on the Backend Servers. Since usually the Backend servers are behind the firewall and since you usually do not want the Backend Servers to spend their resources maintaining SMTP queues, it is recommended to use the forwarding feature of the CommuniGate Pro SMTP module.

Select the Forward to option and specify the asterisk (*) symbol. In this case all messages generated on the Backend Servers will be quickly sent to the Frontend Servers and they will be distributed from there. If you do not want to use all Frontend servers for Backend mail relaying, change the Forward To setting to include the IP addresses of some Frontend Servers, separating the addresses with the comma (,) symbol.

RPOP

RPOP activity takes place on Backend servers in a Static Cluster, and on the Cluster Controller in a Dynamic Cluster. As a result, it is essential for those servers to be able to initiate outgoing TCP connections to remote servers. If the Backend servers are connected to a private LAN behind a firewall, you should install some NAT server software on that network and configure the Backend servers (using their OS TCP/IP settings) to route all non-local packets via the NAT server(s). Frontend servers can be used to run NAT services.

FTP

The FTP module does not "proxy" connections to Backend servers. Instead, it uses CLI to manage Account File Storage data on Backend servers. This eliminates a problem of Backend servers opening FTP connections directly to clients. If all FTP connections come to the Frontend servers, the FTP services on Backends can be switched off.

The FTP module running on cluster Frontends behind a load balancer and/or a NAT has the same problems as any FTP server running in such a configuration. To support the active mode, make sure that Frontend servers can open outgoing connections to client FTP ports (when running via a NAT, make sure that the "address in use" problems are addressed by the NAT software). To support the passive mode, make sure that your load balancer allows clients to connect directly to the Frontend ports the FTP module opens for individual clients.

LDAP

The LDAP module does not "proxy" connections to Backend servers. Instead, it uses CLI to authenticate users and, optionally, to perform LDAP Provisioning operations. If all LDAP connections come to the Frontend servers, the LDAP services on Backends can be switched off.

RADIUS

The RADIUS module does not "proxy" connections to Backend servers. Instead, it uses CLI to authenticate users and to update their statistical data. If all RADIUS connections come to the Frontend servers, the RADIUS services on Backends can be switched off.

SIP

See the Cluster Signals section.

Postmaster Account

Do not remove the "postmaster" Account from the Main Domains on your Backend servers. This Account is opened "by name" (bypassing the Router) when any other Cluster member has to connect to that Backend. You should also keep at least the "Can Modify All Domains and Accounts Settings" Access Right for the postmaster Account.

Cluster Of Clusters

For extremely large sites (more than 5,000,000 active accounts), you can deploy a Static Cluster of Dynamic Clusters. It is essentially the same as a regular Static Cluster with Frontend Servers, but instead of Backend Servers you install Dynamic Clusters. This solves the redundancy problem of Static Clusters, but does not require extremely large Shared Storage devices and excessive network traffic of extra-large Dynamic Clusters:

Frontend Servers in a "Cluster of Clusters" need access to the Directory in order to implement Static Clustering. The Frontend Servers only read the information from the Directory, while the Backend Servers modify the Directory when accounts are added, renamed, or removed. The hostServer attribute of Account directory records contains the name of the Backend Dynamic Cluster hosting the Account (the name of Backend Cluster Servers without the first domain name element).

Frontend Servers can be grouped into subsets for traffic segmentation. Each subset can have its own load balancer(s), and a switch that connects this Frontend Subset with every Backend Dynamic Cluster.

If you plan to deploy many (50 and more) Frontend Servers, the Directory Server itself can become the main site bottleneck. To remove this bottleneck and to provide redundancy on the Directory level, you can deploy several Directory Servers (each Server serving one or several Frontend subsets). Backend Dynamic Clusters can be configured to update only one "Master" Directory Server, and other Directory Servers can use replication mechanisms to synchronize with the Master Directory Server, or the Backend Clusters can be configured to modify all Directory Servers at once.