Kafka Security / SSL Protocols
Security in modern enterprises is the utmost cause of concern, it’s extremely essential to mitigate any risks of attacks, unauthorised access and data corruption. It is addressed with caution considering compliance and performance implications.
Apache Kafka comes with inbuilt configurable security protocols, (Kafka SSL Protocols) the implementation of any(or all) of these are driven by enterprise use-case. Each security feature takes a toll on Kafka streaming performance, this needs to be analysed thoroughly, appropriate trade-off should be understood and implemented. Best way to mitigate this is to benchmark the cluster on each security layer.
In this article we will read about Kafka SSL Protocol, Authentication & Encryption, Types & need of security in Kafka.
The Below diagram visualizes different security standards at different channels and areas.
Need of security in Kafka
In the real world, any organisation asset holding data needs to be secure for obvious reasons including integrity of data. Shrinking budget and cutting maintenance expenditure, give way to multi-tenant systems where different entities share common playing grounds. Hence, the need of selective access along with proper authorization & authentication protocols in Kafka SSL implementation become the need of the hour.
Types of Kafka security
Security is implemented in 3 layers:
- Secure Kafka with SSL
Encryption of data over the wire using SSL (Secure socket layer)/TLS (Transport layer security):
Encoding of data with secret key before sending it via network. This is a common protocol used by financial bodies and during online transactions.
- Authentication using SSL or SASL
Allows producers and consumers to authenticate to your Kafka cluster, which verifies their identity. It’s also a secure way to enable your clients to endorse an identity.
- Authorization using ACLs
Once your clients are authenticated, your Kafka brokers can run them against access control lists (ACL) to determine whether a particular client would be authorized to write or read to some topic.
- Kafka security at rest
Configuration files can be masked with at-rest encryption using configprovider class in java. This is essential as all secrets need to be masked and externalized (read from a single place) wherever possible.
Encryption SSL (Secure socket layer)/TLS (Transport layer security):
Encryption solves the problem of the man in the middle (MITM) attack. That’s because your packets, while being routed to & from your Kafka cluster, travel your network and hop from machines to machines traversing through private/public networks. There is a definite risk of these packets being unwillingly intercepted and consumed. Encryption protocols enable safely encrypting packets while they enjoy travelling.
Encryption with Kafka SSL Protocols, your data is now encrypted and securely transmitted over the network. With SSL, only the first and the final machine possess the ability to decrypt the packet being sent.
But the above insurance comes with a price, the process of encryption & decryption brings in unwanted lag in the system, in some cases might disqualify the overall streaming definition.
Authentication using SSL or SASL
SASL stands for Simple Authorization Service Layer. It’s popular with Big Data systems, most likely your Hadoop setup already leverages it.
SASL PLAINTEXT: This is a classic username/password combination. These usernames and passwords have to be stored on the Kafka brokers in advance and each change needs a rolling restart.
SASL SCRAM: This is a username/password combination alongside a challenge (salt), which makes it more secure. Username and password hashes are stored in Zookeeper, which enables scaling security without rebooting brokers. If you use SASL/SCRAM, make sure to enable SSL encryption so that credentials aren’t sent as PLAINTEXT via the network
SASL GSSAPI (Kerberos): Kerberos is a network authentication protocol. It is designed to provide strong authentication for client/server applications by using secret-key cryptography. A free implementation of this protocol is available from the Massachusetts Institute of Technology. Kerberos is available in many commercial products as well.
Once your Kafka clients are authenticated, Kafka needs to regulate what they can and cannot do. This is where Authorization comes in, controlled by Access Control Lists (ACL).
ACLs are great because they can help you prevent disasters. For example, you may have a topic that needs to be writable from only a subset of clients or hosts. You want to avoid an average user from writing anything to these topics, hence preventing data corruption or deserialization errors. ACLs are also great if you have some sensitive data and you need to comply with regulators that only authorized apps or users can access that data.
To add ACLs, you can use the kafka-acks command. It also even has some facilities and shortcuts to add producers or consumers.
Securing traffic between Kafka brokers
Securing inter-broker communications is important, as this might be intercepted and compromised. To prevent this Kafka ships with four protocols PLAINTEXT, SSL, SASL_PLAINTEXT and SASL_SSL. Kerberos is supported with only SASL plaintext & SASL SSL, and as the name suggests, SSL & SASL_SSL support SSL.
The best protocol to go with is SASL_SSL, this suffices most of the organisation’s compliance needs.
These can be configured with standard jaas/keytabs (for automating)
Also while starting implementing these, better is to open two ports on plaintext & sasl_plaintext both. Once the applications or kafka clients are configured, once can remove the unsecured port by rolling restarts.
More elaborate docs on Kafka security can be found here.