Is Kafka secure? What is Kafka security?
Security in modern enterprises is the utmost cause of concern, it’s essential to mitigate any risks of attacks, unauthorised access and data corruptions. It is addressed with extreme caution considering compliance and performance implications. Considering this, Apache kafka users automatically thinks, what do we have in Apache Kafka Security?
Apache Kafka comes with inbuilt configurable security protocols, the implementation of any(or all) of these are driven by enterprise use-case. Each security feature takes a toll on Kafka streaming performance, this needs to be analysed thoroughly, appropriate trade-off should be understood and implemented. Best way to mitigate this is to benchmark the cluster on each security layer.
In any ‘enterprise Kafka’ use case implementation, you need to build a solid governance framework to ensure security of confidential data along with who deals with it and what kind of operations are performed on it. Moreover, an effective governance framework ensures proper access and judicious use of data.
The fundamental data element in Kafka is Topic. You should define all your governance processes around these Topics. These Kafka topics are further partitioned and replicated across, but from a security perspective these are too abstract to be bothered with security.
Need of security in Kafka
In the real world, any organisation asset holding data needs to be secure for obvious reasons including integrity of data. Shrinking budget and cutting maintenance expenditure, give way to multi-tenant systems where different entities share common playing grounds. Hence, the need of selective access along with proper authorization & authentication protocols become the need of the hour.
Types of Kafka security
Security is implemented in 3 layers:
- Secure Kafka with SSL
Encryption of data over the wire using SSL (Secure socket layer)/TLS (Transport layer security):
Encoding of data with secret key before sending it via network. This is a common protocol used by financial bodies and during online transactions.
- Authentication using SSL or SASL
Allows producers and consumers to authenticate to your Kafka cluster, which verifies their identity. It’s also a secure way to enable your clients to endorse an identity.
- Authorization using ACLs
Once your clients are authenticated, your Kafka brokers can run them against access control lists (ACL) to determine whether a particular client would be authorized to write or read to some topic.
- Kafka security at rest
Configuration files can be masked with at-rest encryption using configprovider class in java. This is essential as all secrets need to be masked and externalized (read from a single place) wherever possible.eg:
Kafka on Kubernetes
There are different operators available in the market to deploy and manage Kafka on Kubernetes. Strimzi Operator is one of them, it is open source and comparatively easy to configure. It supports SSL with ACLs, hence it’s endorsed by many enterprises.
Secure Kafka with Apache Ranger
Apache Ranger is a popular open source tool to enforce security policies for hadoop ecosystem components. It has an immersive UI which enables easy monitoring and management of comprehensive data security. This can be integrated with Kafka as well to enforce custom compliance and IAM.
Secure Kafka with LDAP
This is an enterprise grade security protocol, an AD (Active Directory) service is prerequisite. This may or may not be Kerberos enabled LDAP.
One concern with LDAP is, credentials are sent over wire plaintext, so TLS is recommended to be used with above, also all passwords in Kafka clients configuration need to be encrypted using secrets. Generally large-sized enterprises go with LDAP as they need all of their applications to follow similar standards and be compliant.
Here is one portion of Confluent documentation for LDAP for securing client & interbroker communications.
Secure Kafka on cloud
With the advent of PAAS, organizations are shifting their procurement costs to operational cost, in layman’s terms most infrastructure is leased from public cloud providers.
Now, coming to Kafka, we can either go with fully managed service from Confluent Cloud or alternatively we can lease infrastructure from public cloud providers and deploy our Kafka cluster on them.
Other Topics you might be intereted in