Honeypot It's a system that houses Dockerized containers of some of the most popular open source honeypots such as Suricata, Honeytrap, Cowrie, Glastopf, and Dionaea. While there are plenty of other open source honeypots out there, this list makes for a solid foundation for a new honeypot setup (or a great addition to an existing one). The only missing feature in T-Pot, as we saw it, was a way to centralize the data from different honeypot nodes. And so, this blog will describe how we automated the deployment of T-Pot, changes we made to support the centralization of log data into an Elasticsearch instance, and some of our design decisions.

Environment

High level diagram of honeypot deployment.

High level diagram of honeypot deployment.

Configuration between centralized ELK server and distributed honeypots.

Configuration between centralized ELK server and distributed honeypots.

The distributed configuration relies on systems that only listen on a single SSH port, for management, and requires key-based authentication. A central database server is then tasked with building up SSH port forward tunnels to each managed sensor, with the tunnel endpoints terminating between localhost ports.

Each sensor will use Logstash to ingest honeypot log data, and output the parsed results as JSON into its localhost TCP/9999. Within the central server, we have Logstash input modules for each of the remote host, and outputting into the local Elasticsearch server.

The end result as visualized in Kibana would look similar to the following:

Central ELK displaying honeypot data from our external sensors.

Central ELK displaying honeypot data from our external sensors.

Design Decisions

When adding the centralized ELK component we wanted to leverage as much of the existing default T-Pot setup as possible. Because T-Pot is already structured in a way that all Dockerized honeypots simply store log data and artifacts into a convenient data directory it naturally lends itself towards a centralized database. This allowed us to reuse a lot of their existing configurations, including the Kibana dashboards and Logstash parsing, and augmented it in order to support the capability of aggregating multiple honeypot servers. It also has an added benefit of being modular enough to include additional honeypot containers. This is a great benefit to us and anyone that has their own, custom made honeypots. As long as they're Dockerized, you can just toss them in!

Security

Security is always a priority in everything we do and setting up honeypots is no exception. Even though our honeypots exist in an isolated DMZ we still wanted to minimize potential attack vectors as much as we could. For management a single SSH port was all we really wanted exposed on the Internet, which we could further restrict with source IP rules on iptables, and force the use of key-based authentication for logins. Additionally, log collection from the honeypots to the centralized Elasticsearch server utilized SSH tunnels as we didn't want to leave Logstash open in the wild.

Viewing data on the ELK server required a similar SSH port forwarding process to bind a localhost port in order to view the Kibana dashboard. We found this to be fairly simple and didn't require opening any additional ports on the central server. The process is mostly the same as accessing the standard T-Pot build's Kibana dashboard.

Modifications to T-Pot

The move to implement Ansible was fueled from the fact that we wanted something easily deployable, and repeatable. This would allow us to quickly teardown and re-provision our external systems, where we have no reasonable expectation of reliability. Of course, we make it a point to pull data from our servers for data retention and analysis, but we generally don't care what happens to the sensors themselves.

To meet the above outlined requirements, involved converting the standalone installer script into Ansible roles. This helped us break apart various components of the T-Pot install process in order to insert our own changes to support the distributed configuration.

This is what the playbook looks like:

Playbook

These are just a series of steps we had to go through in order to get the distributed honeypot configuration up and running, in no particular order:

  • Used the standalone 'honeypot' build from T-Pot and installed Logstash.
    • We took the logstash.conf from the T-Pot 'full' build and made changes to use an HTTP output filter to localhost:9999, which is where the SSH tunnel endpoint terminates.
  • Added an additional step to include more honeypot container types for the remote sensors, because why not?
  • Create a central server which has Elasticsearch, Logstash, and Kibana
    • The logstash.conf is built and copied over using Jinja2 templating, based on the hosts file (example below). It basically listens on various localhost ports for input data and outputs it into the local Elasticsearch port.
  • Distribute generated SSH keys and handle known_hosts.
    • Keys are locally generated before the Ansible playbook is kicked off.
    • Retrieve public key identifier from sensors and store into central server.
    • Central server public key (generated locally) are transmitted into the sensor's authorized_users file.
    • At this point the central server can initiate SSH connections to sensors.
  • Central server builds out SSH tunnels to each sensor.
    • Use Jinja2 templates to create a shell script that kicks off autossh to maintain tunnel connectivity across sensors.
    • Create an "sshtun" systemd service that invokes the above script when the network is up.

That's the gist of it!

Ansible hosts file

Ansible hosts file.

With a few changes to the hosts file, we can kick off Ansible to automatically roll out our distributed honeypot network, with a centralized database.

The output of running ansible-playbook central.yml

The output of running 'ansible-playbook central.yml'

Once the environment has been automatically provisioned, we quickly test our setup by SSH'ing into the Cowrie instances of the "bat" and "cat" sensors, and generate some test data. Logging into the Elasticsearch server, we can easily view our results in Kibana, which displays all of our inputs from the different sensors.

Discovering Cowrie data from the central ELK server

Discovering Cowrie data from the central ELK server.

By using this method, we were able to simplify the roll out of external sensors and aggregate our data into a single location. Having a flashy dashboard doesn't hurt either.

Again, we can't stress enough how awesome the T-Pot Community Edition project is. If you plan on rolling out a turn-key honeypot solution, you should definitely check it out at: https://github.com/dtag-dev-sec/tpotce.

Our fork of their T-Pot autoinstall project contains the Ansible playbooks and Vagrant files required to roll out your very own distributed honeypot network. Take a look at it here: https://github.com/CyberPoint/t-pot-autoinstall.

Happy collecting!