If you are monitoring your environment using beats the default action is to rotate the indices every day and create new indices at midnight. After some time your disk will start filling up and it is very hard to see what you want to delete or keep and what you would like to keep or make snapshot of what would you like to delete and so on. For our help elastic developed a tool called curator.
In this post I will show you how to install, configure and run curator. This post will assume that you have a version of python installed with pip. If you don't have then it is a good time to install it before we move forward. To install curator all we need is to run the following command:
#pip install -U elasticsearch-curator
This by default will install the latest curator available. If you are running an older version of elasticsearch you can always check the compatibility map by looking at the map bellow.
In case you need a different version of curator to be installed you can specify the version like this:
#pip install -U elasticsearch-curator==<version>
Once the curator is installed it is time to create a configuration file. By default the curator config file will be located at
~/.curator/curator.yml therefore I will create it in a slightly different path.
#mkdir ~/.curator #cat > ~/.curator/config.yml << EOF --- # Remember, leave a key empty if there is no value. None will be a string, # not a Python "NoneType" client: hosts: - 127.0.0.1 port: 9200 url_prefix: use_ssl: False certificate: client_cert: client_key: ssl_no_validate: False http_auth: timeout: 30 master_only: False logging: loglevel: INFO logfile: logformat: default blacklist: ['elasticsearch', 'urllib3'] EOF
As you can see I am having a pretty default elasticsearch configuration. In case you have something more complex you can adapt the config file above to your needs.
Curator can be used as a command line interface or a singleton command line interface. The command line has the following format:
#curator --config [curator.yml] --dry-run ACTION_FIlE.yml
Now let's create an action file. I am creating the action file in the same folder as the configuration file, but that can be changed as needed.
#cat > ~/.curator/delete_indices.yml << EOF --- actions: 1: action: delete_indices description: >- Delete indices older than 30 days based on index name options: ignore_empty_list: True disable_action: False filters: - filtertype: pattern kind: regex value: '^(metric|heart)beat-.*' - filtertype: age source: name direction: older timestring: '%Y.%m.%d' unit: days unit_count: '30' EOF
With this action file I will delete any indices that has the name
heartbeat-* that is older than 30 days.
Let's add more actions to this file. Since I have my beats configured to send monitoring data to elasticsearch I want to delete those indexes as well if they are older than 15 days.
#cat >> ~/.curator/delete_indices.yml << EOF 2: action: delete_indices description: >- Delete indices older than 30 days based on index name options: ignore_empty_list: True disable_action: False filters: - filtertype: pattern kind: prefix value: .monitoring- - filtertype: age source: name direction: older timestring: '%Y.%m.%d' unit: days unit_count: '15' EOF
After doing some cleaning let's add another action for creating snapshots of the more important indices which we do not want to loose. Creating a snapshot requires some extra steps for creating a shared storage or using aws s3 or google buckets and adding some more configuration arguments to elasticsearch which you can find here.
Before doing anything we need to create a snapshot repository. For that we will be using a different tool called
es_repo_mgr. To add a repo we will be running the following command:
#es_repo_mgr --config .curator/config.yml create fs --repository filebeat_backup --location /bkp --compression True --skip_repo_fs_check True
To check if the repo was created we are running:
#es_repo_mgr --config .curator/config.yml show filebeat_backup
Good. Now that the repo is created we can start adding another action to the action file for creating the snapshots of the latest indexes created by filebeat.
#cat >> ~/.curator/delete_indices.yml << EOF 3: action: snapshot description: >- Snapshot selected indices to 'repository' with the snameshot name or name pattern in 'name' options: repository: filebeat_backup # leaving the bane bank will result in the default 'curator-%Y%m%d%H%M%S' format name: curator-%Y.%m.%d-%H:%M:%S wait_for_completion: True max_wait: 3600 wait_interval: 10 skip_repo_fs_check: True filters: - filtertype: pattern kind: prefix value: filebeat- - filtertype: age source: creation_date direction: younger unit: days unit_count: 1 EOF
Once that is done it is time to run our action file with curator.
#curator --config ~/.curator/config.yml ~/.curator/delete_indices.yml
It will run for a short while. If you have configured a log file then all the details will be shown in the log file otherwise the result will be shown on the console. Let's check if the snapshots were created. For this I will be using the singleton command line interface which is
curator_cli. The singleton command line interface works exactly as the curator command but the actions are provided as a command line argument instead of a config file. Example:
#curator_cli --config ~/.curator/config.yml show_indices --verbose filebeat-6.5.0-2019.01.01 open 26.2MB 68564 3 1 2019-01-01T00:00:01Z filebeat-6.5.0-2019.01.02 open 73.5MB 165566 3 1 2019-01-02T00:00:06Z filebeat-6.5.0-2019.01.03 open 79.2MB 178480 3 1 2019-01-03T00:00:01Z filebeat-6.5.0-2019.01.04 open 142.5MB 295582 3 1 2019-01-04T00:00:02Z filebeat-6.5.0-2019.01.05 open 66.7MB 153728 3 1 2019-01-05T00:00:01Z filebeat-6.5.0-2019.01.06 open 65.6MB 148712 3 1 2019-01-06T00:00:02Z filebeat-6.5.0-2019.01.07 open 82.6MB 177506 3 1 2019-01-07T00:00:02Z
For checking the snapshots I will be running the following:
#curator_cli --config .curator/config.yml show_snapshots --repository filebeat_backup curator-2019.01.24-02:59:29
As I can see there is a snapshot successfully created. For more options you can run:
All we need to do now is to add a cronjob to run the commands periodically and then it will be cleaning up your environment and creating backups of the important indexes as needed.