Infrastructure
Infrastructure
This part describes tools used to manage the infrastructure.
Hosts
This section provides a summary of the backend hosts described in the rest of the document.
A full list is available at https://github.com/ooni/sysadmin/blob/master/ansible/inventory - also see Ansibleβπ§
backend-fsn.ooni.org
Public-facing production backend host, receiving the deployment of the packages:
backend-hel.ooni.org
Standby / pre-production backend host. Runs the same software stack as backend-fsn.ooni.orgβπ₯, plus the OONI bridgesββ
bridge-greenhost.ooni.org
Runs a OONI bridgesββ in front of the production API and production Test helpersββ.
ams-pg-test.ooni.org
Testbed backend host. Runs the same software stack as backend-fsn.ooni.orgβπ₯. Database tables are not backed up and incoming measurements are not uploaded to S3. All data is considered ephemeral.
monitoring.ooni.org
Runs the internal monitoring stack, including Jupyter Notebook, Prometheusβπ§, Vectorβπ§ and ClickHouse instance for logsββ
The Sysadmin repository
This is a git repository living at https://github.com/ooni/sysadmin/ for internal use. It primarily contains:
-
Playbooks for Ansibleβπ§
-
The debops-ci toolβπ§
-
Scripts and tools including diagrams for DNS and Domainsβπ‘
Ansible
Ansible is used to configure the OSes on the backend hosts and manage the configuration of backend components. The playbooks are kept at https://github.com/ooni/sysadmin/tree/master/ansible
This manual supersedes https://github.com/ooni/sysadmin/blob/master/README.md
Installation and setup
Install Ansible using a OS packages or a Python virtualenv. Ensure the same major+minor version is used across the team.
Secrets are stored in vaults using the ansible/vault
script as a
wrapper for ansible-vault
. Store encrypted variables with a vault_
prefix to allow using grep: http://docs.ansible.com/ansible/playbooks_best_practices.html#best-practices-for-variables-and-vaults
and link location of the variable using same name without prefix in
corresponding vars.yml
.
In order to access secrets stored inside of the vault, you will need a
copy of the vault password encrypted with your PGP key. This file should
be stored inside of ~/.ssh/ooni-sysadmin.vaultpw.gpg
.
The file should be provided by other teammates and GPG-encrypted for your own GPG key.
SSH Configuration
You should configure your ~/.ssh/config
with the following:
Replace ~/REPLACE_ME/sysadmin/ext/known_hosts
to where you have cloned
the ooni/sysadmin
repo. This will ensure you use the host key
fingeprints from this repo instead of just relying on TOFU.
You should replace YOUR_USERNAME
with your username from adm_login
.
On MacOS you may want to also add:
To use the Keychain to store passwords.
Ansible playbooks summary
Usage:
warning any minor error in configuration files or ansibleβs playbooks can be destructive for the backend infrastructure. Always test-run playbooks with
--diff
and-C
at first and carefully verify configuration changes. After verification run the playbook without-C
and verify again the applied changes.
note > Etckeeperβπ§ can be useful to verify configuration changes from a different point of view.
Some notable parts of the repository:
A list of the backend hosts lives at https://github.com/ooni/sysadmin/blob/master/ansible/inventory
The backend deployment playbook lives at https://github.com/ooni/sysadmin/blob/master/ansible/deploy-backend.yml
Many playbooks depend on roles that configure the OS, named
base-<os_version>
, for example:
https://github.com/ooni/sysadmin/blob/master/ansible/roles/base-bookworm
for Debian Bookworm and
https://github.com/ooni/sysadmin/tree/master/ansible/roles/base-bullseye
for Debian Bullseye
The nftables firewall is configured to read every .nft
file under
/etc/ooni/nftables/
and /etc/ooni/nftables/
. This allows roles to
create small files to open a port each and keep the configuration as
close as possible to the ansible step that deploys a service. For
example:
https://github.com/ooni/sysadmin/blob/master/ansible/roles/base-bookworm/tasks/main.yml#L110
note Ansible announces its runs on ooni-botsβπ‘ unless running with
-C
.
The root account
Runbooks use ssh to log on the hosts using your own account and leveraging sudo
to act as root.
The only exception is when a new host is being deployed - in that case ansible will log in as root to create individual accounts and lock out the root user.
When running the entire runbook ansible might try to run it as root.
This can be avoided by selecting only the required tags using -t <tagname>
.
Ideally the root user should be disabled after succesfully creating user accounts.
Roles layout
Ansible playbooks use multiple roles (see example) to deploy various components.
Few roles use the meta/main.yml
file to depend on other roles. See
example
note The latter method should be used sparingly because ansible does not indicate where each task in a playbook is coming from.
A diagram of the role dependencies for the deploy-backend.yml playbook:
A similar diagram for deploy-monitoring.yml:
note When deploying files or updating files already existing on the hosts it can be useful to add a note e.g. βDeployed by ansible, see <role_name>β. This helps track down how files on the host were modified and why.
Creating new playbooks runbook
This runbook describe how to add new runbooks or modify existing runbooks to support new hosts.
When adding a new host to an existing group, if no customization is required it is enough to modify inventory
and insert the hostname in the same locations as its peers.
If the host requires small customization e.g. a different configuration file for the <comp:api>:
- add the hostname to
inventory
as described above - create βcustomβ blocks in
tasks/main.yml
to adapt the deployment steps to the new host using thewhen:
syntax.
For an example see: https://github.com/ooni/sysadmin/blob/adb22576791baae046827c79e99b71fc825caae0/ansible/roles/ooni-backend/tasks/main.yml#L65
NOTE: Complex when:
rules can lower the readability of main.yml
When adding a new type of backend component that is different from anything already existing a new dedicated role can be created:
- add the hostname to
inventory
as described above - create a new playbook e.g.
ansible/deploy-newcomponent.yml
- copy files from an existing role into a new
ansible/roles/newcomponent
directory:
ansible/roles/newcomponent/meta/main.yml
ansible/roles/newcomponent/tasks/main.yml
ansible/roles/newcomponent/templates/example_config_file
- run
./play deploy-newcomponent.yml -l newhost.ooni.org --diff -C
and review the output - run
./play deploy-newcomponent.yml -l newhost.ooni.org --diff
and review the output
Example: https://github.com/ooni/sysadmin/commit/50271b9f5a8fd96dad5531c01fcfdd08bac98fe9
TIP: To ensure playbooks are robust and idemponent it can be beneficial to develop and test tasks incrementally by running the deployment commands often.
Monitoring deployment runbook
The monitoring stack is deployed and configured by Ansible on the monitoring.ooni.orgβπ₯ host using the following playbook: https://github.com/ooni/sysadmin/blob/master/ansible/deploy-monitoring.yml
It includes:
-
Grafanaβπ§ at https://grafana.ooni.org
-
Jupyter Notebookβπ§ at https://jupyter.ooni.org
-
Vector (see Log managementβπ‘)
-
local Netdata, Blackbox exporterβπ§, etc
-
Prometheusβπ§ at https://prometheus.ooni.org
It also configures the FQDNs:
-
loghost.ooni.org
-
monitoring.ooni.org
-
netdata.ooni.org
This also includes the credentials to access the Web UIs. They are
deployed as /etc/nginx/monitoring.htpasswd
from
ansible/roles/monitoring/files/htpasswd
Warning the following steps are dangerously broken. Applying the changes will either not work or worse break production.
If you must do something of this sort, you will unfortunately have to resort of
specifying the particular substeps you want to run using the -t
tag filter
(eg. -t prometheus-conf
to update the prometheus configuration.
Steps:
-
Review Ansible playbooks summaryβπ, Deploying a new host Grafana dashboardsβπ‘.
-
Run
./play deploy-monitoring.yml -l monitoring.ooni.org --diff -C
and review the output -
Run
./play deploy-monitoring.yml -l monitoring.ooni.org --diff
and review the output
Updating Blackbox Exporter runbook
This runbook describes updating Blackbox exporterβπ§.
The blackbox_exporter
role in ansible is pulled in by the deploy-monitoring.yml
runbook.
The configuration file is at roles/blackbox_exporter/templates/blackbox.yml.j2
together with host_vars/monitoring.ooni.org/vars.yml
.
To add a simple HTTP[S] check, for example, you can copy the βooni websiteβ block.
Edit it and run the deployment of the monitoring stack as described in the previous subchapter.
Etckeeper
Etckeeper https://etckeeper.branchable.com/ is deployed on backend
hosts and keeps the /etc
directory under git version control. It
commits automatically on package deployment and on timed runs. It also
allows doing commits manually.
To check for history of the /etc directory:
And git diff
for unmerged changes.
Use etckeeper commit <message>
to commit changes.
Team credential repository
A private repository https://github.com/ooni/private contains team credentials, including username/password tuples, GPG keys and more.
warning The credential file is GPG-encrypted as
credentials.json.gpg
. Do not commit the cleartextcredentials.json
file.
note The credentials are stored in a JSON file to allow a flexible, hierarchical layout. This allow storing metadata like descriptions on account usage, dates of account creations, expiry, and credential rotation time.
The tool checks JSON syntax and sorts keys automatically.
Listing file contents
Editing contents
Extracting a credential programmatically:
note this can be used to automate credential retrieval from other tools, e.g. Ansibleβπ§
Updating users allowed to decrypt the credentials file
Edit makefile
to add or remove recipients (see --recipient
)
Then run:
Deploying a new host
To deploy a new host:
-
Choose a FQDN like $name.ooni.org based on the DNS naming policyβπ‘
-
Deploy the physical host or VM using Debian Stable
-
Create
A
andAAAA
records for the FQDN in the Namecheap web UI -
Follow Updating DNS diagramsβπ
-
Review the
inventory
file and git-commit it -
Deploy the required stack. Run ansible it test mode first. For example this would deploy a backend host:
-
Update Prometheusβπ§ by following Monitoring deployment runbookβπ
-
git-push the commits
Also see Monitoring deployment runbookβπ for an example of deployment.
Deleting a host
-
Remove it from
inventory
-
Update the monitoring deployment using:
DNS diagrams
A:
See https://raw.githubusercontent.com/ooni/sysadmin/master/ext/dnsgraph.A.svg
The image is not included here due to space constraints.
CNAME:
MX:
NS:
TXT:
HTTP Moved Permanently (HTTP code 301):
HTTP Redirects:
Updating DNS diagrams
To update the diagrams use the sysadmin repository:
Update the ./ext/dns.json
file:
Then run https://github.com/ooni/sysadmin/blob/master/scripts/dnsgraph to generate the charts:
It will generate SVG files under the ./ext/
directory. Finally, commit
and push the dns.json and SVG files.