Commit Graph

192 Commits

Author SHA1 Message Date
846ea18a16 Don't deploy Nextcloud 2022-07-29 13:01:40 -07:00
6d31c4e6d6 Stop duplicate nomad scraping
Already getting it from Client service
2022-07-29 13:01:22 -07:00
9d57175584 Increase promtail memory 2022-07-28 16:37:19 -07:00
3c0c74797d Make traefik a service rather than a system job
Sets it up to support auto_revert and auto_promote
2022-07-28 15:11:59 -07:00
4b6c388ed9 Traefik wildcard certs 2022-07-28 15:11:24 -07:00
6ccc5a6bcf Remove variable for consul_address for traefik
Now getting from Noamd environment
2022-07-28 15:10:39 -07:00
48d5704b72 Make lldap backup daily 2022-07-28 15:05:00 -07:00
62f59b3929 conditional dns lookups for router assigned domains 2022-07-27 22:04:46 -07:00
c074df4bc7 Working backup and restore 2022-07-27 22:04:22 -07:00
d175166045 Make traefik disk ephemeral and sticky 2022-07-27 17:30:35 -07:00
c8493b1fc5 Bump Traefik mem limit
We don't like this crashing
2022-07-27 17:26:13 -07:00
a3f59145bd Skip dump of lldap db 2022-07-27 17:25:41 -07:00
9a315eb2f7 Add lldap backup and templatize backup job
Now oneoff and system jobs are all using the same template
2022-07-27 17:02:29 -07:00
6e074c55aa Increase prometheus memory limit 2022-07-27 16:11:56 -07:00
ecaee6f8be Add lldap 2022-07-27 15:57:28 -07:00
4213b322c1 Remove set hostname because that's now done in bootstrap 2022-07-27 15:57:12 -07:00
1dd131ba9a Extend ttl for nomad tokens 2022-07-27 15:56:40 -07:00
bc040b4668 Add ddclient 2022-07-27 14:45:08 -07:00
9664802fb6 Clean up services template whitespace 2022-07-27 14:41:42 -07:00
547cd96e4c Add vault stanza to levant services 2022-07-27 14:41:13 -07:00
e39fbc41a7 Add further todos for Nomad Vault 2022-07-27 13:40:21 -07:00
25ec582eaf Update Nomad and Vault ACLs
Now nomad is read only and tokens can be retrieved from Vault
2022-07-27 13:13:11 -07:00
92a30e6709 Reduce memory for blocky sidecar 2022-07-27 11:22:02 -07:00
fb934f3b2f Hide blocky API from non-traefik route 2022-07-27 11:21:11 -07:00
fe11b03a43 Get letsencrypt certs working with Traefik 2022-07-27 11:12:08 -07:00
85fccea867 Fix consul value bootstrap and hide secrets in log 2022-07-27 11:11:03 -07:00
d70dce8ab5 Add basic auth to traefik 2022-07-26 21:48:16 -07:00
963a863e2d Make anonymous nomad read only 2022-07-26 20:20:43 -07:00
3033c581f3 Add userpass login to Vault 2022-07-26 20:09:52 -07:00
b4bb0f866e Make metrics more readable 2022-07-25 21:45:01 -07:00
4508993068 Reduce task memory 2022-07-25 16:37:51 -07:00
4ea7947b1a Fix mysql 2022-07-25 16:29:43 -07:00
465c2d9c29 WIP: Update oneoff backups 2022-07-25 16:29:35 -07:00
ee45e92534 Fix consul backup 2022-07-25 16:29:06 -07:00
3ec1d008e8 Move traefik connect intents to core 2022-07-25 15:54:23 -07:00
04bdef01b8 Allow bypass of healthcheck 2022-07-25 15:52:47 -07:00
157005ae7b Get mysql root from vault 2022-07-25 15:52:47 -07:00
4a06f31f49 Tweak memory requirements for tasks 2022-07-25 15:52:47 -07:00
9d4cd68648 Add test consul backup 2022-07-25 15:52:47 -07:00
18807de608 Clean up Grafana and Loki bootstraps 2022-07-25 15:52:47 -07:00
de82205147 Remove packer stuff 2022-07-25 15:49:07 -07:00
96263d1e99 Update lockfile 2022-07-25 15:40:54 -07:00
9bb8b39fed Add new playbook and make target for bootstrapping values to Consul and Vault 2022-07-25 15:40:22 -07:00
888b1236f1 Update playbook, move acls and comment for fixes
There are some items that I found are broken on first run and made some changes
2022-07-25 11:48:03 -07:00
a0aba7f2f0 Make acls module stand alone 2022-07-25 11:48:03 -07:00
fed875f852 Shorten pip installs 2022-07-25 11:48:03 -07:00
068da0d539 Add vault kv creation 2022-07-25 11:14:51 -07:00
464cdf7010 Add loki, promtail, and syslog-ng 2022-07-25 10:46:16 -07:00
391ad8dee6 Add sticky disk to service template 2022-07-25 10:44:37 -07:00
d386a839c4 Promethus: Use env for consul address rather than variable 2022-07-25 10:38:48 -07:00
af4324db6f Move core services to new tf file
Precursor to moving to a module so it can be applied separately
2022-07-25 10:37:32 -07:00
a7e276c637 WIP: Write a consul backup job 2022-07-21 20:24:50 -07:00
842e656342 Add consul bootstrap and move vault to an example 2022-07-21 20:16:10 -07:00
47a74b6166 Fix consul address in levant 2022-07-21 20:11:21 -07:00
16813e8cb7 Deploy Nomad, Consul, and Vault using apt repo 2022-07-21 19:04:44 -07:00
60dd856666 Use vault for backups jobs 2022-07-21 19:03:40 -07:00
1b88593f88 Major grafana refactor to include automatic loading of provisioning files 2022-07-21 15:54:05 -07:00
5126f5f4d4 Go back to a single ingress node to simplify Traefik TLS
The open source version of Traefik doesn't natively support HA. Running
multiple instances means that the TLS certificates will have to be
managed outside of Traefik and distributed to running jobs via Vault and
Nomad. This is doable, but I've decided to reduce the scope for now to
simplify things and go to a single Ingress node so that Traefik cert
management can be used.
2022-07-21 15:50:13 -07:00
c58056d594 More nextcloud config using Vault 2022-07-08 16:26:26 -07:00
02b448e363 Create levant tf module
Also a template service Nomad job that can be used for some straighforward services
2022-07-08 16:24:03 -07:00
11f5c10f83 Ignore ansible_collections 2022-06-28 12:11:55 -07:00
b2b409a1fe Add example secrets 2022-06-28 12:11:24 -07:00
65ce1b55f0 Fix secrets access from nomad tasks
Probably can be cleaned up and updated to follow least access
2022-06-28 12:11:07 -07:00
c0215bf153 Improve vault bootstrap and nomad connection 2022-06-28 12:10:18 -07:00
bf1ac31cdf Bootstrap vault secrets 2022-06-28 12:09:57 -07:00
41343a6d2c Small improvement to consul kv role 2022-06-28 12:08:23 -07:00
ce09177479 Add missing role requirements file
This uses updated fork of ansible-consul
2022-06-23 20:13:17 -07:00
13e9eac407 Deploy traefik one at a time with autorevert 2022-06-23 20:12:30 -07:00
d40d585358 Install consul dns forwarding 2022-06-23 20:12:09 -07:00
0bfdddf3ee Install consul from repo 2022-06-23 20:11:48 -07:00
617d4ae676 Make blocky config a bit more stable by removing templating based on whami 2022-06-23 20:11:28 -07:00
3d6b405ab6 Fix blocky upstream tcp for quad9 2022-06-23 20:11:09 -07:00
2f4d90abdc Auto revert broken blocky
Also enable traefik
2022-06-23 20:10:36 -07:00
ffdfdeadfb Add Consul lookup for ads dns allowlist 2022-06-23 13:36:06 -07:00
fc2db88276 Add some more upstream dns options
Should pick one later
2022-06-23 13:34:08 -07:00
eb066f5d98 Increase priority of Traefik 2022-06-23 09:51:42 -07:00
e5b61d5307 Update Nomad 2022-06-23 09:51:21 -07:00
6b14507ca6 Generate blocky host mapping from Consul kv 2022-06-23 09:51:09 -07:00
5d2301c791 Update blocky one instance at a time
Avoids dns going down with all instances updating at once
2022-06-23 09:50:23 -07:00
d7fa57864f Deploy backup jobs to all hosts and dynamically determine jobs per node 2022-06-23 09:49:57 -07:00
9ab300c225 Remove csi deployment 2022-06-23 09:49:03 -07:00
520d7c56b9 Move databases to a single module 2022-06-23 09:48:01 -07:00
a02f1a2317 Make traefik a system service
For this to work, will need to put TLS certs in Vault
2022-06-17 15:20:43 -07:00
ce18650e1f Add base hostname to consul in Playbook 2022-06-17 15:19:43 -07:00
16b9440e12 WIP: Add democratic-csi storage plugin 2022-06-17 15:19:19 -07:00
252c9b4111 Make nextcloud backup a non-sidecar task
Avoids restarting whole group when if it fails
2022-06-17 15:16:45 -07:00
8cd2abc6b8 Remove some unecessary traefik configs from tasks 2022-06-17 15:15:37 -07:00
049364df23 Make order of host configs match playbook order 2022-06-17 15:14:55 -07:00
c41babe346 Use new host name in terraform consul address 2022-05-24 20:11:57 -07:00
6cd7bae240 Use new token variable name after bootstrap 2022-05-24 20:11:41 -07:00
de4c96b104 Add autopilot 2022-05-24 20:11:18 -07:00
f50cb98d30 Add docker install 2022-05-24 20:11:07 -07:00
1995434140 Auto initialize vault 2022-05-24 20:10:47 -07:00
d6407d25a0 Wait until mysql is deployed before continuing
Otherwise dependent jobs will fail and take up time restarting
2022-05-24 20:10:26 -07:00
8eb7a58dfd Remove unused playbook 2022-05-24 20:09:45 -07:00
e677259a1d Switch to a 3 node cluster for better resiliance 2022-05-24 20:09:22 -07:00
1352eeb3e8 Fix venv detection for ansible cluster target
This fixes the installation of the consul python library
2022-05-24 20:07:52 -07:00
5f9a04fa5d Make redis optional for blocky to help with resliliance to a single host failing 2022-05-19 16:54:16 -07:00
38597a7eda Dynamically add dns routes to traefik instances to blocky 2022-05-19 16:53:56 -07:00
719c1b62d1 Add dedicated backup module and jobs
Possible alternative to backups deployed with each job
2022-05-18 14:23:46 -07:00