Commit Graph

577 Commits

Author SHA1 Message Date
81093cedfb Increase memory for syslog jobs
Thry were getting OOM killed
2024-06-08 13:36:23 -07:00
7b41d29eb8 Add health checks and restarts to prometheus 2024-05-30 15:01:42 -07:00
90b7740343 Move Blocky and Exporters away from system to service jobs
This is because service jobs do not get rescheduled when allocs fail
2024-05-30 11:41:40 -07:00
e88c7c250d Bump nomad to 1.8 2024-05-30 11:40:58 -07:00
ed83ab0382 Remove qnomad due to disk errors 2024-05-30 11:40:28 -07:00
3cfbda7a27 Stop using diun for nomad fixers 2024-05-28 12:18:27 -07:00
85c626c96f Use Nomad task socket from Traefik 2024-05-28 12:00:13 -07:00
634d63c26c Stop diun for traffic routes
This was causing a check for each set of dead tasks
2024-05-28 11:45:30 -07:00
205388f283 Update traefik to v3 using canary 2024-05-28 11:43:46 -07:00
bdfde48bec Add some more monitors to nomad minitor 2024-05-06 14:29:17 -07:00
9af55580e7 Update diun config to read from task socket 2024-05-01 10:18:54 -07:00
b9c35bf18f Add ability to set task identities for service module 2024-05-01 10:18:24 -07:00
e7f740a2d9 Add languagetool server 2024-05-01 09:43:28 -07:00
57efee14e9 Update Ansible inventory to split node roles
Splits servers and clients to their own groups so that plays can target
specific roles.

Prior, everything was "both", but i want to and another server for
recovery purposes but not host containers on it.
2024-05-01 09:40:21 -07:00
c711c25737 Always use CF for dns when renewing lego certs
Makes it more resilient if my servers are down, but also cuts out a hop
because CF is the nameserver as well.
2024-04-27 19:33:10 -07:00
24122c2a3e Split fixers to their own groups
Allow them to deploy as different allocs on different hosts
2024-04-22 09:07:03 -07:00
13121862ec Add new host on qnap nas 2024-04-22 09:06:33 -07:00
28da3f425b Move nomad default interface to host vars 2024-04-22 09:06:11 -07:00
2d59886378 Update diun to include ability to read nomad socket 2024-04-17 10:46:28 -07:00
da0f52dab3 Improve change detection for cluster bootstrap 2024-04-17 10:46:10 -07:00
beac302a53 Upgrade nomad to 1.7.6 2024-04-17 10:45:27 -07:00
5edcb86e7e Remove traefik grafana dashboard
Now in data backups rather than git.
2024-03-26 14:56:14 -07:00
3dcd4c44b3 Tune memory after reviewing grafana 2024-03-26 09:48:31 -07:00
e6653f6495 Migrate sonarr to postgresql
And increase postgresql memory to accomodate
2024-03-25 16:05:58 -07:00
a9a919b8f2 Increase priority for sevices with highee resources
Photoprism requires lots if mem and sonar a specific volume
2024-03-22 21:09:19 -07:00
cc66bfdbcb Update photoprism 2024-03-22 21:07:55 -07:00
b02050112e Tune some service memeory 2024-03-22 21:07:07 -07:00
d5c2a0d185 Use default diun for syslogng 2024-03-22 21:05:53 -07:00
6a3ae49d8e Update terraform modules 2024-03-11 22:02:07 -07:00
75ee09d7e6 Remove bazarr
Plex does this automatically now
2024-02-20 10:13:40 -08:00
8b90aa0d74 Add 1.1.1.1 dns back to blocky for better resiliance 2024-02-20 10:10:41 -08:00
62e120ce51 Add radarr 2024-02-20 10:09:48 -08:00
5fb510202d Fix indent for Authelia rules 2024-02-20 10:05:25 -08:00
64a085ef80 Reatart failing services
Restart services that fail checks
2024-02-18 07:49:16 -08:00
f2f415aeac Fix traefik metrics 2024-02-18 07:47:31 -08:00
bb291b1f01 Move databases to their own tf files and improve first start 2024-02-13 12:05:55 -08:00
056eac976c lldap: Make it work on first bootstrap
Can't use the job id for creating the variables and permissions because we end up
with circular dependencies. The job won't return until it's successful in Nomad and it won't
start in nomad without access to varibles
2024-02-13 12:05:21 -08:00
198f96f3f7 Add back other traefik ports and metrics 2024-02-13 12:03:03 -08:00
6b5adbdf39 Remove 404 block list 2024-02-13 12:02:35 -08:00
77ef4b4167 Use quad9 encrypted dns 2024-02-13 12:02:14 -08:00
b35b8cecd5 Blocky: Remove mysql and redis configs from stunnel if server isn't found 2024-02-13 12:01:45 -08:00
b9dfeff6d8 Have blocky use router for upstream in nomad 2024-02-13 12:01:08 -08:00
2ff954b4b5 Bump nomad 2024-02-13 12:00:43 -08:00
2528dafcc6 Make nomad restart playbook more resilient 2024-02-13 12:00:24 -08:00
0e168376b8 Add terraform destroy to makefile 2024-02-13 11:59:47 -08:00
a16dc204fe Run dummy backup more frequently to make graphs easier to read 2024-01-24 20:10:14 -08:00
93d340c182 Make sure gitea ingress uses system wesher config
It was always using wesher
2024-01-23 12:09:59 -08:00
37ee67b2e6 fix: Add job_id output to services
This should be earlier in history
2024-01-23 12:09:29 -08:00
35dfeb3093 Add service healthchecks 2024-01-23 12:08:47 -08:00
0a2eace3dd Fix lldap secrets 2024-01-23 12:07:42 -08:00