Server Admin Log/Archive 75

2024-01-31

23:11 eileen: * civicrm upgraded from 6344c95e to 6e1e0d21
22:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
22:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
22:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T355609)', diff saved to https://phabricator.wikimedia.org/P56010 and previous config saved to /var/cache/conftool/dbconfig/20240131-222853-marostegui.json
22:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P56009 and previous config saved to /var/cache/conftool/dbconfig/20240131-221347-marostegui.json
22:11 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: gerrit:994823 Bumping portals to master (T128546) (duration: 06m 43s)
22:05 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: gerrit:994823 Bumping portals to master (T128546) (duration: 07m 26s)
21:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P56008 and previous config saved to /var/cache/conftool/dbconfig/20240131-215840-marostegui.json
21:54 Dreamy_Jazz: Removed already applied patches for T347708 from /srv/patches
21:48 dancy@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.16 refs T354434 (duration: 06m 47s)
21:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T355609)', diff saved to https://phabricator.wikimedia.org/P56007 and previous config saved to /var/cache/conftool/dbconfig/20240131-214334-marostegui.json
21:42 dancy@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.16 refs T354434
21:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1233 (T355609)', diff saved to https://phabricator.wikimedia.org/P56006 and previous config saved to /var/cache/conftool/dbconfig/20240131-213454-marostegui.json
21:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1233.eqiad.wmnet with reason: Maintenance
21:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1233.eqiad.wmnet with reason: Maintenance
21:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T355609)', diff saved to https://phabricator.wikimedia.org/P56005 and previous config saved to /var/cache/conftool/dbconfig/20240131-213432-marostegui.json
21:31 Dreamy_Jazz: Security deploy done
21:30 logmsgbot: dreamyjazz Deployed security patch for T356226
21:23 logmsgbot: dreamyjazz Deployed security patch for T356226
21:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P56004 and previous config saved to /var/cache/conftool/dbconfig/20240131-211926-marostegui.json
21:16 Dreamy_Jazz: Doing security deploy for T356226
21:12 jforrester@deploy2002: Finished scap: Backport for gerrit:994716Gadget: Bump GADGET_CLASS_VERSION (T356322) (duration: 08m 31s)
21:05 jforrester@deploy2002: jforrester and reedy: Continuing with sync
21:05 jforrester@deploy2002: jforrester and reedy: Backport for gerrit:994716Gadget: Bump GADGET_CLASS_VERSION (T356322) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P56003 and previous config saved to /var/cache/conftool/dbconfig/20240131-210419-marostegui.json
21:03 jforrester@deploy2002: Started scap: Backport for gerrit:994716Gadget: Bump GADGET_CLASS_VERSION (T356322)
20:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T355609)', diff saved to https://phabricator.wikimedia.org/P56002 and previous config saved to /var/cache/conftool/dbconfig/20240131-204913-marostegui.json
20:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1229 (T355609)', diff saved to https://phabricator.wikimedia.org/P56001 and previous config saved to /var/cache/conftool/dbconfig/20240131-204439-marostegui.json
20:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1229.eqiad.wmnet with reason: Maintenance
20:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1229.eqiad.wmnet with reason: Maintenance
20:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
20:37 eevans@deploy2002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
20:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
20:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T355609)', diff saved to https://phabricator.wikimedia.org/P56000 and previous config saved to /var/cache/conftool/dbconfig/20240131-203704-marostegui.json
20:36 eevans@deploy2002: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
20:36 eevans@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: sync
20:35 eevans@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: sync
20:35 eevans@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: sync
20:35 eevans@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: sync
20:33 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript userOptions.php --wiki=testwiki --old-is-default --old=2 --new 1 --nowarn 'echo-subscriptions-web-reverted' # T353225
20:32 eevans@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
20:31 eevans@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
20:28 joal@deploy2002: Finished deploy [analytics/refinery@b738b3f] (hadoop-test): HOTFIX analytics weekly train - Test [analytics/refinery@b738b3fd] (duration: 03m 35s)
20:28 eevans@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
20:27 eevans@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: apply
20:25 joal@deploy2002: Started deploy [analytics/refinery@b738b3f] (hadoop-test): HOTFIX analytics weekly train - Test [analytics/refinery@b738b3fd]
20:24 joal@deploy2002: Finished deploy [analytics/refinery@b738b3f] (thin): HOTFIX analytics weekly train -THIN [analytics/refinery@b738b3fd] (duration: 00m 05s)
20:24 joal@deploy2002: Started deploy [analytics/refinery@b738b3f] (thin): HOTFIX analytics weekly train -THIN [analytics/refinery@b738b3fd]
20:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P55999 and previous config saved to /var/cache/conftool/dbconfig/20240131-202158-marostegui.json
20:10 joal@deploy2002: Finished deploy [analytics/refinery@b738b3f]: HOTFIX analytics weekly train [analytics/refinery@b738b3fd] (duration: 10m 51s)
20:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P55998 and previous config saved to /var/cache/conftool/dbconfig/20240131-200652-marostegui.json
19:59 joal@deploy2002: Started deploy [analytics/refinery@b738b3f]: HOTFIX analytics weekly train [analytics/refinery@b738b3fd]
19:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T355609)', diff saved to https://phabricator.wikimedia.org/P55997 and previous config saved to /var/cache/conftool/dbconfig/20240131-195145-marostegui.json
19:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1222 (T355609)', diff saved to https://phabricator.wikimedia.org/P55996 and previous config saved to /var/cache/conftool/dbconfig/20240131-193927-marostegui.json
19:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1222.eqiad.wmnet with reason: Maintenance
19:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1222.eqiad.wmnet with reason: Maintenance
19:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T355609)', diff saved to https://phabricator.wikimedia.org/P55994 and previous config saved to /var/cache/conftool/dbconfig/20240131-193905-marostegui.json
19:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P55993 and previous config saved to /var/cache/conftool/dbconfig/20240131-192359-marostegui.json
19:17 dancy@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.16 refs T354434
19:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P55992 and previous config saved to /var/cache/conftool/dbconfig/20240131-190852-marostegui.json
18:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T355609)', diff saved to https://phabricator.wikimedia.org/P55991 and previous config saved to /var/cache/conftool/dbconfig/20240131-185345-marostegui.json
18:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1197 (T355609)', diff saved to https://phabricator.wikimedia.org/P55990 and previous config saved to /var/cache/conftool/dbconfig/20240131-184900-marostegui.json
18:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
18:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
18:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T355609)', diff saved to https://phabricator.wikimedia.org/P55989 and previous config saved to /var/cache/conftool/dbconfig/20240131-184838-marostegui.json
18:40 phuedx@deploy2002: Finished deploy [airflow-dags/analytics@5078a6b]: (no justification provided) (duration: 00m 28s)
18:40 phuedx@deploy2002: Started deploy [airflow-dags/analytics@5078a6b]: (no justification provided)
18:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P55988 and previous config saved to /var/cache/conftool/dbconfig/20240131-183332-marostegui.json
18:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P55986 and previous config saved to /var/cache/conftool/dbconfig/20240131-181825-marostegui.json
18:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cloudelastic1010.eqiad.wmnet with reason: T355617
18:04 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cloudelastic1010.eqiad.wmnet with reason: T355617
18:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T355609)', diff saved to https://phabricator.wikimedia.org/P55985 and previous config saved to /var/cache/conftool/dbconfig/20240131-180319-marostegui.json
17:58 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1188 (T355609)', diff saved to https://phabricator.wikimedia.org/P55984 and previous config saved to /var/cache/conftool/dbconfig/20240131-175833-marostegui.json
17:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
17:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
17:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T355609)', diff saved to https://phabricator.wikimedia.org/P55983 and previous config saved to /var/cache/conftool/dbconfig/20240131-175811-marostegui.json
17:51 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
17:50 aokoth@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM vrts1001.eqiad.wmnet
17:46 aokoth@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM vrts1001.eqiad.wmnet
17:45 aokoth@cumin1002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM vrts1001.eqiad.wmnet
17:45 aokoth@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM vrts1001.eqiad.wmnet
17:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P55982 and previous config saved to /var/cache/conftool/dbconfig/20240131-174305-marostegui.json
17:35 phuedx@deploy2002: Finished deploy [analytics/refinery@bef134c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@bef134c2] (duration: 03m 29s)
17:31 phuedx@deploy2002: Started deploy [analytics/refinery@bef134c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@bef134c2]
17:31 phuedx@deploy2002: Finished deploy [analytics/refinery@bef134c] (thin): Regular analytics weekly train THIN [analytics/refinery@bef134c2] (duration: 00m 08s)
17:30 phuedx@deploy2002: Started deploy [analytics/refinery@bef134c] (thin): Regular analytics weekly train THIN [analytics/refinery@bef134c2]
17:30 phuedx@deploy2002: Finished deploy [analytics/refinery@bef134c]: Regular analytics weekly train [analytics/refinery@bef134c2] (duration: 11m 05s)
17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P55981 and previous config saved to /var/cache/conftool/dbconfig/20240131-172758-marostegui.json
17:19 phuedx@deploy2002: Started deploy [analytics/refinery@bef134c]: Regular analytics weekly train [analytics/refinery@bef134c2]
17:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T355609)', diff saved to https://phabricator.wikimedia.org/P55980 and previous config saved to /var/cache/conftool/dbconfig/20240131-171252-marostegui.json
17:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1182 (T355609)', diff saved to https://phabricator.wikimedia.org/P55979 and previous config saved to /var/cache/conftool/dbconfig/20240131-170141-marostegui.json
17:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
17:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
17:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55978 and previous config saved to /var/cache/conftool/dbconfig/20240131-170120-marostegui.json
17:01 phuedx@deploy2002: Finished deploy [analytics/refinery@2c00cad] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@2c00cad1] (duration: 03m 35s)
16:57 ejegg: fundraising civicrm upgraded from 520337a0 to 6344c95e
16:57 phuedx@deploy2002: Started deploy [analytics/refinery@2c00cad] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@2c00cad1]
16:56 phuedx@deploy2002: Finished deploy [analytics/refinery@2c00cad] (thin): Regular analytics weekly train THIN [analytics/refinery@2c00cad1] (duration: 00m 06s)
16:56 phuedx@deploy2002: Started deploy [analytics/refinery@2c00cad] (thin): Regular analytics weekly train THIN [analytics/refinery@2c00cad1]
16:54 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
16:52 phuedx@deploy2002: Finished deploy [analytics/refinery@2c00cad]: Regular analytics weekly train [analytics/refinery@2c00cad1] (duration: 09m 52s)
16:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P55977 and previous config saved to /var/cache/conftool/dbconfig/20240131-164613-marostegui.json
16:43 phuedx@deploy2002: Started deploy [analytics/refinery@2c00cad]: Regular analytics weekly train [analytics/refinery@2c00cad1]
16:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P55976 and previous config saved to /var/cache/conftool/dbconfig/20240131-163106-marostegui.json
16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55974 and previous config saved to /var/cache/conftool/dbconfig/20240131-161600-marostegui.json
16:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55973 and previous config saved to /var/cache/conftool/dbconfig/20240131-160624-marostegui.json
16:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
16:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
16:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T355609)', diff saved to https://phabricator.wikimedia.org/P55972 and previous config saved to /var/cache/conftool/dbconfig/20240131-160602-marostegui.json
16:01 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
15:58 moritzm: installing openssh security updates
15:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moscovium.eqiad.wmnet
15:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host moscovium.eqiad.wmnet
15:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P55970 and previous config saved to /var/cache/conftool/dbconfig/20240131-155055-marostegui.json
15:50 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
15:47 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
15:47 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
15:47 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
15:46 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
15:46 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
15:45 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
15:45 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
15:45 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
15:44 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
15:43 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
15:41 ayounsi@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2006.codfw.wmnet
15:41 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:41 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin2002"
15:39 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin2002"
15:36 ayounsi@cumin2002: START - Cookbook sre.dns.netbox
15:36 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: name=maps2009.codfw.wmnet
15:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P55969 and previous config saved to /var/cache/conftool/dbconfig/20240131-153549-marostegui.json
15:34 hnowlan@puppetmaster1001: conftool action : set/weight=10; selector: name=maps1009.eqiad.wmnet
15:32 ayounsi@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2006.codfw.wmnet
15:29 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1009.eqiad.wmnet
15:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T355609)', diff saved to https://phabricator.wikimedia.org/P55968 and previous config saved to /var/cache/conftool/dbconfig/20240131-152042-marostegui.json
15:18 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
15:17 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
15:17 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
15:16 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
15:16 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
15:16 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
15:16 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
15:14 btullis@cumin1002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling reboot on A:schema
15:14 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
15:14 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
15:14 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
15:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1156 (T355609)', diff saved to https://phabricator.wikimedia.org/P55967 and previous config saved to /var/cache/conftool/dbconfig/20240131-151016-marostegui.json
15:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
15:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
15:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
15:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
15:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55966 and previous config saved to /var/cache/conftool/dbconfig/20240131-150934-marostegui.json
15:09 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
15:08 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
15:08 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
15:07 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
15:06 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
15:05 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
14:58 btullis@cumin1002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling reboot on A:schema
14:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P55965 and previous config saved to /var/cache/conftool/dbconfig/20240131-145427-marostegui.json
14:53 brouberol: I'm going to apply kafka log compaction for {eqiad,codfw}.mediawiki.currussearch.page_rerender.v1 on kafka-main-eqiad only (current replica) - T354794
14:52 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
14:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists2001.codfw.wmnet
14:46 urbanecm@deploy2002: Finished scap: Backport for gerrit:994176Add WikimediaCampaignEvents to extension list (T347894) (duration: 10m 41s)
14:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host lists2001.codfw.wmnet
14:43 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
14:40 urbanecm@deploy2002: cmelo and urbanecm: Continuing with sync
14:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P55964 and previous config saved to /var/cache/conftool/dbconfig/20240131-143921-marostegui.json
14:37 urbanecm@deploy2002: cmelo and urbanecm: Backport for gerrit:994176Add WikimediaCampaignEvents to extension list (T347894) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:36 urbanecm@deploy2002: Started scap: Backport for gerrit:994176Add WikimediaCampaignEvents to extension list (T347894)
14:30 urbanecm@deploy2002: Finished scap: Backport for [[gerrit:994702|[metawiki] Let admins add/remove the event-organizer group (T356070)]], gerrit:994711index.php: Restore support for forcesafemode option. (T355314) (duration: 10m 05s)
14:28 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
14:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55963 and previous config saved to /var/cache/conftool/dbconfig/20240131-142413-marostegui.json
14:23 urbanecm@deploy2002: daimona and matmarex and urbanecm: Continuing with sync
14:21 urbanecm@deploy2002: daimona and matmarex and urbanecm: Backport for [[gerrit:994702|[metawiki] Let admins add/remove the event-organizer group (T356070)]], gerrit:994711index.php: Restore support for forcesafemode option. (T355314) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:21 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2020.codfw.wmnet with reason: Decommissioning — T352469
14:20 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2020.codfw.wmnet with reason: Decommissioning — T352469
14:20 urbanecm@deploy2002: Started scap: Backport for [[gerrit:994702|[metawiki] Let admins add/remove the event-organizer group (T356070)]], gerrit:994711index.php: Restore support for forcesafemode option. (T355314)
{{safesubst:SAL entry|1=14:19 urbanecm@deploy2002: Finished scap: Backport for gerrit:994234decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), gerrit:994235decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), gerrit:994708Add an exception for ConvenientDiscussions-style permalinks (T349653), [[gerrit:994709|Add an exception for ConvenientDiscussions-style permalinks (T349653)}}
14:18 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript migrateUserGroup.php --wiki=metawiki campaignevents-beta-tester event-organizer # T356070
14:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1146:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55962 and previous config saved to /var/cache/conftool/dbconfig/20240131-141316-marostegui.json
14:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
14:13 urbanecm@deploy2002: urbanecm and kemayo and matmarex and daimona: Continuing with sync
14:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
{{safesubst:SAL entry|1=14:10 urbanecm@deploy2002: urbanecm and kemayo and matmarex and daimona: Backport for gerrit:994234decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), gerrit:994235decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), gerrit:994708Add an exception for ConvenientDiscussions-style permalinks (T349653), [[gerrit:994709|Add an exception for ConvenientDiscuss}}
14:09 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
{{safesubst:SAL entry|1=14:08 urbanecm@deploy2002: Started scap: Backport for gerrit:994234decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), gerrit:994235decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), gerrit:994708Add an exception for ConvenientDiscussions-style permalinks (T349653), [[gerrit:994709|Add an exception for ConvenientDiscussions-style permalinks (T349653)]}}
14:08 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
14:07 urbanecm@deploy2002: Finished scap: Backport for gerrit:994732testwiki: Temporarily change default value for 4 Echo properties (T353225) (duration: 19m 37s)
14:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
14:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
14:00 urbanecm@deploy2002: urbanecm: Continuing with sync
13:54 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2003.codfw.wmnet
13:51 urbanecm@deploy2002: urbanecm: Backport for gerrit:994732testwiki: Temporarily change default value for 4 Echo properties (T353225) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
13:48 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host people2003.codfw.wmnet
13:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host planet1003.eqiad.wmnet
13:48 urbanecm@deploy2002: Started scap: Backport for gerrit:994732testwiki: Temporarily change default value for 4 Echo properties (T353225)
13:44 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host planet1003.eqiad.wmnet
13:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T355609)', diff saved to https://phabricator.wikimedia.org/P55960 and previous config saved to /var/cache/conftool/dbconfig/20240131-133143-marostegui.json
13:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet
13:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet
13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5002.eqsin.wmnet
13:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P55959 and previous config saved to /var/cache/conftool/dbconfig/20240131-131637-marostegui.json
13:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow5002.eqsin.wmnet
13:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4002.ulsfo.wmnet
13:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4002.ulsfo.wmnet
13:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3003.esams.wmnet
13:04 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1002.eqiad.wmnet
13:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3003.esams.wmnet
13:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet
13:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P55957 and previous config saved to /var/cache/conftool/dbconfig/20240131-130130-marostegui.json
12:58 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan1002.eqiad.wmnet
12:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet
12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet
12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet
12:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T355609)', diff saved to https://phabricator.wikimedia.org/P55956 and previous config saved to /var/cache/conftool/dbconfig/20240131-124623-marostegui.json
12:44 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
12:44 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
12:44 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
12:44 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
12:42 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host netmon1003.wikimedia.org
12:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2189 (T355609)', diff saved to https://phabricator.wikimedia.org/P55955 and previous config saved to /var/cache/conftool/dbconfig/20240131-123224-marostegui.json
12:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2189.codfw.wmnet with reason: Maintenance
12:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2189.codfw.wmnet with reason: Maintenance
12:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T355609)', diff saved to https://phabricator.wikimedia.org/P55954 and previous config saved to /var/cache/conftool/dbconfig/20240131-123203-marostegui.json
12:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
12:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
12:24 btullis@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host dbstore1009.eqiad.wmnet
12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
12:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
12:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P55953 and previous config saved to /var/cache/conftool/dbconfig/20240131-121656-marostegui.json
12:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica2008.wikimedia.org
12:13 claime: Raising external traffic to mw-on-k8s to 35% - T355532
12:13 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stewards2001.codfw.wmnet
12:12 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host dbstore1009.eqiad.wmnet
12:11 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dbstore1008.eqiad.wmnet
12:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica2008.wikimedia.org
12:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica2007.wikimedia.org
12:10 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
12:10 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
12:10 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
12:09 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host stewards2001.codfw.wmnet
12:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stewards1001.eqiad.wmnet
12:08 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
12:08 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
12:08 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
12:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica2007.wikimedia.org
12:07 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
12:07 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
12:06 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
12:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica1006.wikimedia.org
12:05 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
12:05 akosiaris@deploy2002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
12:04 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host stewards1001.eqiad.wmnet
12:04 akosiaris@deploy2002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
12:04 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
12:03 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
12:03 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host planet2003.codfw.wmnet
12:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica1006.wikimedia.org
12:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P55952 and previous config saved to /var/cache/conftool/dbconfig/20240131-120150-marostegui.json
12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica1005.wikimedia.org
12:00 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host dbstore1008.eqiad.wmnet
11:59 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host planet2003.codfw.wmnet
11:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people1004.eqiad.wmnet
11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica1005.wikimedia.org
11:51 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host people1004.eqiad.wmnet
11:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T355609)', diff saved to https://phabricator.wikimedia.org/P55951 and previous config saved to /var/cache/conftool/dbconfig/20240131-114643-marostegui.json
11:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
11:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
11:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
11:38 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker[1157-1175].eqiad.wmnet
11:38 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1157-1175].eqiad.wmnet
11:37 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker[1157-1175].eqiad.wmnet
11:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
11:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2175 (T355609)', diff saved to https://phabricator.wikimedia.org/P55950 and previous config saved to /var/cache/conftool/dbconfig/20240131-113518-marostegui.json
11:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
11:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
11:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55949 and previous config saved to /var/cache/conftool/dbconfig/20240131-113456-marostegui.json
11:34 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1001.eqiad.wmnet
11:29 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1424.eqiad.wmnet with OS bullseye
11:28 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host testvm2006.codfw.wmnet
11:27 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host testvm2006.codfw.wmnet with OS bookworm
11:27 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan1001.eqiad.wmnet
11:26 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1423.eqiad.wmnet with OS bullseye
11:24 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1425.eqiad.wmnet with OS bullseye
11:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P55948 and previous config saved to /var/cache/conftool/dbconfig/20240131-111949-marostegui.json
11:11 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1424.eqiad.wmnet with reason: host reimage
11:08 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1423.eqiad.wmnet with reason: host reimage
11:05 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1425.eqiad.wmnet with reason: host reimage
11:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P55947 and previous config saved to /var/cache/conftool/dbconfig/20240131-110442-marostegui.json
11:02 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1424.eqiad.wmnet with reason: host reimage
11:02 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1423.eqiad.wmnet with reason: host reimage
11:01 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1425.eqiad.wmnet with reason: host reimage
10:53 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
10:53 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
10:51 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:51 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster tap - ayounsi@cumin1002"
10:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55946 and previous config saved to /var/cache/conftool/dbconfig/20240131-104936-marostegui.json
10:49 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster tap - ayounsi@cumin1002"
10:48 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1424.eqiad.wmnet with OS bullseye
10:48 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1423.eqiad.wmnet with OS bullseye
10:48 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1425.eqiad.wmnet with OS bullseye
10:46 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
10:43 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
10:42 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
10:41 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
10:41 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet
10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55945 and previous config saved to /var/cache/conftool/dbconfig/20240131-103830-marostegui.json
10:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
10:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T355609)', diff saved to https://phabricator.wikimedia.org/P55944 and previous config saved to /var/cache/conftool/dbconfig/20240131-103807-marostegui.json
10:36 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1157.eqiad.wmnet
10:35 btullis@deploy2002: Finished deploy [analytics/refinery@13f7a06] (hadoop-test): Ad-hoc deploy of refinery TEST for T354703 [analytics/refinery@13f7a06c] (duration: 00m 07s)
10:35 btullis@deploy2002: Started deploy [analytics/refinery@13f7a06] (hadoop-test): Ad-hoc deploy of refinery TEST for T354703 [analytics/refinery@13f7a06c]
10:35 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet
10:33 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
10:30 btullis@deploy2002: Finished deploy [analytics/refinery@13f7a06] (hadoop-test): Ad-hoc deploy of refinery TEST for T354703 [analytics/refinery@13f7a06c] (duration: 00m 05s)
10:30 btullis@deploy2002: Started deploy [analytics/refinery@13f7a06] (hadoop-test): Ad-hoc deploy of refinery TEST for T354703 [analytics/refinery@13f7a06c]
10:30 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
10:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1002.eqiad.wmnet
10:29 stevemunene@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-worker1157.eqiad.wmnet
10:25 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
10:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1002.eqiad.wmnet
10:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P55943 and previous config saved to /var/cache/conftool/dbconfig/20240131-102300-marostegui.json
10:21 cgoubert@cumin2002: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
10:20 cgoubert@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host testreduce1002.eqiad.wmnet
10:20 cgoubert@cumin2002: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
10:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1001.eqiad.wmnet
10:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P55942 and previous config saved to /var/cache/conftool/dbconfig/20240131-100754-marostegui.json
10:03 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1001.eqiad.wmnet
10:02 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS bookworm
10:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet
09:53 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
09:53 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host moss-be2003.codfw.wmnet
09:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T355609)', diff saved to https://phabricator.wikimedia.org/P55941 and previous config saved to /var/cache/conftool/dbconfig/20240131-095247-marostegui.json
09:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
09:51 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
09:51 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
09:50 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
09:50 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
09:50 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:50 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
09:49 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
09:47 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
09:47 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
09:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2148 (T355609)', diff saved to https://phabricator.wikimedia.org/P55940 and previous config saved to /var/cache/conftool/dbconfig/20240131-094301-marostegui.json
09:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
09:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
09:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55939 and previous config saved to /var/cache/conftool/dbconfig/20240131-094239-marostegui.json
09:38 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
09:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5004.wikimedia.org
09:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P55938 and previous config saved to /var/cache/conftool/dbconfig/20240131-092733-marostegui.json
09:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5004.wikimedia.org
09:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast4005.wikimedia.org
09:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast4005.wikimedia.org
09:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P55937 and previous config saved to /var/cache/conftool/dbconfig/20240131-091226-marostegui.json
09:08 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2002.codfw.wmnet
09:07 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host sretest1003.eqiad.wmnet
09:01 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan2002.codfw.wmnet
08:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55936 and previous config saved to /var/cache/conftool/dbconfig/20240131-085719-marostegui.json
08:55 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2001.codfw.wmnet
08:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1003.eqiad.wmnet
08:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet
08:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2138:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55935 and previous config saved to /var/cache/conftool/dbconfig/20240131-084700-marostegui.json
08:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
08:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
08:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T355609)', diff saved to https://phabricator.wikimedia.org/P55934 and previous config saved to /var/cache/conftool/dbconfig/20240131-084637-marostegui.json
08:45 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan2001.codfw.wmnet
08:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet
08:44 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana2001.codfw.wmnet
08:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host crm2001.codfw.wmnet
08:40 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet
08:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host crm2001.codfw.wmnet
08:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 100%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55932 and previous config saved to /var/cache/conftool/dbconfig/20240131-083142-root.json
08:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P55931 and previous config saved to /var/cache/conftool/dbconfig/20240131-083130-marostegui.json
08:27 moritzm: installing systemd bugfix updates from bookworm 12.4 point release
08:21 moritzm: installing systemd bugfix updates from bookworm 12.4 point release
08:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2004.codfw.wmnet
08:18 slyngshede@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm1001.wikimedia.org
08:17 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
08:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 75%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55930 and previous config saved to /var/cache/conftool/dbconfig/20240131-081637-root.json
08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2004.codfw.wmnet
08:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P55929 and previous config saved to /var/cache/conftool/dbconfig/20240131-081624-marostegui.json
08:14 slyngshede@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM idm1001.wikimedia.org
08:13 slyngshede@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm2001.wikimedia.org
08:13 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
08:09 moritzm: installing ca-certificates-java bugfix updates from bookworm 12.4 point release
08:09 slyngshede@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM idm2001.wikimedia.org
08:09 slyngshede@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm-test1001.wikimedia.org
08:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
08:05 slyngshede@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM idm-test1001.wikimedia.org
08:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
08:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 50%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55928 and previous config saved to /var/cache/conftool/dbconfig/20240131-080132-root.json
08:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T355609)', diff saved to https://phabricator.wikimedia.org/P55927 and previous config saved to /var/cache/conftool/dbconfig/20240131-080117-marostegui.json
07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw2001.wikimedia.org
07:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2126 (T355609)', diff saved to https://phabricator.wikimedia.org/P55926 and previous config saved to /var/cache/conftool/dbconfig/20240131-075600-marostegui.json
07:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
07:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
07:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
07:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
07:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T355609)', diff saved to https://phabricator.wikimedia.org/P55925 and previous config saved to /var/cache/conftool/dbconfig/20240131-075522-marostegui.json
07:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw2001.wikimedia.org
07:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw1001.wikimedia.org
07:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 25%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55924 and previous config saved to /var/cache/conftool/dbconfig/20240131-074627-root.json
07:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw1001.wikimedia.org
07:43 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
07:43 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster tap - ayounsi@cumin1002"
07:42 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster tap - ayounsi@cumin1002"
07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P55923 and previous config saved to /var/cache/conftool/dbconfig/20240131-074015-marostegui.json
07:39 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
07:38 ayounsi@cumin1002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
07:38 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
07:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 10%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55922 and previous config saved to /var/cache/conftool/dbconfig/20240131-073121-root.json
07:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor1003.eqiad.wmnet
07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P55921 and previous config saved to /var/cache/conftool/dbconfig/20240131-072509-marostegui.json
07:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor1003.eqiad.wmnet
07:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 100%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55920 and previous config saved to /var/cache/conftool/dbconfig/20240131-072129-root.json
07:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS bookworm
07:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 5%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55919 and previous config saved to /var/cache/conftool/dbconfig/20240131-071616-root.json
07:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor2003.codfw.wmnet
07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T355609)', diff saved to https://phabricator.wikimedia.org/P55918 and previous config saved to /var/cache/conftool/dbconfig/20240131-071002-marostegui.json
07:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet
07:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 75%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55917 and previous config saved to /var/cache/conftool/dbconfig/20240131-070624-root.json
07:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 1%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55916 and previous config saved to /var/cache/conftool/dbconfig/20240131-070111-root.json
06:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2125 (T355609)', diff saved to https://phabricator.wikimedia.org/P55915 and previous config saved to /var/cache/conftool/dbconfig/20240131-065922-marostegui.json
06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
06:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
06:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2107 (T355609)', diff saved to https://phabricator.wikimedia.org/P55914 and previous config saved to /var/cache/conftool/dbconfig/20240131-065901-marostegui.json
06:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
06:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2114.codfw.wmnet with OS bookworm
06:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
06:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 50%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55913 and previous config saved to /var/cache/conftool/dbconfig/20240131-065118-root.json
06:47 moritzm: installing glibc security updates on bookworm
06:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2107', diff saved to https://phabricator.wikimedia.org/P55912 and previous config saved to /var/cache/conftool/dbconfig/20240131-064353-marostegui.json
06:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2114.codfw.wmnet with reason: host reimage
06:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2114.codfw.wmnet with reason: host reimage
06:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 25%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55911 and previous config saved to /var/cache/conftool/dbconfig/20240131-063613-root.json
06:35 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS bookworm
06:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2107', diff saved to https://phabricator.wikimedia.org/P55910 and previous config saved to /var/cache/conftool/dbconfig/20240131-062846-marostegui.json
06:22 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2114.codfw.wmnet with OS bookworm
06:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 10%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55909 and previous config saved to /var/cache/conftool/dbconfig/20240131-062109-root.json
06:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2114 T354506', diff saved to https://phabricator.wikimedia.org/P55908 and previous config saved to /var/cache/conftool/dbconfig/20240131-061932-root.json
06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2107 (T355609)', diff saved to https://phabricator.wikimedia.org/P55907 and previous config saved to /var/cache/conftool/dbconfig/20240131-061340-marostegui.json
06:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 5%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55906 and previous config saved to /var/cache/conftool/dbconfig/20240131-060602-root.json
06:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2107 (T355609)', diff saved to https://phabricator.wikimedia.org/P55905 and previous config saved to /var/cache/conftool/dbconfig/20240131-060337-marostegui.json
06:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2107.codfw.wmnet with reason: Maintenance
06:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2107.codfw.wmnet with reason: Maintenance
05:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
05:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
05:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 1%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55904 and previous config saved to /var/cache/conftool/dbconfig/20240131-055057-root.json
05:41 eileen: civicrm upgraded from 6de61520 to 520337a0
05:30 fab@deploy2002: Finished deploy [airflow-dags/research@97c6a4e]: (no justification provided) (duration: 00m 14s)
05:30 fab@deploy2002: Started deploy [airflow-dags/research@97c6a4e]: (no justification provided)
03:29 eileen: tools upgraded from 02281338 to c823e692
03:05 fab@deploy2002: Finished deploy [airflow-dags/research@6a97a34]: (no justification provided) (duration: 00m 23s)
03:05 fab@deploy2002: Started deploy [airflow-dags/research@6a97a34]: (no justification provided)

2024-01-30

23:54 mutante: LDAP - added aklapper to group releng T356043
23:07 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for sessionstore1006.eqiad.wmnet
23:07 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for sessionstore1006.eqiad.wmnet
22:49 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on sessionstore1006.eqiad.wmnet with reason: Bootstrapping — T353402
22:48 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on sessionstore1006.eqiad.wmnet with reason: Bootstrapping — T353402
22:41 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate first private IP host config - bking@cumin2002 - T355617
22:20 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for sessionstore1005.eqiad.wmnet
22:20 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for sessionstore1005.eqiad.wmnet
22:10 cjming: end of UTC late backport window
22:09 cjming@deploy2002: Finished scap: Backport for [[gerrit:994254|[eswiki] Add 13 namespaces to $wgExemptFromUserRobotsControl (T355033)]] (duration: 08m 24s)
22:02 cjming@deploy2002: cjming and superpes: Continuing with sync
22:02 cjming@deploy2002: cjming and superpes: Backport for [[gerrit:994254|[eswiki] Add 13 namespaces to $wgExemptFromUserRobotsControl (T355033)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
22:00 cjming@deploy2002: Started scap: Backport for [[gerrit:994254|[eswiki] Add 13 namespaces to $wgExemptFromUserRobotsControl (T355033)]]
21:59 cjming@deploy2002: Finished scap: Backport for [[gerrit:994211|[ukwiki] Change autoconfirmed setting (T355972)]], [[gerrit:994214|[ganwiki] Add 'suppressredirect' to transwiki usergroup and change assignment and revocation methods (T354850)]], [[gerrit:994220|[ganwiki] Add new namespace aliases (T355854)]] (duration: 09m 32s)
21:53 cjming@deploy2002: superpes and cjming: Continuing with sync
21:51 cjming@deploy2002: superpes and cjming: Backport for [[gerrit:994211|[ukwiki] Change autoconfirmed setting (T355972)]], [[gerrit:994214|[ganwiki] Add 'suppressredirect' to transwiki usergroup and change assignment and revocation methods (T354850)]], [[gerrit:994220|[ganwiki] Add new namespace aliases (T355854)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:50 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on sessionstore1005.eqiad.wmnet with reason: Bootstrapping — T353402
21:50 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on sessionstore1005.eqiad.wmnet with reason: Bootstrapping — T353402
21:49 cjming@deploy2002: Started scap: Backport for [[gerrit:994211|[ukwiki] Change autoconfirmed setting (T355972)]], [[gerrit:994214|[ganwiki] Add 'suppressredirect' to transwiki usergroup and change assignment and revocation methods (T354850)]], [[gerrit:994220|[ganwiki] Add new namespace aliases (T355854)]]
21:44 cjming@deploy2002: Finished scap: Backport for gerrit:994143Run CheckerJob against read-only clusters (T354793) (duration: 07m 41s)
21:42 mutante: LDAP - added jnuche to group releng (T356043) - already done/approved in the past in T301149
21:41 mutante: LDAP - added jhuneidi to group releng (T356043) - already done/approved in the past in T210028
21:40 mutante: LDAP - added brennen to group releng (T356043) - already done/approved in the past in T215365
21:38 cjming@deploy2002: cjming and ebernhardson: Continuing with sync
21:38 cjming@deploy2002: cjming and ebernhardson: Backport for gerrit:994143Run CheckerJob against read-only clusters (T354793) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:37 cjming@deploy2002: Started scap: Backport for gerrit:994143Run CheckerJob against read-only clusters (T354793)
21:36 cjming@deploy2002: Finished scap: Backport for gerrit:994142Run CheckerJob against read-only clusters (T354793) (duration: 07m 49s)
21:34 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate first private IP host config - bking@cumin2002 - T355617
21:30 cjming@deploy2002: ebernhardson and cjming: Continuing with sync
21:30 cjming@deploy2002: ebernhardson and cjming: Backport for gerrit:994142Run CheckerJob against read-only clusters (T354793) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:28 cjming@deploy2002: Started scap: Backport for gerrit:994142Run CheckerJob against read-only clusters (T354793)
21:01 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for sessionstore1004.eqiad.wmnet
21:01 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for sessionstore1004.eqiad.wmnet
20:52 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate first private IP host config - bking@cumin2002 - T355617
20:51 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate first private IP host config - bking@cumin2002 - T355617
20:38 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on sessionstore1004.eqiad.wmnet with reason: Commissioning — T353402
20:38 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on sessionstore1004.eqiad.wmnet with reason: Commissioning — T353402
20:35 urandom: bootstrapping sessionstore1004/cassandra-a — T353402
20:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wdqs::public
19:45 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: wdqs::public
19:36 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in cloudelastic
19:36 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in cloudelastic
19:36 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: cloudelastic1010.eqiad.wmnet for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
19:36 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010.eqiad.wmnet for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
19:27 Lucas_WMDE: FINISHED lucaswerkmeister-wmde@mwmaint2002:~$ mwscript CheckSignatures enwiki | tee T356168 # -- 268378 invalid signatures --
19:10 dancy@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.16 refs T354434
19:09 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
18:52 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
18:52 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
18:46 xcollazo@deploy2002: Finished deploy [airflow-dags/analytics@ccaa5dc]: (no justification provided) (duration: 00m 05s)
18:46 xcollazo@deploy2002: Started deploy [airflow-dags/analytics@ccaa5dc]: (no justification provided)
18:17 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
18:16 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
18:05 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
18:04 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
18:04 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
18:04 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
18:04 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
18:03 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
18:03 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
18:03 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
18:02 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
18:02 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
18:02 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
18:02 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
17:37 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
17:37 urandom: DROP test_spark3_loading keyspace, Generated Data (Cassandra) cluster — T356112
17:22 jforrester@deploy2002: Finished scap: Backport for gerrit:994202Do not search for elements if no previews have been registered (T355933 T356186 T356193), gerrit:994203Do not search for elements if no previews have been registered (T355933 T356186 T356193) (duration: 11m 51s)
17:21 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
17:15 jforrester@deploy2002: jforrester: Continuing with sync
17:14 jforrester@deploy2002: jforrester: Backport for gerrit:994202Do not search for elements if no previews have been registered (T355933 T356186 T356193), gerrit:994203Do not search for elements if no previews have been registered (T355933 T356186 T356193) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
17:13 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2005.codfw.wmnet with OS bookworm
17:10 jforrester@deploy2002: Started scap: Backport for gerrit:994202Do not search for elements if no previews have been registered (T355933 T356186 T356193), gerrit:994203Do not search for elements if no previews have been registered (T355933 T356186 T356193)
16:57 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
16:56 bking@cumin2002: conftool action : set/weight=10; selector: name=cloudelastic1009.wikimedia.org
16:56 bking@cumin2002: conftool action : set/weight=10; selector: name=cloudelastic1008.wikimedia.org
16:56 bking@cumin2002: conftool action : set/weight=10; selector: name=cloudelastic1007.wikimedia.org
16:54 claime: Running homer 'cr*codfw*' commit 'T351074'
16:54 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: sync
16:54 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: sync
16:49 mutante: gitlab is back
16:48 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
16:47 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
16:47 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
16:47 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
16:44 mutante: gitlab is down for maintenance for a few minutes
16:34 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
16:29 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on gitlab.wikimedia.org with reason: server move
16:29 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on gitlab.wikimedia.org with reason: server move
16:28 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on gitlab2002.wikimedia.org with reason: server move
16:28 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on gitlab2002.wikimedia.org with reason: server move
16:25 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1466.eqiad.wmnet with OS bullseye
16:21 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1457.eqiad.wmnet with OS bullseye
16:18 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2366.codfw.wmnet with OS bullseye
16:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1440.eqiad.wmnet with OS bullseye
16:14 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
16:13 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1008.wikimedia.org
16:13 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2370.codfw.wmnet with OS bullseye
16:11 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
16:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1482.eqiad.wmnet with OS bullseye
16:08 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2368.codfw.wmnet with OS bullseye
16:06 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1466.eqiad.wmnet with reason: host reimage
16:03 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1459.eqiad.wmnet with OS bullseye
16:02 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1457.eqiad.wmnet with reason: host reimage
15:59 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2366.codfw.wmnet with reason: host reimage
15:58 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cloudelastic1010.eqiad.wmnet with reason: T355617
15:58 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cloudelastic1010.eqiad.wmnet with reason: T355617
15:56 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1440.eqiad.wmnet with reason: host reimage
15:54 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
15:53 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2370.codfw.wmnet with reason: host reimage
15:50 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1482.eqiad.wmnet with reason: host reimage
15:47 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2368.codfw.wmnet with reason: host reimage
15:44 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1459.eqiad.wmnet with reason: host reimage
15:42 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2370.codfw.wmnet with reason: host reimage
15:42 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1457.eqiad.wmnet with reason: host reimage
15:42 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1466.eqiad.wmnet with reason: host reimage
15:42 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2366.codfw.wmnet with reason: host reimage
15:42 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1440.eqiad.wmnet with reason: host reimage
15:41 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2368.codfw.wmnet with reason: host reimage
15:41 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1482.eqiad.wmnet with reason: host reimage
15:41 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1459.eqiad.wmnet with reason: host reimage
15:40 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
15:29 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript CheckSignatures enwiki | tee T356168
15:28 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1466.eqiad.wmnet with OS bullseye
15:28 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1459.eqiad.wmnet with OS bullseye
15:28 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1482.eqiad.wmnet with OS bullseye
15:28 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1457.eqiad.wmnet with OS bullseye
15:27 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1440.eqiad.wmnet with OS bullseye
15:26 Lucas_WMDE: UTC afternoon backport+config window done
15:26 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2370.codfw.wmnet with OS bullseye
15:25 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2368.codfw.wmnet with OS bullseye
15:25 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2366.codfw.wmnet with OS bullseye
15:17 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint2002:~$ mwscript namespaceDupes enwikiquote --fix # T355195 (two pages will need separate fixing)
15:17 claime: Recomissioning mw2366.codfw.wmnet,mw2368.codfw.wmnet,mw2370.codfw.wmnet as k8s nodes - T351074
15:17 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host sretest2005.codfw.wmnet
15:17 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
15:16 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [[gerrit:993458|[enwikiquote] Add a draft namespace and its talk space (T355195)]] (duration: 08m 43s)
15:09 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and superpes: Continuing with sync
15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and superpes: Backport for [[gerrit:993458|[enwikiquote] Add a draft namespace and its talk space (T355195)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
15:07 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [[gerrit:993458|[enwikiquote] Add a draft namespace and its talk space (T355195)]]
15:06 claime: Manual run of mediawiki_job_generatecaptcha.service following timer failure - T141490
15:06 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint2002:~$ mwscript namespaceDupes enwiktionary --fix # T354813
15:05 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [[gerrit:993457|[enwiktionary] Remove the Concordance namespace and its talk space (T354813)]] (duration: 09m 57s)
14:59 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Continuing with sync
14:57 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Backport for [[gerrit:993457|[enwiktionary] Remove the Concordance namespace and its talk space (T354813)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:55 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [[gerrit:993457|[enwiktionary] Remove the Concordance namespace and its talk space (T354813)]]
14:52 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint2002:~$ mwscript namespaceDupes azwiki --fix # T355041, failed at the end :(
14:52 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [[gerrit:993452|[azwiki] Changing 9 namespace aliases (T355041)]] (duration: 08m 37s)
14:46 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Continuing with sync
14:45 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Backport for [[gerrit:993452|[azwiki] Changing 9 namespace aliases (T355041)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:43 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [[gerrit:993452|[azwiki] Changing 9 namespace aliases (T355041)]]
14:41 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:994139CommentParser: Ignore generated timestamp links (T356142), gerrit:994140CommentParser: Ignore generated timestamp links (T356142), gerrit:994141Add maintenance script to list users with invalid signatures (T356168) (duration: 11m 01s)
14:40 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
14:35 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Continuing with sync
14:32 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
14:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Backport for gerrit:994139CommentParser: Ignore generated timestamp links (T356142), gerrit:994140CommentParser: Ignore generated timestamp links (T356142), gerrit:994141Add maintenance script to list users with invalid signatures (T356168) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:31 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
14:31 gmodena@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
14:30 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:994139CommentParser: Ignore generated timestamp links (T356142), gerrit:994140CommentParser: Ignore generated timestamp links (T356142), gerrit:994141Add maintenance script to list users with invalid signatures (T356168)
14:30 gmodena@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
14:30 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
14:26 gmodena@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
14:26 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
14:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 backport Cancelled
14:18 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:994028Don't bail out early when there are no selectors configured (T355933) (duration: 09m 04s)
14:12 logmsgbot: lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Continuing with sync
14:11 logmsgbot: lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Backport for gerrit:994028Don't bail out early when there are no selectors configured (T355933) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:11 volans@cumin2002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
14:09 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:994028Don't bail out early when there are no selectors configured (T355933)
14:09 volans@cumin2002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
13:56 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
13:55 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM sretest2005.codfw.wmnet - ayounsi@cumin1002"
13:55 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM sretest2005.codfw.wmnet - ayounsi@cumin1002"
13:54 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest2005.codfw.wmnet on all recursors
13:54 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache sretest2005.codfw.wmnet on all recursors
13:54 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:54 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM sretest2005.codfw.wmnet - ayounsi@cumin1002"
13:53 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM sretest2005.codfw.wmnet - ayounsi@cumin1002"
13:47 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
13:47 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host sretest2005.codfw.wmnet
13:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts srestest2005.codfw.wmnet
13:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: srestest2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
13:44 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: srestest2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
13:39 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
13:37 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1157-1175].eqiad.wmnet
13:36 ayounsi@cumin1002: START - Cookbook sre.hosts.decommission for hosts srestest2005.codfw.wmnet
13:34 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=94) for new host srestest2005.codfw.wmnet
13:33 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
13:33 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
13:32 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
13:32 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
13:32 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:32 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
13:31 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
13:26 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
13:26 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host srestest2005.codfw.wmnet
13:16 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=93) for new host srestest2005.codfw.wmnet
13:16 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
13:16 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
13:16 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:16 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
13:15 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
13:12 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
13:12 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
13:12 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
13:12 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:12 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
13:10 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
13:08 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker[1159-1175].eqiad.wmnet
13:08 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1159-1175].eqiad.wmnet
13:08 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
13:08 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host srestest2005.codfw.wmnet
13:06 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker1158.eqiad.wmnet
13:04 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1158.eqiad.wmnet
12:19 taavi: reprepro import exim4 4.96-15+deb12u4+wmf1 to component/exim4-arc T356171
11:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T343718)', diff saved to https://phabricator.wikimedia.org/P55896 and previous config saved to /var/cache/conftool/dbconfig/20240130-114726-ladsgroup.json
11:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-airflow1005.eqiad.wmnet
11:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P55895 and previous config saved to /var/cache/conftool/dbconfig/20240130-113220-ladsgroup.json
11:30 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker1157.eqiad.wmnet
11:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-airflow1005.eqiad.wmnet
11:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2113.codfw.wmnet with reason: Maintenance
11:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2113.codfw.wmnet with reason: Maintenance
11:19 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1157.eqiad.wmnet
11:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P55894 and previous config saved to /var/cache/conftool/dbconfig/20240130-111713-ladsgroup.json
11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: analytics_cluster::airflow::search
11:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1183.eqiad.wmnet with reason: Maintenance
11:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1183.eqiad.wmnet with reason: Maintenance
11:02 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: analytics_cluster::airflow::search
11:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T343718)', diff saved to https://phabricator.wikimedia.org/P55893 and previous config saved to /var/cache/conftool/dbconfig/20240130-110207-ladsgroup.json
10:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2114 (T343718)', diff saved to https://phabricator.wikimedia.org/P55892 and previous config saved to /var/cache/conftool/dbconfig/20240130-105954-ladsgroup.json
10:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
10:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
10:56 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-airflow1005.eqiad.wmnet with OS bullseye
10:56 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
10:45 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
10:35 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=93) for new host srestest2005.codfw.wmnet
10:35 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
10:35 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
10:35 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:35 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
10:34 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
10:34 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
10:32 volans@cumin1002: END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling restart_daemons on A:netbox-canary
10:32 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-airflow1005.eqiad.wmnet with reason: host reimage
10:31 volans@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
10:31 volans@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
10:31 volans@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
10:29 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
10:29 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
10:29 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
10:29 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:28 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
10:28 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
10:26 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-airflow1005.eqiad.wmnet with reason: host reimage
10:26 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
10:25 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host srestest2005.codfw.wmnet
10:24 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host srestest2005.codfw.wmnet
10:24 ayounsi@cumin1002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
10:23 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
10:23 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host srestest2005.codfw.wmnet
10:23 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
10:16 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-airflow1005.eqiad.wmnet with OS bullseye
10:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host phab1004.eqiad.wmnet
10:00 gmodena@deploy2002: Finished deploy [airflow-dags/analytics@ccaa5dc]: (no justification provided) (duration: 00m 37s)
10:00 gmodena@deploy2002: Started deploy [airflow-dags/analytics@ccaa5dc]: (no justification provided)
09:56 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host phab1004.eqiad.wmnet
09:30 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-tool1008.eqiad.wmnet with OS bullseye
09:14 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-tool1008.eqiad.wmnet with reason: host reimage
09:11 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-tool1008.eqiad.wmnet with reason: host reimage
09:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 100%: Switchover', diff saved to https://phabricator.wikimedia.org/P55891 and previous config saved to /var/cache/conftool/dbconfig/20240130-090704-root.json
09:00 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host an-tool1008.eqiad.wmnet with OS bullseye
08:57 Emperor: restart swift-object-replicator on ms-be1068
08:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 75%: Switchover', diff saved to https://phabricator.wikimedia.org/P55890 and previous config saved to /var/cache/conftool/dbconfig/20240130-085159-root.json
08:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55889 and previous config saved to /var/cache/conftool/dbconfig/20240130-085055-root.json
08:38 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55888 and previous config saved to /var/cache/conftool/dbconfig/20240130-083829-root.json
08:36 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 50%: Switchover', diff saved to https://phabricator.wikimedia.org/P55887 and previous config saved to /var/cache/conftool/dbconfig/20240130-083654-root.json
08:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55886 and previous config saved to /var/cache/conftool/dbconfig/20240130-083550-root.json
08:29 moritzm: upgrading python-pymysql on remaining DB hosts to 1.0.2-2~wmf11u1 T355531
08:28 ladsgroup@deploy2002: Finished scap: Backport for gerrit:993824Enable PageNotice extension on testwiki (T61245) (duration: 10m 24s)
08:23 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55885 and previous config saved to /var/cache/conftool/dbconfig/20240130-082324-root.json
08:22 ladsgroup@deploy2002: ladsgroup and tto: Continuing with sync
08:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 25%: Switchover', diff saved to https://phabricator.wikimedia.org/P55884 and previous config saved to /var/cache/conftool/dbconfig/20240130-082149-root.json
08:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55883 and previous config saved to /var/cache/conftool/dbconfig/20240130-082045-root.json
08:19 ladsgroup@deploy2002: ladsgroup and tto: Backport for gerrit:993824Enable PageNotice extension on testwiki (T61245) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:18 ladsgroup@deploy2002: Started scap: Backport for gerrit:993824Enable PageNotice extension on testwiki (T61245)
08:08 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55882 and previous config saved to /var/cache/conftool/dbconfig/20240130-080819-root.json
08:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 10%: Switchover', diff saved to https://phabricator.wikimedia.org/P55881 and previous config saved to /var/cache/conftool/dbconfig/20240130-080644-root.json
08:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55880 and previous config saved to /var/cache/conftool/dbconfig/20240130-080540-root.json
07:55 ayounsi@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2034.codfw.wmnet to cluster codfw02 and group AB
07:53 ayounsi@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2034.codfw.wmnet to cluster codfw02 and group AB
07:53 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55879 and previous config saved to /var/cache/conftool/dbconfig/20240130-075314-root.json
07:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55878 and previous config saved to /var/cache/conftool/dbconfig/20240130-075035-root.json
07:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2105 T356069', diff saved to https://phabricator.wikimedia.org/P55877 and previous config saved to /var/cache/conftool/dbconfig/20240130-074746-root.json
07:46 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2127 to s3 primary and set section read-write T356069', diff saved to https://phabricator.wikimedia.org/P55876 and previous config saved to /var/cache/conftool/dbconfig/20240130-074656-marostegui.json
07:46 marostegui@cumin1002: dbctl commit (dc=all): 'Set s3 codfw as read-only for maintenance - T356069', diff saved to https://phabricator.wikimedia.org/P55875 and previous config saved to /var/cache/conftool/dbconfig/20240130-074634-marostegui.json
07:46 marostegui: Starting s3 codfw failover from db2105 to db2127 - T356069
07:38 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55874 and previous config saved to /var/cache/conftool/dbconfig/20240130-073807-root.json
07:33 root@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s3 T356069
07:32 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2127 with weight 0 T356069', diff saved to https://phabricator.wikimedia.org/P55873 and previous config saved to /var/cache/conftool/dbconfig/20240130-073257-marostegui.json
07:32 root@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 23 hosts with reason: Primary switchover s3 T356069
07:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55872 and previous config saved to /var/cache/conftool/dbconfig/20240130-072734-root.json
07:23 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 5%: After switchover', diff saved to https://phabricator.wikimedia.org/P55871 and previous config saved to /var/cache/conftool/dbconfig/20240130-072302-root.json
07:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55870 and previous config saved to /var/cache/conftool/dbconfig/20240130-071612-root.json
07:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55869 and previous config saved to /var/cache/conftool/dbconfig/20240130-071229-root.json
07:12 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2144 to x2 master T356060', diff saved to https://phabricator.wikimedia.org/P55868 and previous config saved to /var/cache/conftool/dbconfig/20240130-071202-root.json
07:07 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 1%: After switchover', diff saved to https://phabricator.wikimedia.org/P55867 and previous config saved to /var/cache/conftool/dbconfig/20240130-070757-root.json
07:02 marostegui@deploy2002: Finished scap: Backport for gerrit:993775Revert "db-production.php: Disable writes on es4" (duration: 07m 48s)
07:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55866 and previous config saved to /var/cache/conftool/dbconfig/20240130-070107-root.json
07:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover x2 T356060
07:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover x2 T356060
06:57 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55865 and previous config saved to /var/cache/conftool/dbconfig/20240130-065724-root.json
06:55 marostegui@deploy2002: marostegui: Continuing with sync
06:55 marostegui@deploy2002: marostegui: Backport for gerrit:993775Revert "db-production.php: Disable writes on es4" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
06:54 marostegui@deploy2002: Started scap: Backport for gerrit:993775Revert "db-production.php: Disable writes on es4"
06:48 marostegui@deploy2002: backport Cancelled
06:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55864 and previous config saved to /var/cache/conftool/dbconfig/20240130-064602-root.json
06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2020 T356064', diff saved to https://phabricator.wikimedia.org/P55863 and previous config saved to /var/cache/conftool/dbconfig/20240130-064526-root.json
06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Reduce es2021 weight T356064', diff saved to https://phabricator.wikimedia.org/P55862 and previous config saved to /var/cache/conftool/dbconfig/20240130-064512-root.json
06:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55861 and previous config saved to /var/cache/conftool/dbconfig/20240130-064219-root.json
06:36 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2021 to es4 primary T356064', diff saved to https://phabricator.wikimedia.org/P55860 and previous config saved to /var/cache/conftool/dbconfig/20240130-063625-root.json
06:35 marostegui: Starting es4 codfw failover from es2020 to es2021 - T356064
06:30 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55859 and previous config saved to /var/cache/conftool/dbconfig/20240130-063057-root.json
06:30 marostegui@deploy2002: Finished scap: Backport for gerrit:993711db-production.php: Disable writes on es4 (T356064) (duration: 09m 11s)
06:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1224 T354591', diff saved to https://phabricator.wikimedia.org/P55858 and previous config saved to /var/cache/conftool/dbconfig/20240130-062930-root.json
06:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55857 and previous config saved to /var/cache/conftool/dbconfig/20240130-062714-root.json
06:23 marostegui@deploy2002: marostegui: Continuing with sync
06:22 marostegui@deploy2002: marostegui: Backport for gerrit:993711db-production.php: Disable writes on es4 (T356064) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
06:22 marostegui@cumin1002: dbctl commit (dc=all): 'Set es2020 with weight 0 T356064', diff saved to https://phabricator.wikimedia.org/P55856 and previous config saved to /var/cache/conftool/dbconfig/20240130-062241-marostegui.json
06:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356064
06:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356064
06:21 marostegui@deploy2002: Started scap: Backport for gerrit:993711db-production.php: Disable writes on es4 (T356064)
06:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356064
06:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356064
06:15 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55855 and previous config saved to /var/cache/conftool/dbconfig/20240130-061552-root.json
06:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2103 T356059', diff saved to https://phabricator.wikimedia.org/P55854 and previous config saved to /var/cache/conftool/dbconfig/20240130-061529-root.json
06:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2146', diff saved to https://phabricator.wikimedia.org/P55853 and previous config saved to /var/cache/conftool/dbconfig/20240130-061423-root.json
06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2112 to s1 primary and set section read-write T356059', diff saved to https://phabricator.wikimedia.org/P55852 and previous config saved to /var/cache/conftool/dbconfig/20240130-061305-marostegui.json
06:12 marostegui@cumin1002: dbctl commit (dc=all): 'Set s1 codfw as read-only for maintenance - T356059', diff saved to https://phabricator.wikimedia.org/P55851 and previous config saved to /var/cache/conftool/dbconfig/20240130-061243-marostegui.json
06:12 marostegui: Starting s1 codfw failover from db2103 to db2112 - T356059
06:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55850 and previous config saved to /var/cache/conftool/dbconfig/20240130-061014-root.json
06:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2146', diff saved to https://phabricator.wikimedia.org/P55849 and previous config saved to /var/cache/conftool/dbconfig/20240130-060727-root.json
05:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 36 hosts with reason: Primary switchover s1 T356059
05:44 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2112 with weight 0 T356059', diff saved to https://phabricator.wikimedia.org/P55848 and previous config saved to /var/cache/conftool/dbconfig/20240130-054410-marostegui.json
05:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 36 hosts with reason: Primary switchover s1 T356059
05:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2114 T355739', diff saved to https://phabricator.wikimedia.org/P55847 and previous config saved to /var/cache/conftool/dbconfig/20240130-054154-root.json
05:40 marostegui@cumin1002: dbctl commit (dc=all): 'Set s6 codfw as read-only for maintenance - T355739', diff saved to https://phabricator.wikimedia.org/P55845 and previous config saved to /var/cache/conftool/dbconfig/20240130-054025-root.json
05:40 marostegui: Starting s6 codfw failover from db2114 to db2129 - T355739
05:19 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2129 with weight 0 T355739', diff saved to https://phabricator.wikimedia.org/P55844 and previous config saved to /var/cache/conftool/dbconfig/20240130-051952-marostegui.json
05:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355739
05:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355739
04:57 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.16 refs T354434 (duration: 52m 38s)
04:04 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.16 refs T354434
04:02 mwpresync@deploy2002: Pruned MediaWiki: 1.42.0-wmf.13 (duration: 02m 09s)
03:30 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
03:29 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
00:00 eileen: tools upgraded from 117e1f9c to 544301bd

2024-01-29

22:31 catrope@deploy2002: Finished scap: Backport for gerrit:993805Drop English Wikipedia configuration for wgMFUseDesktopSpecialHistoryPage (T353388) (duration: 28m 33s)
22:24 catrope@deploy2002: catrope and jdlrobson: Continuing with sync
22:03 catrope@deploy2002: catrope and jdlrobson: Backport for gerrit:993805Drop English Wikipedia configuration for wgMFUseDesktopSpecialHistoryPage (T353388) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
22:02 catrope@deploy2002: Started scap: Backport for gerrit:993805Drop English Wikipedia configuration for wgMFUseDesktopSpecialHistoryPage (T353388)
21:54 catrope@deploy2002: Finished scap: Backport for gerrit:991424Use desktop history page HTML everywhere (T353388), gerrit:992931Begin capturing errors for Wikivoyage (duration: 12m 05s)
21:48 catrope@deploy2002: catrope and jdlrobson: Continuing with sync
21:43 catrope@deploy2002: catrope and jdlrobson: Backport for gerrit:991424Use desktop history page HTML everywhere (T353388), gerrit:992931Begin capturing errors for Wikivoyage synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:42 catrope@deploy2002: Started scap: Backport for gerrit:991424Use desktop history page HTML everywhere (T353388), gerrit:992931Begin capturing errors for Wikivoyage
21:36 catrope@deploy2002: Finished scap: Backport for gerrit:993709DiscussionTools: Enable permalinks frontend everywhere except en.wiki (T356063) (duration: 12m 19s)
21:30 catrope@deploy2002: catrope and esanders: Continuing with sync
21:25 catrope@deploy2002: catrope and esanders: Backport for gerrit:993709DiscussionTools: Enable permalinks frontend everywhere except en.wiki (T356063) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:24 catrope@deploy2002: Started scap: Backport for gerrit:993709DiscussionTools: Enable permalinks frontend everywhere except en.wiki (T356063)
21:17 catrope@deploy2002: Finished scap: Backport for gerrit:992974cirrus: Disable cloudelastic writes to testwiki and mw.org (T352335) (duration: 08m 40s)
21:11 catrope@deploy2002: ebernhardson and catrope: Continuing with sync
21:10 catrope@deploy2002: ebernhardson and catrope: Backport for gerrit:992974cirrus: Disable cloudelastic writes to testwiki and mw.org (T352335) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:09 catrope@deploy2002: Started scap: Backport for gerrit:992974cirrus: Disable cloudelastic writes to testwiki and mw.org (T352335)
20:37 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:37 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
20:33 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:33 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
20:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T355609)', diff saved to https://phabricator.wikimedia.org/P55843 and previous config saved to /var/cache/conftool/dbconfig/20240129-202740-marostegui.json
20:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P55842 and previous config saved to /var/cache/conftool/dbconfig/20240129-201233-marostegui.json
19:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P55841 and previous config saved to /var/cache/conftool/dbconfig/20240129-195725-marostegui.json
19:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T355609)', diff saved to https://phabricator.wikimedia.org/P55840 and previous config saved to /var/cache/conftool/dbconfig/20240129-194218-marostegui.json
19:36 zabe@deploy2002: Finished scap: Backport for gerrit:993765Start reading from af_actor/afh_actor everywhere (T355616) (duration: 09m 09s)
19:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T355609)', diff saved to https://phabricator.wikimedia.org/P55839 and previous config saved to /var/cache/conftool/dbconfig/20240129-193317-marostegui.json
19:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2193.codfw.wmnet with reason: Maintenance
19:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2193.codfw.wmnet with reason: Maintenance
19:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55838 and previous config saved to /var/cache/conftool/dbconfig/20240129-193254-marostegui.json
19:29 zabe@deploy2002: zabe: Continuing with sync
19:28 zabe@deploy2002: zabe: Backport for gerrit:993765Start reading from af_actor/afh_actor everywhere (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
19:27 zabe@deploy2002: Started scap: Backport for gerrit:993765Start reading from af_actor/afh_actor everywhere (T355616)
19:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P55837 and previous config saved to /var/cache/conftool/dbconfig/20240129-191748-marostegui.json
19:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P55836 and previous config saved to /var/cache/conftool/dbconfig/20240129-190241-marostegui.json
19:01 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
19:01 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
19:00 ayounsi@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: CR993089 - ayounsi@cumin1002
18:59 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
18:59 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
18:58 ayounsi@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: CR993089 - ayounsi@cumin1002
18:49 brouberol@cumin1001: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop test cluster: Restart of jvm daemons.
18:49 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
18:49 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
18:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55835 and previous config saved to /var/cache/conftool/dbconfig/20240129-184735-marostegui.json
18:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55834 and previous config saved to /var/cache/conftool/dbconfig/20240129-182909-marostegui.json
18:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2180.codfw.wmnet with reason: Maintenance
18:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2180.codfw.wmnet with reason: Maintenance
18:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55833 and previous config saved to /var/cache/conftool/dbconfig/20240129-182846-marostegui.json
18:24 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
18:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P55832 and previous config saved to /var/cache/conftool/dbconfig/20240129-181340-marostegui.json
17:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P55831 and previous config saved to /var/cache/conftool/dbconfig/20240129-175833-marostegui.json
17:43 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
17:43 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
17:43 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
17:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55830 and previous config saved to /var/cache/conftool/dbconfig/20240129-174327-marostegui.json
17:43 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
17:42 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
17:42 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
17:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2171:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55829 and previous config saved to /var/cache/conftool/dbconfig/20240129-173435-marostegui.json
17:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2171.codfw.wmnet with reason: Maintenance
17:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2171.codfw.wmnet with reason: Maintenance
17:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
17:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
17:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T355609)', diff saved to https://phabricator.wikimedia.org/P55828 and previous config saved to /var/cache/conftool/dbconfig/20240129-173406-marostegui.json
17:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P55824 and previous config saved to /var/cache/conftool/dbconfig/20240129-171859-marostegui.json
17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P55823 and previous config saved to /var/cache/conftool/dbconfig/20240129-170353-marostegui.json
16:51 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: gerrit:993728 Bumping portals to master (T128546) (duration: 06m 37s)
16:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T355609)', diff saved to https://phabricator.wikimedia.org/P55822 and previous config saved to /var/cache/conftool/dbconfig/20240129-164846-marostegui.json
16:44 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: gerrit:993728 Bumping portals to master (T128546) (duration: 07m 04s)
16:40 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
16:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T355609)', diff saved to https://phabricator.wikimedia.org/P55821 and previous config saved to /var/cache/conftool/dbconfig/20240129-164005-marostegui.json
16:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
16:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
16:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2158.codfw.wmnet with reason: Maintenance
16:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2158.codfw.wmnet with reason: Maintenance
16:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T355609)', diff saved to https://phabricator.wikimedia.org/P55820 and previous config saved to /var/cache/conftool/dbconfig/20240129-163926-marostegui.json
16:36 volans: installed spicerack 8.3.0 on cumin1002, cumin1001
16:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P55819 and previous config saved to /var/cache/conftool/dbconfig/20240129-162420-marostegui.json
16:20 ladsgroup@deploy2002: Finished scap: Backport for gerrit:992129Drop old virtual domain for url shortener (duration: 09m 24s)
16:14 ladsgroup@deploy2002: ladsgroup: Continuing with sync
16:12 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:992129Drop old virtual domain for url shortener synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
16:11 ladsgroup@deploy2002: Started scap: Backport for gerrit:992129Drop old virtual domain for url shortener
16:10 urandom: decommissioning restbase2019/cassandra-{a,b,c} — T352469
16:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P55817 and previous config saved to /var/cache/conftool/dbconfig/20240129-160913-marostegui.json
16:08 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2019.codfw.wmnet with reason: Decommissioning — T352469
16:07 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2019.codfw.wmnet with reason: Decommissioning — T352469
15:58 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-tool1009.eqiad.wmnet with OS buster
15:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T355609)', diff saved to https://phabricator.wikimedia.org/P55816 and previous config saved to /var/cache/conftool/dbconfig/20240129-155406-marostegui.json
15:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T355609)', diff saved to https://phabricator.wikimedia.org/P55815 and previous config saved to /var/cache/conftool/dbconfig/20240129-154444-marostegui.json
15:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2151.codfw.wmnet with reason: Maintenance
15:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2151.codfw.wmnet with reason: Maintenance
15:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T355609)', diff saved to https://phabricator.wikimedia.org/P55814 and previous config saved to /var/cache/conftool/dbconfig/20240129-154422-marostegui.json
15:34 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage
15:31 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage
15:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P55811 and previous config saved to /var/cache/conftool/dbconfig/20240129-152915-marostegui.json
15:26 Dreamy_Jazz: Running MediaModeration scanning script using `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30-no-render-now.txt` on a tmux session.
15:24 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
15:23 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
15:21 Dreamy_Jazz: Running `foreachwikiindblist group1.dblist extensions/MediaModeration/maintenance/resendMatchEmails.php 20200405 --verbose`
15:19 Dreamy_Jazz: Running `foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/resendMatchEmails.php 20200405`
15:17 Dreamy_Jazz: Stopping mediamoderation scanning script
15:17 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host an-tool1009.eqiad.wmnet with OS buster
15:15 Dreamy_Jazz: afternoon UTC backport window done
15:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P55810 and previous config saved to /var/cache/conftool/dbconfig/20240129-151409-marostegui.json
15:14 dreamyjazz@deploy2002: Finished scap: Backport for gerrit:993500Make the email subject unique for positive match emails (T355752) (duration: 21m 21s)
15:13 ayounsi@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts sretest1005.eqiad.wmnet
15:13 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:13 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin2002"
15:12 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin2002"
15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-airflow1006.eqiad.wmnet
15:04 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp1001.eqiad.wmnet
15:04 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
15:04 dreamyjazz@deploy2002: dreamyjazz: Backport for gerrit:993500Make the email subject unique for positive match emails (T355752) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
15:00 ayounsi@cumin2002: START - Cookbook sre.dns.netbox
14:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T355609)', diff saved to https://phabricator.wikimedia.org/P55809 and previous config saved to /var/cache/conftool/dbconfig/20240129-145902-marostegui.json
14:58 hashar@deploy2002: Finished deploy [gerrit/gerrit@5594608]: wm-checks-api: direct link to build when only one failed - T355774 (duration: 00m 07s)
14:58 hashar@deploy2002: Started deploy [gerrit/gerrit@5594608]: wm-checks-api: direct link to build when only one failed - T355774
14:57 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp1001.eqiad.wmnet
14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2129 (T355609)', diff saved to https://phabricator.wikimedia.org/P55808 and previous config saved to /var/cache/conftool/dbconfig/20240129-145652-marostegui.json
14:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
14:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
14:56 ayounsi@cumin2002: START - Cookbook sre.hosts.decommission for hosts sretest1005.eqiad.wmnet
14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T355609)', diff saved to https://phabricator.wikimedia.org/P55807 and previous config saved to /var/cache/conftool/dbconfig/20240129-145630-marostegui.json
14:56 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2055.codfw.wmnet
14:54 Dreamy_Jazz: scap backport is also backporting 993499 for T355357
14:53 ayounsi@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host sretest1005.eqiad.wmnet
14:53 ayounsi@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
14:52 dreamyjazz@deploy2002: Started scap: Backport for gerrit:993500Make the email subject unique for positive match emails (T355752)
14:52 ayounsi@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
14:51 dreamyjazz@deploy2002: sync-world aborted: Backport for gerrit:993500Make the email subject unique for positive match emails (T355752) (duration: 04m 13s)
14:51 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM sretest1005.eqiad.wmnet - ayounsi@cumin2002"
14:50 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2055.codfw.wmnet
14:50 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM sretest1005.eqiad.wmnet - ayounsi@cumin2002"
14:49 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest1005.eqiad.wmnet on all recursors
14:49 ayounsi@cumin2002: START - Cookbook sre.dns.wipe-cache sretest1005.eqiad.wmnet on all recursors
14:49 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:49 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM sretest1005.eqiad.wmnet - ayounsi@cumin2002"
14:48 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM sretest1005.eqiad.wmnet - ayounsi@cumin2002"
14:47 dreamyjazz@deploy2002: Started scap: Backport for gerrit:993500Make the email subject unique for positive match emails (T355752)
14:46 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:993494hewikinews: remove wgExtraGenderNamespaces and add wgNamespaceAliases (T349581) (duration: 12m 29s)
14:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-airflow1006.eqiad.wmnet
14:42 ayounsi@cumin2002: START - Cookbook sre.dns.netbox
14:42 ayounsi@cumin2002: START - Cookbook sre.ganeti.makevm for new host sretest1005.eqiad.wmnet
14:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P55806 and previous config saved to /var/cache/conftool/dbconfig/20240129-144124-marostegui.json
14:40 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: analytics_cluster::airflow::analytics_product
14:40 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Continuing with sync
14:37 brouberol@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-tool1009.eqiad.wmnet with OS bullseye
14:36 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Backport for gerrit:993494hewikinews: remove wgExtraGenderNamespaces and add wgNamespaceAliases (T349581) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:34 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:993494hewikinews: remove wgExtraGenderNamespaces and add wgNamespaceAliases (T349581)
14:30 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: analytics_cluster::airflow::analytics_product
14:30 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:992783knwiki: add portal namespace and fix talkpagenames of draft and module namespace (T355662 T346583) (duration: 08m 58s)
14:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P55804 and previous config saved to /var/cache/conftool/dbconfig/20240129-142617-marostegui.json
14:23 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ceph2001.codfw.wmnet with OS bullseye
14:23 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Continuing with sync
14:22 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Backport for gerrit:992783knwiki: add portal namespace and fix talkpagenames of draft and module namespace (T355662 T346583) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:21 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:992783knwiki: add portal namespace and fix talkpagenames of draft and module namespace (T355662 T346583)
14:17 volans: upgraded spicerack to 8.3.0 on cumin2002
14:16 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:992371uzwiki: revert temporary logo for the 20th anniversary (T353723) (duration: 11m 01s)
14:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T355609)', diff saved to https://phabricator.wikimedia.org/P55803 and previous config saved to /var/cache/conftool/dbconfig/20240129-141111-marostegui.json
14:10 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-airflow1006.eqiad.wmnet with OS bullseye
14:09 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and anzx: Continuing with sync
14:07 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and anzx: Backport for gerrit:992371uzwiki: revert temporary logo for the 20th anniversary (T353723) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:05 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:992371uzwiki: revert temporary logo for the 20th anniversary (T353723)
14:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2124 (T355609)', diff saved to https://phabricator.wikimedia.org/P55802 and previous config saved to /var/cache/conftool/dbconfig/20240129-140205-marostegui.json
14:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2124.codfw.wmnet with reason: Maintenance
14:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2124.codfw.wmnet with reason: Maintenance
14:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T355609)', diff saved to https://phabricator.wikimedia.org/P55801 and previous config saved to /var/cache/conftool/dbconfig/20240129-140142-marostegui.json
13:54 volans: uploaded spicerack_8.3.0 to apt.wikimedia.org bullseye-wikimedia
13:48 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2355.codfw.wmnet with OS bullseye
13:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P55799 and previous config saved to /var/cache/conftool/dbconfig/20240129-134636-marostegui.json
13:46 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2445.codfw.wmnet with OS bullseye
13:40 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2429.codfw.wmnet with OS bullseye
13:40 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-airflow1006.eqiad.wmnet with reason: host reimage
13:37 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2381.codfw.wmnet with OS bullseye
13:36 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-airflow1006.eqiad.wmnet with reason: host reimage
13:35 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2260.codfw.wmnet with OS bullseye
13:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P55798 and previous config saved to /var/cache/conftool/dbconfig/20240129-133129-marostegui.json
13:29 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2355.codfw.wmnet with reason: host reimage
13:26 claime: Restarting ferm.service on k8s node kubernetes2055 - T354855
13:25 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2445.codfw.wmnet with reason: host reimage
13:23 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-airflow1006.eqiad.wmnet with OS bullseye
13:23 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage
13:20 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2429.codfw.wmnet with reason: host reimage
13:18 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2381.codfw.wmnet with reason: host reimage
13:17 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2445.codfw.wmnet with reason: host reimage
13:16 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage
13:16 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2429.codfw.wmnet with reason: host reimage
13:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T355609)', diff saved to https://phabricator.wikimedia.org/P55797 and previous config saved to /var/cache/conftool/dbconfig/20240129-131623-marostegui.json
13:15 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2260.codfw.wmnet with reason: host reimage
13:14 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2381.codfw.wmnet with reason: host reimage
13:13 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2355.codfw.wmnet with reason: host reimage
13:12 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2260.codfw.wmnet with reason: host reimage
13:07 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host an-tool1009.eqiad.wmnet with OS bullseye
13:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2117 (T355609)', diff saved to https://phabricator.wikimedia.org/P55796 and previous config saved to /var/cache/conftool/dbconfig/20240129-130724-marostegui.json
13:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
13:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
13:00 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2445.codfw.wmnet with OS bullseye
12:59 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2429.codfw.wmnet with OS bullseye
12:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
12:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
12:58 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2381.codfw.wmnet with OS bullseye
12:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
12:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
12:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T355609)', diff saved to https://phabricator.wikimedia.org/P55795 and previous config saved to /var/cache/conftool/dbconfig/20240129-125726-marostegui.json
12:57 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2355.codfw.wmnet with OS bullseye
12:56 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2260.codfw.wmnet with OS bullseye
12:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P55794 and previous config saved to /var/cache/conftool/dbconfig/20240129-124220-marostegui.json
12:33 moritzm: installing openssh security updates
12:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P55793 and previous config saved to /var/cache/conftool/dbconfig/20240129-122713-marostegui.json
12:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-airflow1007.eqiad.wmnet
12:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-airflow1007.eqiad.wmnet
12:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: analytics_cluster::airflow::wmde
12:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T355609)', diff saved to https://phabricator.wikimedia.org/P55792 and previous config saved to /var/cache/conftool/dbconfig/20240129-121205-marostegui.json
12:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1231 (T355609)', diff saved to https://phabricator.wikimedia.org/P55791 and previous config saved to /var/cache/conftool/dbconfig/20240129-120628-marostegui.json
12:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1231.eqiad.wmnet with reason: Maintenance
12:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1231.eqiad.wmnet with reason: Maintenance
12:00 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: analytics_cluster::airflow::wmde
12:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
11:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T355609)', diff saved to https://phabricator.wikimedia.org/P55790 and previous config saved to /var/cache/conftool/dbconfig/20240129-115953-marostegui.json
11:53 Dreamy_Jazz: Running mwscript maintenance/sql.php --wiki=testwiki --wikidb=centralauth ~/T354700-create-table-global.sql for T354700
11:45 Dreamy_Jazz: sql.php finished for T354700
11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P55789 and previous config saved to /var/cache/conftool/dbconfig/20240129-114446-marostegui.json
11:41 Dreamy_Jazz: T354700 - Running `foreachwiki maintenance/sql.php ~/T354700-create-table.sql`
11:39 Dreamy_Jazz: T354700 - Ran mwscript maintenance/sql.php --wiki=testwiki ~/T354700-create-table.sql
11:38 moritzm: upload ganeti 3.0.2-3+wmf1 (bookworm package of Ganeti plus backport for SSL chain handling in RAPI) to apt.wikimedia.org T300152
11:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P55788 and previous config saved to /var/cache/conftool/dbconfig/20240129-112940-marostegui.json
11:28 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-airflow1007.eqiad.wmnet with OS bullseye
11:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T355609)', diff saved to https://phabricator.wikimedia.org/P55787 and previous config saved to /var/cache/conftool/dbconfig/20240129-111434-marostegui.json
11:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1224 (T355609)', diff saved to https://phabricator.wikimedia.org/P55786 and previous config saved to /var/cache/conftool/dbconfig/20240129-110955-marostegui.json
11:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1224.eqiad.wmnet with reason: Maintenance
11:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1224.eqiad.wmnet with reason: Maintenance
11:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55785 and previous config saved to /var/cache/conftool/dbconfig/20240129-110933-marostegui.json
11:05 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-airflow1007.eqiad.wmnet with reason: host reimage
11:01 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-airflow1007.eqiad.wmnet with reason: host reimage
10:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P55784 and previous config saved to /var/cache/conftool/dbconfig/20240129-105427-marostegui.json
10:53 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1054.eqiad.wmnet
10:53 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2054.codfw.wmnet
10:47 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2054.codfw.wmnet
10:47 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1054.eqiad.wmnet
10:47 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-airflow1007.eqiad.wmnet with OS bullseye
10:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P55783 and previous config saved to /var/cache/conftool/dbconfig/20240129-103920-marostegui.json
10:38 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
10:37 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
10:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55782 and previous config saved to /var/cache/conftool/dbconfig/20240129-102414-marostegui.json
10:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1213:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55781 and previous config saved to /var/cache/conftool/dbconfig/20240129-101757-marostegui.json
10:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1213.eqiad.wmnet with reason: Maintenance
10:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1213.eqiad.wmnet with reason: Maintenance
10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T355609)', diff saved to https://phabricator.wikimedia.org/P55780 and previous config saved to /var/cache/conftool/dbconfig/20240129-101735-marostegui.json
10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P55779 and previous config saved to /var/cache/conftool/dbconfig/20240129-100229-marostegui.json
10:01 moritzm: upload prometheus-ganeti-exporter 0.3+deb12u1 to apt.wikimedia.org T300152
09:56 XioNoX: enable Puppet on all the ganeti servers for CR990968 deployment - T300152
09:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P55778 and previous config saved to /var/cache/conftool/dbconfig/20240129-094722-marostegui.json
09:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T355609)', diff saved to https://phabricator.wikimedia.org/P55777 and previous config saved to /var/cache/conftool/dbconfig/20240129-093216-marostegui.json
09:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1201 (T355609)', diff saved to https://phabricator.wikimedia.org/P55776 and previous config saved to /var/cache/conftool/dbconfig/20240129-092724-marostegui.json
09:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1201.eqiad.wmnet with reason: Maintenance
09:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1201.eqiad.wmnet with reason: Maintenance
09:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T355609)', diff saved to https://phabricator.wikimedia.org/P55775 and previous config saved to /var/cache/conftool/dbconfig/20240129-092702-marostegui.json
09:17 godog: mark for deletetion and cleanup replicated thanos blocks for prometheus=ops, older than 3 months, all resolutions - T351927
09:13 moritzm: upgrading python-pymysql in S7 DB hosts to 1.0.2-2~wmf11u1 T355531
09:13 XioNoX: disable Puppet on all the ganeti servers for CR990968 deployment - T300152
09:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P55773 and previous config saved to /var/cache/conftool/dbconfig/20240129-091156-marostegui.json
08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P55772 and previous config saved to /var/cache/conftool/dbconfig/20240129-085649-marostegui.json
08:46 marostegui@deploy2002: Finished scap: Backport for gerrit:993489Revert "ProductionServices.php: Promote pc2014" (duration: 17m 13s)
08:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T355609)', diff saved to https://phabricator.wikimedia.org/P55771 and previous config saved to /var/cache/conftool/dbconfig/20240129-084143-marostegui.json
08:39 marostegui@deploy2002: marostegui: Continuing with sync
08:39 marostegui@deploy2002: marostegui: Backport for gerrit:993489Revert "ProductionServices.php: Promote pc2014" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T355609)', diff saved to https://phabricator.wikimedia.org/P55770 and previous config saved to /var/cache/conftool/dbconfig/20240129-083627-marostegui.json
08:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1187.eqiad.wmnet with reason: Maintenance
08:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1187.eqiad.wmnet with reason: Maintenance
08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55769 and previous config saved to /var/cache/conftool/dbconfig/20240129-083603-marostegui.json
08:29 marostegui@deploy2002: Started scap: Backport for gerrit:993489Revert "ProductionServices.php: Promote pc2014"
08:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P55768 and previous config saved to /var/cache/conftool/dbconfig/20240129-082057-marostegui.json
08:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P55767 and previous config saved to /var/cache/conftool/dbconfig/20240129-080550-marostegui.json
07:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55766 and previous config saved to /var/cache/conftool/dbconfig/20240129-075044-marostegui.json
07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55765 and previous config saved to /var/cache/conftool/dbconfig/20240129-074541-marostegui.json
07:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
07:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T355609)', diff saved to https://phabricator.wikimedia.org/P55764 and previous config saved to /var/cache/conftool/dbconfig/20240129-074519-marostegui.json
07:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P55763 and previous config saved to /var/cache/conftool/dbconfig/20240129-073857-root.json
07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P55762 and previous config saved to /var/cache/conftool/dbconfig/20240129-073012-marostegui.json
07:23 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P55761 and previous config saved to /var/cache/conftool/dbconfig/20240129-072352-root.json
07:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P55760 and previous config saved to /var/cache/conftool/dbconfig/20240129-071506-marostegui.json
07:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P55758 and previous config saved to /var/cache/conftool/dbconfig/20240129-070847-root.json
07:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T355609)', diff saved to https://phabricator.wikimedia.org/P55757 and previous config saved to /var/cache/conftool/dbconfig/20240129-065959-marostegui.json
06:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T355609)', diff saved to https://phabricator.wikimedia.org/P55756 and previous config saved to /var/cache/conftool/dbconfig/20240129-065450-marostegui.json
06:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
06:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
06:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T355609)', diff saved to https://phabricator.wikimedia.org/P55755 and previous config saved to /var/cache/conftool/dbconfig/20240129-065427-marostegui.json
06:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P55754 and previous config saved to /var/cache/conftool/dbconfig/20240129-065341-root.json
06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P55752 and previous config saved to /var/cache/conftool/dbconfig/20240129-063920-marostegui.json
06:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P55751 and previous config saved to /var/cache/conftool/dbconfig/20240129-063836-root.json
06:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2129', diff saved to https://phabricator.wikimedia.org/P55750 and previous config saved to /var/cache/conftool/dbconfig/20240129-063302-marostegui.json
06:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P55747 and previous config saved to /var/cache/conftool/dbconfig/20240129-062414-marostegui.json
06:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T355609)', diff saved to https://phabricator.wikimedia.org/P55746 and previous config saved to /var/cache/conftool/dbconfig/20240129-060907-marostegui.json
06:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T355609)', diff saved to https://phabricator.wikimedia.org/P55745 and previous config saved to /var/cache/conftool/dbconfig/20240129-060400-marostegui.json
06:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
06:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
06:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
06:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
05:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1134.eqiad.wmnet
05:57 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
05:57 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1134.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
05:56 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1134.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
05:54 marostegui@cumin1002: START - Cookbook sre.dns.netbox
05:49 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts db1134.eqiad.wmnet

2024-01-28

01:11 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2016.codfw.wmnet with reason: Decommissioning — T352469
01:11 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2016.codfw.wmnet with reason: Decommissioning — T352469
01:10 urandom: decommissioning restbase2016/cassandra-{a,b,c} — T352469

2024-01-26

22:07 bking@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host cloudelastic1006.wikimedia.org
22:06 bking@cumin2002: START - Cookbook sre.puppet.migrate-host for host cloudelastic1006.wikimedia.org
22:05 bking@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host cloudelastic1006.wikimedia.org
22:04 bking@cumin2002: START - Cookbook sre.puppet.migrate-host for host cloudelastic1006.wikimedia.org
19:02 ejegg: fundraising civicrm upgraded from 8c0dc1d2 to b953d667
18:27 mutante: cloudweb1003 - OATHAuth disabled for Triciaburmeister. (after video verification - T355958)
18:16 mutante: phab1004 - removing 2fa from TBurmeister (after video verification) T355958
17:57 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
17:57 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
17:53 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
17:37 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
17:34 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
17:17 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
17:12 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
17:11 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
17:09 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:09 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sync cloudelastic1010 IPs - bking@cumin2002"
17:08 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sync cloudelastic1010 IPs - bking@cumin2002"
17:04 bking@cumin2002: START - Cookbook sre.dns.netbox
16:34 bking@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudelastic1010.wikimedia.org
16:33 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:33 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1010.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
16:33 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
16:32 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1010.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
16:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2169 in db2194 for T343674', diff saved to https://phabricator.wikimedia.org/P55740 and previous config saved to /var/cache/conftool/dbconfig/20240126-163057-arnaudb.json
16:29 bking@cumin2002: START - Cookbook sre.dns.netbox
16:23 bking@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudelastic1010.wikimedia.org
16:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
15:01 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
15:00 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:47 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:46 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:37 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:37 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:36 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2015.codfw.wmnet with reason: Decommissioning — T352469
14:35 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2015.codfw.wmnet with reason: Decommissioning — T352469
14:34 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:34 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:33 urandom: decommissioning restbase2015/cassandra-{a,b,c} — T352469
14:27 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:27 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:24 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:24 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:08 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:08 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
13:18 eoghan@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Gitlab security upgrade
12:36 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:36 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster svc - ayounsi@cumin1002"
12:35 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster svc - ayounsi@cumin1002"
12:30 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
11:43 taavi: reprepro: copy helm-diff_3.1.3-2 from bullseye-wikimedia to bookworm-wikimedia
11:28 eoghan@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Gitlab security upgrade
10:52 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
10:51 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
10:50 eoghan@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Gitlab security upgrade
10:44 eoghan@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Gitlab security upgrade
10:36 moritzm: prune obsolete nginx packages from eventschema hosts after migration to new library scheme T329529
10:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2169 in db2194 for T343674', diff saved to https://phabricator.wikimedia.org/P55737 and previous config saved to /var/cache/conftool/dbconfig/20240126-102550-arnaudb.json
08:01 moritzm: rebalance codfw/B following switch maintenance T355549
07:54 moritzm: failover ganeti master for codfw back to ganeti2022, switch maintenance is completed T355549
01:01 dzahn@cumin1002: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1004.wikimedia.org with reason: security release
00:07 dzahn@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: security release
00:00 dzahn@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: security release

2024-01-25

23:54 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=wikimaniawiki --fix # T347622
23:54 zabe@deploy2002: Finished scap: Backport for gerrit:961963Setup namespace for 2025, 2026, enable subpages for 2023-2026 (T347622) (duration: 08m 30s)
23:47 zabe@deploy2002: robertsky and zabe: Continuing with sync
23:47 zabe@deploy2002: robertsky and zabe: Backport for gerrit:961963Setup namespace for 2025, 2026, enable subpages for 2023-2026 (T347622) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
23:45 zabe@deploy2002: Started scap: Backport for gerrit:961963Setup namespace for 2025, 2026, enable subpages for 2023-2026 (T347622)
23:29 zabe: zabe@mwmaint2002:/tmp/uploads$ mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user=Sturm . # T355485
23:17 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cloudelastic1010.wikimedia.org with reason: migration canary T355617
23:17 bking@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on cloudelastic1010.wikimedia.org with reason: migration canary T355617
22:54 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: cloudelastic1010.wikimedia.org for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
22:53 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010.wikimedia.org for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
22:53 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
22:53 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
22:52 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
22:52 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
22:40 ryankemper: T351354 Restarting `cloudelastic1006` (final restart for today)
22:34 ryankemper: T351354 Now restarting new masters to keep configs in sync; restarting `cloudelastic1009`
22:33 ryankemper: T351354 Now restarting new masters to keep configs in sync; restarting `cloudelastic1007`
22:26 ryankemper: T351354 Restarting `cloudelastic1002`
22:19 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
22:19 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
22:15 ryankemper: T351354 Restarting `cloudelastic1004` following puppet run
22:12 dzahn@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: security release
22:11 ryankemper: T351354 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/993038; restarting `cloudelastic1001` following puppet run
22:08 ryankemper: T351354 Downtimed `cloudelastic*`; shortly will restart `cloudelastic100[1,2,4]` one host at a time to make them no longer masters
22:08 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: cloudelastic maintenance
22:07 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: cloudelastic maintenance
21:55 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
21:55 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
21:44 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
21:44 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
21:44 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
21:44 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
21:19 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
21:19 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
21:14 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
21:14 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
21:13 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
21:13 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
20:58 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:58 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
20:57 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:57 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
20:56 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:56 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
20:55 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1002.eqiad.wmnet with OS bookworm
20:55 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
20:54 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
20:51 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1001.eqiad.wmnet with OS bookworm
20:51 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
20:50 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
20:37 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:37 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
20:36 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1002.eqiad.wmnet with reason: host reimage
20:35 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:35 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
20:33 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:33 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
20:33 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1002.eqiad.wmnet with reason: host reimage
20:32 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1001.eqiad.wmnet with reason: host reimage
20:27 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1001.eqiad.wmnet with reason: host reimage
20:26 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set cloudrabbit1001/2 as active - taavi@cumin1002"
20:25 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set cloudrabbit1001/2 as active - taavi@cumin1002"
20:19 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1002.eqiad.wmnet with OS bookworm
20:19 taavi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudrabbit1002.eqiad.wmnet with OS bookworm
20:16 zabe@deploy2002: Finished scap: Backport for gerrit:992942Start reading from af_actor/afh_actor in group1 wikis (T355616) (duration: 11m 27s)
20:15 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:15 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
20:11 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1002.eqiad.wmnet with OS bookworm
20:10 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1001.eqiad.wmnet with OS bookworm
20:10 zabe@deploy2002: zabe: Continuing with sync
20:09 zabe@deploy2002: zabe: Backport for gerrit:992942Start reading from af_actor/afh_actor in group1 wikis (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
20:06 taavi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudrabbit1001
20:05 zabe@deploy2002: Started scap: Backport for gerrit:992942Start reading from af_actor/afh_actor in group1 wikis (T355616)
20:05 taavi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudrabbit1001
20:05 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:05 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add IPs for cloudrabbit1001 - taavi@cumin1002"
20:04 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add IPs for cloudrabbit1001 - taavi@cumin1002"
20:02 taavi@cumin1002: START - Cookbook sre.dns.netbox
20:01 taavi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudrabbit1002
20:00 taavi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudrabbit1002
19:59 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:59 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add IPs for cloudrabbit1002 - taavi@cumin1002"
19:58 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add IPs for cloudrabbit1002 - taavi@cumin1002"
19:56 taavi@cumin1002: START - Cookbook sre.dns.netbox
19:29 bking@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
19:29 bking@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
19:28 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
19:28 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
19:25 bking@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
19:24 bking@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
18:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55736 and previous config saved to /var/cache/conftool/dbconfig/20240125-184922-root.json
18:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55735 and previous config saved to /var/cache/conftool/dbconfig/20240125-184917-root.json
18:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55734 and previous config saved to /var/cache/conftool/dbconfig/20240125-184911-root.json
18:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55733 and previous config saved to /var/cache/conftool/dbconfig/20240125-184906-root.json
18:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55732 and previous config saved to /var/cache/conftool/dbconfig/20240125-184900-root.json
18:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55731 and previous config saved to /var/cache/conftool/dbconfig/20240125-184853-root.json
18:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55730 and previous config saved to /var/cache/conftool/dbconfig/20240125-184845-root.json
18:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55729 and previous config saved to /var/cache/conftool/dbconfig/20240125-184839-root.json
18:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55728 and previous config saved to /var/cache/conftool/dbconfig/20240125-184823-root.json
18:47 mutante: phab2002 - rebooting
18:46 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: reboot
18:45 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab2002.codfw.wmnet with reason: reboot
18:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55727 and previous config saved to /var/cache/conftool/dbconfig/20240125-183417-root.json
18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55726 and previous config saved to /var/cache/conftool/dbconfig/20240125-183412-root.json
18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55725 and previous config saved to /var/cache/conftool/dbconfig/20240125-183406-root.json
18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55724 and previous config saved to /var/cache/conftool/dbconfig/20240125-183401-root.json
18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55723 and previous config saved to /var/cache/conftool/dbconfig/20240125-183355-root.json
18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55722 and previous config saved to /var/cache/conftool/dbconfig/20240125-183348-root.json
18:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55721 and previous config saved to /var/cache/conftool/dbconfig/20240125-183340-root.json
18:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55720 and previous config saved to /var/cache/conftool/dbconfig/20240125-183334-root.json
18:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55719 and previous config saved to /var/cache/conftool/dbconfig/20240125-183318-root.json
18:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55718 and previous config saved to /var/cache/conftool/dbconfig/20240125-181912-root.json
18:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55717 and previous config saved to /var/cache/conftool/dbconfig/20240125-181907-root.json
18:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55716 and previous config saved to /var/cache/conftool/dbconfig/20240125-181901-root.json
18:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55715 and previous config saved to /var/cache/conftool/dbconfig/20240125-181856-root.json
18:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55714 and previous config saved to /var/cache/conftool/dbconfig/20240125-181850-root.json
18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55713 and previous config saved to /var/cache/conftool/dbconfig/20240125-181843-root.json
18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55712 and previous config saved to /var/cache/conftool/dbconfig/20240125-181835-root.json
18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55711 and previous config saved to /var/cache/conftool/dbconfig/20240125-181829-root.json
18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55710 and previous config saved to /var/cache/conftool/dbconfig/20240125-181814-root.json
18:13 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum6001.drmrs.wmnet with OS bookworm
18:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55709 and previous config saved to /var/cache/conftool/dbconfig/20240125-180407-root.json
18:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55708 and previous config saved to /var/cache/conftool/dbconfig/20240125-180402-root.json
18:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55707 and previous config saved to /var/cache/conftool/dbconfig/20240125-180356-root.json
18:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55706 and previous config saved to /var/cache/conftool/dbconfig/20240125-180351-root.json
18:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55705 and previous config saved to /var/cache/conftool/dbconfig/20240125-180345-root.json
18:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55704 and previous config saved to /var/cache/conftool/dbconfig/20240125-180338-root.json
18:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55703 and previous config saved to /var/cache/conftool/dbconfig/20240125-180330-root.json
18:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55702 and previous config saved to /var/cache/conftool/dbconfig/20240125-180324-root.json
18:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55701 and previous config saved to /var/cache/conftool/dbconfig/20240125-180308-root.json
18:01 sukhe: running authdns-update for CR 993008: T355835
17:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2140.codfw.wmnet with reason: Maintenance
17:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2140.codfw.wmnet with reason: Maintenance
17:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55700 and previous config saved to /var/cache/conftool/dbconfig/20240125-174902-root.json
17:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55699 and previous config saved to /var/cache/conftool/dbconfig/20240125-174857-root.json
17:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55698 and previous config saved to /var/cache/conftool/dbconfig/20240125-174851-root.json
17:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55697 and previous config saved to /var/cache/conftool/dbconfig/20240125-174846-root.json
17:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55696 and previous config saved to /var/cache/conftool/dbconfig/20240125-174840-root.json
17:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55695 and previous config saved to /var/cache/conftool/dbconfig/20240125-174833-root.json
17:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55694 and previous config saved to /var/cache/conftool/dbconfig/20240125-174825-root.json
17:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55693 and previous config saved to /var/cache/conftool/dbconfig/20240125-174819-root.json
17:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55692 and previous config saved to /var/cache/conftool/dbconfig/20240125-174803-root.json
17:47 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum6001.drmrs.wmnet with reason: host reimage
17:45 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for asw-b-codfw,lsw1-b5-codfw.mgmt
17:45 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for asw-b-codfw,lsw1-b5-codfw.mgmt
17:43 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum6001.drmrs.wmnet with reason: host reimage
17:38 btullis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
17:34 btullis@deploy2002: helmfile [eqiad] START helmfile.d/services/datahub: apply on main
17:33 btullis@deploy2002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
17:30 Amir1: deploying new captchas (T141490)
17:22 btullis@deploy2002: helmfile [codfw] START helmfile.d/services/datahub: apply on main
17:22 btullis@deploy2002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
17:21 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host durum6001.drmrs.wmnet with OS bookworm
17:17 btullis@deploy2002: helmfile [staging] START helmfile.d/services/datahub: apply on main
17:09 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
17:09 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
17:07 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:05 taavi@cumin1002: START - Cookbook sre.dns.netbox
17:04 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudrabbit[1001-1002].wikimedia.org
17:04 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:04 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudrabbit[1001-1002].wikimedia.org decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
17:01 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
17:01 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
17:00 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudrabbit[1001-1002].wikimedia.org decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
16:56 taavi@cumin1002: START - Cookbook sre.dns.netbox
16:52 sukhe: running authdns-update for CR 992936: T355835
16:49 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2014.codfw.wmnet with reason: Decommissioning — T352469
16:49 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2014.codfw.wmnet with reason: Decommissioning — T352469
16:48 taavi@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudrabbit[1001-1002].wikimedia.org
16:48 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2013.codfw.wmnet with reason: Decommissioning — T352469
16:48 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2013.codfw.wmnet with reason: Decommissioning — T352469
16:43 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 32 hosts
16:42 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for 32 hosts
16:42 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr[1-2]-codfw
16:41 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for cr[1-2]-codfw
16:34 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=parse2007.codfw.wmnet
16:34 claime: repooling parse2007 - T355549
16:33 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=parse2006.codfw.wmnet
16:33 claime: repooling parse2006 - T355549
16:32 claime: uncordoning kubernetes2023 - T355549
16:32 claime: uncordoning kubernetes2032 - T355549
16:29 claime: uncordoning kubernetes2031 - T355549
16:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T354336)', diff saved to https://phabricator.wikimedia.org/P55691 and previous config saved to /var/cache/conftool/dbconfig/20240125-161320-marostegui.json
16:03 topranks: Network maintenance codfw rack b5 underway T355549
15:58 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:30:00 on 32 hosts with reason: Migrating servers in codfw rack B5 to lsw1-b5-codfw T355549
15:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P55690 and previous config saved to /var/cache/conftool/dbconfig/20240125-155813-marostegui.json
15:58 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:30:00 on 32 hosts with reason: Migrating servers in codfw rack B5 to lsw1-b5-codfw T355549
15:57 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:30:00 on cr[1-2]-codfw with reason: prepping for server uplink migration
15:57 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:30:00 on cr[1-2]-codfw with reason: prepping for server uplink migration
15:54 arnaudb@cumin1002: dbctl commit (dc=all): 'preparing to clone db2169 on db2196 as per TT343674', diff saved to https://phabricator.wikimedia.org/P55689 and previous config saved to /var/cache/conftool/dbconfig/20240125-155450-arnaudb.json
15:52 topranks: disabling puppet fleet-wide to allow for maintenance in codfw rack b5 which hosts puppetmaster2003 T355549
15:46 topranks: configuring lsw1-b5-codfw switch ports for servers to be moved T355549
15:46 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on asw-b-codfw,lsw1-b5-codfw.mgmt with reason: prepping for server uplink migration
15:46 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on asw-b-codfw,lsw1-b5-codfw.mgmt with reason: prepping for server uplink migration
15:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P55688 and previous config saved to /var/cache/conftool/dbconfig/20240125-154307-marostegui.json
15:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wcqs::public
15:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T354336)', diff saved to https://phabricator.wikimedia.org/P55687 and previous config saved to /var/cache/conftool/dbconfig/20240125-152801-marostegui.json
15:25 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: wcqs::public
15:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wdqs::internal
15:20 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2006.cofw.wmnet
15:19 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
15:18 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
15:10 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: wdqs::internal
14:35 cgoubert@cumin2002: conftool action : set/pooled=inactive; selector: name=parse2007.codfw.wmnet
14:35 claime: Depooling parse2007 (setting inactive) - T355549
14:34 cgoubert@cumin2002: conftool action : set/pooled=inactive; selector: name=parse2006.codfw.wmnet
14:34 claime: Depooling parse2006 (setting inactive) - T355549
14:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2179 (T354336)', diff saved to https://phabricator.wikimedia.org/P55684 and previous config saved to /var/cache/conftool/dbconfig/20240125-142729-marostegui.json
14:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2179.codfw.wmnet with reason: Maintenance
14:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2179.codfw.wmnet with reason: Maintenance
14:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55683 and previous config saved to /var/cache/conftool/dbconfig/20240125-142706-marostegui.json
14:26 moritzm: installing debmonitor-client 0.3.4 fleet-wide
14:25 claime: Draining kubernetes2023 - T355549
14:25 claime: Draining kubernetes2033 - T355549
14:23 claime: Draining kubernetes2032 - T355549
14:21 claime: Draining kubernetes2031 - T355549
14:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: After T355885', diff saved to https://phabricator.wikimedia.org/P55682 and previous config saved to /var/cache/conftool/dbconfig/20240125-142102-root.json
14:18 btullis@cumin1002: END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
14:15 moritzm: failover ganeti master for codfw to ganeti2020 T355549
14:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P55681 and previous config saved to /var/cache/conftool/dbconfig/20240125-141200-marostegui.json
14:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: After T355885', diff saved to https://phabricator.wikimedia.org/P55680 and previous config saved to /var/cache/conftool/dbconfig/20240125-140557-root.json
14:05 btullis@cumin1002: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
13:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P55679 and previous config saved to /var/cache/conftool/dbconfig/20240125-135653-marostegui.json
13:53 btullis@cumin1002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid test cluster: Roll restart of Druid jvm daemons.
13:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 50%: After T355885', diff saved to https://phabricator.wikimedia.org/P55678 and previous config saved to /var/cache/conftool/dbconfig/20240125-135052-root.json
13:47 volans: uploaded debmonitor-client_0.3.4 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia,bookworm-wikimedia
13:43 btullis@cumin1002: START - Cookbook sre.druid.roll-restart-workers for Druid test cluster: Roll restart of Druid jvm daemons.
13:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55677 and previous config saved to /var/cache/conftool/dbconfig/20240125-134147-marostegui.json
13:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55676 and previous config saved to /var/cache/conftool/dbconfig/20240125-133935-marostegui.json
13:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2172.codfw.wmnet with reason: Maintenance
13:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2172.codfw.wmnet with reason: Maintenance
13:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T354336)', diff saved to https://phabricator.wikimedia.org/P55675 and previous config saved to /var/cache/conftool/dbconfig/20240125-133913-marostegui.json
13:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: After T355885', diff saved to https://phabricator.wikimedia.org/P55674 and previous config saved to /var/cache/conftool/dbconfig/20240125-133547-root.json
13:32 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2022.codfw.wmnet
13:28 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2357.codfw.wmnet with OS bullseye
13:28 topranks: draining VMs from ganeti2022 ahead of codfw rack b5 maintenance T355549
13:27 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2022.codfw.wmnet
13:27 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2021.codfw.wmnet
13:26 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2021.codfw.wmnet
13:26 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2021.codfw.wmnet
13:26 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2021.codfw.wmnet
13:25 topranks: stopping logstash service on logstash2025 to faciliate VM migration T355549
13:25 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2021.codfw.wmnet
13:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55673 and previous config saved to /var/cache/conftool/dbconfig/20240125-132407-marostegui.json
13:24 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2267.codfw.wmnet with OS bullseye
13:21 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2395.codfw.wmnet with OS bullseye
13:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: After T355885', diff saved to https://phabricator.wikimedia.org/P55672 and previous config saved to /var/cache/conftool/dbconfig/20240125-132043-root.json
13:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
13:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
13:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2129', diff saved to https://phabricator.wikimedia.org/P55671 and previous config saved to /var/cache/conftool/dbconfig/20240125-131547-marostegui.json
13:12 hashar@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.15 refs T354433
13:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55670 and previous config saved to /var/cache/conftool/dbconfig/20240125-130900-marostegui.json
13:08 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2357.codfw.wmnet with reason: host reimage
13:05 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2267.codfw.wmnet with reason: host reimage
13:02 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2021.codfw.wmnet
13:02 topranks: draining VMs from ganeti2021 ahead of codfw rack b5 maintenance T355549
13:02 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2395.codfw.wmnet with reason: host reimage
12:58 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2267.codfw.wmnet with reason: host reimage
12:58 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2357.codfw.wmnet with reason: host reimage
12:57 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2395.codfw.wmnet with reason: host reimage
12:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T354336)', diff saved to https://phabricator.wikimedia.org/P55669 and previous config saved to /var/cache/conftool/dbconfig/20240125-125353-marostegui.json
12:41 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2267.codfw.wmnet with OS bullseye
12:41 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2395.codfw.wmnet with OS bullseye
12:41 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2357.codfw.wmnet with OS bullseye
12:12 jgiannelos@deploy2002: Finished deploy [restbase/deploy@708f0f3]: (no justification provided) (duration: 20m 28s)
12:06 moritzm: installing openssh security updates
11:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T354336)', diff saved to https://phabricator.wikimedia.org/P55667 and previous config saved to /var/cache/conftool/dbconfig/20240125-115322-marostegui.json
11:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
11:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
11:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2155.codfw.wmnet with reason: Maintenance
11:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2155.codfw.wmnet with reason: Maintenance
11:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T354336)', diff saved to https://phabricator.wikimedia.org/P55666 and previous config saved to /var/cache/conftool/dbconfig/20240125-115233-marostegui.json
11:52 jgiannelos@deploy2002: Started deploy [restbase/deploy@708f0f3]: (no justification provided)
11:45 zabe@deploy2002: Finished scap: Backport for gerrit:992894Start reading from af_actor/afh_actor in group0 wikis (T355616) (duration: 08m 25s)
11:44 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1038.eqiad.wmnet to cluster eqiad and group D
11:42 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1038.eqiad.wmnet to cluster eqiad and group D
11:38 zabe@deploy2002: zabe: Continuing with sync
11:38 zabe@deploy2002: zabe: Backport for gerrit:992894Start reading from af_actor/afh_actor in group0 wikis (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P55665 and previous config saved to /var/cache/conftool/dbconfig/20240125-113727-marostegui.json
11:36 zabe@deploy2002: Started scap: Backport for gerrit:992894Start reading from af_actor/afh_actor in group0 wikis (T355616)
11:29 hashar@deploy2002: Finished scap: Backport for gerrit:992781UserGroupManager: Fix cross-wiki database access (T355813) (duration: 08m 50s)
11:26 claime: Restarting ferm.service on k8s node kubernetes2036.codfw.wmnet - T354855
11:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2107.codfw.wmnet with reason: Maintenance
11:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2107.codfw.wmnet with reason: Maintenance
11:23 hashar@deploy2002: hashar and zabe: Continuing with sync
11:22 hashar@deploy2002: hashar and zabe: Backport for gerrit:992781UserGroupManager: Fix cross-wiki database access (T355813) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P55664 and previous config saved to /var/cache/conftool/dbconfig/20240125-112220-marostegui.json
11:20 hashar@deploy2002: Started scap: Backport for gerrit:992781UserGroupManager: Fix cross-wiki database access (T355813)
11:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T354336)', diff saved to https://phabricator.wikimedia.org/P55663 and previous config saved to /var/cache/conftool/dbconfig/20240125-110714-marostegui.json
11:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2147.codfw.wmnet with reason: Maintenance
11:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2147.codfw.wmnet with reason: Maintenance
11:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2139.codfw.wmnet with reason: Maintenance
11:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2139.codfw.wmnet with reason: Maintenance
11:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55662 and previous config saved to /var/cache/conftool/dbconfig/20240125-110521-marostegui.json
10:57 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1003.eqiad.wmnet with OS bookworm
10:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P55660 and previous config saved to /var/cache/conftool/dbconfig/20240125-105015-marostegui.json
10:39 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
10:38 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
10:35 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
10:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P55659 and previous config saved to /var/cache/conftool/dbconfig/20240125-103509-marostegui.json
10:21 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS bookworm
10:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55658 and previous config saved to /var/cache/conftool/dbconfig/20240125-102002-marostegui.json
10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2138:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55657 and previous config saved to /var/cache/conftool/dbconfig/20240125-101750-marostegui.json
10:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
10:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55656 and previous config saved to /var/cache/conftool/dbconfig/20240125-101728-marostegui.json
10:17 moritzm: upgrading python-pymysql in S6 DB hosts to 1.0.2-2~wmf11u1 T355531
10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P55655 and previous config saved to /var/cache/conftool/dbconfig/20240125-100221-marostegui.json
09:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P55654 and previous config saved to /var/cache/conftool/dbconfig/20240125-094714-marostegui.json
09:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55653 and previous config saved to /var/cache/conftool/dbconfig/20240125-093208-marostegui.json
09:29 stran@deploy2002: Finished scap: Backport for gerrit:992123PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928) (duration: 17m 24s)
09:18 stran@deploy2002: kharlan and stran: Continuing with sync
09:14 stran@deploy2002: kharlan and stran: Backport for gerrit:992123PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
09:12 stran@deploy2002: Started scap: Backport for gerrit:992123PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928)
08:45 stran@deploy2002: stran and kharlan: Backport for gerrit:992123PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
08:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
08:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2136.codfw.wmnet with reason: Maintenance
08:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2136.codfw.wmnet with reason: Maintenance
08:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T354336)', diff saved to https://phabricator.wikimedia.org/P55652 and previous config saved to /var/cache/conftool/dbconfig/20240125-083106-marostegui.json
08:16 stran@deploy2002: Started scap: Backport for gerrit:992123PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928)
08:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55651 and previous config saved to /var/cache/conftool/dbconfig/20240125-081559-marostegui.json
08:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55650 and previous config saved to /var/cache/conftool/dbconfig/20240125-080053-marostegui.json
07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T354336)', diff saved to https://phabricator.wikimedia.org/P55648 and previous config saved to /var/cache/conftool/dbconfig/20240125-074546-marostegui.json
07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2119 (T354336)', diff saved to https://phabricator.wikimedia.org/P55647 and previous config saved to /var/cache/conftool/dbconfig/20240125-074334-marostegui.json
07:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2119.codfw.wmnet with reason: Maintenance
07:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2119.codfw.wmnet with reason: Maintenance
07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T354336)', diff saved to https://phabricator.wikimedia.org/P55646 and previous config saved to /var/cache/conftool/dbconfig/20240125-074312-marostegui.json
07:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 100%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55645 and previous config saved to /var/cache/conftool/dbconfig/20240125-073319-root.json
07:33 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 100%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55644 and previous config saved to /var/cache/conftool/dbconfig/20240125-073310-root.json
07:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 100%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55643 and previous config saved to /var/cache/conftool/dbconfig/20240125-073252-root.json
07:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 100%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55642 and previous config saved to /var/cache/conftool/dbconfig/20240125-073244-root.json
07:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P55641 and previous config saved to /var/cache/conftool/dbconfig/20240125-072806-marostegui.json
07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2137:3315 T355549', diff saved to https://phabricator.wikimedia.org/P55640 and previous config saved to /var/cache/conftool/dbconfig/20240125-072010-marostegui.json
07:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 75%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55639 and previous config saved to /var/cache/conftool/dbconfig/20240125-071813-root.json
07:18 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 75%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55638 and previous config saved to /var/cache/conftool/dbconfig/20240125-071805-root.json
07:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 75%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55637 and previous config saved to /var/cache/conftool/dbconfig/20240125-071747-root.json
07:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 75%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55636 and previous config saved to /var/cache/conftool/dbconfig/20240125-071739-root.json
07:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P55635 and previous config saved to /var/cache/conftool/dbconfig/20240125-071259-marostegui.json
07:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 db2160 db2109 db2107 db2137:3314 db2135:3315 db2143 db2147 db2177 db2178 db2188 T355549', diff saved to https://phabricator.wikimedia.org/P55634 and previous config saved to /var/cache/conftool/dbconfig/20240125-071253-marostegui.json
07:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2107 T355682', diff saved to https://phabricator.wikimedia.org/P55633 and previous config saved to /var/cache/conftool/dbconfig/20240125-070604-marostegui.json
07:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 50%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55632 and previous config saved to /var/cache/conftool/dbconfig/20240125-070308-root.json
07:03 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 50%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55631 and previous config saved to /var/cache/conftool/dbconfig/20240125-070300-root.json
07:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 50%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55630 and previous config saved to /var/cache/conftool/dbconfig/20240125-070242-root.json
07:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 50%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55629 and previous config saved to /var/cache/conftool/dbconfig/20240125-070234-root.json
07:01 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2104 to s2 primary and set section read-write T355682', diff saved to https://phabricator.wikimedia.org/P55628 and previous config saved to /var/cache/conftool/dbconfig/20240125-070153-marostegui.json
07:01 marostegui@cumin1002: dbctl commit (dc=all): 'Set s2 codfw as read-only for maintenance - T355682', diff saved to https://phabricator.wikimedia.org/P55627 and previous config saved to /var/cache/conftool/dbconfig/20240125-070120-marostegui.json
07:00 marostegui: Starting s2 codfw failover from db2107 to db2104 - T355682
06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T354336)', diff saved to https://phabricator.wikimedia.org/P55626 and previous config saved to /var/cache/conftool/dbconfig/20240125-065535-marostegui.json
06:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 25%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55625 and previous config saved to /var/cache/conftool/dbconfig/20240125-064803-root.json
06:47 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 25%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55624 and previous config saved to /var/cache/conftool/dbconfig/20240125-064755-root.json
06:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 25%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55623 and previous config saved to /var/cache/conftool/dbconfig/20240125-064737-root.json
06:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 25%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55622 and previous config saved to /var/cache/conftool/dbconfig/20240125-064729-root.json
06:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2110 (T354336)', diff saved to https://phabricator.wikimedia.org/P55621 and previous config saved to /var/cache/conftool/dbconfig/20240125-064420-marostegui.json
06:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2110.codfw.wmnet with reason: Maintenance
06:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2110.codfw.wmnet with reason: Maintenance
06:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T354336)', diff saved to https://phabricator.wikimedia.org/P55620 and previous config saved to /var/cache/conftool/dbconfig/20240125-064357-marostegui.json
06:37 marostegui@deploy2002: Finished scap: Backport for gerrit:992842ProductionServices.php: Promote pc2014 (T355683) (duration: 08m 42s)
06:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 10%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55619 and previous config saved to /var/cache/conftool/dbconfig/20240125-063258-root.json
06:32 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 10%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55618 and previous config saved to /var/cache/conftool/dbconfig/20240125-063250-root.json
06:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 10%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55617 and previous config saved to /var/cache/conftool/dbconfig/20240125-063232-root.json
06:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 10%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55616 and previous config saved to /var/cache/conftool/dbconfig/20240125-063225-root.json
06:31 marostegui@deploy2002: marostegui: Continuing with sync
06:31 marostegui@deploy2002: marostegui: Backport for gerrit:992842ProductionServices.php: Promote pc2014 (T355683) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
06:29 marostegui@deploy2002: Started scap: Backport for gerrit:992842ProductionServices.php: Promote pc2014 (T355683)
06:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P55615 and previous config saved to /var/cache/conftool/dbconfig/20240125-062851-marostegui.json
06:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 5%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55614 and previous config saved to /var/cache/conftool/dbconfig/20240125-061753-root.json
06:17 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 5%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55613 and previous config saved to /var/cache/conftool/dbconfig/20240125-061745-root.json
06:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 5%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55612 and previous config saved to /var/cache/conftool/dbconfig/20240125-061727-root.json
06:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 5%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55611 and previous config saved to /var/cache/conftool/dbconfig/20240125-061719-root.json
06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P55610 and previous config saved to /var/cache/conftool/dbconfig/20240125-061344-marostegui.json
06:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s2 T355682
06:10 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2104 with weight 0 T355682', diff saved to https://phabricator.wikimedia.org/P55609 and previous config saved to /var/cache/conftool/dbconfig/20240125-061048-root.json
06:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s2 T355682
06:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 1%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55608 and previous config saved to /var/cache/conftool/dbconfig/20240125-060249-root.json
06:02 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 1%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55607 and previous config saved to /var/cache/conftool/dbconfig/20240125-060240-root.json
06:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 1%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55606 and previous config saved to /var/cache/conftool/dbconfig/20240125-060222-root.json
06:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 1%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55605 and previous config saved to /var/cache/conftool/dbconfig/20240125-060214-root.json
05:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T354336)', diff saved to https://phabricator.wikimedia.org/P55604 and previous config saved to /var/cache/conftool/dbconfig/20240125-055837-marostegui.json
05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2106 (T354336)', diff saved to https://phabricator.wikimedia.org/P55603 and previous config saved to /var/cache/conftool/dbconfig/20240125-055626-marostegui.json
05:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2106.codfw.wmnet with reason: Maintenance
05:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2106.codfw.wmnet with reason: Maintenance
05:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2099.codfw.wmnet with reason: Maintenance
05:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2099.codfw.wmnet with reason: Maintenance
05:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1160.eqiad.wmnet with reason: Maintenance
05:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1160.eqiad.wmnet with reason: Maintenance
02:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
02:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
02:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T354336)', diff saved to https://phabricator.wikimedia.org/P55602 and previous config saved to /var/cache/conftool/dbconfig/20240125-022727-marostegui.json
02:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P55601 and previous config saved to /var/cache/conftool/dbconfig/20240125-021221-marostegui.json
01:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P55600 and previous config saved to /var/cache/conftool/dbconfig/20240125-015714-marostegui.json
01:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T354336)', diff saved to https://phabricator.wikimedia.org/P55599 and previous config saved to /var/cache/conftool/dbconfig/20240125-014208-marostegui.json
01:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1249 (T354336)', diff saved to https://phabricator.wikimedia.org/P55598 and previous config saved to /var/cache/conftool/dbconfig/20240125-013958-marostegui.json
01:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1249.eqiad.wmnet with reason: Maintenance
01:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1249.eqiad.wmnet with reason: Maintenance
01:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T354336)', diff saved to https://phabricator.wikimedia.org/P55597 and previous config saved to /var/cache/conftool/dbconfig/20240125-013936-marostegui.json
01:28 fab@deploy2002: Finished deploy [airflow-dags/research@e6aa85a]: (no justification provided) (duration: 00m 13s)
01:28 fab@deploy2002: Started deploy [airflow-dags/research@e6aa85a]: (no justification provided)
01:25 eileen: civicrm upgraded from b85b6dde to 69d4ebe3
01:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P55596 and previous config saved to /var/cache/conftool/dbconfig/20240125-012430-marostegui.json
01:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P55595 and previous config saved to /var/cache/conftool/dbconfig/20240125-010923-marostegui.json
00:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T354336)', diff saved to https://phabricator.wikimedia.org/P55594 and previous config saved to /var/cache/conftool/dbconfig/20240125-005417-marostegui.json
00:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1248 (T354336)', diff saved to https://phabricator.wikimedia.org/P55593 and previous config saved to /var/cache/conftool/dbconfig/20240125-005307-marostegui.json
00:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1248.eqiad.wmnet with reason: Maintenance
00:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1248.eqiad.wmnet with reason: Maintenance
00:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T354336)', diff saved to https://phabricator.wikimedia.org/P55592 and previous config saved to /var/cache/conftool/dbconfig/20240125-005245-marostegui.json
00:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P55591 and previous config saved to /var/cache/conftool/dbconfig/20240125-003739-marostegui.json
00:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P55590 and previous config saved to /var/cache/conftool/dbconfig/20240125-002233-marostegui.json
00:12 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2103.codfw.wmnet with OS bullseye
00:12 zabe@deploy2002: Finished scap: Backport for gerrit:992830Start reading from af_user(_text)/afh_user(_text) in testwiki (T355616) (duration: 09m 36s)
00:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T354336)', diff saved to https://phabricator.wikimedia.org/P55589 and previous config saved to /var/cache/conftool/dbconfig/20240125-000726-marostegui.json
00:05 zabe@deploy2002: zabe: Continuing with sync
00:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1247 (T354336)', diff saved to https://phabricator.wikimedia.org/P55588 and previous config saved to /var/cache/conftool/dbconfig/20240125-000515-marostegui.json
00:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1247.eqiad.wmnet with reason: Maintenance
00:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1247.eqiad.wmnet with reason: Maintenance
00:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T354336)', diff saved to https://phabricator.wikimedia.org/P55587 and previous config saved to /var/cache/conftool/dbconfig/20240125-000452-marostegui.json
00:04 zabe@deploy2002: zabe: Backport for gerrit:992830Start reading from af_user(_text)/afh_user(_text) in testwiki (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
00:02 zabe@deploy2002: Started scap: Backport for gerrit:992830Start reading from af_user(_text)/afh_user(_text) in testwiki (T355616)

2024-01-24

23:54 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2103.codfw.wmnet with reason: host reimage
23:51 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2103.codfw.wmnet with reason: host reimage
23:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P55586 and previous config saved to /var/cache/conftool/dbconfig/20240124-234946-marostegui.json
23:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P55585 and previous config saved to /var/cache/conftool/dbconfig/20240124-233439-marostegui.json
23:34 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2103.codfw.wmnet with OS bullseye
23:33 jforrester@deploy2002: Finished scap: Backport for [[gerrit:992775|Revert "Update
spacing to improve consistency of ul/ol spacing, also update heading spacing to be more consistent, relying on mw defaults more" (T355805 T354433)]] (duration: 13m 29s)
23:32 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2105.codfw.wmnet with OS bullseye
23:32 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2104.codfw.wmnet with OS bullseye
23:26 jforrester@deploy2002: jforrester: Continuing with sync
23:21 jforrester@deploy2002: jforrester: Backport for [[gerrit:992775|Revert "Update
spacing to improve consistency of ul/ol spacing, also update heading spacing to be more consistent, relying on mw defaults more" (T355805 T354433)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
23:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T354336)', diff saved to https://phabricator.wikimedia.org/P55584 and previous config saved to /var/cache/conftool/dbconfig/20240124-231933-marostegui.json
23:19 jforrester@deploy2002: Started scap: Backport for [[gerrit:992775|Revert "Update
spacing to improve consistency of ul/ol spacing, also update heading spacing to be more consistent, relying on mw defaults more" (T355805 T354433)]]
23:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1243 (T354336)', diff saved to https://phabricator.wikimedia.org/P55583 and previous config saved to /var/cache/conftool/dbconfig/20240124-231723-marostegui.json
23:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1243.eqiad.wmnet with reason: Maintenance
23:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1243.eqiad.wmnet with reason: Maintenance
23:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T354336)', diff saved to https://phabricator.wikimedia.org/P55582 and previous config saved to /var/cache/conftool/dbconfig/20240124-231701-marostegui.json
23:04 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2103.codfw.wmnet with OS bullseye
23:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P55581 and previous config saved to /var/cache/conftool/dbconfig/20240124-230155-marostegui.json
22:50 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2106.codfw.wmnet with OS bullseye
22:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P55580 and previous config saved to /var/cache/conftool/dbconfig/20240124-224648-marostegui.json
22:39 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: cloduelastic maintenance
22:39 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: cloduelastic maintenance
22:33 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2106.codfw.wmnet with reason: host reimage
22:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T354336)', diff saved to https://phabricator.wikimedia.org/P55579 and previous config saved to /var/cache/conftool/dbconfig/20240124-223142-marostegui.json
22:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1242 (T354336)', diff saved to https://phabricator.wikimedia.org/P55578 and previous config saved to /var/cache/conftool/dbconfig/20240124-222932-marostegui.json
22:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1242.eqiad.wmnet with reason: Maintenance
22:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1242.eqiad.wmnet with reason: Maintenance
22:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T354336)', diff saved to https://phabricator.wikimedia.org/P55577 and previous config saved to /var/cache/conftool/dbconfig/20240124-222910-marostegui.json
22:28 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2106.codfw.wmnet with reason: host reimage
22:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P55576 and previous config saved to /var/cache/conftool/dbconfig/20240124-221403-marostegui.json
22:11 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2105.codfw.wmnet with OS bullseye
22:11 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2106.codfw.wmnet with OS bullseye
22:11 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2104.codfw.wmnet with OS bullseye
22:10 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2103.codfw.wmnet with OS bullseye
21:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P55575 and previous config saved to /var/cache/conftool/dbconfig/20240124-215857-marostegui.json
21:45 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase[2022-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
21:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T354336)', diff saved to https://phabricator.wikimedia.org/P55574 and previous config saved to /var/cache/conftool/dbconfig/20240124-214351-marostegui.json
21:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1241 (T354336)', diff saved to https://phabricator.wikimedia.org/P55573 and previous config saved to /var/cache/conftool/dbconfig/20240124-214141-marostegui.json
21:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1241.eqiad.wmnet with reason: Maintenance
21:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1241.eqiad.wmnet with reason: Maintenance
21:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T354336)', diff saved to https://phabricator.wikimedia.org/P55572 and previous config saved to /var/cache/conftool/dbconfig/20240124-214120-marostegui.json
21:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P55571 and previous config saved to /var/cache/conftool/dbconfig/20240124-212613-marostegui.json
21:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P55570 and previous config saved to /var/cache/conftool/dbconfig/20240124-211107-marostegui.json
21:05 aqu@deploy2002: Finished deploy [airflow-dags/analytics@5a0681b]: Regular analytics weekly train [airflow-dags/analytics@5a0681bc] (duration: 00m 37s)
21:05 aqu@deploy2002: Started deploy [airflow-dags/analytics@5a0681b]: Regular analytics weekly train [airflow-dags/analytics@5a0681bc]
20:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T354336)', diff saved to https://phabricator.wikimedia.org/P55569 and previous config saved to /var/cache/conftool/dbconfig/20240124-205600-marostegui.json
20:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1238 (T354336)', diff saved to https://phabricator.wikimedia.org/P55568 and previous config saved to /var/cache/conftool/dbconfig/20240124-205350-marostegui.json
20:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1238.eqiad.wmnet with reason: Maintenance
20:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1238.eqiad.wmnet with reason: Maintenance
20:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T354336)', diff saved to https://phabricator.wikimedia.org/P55567 and previous config saved to /var/cache/conftool/dbconfig/20240124-205327-marostegui.json
20:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P55566 and previous config saved to /var/cache/conftool/dbconfig/20240124-203821-marostegui.json
20:38 fab@deploy2002: Finished deploy [airflow-dags/research@2f514fc]: (no justification provided) (duration: 00m 33s)
20:37 fab@deploy2002: Started deploy [airflow-dags/research@2f514fc]: (no justification provided)
20:26 zabe: zabe@mwmaint2002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=scowiki --logwiki=metawiki 'TheBabushka' 'AshotGPT' # T355743
20:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P55565 and previous config saved to /var/cache/conftool/dbconfig/20240124-202315-marostegui.json
20:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T354336)', diff saved to https://phabricator.wikimedia.org/P55564 and previous config saved to /var/cache/conftool/dbconfig/20240124-200808-marostegui.json
20:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1221 (T354336)', diff saved to https://phabricator.wikimedia.org/P55563 and previous config saved to /var/cache/conftool/dbconfig/20240124-200659-marostegui.json
20:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
20:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
20:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1221.eqiad.wmnet with reason: Maintenance
20:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1221.eqiad.wmnet with reason: Maintenance
20:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T354336)', diff saved to https://phabricator.wikimedia.org/P55562 and previous config saved to /var/cache/conftool/dbconfig/20240124-200619-marostegui.json
20:02 cstone: payments-wiki upgraded from a3691a8e to 8cfbbb4b
19:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P55561 and previous config saved to /var/cache/conftool/dbconfig/20240124-195113-marostegui.json
19:39 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
19:38 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
19:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P55560 and previous config saved to /var/cache/conftool/dbconfig/20240124-193606-marostegui.json
19:35 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
19:34 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
19:34 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
19:33 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
19:24 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
19:23 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
19:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T354336)', diff saved to https://phabricator.wikimedia.org/P55559 and previous config saved to /var/cache/conftool/dbconfig/20240124-192100-marostegui.json
19:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1199 (T354336)', diff saved to https://phabricator.wikimedia.org/P55558 and previous config saved to /var/cache/conftool/dbconfig/20240124-191850-marostegui.json
19:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1199.eqiad.wmnet with reason: Maintenance
19:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1199.eqiad.wmnet with reason: Maintenance
19:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55557 and previous config saved to /var/cache/conftool/dbconfig/20240124-191828-marostegui.json
19:16 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase[2022-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
19:13 eevans@cumin1002: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching restbase[2017-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
19:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P55555 and previous config saved to /var/cache/conftool/dbconfig/20240124-190322-marostegui.json
18:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P55554 and previous config saved to /var/cache/conftool/dbconfig/20240124-184815-marostegui.json
18:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55553 and previous config saved to /var/cache/conftool/dbconfig/20240124-183308-marostegui.json
18:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55552 and previous config saved to /var/cache/conftool/dbconfig/20240124-183059-marostegui.json
18:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1190.eqiad.wmnet with reason: Maintenance
18:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1190.eqiad.wmnet with reason: Maintenance
18:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
18:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
18:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1149.eqiad.wmnet with reason: Maintenance
18:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1149.eqiad.wmnet with reason: Maintenance
18:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55551 and previous config saved to /var/cache/conftool/dbconfig/20240124-183001-marostegui.json
18:24 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase[2017-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
18:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P55550 and previous config saved to /var/cache/conftool/dbconfig/20240124-181455-marostegui.json
18:09 mfossati@deploy2002: Finished deploy [airflow-dags/platform_eng@fed6de3]: (no justification provided) (duration: 00m 32s)
18:08 mfossati@deploy2002: Started deploy [airflow-dags/platform_eng@fed6de3]: (no justification provided)
17:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P55549 and previous config saved to /var/cache/conftool/dbconfig/20240124-175948-marostegui.json
17:50 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
17:50 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
17:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
17:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
17:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55548 and previous config saved to /var/cache/conftool/dbconfig/20240124-174442-marostegui.json
17:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1146:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55547 and previous config saved to /var/cache/conftool/dbconfig/20240124-174332-marostegui.json
17:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
17:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
17:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
17:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
17:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55546 and previous config saved to /var/cache/conftool/dbconfig/20240124-174251-marostegui.json
17:35 eevans@cumin1002: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching restbase[2015-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P55545 and previous config saved to /var/cache/conftool/dbconfig/20240124-172745-marostegui.json
17:24 hashar@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.15 refs T354433 (duration: 07m 10s)
17:17 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase[2015-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
17:16 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.15 refs T354433
17:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P55544 and previous config saved to /var/cache/conftool/dbconfig/20240124-171238-marostegui.json
17:10 sukhe: sudo cumin -b1 -s60 "R:Class = Bird" "enable-puppet 'CR991699' && run-puppet-agent"
17:09 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase103[1-3].eqiad.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
17:06 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@16476a9] (releasing): (no justification provided) (duration: 01m 07s)
17:06 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@16476a9] (releasing): (no justification provided)
17:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2053.codfw.wmnet
17:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1053.eqiad.wmnet
16:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2053.codfw.wmnet
16:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1053.eqiad.wmnet
16:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55543 and previous config saved to /var/cache/conftool/dbconfig/20240124-165732-marostegui.json
16:56 vgutierrez: enable puppet on cp3066 - T354424
16:55 sukhe: enable puppet on durum1001 to test CR 991699
16:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1144:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55542 and previous config saved to /var/cache/conftool/dbconfig/20240124-165522-marostegui.json
16:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
16:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
16:54 XioNoX: disable puppet on all the hosts running bird to deploy https://gerrit.wikimedia.org/r/c/operations/puppet/+/991699
16:39 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase103[1-3].eqiad.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
16:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2103.codfw.wmnet with reason: Maintenance
16:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2103.codfw.wmnet with reason: Maintenance
16:30 eevans@cumin1002: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching A:restbase-eqiad: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
16:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T354336)', diff saved to https://phabricator.wikimedia.org/P55541 and previous config saved to /var/cache/conftool/dbconfig/20240124-162532-marostegui.json
16:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P55540 and previous config saved to /var/cache/conftool/dbconfig/20240124-161026-marostegui.json
16:04 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
16:04 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
16:03 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
16:03 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
15:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host phab2002.codfw.wmnet
15:57 hashar@deploy2002: Synchronized php-1.42.0-wmf.15/extensions/Echo/includes/Formatters/EchoRevertedPresentationModel.php: Fix EchoRevertedPresentationModel using null as string - T355751 (duration: 09m 06s)
15:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P55539 and previous config saved to /var/cache/conftool/dbconfig/20240124-155519-marostegui.json
15:50 vgutierrez: disable puppet on cp3066 - T354424
15:48 sukhe: sudo cumin -b1 -s120 'A:dns-rec' "enable-puppet 'merging CR 980929' && run-puppet-agent"
15:47 hashar@deploy2002: Synchronized php-1.42.0-wmf.15/extensions/CentralAuth/tests/phpunit/CentralAuthIdLookupTest.php: Fix CentralIdLookup tests (duration: 11m 18s)
15:45 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2446.codfw.wmnet with OS bullseye
15:42 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2430.codfw.wmnet with OS bullseye
15:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T354336)', diff saved to https://phabricator.wikimedia.org/P55538 and previous config saved to /var/cache/conftool/dbconfig/20240124-154013-marostegui.json
15:39 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2427.codfw.wmnet with OS bullseye
15:38 sukhe: sudo cumin 'A:dns-rec' "disable-puppet 'merging CR 980929'"
15:38 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
15:38 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
15:38 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
15:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2188 (T354336)', diff saved to https://phabricator.wikimedia.org/P55537 and previous config saved to /var/cache/conftool/dbconfig/20240124-153752-marostegui.json
15:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2188.codfw.wmnet with reason: Maintenance
15:37 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
15:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2188.codfw.wmnet with reason: Maintenance
15:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T354336)', diff saved to https://phabricator.wikimedia.org/P55536 and previous config saved to /var/cache/conftool/dbconfig/20240124-153730-marostegui.json
15:37 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host phab2002.codfw.wmnet
15:37 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
15:36 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
15:32 moritzm: imported jenkins 2.426.3 for buster/bullseye T355503
15:25 aqu@deploy2002: Finished deploy [airflow-dags/analytics@da2e61c]: Regular analytics weekly train [airflow-dags/analytics@da2e61c7] (duration: 00m 42s)
15:25 aqu@deploy2002: Started deploy [airflow-dags/analytics@da2e61c]: Regular analytics weekly train [airflow-dags/analytics@da2e61c7]
15:25 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2446.codfw.wmnet with reason: host reimage
15:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P55534 and previous config saved to /var/cache/conftool/dbconfig/20240124-152224-marostegui.json
15:22 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2430.codfw.wmnet with reason: host reimage
15:21 aqu: Refinery weekly deployment train - end (scap, then deployed onto hdfs) (test cluster deploy still broken T354703)
15:19 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2427.codfw.wmnet with reason: host reimage
15:17 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2430.codfw.wmnet with reason: host reimage
15:16 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2446.codfw.wmnet with reason: host reimage
15:16 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2427.codfw.wmnet with reason: host reimage
15:12 aqu@deploy2002: Finished deploy [analytics/refinery@13f7a06] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@13f7a06c] (duration: 03m 28s)
15:11 moritzm: uploading pymsql 1.0.2-2~wmf11u1 to apt.wikimedia.org T355531
15:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2055.codfw.wmnet
15:08 aqu@deploy2002: Started deploy [analytics/refinery@13f7a06] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@13f7a06c]
15:08 aqu@deploy2002: Finished deploy [analytics/refinery@13f7a06] (thin): Regular analytics weekly train THIN [analytics/refinery@13f7a06c] (duration: 00m 05s)
15:08 aqu@deploy2002: Started deploy [analytics/refinery@13f7a06] (thin): Regular analytics weekly train THIN [analytics/refinery@13f7a06c]
15:07 aqu@deploy2002: Finished deploy [analytics/refinery@13f7a06]: Regular analytics weekly train [analytics/refinery@13f7a06c] (duration: 10m 12s)
15:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P55533 and previous config saved to /var/cache/conftool/dbconfig/20240124-150718-marostegui.json
15:04 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2055.codfw.wmnet
14:59 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2446.codfw.wmnet with OS bullseye
14:59 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2430.codfw.wmnet with OS bullseye
14:59 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2427.codfw.wmnet with OS bullseye
14:57 aqu@deploy2002: Started deploy [analytics/refinery@13f7a06]: Regular analytics weekly train [analytics/refinery@13f7a06c]
14:57 akosiaris@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
14:57 akosiaris@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
14:56 akosiaris@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
14:56 akosiaris@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
14:56 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
14:56 aqu@deploy2002: Finished deploy [analytics/refinery@d1ee04c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d1ee04cc] (duration: 03m 40s)
14:56 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
14:55 akosiaris: bump eventrouter limits/requests memory/cpu
14:55 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
14:55 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['elastic2094.codfw.wmnet']
14:55 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
14:52 aqu@deploy2002: Started deploy [analytics/refinery@d1ee04c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d1ee04cc]
14:52 aqu@deploy2002: Finished deploy [analytics/refinery@d1ee04c] (thin): Regular analytics weekly train THIN [analytics/refinery@d1ee04cc] (duration: 00m 06s)
14:52 aqu@deploy2002: Started deploy [analytics/refinery@d1ee04c] (thin): Regular analytics weekly train THIN [analytics/refinery@d1ee04cc]
14:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T354336)', diff saved to https://phabricator.wikimedia.org/P55532 and previous config saved to /var/cache/conftool/dbconfig/20240124-145211-marostegui.json
14:51 Lucas_WMDE: UTC afternoon backport+config window done
14:50 aqu@deploy2002: Finished deploy [analytics/refinery@d1ee04c]: Regular analytics weekly train [analytics/refinery@d1ee04cc] (duration: 09m 11s)
14:50 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:992706cswiki: remove unused birthday logo files (duration: 09m 36s)
14:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2176 (T354336)', diff saved to https://phabricator.wikimedia.org/P55531 and previous config saved to /var/cache/conftool/dbconfig/20240124-144947-marostegui.json
14:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2176.codfw.wmnet with reason: Maintenance
14:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2176.codfw.wmnet with reason: Maintenance
14:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T354336)', diff saved to https://phabricator.wikimedia.org/P55530 and previous config saved to /var/cache/conftool/dbconfig/20240124-144925-marostegui.json
14:47 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2054.codfw.wmnet
14:44 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
14:43 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for gerrit:992706cswiki: remove unused birthday logo files synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:41 aqu@deploy2002: Started deploy [analytics/refinery@d1ee04c]: Regular analytics weekly train [analytics/refinery@d1ee04cc]
14:41 aqu@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
14:41 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:992706cswiki: remove unused birthday logo files
14:40 aqu@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
14:39 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [[gerrit:992678|[azwiki] Add new namespace aliases (T355041)]] (duration: 10m 00s)
14:38 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2054.codfw.wmnet
14:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1054.eqiad.wmnet
14:37 aqu@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
14:36 aqu@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
14:36 aqu@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
14:35 aqu@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
14:35 aqu@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
14:35 aqu@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
14:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
14:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P55529 and previous config saved to /var/cache/conftool/dbconfig/20240124-143419-marostegui.json
14:34 aqu@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
14:33 aqu@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
14:33 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1054.eqiad.wmnet
14:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Continuing with sync
14:31 aqu: analytics/refinery weekly deployment train - begin
14:31 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2052.codfw.wmnet
14:31 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1052.eqiad.wmnet
14:30 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Backport for [[gerrit:992678|[azwiki] Add new namespace aliases (T355041)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:29 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
14:29 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [[gerrit:992678|[azwiki] Add new namespace aliases (T355041)]]
14:27 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [[gerrit:992671|[ganwiki] Change autoconfirmed setting (T355126)]] (duration: 09m 51s)
14:26 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2094.codfw.wmnet']
14:25 bking@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['elastic2094.codfw.wmnet']
14:25 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2094.codfw.wmnet']
14:25 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2094.codfw.wmnet']
14:25 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2094.codfw.wmnet']
14:25 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2052.codfw.wmnet
14:25 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1052.eqiad.wmnet
14:25 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2088.codfw.wmnet']
14:24 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088.codfw.wmnet']
14:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
14:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and superpes: Continuing with sync
14:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P55527 and previous config saved to /var/cache/conftool/dbconfig/20240124-141912-marostegui.json
14:19 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and superpes: Backport for [[gerrit:992671|[ganwiki] Change autoconfirmed setting (T355126)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:17 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [[gerrit:992671|[ganwiki] Change autoconfirmed setting (T355126)]]
14:14 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:992631Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798) (duration: 10m 52s)
14:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2053.codfw.wmnet
14:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Continuing with sync
14:05 logmsgbot: lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Backport for gerrit:992631Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:04 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2053.codfw.wmnet
14:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T354336)', diff saved to https://phabricator.wikimedia.org/P55526 and previous config saved to /var/cache/conftool/dbconfig/20240124-140406-marostegui.json
14:04 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ml-serve2005.codfw.wmnet with reason: Machine move (T355437)
14:04 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:992631Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798)
14:03 klausman@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ml-serve2005.codfw.wmnet with reason: Machine move (T355437)
14:01 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2174 (T354336)', diff saved to https://phabricator.wikimedia.org/P55525 and previous config saved to /var/cache/conftool/dbconfig/20240124-140142-marostegui.json
14:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2174.codfw.wmnet with reason: Maintenance
14:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2174.codfw.wmnet with reason: Maintenance
14:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T354336)', diff saved to https://phabricator.wikimedia.org/P55524 and previous config saved to /var/cache/conftool/dbconfig/20240124-140120-marostegui.json
13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1053.eqiad.wmnet
13:54 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55523 and previous config saved to /var/cache/conftool/dbconfig/20240124-135424-root.json
13:50 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1053.eqiad.wmnet
13:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P55522 and previous config saved to /var/cache/conftool/dbconfig/20240124-134614-marostegui.json
13:39 samtar@deploy2002: Finished scap: Backport for gerrit:991100Added Diff to approved list of RSS feeds for Foundation Governance Wiki and removed inoperative feed. (T354790) (duration: 09m 14s)
13:39 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55521 and previous config saved to /var/cache/conftool/dbconfig/20240124-133919-root.json
13:37 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2051.codfw.wmnet
13:37 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1051.eqiad.wmnet
13:32 samtar@deploy2002: samtar and varnent: Continuing with sync
13:32 samtar@deploy2002: samtar and varnent: Backport for gerrit:991100Added Diff to approved list of RSS feeds for Foundation Governance Wiki and removed inoperative feed. (T354790) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
13:31 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1051.eqiad.wmnet
13:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P55520 and previous config saved to /var/cache/conftool/dbconfig/20240124-133107-marostegui.json
13:31 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2051.codfw.wmnet
13:30 samtar@deploy2002: Started scap: Backport for gerrit:991100Added Diff to approved list of RSS feeds for Foundation Governance Wiki and removed inoperative feed. (T354790)
13:24 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55519 and previous config saved to /var/cache/conftool/dbconfig/20240124-132414-root.json
13:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T354336)', diff saved to https://phabricator.wikimedia.org/P55518 and previous config saved to /var/cache/conftool/dbconfig/20240124-131600-marostegui.json
13:09 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55517 and previous config saved to /var/cache/conftool/dbconfig/20240124-130909-root.json
12:54 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55516 and previous config saved to /var/cache/conftool/dbconfig/20240124-125404-root.json
12:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2052.codfw.wmnet
12:39 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 5%: After switchover', diff saved to https://phabricator.wikimedia.org/P55515 and previous config saved to /var/cache/conftool/dbconfig/20240124-123859-root.json
12:34 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2052.codfw.wmnet
12:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1052.eqiad.wmnet
12:28 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1052.eqiad.wmnet
12:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 1%: After switchover', diff saved to https://phabricator.wikimedia.org/P55514 and previous config saved to /var/cache/conftool/dbconfig/20240124-122354-root.json
12:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1231 T355760', diff saved to https://phabricator.wikimedia.org/P55513 and previous config saved to /var/cache/conftool/dbconfig/20240124-122148-root.json
12:20 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db1173 to s6 primary T355760', diff saved to https://phabricator.wikimedia.org/P55512 and previous config saved to /var/cache/conftool/dbconfig/20240124-122030-marostegui.json
12:19 marostegui: Starting s6 eqiad failover from db1231 to db1173 - T355760
12:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
12:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
12:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2173.codfw.wmnet with reason: Maintenance
12:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2173.codfw.wmnet with reason: Maintenance
12:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55510 and previous config saved to /var/cache/conftool/dbconfig/20240124-121448-marostegui.json
12:07 ladsgroup@deploy2002: Finished scap: Backport for [[gerrit:992514|GenerateFancyCaptchas: Add ->disableSandbox() to shell command]] (duration: 09m 55s)
12:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355760
12:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355760
12:00 ladsgroup@deploy2002: ladsgroup: Continuing with sync
11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P55509 and previous config saved to /var/cache/conftool/dbconfig/20240124-115942-marostegui.json
11:58 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:992514|GenerateFancyCaptchas: Add ->disableSandbox() to shell command]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:58 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host acmechief-test1001.eqiad.wmnet
11:57 ladsgroup@deploy2002: Started scap: Backport for [[gerrit:992514|GenerateFancyCaptchas: Add ->disableSandbox() to shell command]]
11:57 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
11:56 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
11:56 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2050.codfw.wmnet
11:55 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host acmechief-test2001.codfw.wmnet
11:55 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1050.eqiad.wmnet
11:54 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
11:52 hnowlan@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
11:52 hnowlan@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
11:49 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2050.codfw.wmnet
11:49 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1050.eqiad.wmnet
11:47 hnowlan@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
11:46 hnowlan@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P55506 and previous config saved to /var/cache/conftool/dbconfig/20240124-114435-marostegui.json
11:43 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host acmechief-test2001.codfw.wmnet
11:33 vgutierrez: repool cp3066 - T354424
11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1014.eqiad.wmnet with OS bullseye
11:32 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
11:32 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
11:31 vgutierrez: depooling cp3066 - T354424
11:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55505 and previous config saved to /var/cache/conftool/dbconfig/20240124-112929-marostegui.json
11:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2170:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55504 and previous config saved to /var/cache/conftool/dbconfig/20240124-112705-marostegui.json
11:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
11:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
11:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55503 and previous config saved to /var/cache/conftool/dbconfig/20240124-112643-marostegui.json
11:26 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
11:26 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
11:24 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
11:24 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
11:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P55501 and previous config saved to /var/cache/conftool/dbconfig/20240124-111136-marostegui.json
11:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1014.eqiad.wmnet with reason: host reimage
10:59 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1014.eqiad.wmnet with reason: host reimage
10:57 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=rowikinews --fix # T350889
10:57 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1173 with weight 0 T355760', diff saved to https://phabricator.wikimedia.org/P55500 and previous config saved to /var/cache/conftool/dbconfig/20240124-105702-root.json
10:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P55499 and previous config saved to /var/cache/conftool/dbconfig/20240124-105630-marostegui.json
10:45 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1014.eqiad.wmnet with OS bullseye
10:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host snapshot1014.eqiad.wmnet
10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1017.eqiad.wmnet with OS bullseye
10:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55498 and previous config saved to /var/cache/conftool/dbconfig/20240124-104123-marostegui.json
10:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2167:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55497 and previous config saved to /var/cache/conftool/dbconfig/20240124-103900-marostegui.json
10:38 hashar: deployment-server: removing `gerrit` remove from `/srv/mediawiki-staging` given it is tied to a specific username and the `origin` remote already has ssh protocol for push # ping James_F
10:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
10:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T354336)', diff saved to https://phabricator.wikimedia.org/P55496 and previous config saved to /var/cache/conftool/dbconfig/20240124-103837-marostegui.json
10:37 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host snapshot1014.eqiad.wmnet
10:36 moritzm: upgrading cumin1002 to pymsql 1.0.2-2~wmf11u1 T355531
10:31 hashar@deploy2002: rebuilt and synchronized wikiversions files: Revert "group1 wikis to 1.42.0-wmf.15" - T354433
10:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P55495 and previous config saved to /var/cache/conftool/dbconfig/20240124-102330-marostegui.json
10:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1017.eqiad.wmnet with reason: host reimage
10:10 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1017.eqiad.wmnet with reason: host reimage
10:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P55494 and previous config saved to /var/cache/conftool/dbconfig/20240124-100824-marostegui.json
10:00 vgutierrez: repool cp3066 - T354424
09:58 vgutierrez: depooling cp3066 - T354424
09:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1017.eqiad.wmnet with OS bullseye
09:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T354336)', diff saved to https://phabricator.wikimedia.org/P55493 and previous config saved to /var/cache/conftool/dbconfig/20240124-095317-marostegui.json
09:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2153 (T354336)', diff saved to https://phabricator.wikimedia.org/P55492 and previous config saved to /var/cache/conftool/dbconfig/20240124-095054-marostegui.json
09:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2153.codfw.wmnet with reason: Maintenance
09:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2153.codfw.wmnet with reason: Maintenance
09:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T354336)', diff saved to https://phabricator.wikimedia.org/P55491 and previous config saved to /var/cache/conftool/dbconfig/20240124-095032-marostegui.json
09:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: A1 codfw maintenance
09:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: A1 codfw maintenance
09:49 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1037.eqiad.wmnet to cluster eqiad and group C
09:41 ayounsi@cumin2002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-f8-eqiad
09:41 ayounsi@cumin2002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
09:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P55489 and previous config saved to /var/cache/conftool/dbconfig/20240124-093526-marostegui.json
09:32 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-f8-eqiad
09:32 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
09:31 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1037.eqiad.wmnet to cluster eqiad and group C
09:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: A1 codfw maintenance T355437
09:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: A1 codfw maintenance T355437
09:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: A1 codfw maintenance T355437
09:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: A1 codfw maintenance T355437
09:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: A1 codfw maintenance T355437
09:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: A1 codfw maintenance T355437
09:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2026.codfw.wmnet with reason: A1 codfw maintenance T355437
09:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es2026.codfw.wmnet with reason: A1 codfw maintenance T355437
09:27 hashar@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.15 refs T354433 (duration: 06m 55s)
09:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P55488 and previous config saved to /var/cache/conftool/dbconfig/20240124-092019-marostegui.json
09:20 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.15 refs T354433
09:08 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ganeti1037.eqiad.wmnet
09:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T354336)', diff saved to https://phabricator.wikimedia.org/P55487 and previous config saved to /var/cache/conftool/dbconfig/20240124-090512-marostegui.json
09:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2146 (T354336)', diff saved to https://phabricator.wikimedia.org/P55486 and previous config saved to /var/cache/conftool/dbconfig/20240124-090250-marostegui.json
09:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2146.codfw.wmnet with reason: Maintenance
09:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2146.codfw.wmnet with reason: Maintenance
09:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T354336)', diff saved to https://phabricator.wikimedia.org/P55485 and previous config saved to /var/cache/conftool/dbconfig/20240124-090228-marostegui.json
08:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
08:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P55484 and previous config saved to /var/cache/conftool/dbconfig/20240124-084721-marostegui.json
08:45 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti1037.eqiad.wmnet
08:36 hashar@deploy2002: Finished scap: Backport for gerrit:992513Use a class for 'LogActionsHandlers' (T355680) (duration: 08m 00s)
08:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P55483 and previous config saved to /var/cache/conftool/dbconfig/20240124-083215-marostegui.json
08:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
08:30 hashar@deploy2002: hashar: Continuing with sync
08:30 hashar@deploy2002: hashar: Backport for gerrit:992513Use a class for 'LogActionsHandlers' (T355680) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:28 hashar@deploy2002: Started scap: Backport for gerrit:992513Use a class for 'LogActionsHandlers' (T355680)
08:25 logmsgbot: wmde-fisch@deploy2002 Finished scap: Backport for gerrit:992411Allow Cite events for reference previews baseline stats (T353798) (duration: 08m 32s)
08:18 logmsgbot: wmde-fisch@deploy2002 wmde-fisch: Continuing with sync
08:18 logmsgbot: wmde-fisch@deploy2002 wmde-fisch: Backport for gerrit:992411Allow Cite events for reference previews baseline stats (T353798) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:17 logmsgbot: wmde-fisch@deploy2002 Started scap: Backport for gerrit:992411Allow Cite events for reference previews baseline stats (T353798)
08:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T354336)', diff saved to https://phabricator.wikimedia.org/P55482 and previous config saved to /var/cache/conftool/dbconfig/20240124-081708-marostegui.json
08:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2145 (T354336)', diff saved to https://phabricator.wikimedia.org/P55481 and previous config saved to /var/cache/conftool/dbconfig/20240124-081445-marostegui.json
08:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2145.codfw.wmnet with reason: Maintenance
08:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2145.codfw.wmnet with reason: Maintenance
08:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2141.codfw.wmnet with reason: Maintenance
08:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2141.codfw.wmnet with reason: Maintenance
08:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T354336)', diff saved to https://phabricator.wikimedia.org/P55480 and previous config saved to /var/cache/conftool/dbconfig/20240124-081340-marostegui.json
08:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55479 and previous config saved to /var/cache/conftool/dbconfig/20240124-081050-root.json
08:07 logmsgbot: wmde-fisch@deploy2002 wmde-fisch: Backport for gerrit:992411Allow Cite events for reference previews baseline stats (T353798) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:05 logmsgbot: wmde-fisch@deploy2002 Started scap: Backport for gerrit:992411Allow Cite events for reference previews baseline stats (T353798)
07:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P55478 and previous config saved to /var/cache/conftool/dbconfig/20240124-075834-marostegui.json
07:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55477 and previous config saved to /var/cache/conftool/dbconfig/20240124-075545-root.json
07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P55476 and previous config saved to /var/cache/conftool/dbconfig/20240124-074327-marostegui.json
07:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55475 and previous config saved to /var/cache/conftool/dbconfig/20240124-074040-root.json
07:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T354336)', diff saved to https://phabricator.wikimedia.org/P55474 and previous config saved to /var/cache/conftool/dbconfig/20240124-072821-marostegui.json
07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2130 (T354336)', diff saved to https://phabricator.wikimedia.org/P55473 and previous config saved to /var/cache/conftool/dbconfig/20240124-072557-marostegui.json
07:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2130.codfw.wmnet with reason: Maintenance
07:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55472 and previous config saved to /var/cache/conftool/dbconfig/20240124-072535-root.json
07:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2130.codfw.wmnet with reason: Maintenance
07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T354336)', diff saved to https://phabricator.wikimedia.org/P55471 and previous config saved to /var/cache/conftool/dbconfig/20240124-072523-marostegui.json
07:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 100%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55470 and previous config saved to /var/cache/conftool/dbconfig/20240124-071954-root.json
07:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55469 and previous config saved to /var/cache/conftool/dbconfig/20240124-071030-root.json
07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P55468 and previous config saved to /var/cache/conftool/dbconfig/20240124-071016-marostegui.json
07:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 75%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55467 and previous config saved to /var/cache/conftool/dbconfig/20240124-070449-root.json
06:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55466 and previous config saved to /var/cache/conftool/dbconfig/20240124-065525-root.json
06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P55465 and previous config saved to /var/cache/conftool/dbconfig/20240124-065510-marostegui.json
06:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 50%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55464 and previous config saved to /var/cache/conftool/dbconfig/20240124-064944-root.json
06:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2129.codfw.wmnet with OS bookworm
06:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55463 and previous config saved to /var/cache/conftool/dbconfig/20240124-064020-root.json
06:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T354336)', diff saved to https://phabricator.wikimedia.org/P55462 and previous config saved to /var/cache/conftool/dbconfig/20240124-064003-marostegui.json
06:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2116 (T354336)', diff saved to https://phabricator.wikimedia.org/P55461 and previous config saved to /var/cache/conftool/dbconfig/20240124-063739-marostegui.json
06:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2116.codfw.wmnet with reason: Maintenance
06:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2116.codfw.wmnet with reason: Maintenance
06:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2112 (T354336)', diff saved to https://phabricator.wikimedia.org/P55460 and previous config saved to /var/cache/conftool/dbconfig/20240124-063717-marostegui.json
06:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 25%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55459 and previous config saved to /var/cache/conftool/dbconfig/20240124-063440-root.json
06:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2112', diff saved to https://phabricator.wikimedia.org/P55458 and previous config saved to /var/cache/conftool/dbconfig/20240124-062210-marostegui.json
06:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 10%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55457 and previous config saved to /var/cache/conftool/dbconfig/20240124-061934-root.json
06:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2129.codfw.wmnet with reason: host reimage
06:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2129.codfw.wmnet with reason: host reimage
06:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2112', diff saved to https://phabricator.wikimedia.org/P55456 and previous config saved to /var/cache/conftool/dbconfig/20240124-060703-marostegui.json
06:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 5%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55455 and previous config saved to /var/cache/conftool/dbconfig/20240124-060429-root.json
05:58 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2129.codfw.wmnet with OS bookworm
05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2129 T354506', diff saved to https://phabricator.wikimedia.org/P55454 and previous config saved to /var/cache/conftool/dbconfig/20240124-055635-marostegui.json
05:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2112 (T354336)', diff saved to https://phabricator.wikimedia.org/P55453 and previous config saved to /var/cache/conftool/dbconfig/20240124-055157-marostegui.json
05:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2158 db2157 es2026 db2136 T355437', diff saved to https://phabricator.wikimedia.org/P55452 and previous config saved to /var/cache/conftool/dbconfig/20240124-055143-marostegui.json
05:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2112 (T354336)', diff saved to https://phabricator.wikimedia.org/P55451 and previous config saved to /var/cache/conftool/dbconfig/20240124-054932-marostegui.json
05:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2112.codfw.wmnet with reason: Maintenance
05:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 1%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55450 and previous config saved to /var/cache/conftool/dbconfig/20240124-054924-root.json
05:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2112.codfw.wmnet with reason: Maintenance
05:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2102.codfw.wmnet with reason: Maintenance
05:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2102.codfw.wmnet with reason: Maintenance
05:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1184.eqiad.wmnet with reason: Maintenance
05:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1184.eqiad.wmnet with reason: Maintenance
02:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
02:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
02:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T354336)', diff saved to https://phabricator.wikimedia.org/P55449 and previous config saved to /var/cache/conftool/dbconfig/20240124-023210-marostegui.json
02:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P55448 and previous config saved to /var/cache/conftool/dbconfig/20240124-021704-marostegui.json
02:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P55447 and previous config saved to /var/cache/conftool/dbconfig/20240124-020157-marostegui.json
01:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T354336)', diff saved to https://phabricator.wikimedia.org/P55445 and previous config saved to /var/cache/conftool/dbconfig/20240124-014651-marostegui.json
01:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1234 (T354336)', diff saved to https://phabricator.wikimedia.org/P55444 and previous config saved to /var/cache/conftool/dbconfig/20240124-014430-marostegui.json
01:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1234.eqiad.wmnet with reason: Maintenance
01:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1234.eqiad.wmnet with reason: Maintenance
01:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T354336)', diff saved to https://phabricator.wikimedia.org/P55443 and previous config saved to /var/cache/conftool/dbconfig/20240124-014408-marostegui.json
01:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P55442 and previous config saved to /var/cache/conftool/dbconfig/20240124-012902-marostegui.json
01:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P55441 and previous config saved to /var/cache/conftool/dbconfig/20240124-011355-marostegui.json
00:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T354336)', diff saved to https://phabricator.wikimedia.org/P55440 and previous config saved to /var/cache/conftool/dbconfig/20240124-005849-marostegui.json
00:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1232 (T354336)', diff saved to https://phabricator.wikimedia.org/P55439 and previous config saved to /var/cache/conftool/dbconfig/20240124-005627-marostegui.json
00:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1232.eqiad.wmnet with reason: Maintenance
00:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1232.eqiad.wmnet with reason: Maintenance
00:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T354336)', diff saved to https://phabricator.wikimedia.org/P55438 and previous config saved to /var/cache/conftool/dbconfig/20240124-005605-marostegui.json
00:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P55437 and previous config saved to /var/cache/conftool/dbconfig/20240124-004058-marostegui.json
00:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P55436 and previous config saved to /var/cache/conftool/dbconfig/20240124-002551-marostegui.json
00:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T354336)', diff saved to https://phabricator.wikimedia.org/P55435 and previous config saved to /var/cache/conftool/dbconfig/20240124-001044-marostegui.json
00:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1228 (T354336)', diff saved to https://phabricator.wikimedia.org/P55434 and previous config saved to /var/cache/conftool/dbconfig/20240124-000824-marostegui.json
00:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1228.eqiad.wmnet with reason: Maintenance
00:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1228.eqiad.wmnet with reason: Maintenance
00:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T354336)', diff saved to https://phabricator.wikimedia.org/P55433 and previous config saved to /var/cache/conftool/dbconfig/20240124-000802-marostegui.json

2024-01-23

23:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P55432 and previous config saved to /var/cache/conftool/dbconfig/20240123-235255-marostegui.json
23:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P55430 and previous config saved to /var/cache/conftool/dbconfig/20240123-233749-marostegui.json
23:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T354336)', diff saved to https://phabricator.wikimedia.org/P55429 and previous config saved to /var/cache/conftool/dbconfig/20240123-232242-marostegui.json
23:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1219 (T354336)', diff saved to https://phabricator.wikimedia.org/P55428 and previous config saved to /var/cache/conftool/dbconfig/20240123-232021-marostegui.json
23:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1219.eqiad.wmnet with reason: Maintenance
23:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1219.eqiad.wmnet with reason: Maintenance
23:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T354336)', diff saved to https://phabricator.wikimedia.org/P55427 and previous config saved to /var/cache/conftool/dbconfig/20240123-231959-marostegui.json
23:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P55426 and previous config saved to /var/cache/conftool/dbconfig/20240123-230453-marostegui.json
22:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P55425 and previous config saved to /var/cache/conftool/dbconfig/20240123-224946-marostegui.json
22:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T354336)', diff saved to https://phabricator.wikimedia.org/P55424 and previous config saved to /var/cache/conftool/dbconfig/20240123-223439-marostegui.json
22:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1218 (T354336)', diff saved to https://phabricator.wikimedia.org/P55423 and previous config saved to /var/cache/conftool/dbconfig/20240123-223215-marostegui.json
22:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1218.eqiad.wmnet with reason: Maintenance
22:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1218.eqiad.wmnet with reason: Maintenance
22:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T354336)', diff saved to https://phabricator.wikimedia.org/P55422 and previous config saved to /var/cache/conftool/dbconfig/20240123-223153-marostegui.json
22:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P55421 and previous config saved to /var/cache/conftool/dbconfig/20240123-221646-marostegui.json
22:03 kostajh: UTC late deploys done
22:02 kostajh: T355695 running mwscript resetAuthenticationThrottle.php --wiki=enwikibooks --signup --ip 195.70.81.86
22:02 kostajh: T355695 running mwscript resetAuthenticationThrottle.php --wiki=enwikibooks --signup --ip 62.232.9.14
22:01 kostajh: T355695 running mwscript resetAuthenticationThrottle.php --wiki=enwiki --signup --ip 195.70.81.86
22:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P55420 and previous config saved to /var/cache/conftool/dbconfig/20240123-220140-marostegui.json
22:01 kostajh: T355695 running mwscript resetAuthenticationThrottle.php --wiki=enwiki --signup --ip 62.232.9.14
21:59 kharlan@deploy2002: Finished scap: Backport for [[gerrit:992461|[knwiki] Removing the temporary logo (already reverted) (T338136)]], [[gerrit:992466|[itwiki] Add the 'abusefilter-bypass-blocked-external-domains' right to botadmins (T355694)]], [[gerrit:992471|[enwiki] and [enwikibooks] Throttle exemption for event (T355695)]] (duration: 15m 33s)
21:53 kharlan@deploy2002: superpes and kharlan: Continuing with sync
21:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T354336)', diff saved to https://phabricator.wikimedia.org/P55419 and previous config saved to /var/cache/conftool/dbconfig/20240123-214633-marostegui.json
21:45 kharlan@deploy2002: superpes and kharlan: Backport for [[gerrit:992461|[knwiki] Removing the temporary logo (already reverted) (T338136)]], [[gerrit:992466|[itwiki] Add the 'abusefilter-bypass-blocked-external-domains' right to botadmins (T355694)]], [[gerrit:992471|[enwiki] and [enwikibooks] Throttle exemption for event (T355695)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1207 (T354336)', diff saved to https://phabricator.wikimedia.org/P55418 and previous config saved to /var/cache/conftool/dbconfig/20240123-214413-marostegui.json
21:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1207.eqiad.wmnet with reason: Maintenance
21:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1207.eqiad.wmnet with reason: Maintenance
21:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T354336)', diff saved to https://phabricator.wikimedia.org/P55417 and previous config saved to /var/cache/conftool/dbconfig/20240123-214351-marostegui.json
21:43 kharlan@deploy2002: Started scap: Backport for [[gerrit:992461|[knwiki] Removing the temporary logo (already reverted) (T338136)]], [[gerrit:992466|[itwiki] Add the 'abusefilter-bypass-blocked-external-domains' right to botadmins (T355694)]], [[gerrit:992471|[enwiki] and [enwikibooks] Throttle exemption for event (T355695)]]
21:36 kharlan@deploy2002: Finished scap: Backport for gerrit:992506revertrisk: Fix i18n message reference (T348298), gerrit:992507revertrisk: Fix i18n messages (T348298) (duration: 30m 51s)
21:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P55416 and previous config saved to /var/cache/conftool/dbconfig/20240123-212845-marostegui.json
21:26 kharlan@deploy2002: kharlan: Continuing with sync
21:26 kharlan@deploy2002: kharlan: Backport for gerrit:992506revertrisk: Fix i18n message reference (T348298), gerrit:992507revertrisk: Fix i18n messages (T348298) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P55415 and previous config saved to /var/cache/conftool/dbconfig/20240123-211338-marostegui.json
21:05 kharlan@deploy2002: Started scap: Backport for gerrit:992506revertrisk: Fix i18n message reference (T348298), gerrit:992507revertrisk: Fix i18n messages (T348298)
20:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T354336)', diff saved to https://phabricator.wikimedia.org/P55414 and previous config saved to /var/cache/conftool/dbconfig/20240123-205832-marostegui.json
20:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1206 (T354336)', diff saved to https://phabricator.wikimedia.org/P55413 and previous config saved to /var/cache/conftool/dbconfig/20240123-205611-marostegui.json
20:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1206.eqiad.wmnet with reason: Maintenance
20:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1206.eqiad.wmnet with reason: Maintenance
20:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T354336)', diff saved to https://phabricator.wikimedia.org/P55412 and previous config saved to /var/cache/conftool/dbconfig/20240123-205549-marostegui.json
20:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P55411 and previous config saved to /var/cache/conftool/dbconfig/20240123-204043-marostegui.json
20:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P55410 and previous config saved to /var/cache/conftool/dbconfig/20240123-202536-marostegui.json
20:23 cstone: payments-wiki upgraded from c2138768 to a3691a8e
20:23 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling both afterwards
20:12 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling both afterwards
20:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T354336)', diff saved to https://phabricator.wikimedia.org/P55409 and previous config saved to /var/cache/conftool/dbconfig/20240123-201030-marostegui.json
20:08 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling both afterwards
20:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1196 (T354336)', diff saved to https://phabricator.wikimedia.org/P55408 and previous config saved to /var/cache/conftool/dbconfig/20240123-200809-marostegui.json
20:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
20:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
20:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1196.eqiad.wmnet with reason: Maintenance
20:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1196.eqiad.wmnet with reason: Maintenance
20:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T354336)', diff saved to https://phabricator.wikimedia.org/P55407 and previous config saved to /var/cache/conftool/dbconfig/20240123-200726-marostegui.json
19:57 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling both afterwards
19:57 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet, repooling both afterwards
19:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P55406 and previous config saved to /var/cache/conftool/dbconfig/20240123-195220-marostegui.json
19:49 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet, repooling both afterwards
19:45 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs[2024-2025].codfw.wmnet with reason: testing data xfter cookbook
19:45 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs[2024-2025].codfw.wmnet with reason: testing data xfter cookbook
19:45 mutante: phab1004 - /srv/phab/phabricator/bin/mail volume
19:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P55405 and previous config saved to /var/cache/conftool/dbconfig/20240123-193713-marostegui.json
19:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T354336)', diff saved to https://phabricator.wikimedia.org/P55404 and previous config saved to /var/cache/conftool/dbconfig/20240123-192207-marostegui.json
19:21 ejegg: fundraising civicrm upgraded from d8b0c977 to b85b6dde
19:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1186 (T354336)', diff saved to https://phabricator.wikimedia.org/P55403 and previous config saved to /var/cache/conftool/dbconfig/20240123-191945-marostegui.json
19:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1186.eqiad.wmnet with reason: Maintenance
19:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1186.eqiad.wmnet with reason: Maintenance
19:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T354336)', diff saved to https://phabricator.wikimedia.org/P55402 and previous config saved to /var/cache/conftool/dbconfig/20240123-191922-marostegui.json
19:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P55401 and previous config saved to /var/cache/conftool/dbconfig/20240123-190416-marostegui.json
18:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P55400 and previous config saved to /var/cache/conftool/dbconfig/20240123-184909-marostegui.json
18:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
18:37 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
18:37 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
18:36 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
18:35 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
18:35 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
18:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T354336)', diff saved to https://phabricator.wikimedia.org/P55399 and previous config saved to /var/cache/conftool/dbconfig/20240123-183403-marostegui.json
18:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T354336)', diff saved to https://phabricator.wikimedia.org/P55398 and previous config saved to /var/cache/conftool/dbconfig/20240123-183141-marostegui.json
18:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1169.eqiad.wmnet with reason: Maintenance
18:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1169.eqiad.wmnet with reason: Maintenance
18:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55397 and previous config saved to /var/cache/conftool/dbconfig/20240123-183120-marostegui.json
18:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P55396 and previous config saved to /var/cache/conftool/dbconfig/20240123-181613-marostegui.json
18:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P55395 and previous config saved to /var/cache/conftool/dbconfig/20240123-180107-marostegui.json
17:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55394 and previous config saved to /var/cache/conftool/dbconfig/20240123-174600-marostegui.json
17:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55393 and previous config saved to /var/cache/conftool/dbconfig/20240123-174339-marostegui.json
17:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1163.eqiad.wmnet with reason: Maintenance
17:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1163.eqiad.wmnet with reason: Maintenance
17:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
17:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
17:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
17:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
17:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T354336)', diff saved to https://phabricator.wikimedia.org/P55392 and previous config saved to /var/cache/conftool/dbconfig/20240123-174215-marostegui.json
17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P55391 and previous config saved to /var/cache/conftool/dbconfig/20240123-172709-marostegui.json
17:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P55390 and previous config saved to /var/cache/conftool/dbconfig/20240123-171202-marostegui.json
16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T354336)', diff saved to https://phabricator.wikimedia.org/P55389 and previous config saved to /var/cache/conftool/dbconfig/20240123-165656-marostegui.json
16:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1135 (T354336)', diff saved to https://phabricator.wikimedia.org/P55388 and previous config saved to /var/cache/conftool/dbconfig/20240123-165433-marostegui.json
16:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1135.eqiad.wmnet with reason: Maintenance
16:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1135.eqiad.wmnet with reason: Maintenance
16:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1106.eqiad.wmnet with reason: Maintenance
16:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1106.eqiad.wmnet with reason: Maintenance
16:49 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1003.eqiad.wmnet with OS bookworm
16:39 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest1003.eqiad.wmnet with OS bookworm
16:14 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-f8-eqiad
16:14 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
16:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55387 and previous config saved to /var/cache/conftool/dbconfig/20240123-161426-root.json
16:10 sukhe: enable puppet on A:lvs to merge CR 991785 and run agent on all nodes
15:59 sukhe: disable puppet on A:lvs to merge CR 991785
15:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55386 and previous config saved to /var/cache/conftool/dbconfig/20240123-155921-root.json
15:55 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
15:54 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
15:54 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
15:53 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
15:52 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
15:52 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
15:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55385 and previous config saved to /var/cache/conftool/dbconfig/20240123-155219-ladsgroup.json
15:44 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55384 and previous config saved to /var/cache/conftool/dbconfig/20240123-154416-root.json
15:41 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
15:39 claime: trafficserver: move 30% of traffic to mw on k8s - T355532
15:37 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
15:37 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
15:37 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
15:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55383 and previous config saved to /var/cache/conftool/dbconfig/20240123-153712-ladsgroup.json
15:36 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
15:36 claime: Bumping mw-api-ext replicas - T355532
15:36 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
15:36 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
15:35 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
15:35 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
15:35 claime: Bumping mw-web replicas - T355532
15:33 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] DONE helmfile.d/services/termbox: apply
15:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] START helmfile.d/services/termbox: apply
15:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] DONE helmfile.d/services/termbox: apply
15:31 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] START helmfile.d/services/termbox: apply
15:31 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] DONE helmfile.d/services/termbox: apply
15:31 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] START helmfile.d/services/termbox: apply
15:29 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55382 and previous config saved to /var/cache/conftool/dbconfig/20240123-152911-root.json
15:22 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] DONE helmfile.d/services/termbox: apply
15:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55381 and previous config saved to /var/cache/conftool/dbconfig/20240123-152206-ladsgroup.json
15:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] START helmfile.d/services/termbox: apply
15:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] DONE helmfile.d/services/termbox: apply
15:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] START helmfile.d/services/termbox: apply
15:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] DONE helmfile.d/services/termbox: apply
15:19 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] START helmfile.d/services/termbox: apply
15:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55380 and previous config saved to /var/cache/conftool/dbconfig/20240123-151406-root.json
15:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2165.codfw.wmnet with reason: Maintenance
15:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2165.codfw.wmnet with reason: Maintenance
15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] DONE helmfile.d/services/termbox: apply
15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] START helmfile.d/services/termbox: apply
15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] DONE helmfile.d/services/termbox: apply
15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] START helmfile.d/services/termbox: apply
15:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55379 and previous config saved to /var/cache/conftool/dbconfig/20240123-150659-ladsgroup.json
15:06 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] DONE helmfile.d/services/termbox: apply
15:05 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] START helmfile.d/services/termbox: apply
15:00 Lucas_WMDE: UTC afternoon backport+config window done
14:59 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:992423ORES: Enable renamed revertrisklanguageagnostic model (T348298) (duration: 11m 20s)
14:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55378 and previous config saved to /var/cache/conftool/dbconfig/20240123-145901-root.json
14:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T354336)', diff saved to https://phabricator.wikimedia.org/P55377 and previous config saved to /var/cache/conftool/dbconfig/20240123-145353-marostegui.json
14:53 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and kharlan: Continuing with sync
14:49 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and kharlan: Backport for gerrit:992423ORES: Enable renamed revertrisklanguageagnostic model (T348298) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:48 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:992423ORES: Enable renamed revertrisklanguageagnostic model (T348298)
14:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1173.eqiad.wmnet with OS bookworm
14:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55376 and previous config saved to /var/cache/conftool/dbconfig/20240123-144356-root.json
14:42 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:992376Restore support for matching 'LIKE' patterns/wildcards (T355478) (duration: 07m 50s)
14:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P55375 and previous config saved to /var/cache/conftool/dbconfig/20240123-143846-marostegui.json
14:36 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Continuing with sync
14:36 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Backport for gerrit:992376Restore support for matching 'LIKE' patterns/wildcards (T355478) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:34 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:992376Restore support for matching 'LIKE' patterns/wildcards (T355478)
14:33 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:992377Restore support for matching 'LIKE' patterns/wildcards (T355478) (duration: 10m 29s)
14:32 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts sretest1003.eqiad.wmnet
14:32 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1003.eqiad.wmnet
14:27 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Continuing with sync
14:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1173.eqiad.wmnet with reason: host reimage
14:24 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Backport for gerrit:992377Restore support for matching 'LIKE' patterns/wildcards (T355478) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:24 pt1979@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1003.eqiad.wmnet
14:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1173.eqiad.wmnet with reason: host reimage
14:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P55374 and previous config saved to /var/cache/conftool/dbconfig/20240123-142339-marostegui.json
14:23 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
14:23 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:992377Restore support for matching 'LIKE' patterns/wildcards (T355478)
14:20 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
14:18 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:991606ext-EventLogging,ext-EventStreamConfig: Remove mediawiki.special_diff_interactions stream (T353366) (duration: 11m 49s)
14:15 pt1979@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts sretest1003.eqiad.wmnet
14:12 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and phuedx: Continuing with sync
14:12 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1173.eqiad.wmnet with OS bookworm
14:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and phuedx: Backport for gerrit:991606ext-EventLogging,ext-EventStreamConfig: Remove mediawiki.special_diff_interactions stream (T353366) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T354336)', diff saved to https://phabricator.wikimedia.org/P55373 and previous config saved to /var/cache/conftool/dbconfig/20240123-140833-marostegui.json
14:07 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
14:06 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:991606ext-EventLogging,ext-EventStreamConfig: Remove mediawiki.special_diff_interactions stream (T353366)
14:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1173 (T343718)', diff saved to https://phabricator.wikimedia.org/P55372 and previous config saved to /var/cache/conftool/dbconfig/20240123-140636-ladsgroup.json
14:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
14:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
14:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
14:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
13:58 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2195 (T354336)', diff saved to https://phabricator.wikimedia.org/P55371 and previous config saved to /var/cache/conftool/dbconfig/20240123-135819-marostegui.json
13:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2195.codfw.wmnet with reason: Maintenance
13:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2195.codfw.wmnet with reason: Maintenance
13:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T354336)', diff saved to https://phabricator.wikimedia.org/P55370 and previous config saved to /var/cache/conftool/dbconfig/20240123-135757-marostegui.json
13:52 Dreamy_Jazz: Ran `foreachwikiindblist group0 extensions/MediaModeration/maintenance/resendMatchEmails.php 20200405 --verbose`
13:51 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
13:50 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
13:50 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
13:49 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
13:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1016.eqiad.wmnet with OS bullseye
13:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P55369 and previous config saved to /var/cache/conftool/dbconfig/20240123-134250-marostegui.json
13:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P55368 and previous config saved to /var/cache/conftool/dbconfig/20240123-132744-marostegui.json
13:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55367 and previous config saved to /var/cache/conftool/dbconfig/20240123-131909-root.json
13:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage
13:12 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage
13:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T354336)', diff saved to https://phabricator.wikimedia.org/P55366 and previous config saved to /var/cache/conftool/dbconfig/20240123-131237-marostegui.json
13:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2181 (T354336)', diff saved to https://phabricator.wikimedia.org/P55365 and previous config saved to /var/cache/conftool/dbconfig/20240123-131027-marostegui.json
13:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2181.codfw.wmnet with reason: Maintenance
13:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2181.codfw.wmnet with reason: Maintenance
13:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55364 and previous config saved to /var/cache/conftool/dbconfig/20240123-131005-marostegui.json
13:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55363 and previous config saved to /var/cache/conftool/dbconfig/20240123-130404-root.json
12:56 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS bullseye
12:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P55362 and previous config saved to /var/cache/conftool/dbconfig/20240123-125459-marostegui.json
12:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55361 and previous config saved to /var/cache/conftool/dbconfig/20240123-124859-root.json
12:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host snapshot1017.eqiad.wmnet
12:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P55360 and previous config saved to /var/cache/conftool/dbconfig/20240123-123952-marostegui.json
12:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55359 and previous config saved to /var/cache/conftool/dbconfig/20240123-123354-root.json
12:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55358 and previous config saved to /var/cache/conftool/dbconfig/20240123-123346-root.json
12:31 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host snapshot1017.eqiad.wmnet
12:28 claime: Restarting killed maintenance job mediawiki_job_MachineVision_prioritize_uncategorized.service
12:26 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for sretest1001.eqiad.wmnet
12:26 kamila@cumin1002: START - Cookbook sre.hosts.remove-downtime for sretest1001.eqiad.wmnet
12:26 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on sretest1001.eqiad.wmnet with reason: testing the cookbook
12:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55357 and previous config saved to /var/cache/conftool/dbconfig/20240123-122446-marostegui.json
12:23 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on sretest1001.eqiad.wmnet with reason: testing the cookbook
12:23 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2168:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55356 and previous config saved to /var/cache/conftool/dbconfig/20240123-122336-marostegui.json
12:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
12:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
12:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55355 and previous config saved to /var/cache/conftool/dbconfig/20240123-122314-marostegui.json
12:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55354 and previous config saved to /var/cache/conftool/dbconfig/20240123-122105-root.json
12:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55353 and previous config saved to /var/cache/conftool/dbconfig/20240123-121849-root.json
12:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55352 and previous config saved to /var/cache/conftool/dbconfig/20240123-121841-root.json
12:17 claime: Restarting ferm.service on k8s node mw1495.eqiad.wmnet - T354855
12:16 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host snapshot1016.eqiad.wmnet
12:14 claime: scap::dsh::scap_proxies: Replace mw1486 by mw1405 - T355622
12:13 Amir1: dropping bv2015_edits table from all wikis (T355594)
12:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P55351 and previous config saved to /var/cache/conftool/dbconfig/20240123-120807-marostegui.json
12:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55350 and previous config saved to /var/cache/conftool/dbconfig/20240123-120600-root.json
12:05 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host snapshot1016.eqiad.wmnet
12:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55349 and previous config saved to /var/cache/conftool/dbconfig/20240123-120344-root.json
12:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55348 and previous config saved to /var/cache/conftool/dbconfig/20240123-120335-root.json
12:03 Amir1: dropping bv2009_edits table from all wikis (T355594)
12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1017.eqiad.wmnet with OS bullseye
11:54 godog: initial cleanup of replicated thanos blocks - T351927
11:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P55347 and previous config saved to /var/cache/conftool/dbconfig/20240123-115301-marostegui.json
11:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55346 and previous config saved to /var/cache/conftool/dbconfig/20240123-115055-root.json
11:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55345 and previous config saved to /var/cache/conftool/dbconfig/20240123-114840-root.json
11:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55344 and previous config saved to /var/cache/conftool/dbconfig/20240123-114831-root.json
11:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1173', diff saved to https://phabricator.wikimedia.org/P55343 and previous config saved to /var/cache/conftool/dbconfig/20240123-114826-marostegui.json
11:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55342 and previous config saved to /var/cache/conftool/dbconfig/20240123-113754-marostegui.json
11:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55341 and previous config saved to /var/cache/conftool/dbconfig/20240123-113550-root.json
11:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2167:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55340 and previous config saved to /var/cache/conftool/dbconfig/20240123-113544-marostegui.json
11:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
11:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
11:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55339 and previous config saved to /var/cache/conftool/dbconfig/20240123-113522-marostegui.json
11:35 marostegui: Starting s6 eqiad failover from db1173 to db1231 - T355660
11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1017.eqiad.wmnet with reason: host reimage
11:31 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1017.eqiad.wmnet with reason: host reimage
11:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55338 and previous config saved to /var/cache/conftool/dbconfig/20240123-112420-root.json
11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P55336 and previous config saved to /var/cache/conftool/dbconfig/20240123-112016-marostegui.json
11:11 Amir1: dropping pif_edits table from all wikis (T355594)
11:11 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host snapshot1017.eqiad.wmnet with OS bullseye
11:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55335 and previous config saved to /var/cache/conftool/dbconfig/20240123-110915-root.json
11:07 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1231 with weight 0 T355660', diff saved to https://phabricator.wikimedia.org/P55333 and previous config saved to /var/cache/conftool/dbconfig/20240123-110743-marostegui.json
11:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355660
11:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355660
11:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55332 and previous config saved to /var/cache/conftool/dbconfig/20240123-110540-root.json
11:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P55331 and previous config saved to /var/cache/conftool/dbconfig/20240123-110509-marostegui.json
10:58 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-master1002.eqiad.wmnet
10:58 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:58 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-master1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
10:56 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-master1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
10:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2171.codfw.wmnet with OS bookworm
10:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55330 and previous config saved to /var/cache/conftool/dbconfig/20240123-105410-root.json
10:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55329 and previous config saved to /var/cache/conftool/dbconfig/20240123-105035-root.json
10:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55328 and previous config saved to /var/cache/conftool/dbconfig/20240123-105003-marostegui.json
10:48 btullis@cumin1002: START - Cookbook sre.dns.netbox
10:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55327 and previous config saved to /var/cache/conftool/dbconfig/20240123-104753-marostegui.json
10:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2166.codfw.wmnet with reason: Maintenance
10:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2166.codfw.wmnet with reason: Maintenance
10:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T354336)', diff saved to https://phabricator.wikimedia.org/P55326 and previous config saved to /var/cache/conftool/dbconfig/20240123-104731-marostegui.json
10:43 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts an-master1002.eqiad.wmnet
10:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2171.codfw.wmnet with reason: host reimage
10:34 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-master1001.eqiad.wmnet
10:34 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:34 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-master1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
10:32 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-master1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
10:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P55325 and previous config saved to /var/cache/conftool/dbconfig/20240123-103225-marostegui.json
10:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2171.codfw.wmnet with reason: host reimage
10:27 btullis@cumin1002: START - Cookbook sre.dns.netbox
10:23 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1017.eqiad.wmnet with OS bullseye
10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P55324 and previous config saved to /var/cache/conftool/dbconfig/20240123-101718-marostegui.json
10:13 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts sretest1003.eqiad.wmnet
10:13 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1003.eqiad.wmnet
10:12 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2171.codfw.wmnet with OS bookworm
10:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2171:3315 db2171:3316', diff saved to https://phabricator.wikimedia.org/P55323 and previous config saved to /var/cache/conftool/dbconfig/20240123-101056-marostegui.json
10:10 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts an-master1001.eqiad.wmnet
10:04 ayounsi@cumin1002: START - Cookbook sre.hosts.reboot-single for host sretest1003.eqiad.wmnet
10:04 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
10:03 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1003.eqiad.wmnet
10:03 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
10:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1016.eqiad.wmnet with OS bullseye
10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T354336)', diff saved to https://phabricator.wikimedia.org/P55322 and previous config saved to /var/cache/conftool/dbconfig/20240123-100212-marostegui.json
10:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2164 (T354336)', diff saved to https://phabricator.wikimedia.org/P55321 and previous config saved to /var/cache/conftool/dbconfig/20240123-100002-marostegui.json
09:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
09:59 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts sretest1003.eqiad.wmnet
09:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
09:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2164.codfw.wmnet with reason: Maintenance
09:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2164.codfw.wmnet with reason: Maintenance
09:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55320 and previous config saved to /var/cache/conftool/dbconfig/20240123-095923-marostegui.json
09:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P55319 and previous config saved to /var/cache/conftool/dbconfig/20240123-094417-marostegui.json
09:41 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
09:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage
09:29 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage
09:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P55318 and previous config saved to /var/cache/conftool/dbconfig/20240123-092910-marostegui.json
09:24 hashar@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.15 refs T354433
09:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55317 and previous config saved to /var/cache/conftool/dbconfig/20240123-091404-marostegui.json
09:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55316 and previous config saved to /var/cache/conftool/dbconfig/20240123-091154-marostegui.json
09:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2163.codfw.wmnet with reason: Maintenance
09:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2163.codfw.wmnet with reason: Maintenance
09:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T354336)', diff saved to https://phabricator.wikimedia.org/P55315 and previous config saved to /var/cache/conftool/dbconfig/20240123-091132-marostegui.json
09:04 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1003.eqiad.wmnet
09:01 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
09:01 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55314 and previous config saved to /var/cache/conftool/dbconfig/20240123-090104-root.json
08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P55313 and previous config saved to /var/cache/conftool/dbconfig/20240123-085625-marostegui.json
08:55 taavi: updating CR firewall policy with https://gerrit.wikimedia.org/r/c/operations/homer/public/+/992245/ https://gerrit.wikimedia.org/r/c/operations/homer/public/+/992359/
08:51 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS bullseye
08:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55312 and previous config saved to /var/cache/conftool/dbconfig/20240123-084559-root.json
08:44 gmodena@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
08:44 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
08:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55311 and previous config saved to /var/cache/conftool/dbconfig/20240123-084301-ladsgroup.json
08:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
08:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
08:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
08:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
08:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55310 and previous config saved to /var/cache/conftool/dbconfig/20240123-084244-ladsgroup.json
08:41 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
08:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P55309 and previous config saved to /var/cache/conftool/dbconfig/20240123-084119-marostegui.json
08:39 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
08:37 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
08:30 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55308 and previous config saved to /var/cache/conftool/dbconfig/20240123-083054-root.json
08:28 taavi: updating CR firewall policy with https://gerrit.wikimedia.org/r/c/operations/homer/public/+/992244
08:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55307 and previous config saved to /var/cache/conftool/dbconfig/20240123-082738-ladsgroup.json
08:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T354336)', diff saved to https://phabricator.wikimedia.org/P55306 and previous config saved to /var/cache/conftool/dbconfig/20240123-082613-marostegui.json
08:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2162 (T354336)', diff saved to https://phabricator.wikimedia.org/P55305 and previous config saved to /var/cache/conftool/dbconfig/20240123-082402-marostegui.json
08:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2162.codfw.wmnet with reason: Maintenance
08:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2162.codfw.wmnet with reason: Maintenance
08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T354336)', diff saved to https://phabricator.wikimedia.org/P55304 and previous config saved to /var/cache/conftool/dbconfig/20240123-082340-marostegui.json
08:15 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55303 and previous config saved to /var/cache/conftool/dbconfig/20240123-081549-root.json
08:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55302 and previous config saved to /var/cache/conftool/dbconfig/20240123-081231-ladsgroup.json
08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P55301 and previous config saved to /var/cache/conftool/dbconfig/20240123-080834-marostegui.json
08:02 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2051.codfw.wmnet
08:00 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55300 and previous config saved to /var/cache/conftool/dbconfig/20240123-080044-root.json
07:57 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2051.codfw.wmnet
07:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55299 and previous config saved to /var/cache/conftool/dbconfig/20240123-075725-ladsgroup.json
07:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1051.eqiad.wmnet
07:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P55298 and previous config saved to /var/cache/conftool/dbconfig/20240123-075327-marostegui.json
07:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1051.eqiad.wmnet
07:45 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55297 and previous config saved to /var/cache/conftool/dbconfig/20240123-074538-root.json
07:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T354336)', diff saved to https://phabricator.wikimedia.org/P55296 and previous config saved to /var/cache/conftool/dbconfig/20240123-073821-marostegui.json
07:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2161 (T354336)', diff saved to https://phabricator.wikimedia.org/P55295 and previous config saved to /var/cache/conftool/dbconfig/20240123-073610-marostegui.json
07:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2161.codfw.wmnet with reason: Maintenance
07:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2161.codfw.wmnet with reason: Maintenance
07:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T354336)', diff saved to https://phabricator.wikimedia.org/P55294 and previous config saved to /var/cache/conftool/dbconfig/20240123-073548-marostegui.json
07:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1231.eqiad.wmnet with OS bookworm
07:30 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55293 and previous config saved to /var/cache/conftool/dbconfig/20240123-073033-root.json
07:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P55292 and previous config saved to /var/cache/conftool/dbconfig/20240123-073021-ladsgroup.json
07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P55291 and previous config saved to /var/cache/conftool/dbconfig/20240123-072041-marostegui.json
07:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P55290 and previous config saved to /var/cache/conftool/dbconfig/20240123-071515-ladsgroup.json
07:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
07:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
07:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P55289 and previous config saved to /var/cache/conftool/dbconfig/20240123-070535-marostegui.json
07:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P55288 and previous config saved to /var/cache/conftool/dbconfig/20240123-070008-ladsgroup.json
06:57 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1231.eqiad.wmnet with OS bookworm
06:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1231', diff saved to https://phabricator.wikimedia.org/P55287 and previous config saved to /var/cache/conftool/dbconfig/20240123-065606-marostegui.json
06:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T354336)', diff saved to https://phabricator.wikimedia.org/P55285 and previous config saved to /var/cache/conftool/dbconfig/20240123-065029-marostegui.json
06:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2154 (T354336)', diff saved to https://phabricator.wikimedia.org/P55284 and previous config saved to /var/cache/conftool/dbconfig/20240123-064819-marostegui.json
06:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2154.codfw.wmnet with reason: Maintenance
06:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2154.codfw.wmnet with reason: Maintenance
06:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T354336)', diff saved to https://phabricator.wikimedia.org/P55283 and previous config saved to /var/cache/conftool/dbconfig/20240123-064757-marostegui.json
06:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P55282 and previous config saved to /var/cache/conftool/dbconfig/20240123-064502-ladsgroup.json
06:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P55281 and previous config saved to /var/cache/conftool/dbconfig/20240123-063250-marostegui.json
06:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P55280 and previous config saved to /var/cache/conftool/dbconfig/20240123-061744-marostegui.json
06:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T354336)', diff saved to https://phabricator.wikimedia.org/P55279 and previous config saved to /var/cache/conftool/dbconfig/20240123-060237-marostegui.json
06:01 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2152 (T354336)', diff saved to https://phabricator.wikimedia.org/P55278 and previous config saved to /var/cache/conftool/dbconfig/20240123-060127-marostegui.json
06:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2152.codfw.wmnet with reason: Maintenance
06:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2152.codfw.wmnet with reason: Maintenance
06:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
06:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
06:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
06:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
05:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1209.eqiad.wmnet with reason: Maintenance
05:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1209.eqiad.wmnet with reason: Maintenance
04:54 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.15 refs T354433 (duration: 51m 22s)
04:02 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.15 refs T354433
01:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55277 and previous config saved to /var/cache/conftool/dbconfig/20240123-011434-ladsgroup.json
01:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
01:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
00:58 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=ruwikinews --fix # T350889
00:57 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=fiwikinews --fix # T350889
00:57 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=fiwiki --fix # T350889
00:56 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=enwiki --fix # T350889
00:55 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=cywiki --fix # T350889
00:42 zabe: running 'zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=viwiki --fix' in screen
00:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P55276 and previous config saved to /var/cache/conftool/dbconfig/20240123-003338-ladsgroup.json
00:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
00:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
00:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P55275 and previous config saved to /var/cache/conftool/dbconfig/20240123-003316-ladsgroup.json
00:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P55274 and previous config saved to /var/cache/conftool/dbconfig/20240123-001810-ladsgroup.json
00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P55273 and previous config saved to /var/cache/conftool/dbconfig/20240123-000303-ladsgroup.json

2024-01-22

23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P55272 and previous config saved to /var/cache/conftool/dbconfig/20240122-234757-ladsgroup.json
23:14 zabe@deploy2002: Finished scap: Backport for gerrit:991930Stop setting wgShowIPinHeader (T355479), gerrit:992250beta: Start reading from af_user(_text)/afh_user(_text) (T355616) (duration: 07m 31s)
23:08 zabe@deploy2002: zabe: Continuing with sync
23:08 zabe@deploy2002: zabe: Backport for gerrit:991930Stop setting wgShowIPinHeader (T355479), gerrit:992250beta: Start reading from af_user(_text)/afh_user(_text) (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
23:06 zabe@deploy2002: Started scap: Backport for gerrit:991930Stop setting wgShowIPinHeader (T355479), gerrit:992250beta: Start reading from af_user(_text)/afh_user(_text) (T355616)
22:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
22:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
22:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T354336)', diff saved to https://phabricator.wikimedia.org/P55271 and previous config saved to /var/cache/conftool/dbconfig/20240122-225618-marostegui.json
22:47 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2088']
22:47 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088']
22:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P55270 and previous config saved to /var/cache/conftool/dbconfig/20240122-224111-marostegui.json
22:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P55269 and previous config saved to /var/cache/conftool/dbconfig/20240122-222605-marostegui.json
22:24 maryum: Deployed patch for T355538
22:14 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1003.eqiad.wmnet with OS bookworm
22:14 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
22:13 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
22:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T354336)', diff saved to https://phabricator.wikimedia.org/P55268 and previous config saved to /var/cache/conftool/dbconfig/20240122-221058-marostegui.json
22:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1226 (T354336)', diff saved to https://phabricator.wikimedia.org/P55267 and previous config saved to /var/cache/conftool/dbconfig/20240122-220850-marostegui.json
22:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1226.eqiad.wmnet with reason: Maintenance
22:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1226.eqiad.wmnet with reason: Maintenance
22:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
22:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
22:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T354336)', diff saved to https://phabricator.wikimedia.org/P55266 and previous config saved to /var/cache/conftool/dbconfig/20240122-220811-marostegui.json
21:56 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
21:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P55265 and previous config saved to /var/cache/conftool/dbconfig/20240122-215305-marostegui.json
21:53 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
21:51 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
21:51 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add cloudrabbit1003 cloud-private address - taavi@cumin1002"
21:50 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add cloudrabbit1003 cloud-private address - taavi@cumin1002"
21:48 taavi@cumin1002: START - Cookbook sre.dns.netbox
21:46 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set cloudrabbit1003 as active - taavi@cumin1002"
21:45 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set cloudrabbit1003 as active - taavi@cumin1002"
21:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P55264 and previous config saved to /var/cache/conftool/dbconfig/20240122-213758-marostegui.json
21:33 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS bookworm
21:32 taavi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudrabbit1003.eqiad.wmnet with OS bookworm
21:24 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS bookworm
21:24 taavi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudrabbit1003.eqiad.wmnet with OS bookworm
21:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T354336)', diff saved to https://phabricator.wikimedia.org/P55263 and previous config saved to /var/cache/conftool/dbconfig/20240122-212252-marostegui.json
21:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1214 (T354336)', diff saved to https://phabricator.wikimedia.org/P55262 and previous config saved to /var/cache/conftool/dbconfig/20240122-212144-marostegui.json
21:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1214.eqiad.wmnet with reason: Maintenance
21:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1214.eqiad.wmnet with reason: Maintenance
21:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T354336)', diff saved to https://phabricator.wikimedia.org/P55261 and previous config saved to /var/cache/conftool/dbconfig/20240122-212122-marostegui.json
21:17 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS bookworm
21:07 taavi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudrabbit1003
21:07 taavi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudrabbit1003
21:07 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
21:07 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: allocate IPs for cloudrabbit1003 - taavi@cumin1002"
21:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P55260 and previous config saved to /var/cache/conftool/dbconfig/20240122-210615-marostegui.json
21:05 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: allocate IPs for cloudrabbit1003 - taavi@cumin1002"
21:03 taavi@cumin1002: START - Cookbook sre.dns.netbox
20:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P55259 and previous config saved to /var/cache/conftool/dbconfig/20240122-205109-marostegui.json
20:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T354336)', diff saved to https://phabricator.wikimedia.org/P55258 and previous config saved to /var/cache/conftool/dbconfig/20240122-203602-marostegui.json
20:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1211 (T354336)', diff saved to https://phabricator.wikimedia.org/P55257 and previous config saved to /var/cache/conftool/dbconfig/20240122-203354-marostegui.json
20:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1211.eqiad.wmnet with reason: Maintenance
20:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1211.eqiad.wmnet with reason: Maintenance
20:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T354336)', diff saved to https://phabricator.wikimedia.org/P55256 and previous config saved to /var/cache/conftool/dbconfig/20240122-203332-marostegui.json
20:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P55255 and previous config saved to /var/cache/conftool/dbconfig/20240122-201826-marostegui.json
20:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P55254 and previous config saved to /var/cache/conftool/dbconfig/20240122-200319-marostegui.json
19:57 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
19:56 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
19:56 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
19:55 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
19:54 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
19:54 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
19:51 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
19:50 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
19:50 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
19:48 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
19:48 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
19:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T354336)', diff saved to https://phabricator.wikimedia.org/P55253 and previous config saved to /var/cache/conftool/dbconfig/20240122-194813-marostegui.json
19:47 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
19:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1203 (T354336)', diff saved to https://phabricator.wikimedia.org/P55252 and previous config saved to /var/cache/conftool/dbconfig/20240122-194704-marostegui.json
19:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1203.eqiad.wmnet with reason: Maintenance
19:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1203.eqiad.wmnet with reason: Maintenance
19:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T354336)', diff saved to https://phabricator.wikimedia.org/P55251 and previous config saved to /var/cache/conftool/dbconfig/20240122-194642-marostegui.json
19:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P55250 and previous config saved to /var/cache/conftool/dbconfig/20240122-193136-marostegui.json
19:28 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
19:28 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
19:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P55249 and previous config saved to /var/cache/conftool/dbconfig/20240122-191629-marostegui.json
19:06 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
19:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T354336)', diff saved to https://phabricator.wikimedia.org/P55248 and previous config saved to /var/cache/conftool/dbconfig/20240122-190123-marostegui.json
19:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1193 (T354336)', diff saved to https://phabricator.wikimedia.org/P55247 and previous config saved to /var/cache/conftool/dbconfig/20240122-190014-marostegui.json
19:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1193.eqiad.wmnet with reason: Maintenance
19:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1193.eqiad.wmnet with reason: Maintenance
18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T354336)', diff saved to https://phabricator.wikimedia.org/P55246 and previous config saved to /var/cache/conftool/dbconfig/20240122-185952-marostegui.json
18:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P55245 and previous config saved to /var/cache/conftool/dbconfig/20240122-184446-marostegui.json
18:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P55244 and previous config saved to /var/cache/conftool/dbconfig/20240122-182939-marostegui.json
18:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P55243 and previous config saved to /var/cache/conftool/dbconfig/20240122-182432-ladsgroup.json
18:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
18:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
18:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55242 and previous config saved to /var/cache/conftool/dbconfig/20240122-182359-ladsgroup.json
18:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T354336)', diff saved to https://phabricator.wikimedia.org/P55241 and previous config saved to /var/cache/conftool/dbconfig/20240122-181433-marostegui.json
18:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1192 (T354336)', diff saved to https://phabricator.wikimedia.org/P55240 and previous config saved to /var/cache/conftool/dbconfig/20240122-181324-marostegui.json
18:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1192.eqiad.wmnet with reason: Maintenance
18:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1192.eqiad.wmnet with reason: Maintenance
18:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T354336)', diff saved to https://phabricator.wikimedia.org/P55239 and previous config saved to /var/cache/conftool/dbconfig/20240122-181302-marostegui.json
18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55238 and previous config saved to /var/cache/conftool/dbconfig/20240122-180853-ladsgroup.json
17:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P55237 and previous config saved to /var/cache/conftool/dbconfig/20240122-175755-marostegui.json
17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55236 and previous config saved to /var/cache/conftool/dbconfig/20240122-175346-ladsgroup.json
17:46 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
17:44 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
17:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P55235 and previous config saved to /var/cache/conftool/dbconfig/20240122-174249-marostegui.json
17:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55234 and previous config saved to /var/cache/conftool/dbconfig/20240122-173840-ladsgroup.json
17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T354336)', diff saved to https://phabricator.wikimedia.org/P55233 and previous config saved to /var/cache/conftool/dbconfig/20240122-172743-marostegui.json
17:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1178 (T354336)', diff saved to https://phabricator.wikimedia.org/P55232 and previous config saved to /var/cache/conftool/dbconfig/20240122-172635-marostegui.json
17:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1178.eqiad.wmnet with reason: Maintenance
17:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1178.eqiad.wmnet with reason: Maintenance
17:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55231 and previous config saved to /var/cache/conftool/dbconfig/20240122-172612-marostegui.json
17:17 akosiaris: draining kubestage2001, uncordoning kubestage2002 to allow it to receive the pods. T355437
17:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P55230 and previous config saved to /var/cache/conftool/dbconfig/20240122-171106-marostegui.json
17:05 vgutierrez: restore HAProxy tune.bufsize = 16684 in cp3066 - T354424
16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P55229 and previous config saved to /var/cache/conftool/dbconfig/20240122-165559-marostegui.json
16:53 vgutierrez: testing HAProxy tune.bufsize = 32768 in cp3066 - T354424
16:46 dcausse@deploy2002: Finished deploy [airflow-dags/search@dcf08b2]: (no justification provided) (duration: 00m 31s)
16:46 dcausse@deploy2002: Started deploy [airflow-dags/search@dcf08b2]: (no justification provided)
16:42 Daimona: T353459 Running mwscript /home/daimona/GenerateInvitationList.php to test the script before it reaches production
16:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55228 and previous config saved to /var/cache/conftool/dbconfig/20240122-164053-marostegui.json
16:39 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1495.eqiad.wmnet with OS bullseye
16:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55227 and previous config saved to /var/cache/conftool/dbconfig/20240122-163844-marostegui.json
16:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1177.eqiad.wmnet with reason: Maintenance
16:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1177.eqiad.wmnet with reason: Maintenance
16:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55226 and previous config saved to /var/cache/conftool/dbconfig/20240122-163822-marostegui.json
16:38 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
16:38 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
16:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55225 and previous config saved to /var/cache/conftool/dbconfig/20240122-163808-ladsgroup.json
16:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
16:38 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
16:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
16:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
16:37 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
16:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
16:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P55224 and previous config saved to /var/cache/conftool/dbconfig/20240122-163729-ladsgroup.json
16:31 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1486.eqiad.wmnet with OS bullseye
16:29 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
16:29 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
16:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P55222 and previous config saved to /var/cache/conftool/dbconfig/20240122-162315-marostegui.json
16:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P55221 and previous config saved to /var/cache/conftool/dbconfig/20240122-162223-ladsgroup.json
16:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1495.eqiad.wmnet with reason: host reimage
16:12 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1486.eqiad.wmnet with reason: host reimage
16:09 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1495.eqiad.wmnet with reason: host reimage
16:08 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1486.eqiad.wmnet with reason: host reimage
16:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P55220 and previous config saved to /var/cache/conftool/dbconfig/20240122-160809-marostegui.json
16:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P55219 and previous config saved to /var/cache/conftool/dbconfig/20240122-160716-ladsgroup.json
15:56 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55218 and previous config saved to /var/cache/conftool/dbconfig/20240122-155607-root.json
15:55 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1495.eqiad.wmnet with OS bullseye
15:55 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1486.eqiad.wmnet with OS bullseye
15:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55217 and previous config saved to /var/cache/conftool/dbconfig/20240122-155302-marostegui.json
15:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P55216 and previous config saved to /var/cache/conftool/dbconfig/20240122-155210-ladsgroup.json
15:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55215 and previous config saved to /var/cache/conftool/dbconfig/20240122-155154-marostegui.json
15:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1172.eqiad.wmnet with reason: Maintenance
15:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1172.eqiad.wmnet with reason: Maintenance
15:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
15:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
15:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T354336)', diff saved to https://phabricator.wikimedia.org/P55214 and previous config saved to /var/cache/conftool/dbconfig/20240122-155115-marostegui.json
15:41 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55213 and previous config saved to /var/cache/conftool/dbconfig/20240122-154102-root.json
15:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P55212 and previous config saved to /var/cache/conftool/dbconfig/20240122-153608-marostegui.json
15:26 sukhe: sudo cumin -b1 -s120 "A:dns-rec and not P{dns6001*}" "enable-puppet 'do not enable' && run-puppet-agent"
15:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55211 and previous config saved to /var/cache/conftool/dbconfig/20240122-152557-root.json
15:24 sukhe: re-enable puppet on A:dns-rec and run agent to finish merging CR 979159
15:21 sukhe: enable puppet on dns6001 and run agent to test CR 979159
15:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P55210 and previous config saved to /var/cache/conftool/dbconfig/20240122-152102-marostegui.json
15:13 sukhe: disable Puppet on A:dns-rec to decouple anycast-hc and pdns-rec systemd binding: CR 979159
15:10 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55209 and previous config saved to /var/cache/conftool/dbconfig/20240122-151052-root.json
15:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T354336)', diff saved to https://phabricator.wikimedia.org/P55208 and previous config saved to /var/cache/conftool/dbconfig/20240122-150555-marostegui.json
15:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1167 (T354336)', diff saved to https://phabricator.wikimedia.org/P55207 and previous config saved to /var/cache/conftool/dbconfig/20240122-150046-marostegui.json
15:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
15:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
15:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1167.eqiad.wmnet with reason: Maintenance
15:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1167.eqiad.wmnet with reason: Maintenance
14:55 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55206 and previous config saved to /var/cache/conftool/dbconfig/20240122-145548-root.json
14:55 hashar@deploy2002: Finished deploy [gerrit/gerrit@6257faa]: Update Zuul plugin for Gerrit 3.7 - T355521 (duration: 00m 07s)
14:54 hashar@deploy2002: Started deploy [gerrit/gerrit@6257faa]: Update Zuul plugin for Gerrit 3.7 - T355521
14:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2105.codfw.wmnet with reason: Maintenance
14:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2105.codfw.wmnet with reason: Maintenance
14:42 jgiannelos@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
14:41 Lucas_WMDE: UTC afternoon backport+config window done
14:41 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
14:41 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
14:40 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:991358Set ShowRollbackConfirmation in arwiki (T355213) (duration: 09m 07s)
14:40 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55205 and previous config saved to /var/cache/conftool/dbconfig/20240122-144043-root.json
14:40 jgiannelos@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
14:40 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
14:35 logmsgbot: lucaswerkmeister-wmde@deploy2002 hubaishan and lucaswerkmeister-wmde: Continuing with sync
14:33 logmsgbot: lucaswerkmeister-wmde@deploy2002 hubaishan and lucaswerkmeister-wmde: Backport for gerrit:991358Set ShowRollbackConfirmation in arwiki (T355213) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:31 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:991358Set ShowRollbackConfirmation in arwiki (T355213)
14:30 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:991379Restrict pagequality-validate right to patroller in arwikisource (T354503) (duration: 09m 41s)
14:28 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1036.eqiad.wmnet to cluster eqiad and group B
14:26 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1036.eqiad.wmnet to cluster eqiad and group B
14:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55204 and previous config saved to /var/cache/conftool/dbconfig/20240122-142538-root.json
14:25 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db1134', diff saved to https://phabricator.wikimedia.org/P55203 and previous config saved to /var/cache/conftool/dbconfig/20240122-142530-marostegui.json
14:24 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and hubaishan: Continuing with sync
14:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and hubaishan: Backport for gerrit:991379Restrict pagequality-validate right to patroller in arwikisource (T354503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:991379Restrict pagequality-validate right to patroller in arwikisource (T354503)
13:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1165.eqiad.wmnet with OS bookworm
13:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1165.eqiad.wmnet with reason: host reimage
13:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1165.eqiad.wmnet with reason: host reimage
13:24 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti1036.eqiad.wmnet
13:22 marostegui: Upgrade sanitarium master, there will be lag on s6 wiki replicas T354506
13:21 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1165.eqiad.wmnet with OS bookworm
13:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1165', diff saved to https://phabricator.wikimedia.org/P55201 and previous config saved to /var/cache/conftool/dbconfig/20240122-132023-marostegui.json
13:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2050.codfw.wmnet
13:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
13:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2049.codfw.wmnet
13:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1049.eqiad.wmnet
13:01 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2050.codfw.wmnet
13:00 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1050.eqiad.wmnet
12:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1049.eqiad.wmnet
12:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2049.codfw.wmnet
12:55 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1050.eqiad.wmnet
12:48 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
12:47 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
12:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55200 and previous config saved to /var/cache/conftool/dbconfig/20240122-123351-root.json
12:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55199 and previous config saved to /var/cache/conftool/dbconfig/20240122-122634-marostegui.json
12:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55198 and previous config saved to /var/cache/conftool/dbconfig/20240122-121846-root.json
12:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P55197 and previous config saved to /var/cache/conftool/dbconfig/20240122-121128-marostegui.json
12:06 volans@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:20:00 on sretest1001.eqiad.wmnet with reason: Testing
12:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55195 and previous config saved to /var/cache/conftool/dbconfig/20240122-120341-root.json
11:56 volans@cumin1002: START - Cookbook sre.hosts.downtime for 0:20:00 on sretest1001.eqiad.wmnet with reason: Testing
11:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P55193 and previous config saved to /var/cache/conftool/dbconfig/20240122-115621-marostegui.json
11:56 volans@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 0:20:00 on sretest1001.eqiad.wmnet with reason: Testing
11:56 volans@cumin1002: START - Cookbook sre.hosts.downtime for 0:20:00 on sretest1001.eqiad.wmnet with reason: Testing
11:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55192 and previous config saved to /var/cache/conftool/dbconfig/20240122-114836-root.json
11:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55191 and previous config saved to /var/cache/conftool/dbconfig/20240122-114115-marostegui.json
11:41 vgutierrez: update to HAProxy 2.8.5 on cp3066 - T354424
11:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55190 and previous config saved to /var/cache/conftool/dbconfig/20240122-113331-root.json
11:26 jelto: start envoy on ticket-test.wikimedia.org to test alerting - T354479
11:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55189 and previous config saved to /var/cache/conftool/dbconfig/20240122-112401-marostegui.json
11:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2190.codfw.wmnet with reason: Maintenance
11:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2190.codfw.wmnet with reason: Maintenance
11:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55188 and previous config saved to /var/cache/conftool/dbconfig/20240122-112339-marostegui.json
11:21 jelto: stop envoy on ticket-test.wikimedia.org to test alerting - T354479
11:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55187 and previous config saved to /var/cache/conftool/dbconfig/20240122-111826-root.json
11:10 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2048.codfw.wmnet
11:10 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1048.eqiad.wmnet
11:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P55185 and previous config saved to /var/cache/conftool/dbconfig/20240122-110833-marostegui.json
11:04 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2048.codfw.wmnet
11:04 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1048.eqiad.wmnet
11:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55184 and previous config saved to /var/cache/conftool/dbconfig/20240122-110321-root.json
11:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2158.codfw.wmnet with OS bookworm
10:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P55183 and previous config saved to /var/cache/conftool/dbconfig/20240122-105326-marostegui.json
10:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55182 and previous config saved to /var/cache/conftool/dbconfig/20240122-105237-root.json
10:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55181 and previous config saved to /var/cache/conftool/dbconfig/20240122-105222-root.json
10:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2158.codfw.wmnet with reason: host reimage
10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55180 and previous config saved to /var/cache/conftool/dbconfig/20240122-103820-marostegui.json
10:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55179 and previous config saved to /var/cache/conftool/dbconfig/20240122-103732-root.json
10:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2158.codfw.wmnet with reason: host reimage
10:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55178 and previous config saved to /var/cache/conftool/dbconfig/20240122-103717-root.json
10:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P55177 and previous config saved to /var/cache/conftool/dbconfig/20240122-103520-ladsgroup.json
10:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
10:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
10:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55176 and previous config saved to /var/cache/conftool/dbconfig/20240122-102227-root.json
10:22 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55175 and previous config saved to /var/cache/conftool/dbconfig/20240122-102220-marostegui.json
10:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2177.codfw.wmnet with reason: Maintenance
10:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55174 and previous config saved to /var/cache/conftool/dbconfig/20240122-102212-root.json
10:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2177.codfw.wmnet with reason: Maintenance
10:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T354336)', diff saved to https://phabricator.wikimedia.org/P55173 and previous config saved to /var/cache/conftool/dbconfig/20240122-102158-marostegui.json
10:18 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2158.codfw.wmnet with OS bookworm
10:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2158', diff saved to https://phabricator.wikimedia.org/P55172 and previous config saved to /var/cache/conftool/dbconfig/20240122-101634-marostegui.json
10:13 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for gerrit[1003,2002].wikimedia.org
10:13 cgoubert@cumin1002: START - Cookbook sre.hosts.remove-downtime for gerrit[1003,2002].wikimedia.org
10:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55171 and previous config saved to /var/cache/conftool/dbconfig/20240122-100722-root.json
10:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55170 and previous config saved to /var/cache/conftool/dbconfig/20240122-100707-root.json
10:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P55169 and previous config saved to /var/cache/conftool/dbconfig/20240122-100651-marostegui.json
10:04 hashar: gerrit: running jgit gc on every repository to regenerate potentially faulty reachability bitmaps files preventing fetches on some repositories # T355173
10:00 jelto: start envoy on ticket-test.wikimedia.org to test alerting - T354479
09:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2049.codfw.wmnet
09:56 jelto: stop envoy on ticket-test.wikimedia.org to test alerting - T354479
09:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2049.codfw.wmnet
09:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1049.eqiad.wmnet
09:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55167 and previous config saved to /var/cache/conftool/dbconfig/20240122-095217-root.json
09:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55166 and previous config saved to /var/cache/conftool/dbconfig/20240122-095202-root.json
09:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P55165 and previous config saved to /var/cache/conftool/dbconfig/20240122-095145-marostegui.json
09:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1035.eqiad.wmnet to cluster eqiad and group A
09:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1035.eqiad.wmnet to cluster eqiad and group A
09:47 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1049.eqiad.wmnet
09:38 hashar: Restarted Gerrit with upgraded version 3.7.6 # T354885
09:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55164 and previous config saved to /var/cache/conftool/dbconfig/20240122-093712-root.json
09:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55163 and previous config saved to /var/cache/conftool/dbconfig/20240122-093657-root.json
09:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T354336)', diff saved to https://phabricator.wikimedia.org/P55162 and previous config saved to /var/cache/conftool/dbconfig/20240122-093638-marostegui.json
09:26 cgoubert@cumin1002: conftool action : set/pooled=no; selector: name=mw2394.codfw.wmnet
09:26 cgoubert@cumin1002: conftool action : set/pooled=yes; selector: name=mw2444.codfw.wmnet
09:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55161 and previous config saved to /var/cache/conftool/dbconfig/20240122-092207-root.json
09:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55160 and previous config saved to /var/cache/conftool/dbconfig/20240122-092152-root.json
09:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2156 (T354336)', diff saved to https://phabricator.wikimedia.org/P55159 and previous config saved to /var/cache/conftool/dbconfig/20240122-091916-marostegui.json
09:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
09:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
09:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2156.codfw.wmnet with reason: Maintenance
09:18 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1035.eqiad.wmnet to cluster eqiad and group A
09:18 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1035.eqiad.wmnet to cluster eqiad and group A
09:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2156.codfw.wmnet with reason: Maintenance
09:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T354336)', diff saved to https://phabricator.wikimedia.org/P55158 and previous config saved to /var/cache/conftool/dbconfig/20240122-091838-marostegui.json
09:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1213.eqiad.wmnet with OS bookworm
09:17 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on gerrit[1003,2002].wikimedia.org with reason: Gerrit update
09:17 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on gerrit[1003,2002].wikimedia.org with reason: Gerrit update
09:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
09:11 hashar: Gerrit: reindexing all changes for 3.6 > 3.7 migration # T354885
09:08 hashar@deploy2002: Finished deploy [gerrit/gerrit@bdd1a8b]: Gerrit to version 3.7.6 (duration: 00m 10s)
09:08 hashar@deploy2002: Started deploy [gerrit/gerrit@bdd1a8b]: Gerrit to version 3.7.6
09:06 hashar: Upgrading Gerrit # T354885
09:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
09:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55157 and previous config saved to /var/cache/conftool/dbconfig/20240122-090504-root.json
09:03 cgoubert@cumin1002: conftool action : set/pooled=no; selector: name=mw2444.codfw.wmnet
09:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P55156 and previous config saved to /var/cache/conftool/dbconfig/20240122-090332-marostegui.json
09:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55155 and previous config saved to /var/cache/conftool/dbconfig/20240122-090218-root.json
09:01 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mw2394.codfw.wmnet
09:01 cgoubert@cumin1002: START - Cookbook sre.hosts.remove-downtime for mw2394.codfw.wmnet
08:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1213.eqiad.wmnet with reason: host reimage
08:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1213.eqiad.wmnet with reason: host reimage
08:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55154 and previous config saved to /var/cache/conftool/dbconfig/20240122-084959-root.json
08:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P55153 and previous config saved to /var/cache/conftool/dbconfig/20240122-084825-marostegui.json
08:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55152 and previous config saved to /var/cache/conftool/dbconfig/20240122-084713-root.json
08:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2048.codfw.wmnet
08:39 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1213.eqiad.wmnet with OS bookworm
08:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1213:3316 db1213:3315', diff saved to https://phabricator.wikimedia.org/P55151 and previous config saved to /var/cache/conftool/dbconfig/20240122-083812-marostegui.json
08:38 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2048.codfw.wmnet
08:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1048.eqiad.wmnet
08:35 xSavitar: UTC morning backport window done!
08:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55150 and previous config saved to /var/cache/conftool/dbconfig/20240122-083454-root.json
08:34 derick@deploy2002: Finished scap: Backport for gerrit:988403wmf-config: Remove unused wgCentralAuthTokenCacheType (T336004) (duration: 18m 15s)
08:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T354336)', diff saved to https://phabricator.wikimedia.org/P55149 and previous config saved to /var/cache/conftool/dbconfig/20240122-083319-marostegui.json
08:32 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1048.eqiad.wmnet
08:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55148 and previous config saved to /var/cache/conftool/dbconfig/20240122-083208-root.json
08:27 derick@deploy2002: d3r1ck01 and derick: Continuing with sync
08:26 derick@deploy2002: d3r1ck01 and derick: Backport for gerrit:988403wmf-config: Remove unused wgCentralAuthTokenCacheType (T336004) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55147 and previous config saved to /var/cache/conftool/dbconfig/20240122-081950-root.json
08:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55146 and previous config saved to /var/cache/conftool/dbconfig/20240122-081727-root.json
08:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55145 and previous config saved to /var/cache/conftool/dbconfig/20240122-081703-root.json
08:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2149 (T354336)', diff saved to https://phabricator.wikimedia.org/P55144 and previous config saved to /var/cache/conftool/dbconfig/20240122-081618-marostegui.json
08:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2149.codfw.wmnet with reason: Maintenance
08:15 derick@deploy2002: Started scap: Backport for gerrit:988403wmf-config: Remove unused wgCentralAuthTokenCacheType (T336004)
08:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2149.codfw.wmnet with reason: Maintenance
08:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T354336)', diff saved to https://phabricator.wikimedia.org/P55143 and previous config saved to /var/cache/conftool/dbconfig/20240122-081545-marostegui.json
08:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55142 and previous config saved to /var/cache/conftool/dbconfig/20240122-080445-root.json
08:02 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55141 and previous config saved to /var/cache/conftool/dbconfig/20240122-080222-root.json
08:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55140 and previous config saved to /var/cache/conftool/dbconfig/20240122-080158-root.json
08:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P55139 and previous config saved to /var/cache/conftool/dbconfig/20240122-080038-marostegui.json
07:54 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Shubhankar Patankar out of all services on: 2208 hosts
07:53 root@cumin2002: START - Cookbook sre.idm.logout Logging Shubhankar Patankar out of all services on: 2208 hosts
07:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55138 and previous config saved to /var/cache/conftool/dbconfig/20240122-074940-root.json
07:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55137 and previous config saved to /var/cache/conftool/dbconfig/20240122-074717-root.json
07:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55136 and previous config saved to /var/cache/conftool/dbconfig/20240122-074653-root.json
07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P55135 and previous config saved to /var/cache/conftool/dbconfig/20240122-074532-marostegui.json
07:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2169.codfw.wmnet with OS bookworm
07:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55134 and previous config saved to /var/cache/conftool/dbconfig/20240122-073435-root.json
07:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55133 and previous config saved to /var/cache/conftool/dbconfig/20240122-073212-root.json
07:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55132 and previous config saved to /var/cache/conftool/dbconfig/20240122-073148-root.json
07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T354336)', diff saved to https://phabricator.wikimedia.org/P55131 and previous config saved to /var/cache/conftool/dbconfig/20240122-073025-marostegui.json
07:28 kart_: Updated MinT to 2024-01-22-053144-production (T355303, T338608, T353510, T354666)
07:20 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
07:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55130 and previous config saved to /var/cache/conftool/dbconfig/20240122-071707-root.json
07:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
07:13 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
07:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2127 (T354336)', diff saved to https://phabricator.wikimedia.org/P55129 and previous config saved to /var/cache/conftool/dbconfig/20240122-071117-marostegui.json
07:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
07:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2127.codfw.wmnet with reason: Maintenance
07:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2127.codfw.wmnet with reason: Maintenance
07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T354336)', diff saved to https://phabricator.wikimedia.org/P55128 and previous config saved to /var/cache/conftool/dbconfig/20240122-071054-marostegui.json
07:02 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55127 and previous config saved to /var/cache/conftool/dbconfig/20240122-070202-root.json
07:02 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P55126 and previous config saved to /var/cache/conftool/dbconfig/20240122-065548-marostegui.json
06:55 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
06:52 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2169.codfw.wmnet with OS bookworm
06:52 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
06:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2169:3316 db2169:3317', diff saved to https://phabricator.wikimedia.org/P55125 and previous config saved to /var/cache/conftool/dbconfig/20240122-064929-marostegui.json
06:47 kartik@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
06:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55124 and previous config saved to /var/cache/conftool/dbconfig/20240122-064657-root.json
06:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1187.eqiad.wmnet with OS bookworm
06:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P55123 and previous config saved to /var/cache/conftool/dbconfig/20240122-064041-marostegui.json
06:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1187.eqiad.wmnet with reason: host reimage
06:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T354336)', diff saved to https://phabricator.wikimedia.org/P55122 and previous config saved to /var/cache/conftool/dbconfig/20240122-062535-marostegui.json
06:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1187.eqiad.wmnet with reason: host reimage
06:10 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1187.eqiad.wmnet with OS bookworm
06:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1187 T354506', diff saved to https://phabricator.wikimedia.org/P55121 and previous config saved to /var/cache/conftool/dbconfig/20240122-060811-marostegui.json
06:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2109 (T354336)', diff saved to https://phabricator.wikimedia.org/P55120 and previous config saved to /var/cache/conftool/dbconfig/20240122-060529-marostegui.json
06:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2109.codfw.wmnet with reason: Maintenance
06:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2109.codfw.wmnet with reason: Maintenance
06:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1157.eqiad.wmnet with reason: Maintenance
06:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1157.eqiad.wmnet with reason: Maintenance
05:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
05:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
05:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55119 and previous config saved to /var/cache/conftool/dbconfig/20240122-054005-ladsgroup.json
05:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P55118 and previous config saved to /var/cache/conftool/dbconfig/20240122-052458-ladsgroup.json
05:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P55117 and previous config saved to /var/cache/conftool/dbconfig/20240122-050952-ladsgroup.json
04:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55116 and previous config saved to /var/cache/conftool/dbconfig/20240122-045445-ladsgroup.json

2024-01-21

23:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55115 and previous config saved to /var/cache/conftool/dbconfig/20240121-232323-ladsgroup.json
23:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
23:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
23:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55114 and previous config saved to /var/cache/conftool/dbconfig/20240121-232300-ladsgroup.json
23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P55113 and previous config saved to /var/cache/conftool/dbconfig/20240121-230754-ladsgroup.json
22:55 tgr: T355491 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=dawiki --logwiki=metawiki 'Radiocolono' 'GuaritaRM'
22:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P55112 and previous config saved to /var/cache/conftool/dbconfig/20240121-225247-ladsgroup.json
22:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55111 and previous config saved to /var/cache/conftool/dbconfig/20240121-223740-ladsgroup.json
17:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55110 and previous config saved to /var/cache/conftool/dbconfig/20240121-171534-ladsgroup.json
17:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
17:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
17:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P55109 and previous config saved to /var/cache/conftool/dbconfig/20240121-171512-ladsgroup.json
17:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P55108 and previous config saved to /var/cache/conftool/dbconfig/20240121-170005-ladsgroup.json
16:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P55107 and previous config saved to /var/cache/conftool/dbconfig/20240121-164459-ladsgroup.json
16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P55106 and previous config saved to /var/cache/conftool/dbconfig/20240121-162952-ladsgroup.json
11:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P55105 and previous config saved to /var/cache/conftool/dbconfig/20240121-110344-ladsgroup.json
11:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
11:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
11:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55104 and previous config saved to /var/cache/conftool/dbconfig/20240121-110322-ladsgroup.json
10:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55103 and previous config saved to /var/cache/conftool/dbconfig/20240121-104815-ladsgroup.json
10:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55102 and previous config saved to /var/cache/conftool/dbconfig/20240121-103309-ladsgroup.json
10:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55101 and previous config saved to /var/cache/conftool/dbconfig/20240121-101802-ladsgroup.json
09:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55100 and previous config saved to /var/cache/conftool/dbconfig/20240121-091731-ladsgroup.json
09:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
09:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
09:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P55099 and previous config saved to /var/cache/conftool/dbconfig/20240121-091708-ladsgroup.json
09:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2175', diff saved to https://phabricator.wikimedia.org/P55098 and previous config saved to /var/cache/conftool/dbconfig/20240121-090831-marostegui.json
09:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P55097 and previous config saved to /var/cache/conftool/dbconfig/20240121-090202-ladsgroup.json
08:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P55096 and previous config saved to /var/cache/conftool/dbconfig/20240121-084655-ladsgroup.json
08:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P55095 and previous config saved to /var/cache/conftool/dbconfig/20240121-083148-ladsgroup.json
02:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P55094 and previous config saved to /var/cache/conftool/dbconfig/20240121-024507-ladsgroup.json
02:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
02:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
02:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P55093 and previous config saved to /var/cache/conftool/dbconfig/20240121-024445-ladsgroup.json
02:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P55092 and previous config saved to /var/cache/conftool/dbconfig/20240121-022939-ladsgroup.json
02:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P55091 and previous config saved to /var/cache/conftool/dbconfig/20240121-021432-ladsgroup.json
01:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P55090 and previous config saved to /var/cache/conftool/dbconfig/20240121-015926-ladsgroup.json
00:29 mutante: phabricator is back and on bullseye
00:11 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004 (duration: 00m 13s)
00:11 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004
00:03 mutante: phab1004:/usr/bin# ln -s /var/lib/scap/scap/bin/scap .
00:00 brennen@deploy2002: Installation of scap version "latest" completed for 1 hosts
00:00 brennen@deploy2002: Installing scap version "latest" for 1 hosts

2024-01-20

23:58 mutante: phab1004 - chown -R scap:scap /var/lib/scap
23:10 brennen@deploy2002: Installing scap version "latest" for 1 hosts
22:45 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004 (duration: 00m 10s)
22:44 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004
22:39 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004 (duration: 00m 10s)
22:39 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004
22:34 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: deployment
22:34 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab2002.codfw.wmnet with reason: deployment
22:28 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config revert (part 2) (duration: 00m 54s)
22:27 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config revert (part 2)
22:23 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config revert (duration: 00m 55s)
22:22 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config revert
22:02 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on phabricator.wikimedia.org with reason: OS upgrade
22:02 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on phabricator.wikimedia.org with reason: OS upgrade
22:02 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab.wmfusercontent.org with reason: OS upgrade
22:02 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab.wmfusercontent.org with reason: OS upgrade
22:02 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host phab1004.eqiad.wmnet with OS bullseye
22:02 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab1004.eqiad.wmnet with reason: OS upgrade
22:01 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab1004.eqiad.wmnet with reason: OS upgrade
21:46 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1004.eqiad.wmnet with reason: host reimage
21:43 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab1004.eqiad.wmnet with reason: host reimage
21:33 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
21:33 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
21:31 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host phab1004.eqiad.wmnet with OS bullseye
21:27 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host phab1004.eqiad.wmnet with OS bullseye
21:27 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host phab1004.eqiad.wmnet with OS bullseye
21:03 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config changes (redux) (duration: 01m 35s)
21:02 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config changes (redux)
20:38 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2002.codfw.wmnet with reason: maintenance
20:38 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2002.codfw.wmnet with reason: maintenance
20:37 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up database changes (duration: 00m 53s)
20:36 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up database changes
20:32 mutante: phabricator going down for maintenance
20:24 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab.wmfusercontent.org with reason: OS upgrade
20:23 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on phabricator.wikimedia.org with reason: OS upgrade
20:23 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on phabricator.wikimedia.org with reason: OS upgrade
20:22 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1004.eqiad.wmnet with reason: OS upgrade
20:22 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab1004.eqiad.wmnet with reason: OS upgrade
20:04 brennen: start of phab/phorge bullseye update window - T334519
20:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P55089 and previous config saved to /var/cache/conftool/dbconfig/20240120-200154-ladsgroup.json
20:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
20:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
14:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
14:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
09:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
09:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
09:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P55087 and previous config saved to /var/cache/conftool/dbconfig/20240120-095311-ladsgroup.json
09:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P55086 and previous config saved to /var/cache/conftool/dbconfig/20240120-093804-ladsgroup.json
09:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P55085 and previous config saved to /var/cache/conftool/dbconfig/20240120-092257-ladsgroup.json
09:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P55084 and previous config saved to /var/cache/conftool/dbconfig/20240120-090751-ladsgroup.json
04:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P55083 and previous config saved to /var/cache/conftool/dbconfig/20240120-041124-ladsgroup.json
04:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: Maintenance
04:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: Maintenance
04:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P55082 and previous config saved to /var/cache/conftool/dbconfig/20240120-041102-ladsgroup.json
03:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P55081 and previous config saved to /var/cache/conftool/dbconfig/20240120-035555-ladsgroup.json
03:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P55080 and previous config saved to /var/cache/conftool/dbconfig/20240120-034049-ladsgroup.json
03:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P55079 and previous config saved to /var/cache/conftool/dbconfig/20240120-032542-ladsgroup.json

2024-01-19

22:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P55078 and previous config saved to /var/cache/conftool/dbconfig/20240119-225906-ladsgroup.json
22:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
22:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
22:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P55077 and previous config saved to /var/cache/conftool/dbconfig/20240119-225844-ladsgroup.json
22:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P55076 and previous config saved to /var/cache/conftool/dbconfig/20240119-224337-ladsgroup.json
22:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P55075 and previous config saved to /var/cache/conftool/dbconfig/20240119-222830-ladsgroup.json
22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P55074 and previous config saved to /var/cache/conftool/dbconfig/20240119-221324-ladsgroup.json
22:05 ryankemper: [WDQS] Repooled `wdqs10[19,20]` (caught up on lag)
20:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
20:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
20:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T354336)', diff saved to https://phabricator.wikimedia.org/P55073 and previous config saved to /var/cache/conftool/dbconfig/20240119-202129-marostegui.json
20:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P55072 and previous config saved to /var/cache/conftool/dbconfig/20240119-200622-marostegui.json
19:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P55071 and previous config saved to /var/cache/conftool/dbconfig/20240119-195116-marostegui.json
19:45 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
19:43 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2088.codfw.wmnet']
19:38 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088.codfw.wmnet']
19:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T354336)', diff saved to https://phabricator.wikimedia.org/P55070 and previous config saved to /var/cache/conftool/dbconfig/20240119-193610-marostegui.json
19:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1223 (T354336)', diff saved to https://phabricator.wikimedia.org/P55069 and previous config saved to /var/cache/conftool/dbconfig/20240119-193028-marostegui.json
19:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1223.eqiad.wmnet with reason: Maintenance
19:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1223.eqiad.wmnet with reason: Maintenance
19:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T354336)', diff saved to https://phabricator.wikimedia.org/P55068 and previous config saved to /var/cache/conftool/dbconfig/20240119-193006-marostegui.json
19:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P55067 and previous config saved to /var/cache/conftool/dbconfig/20240119-191459-marostegui.json
18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P55066 and previous config saved to /var/cache/conftool/dbconfig/20240119-185953-marostegui.json
18:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T354336)', diff saved to https://phabricator.wikimedia.org/P55065 and previous config saved to /var/cache/conftool/dbconfig/20240119-184446-marostegui.json
18:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1212 (T354336)', diff saved to https://phabricator.wikimedia.org/P55064 and previous config saved to /var/cache/conftool/dbconfig/20240119-183902-marostegui.json
18:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
18:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
18:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1212.eqiad.wmnet with reason: Maintenance
18:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1212.eqiad.wmnet with reason: Maintenance
18:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T354336)', diff saved to https://phabricator.wikimedia.org/P55063 and previous config saved to /var/cache/conftool/dbconfig/20240119-183821-marostegui.json
18:27 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
18:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P55062 and previous config saved to /var/cache/conftool/dbconfig/20240119-182314-marostegui.json
18:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P55061 and previous config saved to /var/cache/conftool/dbconfig/20240119-180808-marostegui.json
18:02 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
17:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T354336)', diff saved to https://phabricator.wikimedia.org/P55060 and previous config saved to /var/cache/conftool/dbconfig/20240119-175301-marostegui.json
17:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1198 (T354336)', diff saved to https://phabricator.wikimedia.org/P55059 and previous config saved to /var/cache/conftool/dbconfig/20240119-174735-marostegui.json
17:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1198.eqiad.wmnet with reason: Maintenance
17:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1198.eqiad.wmnet with reason: Maintenance
17:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T354336)', diff saved to https://phabricator.wikimedia.org/P55058 and previous config saved to /var/cache/conftool/dbconfig/20240119-174713-marostegui.json
17:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P55057 and previous config saved to /var/cache/conftool/dbconfig/20240119-173207-marostegui.json
17:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P55056 and previous config saved to /var/cache/conftool/dbconfig/20240119-172715-ladsgroup.json
17:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
17:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T352010)', diff saved to https://phabricator.wikimedia.org/P55055 and previous config saved to /var/cache/conftool/dbconfig/20240119-172652-ladsgroup.json
17:25 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cloudelastic1010.wikimedia.org with reason: need to fix regex certs
17:25 bking@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on cloudelastic1010.wikimedia.org with reason: need to fix regex certs
17:23 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1010.wikimedia.org
17:23 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1009.wikimedia.org
17:23 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1008.wikimedia.org
17:22 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1007.wikimedia.org
17:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P55054 and previous config saved to /var/cache/conftool/dbconfig/20240119-171700-marostegui.json
17:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P55053 and previous config saved to /var/cache/conftool/dbconfig/20240119-171146-ladsgroup.json
17:06 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
17:04 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host elastic2088.codfw.wmnet with OS bullseye
17:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T354336)', diff saved to https://phabricator.wikimedia.org/P55052 and previous config saved to /var/cache/conftool/dbconfig/20240119-170154-marostegui.json
16:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P55051 and previous config saved to /var/cache/conftool/dbconfig/20240119-165639-ladsgroup.json
16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1189 (T354336)', diff saved to https://phabricator.wikimedia.org/P55050 and previous config saved to /var/cache/conftool/dbconfig/20240119-165627-marostegui.json
16:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1189.eqiad.wmnet with reason: Maintenance
16:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1189.eqiad.wmnet with reason: Maintenance
16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T354336)', diff saved to https://phabricator.wikimedia.org/P55049 and previous config saved to /var/cache/conftool/dbconfig/20240119-165605-marostegui.json
16:41 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
16:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T352010)', diff saved to https://phabricator.wikimedia.org/P55048 and previous config saved to /var/cache/conftool/dbconfig/20240119-164133-ladsgroup.json
16:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P55047 and previous config saved to /var/cache/conftool/dbconfig/20240119-164058-marostegui.json
16:38 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
16:31 Emperor: mark new drive as non-RAID, mount, restore to service with puppet ms-be2072 T355330
16:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P55046 and previous config saved to /var/cache/conftool/dbconfig/20240119-162552-marostegui.json
16:16 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
16:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T354336)', diff saved to https://phabricator.wikimedia.org/P55045 and previous config saved to /var/cache/conftool/dbconfig/20240119-161046-marostegui.json
16:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1175 (T354336)', diff saved to https://phabricator.wikimedia.org/P55044 and previous config saved to /var/cache/conftool/dbconfig/20240119-160521-marostegui.json
16:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1175.eqiad.wmnet with reason: Maintenance
16:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1175.eqiad.wmnet with reason: Maintenance
16:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55043 and previous config saved to /var/cache/conftool/dbconfig/20240119-160459-marostegui.json
15:57 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
15:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P55042 and previous config saved to /var/cache/conftool/dbconfig/20240119-154953-marostegui.json
15:46 gmodena@deploy2002: Finished deploy [airflow-dags/analytics@f32c06e]: (no justification provided) (duration: 00m 30s)
15:46 gmodena@deploy2002: Started deploy [airflow-dags/analytics@f32c06e]: (no justification provided)
15:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P55041 and previous config saved to /var/cache/conftool/dbconfig/20240119-153446-marostegui.json
15:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55040 and previous config saved to /var/cache/conftool/dbconfig/20240119-151940-marostegui.json
15:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55039 and previous config saved to /var/cache/conftool/dbconfig/20240119-151413-marostegui.json
15:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1166.eqiad.wmnet with reason: Maintenance
15:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1166.eqiad.wmnet with reason: Maintenance
15:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
15:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
15:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
15:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
15:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2118.codfw.wmnet with reason: Maintenance
15:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2118.codfw.wmnet with reason: Maintenance
14:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T354336)', diff saved to https://phabricator.wikimedia.org/P55038 and previous config saved to /var/cache/conftool/dbconfig/20240119-145930-marostegui.json
14:56 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
14:50 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1107.eqiad.wmnet with OS bullseye
14:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P55036 and previous config saved to /var/cache/conftool/dbconfig/20240119-144423-marostegui.json
14:37 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1103.eqiad.wmnet with OS bullseye
14:35 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
14:34 bking@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['elastic2088.codfw.wmnet']
14:34 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088.codfw.wmnet']
14:34 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2088.codfw.wmnet']
14:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1107.eqiad.wmnet with reason: host reimage
14:31 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088.codfw.wmnet']
14:29 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1107.eqiad.wmnet with reason: host reimage
14:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P55034 and previous config saved to /var/cache/conftool/dbconfig/20240119-142917-marostegui.json
14:27 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
14:27 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
14:24 ejegg: payments-wiki upgraded from c37ddae5 to c2138768
14:21 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
14:21 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
14:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1103.eqiad.wmnet with reason: host reimage
14:17 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1103.eqiad.wmnet with reason: host reimage
14:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T354336)', diff saved to https://phabricator.wikimedia.org/P55033 and previous config saved to /var/cache/conftool/dbconfig/20240119-141411-marostegui.json
14:13 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1107.eqiad.wmnet with OS bullseye
14:12 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
14:12 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
14:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T354336)', diff saved to https://phabricator.wikimedia.org/P55032 and previous config saved to /var/cache/conftool/dbconfig/20240119-140746-marostegui.json
14:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2182.codfw.wmnet with reason: Maintenance
14:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2182.codfw.wmnet with reason: Maintenance
14:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55031 and previous config saved to /var/cache/conftool/dbconfig/20240119-140712-marostegui.json
14:07 gmodena@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
14:06 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
14:02 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1103.eqiad.wmnet with OS bullseye
13:58 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
13:57 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P55030 and previous config saved to /var/cache/conftool/dbconfig/20240119-135206-marostegui.json
13:46 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
13:46 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
13:43 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
13:38 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2046.codfw.wmnet
13:38 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1046.eqiad.wmnet
13:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P55029 and previous config saved to /var/cache/conftool/dbconfig/20240119-133659-marostegui.json
13:32 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2046.codfw.wmnet
13:32 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1046.eqiad.wmnet
13:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55028 and previous config saved to /var/cache/conftool/dbconfig/20240119-132153-marostegui.json
13:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2169:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55027 and previous config saved to /var/cache/conftool/dbconfig/20240119-131929-marostegui.json
13:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
13:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
13:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55026 and previous config saved to /var/cache/conftool/dbconfig/20240119-131906-marostegui.json
13:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P55024 and previous config saved to /var/cache/conftool/dbconfig/20240119-130400-marostegui.json
12:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P55023 and previous config saved to /var/cache/conftool/dbconfig/20240119-124853-marostegui.json
12:45 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
12:44 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
12:44 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
12:43 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
12:42 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
12:41 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
12:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55022 and previous config saved to /var/cache/conftool/dbconfig/20240119-123347-marostegui.json
12:32 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
12:32 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
12:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2168:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55021 and previous config saved to /var/cache/conftool/dbconfig/20240119-123023-marostegui.json
12:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
12:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
12:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T354336)', diff saved to https://phabricator.wikimedia.org/P55020 and previous config saved to /var/cache/conftool/dbconfig/20240119-123001-marostegui.json
12:30 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
12:29 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
12:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P55019 and previous config saved to /var/cache/conftool/dbconfig/20240119-121455-marostegui.json
11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P55018 and previous config saved to /var/cache/conftool/dbconfig/20240119-115948-marostegui.json
11:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1243 (T352010)', diff saved to https://phabricator.wikimedia.org/P55017 and previous config saved to /var/cache/conftool/dbconfig/20240119-114452-ladsgroup.json
11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T354336)', diff saved to https://phabricator.wikimedia.org/P55016 and previous config saved to /var/cache/conftool/dbconfig/20240119-114442-marostegui.json
11:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
11:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
11:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T352010)', diff saved to https://phabricator.wikimedia.org/P55015 and previous config saved to /var/cache/conftool/dbconfig/20240119-114424-ladsgroup.json
11:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T354336)', diff saved to https://phabricator.wikimedia.org/P55014 and previous config saved to /var/cache/conftool/dbconfig/20240119-114219-marostegui.json
11:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
11:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
11:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance
11:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance
11:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T354336)', diff saved to https://phabricator.wikimedia.org/P55013 and previous config saved to /var/cache/conftool/dbconfig/20240119-114140-marostegui.json
11:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P55012 and previous config saved to /var/cache/conftool/dbconfig/20240119-112917-ladsgroup.json
11:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P55011 and previous config saved to /var/cache/conftool/dbconfig/20240119-112634-marostegui.json
11:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P55010 and previous config saved to /var/cache/conftool/dbconfig/20240119-111411-ladsgroup.json
11:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P55009 and previous config saved to /var/cache/conftool/dbconfig/20240119-111127-marostegui.json
10:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T352010)', diff saved to https://phabricator.wikimedia.org/P55008 and previous config saved to /var/cache/conftool/dbconfig/20240119-105904-ladsgroup.json
10:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T354336)', diff saved to https://phabricator.wikimedia.org/P55007 and previous config saved to /var/cache/conftool/dbconfig/20240119-105621-marostegui.json
10:45 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
10:42 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
10:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T354336)', diff saved to https://phabricator.wikimedia.org/P55006 and previous config saved to /var/cache/conftool/dbconfig/20240119-101340-marostegui.json
10:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2150.codfw.wmnet with reason: Maintenance
10:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2150.codfw.wmnet with reason: Maintenance
10:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T354336)', diff saved to https://phabricator.wikimedia.org/P55005 and previous config saved to /var/cache/conftool/dbconfig/20240119-101318-marostegui.json
09:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P55004 and previous config saved to /var/cache/conftool/dbconfig/20240119-095811-marostegui.json
09:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P55003 and previous config saved to /var/cache/conftool/dbconfig/20240119-094305-marostegui.json
09:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T354336)', diff saved to https://phabricator.wikimedia.org/P55002 and previous config saved to /var/cache/conftool/dbconfig/20240119-092758-marostegui.json
09:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T354336)', diff saved to https://phabricator.wikimedia.org/P55001 and previous config saved to /var/cache/conftool/dbconfig/20240119-092535-marostegui.json
09:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2122.codfw.wmnet with reason: Maintenance
09:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2122.codfw.wmnet with reason: Maintenance
09:25 jnuche@deploy2002: Installation of scap version "4.65.2" completed for 531 hosts
09:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T354336)', diff saved to https://phabricator.wikimedia.org/P55000 and previous config saved to /var/cache/conftool/dbconfig/20240119-092513-marostegui.json
09:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore2006.codfw.wmnet
09:24 jnuche@deploy2002: Installing scap version "4.65.2" for 531 hosts
09:15 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore2006.codfw.wmnet
09:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore2005.codfw.wmnet
09:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P54999 and previous config saved to /var/cache/conftool/dbconfig/20240119-091007-marostegui.json
09:03 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore2005.codfw.wmnet
09:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore2004.codfw.wmnet
08:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P54998 and previous config saved to /var/cache/conftool/dbconfig/20240119-085500-marostegui.json
08:53 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore2004.codfw.wmnet
08:50 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore1006.eqiad.wmnet
08:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T354336)', diff saved to https://phabricator.wikimedia.org/P54997 and previous config saved to /var/cache/conftool/dbconfig/20240119-083954-marostegui.json
08:39 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore1006.eqiad.wmnet
08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2121 (T354336)', diff saved to https://phabricator.wikimedia.org/P54996 and previous config saved to /var/cache/conftool/dbconfig/20240119-083730-marostegui.json
08:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance
08:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance
08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T354336)', diff saved to https://phabricator.wikimedia.org/P54995 and previous config saved to /var/cache/conftool/dbconfig/20240119-083709-marostegui.json
08:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore1005.eqiad.wmnet
08:22 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore1005.eqiad.wmnet
08:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P54994 and previous config saved to /var/cache/conftool/dbconfig/20240119-082202-marostegui.json
08:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore1004.eqiad.wmnet
08:11 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore1004.eqiad.wmnet
08:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P54993 and previous config saved to /var/cache/conftool/dbconfig/20240119-080655-marostegui.json
07:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 100%: T354336', diff saved to https://phabricator.wikimedia.org/P54992 and previous config saved to /var/cache/conftool/dbconfig/20240119-075828-root.json
07:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T354336)', diff saved to https://phabricator.wikimedia.org/P54991 and previous config saved to /var/cache/conftool/dbconfig/20240119-075149-marostegui.json
07:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2120 (T354336)', diff saved to https://phabricator.wikimedia.org/P54990 and previous config saved to /var/cache/conftool/dbconfig/20240119-074825-marostegui.json
07:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2120.codfw.wmnet with reason: Maintenance
07:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2120.codfw.wmnet with reason: Maintenance
07:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T354336)', diff saved to https://phabricator.wikimedia.org/P54989 and previous config saved to /var/cache/conftool/dbconfig/20240119-074752-marostegui.json
07:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 75%: T354336', diff saved to https://phabricator.wikimedia.org/P54988 and previous config saved to /var/cache/conftool/dbconfig/20240119-074323-root.json
07:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P54987 and previous config saved to /var/cache/conftool/dbconfig/20240119-073245-marostegui.json
07:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 50%: T354336', diff saved to https://phabricator.wikimedia.org/P54986 and previous config saved to /var/cache/conftool/dbconfig/20240119-072818-root.json
07:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P54985 and previous config saved to /var/cache/conftool/dbconfig/20240119-071739-marostegui.json
07:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 25%: T354336', diff saved to https://phabricator.wikimedia.org/P54984 and previous config saved to /var/cache/conftool/dbconfig/20240119-071313-root.json
07:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T354336)', diff saved to https://phabricator.wikimedia.org/P54983 and previous config saved to /var/cache/conftool/dbconfig/20240119-070233-marostegui.json
07:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2108 (T354336)', diff saved to https://phabricator.wikimedia.org/P54982 and previous config saved to /var/cache/conftool/dbconfig/20240119-070009-marostegui.json
07:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2108.codfw.wmnet with reason: Maintenance
06:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2108.codfw.wmnet with reason: Maintenance
06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
06:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
06:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
06:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 10%: T354336', diff saved to https://phabricator.wikimedia.org/P54981 and previous config saved to /var/cache/conftool/dbconfig/20240119-065808-root.json
06:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
06:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
06:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
06:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
06:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
06:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
06:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T354336)', diff saved to https://phabricator.wikimedia.org/P54979 and previous config saved to /var/cache/conftool/dbconfig/20240119-063020-marostegui.json
06:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
06:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
06:28 marostegui@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
06:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
06:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1242 (T352010)', diff saved to https://phabricator.wikimedia.org/P54978 and previous config saved to /var/cache/conftool/dbconfig/20240119-061827-ladsgroup.json
06:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
06:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
06:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T352010)', diff saved to https://phabricator.wikimedia.org/P54977 and previous config saved to /var/cache/conftool/dbconfig/20240119-061805-ladsgroup.json
06:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P54976 and previous config saved to /var/cache/conftool/dbconfig/20240119-060258-ladsgroup.json
05:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P54975 and previous config saved to /var/cache/conftool/dbconfig/20240119-054751-ladsgroup.json
05:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T352010)', diff saved to https://phabricator.wikimedia.org/P54974 and previous config saved to /var/cache/conftool/dbconfig/20240119-053244-ladsgroup.json
03:38 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
02:49 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1103.eqiad.wmnet with OS bullseye
02:48 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1106.eqiad.wmnet with OS bullseye
02:45 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1105.eqiad.wmnet with OS bullseye
02:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1104.eqiad.wmnet with OS bullseye
02:31 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1106.eqiad.wmnet with reason: host reimage
02:28 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1106.eqiad.wmnet with reason: host reimage
02:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1105.eqiad.wmnet with reason: host reimage
02:24 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1105.eqiad.wmnet with reason: host reimage
02:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1104.eqiad.wmnet with reason: host reimage
02:21 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1104.eqiad.wmnet with reason: host reimage
02:18 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
02:17 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
02:12 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1106.eqiad.wmnet with OS bullseye
02:09 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1105.eqiad.wmnet with OS bullseye
02:09 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
02:06 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1104.eqiad.wmnet with OS bullseye
02:01 tzatziki: removing 4 files for legal compliance
01:42 tzatziki: removing 3 files for legal compliance
01:28 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1103.eqiad.wmnet with OS bullseye
01:08 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2097.codfw.wmnet with OS bullseye
01:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2096.codfw.wmnet with OS bullseye
00:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
00:50 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2097.codfw.wmnet with reason: host reimage
00:50 tzatziki: removing 1 file for legal compliance
00:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
00:47 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2097.codfw.wmnet with reason: host reimage
00:46 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2096.codfw.wmnet with reason: host reimage
00:43 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2096.codfw.wmnet with reason: host reimage
00:42 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2101.codfw.wmnet with OS bullseye
00:40 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2100.codfw.wmnet with OS bullseye
00:34 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2099.codfw.wmnet with OS bullseye
00:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2097.codfw.wmnet with OS bullseye
00:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1241 (T352010)', diff saved to https://phabricator.wikimedia.org/P54973 and previous config saved to /var/cache/conftool/dbconfig/20240119-002755-ladsgroup.json
00:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
00:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
00:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T352010)', diff saved to https://phabricator.wikimedia.org/P54972 and previous config saved to /var/cache/conftool/dbconfig/20240119-002733-ladsgroup.json
00:26 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2096.codfw.wmnet with OS bullseye
00:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2098.codfw.wmnet with OS bullseye
00:25 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2101.codfw.wmnet with reason: host reimage
00:22 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2100.codfw.wmnet with reason: host reimage
00:21 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2101.codfw.wmnet with reason: host reimage
00:18 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2100.codfw.wmnet with reason: host reimage
00:17 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2099.codfw.wmnet with reason: host reimage
00:14 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2099.codfw.wmnet with reason: host reimage
00:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs1020.eqiad.wmnet with reason: needs to catch up from its lag
00:13 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs1020.eqiad.wmnet with reason: needs to catch up from its lag
00:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P54971 and previous config saved to /var/cache/conftool/dbconfig/20240119-001226-ladsgroup.json
00:12 inflatador: bking@wdqs1020 depool host to catch up on lag
00:08 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2098.codfw.wmnet with reason: host reimage
00:05 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2098.codfw.wmnet with reason: host reimage
00:05 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2101.codfw.wmnet with OS bullseye
00:02 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2100.codfw.wmnet with OS bullseye

2024-01-18

23:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2099.codfw.wmnet with OS bullseye
23:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P54970 and previous config saved to /var/cache/conftool/dbconfig/20240118-235720-ladsgroup.json
23:50 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
23:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2098.codfw.wmnet with OS bullseye
23:47 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
23:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T352010)', diff saved to https://phabricator.wikimedia.org/P54969 and previous config saved to /var/cache/conftool/dbconfig/20240118-234213-ladsgroup.json
23:13 tstarling@deploy2002: Synchronized php-1.42.0-wmf.14/extensions/CodeMirror/resources/mode/mediawiki/mediawiki.less: fix CodeMirror style bug T355290 (duration: 06m 33s)
22:59 bking@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host elastic2086.codfw.wmnet
22:55 bking@cumin2002: START - Cookbook sre.puppet.migrate-host for host elastic2086.codfw.wmnet
22:55 bking@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host elastic2086*
22:54 bking@cumin2002: START - Cookbook sre.puppet.migrate-host for host elastic2086*
22:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
22:00 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
21:59 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
21:57 urbanecm@deploy2002: Finished scap: Backport for gerrit:991561Use BetaFeatures::isFeatureEnabled instead of getOption (T354288) (duration: 06m 58s)
21:50 urbanecm@deploy2002: Started scap: Backport for gerrit:991561Use BetaFeatures::isFeatureEnabled instead of getOption (T354288)
21:41 jforrester@deploy2002: Finished scap: Backport for gerrit:991547Promote wikimaniawiki to Vector 2022 as default skin (T355297) (duration: 07m 33s)
21:35 jforrester@deploy2002: jforrester and msz2001: Continuing with sync
21:35 jforrester@deploy2002: jforrester and msz2001: Backport for gerrit:991547Promote wikimaniawiki to Vector 2022 as default skin (T355297) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:34 jforrester@deploy2002: Started scap: Backport for gerrit:991547Promote wikimaniawiki to Vector 2022 as default skin (T355297)
21:15 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30-no-render-now.txt`
21:14 dreamyjazz@deploy2002: Finished scap: Backport for gerrit:991555Log to statsd HTTP status codes and reduce logstash log levels (T355216) (duration: 09m 00s)
21:14 Dreamy_Jazz: Stopped MediaModeration scanning script (T351400)
21:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
21:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
21:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T354336)', diff saved to https://phabricator.wikimedia.org/P54968 and previous config saved to /var/cache/conftool/dbconfig/20240118-211337-marostegui.json
21:08 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
21:08 dreamyjazz@deploy2002: dreamyjazz: Backport for gerrit:991555Log to statsd HTTP status codes and reduce logstash log levels (T355216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:05 dreamyjazz@deploy2002: Started scap: Backport for gerrit:991555Log to statsd HTTP status codes and reduce logstash log levels (T355216)
21:04 ejegg: payments-wiki upgraded from e38b24f0 to c37ddae5
20:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P54967 and previous config saved to /var/cache/conftool/dbconfig/20240118-205830-marostegui.json
20:44 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
20:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P54966 and previous config saved to /var/cache/conftool/dbconfig/20240118-204324-marostegui.json
20:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T354336)', diff saved to https://phabricator.wikimedia.org/P54965 and previous config saved to /var/cache/conftool/dbconfig/20240118-202817-marostegui.json
20:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1236 (T354336)', diff saved to https://phabricator.wikimedia.org/P54964 and previous config saved to /var/cache/conftool/dbconfig/20240118-202606-marostegui.json
20:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1236.eqiad.wmnet with reason: Maintenance
20:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1236.eqiad.wmnet with reason: Maintenance
20:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T354336)', diff saved to https://phabricator.wikimedia.org/P54963 and previous config saved to /var/cache/conftool/dbconfig/20240118-202544-marostegui.json
20:24 mutante: rsyncing phab repo data, gitlab2003 pulls from phab2002 (inactive server) - test only to see how long it will take, can be stopped
20:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P54962 and previous config saved to /var/cache/conftool/dbconfig/20240118-201037-marostegui.json
20:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2095.codfw.wmnet with OS bullseye
19:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P54961 and previous config saved to /var/cache/conftool/dbconfig/20240118-195531-marostegui.json
19:48 ryankemper: T354662 Running `sudo -i authdns-update` on `dns1004` following merge of https://gerrit.wikimedia.org/r/c/operations/dns/+/991429
19:46 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2095.codfw.wmnet with reason: host reimage
19:43 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2095.codfw.wmnet with reason: host reimage
19:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T354336)', diff saved to https://phabricator.wikimedia.org/P54960 and previous config saved to /var/cache/conftool/dbconfig/20240118-194024-marostegui.json
19:26 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2095.codfw.wmnet with OS bullseye
19:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2093.codfw.wmnet with OS bullseye
19:23 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
19:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2092.codfw.wmnet with OS bullseye
19:11 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2091.codfw.wmnet with OS bullseye
19:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2093.codfw.wmnet with reason: host reimage
19:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2089.codfw.wmnet with OS bullseye
19:04 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2093.codfw.wmnet with reason: host reimage
19:02 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2092.codfw.wmnet with reason: host reimage
18:59 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2092.codfw.wmnet with reason: host reimage
18:54 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2091.codfw.wmnet with reason: host reimage
18:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2091.codfw.wmnet with reason: host reimage
18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1238 (T352010)', diff saved to https://phabricator.wikimedia.org/P54959 and previous config saved to /var/cache/conftool/dbconfig/20240118-185038-ladsgroup.json
18:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
18:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T352010)', diff saved to https://phabricator.wikimedia.org/P54958 and previous config saved to /var/cache/conftool/dbconfig/20240118-185016-ladsgroup.json
18:48 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2089.codfw.wmnet with reason: host reimage
18:47 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2093.codfw.wmnet with OS bullseye
18:45 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2089.codfw.wmnet with reason: host reimage
18:42 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2092.codfw.wmnet with OS bullseye
18:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T354336)', diff saved to https://phabricator.wikimedia.org/P54957 and previous config saved to /var/cache/conftool/dbconfig/20240118-184002-marostegui.json
18:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
18:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
18:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T354336)', diff saved to https://phabricator.wikimedia.org/P54956 and previous config saved to /var/cache/conftool/dbconfig/20240118-183940-marostegui.json
18:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P54955 and previous config saved to /var/cache/conftool/dbconfig/20240118-183510-ladsgroup.json
18:34 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2091.codfw.wmnet with OS bullseye
18:28 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2089.codfw.wmnet with OS bullseye
18:25 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
18:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P54954 and previous config saved to /var/cache/conftool/dbconfig/20240118-182433-marostegui.json
18:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P54953 and previous config saved to /var/cache/conftool/dbconfig/20240118-182003-ladsgroup.json
18:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P54951 and previous config saved to /var/cache/conftool/dbconfig/20240118-180927-marostegui.json
18:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T352010)', diff saved to https://phabricator.wikimedia.org/P54950 and previous config saved to /var/cache/conftool/dbconfig/20240118-180456-ladsgroup.json
17:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T354336)', diff saved to https://phabricator.wikimedia.org/P54949 and previous config saved to /var/cache/conftool/dbconfig/20240118-175420-marostegui.json
17:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T354336)', diff saved to https://phabricator.wikimedia.org/P54948 and previous config saved to /var/cache/conftool/dbconfig/20240118-175209-marostegui.json
17:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1202.eqiad.wmnet with reason: Maintenance
17:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1202.eqiad.wmnet with reason: Maintenance
17:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T354336)', diff saved to https://phabricator.wikimedia.org/P54947 and previous config saved to /var/cache/conftool/dbconfig/20240118-175147-marostegui.json
17:43 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2097.codfw.wmnet with OS bullseye
17:42 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2101.codfw.wmnet with OS bullseye
17:39 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2096.codfw.wmnet with OS bullseye
17:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P54946 and previous config saved to /var/cache/conftool/dbconfig/20240118-173640-marostegui.json
17:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2095.codfw.wmnet with OS bullseye
17:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2100.codfw.wmnet with OS bullseye
17:33 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
17:31 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2102.codfw.wmnet with OS bullseye
17:30 topranks: Re-enabling PyBal on lvs2011 after network migration T352912
17:30 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2093.codfw.wmnet with OS bullseye
17:28 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2099.codfw.wmnet with OS bullseye
17:27 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2092.codfw.wmnet with OS bullseye
17:25 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2091.codfw.wmnet with OS bullseye
17:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P54945 and previous config saved to /var/cache/conftool/dbconfig/20240118-172134-marostegui.json
17:20 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2098.codfw.wmnet with OS bullseye
17:14 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2102.codfw.wmnet with reason: host reimage
17:11 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2102.codfw.wmnet with reason: host reimage
17:11 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2089.codfw.wmnet with OS bullseye
17:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T354336)', diff saved to https://phabricator.wikimedia.org/P54944 and previous config saved to /var/cache/conftool/dbconfig/20240118-170627-marostegui.json
17:06 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
17:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T354336)', diff saved to https://phabricator.wikimedia.org/P54943 and previous config saved to /var/cache/conftool/dbconfig/20240118-170417-marostegui.json
17:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1194.eqiad.wmnet with reason: Maintenance
17:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1194.eqiad.wmnet with reason: Maintenance
17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T354336)', diff saved to https://phabricator.wikimedia.org/P54942 and previous config saved to /var/cache/conftool/dbconfig/20240118-170355-marostegui.json
16:54 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2102.codfw.wmnet with OS bullseye
16:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2101.codfw.wmnet with OS bullseye
16:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P54941 and previous config saved to /var/cache/conftool/dbconfig/20240118-164848-marostegui.json
16:42 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2100.codfw.wmnet with OS bullseye
16:36 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2090.codfw.wmnet with OS bullseye
16:35 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2099.codfw.wmnet with OS bullseye
16:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P54940 and previous config saved to /var/cache/conftool/dbconfig/20240118-163342-marostegui.json
16:33 hashar@deploy2002: Finished deploy [integration/docroot@1d9323f]: Remove Wikimedia Design Style Guide from the list - T347895 (duration: 00m 07s)
16:33 hashar@deploy2002: Started deploy [integration/docroot@1d9323f]: Remove Wikimedia Design Style Guide from the list - T347895
16:27 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2098.codfw.wmnet with OS bullseye
16:25 sukhe: running authdns-update for T355308
16:22 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2097.codfw.wmnet with OS bullseye
16:18 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2090.codfw.wmnet with reason: host reimage
16:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T354336)', diff saved to https://phabricator.wikimedia.org/P54939 and previous config saved to /var/cache/conftool/dbconfig/20240118-161834-marostegui.json
16:18 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2096.codfw.wmnet with OS bullseye
16:18 claime: Running puppet on 'P{P:kubernetes::node} and not P{F:lldp.parent ~ lsw}' - T352883
16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T354336)', diff saved to https://phabricator.wikimedia.org/P54938 and previous config saved to /var/cache/conftool/dbconfig/20240118-161624-marostegui.json
16:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1191.eqiad.wmnet with reason: Maintenance
16:16 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1191.eqiad.wmnet with reason: Maintenance
16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T354336)', diff saved to https://phabricator.wikimedia.org/P54937 and previous config saved to /var/cache/conftool/dbconfig/20240118-161602-marostegui.json
16:15 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2090.codfw.wmnet with reason: host reimage
16:15 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2095.codfw.wmnet with OS bullseye
16:12 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
16:09 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2093.codfw.wmnet with OS bullseye
16:06 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2092.codfw.wmnet with OS bullseye
16:06 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: moving lvs2011 network link T352912
16:06 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: moving lvs2011 network link T352912
16:06 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cr2-codfw,cr[1-2]-codfw IPv6,re0.cr1-codfw.mgmt,re0.cr2-codfw.mgmt cr1-codfw with reason: moving lvs2011 network link T352912
16:05 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cr2-codfw,cr[1-2]-codfw IPv6,re0.cr1-codfw.mgmt,re0.cr2-codfw.mgmt cr1-codfw with reason: moving lvs2011 network link T352912
16:04 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: moving lvs2011 network link T352912
16:04 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2011.codfw.wmnet with reason: moving lvs2011 network link T352912
16:04 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2091.codfw.wmnet with OS bullseye
16:03 claime: Running puppet on 'P{P:kubernetes::node} and P{F:lldp.parent ~ lsw}' - T352883
16:02 topranks: disabling PyBal and puppet on lvs2011, moving traffic to lvs2014 ahead of network change T352912
16:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P54936 and previous config saved to /var/cache/conftool/dbconfig/20240118-160055-marostegui.json
15:59 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1461.eqiad.wmnet with OS bullseye
15:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2090.codfw.wmnet with OS bullseye
15:56 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1439.eqiad.wmnet with OS bullseye
15:54 claime: Running puppet on A:wikikube-staging-worker - T352883
15:53 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1469.eqiad.wmnet with OS bullseye
15:52 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1045.eqiad.wmnet
15:52 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2045.codfw.wmnet
15:52 claime: Running puppet on kubestage2002 - T352883
15:52 claime: stopping puppet on P:kubernetes::node to deploy 980927 - T352883
15:50 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2089.codfw.wmnet with OS bullseye
15:49 claime: Running puppet on kubestage2002 - T352893
15:46 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1045.eqiad.wmnet
15:46 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2045.codfw.wmnet
15:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P54935 and previous config saved to /var/cache/conftool/dbconfig/20240118-154549-marostegui.json
15:45 claime: stopping puppet on P:kubernetes::node to deploy 980927 - T352893
15:45 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
15:40 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1461.eqiad.wmnet with reason: host reimage
15:37 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1439.eqiad.wmnet with reason: host reimage
15:35 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1469.eqiad.wmnet with reason: host reimage
15:32 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1461.eqiad.wmnet with reason: host reimage
15:32 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1439.eqiad.wmnet with reason: host reimage
15:31 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1469.eqiad.wmnet with reason: host reimage
15:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T354336)', diff saved to https://phabricator.wikimedia.org/P54933 and previous config saved to /var/cache/conftool/dbconfig/20240118-153042-marostegui.json
15:28 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T354336)', diff saved to https://phabricator.wikimedia.org/P54932 and previous config saved to /var/cache/conftool/dbconfig/20240118-152832-marostegui.json
15:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance
15:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance
15:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
15:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
15:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P54931 and previous config saved to /var/cache/conftool/dbconfig/20240118-152747-marostegui.json
15:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 100%: T355313', diff saved to https://phabricator.wikimedia.org/P54930 and previous config saved to /var/cache/conftool/dbconfig/20240118-152006-root.json
15:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1439.eqiad.wmnet with OS bullseye
15:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1469.eqiad.wmnet with OS bullseye
15:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1461.eqiad.wmnet with OS bullseye
15:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P54929 and previous config saved to /var/cache/conftool/dbconfig/20240118-151241-marostegui.json
15:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 75%: T355313', diff saved to https://phabricator.wikimedia.org/P54928 and previous config saved to /var/cache/conftool/dbconfig/20240118-150501-root.json
14:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P54927 and previous config saved to /var/cache/conftool/dbconfig/20240118-145734-marostegui.json
14:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 50%: T355313', diff saved to https://phabricator.wikimedia.org/P54926 and previous config saved to /var/cache/conftool/dbconfig/20240118-144956-root.json
14:43 Dreamy_Jazz: Afternoon UTC backport window done
14:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P54925 and previous config saved to /var/cache/conftool/dbconfig/20240118-144228-marostegui.json
14:42 Emperor: disable puppet on ms-be2072 to try and deal with faulty drive
14:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P54924 and previous config saved to /var/cache/conftool/dbconfig/20240118-144214-marostegui.json
14:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
14:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
14:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54923 and previous config saved to /var/cache/conftool/dbconfig/20240118-144152-marostegui.json
14:41 Dreamy_Jazz: Ran `echo 'https://en.wikipedia.org/static/images/mobile/copyright/wikipedia-tagline-th.svg' | mwscript purgeList.php`, `echo 'https://en.wikipedia.org/static/images/mobile/copyright/wikipedia-wordmark-th.svg' | mwscript purgeList.php`, `echo 'https://en.wikipedia.org/static/images/project-logos/thwiki.png' | mwscript purgeList.php`, `echo 'https://en.wikipedia.org/static/images/project-logos/thwiki-1.5x.png' | mwscript purgeList.php`, and `echo 'https://en.wikipedia.org/static/images/project-logos/thwiki-2x.png' | mwscript purgeList.php`
14:38 dreamyjazz@deploy2002: Finished scap: Backport for gerrit:989750thwiki: update tagline and optimise other logos (T341407) (duration: 08m 28s)
14:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
14:35 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 10 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
14:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 25%: T355313', diff saved to https://phabricator.wikimedia.org/P54922 and previous config saved to /var/cache/conftool/dbconfig/20240118-143451-root.json
14:33 dreamyjazz@deploy2002: anzx and dreamyjazz: Continuing with sync
14:31 dreamyjazz@deploy2002: anzx and dreamyjazz: Backport for gerrit:989750thwiki: update tagline and optimise other logos (T341407) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:30 dreamyjazz@deploy2002: Started scap: Backport for gerrit:989750thwiki: update tagline and optimise other logos (T341407)
14:28 kartik@deploy2002: Finished scap: Backport for gerrit:991002Set MT threshold for Punjabi Wikipedia to 97 (T347789) (duration: 10m 03s)
14:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P54921 and previous config saved to /var/cache/conftool/dbconfig/20240118-142646-marostegui.json
14:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: aqs
14:22 kartik@deploy2002: kartik: Continuing with sync
14:19 kartik@deploy2002: kartik: Backport for gerrit:991002Set MT threshold for Punjabi Wikipedia to 97 (T347789) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 10%: T355313', diff saved to https://phabricator.wikimedia.org/P54920 and previous config saved to /var/cache/conftool/dbconfig/20240118-141946-root.json
14:18 kartik@deploy2002: Started scap: Backport for gerrit:991002Set MT threshold for Punjabi Wikipedia to 97 (T347789)
14:12 Dreamy_Jazz: running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30-no-render-now.txt`
14:11 dreamyjazz@deploy2002: Finished scap: Backport for gerrit:991551Remove RENDER_NOW from File::transform call to avoid job thumbnailing (T355309) (duration: 07m 50s)
14:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P54919 and previous config saved to /var/cache/conftool/dbconfig/20240118-141139-marostegui.json
14:07 Dreamy_Jazz: Stopped MediaModeration scan for commonswiki
14:07 Dreamy_Jazz: stopped MediaModerations scan for group2
14:06 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: aqs
14:06 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
14:05 dreamyjazz@deploy2002: dreamyjazz: Backport for gerrit:991551Remove RENDER_NOW from File::transform call to avoid job thumbnailing (T355309) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 5%: T355313', diff saved to https://phabricator.wikimedia.org/P54918 and previous config saved to /var/cache/conftool/dbconfig/20240118-140441-root.json
14:03 dreamyjazz@deploy2002: Started scap: Backport for gerrit:991551Remove RENDER_NOW from File::transform call to avoid job thumbnailing (T355309)
13:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54917 and previous config saved to /var/cache/conftool/dbconfig/20240118-135633-marostegui.json
13:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54916 and previous config saved to /var/cache/conftool/dbconfig/20240118-135422-marostegui.json
13:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
13:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
13:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
13:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
13:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2107.codfw.wmnet with reason: Maintenance
13:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2107.codfw.wmnet with reason: Maintenance
13:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 1%: T355313', diff saved to https://phabricator.wikimedia.org/P54915 and previous config saved to /var/cache/conftool/dbconfig/20240118-134936-root.json
13:28 moritzm: installing python-requests security updates
13:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T354336)', diff saved to https://phabricator.wikimedia.org/P54914 and previous config saved to /var/cache/conftool/dbconfig/20240118-130451-marostegui.json
12:54 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
12:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1221 (T352010)', diff saved to https://phabricator.wikimedia.org/P54913 and previous config saved to /var/cache/conftool/dbconfig/20240118-125130-ladsgroup.json
12:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
12:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
12:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
12:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
12:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T352010)', diff saved to https://phabricator.wikimedia.org/P54912 and previous config saved to /var/cache/conftool/dbconfig/20240118-125048-ladsgroup.json
12:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P54911 and previous config saved to /var/cache/conftool/dbconfig/20240118-124945-marostegui.json
12:41 godog: grafana restarted on grafana1002 after https://gerrit.wikimedia.org/r/c/operations/puppet/+/991573
12:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P54910 and previous config saved to /var/cache/conftool/dbconfig/20240118-123541-ladsgroup.json
12:35 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
12:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P54909 and previous config saved to /var/cache/conftool/dbconfig/20240118-123439-marostegui.json
12:34 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
12:33 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
12:31 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
12:28 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
12:27 Dreamy_Jazz: Finished security deploy for T347742
12:27 dreamyjazz@deploy2002: Finished scap: Backport for gerrit:991552SECURITY: Use message label instead of sanitized text output for massmessage-form-page-help message (T347742) (duration: 08m 28s)
12:27 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1047.eqiad.wmnet
12:26 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
12:24 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2047.codfw.wmnet
12:21 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
12:20 dreamyjazz@deploy2002: dreamyjazz: Backport for gerrit:991552SECURITY: Use message label instead of sanitized text output for massmessage-form-page-help message (T347742) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
12:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P54908 and previous config saved to /var/cache/conftool/dbconfig/20240118-122035-ladsgroup.json
12:20 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2047.codfw.wmnet
12:20 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1047.eqiad.wmnet
12:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T354336)', diff saved to https://phabricator.wikimedia.org/P54907 and previous config saved to /var/cache/conftool/dbconfig/20240118-121932-marostegui.json
12:18 dreamyjazz@deploy2002: Started scap: Backport for gerrit:991552SECURITY: Use message label instead of sanitized text output for massmessage-form-page-help message (T347742)
12:17 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
12:17 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
12:16 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
12:16 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
12:16 jynus: depooled db2146, lot of lag, should be investigated later
12:15 jynus@cumin1002: dbctl commit (dc=all): 'Depool db2146', diff saved to https://phabricator.wikimedia.org/P54906 and previous config saved to /var/cache/conftool/dbconfig/20240118-121541-jynus.json
12:07 Dreamy_Jazz: Doing security deploy for T347742
12:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T352010)', diff saved to https://phabricator.wikimedia.org/P54905 and previous config saved to /var/cache/conftool/dbconfig/20240118-120528-ladsgroup.json
11:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2189 (T354336)', diff saved to https://phabricator.wikimedia.org/P54904 and previous config saved to /var/cache/conftool/dbconfig/20240118-114551-marostegui.json
11:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2189.codfw.wmnet with reason: Maintenance
11:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2189.codfw.wmnet with reason: Maintenance
11:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T354336)', diff saved to https://phabricator.wikimedia.org/P54903 and previous config saved to /var/cache/conftool/dbconfig/20240118-114528-marostegui.json
11:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P54902 and previous config saved to /var/cache/conftool/dbconfig/20240118-113022-marostegui.json
11:21 godog: bounce apache2 on logstash1025 / logstash1031 - T337818
11:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P54901 and previous config saved to /var/cache/conftool/dbconfig/20240118-111516-marostegui.json
11:04 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
11:01 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
11:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T354336)', diff saved to https://phabricator.wikimedia.org/P54900 and previous config saved to /var/cache/conftool/dbconfig/20240118-110009-marostegui.json
10:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2175 (T354336)', diff saved to https://phabricator.wikimedia.org/P54899 and previous config saved to /var/cache/conftool/dbconfig/20240118-104335-marostegui.json
10:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2175.codfw.wmnet with reason: Maintenance
10:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2175.codfw.wmnet with reason: Maintenance
10:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54898 and previous config saved to /var/cache/conftool/dbconfig/20240118-104313-marostegui.json
10:37 hashar@deploy2002: Finished deploy [integration/docroot@8f5aa9e]: Add Codex Icons package (duration: 00m 05s)
10:36 hashar@deploy2002: Started deploy [integration/docroot@8f5aa9e]: Add Codex Icons package
10:32 hashar@deploy2002: Finished deploy [integration/docroot@88f6458]: Add npm package link for Codex Design Tokens - T354310 (duration: 00m 07s)
10:32 hashar@deploy2002: Started deploy [integration/docroot@88f6458]: Add npm package link for Codex Design Tokens - T354310
10:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet
10:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P54896 and previous config saved to /var/cache/conftool/dbconfig/20240118-102806-marostegui.json
10:26 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2047.codfw.wmnet
10:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet
10:22 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2047.codfw.wmnet
10:19 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1047.eqiad.wmnet
10:13 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1047.eqiad.wmnet
10:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P54894 and previous config saved to /var/cache/conftool/dbconfig/20240118-101300-marostegui.json
10:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2046.codfw.wmnet
10:09 Dreamy_Jazz: T351400 running on a tmux session `foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/scanFilesInScanTable.php --sleep 0 --verbose 2>&1 | tee ~/scan-files-in-scan-table-group2-sleep-0-non-jobqueue.txt`
10:04 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2046.codfw.wmnet
10:01 btullis: built and published updated openjdk-11 images based on: 11.0.21-s0-20240111
09:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54893 and previous config saved to /var/cache/conftool/dbconfig/20240118-095753-marostegui.json
09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54892 and previous config saved to /var/cache/conftool/dbconfig/20240118-095522-marostegui.json
09:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
09:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T354336)', diff saved to https://phabricator.wikimedia.org/P54891 and previous config saved to /var/cache/conftool/dbconfig/20240118-095500-marostegui.json
09:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1046.eqiad.wmnet
09:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P54890 and previous config saved to /var/cache/conftool/dbconfig/20240118-093954-marostegui.json
09:30 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.14 refs T354432
09:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1046.eqiad.wmnet
09:25 godog: add 50G to prometheus@k8s-mlserve in codfw
09:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P54889 and previous config saved to /var/cache/conftool/dbconfig/20240118-092447-marostegui.json
09:15 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --sleep 0 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-0-non-jobqueue.txt`
09:12 Dreamy_Jazz: stopped MediaModeration scanning script
09:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T354336)', diff saved to https://phabricator.wikimedia.org/P54888 and previous config saved to /var/cache/conftool/dbconfig/20240118-090941-marostegui.json
09:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2148 (T354336)', diff saved to https://phabricator.wikimedia.org/P54887 and previous config saved to /var/cache/conftool/dbconfig/20240118-090712-marostegui.json
09:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2148.codfw.wmnet with reason: Maintenance
09:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2148.codfw.wmnet with reason: Maintenance
09:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54886 and previous config saved to /var/cache/conftool/dbconfig/20240118-090649-marostegui.json
08:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P54885 and previous config saved to /var/cache/conftool/dbconfig/20240118-085143-marostegui.json
08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P54884 and previous config saved to /var/cache/conftool/dbconfig/20240118-083636-marostegui.json
08:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54883 and previous config saved to /var/cache/conftool/dbconfig/20240118-082130-marostegui.json
08:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2138:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54882 and previous config saved to /var/cache/conftool/dbconfig/20240118-081900-marostegui.json
08:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
08:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
08:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T354336)', diff saved to https://phabricator.wikimedia.org/P54881 and previous config saved to /var/cache/conftool/dbconfig/20240118-081838-marostegui.json
08:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P54880 and previous config saved to /var/cache/conftool/dbconfig/20240118-080332-marostegui.json
07:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P54879 and previous config saved to /var/cache/conftool/dbconfig/20240118-074825-marostegui.json
07:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T354336)', diff saved to https://phabricator.wikimedia.org/P54878 and previous config saved to /var/cache/conftool/dbconfig/20240118-073319-marostegui.json
07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2126 (T354336)', diff saved to https://phabricator.wikimedia.org/P54877 and previous config saved to /var/cache/conftool/dbconfig/20240118-073054-marostegui.json
07:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
07:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
07:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2126.codfw.wmnet with reason: Maintenance
07:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2126.codfw.wmnet with reason: Maintenance
07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T354336)', diff saved to https://phabricator.wikimedia.org/P54876 and previous config saved to /var/cache/conftool/dbconfig/20240118-073016-marostegui.json
07:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P54875 and previous config saved to /var/cache/conftool/dbconfig/20240118-071509-marostegui.json
07:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P54874 and previous config saved to /var/cache/conftool/dbconfig/20240118-070003-marostegui.json
06:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T354336)', diff saved to https://phabricator.wikimedia.org/P54873 and previous config saved to /var/cache/conftool/dbconfig/20240118-064456-marostegui.json
06:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2125 (T354336)', diff saved to https://phabricator.wikimedia.org/P54872 and previous config saved to /var/cache/conftool/dbconfig/20240118-064225-marostegui.json
06:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2125.codfw.wmnet with reason: Maintenance
06:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2125.codfw.wmnet with reason: Maintenance
06:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T354336)', diff saved to https://phabricator.wikimedia.org/P54871 and previous config saved to /var/cache/conftool/dbconfig/20240118-064203-marostegui.json
06:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P54870 and previous config saved to /var/cache/conftool/dbconfig/20240118-062657-marostegui.json
06:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P54869 and previous config saved to /var/cache/conftool/dbconfig/20240118-061150-marostegui.json
06:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1199 (T352010)', diff saved to https://phabricator.wikimedia.org/P54868 and previous config saved to /var/cache/conftool/dbconfig/20240118-061138-ladsgroup.json
06:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
06:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
06:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T352010)', diff saved to https://phabricator.wikimedia.org/P54867 and previous config saved to /var/cache/conftool/dbconfig/20240118-061116-ladsgroup.json
05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T354336)', diff saved to https://phabricator.wikimedia.org/P54866 and previous config saved to /var/cache/conftool/dbconfig/20240118-055643-marostegui.json
05:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P54865 and previous config saved to /var/cache/conftool/dbconfig/20240118-055609-ladsgroup.json
05:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2104 (T354336)', diff saved to https://phabricator.wikimedia.org/P54864 and previous config saved to /var/cache/conftool/dbconfig/20240118-055419-marostegui.json
05:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2104.codfw.wmnet with reason: Maintenance
05:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2104.codfw.wmnet with reason: Maintenance
05:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
05:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
05:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1162.eqiad.wmnet with reason: Maintenance
05:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1162.eqiad.wmnet with reason: Maintenance
05:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P54863 and previous config saved to /var/cache/conftool/dbconfig/20240118-054103-ladsgroup.json
05:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T352010)', diff saved to https://phabricator.wikimedia.org/P54862 and previous config saved to /var/cache/conftool/dbconfig/20240118-052556-ladsgroup.json

2024-01-17

23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1190 (T352010)', diff saved to https://phabricator.wikimedia.org/P54861 and previous config saved to /var/cache/conftool/dbconfig/20240117-233655-ladsgroup.json
23:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
23:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
22:01 inflatador: bking@kafka-main2001 `kafka topics --alter --topic eqiad.cirrussearch.update_pipeline.fetch_error.rc0 --partitions 5` T354595
21:55 catrope@deploy2002: Finished scap: Backport for gerrit:991049Fix text overflow in history page (T354218) (duration: 09m 39s)
21:50 inflatador: bking@kafka-main2001 `kafka topics --alter --topic codfw.cirrussearch.update_pipeline.fetch_error.rc0 --partitions 5` T354595
21:49 catrope@deploy2002: jdlrobson and catrope: Continuing with sync
21:47 catrope@deploy2002: jdlrobson and catrope: Backport for gerrit:991049Fix text overflow in history page (T354218) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:47 inflatador: bking@kafka-main2001 `kafka topics --alter --topic eqiad.cirrussearch.update_pipeline.update.rc0 --partitions 5` T354595
21:45 catrope@deploy2002: Started scap: Backport for gerrit:991049Fix text overflow in history page (T354218)
21:43 catrope@deploy2002: Finished scap: Backport for gerrit:990152Enable desktop history page for all mobile logged in users (T353388) (duration: 15m 15s)
21:37 catrope@deploy2002: jdlrobson and catrope: Continuing with sync
21:30 catrope@deploy2002: jdlrobson and catrope: Backport for gerrit:990152Enable desktop history page for all mobile logged in users (T353388) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:28 catrope@deploy2002: Started scap: Backport for gerrit:990152Enable desktop history page for all mobile logged in users (T353388)
21:16 inflatador: bking@kafka-main1001 `kafka topics --alter --topic codfw.cirrussearch.update_pipeline.fetch_error.rc0 --partitions 5
21:15 inflatador: bking@kafka-main1001 `kafka topics --alter --topic eqiad.cirrussearch.update_pipeline.update.rc0 --partitions 5` T354595
21:13 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
21:13 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
21:13 inflatador: bking@kafka-main1001 `kafka topics --alter --topic codfw.cirrussearch.update_pipeline.update.rc0 --partitions 5`
21:07 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
21:07 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
21:06 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
21:06 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
21:05 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
21:04 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
20:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
20:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
20:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T354336)', diff saved to https://phabricator.wikimedia.org/P54860 and previous config saved to /var/cache/conftool/dbconfig/20240117-201513-marostegui.json
20:05 mutante: LDAP - added uid=dimakoushha to groups wmde and nda (T354276)
20:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P54859 and previous config saved to /var/cache/conftool/dbconfig/20240117-200006-marostegui.json
19:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P54858 and previous config saved to /var/cache/conftool/dbconfig/20240117-194500-marostegui.json
19:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T354336)', diff saved to https://phabricator.wikimedia.org/P54857 and previous config saved to /var/cache/conftool/dbconfig/20240117-192953-marostegui.json
19:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1233 (T354336)', diff saved to https://phabricator.wikimedia.org/P54856 and previous config saved to /var/cache/conftool/dbconfig/20240117-192737-marostegui.json
19:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1233.eqiad.wmnet with reason: Maintenance
19:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1233.eqiad.wmnet with reason: Maintenance
19:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T354336)', diff saved to https://phabricator.wikimedia.org/P54855 and previous config saved to /var/cache/conftool/dbconfig/20240117-192715-marostegui.json
19:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P54854 and previous config saved to /var/cache/conftool/dbconfig/20240117-191209-marostegui.json
19:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
19:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
18:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P54853 and previous config saved to /var/cache/conftool/dbconfig/20240117-185703-marostegui.json
18:54 jnuche@deploy2002: Finished scap: deploying K8s config changes from T355243 (duration: 01m 42s)
18:52 jnuche@deploy2002: Started scap: deploying K8s config changes from T355243
18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T354336)', diff saved to https://phabricator.wikimedia.org/P54852 and previous config saved to /var/cache/conftool/dbconfig/20240117-184156-marostegui.json
18:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1229 (T354336)', diff saved to https://phabricator.wikimedia.org/P54851 and previous config saved to /var/cache/conftool/dbconfig/20240117-183944-marostegui.json
18:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1229.eqiad.wmnet with reason: Maintenance
18:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1229.eqiad.wmnet with reason: Maintenance
18:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
18:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
18:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T354336)', diff saved to https://phabricator.wikimedia.org/P54850 and previous config saved to /var/cache/conftool/dbconfig/20240117-183857-marostegui.json
18:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P54849 and previous config saved to /var/cache/conftool/dbconfig/20240117-182351-marostegui.json
18:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P54848 and previous config saved to /var/cache/conftool/dbconfig/20240117-180844-marostegui.json
17:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T354336)', diff saved to https://phabricator.wikimedia.org/P54847 and previous config saved to /var/cache/conftool/dbconfig/20240117-175338-marostegui.json
17:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1222 (T354336)', diff saved to https://phabricator.wikimedia.org/P54846 and previous config saved to /var/cache/conftool/dbconfig/20240117-175120-marostegui.json
17:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
17:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
17:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T354336)', diff saved to https://phabricator.wikimedia.org/P54845 and previous config saved to /var/cache/conftool/dbconfig/20240117-175059-marostegui.json
17:39 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2395.codfw.wmnet with OS bullseye
17:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P54844 and previous config saved to /var/cache/conftool/dbconfig/20240117-173552-marostegui.json
17:29 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2357.codfw.wmnet with OS bullseye
17:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P54843 and previous config saved to /var/cache/conftool/dbconfig/20240117-172045-marostegui.json
17:19 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
17:19 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2395.codfw.wmnet with reason: host reimage
17:19 denisse@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host grafana2001.codfw.wmnet with OS bookworm
17:18 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
17:16 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2395.codfw.wmnet with reason: host reimage
17:13 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
17:11 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
17:08 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2357.codfw.wmnet with reason: host reimage
17:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T354336)', diff saved to https://phabricator.wikimedia.org/P54842 and previous config saved to /var/cache/conftool/dbconfig/20240117-170539-marostegui.json
17:05 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2357.codfw.wmnet with reason: host reimage
17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1197 (T354336)', diff saved to https://phabricator.wikimedia.org/P54841 and previous config saved to /var/cache/conftool/dbconfig/20240117-170327-marostegui.json
17:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1197.eqiad.wmnet with reason: Maintenance
17:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1197.eqiad.wmnet with reason: Maintenance
17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T354336)', diff saved to https://phabricator.wikimedia.org/P54840 and previous config saved to /var/cache/conftool/dbconfig/20240117-170305-marostegui.json
17:02 denisse@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on grafana2001.codfw.wmnet with reason: host reimage
17:00 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2395.codfw.wmnet with OS bullseye
16:57 denisse@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on grafana2001.codfw.wmnet with reason: host reimage
16:48 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2357.codfw.wmnet with OS bullseye
16:48 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
16:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P54839 and previous config saved to /var/cache/conftool/dbconfig/20240117-164759-marostegui.json
16:42 denisse@cumin2002: START - Cookbook sre.hosts.reimage for host grafana2001.codfw.wmnet with OS bookworm
16:41 jforrester@deploy2002: Finished deploy [integration/docroot@f08a107]: I746134 for T354310 (duration: 00m 07s)
16:40 jforrester@deploy2002: Started deploy [integration/docroot@f08a107]: I746134 for T354310
16:39 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
16:39 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
16:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P54838 and previous config saved to /var/cache/conftool/dbconfig/20240117-163252-marostegui.json
16:29 damilare: civicrm upgraded from 5ef5362f to d8b0c977
16:25 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
16:23 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
16:23 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
16:22 kamila@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
16:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T354336)', diff saved to https://phabricator.wikimedia.org/P54837 and previous config saved to /var/cache/conftool/dbconfig/20240117-161746-marostegui.json
16:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1188 (T354336)', diff saved to https://phabricator.wikimedia.org/P54836 and previous config saved to /var/cache/conftool/dbconfig/20240117-161534-marostegui.json
16:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1188.eqiad.wmnet with reason: Maintenance
16:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1188.eqiad.wmnet with reason: Maintenance
16:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T354336)', diff saved to https://phabricator.wikimedia.org/P54835 and previous config saved to /var/cache/conftool/dbconfig/20240117-161512-marostegui.json
16:14 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
16:13 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
16:13 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
16:13 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
16:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P54834 and previous config saved to /var/cache/conftool/dbconfig/20240117-160005-marostegui.json
15:54 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Testing new version of Superset
15:54 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Testing new version of Superset
15:54 btullis@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 7 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Testing new version of Superset
15:54 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Testing new version of Superset
15:49 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
15:49 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
15:45 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
15:45 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
15:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P54833 and previous config saved to /var/cache/conftool/dbconfig/20240117-154459-marostegui.json
15:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2045.codfw.wmnet
15:38 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
15:38 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
15:30 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2045.codfw.wmnet
15:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T354336)', diff saved to https://phabricator.wikimedia.org/P54832 and previous config saved to /var/cache/conftool/dbconfig/20240117-152953-marostegui.json
15:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1045.eqiad.wmnet
15:27 taavi: restart etherpad-lite.service on etherpad1003
15:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1182 (T354336)', diff saved to https://phabricator.wikimedia.org/P54831 and previous config saved to /var/cache/conftool/dbconfig/20240117-152737-marostegui.json
15:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1182.eqiad.wmnet with reason: Maintenance
15:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1182.eqiad.wmnet with reason: Maintenance
15:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54830 and previous config saved to /var/cache/conftool/dbconfig/20240117-152715-marostegui.json
15:23 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1045.eqiad.wmnet
15:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: cache::text
15:15 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
15:13 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
15:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P54827 and previous config saved to /var/cache/conftool/dbconfig/20240117-151208-marostegui.json
15:10 Lucas_WMDE: UTC afternoon backport+config window done
15:09 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:991061Exclude qqq from monolingual text languages (T341409) (duration: 07m 59s)
15:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
15:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1044.eqiad.wmnet
15:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
15:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
15:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2044.codfw.wmnet
15:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
15:03 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
15:02 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for gerrit:991061Exclude qqq from monolingual text languages (T341409) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
15:01 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:991061Exclude qqq from monolingual text languages (T341409)
14:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1044.eqiad.wmnet
14:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2044.codfw.wmnet
14:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P54826 and previous config saved to /var/cache/conftool/dbconfig/20240117-145702-marostegui.json
14:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: cache::text
14:51 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:991062Skip tainted references test:distnodiff script to fix Wikibase CI (T354881), gerrit:991060Only build result entries for used wbsearchentities results (T355053) (duration: 08m 28s)
14:49 claime: restarted rsyslog on kubernetes2048
14:45 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
14:44 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for gerrit:991062Skip tainted references test:distnodiff script to fix Wikibase CI (T354881), gerrit:991060Only build result entries for used wbsearchentities results (T355053) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:43 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:991062Skip tainted references test:distnodiff script to fix Wikibase CI (T354881), gerrit:991060Only build result entries for used wbsearchentities results (T355053)
14:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54824 and previous config saved to /var/cache/conftool/dbconfig/20240117-144156-marostegui.json
14:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54823 and previous config saved to /var/cache/conftool/dbconfig/20240117-144039-marostegui.json
14:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
14:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
14:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T354336)', diff saved to https://phabricator.wikimedia.org/P54822 and previous config saved to /var/cache/conftool/dbconfig/20240117-144018-marostegui.json
14:26 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2002.codfw.wmnet
14:25 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:991059Only build result entries for used wbsearchentities results (T355053) (duration: 09m 23s)
14:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P54821 and previous config saved to /var/cache/conftool/dbconfig/20240117-142511-marostegui.json
14:23 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
14:22 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-wf2002.codfw.wmnet
14:22 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2001.codfw.wmnet
14:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
14:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
14:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54820 and previous config saved to /var/cache/conftool/dbconfig/20240117-142015-ladsgroup.json
14:19 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
14:17 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for gerrit:991059Only build result entries for used wbsearchentities results (T355053) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:16 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:991059Only build result entries for used wbsearchentities results (T355053)
14:16 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-wf2001.codfw.wmnet
14:14 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [[gerrit:628773|Remove unused $wgExtraLanguageNames['qqq'] assignment (T263441)]] (duration: 11m 07s)
14:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P54819 and previous config saved to /var/cache/conftool/dbconfig/20240117-141005-marostegui.json
14:07 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
14:07 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for [[gerrit:628773|Remove unused $wgExtraLanguageNames['qqq'] assignment (T263441)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P54818 and previous config saved to /var/cache/conftool/dbconfig/20240117-140509-ladsgroup.json
14:03 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [[gerrit:628773|Remove unused $wgExtraLanguageNames['qqq'] assignment (T263441)]]
13:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T354336)', diff saved to https://phabricator.wikimedia.org/P54817 and previous config saved to /var/cache/conftool/dbconfig/20240117-135459-marostegui.json
13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1156 (T354336)', diff saved to https://phabricator.wikimedia.org/P54816 and previous config saved to /var/cache/conftool/dbconfig/20240117-135242-marostegui.json
13:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
13:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
13:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1156.eqiad.wmnet with reason: Maintenance
13:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1156.eqiad.wmnet with reason: Maintenance
13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54815 and previous config saved to /var/cache/conftool/dbconfig/20240117-135158-marostegui.json
13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P54814 and previous config saved to /var/cache/conftool/dbconfig/20240117-135002-ladsgroup.json
13:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P54813 and previous config saved to /var/cache/conftool/dbconfig/20240117-133652-marostegui.json
13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1014.eqiad.wmnet
13:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54812 and previous config saved to /var/cache/conftool/dbconfig/20240117-133456-ladsgroup.json
13:34 damilare: payments-wiki upgraded from 12d8ad5b to e38b24f0
13:32 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
13:32 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
13:30 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
13:30 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
13:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host snapshot1014.eqiad.wmnet
13:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P54811 and previous config saved to /var/cache/conftool/dbconfig/20240117-132145-marostegui.json
13:19 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2267.codfw.wmnet with OS bullseye
13:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54810 and previous config saved to /var/cache/conftool/dbconfig/20240117-130639-marostegui.json
13:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1146:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54809 and previous config saved to /var/cache/conftool/dbconfig/20240117-130422-marostegui.json
13:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
13:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
13:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
13:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
13:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2113.codfw.wmnet with reason: Maintenance
13:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2113.codfw.wmnet with reason: Maintenance
12:59 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2267.codfw.wmnet with reason: host reimage
12:58 taavi: removing vlan1119 interface on lvs1018 T355115
12:56 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2267.codfw.wmnet with reason: host reimage
12:47 taavi: removing vlan1119 interface on lvs1020 T355115
12:38 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2267.codfw.wmnet with OS bullseye
12:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T354336)', diff saved to https://phabricator.wikimedia.org/P54806 and previous config saved to /var/cache/conftool/dbconfig/20240117-122305-marostegui.json
12:22 hnowlan: setting mw[2267,2282,2357,2395] inactive in advance of reimaging
12:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P54805 and previous config saved to /var/cache/conftool/dbconfig/20240117-120758-marostegui.json
12:06 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
12:00 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
12:00 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
12:00 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw2394.codfw.wmnet with reason: Bad DIMM
12:00 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2044.codfw.wmnet
12:00 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw2394.codfw.wmnet with reason: Bad DIMM
11:59 cgoubert@cumin2002: conftool action : set/pooled=inactive; selector: name=mw2394.codfw.wmnet
11:55 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2044.codfw.wmnet
11:54 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
11:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P54804 and previous config saved to /var/cache/conftool/dbconfig/20240117-115252-marostegui.json
11:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1044.eqiad.wmnet
11:46 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1044.eqiad.wmnet
11:40 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2044.codfw.wmnet
11:40 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1044.eqiad.wmnet
11:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: memcached
11:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T354336)', diff saved to https://phabricator.wikimedia.org/P54803 and previous config saved to /var/cache/conftool/dbconfig/20240117-113745-marostegui.json
11:34 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: memcached
11:34 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1044.eqiad.wmnet
11:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2192 (T354336)', diff saved to https://phabricator.wikimedia.org/P54802 and previous config saved to /var/cache/conftool/dbconfig/20240117-113432-marostegui.json
11:34 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2044.codfw.wmnet
11:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2192.codfw.wmnet with reason: Maintenance
11:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2192.codfw.wmnet with reason: Maintenance
11:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T354336)', diff saved to https://phabricator.wikimedia.org/P54801 and previous config saved to /var/cache/conftool/dbconfig/20240117-113410-marostegui.json
11:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P54800 and previous config saved to /var/cache/conftool/dbconfig/20240117-111904-marostegui.json
11:09 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt`
11:09 Dreamy_Jazz: stopped scanning script
11:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P54799 and previous config saved to /var/cache/conftool/dbconfig/20240117-110357-marostegui.json
10:49 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1043.eqiad.wmnet
10:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T354336)', diff saved to https://phabricator.wikimedia.org/P54798 and previous config saved to /var/cache/conftool/dbconfig/20240117-104851-marostegui.json
10:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2178 (T354336)', diff saved to https://phabricator.wikimedia.org/P54797 and previous config saved to /var/cache/conftool/dbconfig/20240117-104438-marostegui.json
10:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2178.codfw.wmnet with reason: Maintenance
10:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2178.codfw.wmnet with reason: Maintenance
10:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54796 and previous config saved to /var/cache/conftool/dbconfig/20240117-104416-marostegui.json
10:43 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1043.eqiad.wmnet
10:33 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2043.codfw.wmnet
10:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P54795 and previous config saved to /var/cache/conftool/dbconfig/20240117-102909-marostegui.json
10:26 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2043.codfw.wmnet
10:26 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
10:26 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:18 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2043.codfw.wmnet
10:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P54793 and previous config saved to /var/cache/conftool/dbconfig/20240117-101403-marostegui.json
10:12 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2043.codfw.wmnet
09:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54792 and previous config saved to /var/cache/conftool/dbconfig/20240117-095856-marostegui.json
09:58 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:58 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:58 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2171:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54791 and previous config saved to /var/cache/conftool/dbconfig/20240117-095544-marostegui.json
09:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
09:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T354336)', diff saved to https://phabricator.wikimedia.org/P54790 and previous config saved to /var/cache/conftool/dbconfig/20240117-095521-marostegui.json
09:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1043.eqiad.wmnet
09:51 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2042.codfw.wmnet
09:51 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1042.eqiad.wmnet
09:46 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1043.eqiad.wmnet
09:45 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1042.eqiad.wmnet
09:45 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2042.codfw.wmnet
09:40 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1042.eqiad.wmnet
09:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P54789 and previous config saved to /var/cache/conftool/dbconfig/20240117-094015-marostegui.json
09:36 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1042.eqiad.wmnet
09:35 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:35 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:30 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host mc2042.codfw.wmnet
09:29 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2042.codfw.wmnet
09:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P54788 and previous config saved to /var/cache/conftool/dbconfig/20240117-092507-marostegui.json
09:21 jnuche@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.14 refs T354432 (duration: 06m 15s)
09:15 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.14 refs T354432
09:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T354336)', diff saved to https://phabricator.wikimedia.org/P54787 and previous config saved to /var/cache/conftool/dbconfig/20240117-091000-marostegui.json
09:08 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host mc2042.codfw.wmnet
09:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2157 (T354336)', diff saved to https://phabricator.wikimedia.org/P54786 and previous config saved to /var/cache/conftool/dbconfig/20240117-090648-marostegui.json
09:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2157.codfw.wmnet with reason: Maintenance
09:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2157.codfw.wmnet with reason: Maintenance
09:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54785 and previous config saved to /var/cache/conftool/dbconfig/20240117-090626-marostegui.json
09:02 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2042.codfw.wmnet
08:56 dcausse@deploy2002: Finished scap: Backport for gerrit:990718enable page_rerender for all wikis (T351503) (duration: 09m 15s)
08:55 moritzm: installing Python 2.7 security updates
08:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P54784 and previous config saved to /var/cache/conftool/dbconfig/20240117-085119-marostegui.json
08:50 dcausse@deploy2002: pfischer and dcausse: Continuing with sync
08:48 dcausse@deploy2002: pfischer and dcausse: Backport for gerrit:990718enable page_rerender for all wikis (T351503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:46 dcausse@deploy2002: Started scap: Backport for gerrit:990718enable page_rerender for all wikis (T351503)
08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P54783 and previous config saved to /var/cache/conftool/dbconfig/20240117-083613-marostegui.json
08:23 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on db2194.codfw.wmnet with reason: debugging something before T343674
08:22 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on db2194.codfw.wmnet with reason: debugging something before T343674
08:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54782 and previous config saved to /var/cache/conftool/dbconfig/20240117-082106-marostegui.json
08:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54781 and previous config saved to /var/cache/conftool/dbconfig/20240117-082001-ladsgroup.json
08:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
08:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
08:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2137:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54780 and previous config saved to /var/cache/conftool/dbconfig/20240117-081754-marostegui.json
08:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
08:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
08:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T354336)', diff saved to https://phabricator.wikimedia.org/P54779 and previous config saved to /var/cache/conftool/dbconfig/20240117-081731-marostegui.json
08:16 moritzm: installing python-git security updates
08:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P54778 and previous config saved to /var/cache/conftool/dbconfig/20240117-080225-marostegui.json
07:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P54777 and previous config saved to /var/cache/conftool/dbconfig/20240117-074719-marostegui.json
07:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T354336)', diff saved to https://phabricator.wikimedia.org/P54776 and previous config saved to /var/cache/conftool/dbconfig/20240117-073212-marostegui.json
07:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2128 (T354336)', diff saved to https://phabricator.wikimedia.org/P54775 and previous config saved to /var/cache/conftool/dbconfig/20240117-072902-marostegui.json
07:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
07:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
07:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2128.codfw.wmnet with reason: Maintenance
07:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2128.codfw.wmnet with reason: Maintenance
07:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T354336)', diff saved to https://phabricator.wikimedia.org/P54774 and previous config saved to /var/cache/conftool/dbconfig/20240117-072824-marostegui.json
07:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P54773 and previous config saved to /var/cache/conftool/dbconfig/20240117-071317-marostegui.json
06:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P54772 and previous config saved to /var/cache/conftool/dbconfig/20240117-065811-marostegui.json
06:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T354336)', diff saved to https://phabricator.wikimedia.org/P54771 and previous config saved to /var/cache/conftool/dbconfig/20240117-064304-marostegui.json
06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2123 (T354336)', diff saved to https://phabricator.wikimedia.org/P54770 and previous config saved to /var/cache/conftool/dbconfig/20240117-063951-marostegui.json
06:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2123.codfw.wmnet with reason: Maintenance
06:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2123.codfw.wmnet with reason: Maintenance
06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T354336)', diff saved to https://phabricator.wikimedia.org/P54769 and previous config saved to /var/cache/conftool/dbconfig/20240117-063929-marostegui.json
06:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P54768 and previous config saved to /var/cache/conftool/dbconfig/20240117-062422-marostegui.json
06:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P54767 and previous config saved to /var/cache/conftool/dbconfig/20240117-060916-marostegui.json
05:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T354336)', diff saved to https://phabricator.wikimedia.org/P54766 and previous config saved to /var/cache/conftool/dbconfig/20240117-055409-marostegui.json
05:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2111 (T354336)', diff saved to https://phabricator.wikimedia.org/P54765 and previous config saved to /var/cache/conftool/dbconfig/20240117-055056-marostegui.json
05:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2111.codfw.wmnet with reason: Maintenance
05:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2111.codfw.wmnet with reason: Maintenance
05:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2101.codfw.wmnet with reason: Maintenance
05:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2101.codfw.wmnet with reason: Maintenance
05:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1183.eqiad.wmnet with reason: Maintenance
05:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1183.eqiad.wmnet with reason: Maintenance
03:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
03:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
03:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54764 and previous config saved to /var/cache/conftool/dbconfig/20240117-033751-ladsgroup.json
03:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P54763 and previous config saved to /var/cache/conftool/dbconfig/20240117-032245-ladsgroup.json
03:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P54762 and previous config saved to /var/cache/conftool/dbconfig/20240117-030738-ladsgroup.json
02:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54761 and previous config saved to /var/cache/conftool/dbconfig/20240117-025232-ladsgroup.json
00:03 tstarling@deploy2002: Synchronized wmf-config: T344791 related cleanup (duration: 06m 22s)

2024-01-16

23:55 tstarling@deploy2002: Synchronized wmf-config/CommonSettings.php: Disable wgUseSameSiteLegacyCookies T344791 (duration: 09m 19s)
21:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54760 and previous config saved to /var/cache/conftool/dbconfig/20240116-214016-ladsgroup.json
21:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
21:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
20:43 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2297.codfw.wmnet with OS bullseye
20:37 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2296.codfw.wmnet with OS bullseye
20:30 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2295.codfw.wmnet with OS bullseye
20:26 ryankemper: T351650 Running puppet on `P:trafficserver::backend` following merge of https://gerrit.wikimedia.org/r/c/operations/puppet/+/991091
20:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2294.codfw.wmnet with OS bullseye
20:23 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2297.codfw.wmnet with reason: host reimage
20:20 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2297.codfw.wmnet with reason: host reimage
20:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2296.codfw.wmnet with reason: host reimage
20:16 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2292.codfw.wmnet with OS bullseye
20:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2293.codfw.wmnet with OS bullseye
20:13 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2296.codfw.wmnet with reason: host reimage
20:12 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2291.codfw.wmnet with OS bullseye
20:11 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2295.codfw.wmnet with reason: host reimage
20:08 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2295.codfw.wmnet with reason: host reimage
20:06 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2294.codfw.wmnet with reason: host reimage
20:03 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2297.codfw.wmnet with OS bullseye
20:02 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2294.codfw.wmnet with reason: host reimage
19:56 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2296.codfw.wmnet with OS bullseye
19:56 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2292.codfw.wmnet with reason: host reimage
19:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2293.codfw.wmnet with reason: host reimage
19:52 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2295.codfw.wmnet with OS bullseye
19:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2291.codfw.wmnet with reason: host reimage
19:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1375.eqiad.wmnet with OS bullseye
19:49 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2293.codfw.wmnet with reason: host reimage
19:48 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2292.codfw.wmnet with reason: host reimage
19:47 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2291.codfw.wmnet with reason: host reimage
19:47 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1376.eqiad.wmnet with OS bullseye
19:46 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2294.codfw.wmnet with OS bullseye
19:45 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1374.eqiad.wmnet with OS bullseye
19:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
19:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
19:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T354336)', diff saved to https://phabricator.wikimedia.org/P54759 and previous config saved to /var/cache/conftool/dbconfig/20240116-194509-marostegui.json
19:34 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1360.eqiad.wmnet with OS bullseye
19:32 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2293.codfw.wmnet with OS bullseye
19:31 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2292.codfw.wmnet with OS bullseye
19:31 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2291.codfw.wmnet with OS bullseye
19:31 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1363.eqiad.wmnet with OS bullseye
19:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P54758 and previous config saved to /var/cache/conftool/dbconfig/20240116-193002-marostegui.json
19:29 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1375.eqiad.wmnet with reason: host reimage
19:29 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1361.eqiad.wmnet with OS bullseye
19:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1362.eqiad.wmnet with OS bullseye
19:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1376.eqiad.wmnet with reason: host reimage
19:24 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1374.eqiad.wmnet with reason: host reimage
19:23 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1376.eqiad.wmnet with reason: host reimage
19:21 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1375.eqiad.wmnet with reason: host reimage
19:21 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1374.eqiad.wmnet with reason: host reimage
19:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P54757 and previous config saved to /var/cache/conftool/dbconfig/20240116-191456-marostegui.json
19:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1360.eqiad.wmnet with reason: host reimage
19:10 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1363.eqiad.wmnet with reason: host reimage
19:08 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1361.eqiad.wmnet with reason: host reimage
19:08 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1376.eqiad.wmnet with OS bullseye
19:07 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1362.eqiad.wmnet with reason: host reimage
19:07 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1375.eqiad.wmnet with OS bullseye
19:06 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1374.eqiad.wmnet with OS bullseye
19:06 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1363.eqiad.wmnet with reason: host reimage
19:05 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1362.eqiad.wmnet with reason: host reimage
19:05 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1360.eqiad.wmnet with reason: host reimage
19:04 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1361.eqiad.wmnet with reason: host reimage
18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T354336)', diff saved to https://phabricator.wikimedia.org/P54756 and previous config saved to /var/cache/conftool/dbconfig/20240116-185949-marostegui.json
18:57 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1230 (T354336)', diff saved to https://phabricator.wikimedia.org/P54755 and previous config saved to /var/cache/conftool/dbconfig/20240116-185723-marostegui.json
18:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1230.eqiad.wmnet with reason: Maintenance
18:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1230.eqiad.wmnet with reason: Maintenance
18:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
18:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
18:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54754 and previous config saved to /var/cache/conftool/dbconfig/20240116-185626-marostegui.json
18:51 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1363.eqiad.wmnet with OS bullseye
18:51 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1362.eqiad.wmnet with OS bullseye
18:50 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1361.eqiad.wmnet with OS bullseye
18:50 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1360.eqiad.wmnet with OS bullseye
18:42 mutante: phab2002 - pulling repo data from phab1004 by running sync script created by rsync::quickdatacopy after gerrit:990247 T354221
18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315', diff saved to https://phabricator.wikimedia.org/P54753 and previous config saved to /var/cache/conftool/dbconfig/20240116-184120-marostegui.json
18:38 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --sleep 1 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-non-job-queue.txt`
18:36 Dreamy_Jazz: stopped tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt`
18:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315', diff saved to https://phabricator.wikimedia.org/P54752 and previous config saved to /var/cache/conftool/dbconfig/20240116-182613-marostegui.json
18:20 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
18:19 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
18:19 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
18:19 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
18:18 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
18:18 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
18:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54751 and previous config saved to /var/cache/conftool/dbconfig/20240116-181107-marostegui.json
18:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1213:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54750 and previous config saved to /var/cache/conftool/dbconfig/20240116-180841-marostegui.json
18:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
18:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
18:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T354336)', diff saved to https://phabricator.wikimedia.org/P54749 and previous config saved to /var/cache/conftool/dbconfig/20240116-180819-marostegui.json
17:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P54748 and previous config saved to /var/cache/conftool/dbconfig/20240116-175313-marostegui.json
17:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P54747 and previous config saved to /var/cache/conftool/dbconfig/20240116-173806-marostegui.json
17:32 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1460.eqiad.wmnet with OS bullseye
17:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T354336)', diff saved to https://phabricator.wikimedia.org/P54746 and previous config saved to /var/cache/conftool/dbconfig/20240116-172300-marostegui.json
17:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1210 (T354336)', diff saved to https://phabricator.wikimedia.org/P54745 and previous config saved to /var/cache/conftool/dbconfig/20240116-172032-marostegui.json
17:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1210.eqiad.wmnet with reason: Maintenance
17:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1210.eqiad.wmnet with reason: Maintenance
17:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T354336)', diff saved to https://phabricator.wikimedia.org/P54744 and previous config saved to /var/cache/conftool/dbconfig/20240116-172011-marostegui.json
17:14 topranks: Disabling puppet and PyBal on lvs2012 ahead of migration of network link to lsw1-b2-codfw T352909
17:12 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1460.eqiad.wmnet with reason: host reimage
17:11 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: moving lvs hosts codfw T352784 T352918
17:11 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2012.codfw.wmnet with reason: moving lvs hosts codfw T352784 T352918
17:10 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1460.eqiad.wmnet with reason: host reimage
17:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P54743 and previous config saved to /var/cache/conftool/dbconfig/20240116-170503-marostegui.json
16:56 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on prometheus1006.eqiad.wmnet with reason: memory upgrade
16:56 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on prometheus1006.eqiad.wmnet with reason: memory upgrade
16:56 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw1460.eqiad.wmnet with OS bullseye
16:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P54742 and previous config saved to /var/cache/conftool/dbconfig/20240116-164957-marostegui.json
16:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T354336)', diff saved to https://phabricator.wikimedia.org/P54741 and previous config saved to /var/cache/conftool/dbconfig/20240116-163449-marostegui.json
16:33 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on prometheus1005.eqiad.wmnet with reason: memory upgrade
16:33 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on prometheus1005.eqiad.wmnet with reason: memory upgrade
16:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1200 (T354336)', diff saved to https://phabricator.wikimedia.org/P54740 and previous config saved to /var/cache/conftool/dbconfig/20240116-163224-marostegui.json
16:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1200.eqiad.wmnet with reason: Maintenance
16:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1200.eqiad.wmnet with reason: Maintenance
16:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T354336)', diff saved to https://phabricator.wikimedia.org/P54739 and previous config saved to /var/cache/conftool/dbconfig/20240116-163203-marostegui.json
16:22 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab1004 for T354969 (duration: 00m 50s)
16:22 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab1004 for T354969
16:21 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 for T354969 (duration: 00m 27s)
16:21 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 for T354969
16:20 mutante: phabricator deploy is imminent
16:20 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab1004.eqiad.wmnet with reason: deployment
16:20 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab1004.eqiad.wmnet with reason: deployment
16:20 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
16:19 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P54738 and previous config saved to /var/cache/conftool/dbconfig/20240116-161656-marostegui.json
16:03 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt`
16:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P54737 and previous config saved to /var/cache/conftool/dbconfig/20240116-160150-marostegui.json
16:00 Dreamy_Jazz: stopped mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt
15:55 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on re0.cr[1-2]-codfw.mgmt with reason: moving lvs hosts codfw T352784 T352918
15:55 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on re0.cr[1-2]-codfw.mgmt with reason: moving lvs hosts codfw T352784 T352918
15:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T354336)', diff saved to https://phabricator.wikimedia.org/P54736 and previous config saved to /var/cache/conftool/dbconfig/20240116-154643-marostegui.json
15:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1185 (T354336)', diff saved to https://phabricator.wikimedia.org/P54735 and previous config saved to /var/cache/conftool/dbconfig/20240116-154419-marostegui.json
15:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1185.eqiad.wmnet with reason: Maintenance
15:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1185.eqiad.wmnet with reason: Maintenance
15:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T354336)', diff saved to https://phabricator.wikimedia.org/P54734 and previous config saved to /var/cache/conftool/dbconfig/20240116-154357-marostegui.json
15:29 Dreamy_Jazz: T351400 running mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt
15:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P54733 and previous config saved to /var/cache/conftool/dbconfig/20240116-152850-marostegui.json
15:28 Dreamy_Jazz: stopped mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 25 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-25.txt
15:27 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr[1-2]-codfw,cr[1-2]-codfw IPv6,lvs2013 with reason: moving lvs hosts codfw T352784
15:27 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cr[1-2]-codfw,cr[1-2]-codfw IPv6,lvs2013 with reason: moving lvs hosts codfw T352784
15:19 topranks: Disabling puppet and PyBal on lvs2013 ahead of migration of network link to ssw1-a1-codfw T352784
15:18 Dreamy_Jazz: T351400 running mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 25 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-20.txt
15:18 Dreamy_Jazz: Stopped mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 20 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-20.txt
15:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P54732 and previous config saved to /var/cache/conftool/dbconfig/20240116-151344-marostegui.json
15:13 Dreamy_Jazz: T351400 running mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 20 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-20.txt
15:11 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
15:07 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
15:00 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:00 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove old records for cloud-support1-c-eqiad - cmooney@cumin1002"
14:58 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove old records for cloud-support1-c-eqiad - cmooney@cumin1002"
14:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T354336)', diff saved to https://phabricator.wikimedia.org/P54731 and previous config saved to /var/cache/conftool/dbconfig/20240116-145837-marostegui.json
14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1161 (T354336)', diff saved to https://phabricator.wikimedia.org/P54730 and previous config saved to /var/cache/conftool/dbconfig/20240116-145613-marostegui.json
14:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
14:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
14:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1161.eqiad.wmnet with reason: Maintenance
14:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1161.eqiad.wmnet with reason: Maintenance
14:55 cmooney@cumin1002: START - Cookbook sre.dns.netbox
14:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
14:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
14:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54729 and previous config saved to /var/cache/conftool/dbconfig/20240116-145458-marostegui.json
14:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P54728 and previous config saved to /var/cache/conftool/dbconfig/20240116-143951-marostegui.json
14:33 moritzm: installing ca-certificates-java bugfix updates on bookworm
14:31 Dreamy_Jazz: UTC afternoon deploys done
14:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P54727 and previous config saved to /var/cache/conftool/dbconfig/20240116-142444-marostegui.json
14:24 dreamyjazz@deploy2002: Finished scap: Backport for gerrit:990760Add more statsd counters and add logstash logging (T351419) (duration: 07m 15s)
14:18 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
14:18 dreamyjazz@deploy2002: dreamyjazz: Backport for gerrit:990760Add more statsd counters and add logstash logging (T351419) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:17 moritzm: installing 5.10.205 kernels on buster hosts running the 5.10 backport
14:16 dreamyjazz@deploy2002: Started scap: Backport for gerrit:990760Add more statsd counters and add logstash logging (T351419)
14:14 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2042.codfw.wmnet
14:14 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1041.eqiad.wmnet
14:11 dreamyjazz@deploy2002: Finished scap: Backport for gerrit:990754Support parallel PhotoDNA requests (T354408) (duration: 07m 14s)
14:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54726 and previous config saved to /var/cache/conftool/dbconfig/20240116-140938-marostegui.json
14:07 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2042.codfw.wmnet
14:07 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1041.eqiad.wmnet
14:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1144:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54725 and previous config saved to /var/cache/conftool/dbconfig/20240116-140713-marostegui.json
14:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
14:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
14:05 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
14:05 dreamyjazz@deploy2002: dreamyjazz: Backport for gerrit:990754Support parallel PhotoDNA requests (T354408) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:04 dreamyjazz@deploy2002: Started scap: Backport for gerrit:990754Support parallel PhotoDNA requests (T354408)
13:54 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
13:35 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1001.eqiad.wmnet with OS bullseye
13:18 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
13:15 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
13:09 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
13:09 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
13:08 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
13:08 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
13:06 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
13:05 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
13:02 effie: reimage mc-wf1001 (part of puppet7 migration)
13:01 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc-wf1001.eqiad.wmnet with OS bullseye
12:57 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1040.eqiad.wmnet
12:56 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2041.codfw.wmnet
12:52 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1040.eqiad.wmnet
12:50 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2041.codfw.wmnet
12:30 moritzm: installing systemd bugfix updates from Bullseye point release
12:18 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc-wf1001.eqiad.wmnet
12:18 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2040.codfw.wmnet
12:11 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2040.codfw.wmnet
12:10 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc-wf1001.eqiad.wmnet
11:56 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.14 refs T354432
11:45 jnuche@deploy2002: Finished scap: Backport for gerrit:990752PreAuthenticationProvider: Deny account creation based on ipoid data (T354928) (duration: 29m 32s)
11:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2041.codfw.wmnet
11:39 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2041.codfw.wmnet
11:36 jnuche@deploy2002: jnuche and kharlan: Continuing with sync
11:36 jnuche@deploy2002: jnuche and kharlan: Backport for gerrit:990752PreAuthenticationProvider: Deny account creation based on ipoid data (T354928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2040.codfw.wmnet
11:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2040.codfw.wmnet
11:23 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1041.eqiad.wmnet
11:19 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2039.codfw.wmnet
11:16 jnuche@deploy2002: Started scap: Backport for gerrit:990752PreAuthenticationProvider: Deny account creation based on ipoid data (T354928)
11:15 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1041.eqiad.wmnet
11:13 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2039.codfw.wmnet
11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1040.eqiad.wmnet
11:08 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1040.eqiad.wmnet
10:59 jnuche@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.14 refs T354432 (duration: 29m 36s)
10:53 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1039.eqiad.wmnet
10:47 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1039.eqiad.wmnet
10:41 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2039.codfw.wmnet
10:35 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2038.codfw.wmnet
10:30 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2039.codfw.wmnet
10:30 jnuche@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.14 refs T354432
10:29 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2038.codfw.wmnet
10:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2038.codfw.wmnet
10:21 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1038.eqiad.wmnet
10:16 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2038.codfw.wmnet
10:16 godog: clean up also 1.42.0-wmf.9 1.42.0-wmf.10 1.42.0-wmf.12 from mw22* - T355117
10:15 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1038.eqiad.wmnet
10:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1039.eqiad.wmnet
10:10 godog: manually pruning php-1.42.0-wmf.7 from mw22* - T355117
10:07 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1039.eqiad.wmnet
10:06 jnuche@deploy2002: Pruned MediaWiki: 1.42.0-wmf.7, 1.42.0-wmf.9, 1.42.0-wmf.10, 1.42.0-wmf.12 (duration: 07m 08s)
10:05 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1038.eqiad.wmnet
10:00 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1038.eqiad.wmnet
09:51 jnuche@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.14 refs T354432 (duration: 52m 52s)
09:28 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set cloudvirt2004-dev as active - taavi@cumin1002"
09:26 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set cloudvirt2004-dev as active - taavi@cumin1002"
09:25 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:23 taavi@cumin1002: START - Cookbook sre.dns.netbox
09:05 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Daniram3 out of all services on: 2211 hosts
09:04 denisse: reprepro: Copy grafana v9.4.14 from buster to bookworm - T352665
09:03 denisse: reprepro: Copy grafana v9.4.14 from buster to bookworm
09:03 root@cumin2002: START - Cookbook sre.idm.logout Logging Daniram3 out of all services on: 2211 hosts
08:59 jnuche@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.14 refs T354432

2024-01-15

21:46 reedy@deploy2002: Synchronized wmf-config/: Fix more stringified class names (duration: 06m 29s)
21:37 fab@deploy2002: Finished deploy [airflow-dags/research@9b6a69a]: (no justification provided) (duration: 00m 27s)
21:37 reedy@deploy2002: Synchronized wmf-config/InitialiseSettings.php: Swap stringified class names in ConfirmEdit usages (duration: 06m 30s)
21:36 fab@deploy2002: Started deploy [airflow-dags/research@9b6a69a]: (no justification provided)
21:23 tgr: UTC late deploys done
21:22 tgr@deploy2002: Finished scap: Backport for gerrit:990164Log emails in production (duration: 09m 11s)
21:15 tgr@deploy2002: tgr: Continuing with sync
21:14 tgr@deploy2002: tgr: Backport for gerrit:990164Log emails in production synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:12 tgr@deploy2002: Started scap: Backport for gerrit:990164Log emails in production
19:23 tzatziki: creating the u4c2024_edits table on all wikis
17:55 btullis@cumin1002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid public cluster: Roll restart of Druid jvm daemons.
17:48 btullis@cumin1002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid analytics cluster: Roll restart of Druid jvm daemons.
17:23 btullis@cumin1002: END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
17:02 btullis@cumin1002: START - Cookbook sre.druid.roll-restart-workers for Druid public cluster: Roll restart of Druid jvm daemons.
17:00 btullis@cumin1002: START - Cookbook sre.druid.roll-restart-workers for Druid analytics cluster: Roll restart of Druid jvm daemons.
16:51 btullis@cumin1002: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
16:45 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbstore1005.eqiad.wmnet
16:45 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:45 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbstore1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
15:26 hnowlan: depooled jobrunner mw1460 to repurpose as k8s node
15:06 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbstore1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
15:03 btullis@cumin1002: START - Cookbook sre.dns.netbox
14:59 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
14:47 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbstore1005.eqiad.wmnet
14:38 Lucas_WMDE: UTC afternoon backport+config window done
14:33 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:989747cawiki: update wgAutoConfirmAge and wgAutoConfirmCount (T354425) (duration: 11m 36s)
14:28 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
14:28 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
14:27 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and anzx: Continuing with sync
14:26 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
14:25 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
14:24 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
14:24 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
14:23 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and anzx: Backport for gerrit:989747cawiki: update wgAutoConfirmAge and wgAutoConfirmCount (T354425) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:23 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
14:23 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
14:22 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:989747cawiki: update wgAutoConfirmAge and wgAutoConfirmCount (T354425)
13:49 aikochou@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
13:26 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp2003.codfw.wmnet
13:19 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp2003.codfw.wmnet
13:19 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp2002.codfw.wmnet
13:12 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp2002.codfw.wmnet
13:12 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp2001.codfw.wmnet
13:09 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp1003.eqiad.wmnet
13:05 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp2001.codfw.wmnet
13:03 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp1003.eqiad.wmnet
13:00 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbstore1003.eqiad.wmnet
13:00 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:00 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbstore1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
12:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: mediawiki::memcached::gutter
12:59 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbstore1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
12:54 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: mediawiki::memcached::gutter
12:42 btullis@cumin1002: START - Cookbook sre.dns.netbox
12:39 effie: enable puppet on mc* hosts - - T349619
12:37 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbstore1003.eqiad.wmnet
12:23 effie: stopping puppet on all mediawiki memcached hosts (mc*, mc-gp*), puppet 7 migration in progress - T349619
12:01 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 92 hosts
12:00 btullis@cumin1002: START - Cookbook sre.hosts.remove-downtime for 92 hosts
11:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
11:38 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
11:10 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-coord[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service
11:10 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-coord[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service
11:10 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-master[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service
11:10 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-master[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service
11:09 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1037.eqiad.wmnet
11:08 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 8 hosts with reason: Bringing new nameservers into service
11:08 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 8 hosts with reason: Bringing new nameservers into service
11:08 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 97 hosts with reason: Bringing new nameservers into service
11:07 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 97 hosts with reason: Bringing new nameservers into service
11:03 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1037.eqiad.wmnet
10:58 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp1002.eqiad.wmnet
10:51 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp1002.eqiad.wmnet
10:48 moritzm: installing systemd bugfix updates from Bullseye point release
10:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1037.eqiad.wmnet
10:13 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1037.eqiad.wmnet
10:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc-gp1002.eqiad.wmnet
10:02 ladsgroup@deploy2002: Finished scap: Backport for gerrit:990424SecurePoll: Adding updated voterlist files (T349263) (duration: 16m 04s)
09:58 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc-gp1002.eqiad.wmnet
09:56 ladsgroup@deploy2002: ladsgroup: Continuing with sync
09:48 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:990424SecurePoll: Adding updated voterlist files (T349263) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
09:46 ladsgroup@deploy2002: Started scap: Backport for gerrit:990424SecurePoll: Adding updated voterlist files (T349263)
09:16 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:16 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:15 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:15 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:15 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:14 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
08:45 filippo@deploy2002: Finished deploy [performance/arc-lamp@67389a0]: (no justification provided) (duration: 00m 05s)
08:45 filippo@deploy2002: Started deploy [performance/arc-lamp@67389a0]: (no justification provided)
08:23 dcausse@deploy2002: Finished scap: Backport for gerrit:990029enable page_rerender for 5th batch of wikis (T351503) (duration: 11m 40s)
08:17 dcausse@deploy2002: pfischer and dcausse: Continuing with sync
08:13 dcausse@deploy2002: pfischer and dcausse: Backport for gerrit:990029enable page_rerender for 5th batch of wikis (T351503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:12 dcausse@deploy2002: Started scap: Backport for gerrit:990029enable page_rerender for 5th batch of wikis (T351503)
04:57 andrewbogott: restarting wikitech-static, oom

2024-01-14

15:47 taavi@deploy2002: Finished scap: Backport for gerrit:990396Log IpReputation channel as debug (T354928) (duration: 26m 49s)
15:36 taavi@deploy2002: taavi: Continuing with sync
15:35 taavi@deploy2002: taavi: Backport for gerrit:990396Log IpReputation channel as debug (T354928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
15:20 taavi@deploy2002: Started scap: Backport for gerrit:990396Log IpReputation channel as debug (T354928)
15:01 andrewbogott: manually emptying /srv/mediawiki/images/wikitech/archive on wikitech-static; the maintenance script didn't do it and the host is failing due to a full disk
15:01 andrewbogott: running deleteArchivedFiles.php on wikitech-static

2024-01-12

23:49 dzahn@cumin1001: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Conniecc1 out of all services on: 2213 hosts
23:47 dzahn@cumin1001: START - Cookbook sre.idm.logout Logging Conniecc1 out of all services on: 2213 hosts
22:52 dzahn@cumin1001: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Conniecc1 out of all services on: 2213 hosts
22:51 dzahn@cumin1001: START - Cookbook sre.idm.logout Logging Conniecc1 out of all services on: 2213 hosts
22:29 dzahn@cumin1001: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Conniecc1 out of all services on: 2213 hosts
22:28 dzahn@cumin1001: START - Cookbook sre.idm.logout Logging Conniecc1 out of all services on: 2213 hosts
18:07 mutante: aphlict1002 - systemctl start logrotate
17:18 tchanders@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
17:18 tchanders@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
17:17 tchanders@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
17:16 tchanders@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
17:10 tchanders@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
17:09 tchanders@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
16:52 cgoubert@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
16:52 cgoubert@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
16:51 cgoubert@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
16:51 cgoubert@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
16:20 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
16:20 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
16:20 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
16:19 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
15:46 klausman@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
15:37 klausman@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
15:14 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
15:14 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
14:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2114.codfw.wmnet with reason: Maintenance
14:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2114.codfw.wmnet with reason: Maintenance
14:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T354336)', diff saved to https://phabricator.wikimedia.org/P54714 and previous config saved to /var/cache/conftool/dbconfig/20240112-140423-marostegui.json
13:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P54713 and previous config saved to /var/cache/conftool/dbconfig/20240112-134916-marostegui.json
13:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P54712 and previous config saved to /var/cache/conftool/dbconfig/20240112-133410-marostegui.json
13:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T354336)', diff saved to https://phabricator.wikimedia.org/P54711 and previous config saved to /var/cache/conftool/dbconfig/20240112-131904-marostegui.json
12:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T354336)', diff saved to https://phabricator.wikimedia.org/P54710 and previous config saved to /var/cache/conftool/dbconfig/20240112-125944-marostegui.json
12:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2193.codfw.wmnet with reason: Maintenance
12:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2193.codfw.wmnet with reason: Maintenance
12:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54709 and previous config saved to /var/cache/conftool/dbconfig/20240112-125921-marostegui.json
12:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P54708 and previous config saved to /var/cache/conftool/dbconfig/20240112-124416-marostegui.json
12:33 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=dewiki --logwiki=metawiki 'Osip Knecht' 'Artquichotte39'
12:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P54707 and previous config saved to /var/cache/conftool/dbconfig/20240112-122909-marostegui.json
12:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54706 and previous config saved to /var/cache/conftool/dbconfig/20240112-121402-marostegui.json
12:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54704 and previous config saved to /var/cache/conftool/dbconfig/20240112-121150-marostegui.json
12:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2180.codfw.wmnet with reason: Maintenance
12:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2180.codfw.wmnet with reason: Maintenance
12:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54703 and previous config saved to /var/cache/conftool/dbconfig/20240112-121127-marostegui.json
12:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
12:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
12:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
12:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
12:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
12:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
11:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P54701 and previous config saved to /var/cache/conftool/dbconfig/20240112-115621-marostegui.json
11:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P54700 and previous config saved to /var/cache/conftool/dbconfig/20240112-114114-marostegui.json
11:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54699 and previous config saved to /var/cache/conftool/dbconfig/20240112-112608-marostegui.json
11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2171:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54698 and previous config saved to /var/cache/conftool/dbconfig/20240112-112049-marostegui.json
11:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
11:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54697 and previous config saved to /var/cache/conftool/dbconfig/20240112-112027-marostegui.json
11:10 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
11:08 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
11:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P54696 and previous config saved to /var/cache/conftool/dbconfig/20240112-110521-marostegui.json
11:04 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
10:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P54695 and previous config saved to /var/cache/conftool/dbconfig/20240112-105014-marostegui.json
10:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54694 and previous config saved to /var/cache/conftool/dbconfig/20240112-103508-marostegui.json
10:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2169:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54693 and previous config saved to /var/cache/conftool/dbconfig/20240112-103250-marostegui.json
10:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
10:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
10:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54692 and previous config saved to /var/cache/conftool/dbconfig/20240112-103227-marostegui.json
10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P54691 and previous config saved to /var/cache/conftool/dbconfig/20240112-101721-marostegui.json
10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P54690 and previous config saved to /var/cache/conftool/dbconfig/20240112-100214-marostegui.json
09:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54689 and previous config saved to /var/cache/conftool/dbconfig/20240112-094708-marostegui.json
09:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54688 and previous config saved to /var/cache/conftool/dbconfig/20240112-094451-marostegui.json
09:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
09:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
09:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2158.codfw.wmnet with reason: Maintenance
09:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2158.codfw.wmnet with reason: Maintenance
09:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T354336)', diff saved to https://phabricator.wikimedia.org/P54687 and previous config saved to /var/cache/conftool/dbconfig/20240112-094413-marostegui.json
09:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P54686 and previous config saved to /var/cache/conftool/dbconfig/20240112-092907-marostegui.json
09:25 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
09:25 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
09:17 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
09:16 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
09:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P54685 and previous config saved to /var/cache/conftool/dbconfig/20240112-091400-marostegui.json
09:09 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
08:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T354336)', diff saved to https://phabricator.wikimedia.org/P54684 and previous config saved to /var/cache/conftool/dbconfig/20240112-085854-marostegui.json
08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T354336)', diff saved to https://phabricator.wikimedia.org/P54683 and previous config saved to /var/cache/conftool/dbconfig/20240112-085637-marostegui.json
08:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2151.codfw.wmnet with reason: Maintenance
08:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2151.codfw.wmnet with reason: Maintenance
08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T354336)', diff saved to https://phabricator.wikimedia.org/P54682 and previous config saved to /var/cache/conftool/dbconfig/20240112-085614-marostegui.json
08:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P54681 and previous config saved to /var/cache/conftool/dbconfig/20240112-084108-marostegui.json
08:40 godog: upload and finish upgrade of prometheus 2.48 on all sites - T354399
08:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54680 and previous config saved to /var/cache/conftool/dbconfig/20240112-083837-root.json
08:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P54679 and previous config saved to /var/cache/conftool/dbconfig/20240112-082601-marostegui.json
08:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54678 and previous config saved to /var/cache/conftool/dbconfig/20240112-082332-root.json
08:20 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 3605
08:19 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 3605
08:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T354336)', diff saved to https://phabricator.wikimedia.org/P54677 and previous config saved to /var/cache/conftool/dbconfig/20240112-081055-marostegui.json
08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2129 (T354336)', diff saved to https://phabricator.wikimedia.org/P54676 and previous config saved to /var/cache/conftool/dbconfig/20240112-080837-marostegui.json
08:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2129.codfw.wmnet with reason: Maintenance
08:08 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54675 and previous config saved to /var/cache/conftool/dbconfig/20240112-080827-root.json
08:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2129.codfw.wmnet with reason: Maintenance
08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T354336)', diff saved to https://phabricator.wikimedia.org/P54674 and previous config saved to /var/cache/conftool/dbconfig/20240112-080815-marostegui.json
07:53 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54673 and previous config saved to /var/cache/conftool/dbconfig/20240112-075322-root.json
07:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P54672 and previous config saved to /var/cache/conftool/dbconfig/20240112-075309-marostegui.json
07:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54671 and previous config saved to /var/cache/conftool/dbconfig/20240112-073817-root.json
07:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P54670 and previous config saved to /var/cache/conftool/dbconfig/20240112-073802-marostegui.json
07:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54669 and previous config saved to /var/cache/conftool/dbconfig/20240112-072312-root.json
07:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T354336)', diff saved to https://phabricator.wikimedia.org/P54668 and previous config saved to /var/cache/conftool/dbconfig/20240112-072255-marostegui.json
07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2124 (T354336)', diff saved to https://phabricator.wikimedia.org/P54667 and previous config saved to /var/cache/conftool/dbconfig/20240112-072038-marostegui.json
07:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2124.codfw.wmnet with reason: Maintenance
07:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2124.codfw.wmnet with reason: Maintenance
07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T354336)', diff saved to https://phabricator.wikimedia.org/P54666 and previous config saved to /var/cache/conftool/dbconfig/20240112-072015-marostegui.json
07:08 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54665 and previous config saved to /var/cache/conftool/dbconfig/20240112-070807-root.json
07:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P54664 and previous config saved to /var/cache/conftool/dbconfig/20240112-070508-marostegui.json
06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1168.eqiad.wmnet with OS bookworm
06:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P54663 and previous config saved to /var/cache/conftool/dbconfig/20240112-065002-marostegui.json
06:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1168.eqiad.wmnet with reason: host reimage
06:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1168.eqiad.wmnet with reason: host reimage
06:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T354336)', diff saved to https://phabricator.wikimedia.org/P54662 and previous config saved to /var/cache/conftool/dbconfig/20240112-063456-marostegui.json
06:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2117 (T354336)', diff saved to https://phabricator.wikimedia.org/P54661 and previous config saved to /var/cache/conftool/dbconfig/20240112-063239-marostegui.json
06:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2117.codfw.wmnet with reason: Maintenance
06:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2117.codfw.wmnet with reason: Maintenance
06:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
06:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
06:23 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1168.eqiad.wmnet with OS bookworm
06:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1168 T354506', diff saved to https://phabricator.wikimedia.org/P54660 and previous config saved to /var/cache/conftool/dbconfig/20240112-062137-marostegui.json
06:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1173.eqiad.wmnet with reason: Maintenance
06:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1173.eqiad.wmnet with reason: Maintenance
04:12 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
04:12 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
04:12 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
04:11 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
04:11 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
04:11 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
00:59 mutante: LDAP - added myself to gerritadmin group

2024-01-11

21:36 jan_drewniak: https://phabricator.wikimedia.org/T349337#9454773 running maintenance script to delete unnecessary user preferences.
21:26 jdrewniak@deploy2002: Finished scap: Backport for gerrit:985647InitialiseSettings.php: disallow obsolete HTML in signatures (enwiki) (T354013), gerrit:984288InitialiseSettings.php: Allow thanking bots (T341388) (duration: 13m 43s)
21:20 jdrewniak@deploy2002: jdrewniak and houseblaster: Continuing with sync
21:14 jdrewniak@deploy2002: jdrewniak and houseblaster: Backport for gerrit:985647InitialiseSettings.php: disallow obsolete HTML in signatures (enwiki) (T354013), gerrit:984288InitialiseSettings.php: Allow thanking bots (T341388) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:12 jdrewniak@deploy2002: Started scap: Backport for gerrit:985647InitialiseSettings.php: disallow obsolete HTML in signatures (enwiki) (T354013), gerrit:984288InitialiseSettings.php: Allow thanking bots (T341388)
20:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
20:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
20:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T354336)', diff saved to https://phabricator.wikimedia.org/P54657 and previous config saved to /var/cache/conftool/dbconfig/20240111-205021-marostegui.json
20:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P54656 and previous config saved to /var/cache/conftool/dbconfig/20240111-203514-marostegui.json
20:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P54655 and previous config saved to /var/cache/conftool/dbconfig/20240111-202008-marostegui.json
20:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T354336)', diff saved to https://phabricator.wikimedia.org/P54654 and previous config saved to /var/cache/conftool/dbconfig/20240111-200502-marostegui.json
20:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1231 (T354336)', diff saved to https://phabricator.wikimedia.org/P54653 and previous config saved to /var/cache/conftool/dbconfig/20240111-200253-marostegui.json
20:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1231.eqiad.wmnet with reason: Maintenance
20:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1231.eqiad.wmnet with reason: Maintenance
20:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
20:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
20:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T354336)', diff saved to https://phabricator.wikimedia.org/P54652 and previous config saved to /var/cache/conftool/dbconfig/20240111-200209-marostegui.json
20:00 htriedman@deploy2002: Finished deploy [airflow-dags/platform_eng@07f5320]: (no justification provided) (duration: 00m 27s)
20:00 htriedman@deploy2002: Started deploy [airflow-dags/platform_eng@07f5320]: (no justification provided)
19:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P54651 and previous config saved to /var/cache/conftool/dbconfig/20240111-194703-marostegui.json
19:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P54649 and previous config saved to /var/cache/conftool/dbconfig/20240111-193156-marostegui.json
19:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T354336)', diff saved to https://phabricator.wikimedia.org/P54647 and previous config saved to /var/cache/conftool/dbconfig/20240111-191650-marostegui.json
19:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1224 (T354336)', diff saved to https://phabricator.wikimedia.org/P54646 and previous config saved to /var/cache/conftool/dbconfig/20240111-191440-marostegui.json
19:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1224.eqiad.wmnet with reason: Maintenance
19:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1224.eqiad.wmnet with reason: Maintenance
19:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54645 and previous config saved to /var/cache/conftool/dbconfig/20240111-191418-marostegui.json
19:11 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.13 refs T350089
19:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
19:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P54644 and previous config saved to /var/cache/conftool/dbconfig/20240111-185912-marostegui.json
18:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P54643 and previous config saved to /var/cache/conftool/dbconfig/20240111-184405-marostegui.json
18:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54641 and previous config saved to /var/cache/conftool/dbconfig/20240111-182859-marostegui.json
18:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1213:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54640 and previous config saved to /var/cache/conftool/dbconfig/20240111-182745-marostegui.json
18:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
18:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
18:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T354336)', diff saved to https://phabricator.wikimedia.org/P54639 and previous config saved to /var/cache/conftool/dbconfig/20240111-182723-marostegui.json
18:27 thcipriani@deploy2002: Finished deploy [gerrit/gerrit@376b3e5]: Remove devsat survey banner in 3.6 (gerrit primary: gerrit.wikimedia.org) (duration: 00m 07s)
18:27 thcipriani@deploy2002: Started deploy [gerrit/gerrit@376b3e5]: Remove devsat survey banner in 3.6 (gerrit primary: gerrit.wikimedia.org)
18:25 thcipriani@deploy2002: Finished deploy [gerrit/gerrit@376b3e5]: Remove devsat survey banner in 3.6 (gerrit2002 only) (duration: 00m 05s)
18:25 thcipriani@deploy2002: Started deploy [gerrit/gerrit@376b3e5]: Remove devsat survey banner in 3.6 (gerrit2002 only)
18:23 thcipriani: deploying gerrit to remove devsat survey (no restart needed)
18:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P54638 and previous config saved to /var/cache/conftool/dbconfig/20240111-181217-marostegui.json
17:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P54637 and previous config saved to /var/cache/conftool/dbconfig/20240111-175710-marostegui.json
17:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T354336)', diff saved to https://phabricator.wikimedia.org/P54636 and previous config saved to /var/cache/conftool/dbconfig/20240111-174204-marostegui.json
17:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1201 (T354336)', diff saved to https://phabricator.wikimedia.org/P54635 and previous config saved to /var/cache/conftool/dbconfig/20240111-173955-marostegui.json
17:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1201.eqiad.wmnet with reason: Maintenance
17:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1201.eqiad.wmnet with reason: Maintenance
17:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T354336)', diff saved to https://phabricator.wikimedia.org/P54634 and previous config saved to /var/cache/conftool/dbconfig/20240111-173933-marostegui.json
17:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P54633 and previous config saved to /var/cache/conftool/dbconfig/20240111-172427-marostegui.json
17:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P54632 and previous config saved to /var/cache/conftool/dbconfig/20240111-170920-marostegui.json
16:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T354336)', diff saved to https://phabricator.wikimedia.org/P54631 and previous config saved to /var/cache/conftool/dbconfig/20240111-165414-marostegui.json
16:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T354336)', diff saved to https://phabricator.wikimedia.org/P54630 and previous config saved to /var/cache/conftool/dbconfig/20240111-165305-marostegui.json
16:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1187.eqiad.wmnet with reason: Maintenance
16:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1187.eqiad.wmnet with reason: Maintenance
16:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54629 and previous config saved to /var/cache/conftool/dbconfig/20240111-165244-marostegui.json
16:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P54628 and previous config saved to /var/cache/conftool/dbconfig/20240111-163738-marostegui.json
16:23 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
16:23 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
16:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P54626 and previous config saved to /var/cache/conftool/dbconfig/20240111-162231-marostegui.json
16:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54625 and previous config saved to /var/cache/conftool/dbconfig/20240111-160725-marostegui.json
16:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: cache::upload
16:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54624 and previous config saved to /var/cache/conftool/dbconfig/20240111-160516-marostegui.json
16:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1180.eqiad.wmnet with reason: Maintenance
16:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1180.eqiad.wmnet with reason: Maintenance
16:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T354336)', diff saved to https://phabricator.wikimedia.org/P54623 and previous config saved to /var/cache/conftool/dbconfig/20240111-160454-marostegui.json
15:59 sukhe: restart pybal on lvs4010
15:58 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe
15:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P54622 and previous config saved to /var/cache/conftool/dbconfig/20240111-154947-marostegui.json
15:47 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe
15:41 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: cache::upload
15:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P54621 and previous config saved to /var/cache/conftool/dbconfig/20240111-153441-marostegui.json
15:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T354336)', diff saved to https://phabricator.wikimedia.org/P54620 and previous config saved to /var/cache/conftool/dbconfig/20240111-151934-marostegui.json
15:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T354336)', diff saved to https://phabricator.wikimedia.org/P54619 and previous config saved to /var/cache/conftool/dbconfig/20240111-151724-marostegui.json
15:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1168.eqiad.wmnet with reason: Maintenance
15:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1168.eqiad.wmnet with reason: Maintenance
15:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T354336)', diff saved to https://phabricator.wikimedia.org/P54618 and previous config saved to /var/cache/conftool/dbconfig/20240111-151702-marostegui.json
15:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P54617 and previous config saved to /var/cache/conftool/dbconfig/20240111-150156-marostegui.json
14:51 reedy@deploy2002: Synchronized wmf-config/: T325147 (duration: 06m 43s)
14:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P54616 and previous config saved to /var/cache/conftool/dbconfig/20240111-144649-marostegui.json
14:36 reedy@deploy2002: Synchronized wmf-config/: T344398 (duration: 07m 25s)
14:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T354336)', diff saved to https://phabricator.wikimedia.org/P54615 and previous config saved to /var/cache/conftool/dbconfig/20240111-143143-marostegui.json
14:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T354336)', diff saved to https://phabricator.wikimedia.org/P54614 and previous config saved to /var/cache/conftool/dbconfig/20240111-143034-marostegui.json
14:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
14:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
14:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1165.eqiad.wmnet with reason: Maintenance
14:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1165.eqiad.wmnet with reason: Maintenance
14:26 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
14:25 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
14:25 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
14:25 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
14:24 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
14:24 kamila@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
14:21 reedy@deploy2002: Synchronized wmf-config/InitialiseSettings.php: T205347 (duration: 07m 41s)
14:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54613 and previous config saved to /var/cache/conftool/dbconfig/20240111-141058-root.json
13:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54612 and previous config saved to /var/cache/conftool/dbconfig/20240111-135553-root.json
13:49 hashar@deploy2002: Finished deploy [gerrit/gerrit@af34477]: wm-zuul-status: add SCHEDULED for pending check run - T348959 (duration: 00m 07s)
13:49 hashar@deploy2002: Started deploy [gerrit/gerrit@af34477]: wm-zuul-status: add SCHEDULED for pending check run - T348959
13:41 moritzm: installing xerces-c security updates
13:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54611 and previous config saved to /var/cache/conftool/dbconfig/20240111-134048-root.json
13:29 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
13:29 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
13:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54610 and previous config saved to /var/cache/conftool/dbconfig/20240111-132543-root.json
13:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54609 and previous config saved to /var/cache/conftool/dbconfig/20240111-131038-root.json
12:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54608 and previous config saved to /var/cache/conftool/dbconfig/20240111-125533-root.json
12:47 hashar: Restarting Gerrit to apply config change https://gerrit.wikimedia.org/r/c/operations/puppet/+/989735/ # T206049
12:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54607 and previous config saved to /var/cache/conftool/dbconfig/20240111-124028-root.json
12:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2124.codfw.wmnet with OS bookworm
12:20 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
12:20 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
12:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2124.codfw.wmnet with reason: host reimage
12:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2124.codfw.wmnet with reason: host reimage
12:00 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
12:00 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
11:59 moritzm: installing Python 2.7 security updates on Bullseye
11:50 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2124.codfw.wmnet with OS bookworm
11:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2124 T354506', diff saved to https://phabricator.wikimedia.org/P54606 and previous config saved to /var/cache/conftool/dbconfig/20240111-114930-marostegui.json
11:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54605 and previous config saved to /var/cache/conftool/dbconfig/20240111-111958-root.json
11:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54604 and previous config saved to /var/cache/conftool/dbconfig/20240111-110453-root.json
10:54 moritzm: installing Linux 5.10.205 updates on Bullseye hosts
10:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54603 and previous config saved to /var/cache/conftool/dbconfig/20240111-104948-root.json
10:34 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54602 and previous config saved to /var/cache/conftool/dbconfig/20240111-103443-root.json
10:31 moritzm: installing exim4 security updates
10:31 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
10:30 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
10:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: druid::public::worker
10:26 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
10:26 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
10:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54601 and previous config saved to /var/cache/conftool/dbconfig/20240111-101938-root.json
10:13 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: druid::public::worker
10:12 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
10:12 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
10:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54600 and previous config saved to /var/cache/conftool/dbconfig/20240111-100433-root.json
10:04 sfaci@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
10:03 sfaci@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
10:03 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
10:00 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
10:00 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
09:58 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
09:53 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
09:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54599 and previous config saved to /var/cache/conftool/dbconfig/20240111-094928-root.json
09:39 hashar: Gerrit back up and operational, now running version 3.6.8
09:33 hashar: Gerrit restarted and its reindexing all changes T309870
09:23 hashar@deploy2002: Finished deploy [gerrit/gerrit@e099b0b]: Gerrit to version 3.6.8 # T309870 (duration: 00m 07s)
09:23 hashar@deploy2002: Started deploy [gerrit/gerrit@e099b0b]: Gerrit to version 3.6.8 # T309870
09:22 hashar@deploy2002: Finished deploy [gerrit/gerrit@e099b0b]: Gerrit to version 3.6.8 # T309870 (duration: 00m 27s)
09:21 hashar@deploy2002: Started deploy [gerrit/gerrit@e099b0b]: Gerrit to version 3.6.8 # T309870
09:21 hashar: Stopping Gerrit
09:10 hashar: gerrit: `ssh -p 29418 gerrit.wikimedia.org gerrit copy-approvals` # T309870
09:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1201.eqiad.wmnet with OS bookworm
08:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1201.eqiad.wmnet with reason: host reimage
08:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1201.eqiad.wmnet with reason: host reimage

2024-01-10

22:29 herron@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-eqiad
22:05 herron@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-eqiad
21:54 herron@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-codfw
21:36 Dreamy_Jazz: UTC late deploys done
21:33 dreamyjazz@deploy2002: Finished scap: Backport for gerrit:989569Add comment to clarify which rate limits apply to temporary users (T331576) (duration: 08m 05s)
21:28 herron@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-codfw
21:27 dreamyjazz@deploy2002: dreamyjazz and tchanders: Continuing with sync
21:27 dreamyjazz@deploy2002: dreamyjazz and tchanders: Backport for gerrit:989569Add comment to clarify which rate limits apply to temporary users (T331576) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:25 dreamyjazz@deploy2002: Started scap: Backport for gerrit:989569Add comment to clarify which rate limits apply to temporary users (T331576)
21:19 taavi@deploy2002: Finished scap: Backport for gerrit:989262Disable max width for index namespace (T352162) (duration: 14m 19s)
21:12 taavi@deploy2002: toyofuku and taavi: Continuing with sync
21:08 taavi@deploy2002: toyofuku and taavi: Backport for gerrit:989262Disable max width for index namespace (T352162) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:05 taavi@deploy2002: Started scap: Backport for gerrit:989262Disable max width for index namespace (T352162)
20:22 sukhe: enable puppet on lvs2013: T352758
19:29 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:29 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove old records for mr1-codfw core links - cmooney@cumin1002"
19:28 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove old records for mr1-codfw core links - cmooney@cumin1002"
19:26 jhuneidi@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.13 refs T350089 (duration: 07m 58s)
19:24 cmooney@cumin1002: START - Cookbook sre.dns.netbox
19:18 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.13 refs T350089
19:00 topranks: disabling OSPF connection from mr1-codfw to codfw core routers T348164
18:40 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
18:38 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on prometheus2006.codfw.wmnet with reason: memory upgrade
18:37 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on prometheus2006.codfw.wmnet with reason: memory upgrade
18:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
18:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
18:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
18:35 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for prometheus2005.codfw.wmnet
18:35 filippo@cumin1002: START - Cookbook sre.hosts.remove-downtime for prometheus2005.codfw.wmnet
18:24 sukhe: stop pybal on lvs2013: T352758
17:59 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on prometheus2005.codfw.wmnet with reason: memory upgrade
17:58 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on prometheus2005.codfw.wmnet with reason: memory upgrade
17:54 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1377.eqiad.wmnet with OS bullseye
17:47 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
17:46 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
17:44 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
17:44 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
17:40 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host lvs2014.codfw.wmnet
17:34 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
17:31 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
17:28 sukhe@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet
17:27 sukhe: enable puppet on lvs2014: T352758
17:16 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
17:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1378.eqiad.wmnet with OS bullseye
17:14 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:14 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update reverse dns for sandbox1-a-codfw irb.2201 gw - cmooney@cumin1002"
17:14 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update reverse dns for sandbox1-a-codfw irb.2201 gw - cmooney@cumin1002"
17:09 cmooney@cumin1002: START - Cookbook sre.dns.netbox
16:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
16:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
16:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1378.eqiad.wmnet with OS bullseye
16:36 godog: upgrade prometheus on prometheus2006 - T354399
16:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
16:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
16:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
16:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
16:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
16:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
16:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw[1379-1383].eqiad.wmnet with reason: testing reboot
16:25 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw[1379-1383].eqiad.wmnet with reason: testing reboot
16:22 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1379.eqiad.wmnet with OS bullseye
16:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
16:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
16:00 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1383.eqiad.wmnet with OS bullseye
15:59 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1381.eqiad.wmnet with OS bullseye
15:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1382.eqiad.wmnet with OS bullseye
15:57 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
15:41 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: logging::opensearch::data
15:41 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye
15:40 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
15:37 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
15:37 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
15:35 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
15:35 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
15:34 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
15:24 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: logging::opensearch::data
15:24 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2001.codfw.wmnet
15:22 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2013.codfw.wmnet with reason: Decommissioning — T352469
15:21 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2013.codfw.wmnet with reason: Decommissioning — T352469
15:21 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1383.eqiad.wmnet with OS bullseye
15:20 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1382.eqiad.wmnet with OS bullseye
15:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: logging::opensearch::collector
15:19 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1381.eqiad.wmnet with OS bullseye
15:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1380.eqiad.wmnet with OS bullseye
15:14 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
15:13 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-master[1003-1004].eqiad.wmnet with reason: Bringing new nameservers into service
15:13 klausman@cumin1001: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-staging2001.codfw.wmnet
15:12 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-master[1003-1004].eqiad.wmnet with reason: Bringing new nameservers into service
15:07 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbproxy[1018-1019].eqiad.wmnet
15:06 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:06 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbproxy[1018-1019].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
15:04 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on lvs2014.codfw.wmnet with reason: T352758
15:04 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on lvs2014.codfw.wmnet with reason: T352758
15:03 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbproxy[1018-1019].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
15:01 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
15:01 sukhe: disable puppet and stop pybal on lvs2014: T352758
15:00 taavi@cumin1002: START - Cookbook sre.dns.netbox
14:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
14:55 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: logging::opensearch::collector
14:54 topranks: adding vlans to ssw1-a8-codfw to trunk to lvs2014 T352758
14:52 taavi@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbproxy[1018-1019].eqiad.wmnet
14:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
14:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: lvs::balancer
14:39 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
14:39 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
14:38 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1380.eqiad.wmnet with OS bullseye
14:27 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: lvs::balancer
14:27 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
14:27 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1378.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
14:26 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
14:26 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1378.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
14:25 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
14:24 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
14:22 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
14:21 moritzm: installing lapack bugfix updates
14:21 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
14:04 moritzm: installing openblas bugfix updates
14:03 hashar: Switching operations-puppet-tests-buster-docker Jenkins job from tox v3 to tox v4 | T345152
13:56 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
13:56 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
13:54 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
13:54 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
13:15 godog: test prometheus 2.48.1 on prometheus1005 - T354399
12:48 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.roll-restart-workers (exit_code=99) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
12:47 stevemunene@cumin1002: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
12:39 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts druid1006.eqiad.wmnet
12:39 stevemunene@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:39 stevemunene@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1006.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
12:37 stevemunene@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1006.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
12:37 hnowlan@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
12:37 hnowlan@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
12:37 hnowlan@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
12:37 hnowlan@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
12:35 stevemunene@cumin1002: START - Cookbook sre.dns.netbox
12:22 stevemunene@cumin1002: START - Cookbook sre.hosts.decommission for hosts druid1006.eqiad.wmnet
12:21 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts druid1005.eqiad.wmnet
12:21 stevemunene@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:21 stevemunene@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
12:20 stevemunene@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
12:18 stevemunene@cumin1002: START - Cookbook sre.dns.netbox
12:05 stevemunene@cumin1002: START - Cookbook sre.hosts.decommission for hosts druid1005.eqiad.wmnet
11:56 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts druid1004.eqiad.wmnet
11:56 stevemunene@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:56 stevemunene@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1004.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
11:54 stevemunene@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1004.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
11:51 stevemunene@cumin1002: START - Cookbook sre.dns.netbox
11:47 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
11:46 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
11:46 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
11:46 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
11:46 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
11:46 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
11:43 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
11:43 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
11:43 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
11:41 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
11:41 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
11:41 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
11:39 stevemunene@cumin1002: START - Cookbook sre.hosts.decommission for hosts druid1004.eqiad.wmnet
11:37 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
11:37 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
11:36 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
11:36 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
11:03 moritzm: installing PHP 7.3 security updates
10:46 moritzm: installing curl security updates
10:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testreduce1001.eqiad.wmnet
10:04 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testreduce1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
10:02 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testreduce1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
10:01 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: sync
10:00 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: sync
10:00 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: sync
10:00 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: sync
09:57 jmm@cumin2002: START - Cookbook sre.dns.netbox
09:55 hashar@deploy2002: Finished deploy [integration/docroot@355ddbb]: (no justification provided) (duration: 00m 04s)
09:55 hashar@deploy2002: Started deploy [integration/docroot@355ddbb]: (no justification provided)
09:55 moritzm: installing git security updates on deployment hosts
09:53 hashar@deploy2002: Finished deploy [integration/docroot@355ddbb]: Dummy deploy to test git safe.directory # T335354 (duration: 00m 06s)
09:53 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testreduce1001.eqiad.wmnet
09:53 hashar@deploy2002: Started deploy [integration/docroot@355ddbb]: Dummy deploy to test git safe.directory # T335354
09:38 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
09:38 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
09:38 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1378.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
09:38 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1378.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
09:01 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 15133
09:00 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 15133
08:59 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 13150
08:57 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 13150
08:47 dcausse@deploy2002: Finished scap: Backport for gerrit:989442enable page_rerender for 4th batch of wikis (T351503) (duration: 11m 50s)
08:42 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
08:41 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
08:41 moritzm: installing Exim security updates
08:40 dcausse@deploy2002: pfischer and dcausse: Continuing with sync
08:37 dcausse@deploy2002: pfischer and dcausse: Backport for gerrit:989442enable page_rerender for 4th batch of wikis (T351503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:35 dcausse@deploy2002: Started scap: Backport for gerrit:989442enable page_rerender for 4th batch of wikis (T351503)
08:12 kartik@deploy2002: Finished scap: Backport for gerrit:988984testwiki: Enable Section translation on WPs with Content Translation available as default (T351882) (duration: 09m 10s)
08:06 kartik@deploy2002: kartik: Continuing with sync
08:04 kartik@deploy2002: kartik: Backport for gerrit:988984testwiki: Enable Section translation on WPs with Content Translation available as default (T351882) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:03 kartik@deploy2002: Started scap: Backport for gerrit:988984testwiki: Enable Section translation on WPs with Content Translation available as default (T351882)
07:53 moritzm: installing openjdk-8 security updates
07:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2143.codfw.wmnet with OS bookworm
06:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2143.codfw.wmnet with reason: host reimage
06:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2143.codfw.wmnet with reason: host reimage
06:32 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2143.codfw.wmnet with OS bookworm

2024-01-09

21:23 aqu@deploy2002: Finished deploy [airflow-dags/analytics@ea53374]: Regular airflow-dags/analytics weekly train [airflow-dags@ea53374f] (duration: 00m 28s)
21:22 aqu@deploy2002: Started deploy [airflow-dags/analytics@ea53374]: Regular airflow-dags/analytics weekly train [airflow-dags@ea53374f]
21:21 aqu@deploy2002: Finished deploy [airflow-dags/analytics_test@ea53374]: Regular airflow-dags/analytics_test weekly train [airflow-dags@ea53374f] (duration: 00m 12s)
21:21 aqu@deploy2002: Started deploy [airflow-dags/analytics_test@ea53374]: Regular airflow-dags/analytics_test weekly train [airflow-dags@ea53374f]
21:03 aqu@deploy2002: Finished deploy [analytics/refinery@c4fed56] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c4fed56c] (test number 2 after permission error) (duration: 00m 05s)
21:03 aqu@deploy2002: Started deploy [analytics/refinery@c4fed56] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c4fed56c] (test number 2 after permission error)
21:02 aqu@deploy2002: Finished deploy [analytics/refinery@c4fed56] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c4fed56c] (duration: 03m 33s)
20:59 aqu@deploy2002: Started deploy [analytics/refinery@c4fed56] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c4fed56c]
20:59 aqu@deploy2002: Finished deploy [analytics/refinery@c4fed56] (thin): Regular analytics weekly train THIN [analytics/refinery@c4fed56c] (duration: 00m 06s)
20:58 aqu@deploy2002: Started deploy [analytics/refinery@c4fed56] (thin): Regular analytics weekly train THIN [analytics/refinery@c4fed56c]
20:58 aqu@deploy2002: Finished deploy [analytics/refinery@c4fed56]: Regular analytics weekly train [analytics/refinery@c4fed56c] (duration: 09m 06s)
20:49 eevans@cumin1002: conftool action : set/weight=0; selector: cluster=restbase,dc=codfw,name=restbase2019.codfw.wmnet
20:49 eevans@cumin1002: conftool action : set/weight=0; selector: cluster=restbase,dc=codfw,name=restbase2014.codfw.wmnet
20:49 eevans@cumin1002: conftool action : set/weight=0; selector: cluster=restbase,dc=codfw,name=restbase2013.codfw.wmnet
20:49 aqu@deploy2002: Started deploy [analytics/refinery@c4fed56]: Regular analytics weekly train [analytics/refinery@c4fed56c]
20:48 aqu: about to deploy analytics/refinery - weekly train
20:40 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.13 refs T350089
20:26 jhuneidi@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.13 refs T350089 (duration: 23m 33s)
20:03 jhuneidi@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.13 refs T350089
19:44 mutante: mwmaint1002 - rm -rf 1.42.0-wmf.7 ; mwmamint2002 - rm -rf php-1.39.0-wmf.25
19:35 mutante: mwmaint1002 - rm -rf /srv/mediawiki/php-1.40.0-wmf.17
19:33 mutante: mwmaint1002 - rm -rf /srv/mediawiki/php-1.39.0-wmf.25 after monitoring alerted about 99% disk usage on /srv
19:26 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: all wikis to 1.42.0-wmf.12 refs T350089
19:16 urandom: decommissioning cassandra, restbase2013-{a,b,c} — T352469
19:14 jhuneidi@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.13 refs T350089 (duration: 45m 48s)
18:42 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
18:40 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
18:29 jhuneidi@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.13 refs T350089
18:04 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
18:04 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new reverse entries for mr1 -> lsw1-a2 link in codfw - cmooney@cumin1002"
18:02 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new reverse entries for mr1 -> lsw1-a2 link in codfw - cmooney@cumin1002"
18:00 cmooney@cumin1002: START - Cookbook sre.dns.netbox
17:41 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db2143']
17:33 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2143']
17:31 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db2143']
17:21 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2143']
17:17 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti-test2004.codfw.wmnet
17:17 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:17 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti-test2004.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
17:14 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti-test2004.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
17:12 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
17:06 ayounsi@cumin1002: START - Cookbook sre.hosts.decommission for hosts ganeti-test2004.codfw.wmnet
17:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti-test[1001-1002].eqiad.wmnet
17:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti-test[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
17:04 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti-test[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
17:02 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
16:53 ayounsi@cumin1002: START - Cookbook sre.hosts.decommission for hosts ganeti-test[1001-1002].eqiad.wmnet
16:27 jayme: restart prometheus@k8s on prometheus1005 revert GOGC to 100 (default) - T354604
16:22 mutante: phabricator - differential has been disabled (T330797)
16:11 brennen@deploy2002: Finished deploy [phabricator/deployment@369e797]: deploy to phab1004 for T354545 (duration: 00m 56s)
16:10 brennen@deploy2002: Started deploy [phabricator/deployment@369e797]: deploy to phab1004 for T354545
16:10 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudrabbit1003.wikimedia.org
16:10 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:10 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudrabbit1003.wikimedia.org decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
16:09 brennen@deploy2002: Finished deploy [phabricator/deployment@369e797]: deploy to phab2002 for T354545 (duration: 00m 55s)
16:09 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudrabbit1003.wikimedia.org decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
16:09 mutante: phabricator deployment in progress
16:08 brennen@deploy2002: Started deploy [phabricator/deployment@369e797]: deploy to phab2002 for T354545
16:08 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
16:08 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
16:07 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on phab1004.eqiad.wmnet with reason: deployment
16:04 taavi@cumin1002: START - Cookbook sre.dns.netbox
15:58 taavi@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudrabbit1003.wikimedia.org
15:54 jayme: restart prometheus@k8s on prometheus1005 with GOGC=60 - T354604
15:37 akosiaris: depool and reboot mw1349 for a test T354413
15:36 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp
15:19 sukhe: restart pybal on lvs1019: T336043
15:19 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp
15:16 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
15:16 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
15:16 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
15:15 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
15:14 sukhe: restart pybal on lvs1020: T336043
15:06 TheresNoTime: done UTC afternoon backport window
15:03 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
15:02 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
15:02 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
15:01 TheresNoTime: `[samtar@mwmaint2002 ~]$ echo 'https://en.wikipedia.org/static/images/mobile/copyright/wikinews-wordmark-zh.svg' | mwscript purgeList.php` T353792
15:01 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
15:00 TheresNoTime: `[samtar@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki bjnwikiquote --add-prefix "BROKEN " --fix` T350235
14:59 TheresNoTime: `[samtar@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki zghwiki --add-prefix "BROKEN " --fix` T350241
14:58 samtar@deploy2002: Finished scap: Backport for gerrit:986659zghwiki: add metanamespace (T350241), gerrit:986660bjnwikiquote: add metanamespace (T350235) (duration: 12m 10s)
14:56 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
14:56 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
14:52 samtar@deploy2002: samtar and anzx: Continuing with sync
14:50 samtar@deploy2002: samtar and anzx: Backport for gerrit:986659zghwiki: add metanamespace (T350241), gerrit:986660bjnwikiquote: add metanamespace (T350235) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:46 samtar@deploy2002: Started scap: Backport for gerrit:986659zghwiki: add metanamespace (T350241), gerrit:986660bjnwikiquote: add metanamespace (T350235)
14:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2034.codfw.wmnet with OS bookworm
14:44 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1002"
14:43 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp
14:42 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1002"
14:38 TheresNoTime: `[samtar@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki hewikinews --fix` T349581
14:38 samtar@deploy2002: Finished scap: Backport for gerrit:968318Create draft namespace and add namespaces aliases for hewikinews (T349581) (duration: 10m 05s)
14:36 kevinbazira@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
14:35 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
14:34 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host snapshot1014.eqiad.wmnet
14:32 samtar@deploy2002: samtar and anzx: Continuing with sync
14:30 samtar@deploy2002: samtar and anzx: Backport for gerrit:968318Create draft namespace and add namespaces aliases for hewikinews (T349581) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:28 samtar@deploy2002: Started scap: Backport for gerrit:968318Create draft namespace and add namespaces aliases for hewikinews (T349581)
14:27 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp
14:26 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
14:26 bking@cumin2002: START - Cookbook sre.wdqs.restart
14:26 TheresNoTime: deployed patch for T350739, logging bot not working?
14:24 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2034.codfw.wmnet with reason: host reimage
14:23 samtar@deploy2002: Finished scap: Backport for [[gerrit:972473|[namespaces] Use correct diacritics in Romanian]] (duration: 14m 42s)
14:22 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_esams and not P{cp3066.esams.wmnet} and A:cp
14:21 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2034.codfw.wmnet with reason: host reimage
14:16 samtar@deploy2002: strainu and samtar: Continuing with sync
14:13 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=codfw,name=restbase2035.codfw.wmnet
14:12 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for restbase2035.codfw.wmnet
14:12 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for restbase2035.codfw.wmnet
14:09 samtar@deploy2002: strainu and samtar: Backport for [[gerrit:972473|[namespaces] Use correct diacritics in Romanian]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:08 samtar@deploy2002: Started scap: Backport for [[gerrit:972473|[namespaces] Use correct diacritics in Romanian]]
14:04 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_esams and not P{cp3066.esams.wmnet} and A:cp
14:01 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti2034.codfw.wmnet with OS bookworm
14:01 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host ganeti2034.codfw.wmnet with OS bookworm
13:58 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2033.codfw.wmnet with OS bookworm
13:58 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1002"
13:56 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1002"
13:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host snapshot1014.eqiad.wmnet
13:43 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host snapshot1014.eqiad.wmnet with OS bullseye
13:41 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti2034.codfw.wmnet with OS bookworm
13:37 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2033.codfw.wmnet with reason: host reimage
13:34 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2033.codfw.wmnet with reason: host reimage
13:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54575 and previous config saved to /var/cache/conftool/dbconfig/20240109-133327-root.json
13:20 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
13:18 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
13:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54574 and previous config saved to /var/cache/conftool/dbconfig/20240109-131822-root.json
13:16 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
13:14 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti2033.codfw.wmnet with OS bookworm
13:13 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop analytics cluster: Restart of jvm daemons.
13:10 btullis@cumin1002: END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
13:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54573 and previous config saved to /var/cache/conftool/dbconfig/20240109-130317-root.json
13:00 hnowlan@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
13:00 hnowlan@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
12:58 stevemunene@cumin1002: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons.
12:57 hnowlan@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
12:57 hnowlan@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
12:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54572 and previous config saved to /var/cache/conftool/dbconfig/20240109-124812-root.json
12:43 moritzm: imported mwbzutils 0.1.4~wmf-1+deb11u1 for bullseye-wikimedia T325228
12:43 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw[1380-1382].eqiad.wmnet with reason: failed reimage waiting on fix
12:42 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw[1380-1382].eqiad.wmnet with reason: failed reimage waiting on fix
12:39 btullis@cumin1002: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
12:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54571 and previous config saved to /var/cache/conftool/dbconfig/20240109-123307-root.json
12:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54570 and previous config saved to /var/cache/conftool/dbconfig/20240109-121802-root.json
12:17 stevemunene@cumin1002: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons.
12:10 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_esams and A:cp
12:07 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:07 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove wiki replica LVS VIPs - taavi@cumin1002"
12:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1180.eqiad.wmnet with OS bookworm
12:06 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove wiki replica LVS VIPs - taavi@cumin1002"
12:04 taavi@cumin1002: START - Cookbook sre.dns.netbox
12:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54569 and previous config saved to /var/cache/conftool/dbconfig/20240109-120257-root.json
12:01 btullis@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
11:50 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:50 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update dns entry for kubestage2002.codfw.wmnet - cmooney@cumin1002"
11:50 stevemunene@cumin1002: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons.
11:50 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_esams and A:cp
11:49 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update dns entry for kubestage2002.codfw.wmnet - cmooney@cumin1002"
11:46 cmooney@cumin1002: START - Cookbook sre.dns.netbox
11:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1180.eqiad.wmnet with reason: host reimage
11:43 btullis@cumin1002: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
11:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1180.eqiad.wmnet with reason: host reimage
11:38 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp
11:37 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-b8-codfw,lsw1-b8-codfw IPv6 with reason: Adding vlan to switch, precaution in case it triggers EVPN L3 bug.
11:37 btullis@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-jumbo-eqiad
11:37 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on lsw1-b8-codfw,lsw1-b8-codfw IPv6 with reason: Adding vlan to switch, precaution in case it triggers EVPN L3 bug.
11:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1014.eqiad.wmnet with reason: host reimage
11:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1014.eqiad.wmnet with reason: host reimage
11:30 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1180.eqiad.wmnet with OS bookworm
11:30 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=mw2394.codfw.wmnet,cluster=jobrunner
11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1180 T354506', diff saved to https://phabricator.wikimedia.org/P54568 and previous config saved to /var/cache/conftool/dbconfig/20240109-112922-root.json
11:22 cgoubert@cumin2002: conftool action : set/pooled=no; selector: name=mw2394.codfw.wmnet
11:19 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1014.eqiad.wmnet with OS bullseye
11:19 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
11:19 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
11:18 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
11:18 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
11:17 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
11:17 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
11:15 taavi@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s3
11:15 taavi@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3
11:14 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp
11:05 moritzm: installing exim security updates
10:54 godog: restart prometheus@k8s on prometheus1005 to see if labeldrop id will yield expected results - T354604
10:45 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host ganeti2033.codfw.wmnet with OS bookworm
10:38 btullis@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad
10:22 sfaci@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
10:21 sfaci@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
10:19 btullis@cumin1002: END (PASS) - Cookbook sre.opensearch.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:datahubsearch
10:11 btullis@cumin1002: START - Cookbook sre.opensearch.roll-restart-reboot rolling restart_daemons on A:datahubsearch
10:00 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp
09:59 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti2033.codfw.wmnet with OS bookworm
09:54 oblivian@deploy2002: Finished scap: Backport for gerrit:987033Always process media files via shellbox on k8s (T352515) (duration: 11m 03s)
09:52 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:52 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2033/2034 move - ayounsi@cumin1002"
09:48 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2033/2034 move - ayounsi@cumin1002"
09:47 oblivian@deploy2002: oblivian: Continuing with sync
09:46 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
09:44 oblivian@deploy2002: oblivian: Backport for gerrit:987033Always process media files via shellbox on k8s (T352515) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
09:43 oblivian@deploy2002: Started scap: Backport for gerrit:987033Always process media files via shellbox on k8s (T352515)
09:39 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp
09:34 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_codfw and A:cp
09:27 oblivian@deploy2002: Finished scap: Backport for gerrit:987032Use shellbox for djvu handling on kubernetes (T352515) (duration: 23m 56s)
09:20 oblivian@deploy2002: oblivian: Continuing with sync
09:15 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_codfw and A:cp
09:14 moritzm: prune obsolete nginx packages from ncredir hosts after migration to new library scheme T329529
09:11 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp
09:06 arnaudb: upload wmfdb 0.1.4 from https://gitlab.wikimedia.org/repos/sre/wmfdb/-/tree/dgit/bookworm-wikimedia to fix default ca bundle
09:05 oblivian@deploy2002: oblivian: Backport for gerrit:987032Use shellbox for djvu handling on kubernetes (T352515) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
09:03 oblivian@deploy2002: Started scap: Backport for gerrit:987032Use shellbox for djvu handling on kubernetes (T352515)
08:59 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 45287
08:54 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 45287
08:54 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp
08:49 oblivian@deploy2002: Finished scap: Backport for gerrit:987031Remove throttle exception (T352569) (duration: 09m 01s)
08:48 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 9902
08:47 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 9902
08:42 oblivian@deploy2002: oblivian: Continuing with sync
08:42 oblivian@deploy2002: oblivian: Backport for gerrit:987031Remove throttle exception (T352569) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:40 oblivian@deploy2002: Started scap: Backport for gerrit:987031Remove throttle exception (T352569)
08:22 kartik@deploy2002: Finished scap: Backport for gerrit:988493testwiki: Enable Section translation on WPs with potential to be supported with MinT using MADLAD-400 (T353510) (duration: 15m 54s)
08:21 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2143.codfw.wmnet with OS bookworm
08:20 godog: set aside WAL for prometheus@k8s in codfw and restart - T354399
08:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54567 and previous config saved to /var/cache/conftool/dbconfig/20240109-081946-root.json
08:11 kartik@deploy2002: kartik: Continuing with sync
08:10 kartik@deploy2002: kartik: Backport for gerrit:988493testwiki: Enable Section translation on WPs with potential to be supported with MinT using MADLAD-400 (T353510) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:06 kartik@deploy2002: Started scap: Backport for gerrit:988493testwiki: Enable Section translation on WPs with potential to be supported with MinT using MADLAD-400 (T353510)
08:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 100%: After a crash', diff saved to https://phabricator.wikimedia.org/P54566 and previous config saved to /var/cache/conftool/dbconfig/20240109-080558-root.json
08:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54565 and previous config saved to /var/cache/conftool/dbconfig/20240109-080441-root.json
07:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 75%: After a crash', diff saved to https://phabricator.wikimedia.org/P54564 and previous config saved to /var/cache/conftool/dbconfig/20240109-075053-root.json
07:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54563 and previous config saved to /var/cache/conftool/dbconfig/20240109-074936-root.json
07:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 50%: After a crash', diff saved to https://phabricator.wikimedia.org/P54562 and previous config saved to /var/cache/conftool/dbconfig/20240109-073548-root.json
07:34 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54561 and previous config saved to /var/cache/conftool/dbconfig/20240109-073431-root.json
07:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 25%: After a crash', diff saved to https://phabricator.wikimedia.org/P54560 and previous config saved to /var/cache/conftool/dbconfig/20240109-072043-root.json
07:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54559 and previous config saved to /var/cache/conftool/dbconfig/20240109-071926-root.json
07:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 10%: After a crash', diff saved to https://phabricator.wikimedia.org/P54558 and previous config saved to /var/cache/conftool/dbconfig/20240109-070538-root.json
07:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54557 and previous config saved to /var/cache/conftool/dbconfig/20240109-070421-root.json
07:01 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2143.codfw.wmnet with OS bookworm
06:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2151.codfw.wmnet with OS bookworm
06:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 5%: After a crash', diff saved to https://phabricator.wikimedia.org/P54556 and previous config saved to /var/cache/conftool/dbconfig/20240109-065033-root.json
06:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54555 and previous config saved to /var/cache/conftool/dbconfig/20240109-064916-root.json
06:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 1%: After a crash', diff saved to https://phabricator.wikimedia.org/P54554 and previous config saved to /var/cache/conftool/dbconfig/20240109-063528-root.json
06:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2151.codfw.wmnet with reason: host reimage
06:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2151.codfw.wmnet with reason: host reimage
06:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1224', diff saved to https://phabricator.wikimedia.org/P54553 and previous config saved to /var/cache/conftool/dbconfig/20240109-062806-root.json
06:11 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2151.codfw.wmnet with OS bookworm
06:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2151 T354506', diff saved to https://phabricator.wikimedia.org/P54552 and previous config saved to /var/cache/conftool/dbconfig/20240109-061015-root.json
03:11 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
03:11 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
03:11 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
03:10 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
03:10 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
03:10 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
01:22 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
01:17 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED

2024-01-08

23:16 eileen: civicrm upgraded from 16b5417b to c7304245
22:58 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
22:57 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
22:56 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2003.mgmt.codfw.wmnet with reboot policy GRACEFUL
22:30 ryankemper@puppetmaster1001: conftool action : set/weight=10:pooled=yes; selector: name=elastic2087\.codfw\.wmnet
22:04 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host sretest2003.mgmt.codfw.wmnet with reboot policy GRACEFUL
21:50 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
21:49 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
21:37 cjming: end of UTC late backport window
21:32 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
21:29 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
21:27 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
21:24 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
21:15 cjming@deploy2002: Finished scap: Backport for gerrit:988714Remove android.metrics_platform.* stream definitions (T354199) (duration: 08m 17s)
21:08 cjming@deploy2002: cjming: Continuing with sync
21:08 cjming@deploy2002: cjming: Backport for gerrit:988714Remove android.metrics_platform.* stream definitions (T354199) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:07 cjming@deploy2002: Started scap: Backport for gerrit:988714Remove android.metrics_platform.* stream definitions (T354199)
19:30 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
19:28 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
19:27 taavi: make puppet re-generate empty envoy config file on testreduce1002 T345220
19:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
19:13 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
19:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
19:09 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
19:08 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
19:04 sukhe: running authdns-update for CR 988684: T345220
19:04 sukhe: running authdns-update for CR 988684: T336043
18:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
18:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
18:34 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
18:27 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
18:25 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
18:21 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
18:19 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
18:12 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
18:10 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
17:56 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
17:53 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
17:43 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: gerrit:988673 Bumping portals to master (T128546) (duration: 06m 17s)
17:36 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: gerrit:988673 Bumping portals to master (T128546) (duration: 06m 21s)
17:34 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1377.eqiad.wmnet with OS bullseye
17:18 godog: wipe prometheus@k8s eqiad WAL and restart - T354399
17:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
17:15 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
17:15 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
17:14 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
17:14 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp
17:12 ladsgroup@deploy2002: Finished scap: Backport for gerrit:988658Undeploy Listings extension part III (T253216) (duration: 08m 01s)
17:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
17:07 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
17:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
17:06 ladsgroup@deploy2002: ladsgroup: Continuing with sync
17:06 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:988658Undeploy Listings extension part III (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
17:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
17:04 ladsgroup@deploy2002: Started scap: Backport for gerrit:988658Undeploy Listings extension part III (T253216)
17:04 ladsgroup@deploy2002: Finished scap: Backport for gerrit:988658Undeploy Listings extension part III (T253216) (duration: 12m 24s)
17:00 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
16:57 ladsgroup@deploy2002: ladsgroup: Continuing with sync
16:54 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1377.eqiad.wmnet with OS bullseye
16:53 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:988658Undeploy Listings extension part III (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti2034.codfw.wmnet
16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2034.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
16:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
16:51 ladsgroup@deploy2002: Started scap: Backport for gerrit:988658Undeploy Listings extension part III (T253216)
16:49 ladsgroup@deploy2002: Finished scap: Backport for gerrit:988658Undeploy Listings extension part III (T253216) (duration: 08m 47s)
16:49 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
16:48 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2034.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
16:46 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp
16:44 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqsin and not P{cp[5030,5032].eqsin.wmnet} and A:cp
16:43 ladsgroup@deploy2002: ladsgroup: Continuing with sync
16:42 pt1979@cumin2002: START - Cookbook sre.dns.netbox
16:42 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:988658Undeploy Listings extension part III (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
16:41 ladsgroup@deploy2002: Started scap: Backport for gerrit:988658Undeploy Listings extension part III (T253216)
16:37 pt1979@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti2034.codfw.wmnet
16:36 btullis@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dbstore1008.eqiad.wmnet on all recursors
16:36 btullis@cumin1002: START - Cookbook sre.dns.wipe-cache dbstore1008.eqiad.wmnet on all recursors
16:35 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
16:35 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:35 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove unwanted AAAA records from new dbstore hosts - btullis@cumin1002"
16:34 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove unwanted AAAA records from new dbstore hosts - btullis@cumin1002"
16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti2033.codfw.wmnet
16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2033.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
16:32 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2033.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
16:30 btullis@cumin1002: START - Cookbook sre.dns.netbox
16:25 pt1979@cumin2002: START - Cookbook sre.dns.netbox
16:25 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and not P{cp[5030,5032].eqsin.wmnet} and A:cp
16:25 ladsgroup@deploy2002: Finished scap: Backport for gerrit:988658Undeploy Listings extension part III (T253216) (duration: 24m 06s)
16:24 taavi: lvs1018: sudo ipvsadm --delete-service --tcp-service 208.80.154.243:3311 (and all the way to :3318) - T346947
16:23 taavi: lvs1018: sudo ipvsadm --delete-service --tcp-service 208.80.154.242:3311 (and all the way to :3318) - T346947
16:21 taavi: lvs1020: sudo ipvsadm --delete-service --tcp-service 208.80.154.243:3311 (and all the way to :3318) - T346947
16:20 taavi: lvs1020: sudo ipvsadm --delete-service --tcp-service 208.80.154.242:3311 (and all the way to :3318) - T346947
16:18 pt1979@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti2033.codfw.wmnet
16:15 taavi: restart pybal on lvs1018 - T346947
16:14 ladsgroup@deploy2002: ladsgroup: Continuing with sync
16:14 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:988658Undeploy Listings extension part III (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
16:09 taavi: restart pybal on lvs1020 - T346947
16:01 ladsgroup@deploy2002: Started scap: Backport for gerrit:988658Undeploy Listings extension part III (T253216)
15:59 sfaci@deploy2002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
15:59 sfaci@deploy2002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
15:58 ladsgroup@deploy2002: Finished scap: Backport for gerrit:988655Undeploy listing extension part II (T253216) (duration: 08m 40s)
15:57 sfaci@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
15:57 sfaci@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
15:52 ladsgroup@deploy2002: ladsgroup: Continuing with sync
15:51 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:988655Undeploy listing extension part II (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
15:49 ladsgroup@deploy2002: Started scap: Backport for gerrit:988655Undeploy listing extension part II (T253216)
15:48 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw1377.eqiad.wmnet with reason: reboot debugging
15:48 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw1377.eqiad.wmnet with reason: reboot debugging
15:47 ladsgroup@deploy2002: Finished scap: Backport for gerrit:988654Undeploy Listings extension, part I (T253216) (duration: 08m 22s)
15:46 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
15:46 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
15:45 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
15:41 ladsgroup@deploy2002: ladsgroup: Continuing with sync
15:40 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:988654Undeploy Listings extension, part I (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
15:40 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
15:38 ladsgroup@deploy2002: Started scap: Backport for gerrit:988654Undeploy Listings extension, part I (T253216)
15:35 claime: Draining and cordoning kubestage2002.codfw.wmnet - T352883
15:32 krinkle@deploy2002: Finished scap: Backport for gerrit:987999Fix parsing logic when comments or hidden characters are present (T354385) (duration: 07m 52s)
15:26 krinkle@deploy2002: krinkle: Continuing with sync
15:26 krinkle@deploy2002: krinkle: Backport for gerrit:987999Fix parsing logic when comments or hidden characters are present (T354385) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
15:24 krinkle@deploy2002: Started scap: Backport for gerrit:987999Fix parsing logic when comments or hidden characters are present (T354385)
14:46 urbanecm@deploy2002: Finished scap: Backport for gerrit:987159Add agent.app_install_id to android.product_metrics.* streams (T353680), gerrit:982467Remove partial migration of EditAttemptStep instrument (T351335), gerrit:982903Add new stream names to the config variable (T353297), [[gerrit:988504|agent.app_ -> agent_app_ in android.product_metrics.* streams (T353680)]] (duration: 10m 22s)
14:40 urbanecm@deploy2002: urbanecm and phuedx and ksarabia and sfaci: Continuing with sync
14:37 urbanecm@deploy2002: urbanecm and phuedx and ksarabia and sfaci: Backport for gerrit:987159Add agent.app_install_id to android.product_metrics.* streams (T353680), gerrit:982467Remove partial migration of EditAttemptStep instrument (T351335), gerrit:982903Add new stream names to the config variable (T353297), [[gerrit:988504|agent.app_ -> agent_app_ in android.product_metrics.* streams (T353680)]] synce
14:35 urbanecm@deploy2002: Started scap: Backport for gerrit:987159Add agent.app_install_id to android.product_metrics.* streams (T353680), gerrit:982467Remove partial migration of EditAttemptStep instrument (T351335), gerrit:982903Add new stream names to the config variable (T353297), [[gerrit:988504|agent.app_ -> agent_app_ in android.product_metrics.* streams (T353680)]]
14:34 urbanecm@deploy2002: Sync cancelled.
14:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
14:27 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 10 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
14:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54548 and previous config saved to /var/cache/conftool/dbconfig/20240108-141717-root.json
14:14 urbanecm@deploy2002: urbanecm and phuedx and ksarabia and sfaci: Backport for gerrit:987159Add agent.app_install_id to android.product_metrics.* streams (T353680), gerrit:982467Remove partial migration of EditAttemptStep instrument (T351335), gerrit:982903Add new stream names to the config variable (T353297) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:13 urbanecm@deploy2002: Started scap: Backport for gerrit:987159Add agent.app_install_id to android.product_metrics.* streams (T353680), gerrit:982467Remove partial migration of EditAttemptStep instrument (T351335), gerrit:982903Add new stream names to the config variable (T353297)
14:12 urbanecm@deploy2002: Finished scap: Backport for gerrit:988449enable page_rerender for 3rd batch of wikis (T351503) (duration: 09m 35s)
14:06 urbanecm@deploy2002: pfischer and urbanecm: Continuing with sync
14:04 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
14:04 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
14:04 urbanecm@deploy2002: pfischer and urbanecm: Backport for gerrit:988449enable page_rerender for 3rd batch of wikis (T351503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:02 urbanecm@deploy2002: Started scap: Backport for gerrit:988449enable page_rerender for 3rd batch of wikis (T351503)
14:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54547 and previous config saved to /var/cache/conftool/dbconfig/20240108-140212-root.json
14:01 moritzm: installing curl security updates
13:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54546 and previous config saved to /var/cache/conftool/dbconfig/20240108-134707-root.json
13:33 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
13:33 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
13:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54545 and previous config saved to /var/cache/conftool/dbconfig/20240108-133202-root.json
13:32 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
13:31 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
13:30 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54544 and previous config saved to /var/cache/conftool/dbconfig/20240108-133016-root.json
13:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54543 and previous config saved to /var/cache/conftool/dbconfig/20240108-131657-root.json
13:15 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54542 and previous config saved to /var/cache/conftool/dbconfig/20240108-131511-root.json
13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54541 and previous config saved to /var/cache/conftool/dbconfig/20240108-130152-root.json
13:00 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54540 and previous config saved to /var/cache/conftool/dbconfig/20240108-130006-root.json
12:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54539 and previous config saved to /var/cache/conftool/dbconfig/20240108-124647-root.json
12:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1224.eqiad.wmnet with OS bookworm
12:45 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54538 and previous config saved to /var/cache/conftool/dbconfig/20240108-124501-root.json
12:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54537 and previous config saved to /var/cache/conftool/dbconfig/20240108-122956-root.json
12:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
12:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
12:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54536 and previous config saved to /var/cache/conftool/dbconfig/20240108-121451-root.json
12:10 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1224.eqiad.wmnet with OS bookworm
12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1224 T354506', diff saved to https://phabricator.wikimedia.org/P54535 and previous config saved to /var/cache/conftool/dbconfig/20240108-120759-root.json
12:03 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 45287
12:02 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 45287
12:02 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 35847
12:02 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 35847
12:01 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9902
12:00 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 9902
12:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2117.codfw.wmnet with OS bookworm
11:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54534 and previous config saved to /var/cache/conftool/dbconfig/20240108-115946-root.json
11:57 ladsgroup@deploy2002: Finished scap: Backport for gerrit:988460Disable Listings extension everywhere except rowikivoyage (T253216) (duration: 08m 43s)
11:50 ladsgroup@deploy2002: ladsgroup: Continuing with sync
11:50 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:988460Disable Listings extension everywhere except rowikivoyage (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:48 ladsgroup@deploy2002: Started scap: Backport for gerrit:988460Disable Listings extension everywhere except rowikivoyage (T253216)
11:45 taavi@deploy2002: Finished scap: Backport for gerrit:988252OATHAuthServices: Fix service name (T354505), gerrit:988253Fix disabling two-factor authentication (T354505) (duration: 09m 21s)
11:39 taavi@deploy2002: taavi: Continuing with sync
11:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2117.codfw.wmnet with reason: host reimage
11:38 taavi@deploy2002: taavi: Backport for gerrit:988252OATHAuthServices: Fix service name (T354505), gerrit:988253Fix disabling two-factor authentication (T354505) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:36 taavi@deploy2002: Started scap: Backport for gerrit:988252OATHAuthServices: Fix service name (T354505), gerrit:988253Fix disabling two-factor authentication (T354505)
11:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2117.codfw.wmnet with reason: host reimage
11:29 ladsgroup@deploy2002: Finished scap: Backport for gerrit:988456Stop writing to the old columns of pagelinks in testwiki (T352010) (duration: 10m 02s)
11:23 ladsgroup@deploy2002: ladsgroup: Continuing with sync
11:20 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:988456Stop writing to the old columns of pagelinks in testwiki (T352010) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:19 ladsgroup@deploy2002: Started scap: Backport for gerrit:988456Stop writing to the old columns of pagelinks in testwiki (T352010)
11:17 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2117.codfw.wmnet with OS bookworm
11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2117 T354506', diff saved to https://phabricator.wikimedia.org/P54533 and previous config saved to /var/cache/conftool/dbconfig/20240108-111452-root.json
10:36 XioNoX: repool eqsin - T332395
10:33 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
10:32 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:21 ladsgroup@deploy2002: Finished scap: Backport for gerrit:987861styles: Replace obsolete WikimediaUI Base var with Codex alias (duration: 07m 32s)
10:20 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
10:20 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
10:15 ladsgroup@deploy2002: volker-e and ladsgroup: Continuing with sync
10:15 ladsgroup@deploy2002: volker-e and ladsgroup: Backport for gerrit:987861styles: Replace obsolete WikimediaUI Base var with Codex alias synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
10:14 ladsgroup@deploy2002: Started scap: Backport for gerrit:987861styles: Replace obsolete WikimediaUI Base var with Codex alias
10:11 ladsgroup@deploy2002: Finished scap: Backport for gerrit:987657Set commonswiki pagelinks migration stage to READ NEW (T351237) (duration: 08m 52s)
10:05 ladsgroup@deploy2002: ladsgroup: Continuing with sync
10:04 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:987657Set commonswiki pagelinks migration stage to READ NEW (T351237) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
10:02 ladsgroup@deploy2002: Started scap: Backport for gerrit:987657Set commonswiki pagelinks migration stage to READ NEW (T351237)
09:54 XioNoX: asw1-eqsin> request system reboot - T332395
09:32 Emperor: reboot ms-be2074-80 before adding them to the rings T353149
09:32 Emperor: reboot ms-be1072-82 before adding them to the rings T353149
09:24 XioNoX: start install process on asw1-eqsin - T332395
09:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 35 hosts with reason: eqsin switch upgrade
09:04 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 35 hosts with reason: eqsin switch upgrade
09:03 XioNoX: depool eqsin for switch upgrade - T332395
08:27 xSavitar: UTC morning backport window done.
08:26 derick@deploy2002: Finished scap: Backport for gerrit:974508wmf-config: Remove unused wgStatsCacheType setting (T336004) (duration: 09m 11s)
08:20 derick@deploy2002: derick and d3r1ck01: Continuing with sync
08:18 derick@deploy2002: derick and d3r1ck01: Backport for gerrit:974508wmf-config: Remove unused wgStatsCacheType setting (T336004) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:17 derick@deploy2002: Started scap: Backport for gerrit:974508wmf-config: Remove unused wgStatsCacheType setting (T336004)

2024-01-06

22:27 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
22:27 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
22:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.

2024-01-05

23:49 thcipriani@deploy2002: Finished deploy [gerrit/gerrit@de3a994]: Removing survey banner gerrit:987995 (gerrit.wikimedia.org only this deploy) (duration: 00m 08s)
23:49 thcipriani@deploy2002: Started deploy [gerrit/gerrit@de3a994]: Removing survey banner gerrit:987995 (gerrit.wikimedia.org only this deploy)
23:31 thcipriani@deploy2002: Finished deploy [gerrit/gerrit@de3a994]: Removing survey banner gerrit:987995 (gerrit-replicas only this deploy) (duration: 00m 06s)
23:31 thcipriani@deploy2002: Started deploy [gerrit/gerrit@de3a994]: Removing survey banner gerrit:987995 (gerrit-replicas only this deploy)
23:25 thcipriani: deploying gerrit to remove survey banner https://gerrit.wikimedia.org/r/987995 (no downtime needed)
19:29 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for restbase2034.codfw.wmnet
19:29 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for restbase2034.codfw.wmnet
19:23 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=codfw,name=restbase2034.codfw.wmnet
19:07 mutante: vrts1001 - sudo systemctl start clamav-daemon
17:14 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
17:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
16:43 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
16:42 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
16:40 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
16:30 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
16:29 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
16:19 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
15:40 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:40 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:40 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:40 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:31 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
15:30 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
14:50 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
14:50 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
14:45 milimetric@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
14:45 milimetric@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
14:43 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:42 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:41 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:41 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:38 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:37 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:14 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
14:14 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
13:42 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
13:41 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
13:23 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
13:23 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
11:56 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mw1379.eqiad.wmnet
11:49 kamila@cumin1002: START - Cookbook sre.hosts.reboot-single for host mw1379.eqiad.wmnet
09:26 moritzm: installing 5.10.205 kernels on Bullseye hosts
09:15 _joe_: upgrading conftool across the fleet
08:01 moritzm: installing 6.1.69 kernels on Bookworm hosts
01:27 zabe: zabe@mwmaint2002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=arzwiki --logwiki=metawiki 'WanderingPlaywrite' 'WanderingPlaywright' # T354397
00:59 cwhite: restarted prometheus@k8s on prometheus1006 and backed up the wal for OOM loop investigation
00:52 cwhite: restarted prometheus@k8s on prometheus1005 and backed up the wal for OOM loop investigation

2024-01-04

23:10 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
23:10 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
22:34 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
22:33 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
22:33 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
22:33 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
22:31 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
22:31 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
22:29 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
22:29 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
22:29 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
22:29 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
22:25 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
22:25 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
22:24 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
22:24 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
22:22 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
22:22 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
22:22 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
22:21 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
22:21 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
22:21 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
22:00 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
22:00 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
21:38 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
21:38 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
21:27 brennen: end of utc late backport window
21:26 brennen@deploy2002: Finished scap: Backport for gerrit:987738Ensure all non-okay statuses from ::getImageContents have a message (T354374) (duration: 08m 01s)
21:20 brennen@deploy2002: brennen and dreamyjazz: Continuing with sync
21:19 brennen@deploy2002: brennen and dreamyjazz: Backport for gerrit:987738Ensure all non-okay statuses from ::getImageContents have a message (T354374) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:18 brennen@deploy2002: Started scap: Backport for gerrit:987738Ensure all non-okay statuses from ::getImageContents have a message (T354374)
21:17 brennen@deploy2002: Finished scap: Backport for gerrit:987734Check for invalid JSON on a good response from PhotoDNA (T354370) (duration: 07m 57s)
21:11 brennen@deploy2002: brennen and dreamyjazz: Continuing with sync
21:10 brennen@deploy2002: brennen and dreamyjazz: Backport for gerrit:987734Check for invalid JSON on a good response from PhotoDNA (T354370) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:09 brennen@deploy2002: Started scap: Backport for gerrit:987734Check for invalid JSON on a good response from PhotoDNA (T354370)
20:41 ryankemper: [apifeatureusage] T350703 Restarted `logstash` on `apifeatureusage[1,2]001`
20:39 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.12 refs T350088
20:30 mutante: mwmaint2002 - /usr/local/sbin/sync-home-mwmaint after gerrit:987778
20:20 dduvall@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.12 refs T350088 (duration: 06m 09s)
20:16 ejegg: standalone (payments listener) SmashPig upgraded from fc74ccca to 20d6434e
20:13 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.12 refs T350088
20:03 mutante: releases2003 - systemctl status rsync-srv-org-wikimedia-releases-releases2003.codfw.wmnet after gerrit:987436
20:01 mutante: releases2003 - systemctl start rsync-srv-patches-releases2003.codfw.wmnet after gerrit:987436
19:59 brett: restarting pybal on lvs5006 for testing purposes - T353760
19:59 mutante: releases1003 - systemctl start rsync-srv-patches-releases-primary after gerrit:987436
19:57 dcausse: repooling wdqs1019
19:52 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
19:51 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
19:49 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.12 refs T350088
19:47 mutante: deploy1002 - systemctl start rsync-patches_module after gerrit:987436
19:32 dduvall@deploy2002: Finished scap: Backport for gerrit:987473Revise logic for creating compact links button on Vector 2022 (T353850) (duration: 07m 58s)
19:26 dduvall@deploy2002: jdlrobson and dduvall: Continuing with sync
19:26 dduvall@deploy2002: jdlrobson and dduvall: Backport for gerrit:987473Revise logic for creating compact links button on Vector 2022 (T353850) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
19:25 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
19:25 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
19:24 dduvall@deploy2002: Started scap: Backport for gerrit:987473Revise logic for creating compact links button on Vector 2022 (T353850)
19:22 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
19:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
19:04 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
19:04 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
18:46 sukhe: [second time] mx2001: exiqgrep -i -r w*@gmail.com | xargs exim -Mrm
18:03 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1377.eqiad.wmnet with OS bullseye
17:57 sukhe: mx2001: exiqgrep -i -r w*@gmail.com | xargs exim -Mrm
17:46 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
17:43 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
17:42 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
17:42 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
17:35 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
17:34 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
17:28 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
17:10 oblivian@puppetmaster2001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=kubernetes,service=kubesvc,name=mw1377.*
16:43 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
16:42 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
16:42 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
16:41 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
16:41 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
16:41 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
16:36 volans@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw1378.eqiad.wmnet
16:25 volans@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw1378.eqiad.wmnet
16:00 volans@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts mw1378.eqiad.wmnet
15:59 volans@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw1378.eqiad.wmnet
15:58 moritzm: installing libdatetime-timezone-perl updates
15:51 moritzm: rolling restart of FPM/apache on mw canaries to pick up curl updates
15:48 XioNoX: repool esams - T346779
15:46 volans@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw1378.eqiad.wmnet
15:38 XioNoX: undrain esams-eqiad transport - T346779
15:37 XioNoX: re-enable peering/transit on cr1-esams - T346779
15:35 volans@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw1378.eqiad.wmnet
15:30 XioNoX: reboot fpc0 on cr1-esams - T346779
15:29 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw1378.mgmt.eqiad.wmnet with reboot policy GRACEFUL
15:26 XioNoX: disable peering/transit on cr1-esams for linecard reboot - T346779
15:19 volans: running sre.hosts.provision for mw1378 - T351074
15:19 volans@cumin2002: START - Cookbook sre.hosts.provision for host mw1378.mgmt.eqiad.wmnet with reboot policy GRACEFUL
15:16 XioNoX: drain esams-eqiad transport - T346779
15:14 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:13 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:12 moritzm: installing curl security updates
15:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:08 volans: rebooting mw1378 (downtimed and depooled) to debug reboot issues afer reimage - T351074
15:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:07 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:07 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:07 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:07 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:05 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:04 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:04 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:01 XioNoX: depool esams for router work - T346779
15:00 tchanders@deploy2002: Finished scap: Backport for gerrit:984810enable page_rerender for 2nd batch: dewiki, frwiktionary, and kuwiktionary (duration: 17m 55s)
14:59 volans: rebooting mw1378 (downtimed and depooled) to debug reboot issues afer reimage - T351074
14:56 volans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw1378.eqiad.wmnet with reason: WIP hosts to be setup
14:56 volans@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on mw1378.eqiad.wmnet with reason: WIP hosts to be setup
14:54 tchanders@deploy2002: pfischer and tchanders: Continuing with sync
14:45 tchanders@deploy2002: pfischer and tchanders: Backport for gerrit:984810enable page_rerender for 2nd batch: dewiki, frwiktionary, and kuwiktionary synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:42 tchanders@deploy2002: Started scap: Backport for gerrit:984810enable page_rerender for 2nd batch: dewiki, frwiktionary, and kuwiktionary
14:40 tchanders@deploy2002: Finished scap: Backport for gerrit:987726Attempt to send original file to PhotoDNA if no thumbnail (T353854) (duration: 09m 25s)
14:34 tchanders@deploy2002: tchanders and dreamyjazz: Continuing with sync
14:34 tchanders@deploy2002: tchanders and dreamyjazz: Backport for gerrit:987726Attempt to send original file to PhotoDNA if no thumbnail (T353854) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:30 tchanders@deploy2002: Started scap: Backport for gerrit:987726Attempt to send original file to PhotoDNA if no thumbnail (T353854)
14:25 tchanders@deploy2002: Finished scap: Backport for gerrit:987485Attempt to send original file to PhotoDNA if no thumbnail (T353854) (duration: 09m 24s)
14:20 tchanders@deploy2002: dreamyjazz and tchanders: Continuing with sync
14:20 tchanders@deploy2002: dreamyjazz and tchanders: Backport for gerrit:987485Attempt to send original file to PhotoDNA if no thumbnail (T353854) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:16 tchanders@deploy2002: Started scap: Backport for gerrit:987485Attempt to send original file to PhotoDNA if no thumbnail (T353854)
14:12 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:12 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:09 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:09 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:09 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:03 XioNoX: repool drmrs - T354340
14:01 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:00 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:00 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
13:57 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 2686
13:56 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 2686
13:53 moritzm: installing libssh security updates
13:24 dcausse: restarting blazegraph on wdqs1019 (stuck with high thread count)
13:07 zabe@deploy2002: Finished scap: Backport for gerrit:987483Revert "Get blocks from DatabaseBlockStore instead of doing our own query" (T353620), gerrit:987482Revert "Support new block schema" (T354298) (duration: 10m 06s)
13:02 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host mw1377.eqiad.wmnet
13:02 XioNoX: depool drmrs for router work - T354340
13:01 zabe@deploy2002: zabe: Continuing with sync
13:00 zabe@deploy2002: zabe: Backport for gerrit:987483Revert "Get blocks from DatabaseBlockStore instead of doing our own query" (T353620), gerrit:987482Revert "Support new block schema" (T354298) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
12:56 zabe@deploy2002: Started scap: Backport for gerrit:987483Revert "Get blocks from DatabaseBlockStore instead of doing our own query" (T353620), gerrit:987482Revert "Support new block schema" (T354298)
12:53 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 63296
12:52 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 63296
12:10 kamila@cumin1002: START - Cookbook sre.hosts.reboot-single for host mw1377.eqiad.wmnet
12:04 moritzm: installing lua5.3 security updates
11:52 moritzm: installing libde265 security updates
11:35 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1379.eqiad.wmnet with OS bullseye
11:19 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
11:16 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
11:01 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye
10:51 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
10:33 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
10:32 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
10:17 akosiaris: bump memory limits for calico-node in wikikube codfw/eqiad by 25% (i.e from 400Mi to 500Mi) take #3
10:17 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
09:57 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
09:38 akosiaris: delete mw1377-mw1383 from eqiad wikikube nodes
09:38 akosiaris: bump memory limits for calico-node in wikikube codfw/eqiad by 25% (i.e from 400Mi to 500Mi) take #2
09:36 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
09:36 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
09:22 akosiaris: bump memory limits for calico-node in wikikube codfw/eqiad by 25% (i.e from 400Mi to 500Mi)
09:22 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
09:13 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:13 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:12 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:11 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:09 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
08:49 ladsgroup@deploy2002: Finished scap: Backport for gerrit:987134Update virtual domain for url shortener (duration: 12m 35s)
08:43 ladsgroup@deploy2002: ladsgroup: Continuing with sync
08:38 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:987134Update virtual domain for url shortener synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:36 ladsgroup@deploy2002: Started scap: Backport for gerrit:987134Update virtual domain for url shortener
08:34 ladsgroup@deploy2002: Finished scap: Backport for gerrit:985160Add virtual domain config for reading lists extension (T353948) (duration: 09m 05s)
08:28 ladsgroup@deploy2002: ladsgroup: Continuing with sync
08:27 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:985160Add virtual domain config for reading lists extension (T353948) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:25 ladsgroup@deploy2002: Started scap: Backport for gerrit:985160Add virtual domain config for reading lists extension (T353948)
07:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1151.eqiad.wmnet with OS bookworm
06:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1151.eqiad.wmnet with reason: host reimage
06:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1151.eqiad.wmnet with reason: host reimage
06:28 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1151.eqiad.wmnet with OS bookworm
03:49 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.

2024-01-03

23:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow
23:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow
23:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow
23:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow
23:33 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
23:24 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
23:24 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
23:18 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1383.eqiad.wmnet with OS bullseye
23:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1380.eqiad.wmnet with OS bullseye
23:14 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1382.eqiad.wmnet with OS bullseye
23:12 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1378.eqiad.wmnet with OS bullseye
23:10 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1381.eqiad.wmnet with OS bullseye
23:07 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1379.eqiad.wmnet with OS bullseye
23:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
23:01 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
22:59 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
22:59 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
22:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
22:54 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
22:54 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
22:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
22:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
22:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
22:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
22:51 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
22:51 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
22:40 bking@cumin2002: START - Cookbook sre.wdqs.restart
22:38 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1383.eqiad.wmnet with OS bullseye
22:38 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1382.eqiad.wmnet with OS bullseye
22:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1381.eqiad.wmnet with OS bullseye
22:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1380.eqiad.wmnet with OS bullseye
22:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye
22:36 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1378.eqiad.wmnet with OS bullseye
22:36 bking@cumin2002: START - Cookbook sre.wdqs.restart
22:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2087.codfw.wmnet with OS bullseye
22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2087.codfw.wmnet with reason: host reimage
21:59 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2087.codfw.wmnet with reason: host reimage
21:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1377.eqiad.wmnet with OS bullseye
21:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: broken reimage
21:47 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: broken reimage
21:43 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2087.codfw.wmnet with OS bullseye
21:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
21:34 zabe@deploy2002: Finished scap: Backport for gerrit:986825Update mediawiki/mediawiki-codesniffer to 42.0.0 (duration: 10m 34s)
21:33 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
21:28 zabe@deploy2002: zabe: Continuing with sync
21:27 zabe@deploy2002: zabe: Backport for gerrit:986825Update mediawiki/mediawiki-codesniffer to 42.0.0 synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:24 zabe@deploy2002: Started scap: Backport for gerrit:986825Update mediawiki/mediawiki-codesniffer to 42.0.0
21:19 TheresNoTime: UTC late backport window done
21:18 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
21:14 samtar@deploy2002: Finished scap: Backport for gerrit:986200Add "patroller" user group to testwiki (T354063) (duration: 12m 19s)
21:08 samtar@deploy2002: novemlinguae and samtar: Continuing with sync
21:06 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1383.eqiad.wmnet with OS bullseye
21:06 samtar@deploy2002: novemlinguae and samtar: Backport for gerrit:986200Add "patroller" user group to testwiki (T354063) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:04 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1382.eqiad.wmnet with OS bullseye
21:02 samtar@deploy2002: Started scap: Backport for gerrit:986200Add "patroller" user group to testwiki (T354063)
20:59 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1381.eqiad.wmnet with OS bullseye
20:47 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1380.eqiad.wmnet with OS bullseye
20:45 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1379.eqiad.wmnet with OS bullseye
20:37 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1378.eqiad.wmnet with OS bullseye
20:34 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1377.eqiad.wmnet with OS bullseye
20:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2450.codfw.wmnet with OS bullseye
20:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2443.codfw.wmnet with OS bullseye
20:11 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2451.codfw.wmnet with OS bullseye
20:04 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2442.codfw.wmnet with OS bullseye
20:00 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
19:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2436.codfw.wmnet with OS bullseye
19:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
19:57 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2450.codfw.wmnet with reason: host reimage
19:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2440.codfw.wmnet with OS bullseye
19:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2443.codfw.wmnet with reason: host reimage
19:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2437.codfw.wmnet with OS bullseye
19:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
19:51 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2451.codfw.wmnet with reason: host reimage
19:51 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2451.codfw.wmnet with reason: host reimage
19:51 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2450.codfw.wmnet with reason: host reimage
19:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2443.codfw.wmnet with reason: host reimage
19:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
19:49 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
19:49 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
19:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2442.codfw.wmnet with reason: host reimage
19:42 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
19:39 mutante: root@doc2002: /usr/local/sbin/sync-doc-host-data-sync after gerrit:987406
19:39 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
19:38 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2442.codfw.wmnet with reason: host reimage
19:36 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
19:36 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2440.codfw.wmnet with reason: host reimage
19:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2436.codfw.wmnet with reason: host reimage
19:35 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1383.eqiad.wmnet with OS bullseye
19:35 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2440.codfw.wmnet with reason: host reimage
19:35 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
19:35 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1382.eqiad.wmnet with OS bullseye
19:34 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1381.eqiad.wmnet with OS bullseye
19:33 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2451.codfw.wmnet with OS bullseye
19:33 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2437.codfw.wmnet with reason: host reimage
19:33 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2450.codfw.wmnet with OS bullseye
19:32 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2443.codfw.wmnet with OS bullseye
19:31 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
19:28 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2437.codfw.wmnet with reason: host reimage
19:28 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
19:26 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2436.codfw.wmnet with reason: host reimage
19:26 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
19:25 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
19:22 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1380.eqiad.wmnet with OS bullseye
19:21 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye
19:19 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2442.codfw.wmnet with OS bullseye
19:18 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2440.codfw.wmnet with OS bullseye
19:11 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1378.eqiad.wmnet with OS bullseye
19:11 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
19:10 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2437.codfw.wmnet with OS bullseye
19:08 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2436.codfw.wmnet with OS bullseye
18:27 brennen@deploy2002: Finished deploy [phabricator/deployment@369e797]: deploy to phab2002 for T334519 (duration: 00m 27s)
18:27 brennen@deploy2002: Started deploy [phabricator/deployment@369e797]: deploy to phab2002 for T334519
18:27 brennen: running an essentially no-op phab2002 deploy
18:11 dduvall@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.12 refs T350088 (duration: 07m 23s)
18:03 dduvall@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.12 refs T350088
17:06 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and not P{cp4044.ulsfo.wmnet} and A:cp
16:45 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and not P{cp4044.ulsfo.wmnet} and A:cp
16:33 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and not P{cp4050.ulsfo.wmnet,cp4051.ulsfo.wmnet} and A:cp
16:27 stevemunene@deploy2002: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
16:27 stevemunene@deploy2002: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
16:27 stevemunene@deploy2002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
16:26 stevemunene@deploy2002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
16:26 stevemunene@deploy2002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
16:26 stevemunene@deploy2002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
16:25 stevemunene@deploy2002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
16:25 stevemunene@deploy2002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
16:24 stevemunene@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
16:24 stevemunene@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
16:23 stevemunene@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
16:22 stevemunene@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
16:16 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and not P{cp4050.ulsfo.wmnet,cp4051.ulsfo.wmnet} and A:cp
16:11 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P{cp3066.esams.wmnet} and A:cp
16:10 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P{cp3066.esams.wmnet} and A:cp
15:39 moritzm: rebuild md RAIDs after disk swap T353324
14:55 TheresNoTime: UTC afternoon backport window done
14:54 samtar@deploy2002: Finished scap: Backport for gerrit:986658zhwikinews: update wordmark (T353792) (duration: 09m 11s)
14:48 samtar@deploy2002: anzx and samtar: Continuing with sync
14:46 samtar@deploy2002: anzx and samtar: Backport for gerrit:986658zhwikinews: update wordmark (T353792) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:45 samtar@deploy2002: Started scap: Backport for gerrit:986658zhwikinews: update wordmark (T353792)
14:43 samtar@deploy2002: Finished scap: Backport for gerrit:985389aswikiquote: change wordmark and update logo (T353934) (duration: 07m 51s)
14:38 samtar@deploy2002: samtar and anzx: Continuing with sync
14:37 samtar@deploy2002: samtar and anzx: Backport for gerrit:985389aswikiquote: change wordmark and update logo (T353934) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:36 samtar@deploy2002: Started scap: Backport for gerrit:985389aswikiquote: change wordmark and update logo (T353934)
14:34 samtar@deploy2002: Finished scap: Backport for gerrit:986662Edit Recovery: fix typo in expiry field name (T347673) (duration: 07m 46s)
14:29 samtar@deploy2002: samtar: Continuing with sync
14:28 samtar@deploy2002: samtar: Backport for gerrit:986662Edit Recovery: fix typo in expiry field name (T347673) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:27 samtar@deploy2002: Started scap: Backport for gerrit:986662Edit Recovery: fix typo in expiry field name (T347673)
14:18 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:18 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:17 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:17 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:11 samtar@deploy2002: Finished scap: Backport for gerrit:985376zhwikivoyage: Enable block feature for abusefilter (T353604), gerrit:987417ganwiki: Add transwiki import sources (T354000) (duration: 09m 58s)
14:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:06 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:05 samtar@deploy2002: samtar and stang: Continuing with sync
14:03 moritzm: installing qemu security updates
14:02 samtar@deploy2002: samtar and stang: Backport for gerrit:985376zhwikivoyage: Enable block feature for abusefilter (T353604), gerrit:987417ganwiki: Add transwiki import sources (T354000) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:01 samtar@deploy2002: Started scap: Backport for gerrit:985376zhwikivoyage: Enable block feature for abusefilter (T353604), gerrit:987417ganwiki: Add transwiki import sources (T354000)
13:32 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Nick Ifeajika out of all services on: 2220 hosts
13:31 root@cumin2002: START - Cookbook sre.idm.logout Logging Nick Ifeajika out of all services on: 2220 hosts
13:29 moritzm: installing Java 8/11 security updates
12:34 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
12:34 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
12:29 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot-master (exit_code=0) rolling restart_daemons on A:maps-master
12:28 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot-master rolling restart_daemons on A:maps-master
12:23 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
12:18 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-eqiad
12:14 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
12:14 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-codfw
12:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
12:08 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-codfw
12:02 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
12:02 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
12:01 moritzm: installing gnutls28 security updates on buster
11:47 oblivian@deploy2002: Finished scap: Backport for gerrit:987400Fix timeouts detection on mw on k8s jobrunners (T354229) (duration: 11m 38s)
11:44 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:44 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:41 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:41 oblivian@deploy2002: oblivian: Continuing with sync
11:40 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:39 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:37 oblivian@deploy2002: oblivian: Backport for gerrit:987400Fix timeouts detection on mw on k8s jobrunners (T354229) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:36 oblivian@deploy2002: Started scap: Backport for gerrit:987400Fix timeouts detection on mw on k8s jobrunners (T354229)
11:31 oblivian@deploy2002: Finished scap: Backport for gerrit:951049Disable things that don't work on k8s when on k8s (duration: 15m 29s)
11:25 oblivian@deploy2002: oblivian: Continuing with sync
11:25 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:24 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:24 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:24 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:24 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:23 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:23 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:23 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:23 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:22 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:22 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:21 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:21 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:20 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:20 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:18 oblivian@deploy2002: oblivian: Backport for gerrit:951049Disable things that don't work on k8s when on k8s synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:16 oblivian@deploy2002: Started scap: Backport for gerrit:951049Disable things that don't work on k8s when on k8s
11:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:56 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:53 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:51 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
10:51 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:48 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
10:48 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:46 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
10:46 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:35 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
10:35 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:23 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:16 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:15 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:11 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:11 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
10:10 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:09 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
10:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:57 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:40 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:39 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:36 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:36 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:35 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:35 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:33 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:33 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:32 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:32 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:31 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:31 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:21 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:21 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:21 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:21 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:21 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:10 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:10 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:07 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:03 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
01:16 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
01:16 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
01:13 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
01:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
00:55 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
00:55 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
00:08 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
00:08 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.

2024-01-02

22:42 urbanecm: mwmaint2002: Restart `mwscript extensions/GrowthExperiments/maintenance/reassignMentees.php --wiki=enwiki --mentor 'FormalDude' --performer 'Martin Urbanec (WMF)'` (T354220)
22:29 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2087.codfw.wmnet with OS bullseye
21:08 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2087.codfw.wmnet with OS bullseye
20:52 urbanecm: mwmaint2002: `mwscript extensions/GrowthExperiments/maintenance/reassignMentees.php --wiki=enwiki --mentor 'FormalDude' --performer 'Martin Urbanec (WMF)'` (T354220)
20:32 mutante: phab2002 - synced /srv/homes tfrom phab1004 to /srv/homes on phab2002
19:39 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.12 refs T350088
18:29 mutante: confctl select 'name=mw2394.codfw.wmnet' set/pooled=inactive | T354193#9430654 - seems like 2396 was previously depooled instead of this 2394
17:29 dancy@deploy2002: Installation of scap version "4.65.1" completed for 566 hosts
17:28 dancy@deploy2002: Installing scap version "4.65.1" for 566 hosts
17:26 dancy@deploy2002: Installing scap version "4.65.1" for 567 hosts
14:59 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbstore1008.eqiad.wmnet with OS bookworm
14:58 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbstore1009.eqiad.wmnet with OS bookworm
14:44 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki=csbwiktionary --fix # T354114
14:43 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1009.eqiad.wmnet with reason: host reimage
14:40 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on dbstore1009.eqiad.wmnet with reason: host reimage
14:37 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1008.eqiad.wmnet with reason: host reimage
14:34 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on dbstore1008.eqiad.wmnet with reason: host reimage
14:32 _joe_: confctl select 'name=mw2396.codfw.wmnet' set/pooled=inactive
14:26 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host dbstore1009.eqiad.wmnet with OS bookworm
14:20 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host dbstore1008.eqiad.wmnet with OS bookworm
14:16 urbanecm@deploy2002: Finished scap: Backport for gerrit:985384cswiki: Grant patrolmarks to autopatrolled (T354004), gerrit:986640csbwiktionary: Set MetaNamespaceName to Wikisłowôrz (T354114) (duration: 13m 46s)
14:04 urbanecm@deploy2002: urbanecm: Continuing with sync
14:04 urbanecm@deploy2002: urbanecm: Backport for gerrit:985384cswiki: Grant patrolmarks to autopatrolled (T354004), gerrit:986640csbwiktionary: Set MetaNamespaceName to Wikisłowôrz (T354114) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:02 urbanecm@deploy2002: Started scap: Backport for gerrit:985384cswiki: Grant patrolmarks to autopatrolled (T354004), gerrit:986640csbwiktionary: Set MetaNamespaceName to Wikisłowôrz (T354114)
10:55 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P{cp4044.ulsfo.wmnet,cp4050.ulsfo.wmnet} and A:cp
10:50 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P{cp4044.ulsfo.wmnet,cp4050.ulsfo.wmnet} and A:cp
10:38 vgutierrez: fetching haproxy 2.6.16 for thirdparty/haproxy26 bullseye-wikimedia (apt.wm.o)
09:23 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Commissioning new database server
09:23 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Commissioning new database server
09:17 pfischer@deploy2002: Finished scap: Backport for gerrit:987028configure message_key_fields for update_pipeline (duration: 15m 35s)
09:05 pfischer@deploy2002: pfischer: Continuing with sync
09:04 pfischer@deploy2002: pfischer: Backport for gerrit:987028configure message_key_fields for update_pipeline synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
09:02 moritzm: installing nodejs security updates on bookworm
09:02 pfischer@deploy2002: Started scap: Backport for gerrit:987028configure message_key_fields for update_pipeline
08:33 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2448.mgmt.codfw.wmnet with reboot policy GRACEFUL
08:27 jayme: restart prometheus@k8s prometheus@k8s-aux in eqiad - T343529
08:26 akosiaris@cumin1001: START - Cookbook sre.hosts.provision for host mw2448.mgmt.codfw.wmnet with reboot policy GRACEFUL
06:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2144.codfw.wmnet with OS bookworm
06:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2144.codfw.wmnet with reason: host reimage
06:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2144.codfw.wmnet with reason: host reimage
06:06 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2144.codfw.wmnet with OS bookworm
05:00 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.12 refs T350088 (duration: 56m 48s)
04:03 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.12 refs T350088

2024-01-01

21:38 eileen: config revision changed from 026cf508 to 21b91455
21:13 eileen: config revision changed from 3a1a1444 to 026cf508
21:13 eileen: fork/mapping-edit-button-fix
17:11 joal@deploy2002: Finished deploy [airflow-dags/analytics@8b8a456]: Fix monthly job [airflow-dags/analytics@8b8a4567] (duration: 00m 31s)
17:11 joal@deploy2002: Started deploy [airflow-dags/analytics@8b8a456]: Fix monthly job [airflow-dags/analytics@8b8a4567]

Other archives

2000s

Archive 1: 2004 Jun - 2004 Sep
Archive 2: 2004 Oct - 2004 Nov
Archive 3: 2004 Dec - 2005 Mar
Archive 4: 2005 Apr - 2005 Jul
Archive 5: 2005 Aug - 2005 Oct, with revision history 2004-06-23 to 2005-11-25
Archive 6: 2005 Nov - 2006 Feb
Archive 7: 2006 Mar - 2006 Jun
Archive 8: 2006 Jul - 2006 Sep
Archive 9: 2006 Oct - 2007 Jan, with revision history 2005-11-25 to 2007-02-21
Archive 10: 2007 Feb - 2007 Jun
Archive 11: 2007 Jul - 2007 Dec
Archive 12: 2008 Jan - 2008 Jul
Archive 12a: 2008 Aug
Archive 12b: 2008 Sept
Archive 13: 2008 Oct - 2009 Jun
Archive 14: 2009 Jun - 2009 Dec

2010s

2020s

This article is issued from Wikimedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.