< Server Admin Log
Server Admin Log/Archive 75
2024-01-31
- 23:11 eileen: * civicrm upgraded from 6344c95e to 6e1e0d21
- 22:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 22:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 22:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T355609)', diff saved to https://phabricator.wikimedia.org/P56010 and previous config saved to /var/cache/conftool/dbconfig/20240131-222853-marostegui.json
- 22:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P56009 and previous config saved to /var/cache/conftool/dbconfig/20240131-221347-marostegui.json
- 22:11 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: gerrit:994823 Bumping portals to master (T128546) (duration: 06m 43s)
- 22:05 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: gerrit:994823 Bumping portals to master (T128546) (duration: 07m 26s)
- 21:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P56008 and previous config saved to /var/cache/conftool/dbconfig/20240131-215840-marostegui.json
- 21:54 Dreamy_Jazz: Removed already applied patches for T347708 from /srv/patches
- 21:48 dancy@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.16 refs T354434 (duration: 06m 47s)
- 21:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T355609)', diff saved to https://phabricator.wikimedia.org/P56007 and previous config saved to /var/cache/conftool/dbconfig/20240131-214334-marostegui.json
- 21:42 dancy@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.16 refs T354434
- 21:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1233 (T355609)', diff saved to https://phabricator.wikimedia.org/P56006 and previous config saved to /var/cache/conftool/dbconfig/20240131-213454-marostegui.json
- 21:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1233.eqiad.wmnet with reason: Maintenance
- 21:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1233.eqiad.wmnet with reason: Maintenance
- 21:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T355609)', diff saved to https://phabricator.wikimedia.org/P56005 and previous config saved to /var/cache/conftool/dbconfig/20240131-213432-marostegui.json
- 21:31 Dreamy_Jazz: Security deploy done
- 21:30 logmsgbot: dreamyjazz Deployed security patch for T356226
- 21:23 logmsgbot: dreamyjazz Deployed security patch for T356226
- 21:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P56004 and previous config saved to /var/cache/conftool/dbconfig/20240131-211926-marostegui.json
- 21:16 Dreamy_Jazz: Doing security deploy for T356226
- 21:12 jforrester@deploy2002: Finished scap: Backport for gerrit:994716Gadget: Bump GADGET_CLASS_VERSION (T356322) (duration: 08m 31s)
- 21:05 jforrester@deploy2002: jforrester and reedy: Continuing with sync
- 21:05 jforrester@deploy2002: jforrester and reedy: Backport for gerrit:994716Gadget: Bump GADGET_CLASS_VERSION (T356322) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P56003 and previous config saved to /var/cache/conftool/dbconfig/20240131-210419-marostegui.json
- 21:03 jforrester@deploy2002: Started scap: Backport for gerrit:994716Gadget: Bump GADGET_CLASS_VERSION (T356322)
- 20:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T355609)', diff saved to https://phabricator.wikimedia.org/P56002 and previous config saved to /var/cache/conftool/dbconfig/20240131-204913-marostegui.json
- 20:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1229 (T355609)', diff saved to https://phabricator.wikimedia.org/P56001 and previous config saved to /var/cache/conftool/dbconfig/20240131-204439-marostegui.json
- 20:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1229.eqiad.wmnet with reason: Maintenance
- 20:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1229.eqiad.wmnet with reason: Maintenance
- 20:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 20:37 eevans@deploy2002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
- 20:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 20:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T355609)', diff saved to https://phabricator.wikimedia.org/P56000 and previous config saved to /var/cache/conftool/dbconfig/20240131-203704-marostegui.json
- 20:36 eevans@deploy2002: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
- 20:36 eevans@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: sync
- 20:35 eevans@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: sync
- 20:35 eevans@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: sync
- 20:35 eevans@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: sync
- 20:33 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript userOptions.php --wiki=testwiki --old-is-default --old=2 --new 1 --nowarn 'echo-subscriptions-web-reverted' # T353225
- 20:32 eevans@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
- 20:31 eevans@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
- 20:28 joal@deploy2002: Finished deploy [analytics/refinery@b738b3f] (hadoop-test): HOTFIX analytics weekly train - Test [analytics/refinery@b738b3fd] (duration: 03m 35s)
- 20:28 eevans@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
- 20:27 eevans@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: apply
- 20:25 joal@deploy2002: Started deploy [analytics/refinery@b738b3f] (hadoop-test): HOTFIX analytics weekly train - Test [analytics/refinery@b738b3fd]
- 20:24 joal@deploy2002: Finished deploy [analytics/refinery@b738b3f] (thin): HOTFIX analytics weekly train -THIN [analytics/refinery@b738b3fd] (duration: 00m 05s)
- 20:24 joal@deploy2002: Started deploy [analytics/refinery@b738b3f] (thin): HOTFIX analytics weekly train -THIN [analytics/refinery@b738b3fd]
- 20:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P55999 and previous config saved to /var/cache/conftool/dbconfig/20240131-202158-marostegui.json
- 20:10 joal@deploy2002: Finished deploy [analytics/refinery@b738b3f]: HOTFIX analytics weekly train [analytics/refinery@b738b3fd] (duration: 10m 51s)
- 20:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P55998 and previous config saved to /var/cache/conftool/dbconfig/20240131-200652-marostegui.json
- 19:59 joal@deploy2002: Started deploy [analytics/refinery@b738b3f]: HOTFIX analytics weekly train [analytics/refinery@b738b3fd]
- 19:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T355609)', diff saved to https://phabricator.wikimedia.org/P55997 and previous config saved to /var/cache/conftool/dbconfig/20240131-195145-marostegui.json
- 19:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1222 (T355609)', diff saved to https://phabricator.wikimedia.org/P55996 and previous config saved to /var/cache/conftool/dbconfig/20240131-193927-marostegui.json
- 19:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1222.eqiad.wmnet with reason: Maintenance
- 19:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1222.eqiad.wmnet with reason: Maintenance
- 19:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T355609)', diff saved to https://phabricator.wikimedia.org/P55994 and previous config saved to /var/cache/conftool/dbconfig/20240131-193905-marostegui.json
- 19:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P55993 and previous config saved to /var/cache/conftool/dbconfig/20240131-192359-marostegui.json
- 19:17 dancy@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.16 refs T354434
- 19:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P55992 and previous config saved to /var/cache/conftool/dbconfig/20240131-190852-marostegui.json
- 18:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T355609)', diff saved to https://phabricator.wikimedia.org/P55991 and previous config saved to /var/cache/conftool/dbconfig/20240131-185345-marostegui.json
- 18:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1197 (T355609)', diff saved to https://phabricator.wikimedia.org/P55990 and previous config saved to /var/cache/conftool/dbconfig/20240131-184900-marostegui.json
- 18:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
- 18:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
- 18:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T355609)', diff saved to https://phabricator.wikimedia.org/P55989 and previous config saved to /var/cache/conftool/dbconfig/20240131-184838-marostegui.json
- 18:40 phuedx@deploy2002: Finished deploy [airflow-dags/analytics@5078a6b]: (no justification provided) (duration: 00m 28s)
- 18:40 phuedx@deploy2002: Started deploy [airflow-dags/analytics@5078a6b]: (no justification provided)
- 18:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P55988 and previous config saved to /var/cache/conftool/dbconfig/20240131-183332-marostegui.json
- 18:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P55986 and previous config saved to /var/cache/conftool/dbconfig/20240131-181825-marostegui.json
- 18:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cloudelastic1010.eqiad.wmnet with reason: T355617
- 18:04 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cloudelastic1010.eqiad.wmnet with reason: T355617
- 18:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T355609)', diff saved to https://phabricator.wikimedia.org/P55985 and previous config saved to /var/cache/conftool/dbconfig/20240131-180319-marostegui.json
- 17:58 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1188 (T355609)', diff saved to https://phabricator.wikimedia.org/P55984 and previous config saved to /var/cache/conftool/dbconfig/20240131-175833-marostegui.json
- 17:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
- 17:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
- 17:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T355609)', diff saved to https://phabricator.wikimedia.org/P55983 and previous config saved to /var/cache/conftool/dbconfig/20240131-175811-marostegui.json
- 17:51 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
- 17:50 aokoth@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM vrts1001.eqiad.wmnet
- 17:46 aokoth@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM vrts1001.eqiad.wmnet
- 17:45 aokoth@cumin1002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM vrts1001.eqiad.wmnet
- 17:45 aokoth@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM vrts1001.eqiad.wmnet
- 17:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P55982 and previous config saved to /var/cache/conftool/dbconfig/20240131-174305-marostegui.json
- 17:35 phuedx@deploy2002: Finished deploy [analytics/refinery@bef134c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@bef134c2] (duration: 03m 29s)
- 17:31 phuedx@deploy2002: Started deploy [analytics/refinery@bef134c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@bef134c2]
- 17:31 phuedx@deploy2002: Finished deploy [analytics/refinery@bef134c] (thin): Regular analytics weekly train THIN [analytics/refinery@bef134c2] (duration: 00m 08s)
- 17:30 phuedx@deploy2002: Started deploy [analytics/refinery@bef134c] (thin): Regular analytics weekly train THIN [analytics/refinery@bef134c2]
- 17:30 phuedx@deploy2002: Finished deploy [analytics/refinery@bef134c]: Regular analytics weekly train [analytics/refinery@bef134c2] (duration: 11m 05s)
- 17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P55981 and previous config saved to /var/cache/conftool/dbconfig/20240131-172758-marostegui.json
- 17:19 phuedx@deploy2002: Started deploy [analytics/refinery@bef134c]: Regular analytics weekly train [analytics/refinery@bef134c2]
- 17:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T355609)', diff saved to https://phabricator.wikimedia.org/P55980 and previous config saved to /var/cache/conftool/dbconfig/20240131-171252-marostegui.json
- 17:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1182 (T355609)', diff saved to https://phabricator.wikimedia.org/P55979 and previous config saved to /var/cache/conftool/dbconfig/20240131-170141-marostegui.json
- 17:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 17:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 17:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55978 and previous config saved to /var/cache/conftool/dbconfig/20240131-170120-marostegui.json
- 17:01 phuedx@deploy2002: Finished deploy [analytics/refinery@2c00cad] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@2c00cad1] (duration: 03m 35s)
- 16:57 ejegg: fundraising civicrm upgraded from 520337a0 to 6344c95e
- 16:57 phuedx@deploy2002: Started deploy [analytics/refinery@2c00cad] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@2c00cad1]
- 16:56 phuedx@deploy2002: Finished deploy [analytics/refinery@2c00cad] (thin): Regular analytics weekly train THIN [analytics/refinery@2c00cad1] (duration: 00m 06s)
- 16:56 phuedx@deploy2002: Started deploy [analytics/refinery@2c00cad] (thin): Regular analytics weekly train THIN [analytics/refinery@2c00cad1]
- 16:54 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
- 16:52 phuedx@deploy2002: Finished deploy [analytics/refinery@2c00cad]: Regular analytics weekly train [analytics/refinery@2c00cad1] (duration: 09m 52s)
- 16:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P55977 and previous config saved to /var/cache/conftool/dbconfig/20240131-164613-marostegui.json
- 16:43 phuedx@deploy2002: Started deploy [analytics/refinery@2c00cad]: Regular analytics weekly train [analytics/refinery@2c00cad1]
- 16:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P55976 and previous config saved to /var/cache/conftool/dbconfig/20240131-163106-marostegui.json
- 16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55974 and previous config saved to /var/cache/conftool/dbconfig/20240131-161600-marostegui.json
- 16:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55973 and previous config saved to /var/cache/conftool/dbconfig/20240131-160624-marostegui.json
- 16:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 16:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 16:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T355609)', diff saved to https://phabricator.wikimedia.org/P55972 and previous config saved to /var/cache/conftool/dbconfig/20240131-160602-marostegui.json
- 16:01 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 15:58 moritzm: installing openssh security updates
- 15:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moscovium.eqiad.wmnet
- 15:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host moscovium.eqiad.wmnet
- 15:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P55970 and previous config saved to /var/cache/conftool/dbconfig/20240131-155055-marostegui.json
- 15:50 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 15:47 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 15:47 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
- 15:47 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 15:46 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 15:46 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
- 15:45 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
- 15:45 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 15:45 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
- 15:44 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 15:43 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 15:41 ayounsi@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2006.codfw.wmnet
- 15:41 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:41 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin2002"
- 15:39 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin2002"
- 15:36 ayounsi@cumin2002: START - Cookbook sre.dns.netbox
- 15:36 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: name=maps2009.codfw.wmnet
- 15:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P55969 and previous config saved to /var/cache/conftool/dbconfig/20240131-153549-marostegui.json
- 15:34 hnowlan@puppetmaster1001: conftool action : set/weight=10; selector: name=maps1009.eqiad.wmnet
- 15:32 ayounsi@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2006.codfw.wmnet
- 15:29 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1009.eqiad.wmnet
- 15:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T355609)', diff saved to https://phabricator.wikimedia.org/P55968 and previous config saved to /var/cache/conftool/dbconfig/20240131-152042-marostegui.json
- 15:18 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
- 15:17 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
- 15:17 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
- 15:16 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
- 15:16 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 15:16 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 15:16 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 15:14 btullis@cumin1002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling reboot on A:schema
- 15:14 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
- 15:14 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 15:14 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 15:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1156 (T355609)', diff saved to https://phabricator.wikimedia.org/P55967 and previous config saved to /var/cache/conftool/dbconfig/20240131-151016-marostegui.json
- 15:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 15:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 15:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 15:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 15:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55966 and previous config saved to /var/cache/conftool/dbconfig/20240131-150934-marostegui.json
- 15:09 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 15:08 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 15:08 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 15:07 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 15:06 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 15:05 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
- 14:58 btullis@cumin1002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling reboot on A:schema
- 14:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P55965 and previous config saved to /var/cache/conftool/dbconfig/20240131-145427-marostegui.json
- 14:53 brouberol: I'm going to apply kafka log compaction for {eqiad,codfw}.mediawiki.currussearch.page_rerender.v1 on kafka-main-eqiad only (current replica) - T354794
- 14:52 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
- 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists2001.codfw.wmnet
- 14:46 urbanecm@deploy2002: Finished scap: Backport for gerrit:994176Add WikimediaCampaignEvents to extension list (T347894) (duration: 10m 41s)
- 14:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host lists2001.codfw.wmnet
- 14:43 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
- 14:40 urbanecm@deploy2002: cmelo and urbanecm: Continuing with sync
- 14:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P55964 and previous config saved to /var/cache/conftool/dbconfig/20240131-143921-marostegui.json
- 14:37 urbanecm@deploy2002: cmelo and urbanecm: Backport for gerrit:994176Add WikimediaCampaignEvents to extension list (T347894) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:36 urbanecm@deploy2002: Started scap: Backport for gerrit:994176Add WikimediaCampaignEvents to extension list (T347894)
- 14:30 urbanecm@deploy2002: Finished scap: Backport for [[gerrit:994702|[metawiki] Let admins add/remove the event-organizer group (T356070)]], gerrit:994711index.php: Restore support for forcesafemode option. (T355314) (duration: 10m 05s)
- 14:28 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
- 14:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55963 and previous config saved to /var/cache/conftool/dbconfig/20240131-142413-marostegui.json
- 14:23 urbanecm@deploy2002: daimona and matmarex and urbanecm: Continuing with sync
- 14:21 urbanecm@deploy2002: daimona and matmarex and urbanecm: Backport for [[gerrit:994702|[metawiki] Let admins add/remove the event-organizer group (T356070)]], gerrit:994711index.php: Restore support for forcesafemode option. (T355314) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:21 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2020.codfw.wmnet with reason: Decommissioning — T352469
- 14:20 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2020.codfw.wmnet with reason: Decommissioning — T352469
- 14:20 urbanecm@deploy2002: Started scap: Backport for [[gerrit:994702|[metawiki] Let admins add/remove the event-organizer group (T356070)]], gerrit:994711index.php: Restore support for forcesafemode option. (T355314)
- {{safesubst:SAL entry|1=14:19 urbanecm@deploy2002: Finished scap: Backport for gerrit:994234decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), gerrit:994235decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), gerrit:994708Add an exception for ConvenientDiscussions-style permalinks (T349653), [[gerrit:994709|Add an exception for ConvenientDiscussions-style permalinks (T349653)}}
- 14:18 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript migrateUserGroup.php --wiki=metawiki campaignevents-beta-tester event-organizer # T356070
- 14:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1146:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55962 and previous config saved to /var/cache/conftool/dbconfig/20240131-141316-marostegui.json
- 14:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 14:13 urbanecm@deploy2002: urbanecm and kemayo and matmarex and daimona: Continuing with sync
- 14:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- {{safesubst:SAL entry|1=14:10 urbanecm@deploy2002: urbanecm and kemayo and matmarex and daimona: Backport for gerrit:994234decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), gerrit:994235decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), gerrit:994708Add an exception for ConvenientDiscussions-style permalinks (T349653), [[gerrit:994709|Add an exception for ConvenientDiscuss}}
- 14:09 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- {{safesubst:SAL entry|1=14:08 urbanecm@deploy2002: Started scap: Backport for gerrit:994234decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), gerrit:994235decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), gerrit:994708Add an exception for ConvenientDiscussions-style permalinks (T349653), [[gerrit:994709|Add an exception for ConvenientDiscussions-style permalinks (T349653)]}}
- 14:08 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 14:07 urbanecm@deploy2002: Finished scap: Backport for gerrit:994732testwiki: Temporarily change default value for 4 Echo properties (T353225) (duration: 19m 37s)
- 14:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 14:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 14:00 urbanecm@deploy2002: urbanecm: Continuing with sync
- 13:54 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2003.codfw.wmnet
- 13:51 urbanecm@deploy2002: urbanecm: Backport for gerrit:994732testwiki: Temporarily change default value for 4 Echo properties (T353225) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:48 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host people2003.codfw.wmnet
- 13:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host planet1003.eqiad.wmnet
- 13:48 urbanecm@deploy2002: Started scap: Backport for gerrit:994732testwiki: Temporarily change default value for 4 Echo properties (T353225)
- 13:44 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host planet1003.eqiad.wmnet
- 13:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T355609)', diff saved to https://phabricator.wikimedia.org/P55960 and previous config saved to /var/cache/conftool/dbconfig/20240131-133143-marostegui.json
- 13:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet
- 13:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet
- 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5002.eqsin.wmnet
- 13:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P55959 and previous config saved to /var/cache/conftool/dbconfig/20240131-131637-marostegui.json
- 13:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow5002.eqsin.wmnet
- 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4002.ulsfo.wmnet
- 13:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4002.ulsfo.wmnet
- 13:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3003.esams.wmnet
- 13:04 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1002.eqiad.wmnet
- 13:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3003.esams.wmnet
- 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet
- 13:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P55957 and previous config saved to /var/cache/conftool/dbconfig/20240131-130130-marostegui.json
- 12:58 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan1002.eqiad.wmnet
- 12:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet
- 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet
- 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet
- 12:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T355609)', diff saved to https://phabricator.wikimedia.org/P55956 and previous config saved to /var/cache/conftool/dbconfig/20240131-124623-marostegui.json
- 12:44 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
- 12:44 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
- 12:44 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
- 12:44 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
- 12:42 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host netmon1003.wikimedia.org
- 12:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2189 (T355609)', diff saved to https://phabricator.wikimedia.org/P55955 and previous config saved to /var/cache/conftool/dbconfig/20240131-123224-marostegui.json
- 12:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2189.codfw.wmnet with reason: Maintenance
- 12:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2189.codfw.wmnet with reason: Maintenance
- 12:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T355609)', diff saved to https://phabricator.wikimedia.org/P55954 and previous config saved to /var/cache/conftool/dbconfig/20240131-123203-marostegui.json
- 12:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
- 12:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
- 12:24 btullis@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host dbstore1009.eqiad.wmnet
- 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
- 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
- 12:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P55953 and previous config saved to /var/cache/conftool/dbconfig/20240131-121656-marostegui.json
- 12:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
- 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica2008.wikimedia.org
- 12:13 claime: Raising external traffic to mw-on-k8s to 35% - T355532
- 12:13 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stewards2001.codfw.wmnet
- 12:12 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host dbstore1009.eqiad.wmnet
- 12:11 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dbstore1008.eqiad.wmnet
- 12:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica2008.wikimedia.org
- 12:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica2007.wikimedia.org
- 12:10 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 12:10 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 12:10 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 12:09 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host stewards2001.codfw.wmnet
- 12:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stewards1001.eqiad.wmnet
- 12:08 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 12:08 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 12:08 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 12:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica2007.wikimedia.org
- 12:07 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 12:07 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
- 12:06 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 12:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica1006.wikimedia.org
- 12:05 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
- 12:05 akosiaris@deploy2002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
- 12:04 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host stewards1001.eqiad.wmnet
- 12:04 akosiaris@deploy2002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
- 12:04 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
- 12:03 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
- 12:03 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host planet2003.codfw.wmnet
- 12:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica1006.wikimedia.org
- 12:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P55952 and previous config saved to /var/cache/conftool/dbconfig/20240131-120150-marostegui.json
- 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica1005.wikimedia.org
- 12:00 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host dbstore1008.eqiad.wmnet
- 11:59 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host planet2003.codfw.wmnet
- 11:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people1004.eqiad.wmnet
- 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica1005.wikimedia.org
- 11:51 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host people1004.eqiad.wmnet
- 11:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T355609)', diff saved to https://phabricator.wikimedia.org/P55951 and previous config saved to /var/cache/conftool/dbconfig/20240131-114643-marostegui.json
- 11:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
- 11:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
- 11:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
- 11:38 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker[1157-1175].eqiad.wmnet
- 11:38 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1157-1175].eqiad.wmnet
- 11:37 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker[1157-1175].eqiad.wmnet
- 11:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
- 11:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2175 (T355609)', diff saved to https://phabricator.wikimedia.org/P55950 and previous config saved to /var/cache/conftool/dbconfig/20240131-113518-marostegui.json
- 11:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
- 11:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
- 11:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55949 and previous config saved to /var/cache/conftool/dbconfig/20240131-113456-marostegui.json
- 11:34 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1001.eqiad.wmnet
- 11:29 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1424.eqiad.wmnet with OS bullseye
- 11:28 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host testvm2006.codfw.wmnet
- 11:27 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host testvm2006.codfw.wmnet with OS bookworm
- 11:27 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan1001.eqiad.wmnet
- 11:26 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1423.eqiad.wmnet with OS bullseye
- 11:24 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1425.eqiad.wmnet with OS bullseye
- 11:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P55948 and previous config saved to /var/cache/conftool/dbconfig/20240131-111949-marostegui.json
- 11:11 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1424.eqiad.wmnet with reason: host reimage
- 11:08 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1423.eqiad.wmnet with reason: host reimage
- 11:05 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1425.eqiad.wmnet with reason: host reimage
- 11:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P55947 and previous config saved to /var/cache/conftool/dbconfig/20240131-110442-marostegui.json
- 11:02 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1424.eqiad.wmnet with reason: host reimage
- 11:02 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1423.eqiad.wmnet with reason: host reimage
- 11:01 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1425.eqiad.wmnet with reason: host reimage
- 10:53 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
- 10:53 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
- 10:51 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:51 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster tap - ayounsi@cumin1002"
- 10:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55946 and previous config saved to /var/cache/conftool/dbconfig/20240131-104936-marostegui.json
- 10:49 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster tap - ayounsi@cumin1002"
- 10:48 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1424.eqiad.wmnet with OS bullseye
- 10:48 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1423.eqiad.wmnet with OS bullseye
- 10:48 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1425.eqiad.wmnet with OS bullseye
- 10:46 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
- 10:43 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 10:42 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 10:41 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 10:41 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet
- 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55945 and previous config saved to /var/cache/conftool/dbconfig/20240131-103830-marostegui.json
- 10:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 10:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T355609)', diff saved to https://phabricator.wikimedia.org/P55944 and previous config saved to /var/cache/conftool/dbconfig/20240131-103807-marostegui.json
- 10:36 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1157.eqiad.wmnet
- 10:35 btullis@deploy2002: Finished deploy [analytics/refinery@13f7a06] (hadoop-test): Ad-hoc deploy of refinery TEST for T354703 [analytics/refinery@13f7a06c] (duration: 00m 07s)
- 10:35 btullis@deploy2002: Started deploy [analytics/refinery@13f7a06] (hadoop-test): Ad-hoc deploy of refinery TEST for T354703 [analytics/refinery@13f7a06c]
- 10:35 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet
- 10:33 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
- 10:30 btullis@deploy2002: Finished deploy [analytics/refinery@13f7a06] (hadoop-test): Ad-hoc deploy of refinery TEST for T354703 [analytics/refinery@13f7a06c] (duration: 00m 05s)
- 10:30 btullis@deploy2002: Started deploy [analytics/refinery@13f7a06] (hadoop-test): Ad-hoc deploy of refinery TEST for T354703 [analytics/refinery@13f7a06c]
- 10:30 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
- 10:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1002.eqiad.wmnet
- 10:29 stevemunene@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-worker1157.eqiad.wmnet
- 10:25 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
- 10:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1002.eqiad.wmnet
- 10:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P55943 and previous config saved to /var/cache/conftool/dbconfig/20240131-102300-marostegui.json
- 10:21 cgoubert@cumin2002: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
- 10:20 cgoubert@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host testreduce1002.eqiad.wmnet
- 10:20 cgoubert@cumin2002: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
- 10:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1001.eqiad.wmnet
- 10:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P55942 and previous config saved to /var/cache/conftool/dbconfig/20240131-100754-marostegui.json
- 10:03 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1001.eqiad.wmnet
- 10:02 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS bookworm
- 10:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet
- 09:53 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
- 09:53 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host moss-be2003.codfw.wmnet
- 09:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T355609)', diff saved to https://phabricator.wikimedia.org/P55941 and previous config saved to /var/cache/conftool/dbconfig/20240131-095247-marostegui.json
- 09:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
- 09:51 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
- 09:51 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
- 09:50 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
- 09:50 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
- 09:50 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:50 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
- 09:49 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
- 09:47 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
- 09:47 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
- 09:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2148 (T355609)', diff saved to https://phabricator.wikimedia.org/P55940 and previous config saved to /var/cache/conftool/dbconfig/20240131-094301-marostegui.json
- 09:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
- 09:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
- 09:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55939 and previous config saved to /var/cache/conftool/dbconfig/20240131-094239-marostegui.json
- 09:38 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
- 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5004.wikimedia.org
- 09:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P55938 and previous config saved to /var/cache/conftool/dbconfig/20240131-092733-marostegui.json
- 09:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5004.wikimedia.org
- 09:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast4005.wikimedia.org
- 09:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast4005.wikimedia.org
- 09:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P55937 and previous config saved to /var/cache/conftool/dbconfig/20240131-091226-marostegui.json
- 09:08 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2002.codfw.wmnet
- 09:07 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host sretest1003.eqiad.wmnet
- 09:01 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan2002.codfw.wmnet
- 08:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55936 and previous config saved to /var/cache/conftool/dbconfig/20240131-085719-marostegui.json
- 08:55 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2001.codfw.wmnet
- 08:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1003.eqiad.wmnet
- 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet
- 08:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2138:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55935 and previous config saved to /var/cache/conftool/dbconfig/20240131-084700-marostegui.json
- 08:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
- 08:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
- 08:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T355609)', diff saved to https://phabricator.wikimedia.org/P55934 and previous config saved to /var/cache/conftool/dbconfig/20240131-084637-marostegui.json
- 08:45 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan2001.codfw.wmnet
- 08:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet
- 08:44 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana2001.codfw.wmnet
- 08:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host crm2001.codfw.wmnet
- 08:40 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet
- 08:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host crm2001.codfw.wmnet
- 08:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 100%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55932 and previous config saved to /var/cache/conftool/dbconfig/20240131-083142-root.json
- 08:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P55931 and previous config saved to /var/cache/conftool/dbconfig/20240131-083130-marostegui.json
- 08:27 moritzm: installing systemd bugfix updates from bookworm 12.4 point release
- 08:21 moritzm: installing systemd bugfix updates from bookworm 12.4 point release
- 08:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2004.codfw.wmnet
- 08:18 slyngshede@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm1001.wikimedia.org
- 08:17 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
- 08:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 75%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55930 and previous config saved to /var/cache/conftool/dbconfig/20240131-081637-root.json
- 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2004.codfw.wmnet
- 08:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P55929 and previous config saved to /var/cache/conftool/dbconfig/20240131-081624-marostegui.json
- 08:14 slyngshede@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM idm1001.wikimedia.org
- 08:13 slyngshede@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm2001.wikimedia.org
- 08:13 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
- 08:09 moritzm: installing ca-certificates-java bugfix updates from bookworm 12.4 point release
- 08:09 slyngshede@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM idm2001.wikimedia.org
- 08:09 slyngshede@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm-test1001.wikimedia.org
- 08:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
- 08:05 slyngshede@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM idm-test1001.wikimedia.org
- 08:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
- 08:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 50%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55928 and previous config saved to /var/cache/conftool/dbconfig/20240131-080132-root.json
- 08:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T355609)', diff saved to https://phabricator.wikimedia.org/P55927 and previous config saved to /var/cache/conftool/dbconfig/20240131-080117-marostegui.json
- 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw2001.wikimedia.org
- 07:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2126 (T355609)', diff saved to https://phabricator.wikimedia.org/P55926 and previous config saved to /var/cache/conftool/dbconfig/20240131-075600-marostegui.json
- 07:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 07:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 07:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
- 07:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
- 07:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T355609)', diff saved to https://phabricator.wikimedia.org/P55925 and previous config saved to /var/cache/conftool/dbconfig/20240131-075522-marostegui.json
- 07:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw2001.wikimedia.org
- 07:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw1001.wikimedia.org
- 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 25%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55924 and previous config saved to /var/cache/conftool/dbconfig/20240131-074627-root.json
- 07:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw1001.wikimedia.org
- 07:43 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 07:43 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster tap - ayounsi@cumin1002"
- 07:42 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster tap - ayounsi@cumin1002"
- 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P55923 and previous config saved to /var/cache/conftool/dbconfig/20240131-074015-marostegui.json
- 07:39 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
- 07:38 ayounsi@cumin1002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
- 07:38 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
- 07:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 10%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55922 and previous config saved to /var/cache/conftool/dbconfig/20240131-073121-root.json
- 07:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor1003.eqiad.wmnet
- 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P55921 and previous config saved to /var/cache/conftool/dbconfig/20240131-072509-marostegui.json
- 07:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor1003.eqiad.wmnet
- 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 100%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55920 and previous config saved to /var/cache/conftool/dbconfig/20240131-072129-root.json
- 07:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS bookworm
- 07:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 5%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55919 and previous config saved to /var/cache/conftool/dbconfig/20240131-071616-root.json
- 07:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor2003.codfw.wmnet
- 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T355609)', diff saved to https://phabricator.wikimedia.org/P55918 and previous config saved to /var/cache/conftool/dbconfig/20240131-071002-marostegui.json
- 07:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet
- 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 75%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55917 and previous config saved to /var/cache/conftool/dbconfig/20240131-070624-root.json
- 07:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 1%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55916 and previous config saved to /var/cache/conftool/dbconfig/20240131-070111-root.json
- 06:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2125 (T355609)', diff saved to https://phabricator.wikimedia.org/P55915 and previous config saved to /var/cache/conftool/dbconfig/20240131-065922-marostegui.json
- 06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
- 06:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
- 06:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2107 (T355609)', diff saved to https://phabricator.wikimedia.org/P55914 and previous config saved to /var/cache/conftool/dbconfig/20240131-065901-marostegui.json
- 06:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
- 06:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2114.codfw.wmnet with OS bookworm
- 06:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
- 06:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 50%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55913 and previous config saved to /var/cache/conftool/dbconfig/20240131-065118-root.json
- 06:47 moritzm: installing glibc security updates on bookworm
- 06:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2107', diff saved to https://phabricator.wikimedia.org/P55912 and previous config saved to /var/cache/conftool/dbconfig/20240131-064353-marostegui.json
- 06:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2114.codfw.wmnet with reason: host reimage
- 06:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2114.codfw.wmnet with reason: host reimage
- 06:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 25%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55911 and previous config saved to /var/cache/conftool/dbconfig/20240131-063613-root.json
- 06:35 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS bookworm
- 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2107', diff saved to https://phabricator.wikimedia.org/P55910 and previous config saved to /var/cache/conftool/dbconfig/20240131-062846-marostegui.json
- 06:22 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2114.codfw.wmnet with OS bookworm
- 06:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 10%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55909 and previous config saved to /var/cache/conftool/dbconfig/20240131-062109-root.json
- 06:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2114 T354506', diff saved to https://phabricator.wikimedia.org/P55908 and previous config saved to /var/cache/conftool/dbconfig/20240131-061932-root.json
- 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2107 (T355609)', diff saved to https://phabricator.wikimedia.org/P55907 and previous config saved to /var/cache/conftool/dbconfig/20240131-061340-marostegui.json
- 06:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 5%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55906 and previous config saved to /var/cache/conftool/dbconfig/20240131-060602-root.json
- 06:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2107 (T355609)', diff saved to https://phabricator.wikimedia.org/P55905 and previous config saved to /var/cache/conftool/dbconfig/20240131-060337-marostegui.json
- 06:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2107.codfw.wmnet with reason: Maintenance
- 06:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2107.codfw.wmnet with reason: Maintenance
- 05:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
- 05:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
- 05:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 1%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55904 and previous config saved to /var/cache/conftool/dbconfig/20240131-055057-root.json
- 05:41 eileen: civicrm upgraded from 6de61520 to 520337a0
- 05:30 fab@deploy2002: Finished deploy [airflow-dags/research@97c6a4e]: (no justification provided) (duration: 00m 14s)
- 05:30 fab@deploy2002: Started deploy [airflow-dags/research@97c6a4e]: (no justification provided)
- 03:29 eileen: tools upgraded from 02281338 to c823e692
- 03:05 fab@deploy2002: Finished deploy [airflow-dags/research@6a97a34]: (no justification provided) (duration: 00m 23s)
- 03:05 fab@deploy2002: Started deploy [airflow-dags/research@6a97a34]: (no justification provided)
2024-01-30
- 23:54 mutante: LDAP - added aklapper to group releng T356043
- 23:07 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for sessionstore1006.eqiad.wmnet
- 23:07 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for sessionstore1006.eqiad.wmnet
- 22:49 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on sessionstore1006.eqiad.wmnet with reason: Bootstrapping — T353402
- 22:48 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on sessionstore1006.eqiad.wmnet with reason: Bootstrapping — T353402
- 22:41 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate first private IP host config - bking@cumin2002 - T355617
- 22:20 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for sessionstore1005.eqiad.wmnet
- 22:20 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for sessionstore1005.eqiad.wmnet
- 22:10 cjming: end of UTC late backport window
- 22:09 cjming@deploy2002: Finished scap: Backport for [[gerrit:994254|[eswiki] Add 13 namespaces to $wgExemptFromUserRobotsControl (T355033)]] (duration: 08m 24s)
- 22:02 cjming@deploy2002: cjming and superpes: Continuing with sync
- 22:02 cjming@deploy2002: cjming and superpes: Backport for [[gerrit:994254|[eswiki] Add 13 namespaces to $wgExemptFromUserRobotsControl (T355033)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 22:00 cjming@deploy2002: Started scap: Backport for [[gerrit:994254|[eswiki] Add 13 namespaces to $wgExemptFromUserRobotsControl (T355033)]]
- 21:59 cjming@deploy2002: Finished scap: Backport for [[gerrit:994211|[ukwiki] Change autoconfirmed setting (T355972)]], [[gerrit:994214|[ganwiki] Add 'suppressredirect' to transwiki usergroup and change assignment and revocation methods (T354850)]], [[gerrit:994220|[ganwiki] Add new namespace aliases (T355854)]] (duration: 09m 32s)
- 21:53 cjming@deploy2002: superpes and cjming: Continuing with sync
- 21:51 cjming@deploy2002: superpes and cjming: Backport for [[gerrit:994211|[ukwiki] Change autoconfirmed setting (T355972)]], [[gerrit:994214|[ganwiki] Add 'suppressredirect' to transwiki usergroup and change assignment and revocation methods (T354850)]], [[gerrit:994220|[ganwiki] Add new namespace aliases (T355854)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:50 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on sessionstore1005.eqiad.wmnet with reason: Bootstrapping — T353402
- 21:50 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on sessionstore1005.eqiad.wmnet with reason: Bootstrapping — T353402
- 21:49 cjming@deploy2002: Started scap: Backport for [[gerrit:994211|[ukwiki] Change autoconfirmed setting (T355972)]], [[gerrit:994214|[ganwiki] Add 'suppressredirect' to transwiki usergroup and change assignment and revocation methods (T354850)]], [[gerrit:994220|[ganwiki] Add new namespace aliases (T355854)]]
- 21:44 cjming@deploy2002: Finished scap: Backport for gerrit:994143Run CheckerJob against read-only clusters (T354793) (duration: 07m 41s)
- 21:42 mutante: LDAP - added jnuche to group releng (T356043) - already done/approved in the past in T301149
- 21:41 mutante: LDAP - added jhuneidi to group releng (T356043) - already done/approved in the past in T210028
- 21:40 mutante: LDAP - added brennen to group releng (T356043) - already done/approved in the past in T215365
- 21:38 cjming@deploy2002: cjming and ebernhardson: Continuing with sync
- 21:38 cjming@deploy2002: cjming and ebernhardson: Backport for gerrit:994143Run CheckerJob against read-only clusters (T354793) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:37 cjming@deploy2002: Started scap: Backport for gerrit:994143Run CheckerJob against read-only clusters (T354793)
- 21:36 cjming@deploy2002: Finished scap: Backport for gerrit:994142Run CheckerJob against read-only clusters (T354793) (duration: 07m 49s)
- 21:34 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate first private IP host config - bking@cumin2002 - T355617
- 21:30 cjming@deploy2002: ebernhardson and cjming: Continuing with sync
- 21:30 cjming@deploy2002: ebernhardson and cjming: Backport for gerrit:994142Run CheckerJob against read-only clusters (T354793) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:28 cjming@deploy2002: Started scap: Backport for gerrit:994142Run CheckerJob against read-only clusters (T354793)
- 21:01 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for sessionstore1004.eqiad.wmnet
- 21:01 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for sessionstore1004.eqiad.wmnet
- 20:52 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate first private IP host config - bking@cumin2002 - T355617
- 20:51 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate first private IP host config - bking@cumin2002 - T355617
- 20:38 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on sessionstore1004.eqiad.wmnet with reason: Commissioning — T353402
- 20:38 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on sessionstore1004.eqiad.wmnet with reason: Commissioning — T353402
- 20:35 urandom: bootstrapping sessionstore1004/cassandra-a — T353402
- 20:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wdqs::public
- 19:45 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: wdqs::public
- 19:36 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in cloudelastic
- 19:36 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in cloudelastic
- 19:36 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: cloudelastic1010.eqiad.wmnet for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
- 19:36 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010.eqiad.wmnet for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
- 19:27 Lucas_WMDE: FINISHED lucaswerkmeister-wmde@mwmaint2002:~$ mwscript CheckSignatures enwiki | tee T356168 # -- 268378 invalid signatures --
- 19:10 dancy@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.16 refs T354434
- 19:09 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
- 18:52 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 18:52 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 18:46 xcollazo@deploy2002: Finished deploy [airflow-dags/analytics@ccaa5dc]: (no justification provided) (duration: 00m 05s)
- 18:46 xcollazo@deploy2002: Started deploy [airflow-dags/analytics@ccaa5dc]: (no justification provided)
- 18:17 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
- 18:16 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
- 18:05 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 18:04 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 18:04 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 18:04 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 18:04 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 18:03 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 18:03 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 18:03 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 18:02 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 18:02 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 18:02 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 18:02 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 17:37 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
- 17:37 urandom: DROP test_spark3_loading keyspace, Generated Data (Cassandra) cluster — T356112
- 17:22 jforrester@deploy2002: Finished scap: Backport for gerrit:994202Do not search for elements if no previews have been registered (T355933 T356186 T356193), gerrit:994203Do not search for elements if no previews have been registered (T355933 T356186 T356193) (duration: 11m 51s)
- 17:21 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
- 17:15 jforrester@deploy2002: jforrester: Continuing with sync
- 17:14 jforrester@deploy2002: jforrester: Backport for gerrit:994202Do not search for elements if no previews have been registered (T355933 T356186 T356193), gerrit:994203Do not search for elements if no previews have been registered (T355933 T356186 T356193) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 17:13 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2005.codfw.wmnet with OS bookworm
- 17:10 jforrester@deploy2002: Started scap: Backport for gerrit:994202Do not search for elements if no previews have been registered (T355933 T356186 T356193), gerrit:994203Do not search for elements if no previews have been registered (T355933 T356186 T356193)
- 16:57 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
- 16:56 bking@cumin2002: conftool action : set/weight=10; selector: name=cloudelastic1009.wikimedia.org
- 16:56 bking@cumin2002: conftool action : set/weight=10; selector: name=cloudelastic1008.wikimedia.org
- 16:56 bking@cumin2002: conftool action : set/weight=10; selector: name=cloudelastic1007.wikimedia.org
- 16:54 claime: Running homer 'cr*codfw*' commit 'T351074'
- 16:54 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: sync
- 16:54 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: sync
- 16:49 mutante: gitlab is back
- 16:48 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 16:47 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 16:47 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 16:47 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 16:44 mutante: gitlab is down for maintenance for a few minutes
- 16:34 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
- 16:29 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on gitlab.wikimedia.org with reason: server move
- 16:29 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on gitlab.wikimedia.org with reason: server move
- 16:28 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on gitlab2002.wikimedia.org with reason: server move
- 16:28 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on gitlab2002.wikimedia.org with reason: server move
- 16:25 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1466.eqiad.wmnet with OS bullseye
- 16:21 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1457.eqiad.wmnet with OS bullseye
- 16:18 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2366.codfw.wmnet with OS bullseye
- 16:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1440.eqiad.wmnet with OS bullseye
- 16:14 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
- 16:13 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1008.wikimedia.org
- 16:13 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2370.codfw.wmnet with OS bullseye
- 16:11 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
- 16:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1482.eqiad.wmnet with OS bullseye
- 16:08 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2368.codfw.wmnet with OS bullseye
- 16:06 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1466.eqiad.wmnet with reason: host reimage
- 16:03 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1459.eqiad.wmnet with OS bullseye
- 16:02 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1457.eqiad.wmnet with reason: host reimage
- 15:59 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2366.codfw.wmnet with reason: host reimage
- 15:58 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cloudelastic1010.eqiad.wmnet with reason: T355617
- 15:58 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cloudelastic1010.eqiad.wmnet with reason: T355617
- 15:56 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1440.eqiad.wmnet with reason: host reimage
- 15:54 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
- 15:53 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2370.codfw.wmnet with reason: host reimage
- 15:50 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1482.eqiad.wmnet with reason: host reimage
- 15:47 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2368.codfw.wmnet with reason: host reimage
- 15:44 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1459.eqiad.wmnet with reason: host reimage
- 15:42 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2370.codfw.wmnet with reason: host reimage
- 15:42 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1457.eqiad.wmnet with reason: host reimage
- 15:42 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1466.eqiad.wmnet with reason: host reimage
- 15:42 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2366.codfw.wmnet with reason: host reimage
- 15:42 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1440.eqiad.wmnet with reason: host reimage
- 15:41 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2368.codfw.wmnet with reason: host reimage
- 15:41 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1482.eqiad.wmnet with reason: host reimage
- 15:41 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1459.eqiad.wmnet with reason: host reimage
- 15:40 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
- 15:29 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript CheckSignatures enwiki | tee T356168
- 15:28 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1466.eqiad.wmnet with OS bullseye
- 15:28 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1459.eqiad.wmnet with OS bullseye
- 15:28 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1482.eqiad.wmnet with OS bullseye
- 15:28 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1457.eqiad.wmnet with OS bullseye
- 15:27 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1440.eqiad.wmnet with OS bullseye
- 15:26 Lucas_WMDE: UTC afternoon backport+config window done
- 15:26 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2370.codfw.wmnet with OS bullseye
- 15:25 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2368.codfw.wmnet with OS bullseye
- 15:25 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2366.codfw.wmnet with OS bullseye
- 15:17 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint2002:~$ mwscript namespaceDupes enwikiquote --fix # T355195 (two pages will need separate fixing)
- 15:17 claime: Recomissioning mw2366.codfw.wmnet,mw2368.codfw.wmnet,mw2370.codfw.wmnet as k8s nodes - T351074
- 15:17 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host sretest2005.codfw.wmnet
- 15:17 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
- 15:16 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [[gerrit:993458|[enwikiquote] Add a draft namespace and its talk space (T355195)]] (duration: 08m 43s)
- 15:09 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and superpes: Continuing with sync
- 15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and superpes: Backport for [[gerrit:993458|[enwikiquote] Add a draft namespace and its talk space (T355195)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 15:07 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [[gerrit:993458|[enwikiquote] Add a draft namespace and its talk space (T355195)]]
- 15:06 claime: Manual run of mediawiki_job_generatecaptcha.service following timer failure - T141490
- 15:06 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint2002:~$ mwscript namespaceDupes enwiktionary --fix # T354813
- 15:05 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [[gerrit:993457|[enwiktionary] Remove the Concordance namespace and its talk space (T354813)]] (duration: 09m 57s)
- 14:59 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Continuing with sync
- 14:57 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Backport for [[gerrit:993457|[enwiktionary] Remove the Concordance namespace and its talk space (T354813)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:55 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [[gerrit:993457|[enwiktionary] Remove the Concordance namespace and its talk space (T354813)]]
- 14:52 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint2002:~$ mwscript namespaceDupes azwiki --fix # T355041, failed at the end :(
- 14:52 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [[gerrit:993452|[azwiki] Changing 9 namespace aliases (T355041)]] (duration: 08m 37s)
- 14:46 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Continuing with sync
- 14:45 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Backport for [[gerrit:993452|[azwiki] Changing 9 namespace aliases (T355041)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:43 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [[gerrit:993452|[azwiki] Changing 9 namespace aliases (T355041)]]
- 14:41 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:994139CommentParser: Ignore generated timestamp links (T356142), gerrit:994140CommentParser: Ignore generated timestamp links (T356142), gerrit:994141Add maintenance script to list users with invalid signatures (T356168) (duration: 11m 01s)
- 14:40 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
- 14:35 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Continuing with sync
- 14:32 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
- 14:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Backport for gerrit:994139CommentParser: Ignore generated timestamp links (T356142), gerrit:994140CommentParser: Ignore generated timestamp links (T356142), gerrit:994141Add maintenance script to list users with invalid signatures (T356168) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:31 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
- 14:31 gmodena@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
- 14:30 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:994139CommentParser: Ignore generated timestamp links (T356142), gerrit:994140CommentParser: Ignore generated timestamp links (T356142), gerrit:994141Add maintenance script to list users with invalid signatures (T356168)
- 14:30 gmodena@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
- 14:30 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
- 14:26 gmodena@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
- 14:26 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
- 14:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 backport Cancelled
- 14:18 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:994028Don't bail out early when there are no selectors configured (T355933) (duration: 09m 04s)
- 14:12 logmsgbot: lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Continuing with sync
- 14:11 logmsgbot: lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Backport for gerrit:994028Don't bail out early when there are no selectors configured (T355933) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:11 volans@cumin2002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
- 14:09 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:994028Don't bail out early when there are no selectors configured (T355933)
- 14:09 volans@cumin2002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 13:56 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
- 13:55 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM sretest2005.codfw.wmnet - ayounsi@cumin1002"
- 13:55 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM sretest2005.codfw.wmnet - ayounsi@cumin1002"
- 13:54 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest2005.codfw.wmnet on all recursors
- 13:54 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache sretest2005.codfw.wmnet on all recursors
- 13:54 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:54 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM sretest2005.codfw.wmnet - ayounsi@cumin1002"
- 13:53 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM sretest2005.codfw.wmnet - ayounsi@cumin1002"
- 13:47 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
- 13:47 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host sretest2005.codfw.wmnet
- 13:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts srestest2005.codfw.wmnet
- 13:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: srestest2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
- 13:44 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: srestest2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
- 13:39 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
- 13:37 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1157-1175].eqiad.wmnet
- 13:36 ayounsi@cumin1002: START - Cookbook sre.hosts.decommission for hosts srestest2005.codfw.wmnet
- 13:34 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=94) for new host srestest2005.codfw.wmnet
- 13:33 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
- 13:33 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
- 13:32 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
- 13:32 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
- 13:32 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:32 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
- 13:31 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
- 13:26 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
- 13:26 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host srestest2005.codfw.wmnet
- 13:16 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=93) for new host srestest2005.codfw.wmnet
- 13:16 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
- 13:16 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
- 13:16 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:16 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
- 13:15 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
- 13:12 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
- 13:12 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
- 13:12 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
- 13:12 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:12 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
- 13:10 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
- 13:08 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker[1159-1175].eqiad.wmnet
- 13:08 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1159-1175].eqiad.wmnet
- 13:08 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
- 13:08 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host srestest2005.codfw.wmnet
- 13:06 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker1158.eqiad.wmnet
- 13:04 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1158.eqiad.wmnet
- 12:19 taavi: reprepro import exim4 4.96-15+deb12u4+wmf1 to component/exim4-arc T356171
- 11:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T343718)', diff saved to https://phabricator.wikimedia.org/P55896 and previous config saved to /var/cache/conftool/dbconfig/20240130-114726-ladsgroup.json
- 11:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-airflow1005.eqiad.wmnet
- 11:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P55895 and previous config saved to /var/cache/conftool/dbconfig/20240130-113220-ladsgroup.json
- 11:30 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker1157.eqiad.wmnet
- 11:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-airflow1005.eqiad.wmnet
- 11:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2113.codfw.wmnet with reason: Maintenance
- 11:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2113.codfw.wmnet with reason: Maintenance
- 11:19 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1157.eqiad.wmnet
- 11:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P55894 and previous config saved to /var/cache/conftool/dbconfig/20240130-111713-ladsgroup.json
- 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: analytics_cluster::airflow::search
- 11:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1183.eqiad.wmnet with reason: Maintenance
- 11:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1183.eqiad.wmnet with reason: Maintenance
- 11:02 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: analytics_cluster::airflow::search
- 11:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T343718)', diff saved to https://phabricator.wikimedia.org/P55893 and previous config saved to /var/cache/conftool/dbconfig/20240130-110207-ladsgroup.json
- 10:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2114 (T343718)', diff saved to https://phabricator.wikimedia.org/P55892 and previous config saved to /var/cache/conftool/dbconfig/20240130-105954-ladsgroup.json
- 10:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
- 10:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
- 10:56 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-airflow1005.eqiad.wmnet with OS bullseye
- 10:56 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
- 10:45 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
- 10:35 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=93) for new host srestest2005.codfw.wmnet
- 10:35 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
- 10:35 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
- 10:35 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:35 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
- 10:34 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
- 10:34 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
- 10:32 volans@cumin1002: END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling restart_daemons on A:netbox-canary
- 10:32 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-airflow1005.eqiad.wmnet with reason: host reimage
- 10:31 volans@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 10:31 volans@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 10:31 volans@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 10:29 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
- 10:29 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
- 10:29 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
- 10:29 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:28 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
- 10:28 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
- 10:26 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-airflow1005.eqiad.wmnet with reason: host reimage
- 10:26 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
- 10:25 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host srestest2005.codfw.wmnet
- 10:24 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host srestest2005.codfw.wmnet
- 10:24 ayounsi@cumin1002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
- 10:23 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
- 10:23 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host srestest2005.codfw.wmnet
- 10:23 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
- 10:16 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-airflow1005.eqiad.wmnet with OS bullseye
- 10:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host phab1004.eqiad.wmnet
- 10:00 gmodena@deploy2002: Finished deploy [airflow-dags/analytics@ccaa5dc]: (no justification provided) (duration: 00m 37s)
- 10:00 gmodena@deploy2002: Started deploy [airflow-dags/analytics@ccaa5dc]: (no justification provided)
- 09:56 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host phab1004.eqiad.wmnet
- 09:30 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-tool1008.eqiad.wmnet with OS bullseye
- 09:14 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-tool1008.eqiad.wmnet with reason: host reimage
- 09:11 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-tool1008.eqiad.wmnet with reason: host reimage
- 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 100%: Switchover', diff saved to https://phabricator.wikimedia.org/P55891 and previous config saved to /var/cache/conftool/dbconfig/20240130-090704-root.json
- 09:00 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host an-tool1008.eqiad.wmnet with OS bullseye
- 08:57 Emperor: restart swift-object-replicator on ms-be1068
- 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 75%: Switchover', diff saved to https://phabricator.wikimedia.org/P55890 and previous config saved to /var/cache/conftool/dbconfig/20240130-085159-root.json
- 08:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55889 and previous config saved to /var/cache/conftool/dbconfig/20240130-085055-root.json
- 08:38 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55888 and previous config saved to /var/cache/conftool/dbconfig/20240130-083829-root.json
- 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 50%: Switchover', diff saved to https://phabricator.wikimedia.org/P55887 and previous config saved to /var/cache/conftool/dbconfig/20240130-083654-root.json
- 08:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55886 and previous config saved to /var/cache/conftool/dbconfig/20240130-083550-root.json
- 08:29 moritzm: upgrading python-pymysql on remaining DB hosts to 1.0.2-2~wmf11u1 T355531
- 08:28 ladsgroup@deploy2002: Finished scap: Backport for gerrit:993824Enable PageNotice extension on testwiki (T61245) (duration: 10m 24s)
- 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55885 and previous config saved to /var/cache/conftool/dbconfig/20240130-082324-root.json
- 08:22 ladsgroup@deploy2002: ladsgroup and tto: Continuing with sync
- 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 25%: Switchover', diff saved to https://phabricator.wikimedia.org/P55884 and previous config saved to /var/cache/conftool/dbconfig/20240130-082149-root.json
- 08:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55883 and previous config saved to /var/cache/conftool/dbconfig/20240130-082045-root.json
- 08:19 ladsgroup@deploy2002: ladsgroup and tto: Backport for gerrit:993824Enable PageNotice extension on testwiki (T61245) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:18 ladsgroup@deploy2002: Started scap: Backport for gerrit:993824Enable PageNotice extension on testwiki (T61245)
- 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55882 and previous config saved to /var/cache/conftool/dbconfig/20240130-080819-root.json
- 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 10%: Switchover', diff saved to https://phabricator.wikimedia.org/P55881 and previous config saved to /var/cache/conftool/dbconfig/20240130-080644-root.json
- 08:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55880 and previous config saved to /var/cache/conftool/dbconfig/20240130-080540-root.json
- 07:55 ayounsi@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2034.codfw.wmnet to cluster codfw02 and group AB
- 07:53 ayounsi@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2034.codfw.wmnet to cluster codfw02 and group AB
- 07:53 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55879 and previous config saved to /var/cache/conftool/dbconfig/20240130-075314-root.json
- 07:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55878 and previous config saved to /var/cache/conftool/dbconfig/20240130-075035-root.json
- 07:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2105 T356069', diff saved to https://phabricator.wikimedia.org/P55877 and previous config saved to /var/cache/conftool/dbconfig/20240130-074746-root.json
- 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2127 to s3 primary and set section read-write T356069', diff saved to https://phabricator.wikimedia.org/P55876 and previous config saved to /var/cache/conftool/dbconfig/20240130-074656-marostegui.json
- 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'Set s3 codfw as read-only for maintenance - T356069', diff saved to https://phabricator.wikimedia.org/P55875 and previous config saved to /var/cache/conftool/dbconfig/20240130-074634-marostegui.json
- 07:46 marostegui: Starting s3 codfw failover from db2105 to db2127 - T356069
- 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55874 and previous config saved to /var/cache/conftool/dbconfig/20240130-073807-root.json
- 07:33 root@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s3 T356069
- 07:32 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2127 with weight 0 T356069', diff saved to https://phabricator.wikimedia.org/P55873 and previous config saved to /var/cache/conftool/dbconfig/20240130-073257-marostegui.json
- 07:32 root@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 23 hosts with reason: Primary switchover s3 T356069
- 07:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55872 and previous config saved to /var/cache/conftool/dbconfig/20240130-072734-root.json
- 07:23 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 5%: After switchover', diff saved to https://phabricator.wikimedia.org/P55871 and previous config saved to /var/cache/conftool/dbconfig/20240130-072302-root.json
- 07:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55870 and previous config saved to /var/cache/conftool/dbconfig/20240130-071612-root.json
- 07:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55869 and previous config saved to /var/cache/conftool/dbconfig/20240130-071229-root.json
- 07:12 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2144 to x2 master T356060', diff saved to https://phabricator.wikimedia.org/P55868 and previous config saved to /var/cache/conftool/dbconfig/20240130-071202-root.json
- 07:07 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 1%: After switchover', diff saved to https://phabricator.wikimedia.org/P55867 and previous config saved to /var/cache/conftool/dbconfig/20240130-070757-root.json
- 07:02 marostegui@deploy2002: Finished scap: Backport for gerrit:993775Revert "db-production.php: Disable writes on es4" (duration: 07m 48s)
- 07:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55866 and previous config saved to /var/cache/conftool/dbconfig/20240130-070107-root.json
- 07:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover x2 T356060
- 07:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover x2 T356060
- 06:57 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55865 and previous config saved to /var/cache/conftool/dbconfig/20240130-065724-root.json
- 06:55 marostegui@deploy2002: marostegui: Continuing with sync
- 06:55 marostegui@deploy2002: marostegui: Backport for gerrit:993775Revert "db-production.php: Disable writes on es4" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 06:54 marostegui@deploy2002: Started scap: Backport for gerrit:993775Revert "db-production.php: Disable writes on es4"
- 06:48 marostegui@deploy2002: backport Cancelled
- 06:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55864 and previous config saved to /var/cache/conftool/dbconfig/20240130-064602-root.json
- 06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2020 T356064', diff saved to https://phabricator.wikimedia.org/P55863 and previous config saved to /var/cache/conftool/dbconfig/20240130-064526-root.json
- 06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Reduce es2021 weight T356064', diff saved to https://phabricator.wikimedia.org/P55862 and previous config saved to /var/cache/conftool/dbconfig/20240130-064512-root.json
- 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55861 and previous config saved to /var/cache/conftool/dbconfig/20240130-064219-root.json
- 06:36 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2021 to es4 primary T356064', diff saved to https://phabricator.wikimedia.org/P55860 and previous config saved to /var/cache/conftool/dbconfig/20240130-063625-root.json
- 06:35 marostegui: Starting es4 codfw failover from es2020 to es2021 - T356064
- 06:30 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55859 and previous config saved to /var/cache/conftool/dbconfig/20240130-063057-root.json
- 06:30 marostegui@deploy2002: Finished scap: Backport for gerrit:993711db-production.php: Disable writes on es4 (T356064) (duration: 09m 11s)
- 06:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1224 T354591', diff saved to https://phabricator.wikimedia.org/P55858 and previous config saved to /var/cache/conftool/dbconfig/20240130-062930-root.json
- 06:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55857 and previous config saved to /var/cache/conftool/dbconfig/20240130-062714-root.json
- 06:23 marostegui@deploy2002: marostegui: Continuing with sync
- 06:22 marostegui@deploy2002: marostegui: Backport for gerrit:993711db-production.php: Disable writes on es4 (T356064) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 06:22 marostegui@cumin1002: dbctl commit (dc=all): 'Set es2020 with weight 0 T356064', diff saved to https://phabricator.wikimedia.org/P55856 and previous config saved to /var/cache/conftool/dbconfig/20240130-062241-marostegui.json
- 06:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356064
- 06:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356064
- 06:21 marostegui@deploy2002: Started scap: Backport for gerrit:993711db-production.php: Disable writes on es4 (T356064)
- 06:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356064
- 06:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356064
- 06:15 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55855 and previous config saved to /var/cache/conftool/dbconfig/20240130-061552-root.json
- 06:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2103 T356059', diff saved to https://phabricator.wikimedia.org/P55854 and previous config saved to /var/cache/conftool/dbconfig/20240130-061529-root.json
- 06:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2146', diff saved to https://phabricator.wikimedia.org/P55853 and previous config saved to /var/cache/conftool/dbconfig/20240130-061423-root.json
- 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2112 to s1 primary and set section read-write T356059', diff saved to https://phabricator.wikimedia.org/P55852 and previous config saved to /var/cache/conftool/dbconfig/20240130-061305-marostegui.json
- 06:12 marostegui@cumin1002: dbctl commit (dc=all): 'Set s1 codfw as read-only for maintenance - T356059', diff saved to https://phabricator.wikimedia.org/P55851 and previous config saved to /var/cache/conftool/dbconfig/20240130-061243-marostegui.json
- 06:12 marostegui: Starting s1 codfw failover from db2103 to db2112 - T356059
- 06:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55850 and previous config saved to /var/cache/conftool/dbconfig/20240130-061014-root.json
- 06:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2146', diff saved to https://phabricator.wikimedia.org/P55849 and previous config saved to /var/cache/conftool/dbconfig/20240130-060727-root.json
- 05:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 36 hosts with reason: Primary switchover s1 T356059
- 05:44 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2112 with weight 0 T356059', diff saved to https://phabricator.wikimedia.org/P55848 and previous config saved to /var/cache/conftool/dbconfig/20240130-054410-marostegui.json
- 05:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 36 hosts with reason: Primary switchover s1 T356059
- 05:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2114 T355739', diff saved to https://phabricator.wikimedia.org/P55847 and previous config saved to /var/cache/conftool/dbconfig/20240130-054154-root.json
- 05:40 marostegui@cumin1002: dbctl commit (dc=all): 'Set s6 codfw as read-only for maintenance - T355739', diff saved to https://phabricator.wikimedia.org/P55845 and previous config saved to /var/cache/conftool/dbconfig/20240130-054025-root.json
- 05:40 marostegui: Starting s6 codfw failover from db2114 to db2129 - T355739
- 05:19 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2129 with weight 0 T355739', diff saved to https://phabricator.wikimedia.org/P55844 and previous config saved to /var/cache/conftool/dbconfig/20240130-051952-marostegui.json
- 05:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355739
- 05:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355739
- 04:57 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.16 refs T354434 (duration: 52m 38s)
- 04:04 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.16 refs T354434
- 04:02 mwpresync@deploy2002: Pruned MediaWiki: 1.42.0-wmf.13 (duration: 02m 09s)
- 03:30 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 03:29 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 00:00 eileen: tools upgraded from 117e1f9c to 544301bd
2024-01-29
- 22:31 catrope@deploy2002: Finished scap: Backport for gerrit:993805Drop English Wikipedia configuration for wgMFUseDesktopSpecialHistoryPage (T353388) (duration: 28m 33s)
- 22:24 catrope@deploy2002: catrope and jdlrobson: Continuing with sync
- 22:03 catrope@deploy2002: catrope and jdlrobson: Backport for gerrit:993805Drop English Wikipedia configuration for wgMFUseDesktopSpecialHistoryPage (T353388) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 22:02 catrope@deploy2002: Started scap: Backport for gerrit:993805Drop English Wikipedia configuration for wgMFUseDesktopSpecialHistoryPage (T353388)
- 21:54 catrope@deploy2002: Finished scap: Backport for gerrit:991424Use desktop history page HTML everywhere (T353388), gerrit:992931Begin capturing errors for Wikivoyage (duration: 12m 05s)
- 21:48 catrope@deploy2002: catrope and jdlrobson: Continuing with sync
- 21:43 catrope@deploy2002: catrope and jdlrobson: Backport for gerrit:991424Use desktop history page HTML everywhere (T353388), gerrit:992931Begin capturing errors for Wikivoyage synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:42 catrope@deploy2002: Started scap: Backport for gerrit:991424Use desktop history page HTML everywhere (T353388), gerrit:992931Begin capturing errors for Wikivoyage
- 21:36 catrope@deploy2002: Finished scap: Backport for gerrit:993709DiscussionTools: Enable permalinks frontend everywhere except en.wiki (T356063) (duration: 12m 19s)
- 21:30 catrope@deploy2002: catrope and esanders: Continuing with sync
- 21:25 catrope@deploy2002: catrope and esanders: Backport for gerrit:993709DiscussionTools: Enable permalinks frontend everywhere except en.wiki (T356063) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:24 catrope@deploy2002: Started scap: Backport for gerrit:993709DiscussionTools: Enable permalinks frontend everywhere except en.wiki (T356063)
- 21:17 catrope@deploy2002: Finished scap: Backport for gerrit:992974cirrus: Disable cloudelastic writes to testwiki and mw.org (T352335) (duration: 08m 40s)
- 21:11 catrope@deploy2002: ebernhardson and catrope: Continuing with sync
- 21:10 catrope@deploy2002: ebernhardson and catrope: Backport for gerrit:992974cirrus: Disable cloudelastic writes to testwiki and mw.org (T352335) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:09 catrope@deploy2002: Started scap: Backport for gerrit:992974cirrus: Disable cloudelastic writes to testwiki and mw.org (T352335)
- 20:37 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 20:37 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 20:33 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 20:33 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 20:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T355609)', diff saved to https://phabricator.wikimedia.org/P55843 and previous config saved to /var/cache/conftool/dbconfig/20240129-202740-marostegui.json
- 20:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P55842 and previous config saved to /var/cache/conftool/dbconfig/20240129-201233-marostegui.json
- 19:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P55841 and previous config saved to /var/cache/conftool/dbconfig/20240129-195725-marostegui.json
- 19:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T355609)', diff saved to https://phabricator.wikimedia.org/P55840 and previous config saved to /var/cache/conftool/dbconfig/20240129-194218-marostegui.json
- 19:36 zabe@deploy2002: Finished scap: Backport for gerrit:993765Start reading from af_actor/afh_actor everywhere (T355616) (duration: 09m 09s)
- 19:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T355609)', diff saved to https://phabricator.wikimedia.org/P55839 and previous config saved to /var/cache/conftool/dbconfig/20240129-193317-marostegui.json
- 19:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2193.codfw.wmnet with reason: Maintenance
- 19:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2193.codfw.wmnet with reason: Maintenance
- 19:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55838 and previous config saved to /var/cache/conftool/dbconfig/20240129-193254-marostegui.json
- 19:29 zabe@deploy2002: zabe: Continuing with sync
- 19:28 zabe@deploy2002: zabe: Backport for gerrit:993765Start reading from af_actor/afh_actor everywhere (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 19:27 zabe@deploy2002: Started scap: Backport for gerrit:993765Start reading from af_actor/afh_actor everywhere (T355616)
- 19:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P55837 and previous config saved to /var/cache/conftool/dbconfig/20240129-191748-marostegui.json
- 19:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P55836 and previous config saved to /var/cache/conftool/dbconfig/20240129-190241-marostegui.json
- 19:01 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 19:01 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 19:00 ayounsi@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: CR993089 - ayounsi@cumin1002
- 18:59 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 18:59 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 18:58 ayounsi@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: CR993089 - ayounsi@cumin1002
- 18:49 brouberol@cumin1001: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop test cluster: Restart of jvm daemons.
- 18:49 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 18:49 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 18:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55835 and previous config saved to /var/cache/conftool/dbconfig/20240129-184735-marostegui.json
- 18:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55834 and previous config saved to /var/cache/conftool/dbconfig/20240129-182909-marostegui.json
- 18:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2180.codfw.wmnet with reason: Maintenance
- 18:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2180.codfw.wmnet with reason: Maintenance
- 18:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55833 and previous config saved to /var/cache/conftool/dbconfig/20240129-182846-marostegui.json
- 18:24 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 18:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P55832 and previous config saved to /var/cache/conftool/dbconfig/20240129-181340-marostegui.json
- 17:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P55831 and previous config saved to /var/cache/conftool/dbconfig/20240129-175833-marostegui.json
- 17:43 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 17:43 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 17:43 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 17:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55830 and previous config saved to /var/cache/conftool/dbconfig/20240129-174327-marostegui.json
- 17:43 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 17:42 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 17:42 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 17:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2171:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55829 and previous config saved to /var/cache/conftool/dbconfig/20240129-173435-marostegui.json
- 17:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2171.codfw.wmnet with reason: Maintenance
- 17:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2171.codfw.wmnet with reason: Maintenance
- 17:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
- 17:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
- 17:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T355609)', diff saved to https://phabricator.wikimedia.org/P55828 and previous config saved to /var/cache/conftool/dbconfig/20240129-173406-marostegui.json
- 17:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P55824 and previous config saved to /var/cache/conftool/dbconfig/20240129-171859-marostegui.json
- 17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P55823 and previous config saved to /var/cache/conftool/dbconfig/20240129-170353-marostegui.json
- 16:51 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: gerrit:993728 Bumping portals to master (T128546) (duration: 06m 37s)
- 16:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T355609)', diff saved to https://phabricator.wikimedia.org/P55822 and previous config saved to /var/cache/conftool/dbconfig/20240129-164846-marostegui.json
- 16:44 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: gerrit:993728 Bumping portals to master (T128546) (duration: 07m 04s)
- 16:40 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 16:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T355609)', diff saved to https://phabricator.wikimedia.org/P55821 and previous config saved to /var/cache/conftool/dbconfig/20240129-164005-marostegui.json
- 16:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 16:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 16:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2158.codfw.wmnet with reason: Maintenance
- 16:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2158.codfw.wmnet with reason: Maintenance
- 16:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T355609)', diff saved to https://phabricator.wikimedia.org/P55820 and previous config saved to /var/cache/conftool/dbconfig/20240129-163926-marostegui.json
- 16:36 volans: installed spicerack 8.3.0 on cumin1002, cumin1001
- 16:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P55819 and previous config saved to /var/cache/conftool/dbconfig/20240129-162420-marostegui.json
- 16:20 ladsgroup@deploy2002: Finished scap: Backport for gerrit:992129Drop old virtual domain for url shortener (duration: 09m 24s)
- 16:14 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 16:12 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:992129Drop old virtual domain for url shortener synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 16:11 ladsgroup@deploy2002: Started scap: Backport for gerrit:992129Drop old virtual domain for url shortener
- 16:10 urandom: decommissioning restbase2019/cassandra-{a,b,c} — T352469
- 16:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P55817 and previous config saved to /var/cache/conftool/dbconfig/20240129-160913-marostegui.json
- 16:08 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2019.codfw.wmnet with reason: Decommissioning — T352469
- 16:07 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2019.codfw.wmnet with reason: Decommissioning — T352469
- 15:58 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-tool1009.eqiad.wmnet with OS buster
- 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T355609)', diff saved to https://phabricator.wikimedia.org/P55816 and previous config saved to /var/cache/conftool/dbconfig/20240129-155406-marostegui.json
- 15:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T355609)', diff saved to https://phabricator.wikimedia.org/P55815 and previous config saved to /var/cache/conftool/dbconfig/20240129-154444-marostegui.json
- 15:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2151.codfw.wmnet with reason: Maintenance
- 15:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2151.codfw.wmnet with reason: Maintenance
- 15:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T355609)', diff saved to https://phabricator.wikimedia.org/P55814 and previous config saved to /var/cache/conftool/dbconfig/20240129-154422-marostegui.json
- 15:34 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage
- 15:31 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage
- 15:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P55811 and previous config saved to /var/cache/conftool/dbconfig/20240129-152915-marostegui.json
- 15:26 Dreamy_Jazz: Running MediaModeration scanning script using `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30-no-render-now.txt` on a tmux session.
- 15:24 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 15:23 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
- 15:21 Dreamy_Jazz: Running `foreachwikiindblist group1.dblist extensions/MediaModeration/maintenance/resendMatchEmails.php 20200405 --verbose`
- 15:19 Dreamy_Jazz: Running `foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/resendMatchEmails.php 20200405`
- 15:17 Dreamy_Jazz: Stopping mediamoderation scanning script
- 15:17 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host an-tool1009.eqiad.wmnet with OS buster
- 15:15 Dreamy_Jazz: afternoon UTC backport window done
- 15:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P55810 and previous config saved to /var/cache/conftool/dbconfig/20240129-151409-marostegui.json
- 15:14 dreamyjazz@deploy2002: Finished scap: Backport for gerrit:993500Make the email subject unique for positive match emails (T355752) (duration: 21m 21s)
- 15:13 ayounsi@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts sretest1005.eqiad.wmnet
- 15:13 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:13 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin2002"
- 15:12 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin2002"
- 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-airflow1006.eqiad.wmnet
- 15:04 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp1001.eqiad.wmnet
- 15:04 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 15:04 dreamyjazz@deploy2002: dreamyjazz: Backport for gerrit:993500Make the email subject unique for positive match emails (T355752) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 15:00 ayounsi@cumin2002: START - Cookbook sre.dns.netbox
- 14:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T355609)', diff saved to https://phabricator.wikimedia.org/P55809 and previous config saved to /var/cache/conftool/dbconfig/20240129-145902-marostegui.json
- 14:58 hashar@deploy2002: Finished deploy [gerrit/gerrit@5594608]: wm-checks-api: direct link to build when only one failed - T355774 (duration: 00m 07s)
- 14:58 hashar@deploy2002: Started deploy [gerrit/gerrit@5594608]: wm-checks-api: direct link to build when only one failed - T355774
- 14:57 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp1001.eqiad.wmnet
- 14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2129 (T355609)', diff saved to https://phabricator.wikimedia.org/P55808 and previous config saved to /var/cache/conftool/dbconfig/20240129-145652-marostegui.json
- 14:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 14:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 14:56 ayounsi@cumin2002: START - Cookbook sre.hosts.decommission for hosts sretest1005.eqiad.wmnet
- 14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T355609)', diff saved to https://phabricator.wikimedia.org/P55807 and previous config saved to /var/cache/conftool/dbconfig/20240129-145630-marostegui.json
- 14:56 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2055.codfw.wmnet
- 14:54 Dreamy_Jazz: scap backport is also backporting 993499 for T355357
- 14:53 ayounsi@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host sretest1005.eqiad.wmnet
- 14:53 ayounsi@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 14:52 dreamyjazz@deploy2002: Started scap: Backport for gerrit:993500Make the email subject unique for positive match emails (T355752)
- 14:52 ayounsi@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 14:51 dreamyjazz@deploy2002: sync-world aborted: Backport for gerrit:993500Make the email subject unique for positive match emails (T355752) (duration: 04m 13s)
- 14:51 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM sretest1005.eqiad.wmnet - ayounsi@cumin2002"
- 14:50 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2055.codfw.wmnet
- 14:50 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM sretest1005.eqiad.wmnet - ayounsi@cumin2002"
- 14:49 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest1005.eqiad.wmnet on all recursors
- 14:49 ayounsi@cumin2002: START - Cookbook sre.dns.wipe-cache sretest1005.eqiad.wmnet on all recursors
- 14:49 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:49 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM sretest1005.eqiad.wmnet - ayounsi@cumin2002"
- 14:48 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM sretest1005.eqiad.wmnet - ayounsi@cumin2002"
- 14:47 dreamyjazz@deploy2002: Started scap: Backport for gerrit:993500Make the email subject unique for positive match emails (T355752)
- 14:46 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:993494hewikinews: remove wgExtraGenderNamespaces and add wgNamespaceAliases (T349581) (duration: 12m 29s)
- 14:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-airflow1006.eqiad.wmnet
- 14:42 ayounsi@cumin2002: START - Cookbook sre.dns.netbox
- 14:42 ayounsi@cumin2002: START - Cookbook sre.ganeti.makevm for new host sretest1005.eqiad.wmnet
- 14:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P55806 and previous config saved to /var/cache/conftool/dbconfig/20240129-144124-marostegui.json
- 14:40 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: analytics_cluster::airflow::analytics_product
- 14:40 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Continuing with sync
- 14:37 brouberol@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-tool1009.eqiad.wmnet with OS bullseye
- 14:36 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Backport for gerrit:993494hewikinews: remove wgExtraGenderNamespaces and add wgNamespaceAliases (T349581) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:34 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:993494hewikinews: remove wgExtraGenderNamespaces and add wgNamespaceAliases (T349581)
- 14:30 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: analytics_cluster::airflow::analytics_product
- 14:30 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:992783knwiki: add portal namespace and fix talkpagenames of draft and module namespace (T355662 T346583) (duration: 08m 58s)
- 14:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P55804 and previous config saved to /var/cache/conftool/dbconfig/20240129-142617-marostegui.json
- 14:23 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ceph2001.codfw.wmnet with OS bullseye
- 14:23 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Continuing with sync
- 14:22 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Backport for gerrit:992783knwiki: add portal namespace and fix talkpagenames of draft and module namespace (T355662 T346583) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:21 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 14:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:992783knwiki: add portal namespace and fix talkpagenames of draft and module namespace (T355662 T346583)
- 14:17 volans: upgraded spicerack to 8.3.0 on cumin2002
- 14:16 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:992371uzwiki: revert temporary logo for the 20th anniversary (T353723) (duration: 11m 01s)
- 14:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T355609)', diff saved to https://phabricator.wikimedia.org/P55803 and previous config saved to /var/cache/conftool/dbconfig/20240129-141111-marostegui.json
- 14:10 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-airflow1006.eqiad.wmnet with OS bullseye
- 14:09 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and anzx: Continuing with sync
- 14:07 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and anzx: Backport for gerrit:992371uzwiki: revert temporary logo for the 20th anniversary (T353723) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:05 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:992371uzwiki: revert temporary logo for the 20th anniversary (T353723)
- 14:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2124 (T355609)', diff saved to https://phabricator.wikimedia.org/P55802 and previous config saved to /var/cache/conftool/dbconfig/20240129-140205-marostegui.json
- 14:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2124.codfw.wmnet with reason: Maintenance
- 14:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2124.codfw.wmnet with reason: Maintenance
- 14:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T355609)', diff saved to https://phabricator.wikimedia.org/P55801 and previous config saved to /var/cache/conftool/dbconfig/20240129-140142-marostegui.json
- 13:54 volans: uploaded spicerack_8.3.0 to apt.wikimedia.org bullseye-wikimedia
- 13:48 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2355.codfw.wmnet with OS bullseye
- 13:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P55799 and previous config saved to /var/cache/conftool/dbconfig/20240129-134636-marostegui.json
- 13:46 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2445.codfw.wmnet with OS bullseye
- 13:40 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2429.codfw.wmnet with OS bullseye
- 13:40 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-airflow1006.eqiad.wmnet with reason: host reimage
- 13:37 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2381.codfw.wmnet with OS bullseye
- 13:36 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-airflow1006.eqiad.wmnet with reason: host reimage
- 13:35 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2260.codfw.wmnet with OS bullseye
- 13:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P55798 and previous config saved to /var/cache/conftool/dbconfig/20240129-133129-marostegui.json
- 13:29 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2355.codfw.wmnet with reason: host reimage
- 13:26 claime: Restarting ferm.service on k8s node kubernetes2055 - T354855
- 13:25 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2445.codfw.wmnet with reason: host reimage
- 13:23 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-airflow1006.eqiad.wmnet with OS bullseye
- 13:23 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage
- 13:20 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2429.codfw.wmnet with reason: host reimage
- 13:18 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2381.codfw.wmnet with reason: host reimage
- 13:17 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2445.codfw.wmnet with reason: host reimage
- 13:16 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage
- 13:16 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2429.codfw.wmnet with reason: host reimage
- 13:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T355609)', diff saved to https://phabricator.wikimedia.org/P55797 and previous config saved to /var/cache/conftool/dbconfig/20240129-131623-marostegui.json
- 13:15 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2260.codfw.wmnet with reason: host reimage
- 13:14 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2381.codfw.wmnet with reason: host reimage
- 13:13 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2355.codfw.wmnet with reason: host reimage
- 13:12 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2260.codfw.wmnet with reason: host reimage
- 13:07 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host an-tool1009.eqiad.wmnet with OS bullseye
- 13:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2117 (T355609)', diff saved to https://phabricator.wikimedia.org/P55796 and previous config saved to /var/cache/conftool/dbconfig/20240129-130724-marostegui.json
- 13:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
- 13:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
- 13:00 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2445.codfw.wmnet with OS bullseye
- 12:59 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2429.codfw.wmnet with OS bullseye
- 12:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
- 12:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
- 12:58 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2381.codfw.wmnet with OS bullseye
- 12:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
- 12:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
- 12:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T355609)', diff saved to https://phabricator.wikimedia.org/P55795 and previous config saved to /var/cache/conftool/dbconfig/20240129-125726-marostegui.json
- 12:57 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2355.codfw.wmnet with OS bullseye
- 12:56 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2260.codfw.wmnet with OS bullseye
- 12:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P55794 and previous config saved to /var/cache/conftool/dbconfig/20240129-124220-marostegui.json
- 12:33 moritzm: installing openssh security updates
- 12:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P55793 and previous config saved to /var/cache/conftool/dbconfig/20240129-122713-marostegui.json
- 12:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-airflow1007.eqiad.wmnet
- 12:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-airflow1007.eqiad.wmnet
- 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: analytics_cluster::airflow::wmde
- 12:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T355609)', diff saved to https://phabricator.wikimedia.org/P55792 and previous config saved to /var/cache/conftool/dbconfig/20240129-121205-marostegui.json
- 12:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1231 (T355609)', diff saved to https://phabricator.wikimedia.org/P55791 and previous config saved to /var/cache/conftool/dbconfig/20240129-120628-marostegui.json
- 12:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1231.eqiad.wmnet with reason: Maintenance
- 12:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1231.eqiad.wmnet with reason: Maintenance
- 12:00 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: analytics_cluster::airflow::wmde
- 12:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 11:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T355609)', diff saved to https://phabricator.wikimedia.org/P55790 and previous config saved to /var/cache/conftool/dbconfig/20240129-115953-marostegui.json
- 11:53 Dreamy_Jazz: Running mwscript maintenance/sql.php --wiki=testwiki --wikidb=centralauth ~/T354700-create-table-global.sql for T354700
- 11:45 Dreamy_Jazz: sql.php finished for T354700
- 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P55789 and previous config saved to /var/cache/conftool/dbconfig/20240129-114446-marostegui.json
- 11:41 Dreamy_Jazz: T354700 - Running `foreachwiki maintenance/sql.php ~/T354700-create-table.sql`
- 11:39 Dreamy_Jazz: T354700 - Ran mwscript maintenance/sql.php --wiki=testwiki ~/T354700-create-table.sql
- 11:38 moritzm: upload ganeti 3.0.2-3+wmf1 (bookworm package of Ganeti plus backport for SSL chain handling in RAPI) to apt.wikimedia.org T300152
- 11:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P55788 and previous config saved to /var/cache/conftool/dbconfig/20240129-112940-marostegui.json
- 11:28 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-airflow1007.eqiad.wmnet with OS bullseye
- 11:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T355609)', diff saved to https://phabricator.wikimedia.org/P55787 and previous config saved to /var/cache/conftool/dbconfig/20240129-111434-marostegui.json
- 11:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1224 (T355609)', diff saved to https://phabricator.wikimedia.org/P55786 and previous config saved to /var/cache/conftool/dbconfig/20240129-110955-marostegui.json
- 11:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1224.eqiad.wmnet with reason: Maintenance
- 11:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1224.eqiad.wmnet with reason: Maintenance
- 11:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55785 and previous config saved to /var/cache/conftool/dbconfig/20240129-110933-marostegui.json
- 11:05 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-airflow1007.eqiad.wmnet with reason: host reimage
- 11:01 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-airflow1007.eqiad.wmnet with reason: host reimage
- 10:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P55784 and previous config saved to /var/cache/conftool/dbconfig/20240129-105427-marostegui.json
- 10:53 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1054.eqiad.wmnet
- 10:53 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2054.codfw.wmnet
- 10:47 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2054.codfw.wmnet
- 10:47 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1054.eqiad.wmnet
- 10:47 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-airflow1007.eqiad.wmnet with OS bullseye
- 10:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P55783 and previous config saved to /var/cache/conftool/dbconfig/20240129-103920-marostegui.json
- 10:38 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 10:37 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 10:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55782 and previous config saved to /var/cache/conftool/dbconfig/20240129-102414-marostegui.json
- 10:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1213:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55781 and previous config saved to /var/cache/conftool/dbconfig/20240129-101757-marostegui.json
- 10:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1213.eqiad.wmnet with reason: Maintenance
- 10:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1213.eqiad.wmnet with reason: Maintenance
- 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T355609)', diff saved to https://phabricator.wikimedia.org/P55780 and previous config saved to /var/cache/conftool/dbconfig/20240129-101735-marostegui.json
- 10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P55779 and previous config saved to /var/cache/conftool/dbconfig/20240129-100229-marostegui.json
- 10:01 moritzm: upload prometheus-ganeti-exporter 0.3+deb12u1 to apt.wikimedia.org T300152
- 09:56 XioNoX: enable Puppet on all the ganeti servers for CR990968 deployment - T300152
- 09:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P55778 and previous config saved to /var/cache/conftool/dbconfig/20240129-094722-marostegui.json
- 09:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T355609)', diff saved to https://phabricator.wikimedia.org/P55777 and previous config saved to /var/cache/conftool/dbconfig/20240129-093216-marostegui.json
- 09:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1201 (T355609)', diff saved to https://phabricator.wikimedia.org/P55776 and previous config saved to /var/cache/conftool/dbconfig/20240129-092724-marostegui.json
- 09:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1201.eqiad.wmnet with reason: Maintenance
- 09:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1201.eqiad.wmnet with reason: Maintenance
- 09:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T355609)', diff saved to https://phabricator.wikimedia.org/P55775 and previous config saved to /var/cache/conftool/dbconfig/20240129-092702-marostegui.json
- 09:17 godog: mark for deletetion and cleanup replicated thanos blocks for prometheus=ops, older than 3 months, all resolutions - T351927
- 09:13 moritzm: upgrading python-pymysql in S7 DB hosts to 1.0.2-2~wmf11u1 T355531
- 09:13 XioNoX: disable Puppet on all the ganeti servers for CR990968 deployment - T300152
- 09:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P55773 and previous config saved to /var/cache/conftool/dbconfig/20240129-091156-marostegui.json
- 08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P55772 and previous config saved to /var/cache/conftool/dbconfig/20240129-085649-marostegui.json
- 08:46 marostegui@deploy2002: Finished scap: Backport for gerrit:993489Revert "ProductionServices.php: Promote pc2014" (duration: 17m 13s)
- 08:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T355609)', diff saved to https://phabricator.wikimedia.org/P55771 and previous config saved to /var/cache/conftool/dbconfig/20240129-084143-marostegui.json
- 08:39 marostegui@deploy2002: marostegui: Continuing with sync
- 08:39 marostegui@deploy2002: marostegui: Backport for gerrit:993489Revert "ProductionServices.php: Promote pc2014" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T355609)', diff saved to https://phabricator.wikimedia.org/P55770 and previous config saved to /var/cache/conftool/dbconfig/20240129-083627-marostegui.json
- 08:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 08:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55769 and previous config saved to /var/cache/conftool/dbconfig/20240129-083603-marostegui.json
- 08:29 marostegui@deploy2002: Started scap: Backport for gerrit:993489Revert "ProductionServices.php: Promote pc2014"
- 08:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P55768 and previous config saved to /var/cache/conftool/dbconfig/20240129-082057-marostegui.json
- 08:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P55767 and previous config saved to /var/cache/conftool/dbconfig/20240129-080550-marostegui.json
- 07:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55766 and previous config saved to /var/cache/conftool/dbconfig/20240129-075044-marostegui.json
- 07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55765 and previous config saved to /var/cache/conftool/dbconfig/20240129-074541-marostegui.json
- 07:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 07:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T355609)', diff saved to https://phabricator.wikimedia.org/P55764 and previous config saved to /var/cache/conftool/dbconfig/20240129-074519-marostegui.json
- 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P55763 and previous config saved to /var/cache/conftool/dbconfig/20240129-073857-root.json
- 07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P55762 and previous config saved to /var/cache/conftool/dbconfig/20240129-073012-marostegui.json
- 07:23 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P55761 and previous config saved to /var/cache/conftool/dbconfig/20240129-072352-root.json
- 07:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P55760 and previous config saved to /var/cache/conftool/dbconfig/20240129-071506-marostegui.json
- 07:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P55758 and previous config saved to /var/cache/conftool/dbconfig/20240129-070847-root.json
- 07:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T355609)', diff saved to https://phabricator.wikimedia.org/P55757 and previous config saved to /var/cache/conftool/dbconfig/20240129-065959-marostegui.json
- 06:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T355609)', diff saved to https://phabricator.wikimedia.org/P55756 and previous config saved to /var/cache/conftool/dbconfig/20240129-065450-marostegui.json
- 06:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 06:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 06:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T355609)', diff saved to https://phabricator.wikimedia.org/P55755 and previous config saved to /var/cache/conftool/dbconfig/20240129-065427-marostegui.json
- 06:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P55754 and previous config saved to /var/cache/conftool/dbconfig/20240129-065341-root.json
- 06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P55752 and previous config saved to /var/cache/conftool/dbconfig/20240129-063920-marostegui.json
- 06:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P55751 and previous config saved to /var/cache/conftool/dbconfig/20240129-063836-root.json
- 06:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2129', diff saved to https://phabricator.wikimedia.org/P55750 and previous config saved to /var/cache/conftool/dbconfig/20240129-063302-marostegui.json
- 06:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P55747 and previous config saved to /var/cache/conftool/dbconfig/20240129-062414-marostegui.json
- 06:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T355609)', diff saved to https://phabricator.wikimedia.org/P55746 and previous config saved to /var/cache/conftool/dbconfig/20240129-060907-marostegui.json
- 06:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T355609)', diff saved to https://phabricator.wikimedia.org/P55745 and previous config saved to /var/cache/conftool/dbconfig/20240129-060400-marostegui.json
- 06:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 06:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 06:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 06:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 05:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1134.eqiad.wmnet
- 05:57 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 05:57 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1134.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
- 05:56 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1134.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
- 05:54 marostegui@cumin1002: START - Cookbook sre.dns.netbox
- 05:49 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts db1134.eqiad.wmnet
2024-01-28
- 01:11 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2016.codfw.wmnet with reason: Decommissioning — T352469
- 01:11 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2016.codfw.wmnet with reason: Decommissioning — T352469
- 01:10 urandom: decommissioning restbase2016/cassandra-{a,b,c} — T352469
2024-01-26
- 22:07 bking@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host cloudelastic1006.wikimedia.org
- 22:06 bking@cumin2002: START - Cookbook sre.puppet.migrate-host for host cloudelastic1006.wikimedia.org
- 22:05 bking@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host cloudelastic1006.wikimedia.org
- 22:04 bking@cumin2002: START - Cookbook sre.puppet.migrate-host for host cloudelastic1006.wikimedia.org
- 19:02 ejegg: fundraising civicrm upgraded from 8c0dc1d2 to b953d667
- 18:27 mutante: cloudweb1003 - OATHAuth disabled for Triciaburmeister. (after video verification - T355958)
- 18:16 mutante: phab1004 - removing 2fa from TBurmeister (after video verification) T355958
- 17:57 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
- 17:57 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
- 17:53 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
- 17:37 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
- 17:34 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
- 17:17 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
- 17:12 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
- 17:11 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
- 17:09 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:09 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sync cloudelastic1010 IPs - bking@cumin2002"
- 17:08 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sync cloudelastic1010 IPs - bking@cumin2002"
- 17:04 bking@cumin2002: START - Cookbook sre.dns.netbox
- 16:34 bking@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudelastic1010.wikimedia.org
- 16:33 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:33 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1010.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
- 16:33 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
- 16:32 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1010.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
- 16:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2169 in db2194 for T343674', diff saved to https://phabricator.wikimedia.org/P55740 and previous config saved to /var/cache/conftool/dbconfig/20240126-163057-arnaudb.json
- 16:29 bking@cumin2002: START - Cookbook sre.dns.netbox
- 16:23 bking@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudelastic1010.wikimedia.org
- 16:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
- 15:01 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 15:00 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 14:47 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 14:46 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 14:37 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 14:37 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 14:36 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2015.codfw.wmnet with reason: Decommissioning — T352469
- 14:35 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2015.codfw.wmnet with reason: Decommissioning — T352469
- 14:34 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 14:34 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 14:33 urandom: decommissioning restbase2015/cassandra-{a,b,c} — T352469
- 14:27 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 14:27 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 14:24 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 14:24 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 14:08 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 14:08 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 13:18 eoghan@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Gitlab security upgrade
- 12:36 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:36 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster svc - ayounsi@cumin1002"
- 12:35 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster svc - ayounsi@cumin1002"
- 12:30 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
- 11:43 taavi: reprepro: copy helm-diff_3.1.3-2 from bullseye-wikimedia to bookworm-wikimedia
- 11:28 eoghan@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Gitlab security upgrade
- 10:52 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 10:51 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
- 10:50 eoghan@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Gitlab security upgrade
- 10:44 eoghan@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Gitlab security upgrade
- 10:36 moritzm: prune obsolete nginx packages from eventschema hosts after migration to new library scheme T329529
- 10:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2169 in db2194 for T343674', diff saved to https://phabricator.wikimedia.org/P55737 and previous config saved to /var/cache/conftool/dbconfig/20240126-102550-arnaudb.json
- 08:01 moritzm: rebalance codfw/B following switch maintenance T355549
- 07:54 moritzm: failover ganeti master for codfw back to ganeti2022, switch maintenance is completed T355549
- 01:01 dzahn@cumin1002: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1004.wikimedia.org with reason: security release
- 00:07 dzahn@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: security release
- 00:00 dzahn@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: security release
2024-01-25
- 23:54 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=wikimaniawiki --fix # T347622
- 23:54 zabe@deploy2002: Finished scap: Backport for gerrit:961963Setup namespace for 2025, 2026, enable subpages for 2023-2026 (T347622) (duration: 08m 30s)
- 23:47 zabe@deploy2002: robertsky and zabe: Continuing with sync
- 23:47 zabe@deploy2002: robertsky and zabe: Backport for gerrit:961963Setup namespace for 2025, 2026, enable subpages for 2023-2026 (T347622) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 23:45 zabe@deploy2002: Started scap: Backport for gerrit:961963Setup namespace for 2025, 2026, enable subpages for 2023-2026 (T347622)
- 23:29 zabe: zabe@mwmaint2002:/tmp/uploads$ mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user=Sturm . # T355485
- 23:17 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cloudelastic1010.wikimedia.org with reason: migration canary T355617
- 23:17 bking@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on cloudelastic1010.wikimedia.org with reason: migration canary T355617
- 22:54 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: cloudelastic1010.wikimedia.org for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
- 22:53 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010.wikimedia.org for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
- 22:53 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
- 22:53 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
- 22:52 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
- 22:52 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
- 22:40 ryankemper: T351354 Restarting `cloudelastic1006` (final restart for today)
- 22:34 ryankemper: T351354 Now restarting new masters to keep configs in sync; restarting `cloudelastic1009`
- 22:33 ryankemper: T351354 Now restarting new masters to keep configs in sync; restarting `cloudelastic1007`
- 22:26 ryankemper: T351354 Restarting `cloudelastic1002`
- 22:19 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 22:19 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 22:15 ryankemper: T351354 Restarting `cloudelastic1004` following puppet run
- 22:12 dzahn@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: security release
- 22:11 ryankemper: T351354 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/993038; restarting `cloudelastic1001` following puppet run
- 22:08 ryankemper: T351354 Downtimed `cloudelastic*`; shortly will restart `cloudelastic100[1,2,4]` one host at a time to make them no longer masters
- 22:08 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: cloudelastic maintenance
- 22:07 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: cloudelastic maintenance
- 21:55 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 21:55 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 21:44 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 21:44 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 21:44 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 21:44 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 21:19 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 21:19 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 21:14 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 21:14 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 21:13 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 21:13 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 20:58 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 20:58 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 20:57 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 20:57 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 20:56 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 20:56 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 20:55 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1002.eqiad.wmnet with OS bookworm
- 20:55 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
- 20:54 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
- 20:51 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1001.eqiad.wmnet with OS bookworm
- 20:51 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
- 20:50 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
- 20:37 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 20:37 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 20:36 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1002.eqiad.wmnet with reason: host reimage
- 20:35 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 20:35 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 20:33 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 20:33 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 20:33 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1002.eqiad.wmnet with reason: host reimage
- 20:32 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1001.eqiad.wmnet with reason: host reimage
- 20:27 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1001.eqiad.wmnet with reason: host reimage
- 20:26 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set cloudrabbit1001/2 as active - taavi@cumin1002"
- 20:25 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set cloudrabbit1001/2 as active - taavi@cumin1002"
- 20:19 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1002.eqiad.wmnet with OS bookworm
- 20:19 taavi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudrabbit1002.eqiad.wmnet with OS bookworm
- 20:16 zabe@deploy2002: Finished scap: Backport for gerrit:992942Start reading from af_actor/afh_actor in group1 wikis (T355616) (duration: 11m 27s)
- 20:15 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 20:15 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 20:11 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1002.eqiad.wmnet with OS bookworm
- 20:10 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1001.eqiad.wmnet with OS bookworm
- 20:10 zabe@deploy2002: zabe: Continuing with sync
- 20:09 zabe@deploy2002: zabe: Backport for gerrit:992942Start reading from af_actor/afh_actor in group1 wikis (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 20:06 taavi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudrabbit1001
- 20:05 zabe@deploy2002: Started scap: Backport for gerrit:992942Start reading from af_actor/afh_actor in group1 wikis (T355616)
- 20:05 taavi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudrabbit1001
- 20:05 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:05 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add IPs for cloudrabbit1001 - taavi@cumin1002"
- 20:04 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add IPs for cloudrabbit1001 - taavi@cumin1002"
- 20:02 taavi@cumin1002: START - Cookbook sre.dns.netbox
- 20:01 taavi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudrabbit1002
- 20:00 taavi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudrabbit1002
- 19:59 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:59 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add IPs for cloudrabbit1002 - taavi@cumin1002"
- 19:58 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add IPs for cloudrabbit1002 - taavi@cumin1002"
- 19:56 taavi@cumin1002: START - Cookbook sre.dns.netbox
- 19:29 bking@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 19:29 bking@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 19:28 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 19:28 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 19:25 bking@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 19:24 bking@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 18:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55736 and previous config saved to /var/cache/conftool/dbconfig/20240125-184922-root.json
- 18:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55735 and previous config saved to /var/cache/conftool/dbconfig/20240125-184917-root.json
- 18:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55734 and previous config saved to /var/cache/conftool/dbconfig/20240125-184911-root.json
- 18:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55733 and previous config saved to /var/cache/conftool/dbconfig/20240125-184906-root.json
- 18:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55732 and previous config saved to /var/cache/conftool/dbconfig/20240125-184900-root.json
- 18:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55731 and previous config saved to /var/cache/conftool/dbconfig/20240125-184853-root.json
- 18:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55730 and previous config saved to /var/cache/conftool/dbconfig/20240125-184845-root.json
- 18:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55729 and previous config saved to /var/cache/conftool/dbconfig/20240125-184839-root.json
- 18:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55728 and previous config saved to /var/cache/conftool/dbconfig/20240125-184823-root.json
- 18:47 mutante: phab2002 - rebooting
- 18:46 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: reboot
- 18:45 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab2002.codfw.wmnet with reason: reboot
- 18:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55727 and previous config saved to /var/cache/conftool/dbconfig/20240125-183417-root.json
- 18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55726 and previous config saved to /var/cache/conftool/dbconfig/20240125-183412-root.json
- 18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55725 and previous config saved to /var/cache/conftool/dbconfig/20240125-183406-root.json
- 18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55724 and previous config saved to /var/cache/conftool/dbconfig/20240125-183401-root.json
- 18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55723 and previous config saved to /var/cache/conftool/dbconfig/20240125-183355-root.json
- 18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55722 and previous config saved to /var/cache/conftool/dbconfig/20240125-183348-root.json
- 18:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55721 and previous config saved to /var/cache/conftool/dbconfig/20240125-183340-root.json
- 18:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55720 and previous config saved to /var/cache/conftool/dbconfig/20240125-183334-root.json
- 18:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55719 and previous config saved to /var/cache/conftool/dbconfig/20240125-183318-root.json
- 18:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55718 and previous config saved to /var/cache/conftool/dbconfig/20240125-181912-root.json
- 18:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55717 and previous config saved to /var/cache/conftool/dbconfig/20240125-181907-root.json
- 18:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55716 and previous config saved to /var/cache/conftool/dbconfig/20240125-181901-root.json
- 18:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55715 and previous config saved to /var/cache/conftool/dbconfig/20240125-181856-root.json
- 18:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55714 and previous config saved to /var/cache/conftool/dbconfig/20240125-181850-root.json
- 18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55713 and previous config saved to /var/cache/conftool/dbconfig/20240125-181843-root.json
- 18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55712 and previous config saved to /var/cache/conftool/dbconfig/20240125-181835-root.json
- 18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55711 and previous config saved to /var/cache/conftool/dbconfig/20240125-181829-root.json
- 18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55710 and previous config saved to /var/cache/conftool/dbconfig/20240125-181814-root.json
- 18:13 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum6001.drmrs.wmnet with OS bookworm
- 18:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55709 and previous config saved to /var/cache/conftool/dbconfig/20240125-180407-root.json
- 18:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55708 and previous config saved to /var/cache/conftool/dbconfig/20240125-180402-root.json
- 18:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55707 and previous config saved to /var/cache/conftool/dbconfig/20240125-180356-root.json
- 18:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55706 and previous config saved to /var/cache/conftool/dbconfig/20240125-180351-root.json
- 18:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55705 and previous config saved to /var/cache/conftool/dbconfig/20240125-180345-root.json
- 18:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55704 and previous config saved to /var/cache/conftool/dbconfig/20240125-180338-root.json
- 18:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55703 and previous config saved to /var/cache/conftool/dbconfig/20240125-180330-root.json
- 18:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55702 and previous config saved to /var/cache/conftool/dbconfig/20240125-180324-root.json
- 18:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55701 and previous config saved to /var/cache/conftool/dbconfig/20240125-180308-root.json
- 18:01 sukhe: running authdns-update for CR 993008: T355835
- 17:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2140.codfw.wmnet with reason: Maintenance
- 17:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2140.codfw.wmnet with reason: Maintenance
- 17:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55700 and previous config saved to /var/cache/conftool/dbconfig/20240125-174902-root.json
- 17:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55699 and previous config saved to /var/cache/conftool/dbconfig/20240125-174857-root.json
- 17:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55698 and previous config saved to /var/cache/conftool/dbconfig/20240125-174851-root.json
- 17:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55697 and previous config saved to /var/cache/conftool/dbconfig/20240125-174846-root.json
- 17:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55696 and previous config saved to /var/cache/conftool/dbconfig/20240125-174840-root.json
- 17:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55695 and previous config saved to /var/cache/conftool/dbconfig/20240125-174833-root.json
- 17:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55694 and previous config saved to /var/cache/conftool/dbconfig/20240125-174825-root.json
- 17:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55693 and previous config saved to /var/cache/conftool/dbconfig/20240125-174819-root.json
- 17:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55692 and previous config saved to /var/cache/conftool/dbconfig/20240125-174803-root.json
- 17:47 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum6001.drmrs.wmnet with reason: host reimage
- 17:45 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for asw-b-codfw,lsw1-b5-codfw.mgmt
- 17:45 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for asw-b-codfw,lsw1-b5-codfw.mgmt
- 17:43 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum6001.drmrs.wmnet with reason: host reimage
- 17:38 btullis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
- 17:34 btullis@deploy2002: helmfile [eqiad] START helmfile.d/services/datahub: apply on main
- 17:33 btullis@deploy2002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
- 17:30 Amir1: deploying new captchas (T141490)
- 17:22 btullis@deploy2002: helmfile [codfw] START helmfile.d/services/datahub: apply on main
- 17:22 btullis@deploy2002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
- 17:21 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host durum6001.drmrs.wmnet with OS bookworm
- 17:17 btullis@deploy2002: helmfile [staging] START helmfile.d/services/datahub: apply on main
- 17:09 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 17:09 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 17:07 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:05 taavi@cumin1002: START - Cookbook sre.dns.netbox
- 17:04 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudrabbit[1001-1002].wikimedia.org
- 17:04 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:04 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudrabbit[1001-1002].wikimedia.org decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
- 17:01 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
- 17:01 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
- 17:00 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudrabbit[1001-1002].wikimedia.org decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
- 16:56 taavi@cumin1002: START - Cookbook sre.dns.netbox
- 16:52 sukhe: running authdns-update for CR 992936: T355835
- 16:49 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2014.codfw.wmnet with reason: Decommissioning — T352469
- 16:49 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2014.codfw.wmnet with reason: Decommissioning — T352469
- 16:48 taavi@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudrabbit[1001-1002].wikimedia.org
- 16:48 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2013.codfw.wmnet with reason: Decommissioning — T352469
- 16:48 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2013.codfw.wmnet with reason: Decommissioning — T352469
- 16:43 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 32 hosts
- 16:42 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for 32 hosts
- 16:42 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr[1-2]-codfw
- 16:41 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for cr[1-2]-codfw
- 16:34 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=parse2007.codfw.wmnet
- 16:34 claime: repooling parse2007 - T355549
- 16:33 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=parse2006.codfw.wmnet
- 16:33 claime: repooling parse2006 - T355549
- 16:32 claime: uncordoning kubernetes2023 - T355549
- 16:32 claime: uncordoning kubernetes2032 - T355549
- 16:29 claime: uncordoning kubernetes2031 - T355549
- 16:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T354336)', diff saved to https://phabricator.wikimedia.org/P55691 and previous config saved to /var/cache/conftool/dbconfig/20240125-161320-marostegui.json
- 16:03 topranks: Network maintenance codfw rack b5 underway T355549
- 15:58 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:30:00 on 32 hosts with reason: Migrating servers in codfw rack B5 to lsw1-b5-codfw T355549
- 15:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P55690 and previous config saved to /var/cache/conftool/dbconfig/20240125-155813-marostegui.json
- 15:58 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:30:00 on 32 hosts with reason: Migrating servers in codfw rack B5 to lsw1-b5-codfw T355549
- 15:57 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:30:00 on cr[1-2]-codfw with reason: prepping for server uplink migration
- 15:57 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:30:00 on cr[1-2]-codfw with reason: prepping for server uplink migration
- 15:54 arnaudb@cumin1002: dbctl commit (dc=all): 'preparing to clone db2169 on db2196 as per TT343674', diff saved to https://phabricator.wikimedia.org/P55689 and previous config saved to /var/cache/conftool/dbconfig/20240125-155450-arnaudb.json
- 15:52 topranks: disabling puppet fleet-wide to allow for maintenance in codfw rack b5 which hosts puppetmaster2003 T355549
- 15:46 topranks: configuring lsw1-b5-codfw switch ports for servers to be moved T355549
- 15:46 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on asw-b-codfw,lsw1-b5-codfw.mgmt with reason: prepping for server uplink migration
- 15:46 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on asw-b-codfw,lsw1-b5-codfw.mgmt with reason: prepping for server uplink migration
- 15:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P55688 and previous config saved to /var/cache/conftool/dbconfig/20240125-154307-marostegui.json
- 15:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wcqs::public
- 15:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T354336)', diff saved to https://phabricator.wikimedia.org/P55687 and previous config saved to /var/cache/conftool/dbconfig/20240125-152801-marostegui.json
- 15:25 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: wcqs::public
- 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wdqs::internal
- 15:20 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2006.cofw.wmnet
- 15:19 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
- 15:18 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
- 15:10 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: wdqs::internal
- 14:35 cgoubert@cumin2002: conftool action : set/pooled=inactive; selector: name=parse2007.codfw.wmnet
- 14:35 claime: Depooling parse2007 (setting inactive) - T355549
- 14:34 cgoubert@cumin2002: conftool action : set/pooled=inactive; selector: name=parse2006.codfw.wmnet
- 14:34 claime: Depooling parse2006 (setting inactive) - T355549
- 14:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2179 (T354336)', diff saved to https://phabricator.wikimedia.org/P55684 and previous config saved to /var/cache/conftool/dbconfig/20240125-142729-marostegui.json
- 14:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2179.codfw.wmnet with reason: Maintenance
- 14:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2179.codfw.wmnet with reason: Maintenance
- 14:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55683 and previous config saved to /var/cache/conftool/dbconfig/20240125-142706-marostegui.json
- 14:26 moritzm: installing debmonitor-client 0.3.4 fleet-wide
- 14:25 claime: Draining kubernetes2023 - T355549
- 14:25 claime: Draining kubernetes2033 - T355549
- 14:23 claime: Draining kubernetes2032 - T355549
- 14:21 claime: Draining kubernetes2031 - T355549
- 14:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: After T355885', diff saved to https://phabricator.wikimedia.org/P55682 and previous config saved to /var/cache/conftool/dbconfig/20240125-142102-root.json
- 14:18 btullis@cumin1002: END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
- 14:15 moritzm: failover ganeti master for codfw to ganeti2020 T355549
- 14:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P55681 and previous config saved to /var/cache/conftool/dbconfig/20240125-141200-marostegui.json
- 14:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: After T355885', diff saved to https://phabricator.wikimedia.org/P55680 and previous config saved to /var/cache/conftool/dbconfig/20240125-140557-root.json
- 14:05 btullis@cumin1002: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
- 13:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P55679 and previous config saved to /var/cache/conftool/dbconfig/20240125-135653-marostegui.json
- 13:53 btullis@cumin1002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid test cluster: Roll restart of Druid jvm daemons.
- 13:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 50%: After T355885', diff saved to https://phabricator.wikimedia.org/P55678 and previous config saved to /var/cache/conftool/dbconfig/20240125-135052-root.json
- 13:47 volans: uploaded debmonitor-client_0.3.4 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia,bookworm-wikimedia
- 13:43 btullis@cumin1002: START - Cookbook sre.druid.roll-restart-workers for Druid test cluster: Roll restart of Druid jvm daemons.
- 13:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55677 and previous config saved to /var/cache/conftool/dbconfig/20240125-134147-marostegui.json
- 13:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55676 and previous config saved to /var/cache/conftool/dbconfig/20240125-133935-marostegui.json
- 13:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 13:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 13:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T354336)', diff saved to https://phabricator.wikimedia.org/P55675 and previous config saved to /var/cache/conftool/dbconfig/20240125-133913-marostegui.json
- 13:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: After T355885', diff saved to https://phabricator.wikimedia.org/P55674 and previous config saved to /var/cache/conftool/dbconfig/20240125-133547-root.json
- 13:32 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2022.codfw.wmnet
- 13:28 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2357.codfw.wmnet with OS bullseye
- 13:28 topranks: draining VMs from ganeti2022 ahead of codfw rack b5 maintenance T355549
- 13:27 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2022.codfw.wmnet
- 13:27 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2021.codfw.wmnet
- 13:26 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2021.codfw.wmnet
- 13:26 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2021.codfw.wmnet
- 13:26 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2021.codfw.wmnet
- 13:25 topranks: stopping logstash service on logstash2025 to faciliate VM migration T355549
- 13:25 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2021.codfw.wmnet
- 13:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55673 and previous config saved to /var/cache/conftool/dbconfig/20240125-132407-marostegui.json
- 13:24 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2267.codfw.wmnet with OS bullseye
- 13:21 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2395.codfw.wmnet with OS bullseye
- 13:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: After T355885', diff saved to https://phabricator.wikimedia.org/P55672 and previous config saved to /var/cache/conftool/dbconfig/20240125-132043-root.json
- 13:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 13:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 13:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2129', diff saved to https://phabricator.wikimedia.org/P55671 and previous config saved to /var/cache/conftool/dbconfig/20240125-131547-marostegui.json
- 13:12 hashar@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.15 refs T354433
- 13:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55670 and previous config saved to /var/cache/conftool/dbconfig/20240125-130900-marostegui.json
- 13:08 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2357.codfw.wmnet with reason: host reimage
- 13:05 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2267.codfw.wmnet with reason: host reimage
- 13:02 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2021.codfw.wmnet
- 13:02 topranks: draining VMs from ganeti2021 ahead of codfw rack b5 maintenance T355549
- 13:02 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2395.codfw.wmnet with reason: host reimage
- 12:58 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2267.codfw.wmnet with reason: host reimage
- 12:58 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2357.codfw.wmnet with reason: host reimage
- 12:57 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2395.codfw.wmnet with reason: host reimage
- 12:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T354336)', diff saved to https://phabricator.wikimedia.org/P55669 and previous config saved to /var/cache/conftool/dbconfig/20240125-125353-marostegui.json
- 12:41 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2267.codfw.wmnet with OS bullseye
- 12:41 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2395.codfw.wmnet with OS bullseye
- 12:41 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2357.codfw.wmnet with OS bullseye
- 12:12 jgiannelos@deploy2002: Finished deploy [restbase/deploy@708f0f3]: (no justification provided) (duration: 20m 28s)
- 12:06 moritzm: installing openssh security updates
- 11:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T354336)', diff saved to https://phabricator.wikimedia.org/P55667 and previous config saved to /var/cache/conftool/dbconfig/20240125-115322-marostegui.json
- 11:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 11:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 11:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 11:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 11:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T354336)', diff saved to https://phabricator.wikimedia.org/P55666 and previous config saved to /var/cache/conftool/dbconfig/20240125-115233-marostegui.json
- 11:52 jgiannelos@deploy2002: Started deploy [restbase/deploy@708f0f3]: (no justification provided)
- 11:45 zabe@deploy2002: Finished scap: Backport for gerrit:992894Start reading from af_actor/afh_actor in group0 wikis (T355616) (duration: 08m 25s)
- 11:44 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1038.eqiad.wmnet to cluster eqiad and group D
- 11:42 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1038.eqiad.wmnet to cluster eqiad and group D
- 11:38 zabe@deploy2002: zabe: Continuing with sync
- 11:38 zabe@deploy2002: zabe: Backport for gerrit:992894Start reading from af_actor/afh_actor in group0 wikis (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 11:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P55665 and previous config saved to /var/cache/conftool/dbconfig/20240125-113727-marostegui.json
- 11:36 zabe@deploy2002: Started scap: Backport for gerrit:992894Start reading from af_actor/afh_actor in group0 wikis (T355616)
- 11:29 hashar@deploy2002: Finished scap: Backport for gerrit:992781UserGroupManager: Fix cross-wiki database access (T355813) (duration: 08m 50s)
- 11:26 claime: Restarting ferm.service on k8s node kubernetes2036.codfw.wmnet - T354855
- 11:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2107.codfw.wmnet with reason: Maintenance
- 11:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2107.codfw.wmnet with reason: Maintenance
- 11:23 hashar@deploy2002: hashar and zabe: Continuing with sync
- 11:22 hashar@deploy2002: hashar and zabe: Backport for gerrit:992781UserGroupManager: Fix cross-wiki database access (T355813) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 11:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P55664 and previous config saved to /var/cache/conftool/dbconfig/20240125-112220-marostegui.json
- 11:20 hashar@deploy2002: Started scap: Backport for gerrit:992781UserGroupManager: Fix cross-wiki database access (T355813)
- 11:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T354336)', diff saved to https://phabricator.wikimedia.org/P55663 and previous config saved to /var/cache/conftool/dbconfig/20240125-110714-marostegui.json
- 11:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2147.codfw.wmnet with reason: Maintenance
- 11:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2147.codfw.wmnet with reason: Maintenance
- 11:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2139.codfw.wmnet with reason: Maintenance
- 11:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2139.codfw.wmnet with reason: Maintenance
- 11:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55662 and previous config saved to /var/cache/conftool/dbconfig/20240125-110521-marostegui.json
- 10:57 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1003.eqiad.wmnet with OS bookworm
- 10:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P55660 and previous config saved to /var/cache/conftool/dbconfig/20240125-105015-marostegui.json
- 10:39 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
- 10:38 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 10:35 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
- 10:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P55659 and previous config saved to /var/cache/conftool/dbconfig/20240125-103509-marostegui.json
- 10:21 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS bookworm
- 10:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55658 and previous config saved to /var/cache/conftool/dbconfig/20240125-102002-marostegui.json
- 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2138:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55657 and previous config saved to /var/cache/conftool/dbconfig/20240125-101750-marostegui.json
- 10:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
- 10:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
- 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55656 and previous config saved to /var/cache/conftool/dbconfig/20240125-101728-marostegui.json
- 10:17 moritzm: upgrading python-pymysql in S6 DB hosts to 1.0.2-2~wmf11u1 T355531
- 10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P55655 and previous config saved to /var/cache/conftool/dbconfig/20240125-100221-marostegui.json
- 09:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P55654 and previous config saved to /var/cache/conftool/dbconfig/20240125-094714-marostegui.json
- 09:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55653 and previous config saved to /var/cache/conftool/dbconfig/20240125-093208-marostegui.json
- 09:29 stran@deploy2002: Finished scap: Backport for gerrit:992123PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928) (duration: 17m 24s)
- 09:18 stran@deploy2002: kharlan and stran: Continuing with sync
- 09:14 stran@deploy2002: kharlan and stran: Backport for gerrit:992123PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 09:12 stran@deploy2002: Started scap: Backport for gerrit:992123PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928)
- 08:45 stran@deploy2002: stran and kharlan: Backport for gerrit:992123PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
- 08:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
- 08:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2136.codfw.wmnet with reason: Maintenance
- 08:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2136.codfw.wmnet with reason: Maintenance
- 08:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T354336)', diff saved to https://phabricator.wikimedia.org/P55652 and previous config saved to /var/cache/conftool/dbconfig/20240125-083106-marostegui.json
- 08:16 stran@deploy2002: Started scap: Backport for gerrit:992123PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928)
- 08:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55651 and previous config saved to /var/cache/conftool/dbconfig/20240125-081559-marostegui.json
- 08:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55650 and previous config saved to /var/cache/conftool/dbconfig/20240125-080053-marostegui.json
- 07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T354336)', diff saved to https://phabricator.wikimedia.org/P55648 and previous config saved to /var/cache/conftool/dbconfig/20240125-074546-marostegui.json
- 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2119 (T354336)', diff saved to https://phabricator.wikimedia.org/P55647 and previous config saved to /var/cache/conftool/dbconfig/20240125-074334-marostegui.json
- 07:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2119.codfw.wmnet with reason: Maintenance
- 07:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2119.codfw.wmnet with reason: Maintenance
- 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T354336)', diff saved to https://phabricator.wikimedia.org/P55646 and previous config saved to /var/cache/conftool/dbconfig/20240125-074312-marostegui.json
- 07:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 100%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55645 and previous config saved to /var/cache/conftool/dbconfig/20240125-073319-root.json
- 07:33 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 100%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55644 and previous config saved to /var/cache/conftool/dbconfig/20240125-073310-root.json
- 07:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 100%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55643 and previous config saved to /var/cache/conftool/dbconfig/20240125-073252-root.json
- 07:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 100%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55642 and previous config saved to /var/cache/conftool/dbconfig/20240125-073244-root.json
- 07:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P55641 and previous config saved to /var/cache/conftool/dbconfig/20240125-072806-marostegui.json
- 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2137:3315 T355549', diff saved to https://phabricator.wikimedia.org/P55640 and previous config saved to /var/cache/conftool/dbconfig/20240125-072010-marostegui.json
- 07:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 75%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55639 and previous config saved to /var/cache/conftool/dbconfig/20240125-071813-root.json
- 07:18 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 75%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55638 and previous config saved to /var/cache/conftool/dbconfig/20240125-071805-root.json
- 07:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 75%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55637 and previous config saved to /var/cache/conftool/dbconfig/20240125-071747-root.json
- 07:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 75%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55636 and previous config saved to /var/cache/conftool/dbconfig/20240125-071739-root.json
- 07:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P55635 and previous config saved to /var/cache/conftool/dbconfig/20240125-071259-marostegui.json
- 07:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 db2160 db2109 db2107 db2137:3314 db2135:3315 db2143 db2147 db2177 db2178 db2188 T355549', diff saved to https://phabricator.wikimedia.org/P55634 and previous config saved to /var/cache/conftool/dbconfig/20240125-071253-marostegui.json
- 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2107 T355682', diff saved to https://phabricator.wikimedia.org/P55633 and previous config saved to /var/cache/conftool/dbconfig/20240125-070604-marostegui.json
- 07:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 50%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55632 and previous config saved to /var/cache/conftool/dbconfig/20240125-070308-root.json
- 07:03 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 50%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55631 and previous config saved to /var/cache/conftool/dbconfig/20240125-070300-root.json
- 07:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 50%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55630 and previous config saved to /var/cache/conftool/dbconfig/20240125-070242-root.json
- 07:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 50%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55629 and previous config saved to /var/cache/conftool/dbconfig/20240125-070234-root.json
- 07:01 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2104 to s2 primary and set section read-write T355682', diff saved to https://phabricator.wikimedia.org/P55628 and previous config saved to /var/cache/conftool/dbconfig/20240125-070153-marostegui.json
- 07:01 marostegui@cumin1002: dbctl commit (dc=all): 'Set s2 codfw as read-only for maintenance - T355682', diff saved to https://phabricator.wikimedia.org/P55627 and previous config saved to /var/cache/conftool/dbconfig/20240125-070120-marostegui.json
- 07:00 marostegui: Starting s2 codfw failover from db2107 to db2104 - T355682
- 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T354336)', diff saved to https://phabricator.wikimedia.org/P55626 and previous config saved to /var/cache/conftool/dbconfig/20240125-065535-marostegui.json
- 06:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 25%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55625 and previous config saved to /var/cache/conftool/dbconfig/20240125-064803-root.json
- 06:47 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 25%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55624 and previous config saved to /var/cache/conftool/dbconfig/20240125-064755-root.json
- 06:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 25%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55623 and previous config saved to /var/cache/conftool/dbconfig/20240125-064737-root.json
- 06:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 25%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55622 and previous config saved to /var/cache/conftool/dbconfig/20240125-064729-root.json
- 06:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2110 (T354336)', diff saved to https://phabricator.wikimedia.org/P55621 and previous config saved to /var/cache/conftool/dbconfig/20240125-064420-marostegui.json
- 06:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2110.codfw.wmnet with reason: Maintenance
- 06:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2110.codfw.wmnet with reason: Maintenance
- 06:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T354336)', diff saved to https://phabricator.wikimedia.org/P55620 and previous config saved to /var/cache/conftool/dbconfig/20240125-064357-marostegui.json
- 06:37 marostegui@deploy2002: Finished scap: Backport for gerrit:992842ProductionServices.php: Promote pc2014 (T355683) (duration: 08m 42s)
- 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 10%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55619 and previous config saved to /var/cache/conftool/dbconfig/20240125-063258-root.json
- 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 10%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55618 and previous config saved to /var/cache/conftool/dbconfig/20240125-063250-root.json
- 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 10%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55617 and previous config saved to /var/cache/conftool/dbconfig/20240125-063232-root.json
- 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 10%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55616 and previous config saved to /var/cache/conftool/dbconfig/20240125-063225-root.json
- 06:31 marostegui@deploy2002: marostegui: Continuing with sync
- 06:31 marostegui@deploy2002: marostegui: Backport for gerrit:992842ProductionServices.php: Promote pc2014 (T355683) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 06:29 marostegui@deploy2002: Started scap: Backport for gerrit:992842ProductionServices.php: Promote pc2014 (T355683)
- 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P55615 and previous config saved to /var/cache/conftool/dbconfig/20240125-062851-marostegui.json
- 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 5%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55614 and previous config saved to /var/cache/conftool/dbconfig/20240125-061753-root.json
- 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 5%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55613 and previous config saved to /var/cache/conftool/dbconfig/20240125-061745-root.json
- 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 5%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55612 and previous config saved to /var/cache/conftool/dbconfig/20240125-061727-root.json
- 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 5%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55611 and previous config saved to /var/cache/conftool/dbconfig/20240125-061719-root.json
- 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P55610 and previous config saved to /var/cache/conftool/dbconfig/20240125-061344-marostegui.json
- 06:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s2 T355682
- 06:10 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2104 with weight 0 T355682', diff saved to https://phabricator.wikimedia.org/P55609 and previous config saved to /var/cache/conftool/dbconfig/20240125-061048-root.json
- 06:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s2 T355682
- 06:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 1%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55608 and previous config saved to /var/cache/conftool/dbconfig/20240125-060249-root.json
- 06:02 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 1%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55607 and previous config saved to /var/cache/conftool/dbconfig/20240125-060240-root.json
- 06:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 1%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55606 and previous config saved to /var/cache/conftool/dbconfig/20240125-060222-root.json
- 06:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 1%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55605 and previous config saved to /var/cache/conftool/dbconfig/20240125-060214-root.json
- 05:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T354336)', diff saved to https://phabricator.wikimedia.org/P55604 and previous config saved to /var/cache/conftool/dbconfig/20240125-055837-marostegui.json
- 05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2106 (T354336)', diff saved to https://phabricator.wikimedia.org/P55603 and previous config saved to /var/cache/conftool/dbconfig/20240125-055626-marostegui.json
- 05:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2106.codfw.wmnet with reason: Maintenance
- 05:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2106.codfw.wmnet with reason: Maintenance
- 05:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2099.codfw.wmnet with reason: Maintenance
- 05:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2099.codfw.wmnet with reason: Maintenance
- 05:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1160.eqiad.wmnet with reason: Maintenance
- 05:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1160.eqiad.wmnet with reason: Maintenance
- 02:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 02:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 02:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T354336)', diff saved to https://phabricator.wikimedia.org/P55602 and previous config saved to /var/cache/conftool/dbconfig/20240125-022727-marostegui.json
- 02:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P55601 and previous config saved to /var/cache/conftool/dbconfig/20240125-021221-marostegui.json
- 01:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P55600 and previous config saved to /var/cache/conftool/dbconfig/20240125-015714-marostegui.json
- 01:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T354336)', diff saved to https://phabricator.wikimedia.org/P55599 and previous config saved to /var/cache/conftool/dbconfig/20240125-014208-marostegui.json
- 01:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1249 (T354336)', diff saved to https://phabricator.wikimedia.org/P55598 and previous config saved to /var/cache/conftool/dbconfig/20240125-013958-marostegui.json
- 01:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1249.eqiad.wmnet with reason: Maintenance
- 01:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1249.eqiad.wmnet with reason: Maintenance
- 01:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T354336)', diff saved to https://phabricator.wikimedia.org/P55597 and previous config saved to /var/cache/conftool/dbconfig/20240125-013936-marostegui.json
- 01:28 fab@deploy2002: Finished deploy [airflow-dags/research@e6aa85a]: (no justification provided) (duration: 00m 13s)
- 01:28 fab@deploy2002: Started deploy [airflow-dags/research@e6aa85a]: (no justification provided)
- 01:25 eileen: civicrm upgraded from b85b6dde to 69d4ebe3
- 01:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P55596 and previous config saved to /var/cache/conftool/dbconfig/20240125-012430-marostegui.json
- 01:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P55595 and previous config saved to /var/cache/conftool/dbconfig/20240125-010923-marostegui.json
- 00:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T354336)', diff saved to https://phabricator.wikimedia.org/P55594 and previous config saved to /var/cache/conftool/dbconfig/20240125-005417-marostegui.json
- 00:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1248 (T354336)', diff saved to https://phabricator.wikimedia.org/P55593 and previous config saved to /var/cache/conftool/dbconfig/20240125-005307-marostegui.json
- 00:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1248.eqiad.wmnet with reason: Maintenance
- 00:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1248.eqiad.wmnet with reason: Maintenance
- 00:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T354336)', diff saved to https://phabricator.wikimedia.org/P55592 and previous config saved to /var/cache/conftool/dbconfig/20240125-005245-marostegui.json
- 00:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P55591 and previous config saved to /var/cache/conftool/dbconfig/20240125-003739-marostegui.json
- 00:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P55590 and previous config saved to /var/cache/conftool/dbconfig/20240125-002233-marostegui.json
- 00:12 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2103.codfw.wmnet with OS bullseye
- 00:12 zabe@deploy2002: Finished scap: Backport for gerrit:992830Start reading from af_user(_text)/afh_user(_text) in testwiki (T355616) (duration: 09m 36s)
- 00:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T354336)', diff saved to https://phabricator.wikimedia.org/P55589 and previous config saved to /var/cache/conftool/dbconfig/20240125-000726-marostegui.json
- 00:05 zabe@deploy2002: zabe: Continuing with sync
- 00:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1247 (T354336)', diff saved to https://phabricator.wikimedia.org/P55588 and previous config saved to /var/cache/conftool/dbconfig/20240125-000515-marostegui.json
- 00:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1247.eqiad.wmnet with reason: Maintenance
- 00:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1247.eqiad.wmnet with reason: Maintenance
- 00:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T354336)', diff saved to https://phabricator.wikimedia.org/P55587 and previous config saved to /var/cache/conftool/dbconfig/20240125-000452-marostegui.json
- 00:04 zabe@deploy2002: zabe: Backport for gerrit:992830Start reading from af_user(_text)/afh_user(_text) in testwiki (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 00:02 zabe@deploy2002: Started scap: Backport for gerrit:992830Start reading from af_user(_text)/afh_user(_text) in testwiki (T355616)
2024-01-24
- 23:54 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2103.codfw.wmnet with reason: host reimage
- 23:51 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2103.codfw.wmnet with reason: host reimage
- 23:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P55586 and previous config saved to /var/cache/conftool/dbconfig/20240124-234946-marostegui.json
- 23:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P55585 and previous config saved to /var/cache/conftool/dbconfig/20240124-233439-marostegui.json
- 23:34 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2103.codfw.wmnet with OS bullseye
- 23:33 jforrester@deploy2002: Finished scap: Backport for [[gerrit:992775|Revert "Update
spacing to improve consistency of ul/ol spacing, also update heading spacing to be more consistent, relying on mw defaults more" (T355805 T354433)]] (duration: 13m 29s)
- 23:32 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2105.codfw.wmnet with OS bullseye
- 23:32 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2104.codfw.wmnet with OS bullseye
- 23:26 jforrester@deploy2002: jforrester: Continuing with sync
- 23:21 jforrester@deploy2002: jforrester: Backport for [[gerrit:992775|Revert "Update
spacing to improve consistency of ul/ol spacing, also update heading spacing to be more consistent, relying on mw defaults more" (T355805 T354433)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 23:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T354336)', diff saved to https://phabricator.wikimedia.org/P55584 and previous config saved to /var/cache/conftool/dbconfig/20240124-231933-marostegui.json
- 23:19 jforrester@deploy2002: Started scap: Backport for [[gerrit:992775|Revert "Update
spacing to improve consistency of ul/ol spacing, also update heading spacing to be more consistent, relying on mw defaults more" (T355805 T354433)]]
- 23:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1243 (T354336)', diff saved to https://phabricator.wikimedia.org/P55583 and previous config saved to /var/cache/conftool/dbconfig/20240124-231723-marostegui.json
- 23:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1243.eqiad.wmnet with reason: Maintenance
- 23:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1243.eqiad.wmnet with reason: Maintenance
- 23:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T354336)', diff saved to https://phabricator.wikimedia.org/P55582 and previous config saved to /var/cache/conftool/dbconfig/20240124-231701-marostegui.json
- 23:04 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2103.codfw.wmnet with OS bullseye
- 23:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P55581 and previous config saved to /var/cache/conftool/dbconfig/20240124-230155-marostegui.json
- 22:50 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2106.codfw.wmnet with OS bullseye
- 22:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P55580 and previous config saved to /var/cache/conftool/dbconfig/20240124-224648-marostegui.json
- 22:39 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: cloduelastic maintenance
- 22:39 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: cloduelastic maintenance
- 22:33 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2106.codfw.wmnet with reason: host reimage
- 22:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T354336)', diff saved to https://phabricator.wikimedia.org/P55579 and previous config saved to /var/cache/conftool/dbconfig/20240124-223142-marostegui.json
- 22:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1242 (T354336)', diff saved to https://phabricator.wikimedia.org/P55578 and previous config saved to /var/cache/conftool/dbconfig/20240124-222932-marostegui.json
- 22:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1242.eqiad.wmnet with reason: Maintenance
- 22:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1242.eqiad.wmnet with reason: Maintenance
- 22:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T354336)', diff saved to https://phabricator.wikimedia.org/P55577 and previous config saved to /var/cache/conftool/dbconfig/20240124-222910-marostegui.json
- 22:28 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2106.codfw.wmnet with reason: host reimage
- 22:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P55576 and previous config saved to /var/cache/conftool/dbconfig/20240124-221403-marostegui.json
- 22:11 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2105.codfw.wmnet with OS bullseye
- 22:11 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2106.codfw.wmnet with OS bullseye
- 22:11 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2104.codfw.wmnet with OS bullseye
- 22:10 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2103.codfw.wmnet with OS bullseye
- 21:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P55575 and previous config saved to /var/cache/conftool/dbconfig/20240124-215857-marostegui.json
- 21:45 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase[2022-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
- 21:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T354336)', diff saved to https://phabricator.wikimedia.org/P55574 and previous config saved to /var/cache/conftool/dbconfig/20240124-214351-marostegui.json
- 21:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1241 (T354336)', diff saved to https://phabricator.wikimedia.org/P55573 and previous config saved to /var/cache/conftool/dbconfig/20240124-214141-marostegui.json
- 21:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1241.eqiad.wmnet with reason: Maintenance
- 21:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1241.eqiad.wmnet with reason: Maintenance
- 21:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T354336)', diff saved to https://phabricator.wikimedia.org/P55572 and previous config saved to /var/cache/conftool/dbconfig/20240124-214120-marostegui.json
- 21:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P55571 and previous config saved to /var/cache/conftool/dbconfig/20240124-212613-marostegui.json
- 21:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P55570 and previous config saved to /var/cache/conftool/dbconfig/20240124-211107-marostegui.json
- 21:05 aqu@deploy2002: Finished deploy [airflow-dags/analytics@5a0681b]: Regular analytics weekly train [airflow-dags/analytics@5a0681bc] (duration: 00m 37s)
- 21:05 aqu@deploy2002: Started deploy [airflow-dags/analytics@5a0681b]: Regular analytics weekly train [airflow-dags/analytics@5a0681bc]
- 20:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T354336)', diff saved to https://phabricator.wikimedia.org/P55569 and previous config saved to /var/cache/conftool/dbconfig/20240124-205600-marostegui.json
- 20:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1238 (T354336)', diff saved to https://phabricator.wikimedia.org/P55568 and previous config saved to /var/cache/conftool/dbconfig/20240124-205350-marostegui.json
- 20:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1238.eqiad.wmnet with reason: Maintenance
- 20:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1238.eqiad.wmnet with reason: Maintenance
- 20:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T354336)', diff saved to https://phabricator.wikimedia.org/P55567 and previous config saved to /var/cache/conftool/dbconfig/20240124-205327-marostegui.json
- 20:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P55566 and previous config saved to /var/cache/conftool/dbconfig/20240124-203821-marostegui.json
- 20:38 fab@deploy2002: Finished deploy [airflow-dags/research@2f514fc]: (no justification provided) (duration: 00m 33s)
- 20:37 fab@deploy2002: Started deploy [airflow-dags/research@2f514fc]: (no justification provided)
- 20:26 zabe: zabe@mwmaint2002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=scowiki --logwiki=metawiki 'TheBabushka' 'AshotGPT' # T355743
- 20:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P55565 and previous config saved to /var/cache/conftool/dbconfig/20240124-202315-marostegui.json
- 20:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T354336)', diff saved to https://phabricator.wikimedia.org/P55564 and previous config saved to /var/cache/conftool/dbconfig/20240124-200808-marostegui.json
- 20:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1221 (T354336)', diff saved to https://phabricator.wikimedia.org/P55563 and previous config saved to /var/cache/conftool/dbconfig/20240124-200659-marostegui.json
- 20:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 20:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 20:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1221.eqiad.wmnet with reason: Maintenance
- 20:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1221.eqiad.wmnet with reason: Maintenance
- 20:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T354336)', diff saved to https://phabricator.wikimedia.org/P55562 and previous config saved to /var/cache/conftool/dbconfig/20240124-200619-marostegui.json
- 20:02 cstone: payments-wiki upgraded from a3691a8e to 8cfbbb4b
- 19:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P55561 and previous config saved to /var/cache/conftool/dbconfig/20240124-195113-marostegui.json
- 19:39 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 19:38 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 19:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P55560 and previous config saved to /var/cache/conftool/dbconfig/20240124-193606-marostegui.json
- 19:35 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 19:34 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 19:34 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 19:33 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 19:24 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 19:23 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 19:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T354336)', diff saved to https://phabricator.wikimedia.org/P55559 and previous config saved to /var/cache/conftool/dbconfig/20240124-192100-marostegui.json
- 19:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1199 (T354336)', diff saved to https://phabricator.wikimedia.org/P55558 and previous config saved to /var/cache/conftool/dbconfig/20240124-191850-marostegui.json
- 19:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1199.eqiad.wmnet with reason: Maintenance
- 19:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1199.eqiad.wmnet with reason: Maintenance
- 19:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55557 and previous config saved to /var/cache/conftool/dbconfig/20240124-191828-marostegui.json
- 19:16 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase[2022-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
- 19:13 eevans@cumin1002: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching restbase[2017-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
- 19:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P55555 and previous config saved to /var/cache/conftool/dbconfig/20240124-190322-marostegui.json
- 18:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P55554 and previous config saved to /var/cache/conftool/dbconfig/20240124-184815-marostegui.json
- 18:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55553 and previous config saved to /var/cache/conftool/dbconfig/20240124-183308-marostegui.json
- 18:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55552 and previous config saved to /var/cache/conftool/dbconfig/20240124-183059-marostegui.json
- 18:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1190.eqiad.wmnet with reason: Maintenance
- 18:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1190.eqiad.wmnet with reason: Maintenance
- 18:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 18:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 18:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1149.eqiad.wmnet with reason: Maintenance
- 18:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1149.eqiad.wmnet with reason: Maintenance
- 18:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55551 and previous config saved to /var/cache/conftool/dbconfig/20240124-183001-marostegui.json
- 18:24 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase[2017-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
- 18:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P55550 and previous config saved to /var/cache/conftool/dbconfig/20240124-181455-marostegui.json
- 18:09 mfossati@deploy2002: Finished deploy [airflow-dags/platform_eng@fed6de3]: (no justification provided) (duration: 00m 32s)
- 18:08 mfossati@deploy2002: Started deploy [airflow-dags/platform_eng@fed6de3]: (no justification provided)
- 17:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P55549 and previous config saved to /var/cache/conftool/dbconfig/20240124-175948-marostegui.json
- 17:50 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 17:50 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 17:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
- 17:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
- 17:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55548 and previous config saved to /var/cache/conftool/dbconfig/20240124-174442-marostegui.json
- 17:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1146:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55547 and previous config saved to /var/cache/conftool/dbconfig/20240124-174332-marostegui.json
- 17:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 17:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 17:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 17:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 17:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55546 and previous config saved to /var/cache/conftool/dbconfig/20240124-174251-marostegui.json
- 17:35 eevans@cumin1002: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching restbase[2015-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
- 17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P55545 and previous config saved to /var/cache/conftool/dbconfig/20240124-172745-marostegui.json
- 17:24 hashar@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.15 refs T354433 (duration: 07m 10s)
- 17:17 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase[2015-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
- 17:16 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.15 refs T354433
- 17:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P55544 and previous config saved to /var/cache/conftool/dbconfig/20240124-171238-marostegui.json
- 17:10 sukhe: sudo cumin -b1 -s60 "R:Class = Bird" "enable-puppet 'CR991699' && run-puppet-agent"
- 17:09 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase103[1-3].eqiad.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
- 17:06 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@16476a9] (releasing): (no justification provided) (duration: 01m 07s)
- 17:06 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@16476a9] (releasing): (no justification provided)
- 17:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2053.codfw.wmnet
- 17:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1053.eqiad.wmnet
- 16:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2053.codfw.wmnet
- 16:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1053.eqiad.wmnet
- 16:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55543 and previous config saved to /var/cache/conftool/dbconfig/20240124-165732-marostegui.json
- 16:56 vgutierrez: enable puppet on cp3066 - T354424
- 16:55 sukhe: enable puppet on durum1001 to test CR 991699
- 16:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1144:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55542 and previous config saved to /var/cache/conftool/dbconfig/20240124-165522-marostegui.json
- 16:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 16:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 16:54 XioNoX: disable puppet on all the hosts running bird to deploy https://gerrit.wikimedia.org/r/c/operations/puppet/+/991699
- 16:39 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase103[1-3].eqiad.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
- 16:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2103.codfw.wmnet with reason: Maintenance
- 16:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2103.codfw.wmnet with reason: Maintenance
- 16:30 eevans@cumin1002: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching A:restbase-eqiad: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
- 16:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T354336)', diff saved to https://phabricator.wikimedia.org/P55541 and previous config saved to /var/cache/conftool/dbconfig/20240124-162532-marostegui.json
- 16:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P55540 and previous config saved to /var/cache/conftool/dbconfig/20240124-161026-marostegui.json
- 16:04 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 16:04 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 16:03 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 16:03 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 15:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host phab2002.codfw.wmnet
- 15:57 hashar@deploy2002: Synchronized php-1.42.0-wmf.15/extensions/Echo/includes/Formatters/EchoRevertedPresentationModel.php: Fix EchoRevertedPresentationModel using null as string - T355751 (duration: 09m 06s)
- 15:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P55539 and previous config saved to /var/cache/conftool/dbconfig/20240124-155519-marostegui.json
- 15:50 vgutierrez: disable puppet on cp3066 - T354424
- 15:48 sukhe: sudo cumin -b1 -s120 'A:dns-rec' "enable-puppet 'merging CR 980929' && run-puppet-agent"
- 15:47 hashar@deploy2002: Synchronized php-1.42.0-wmf.15/extensions/CentralAuth/tests/phpunit/CentralAuthIdLookupTest.php: Fix CentralIdLookup tests (duration: 11m 18s)
- 15:45 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2446.codfw.wmnet with OS bullseye
- 15:42 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2430.codfw.wmnet with OS bullseye
- 15:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T354336)', diff saved to https://phabricator.wikimedia.org/P55538 and previous config saved to /var/cache/conftool/dbconfig/20240124-154013-marostegui.json
- 15:39 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2427.codfw.wmnet with OS bullseye
- 15:38 sukhe: sudo cumin 'A:dns-rec' "disable-puppet 'merging CR 980929'"
- 15:38 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 15:38 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 15:38 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 15:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2188 (T354336)', diff saved to https://phabricator.wikimedia.org/P55537 and previous config saved to /var/cache/conftool/dbconfig/20240124-153752-marostegui.json
- 15:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2188.codfw.wmnet with reason: Maintenance
- 15:37 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 15:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2188.codfw.wmnet with reason: Maintenance
- 15:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T354336)', diff saved to https://phabricator.wikimedia.org/P55536 and previous config saved to /var/cache/conftool/dbconfig/20240124-153730-marostegui.json
- 15:37 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host phab2002.codfw.wmnet
- 15:37 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 15:36 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 15:32 moritzm: imported jenkins 2.426.3 for buster/bullseye T355503
- 15:25 aqu@deploy2002: Finished deploy [airflow-dags/analytics@da2e61c]: Regular analytics weekly train [airflow-dags/analytics@da2e61c7] (duration: 00m 42s)
- 15:25 aqu@deploy2002: Started deploy [airflow-dags/analytics@da2e61c]: Regular analytics weekly train [airflow-dags/analytics@da2e61c7]
- 15:25 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2446.codfw.wmnet with reason: host reimage
- 15:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P55534 and previous config saved to /var/cache/conftool/dbconfig/20240124-152224-marostegui.json
- 15:22 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2430.codfw.wmnet with reason: host reimage
- 15:21 aqu: Refinery weekly deployment train - end (scap, then deployed onto hdfs) (test cluster deploy still broken T354703)
- 15:19 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2427.codfw.wmnet with reason: host reimage
- 15:17 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2430.codfw.wmnet with reason: host reimage
- 15:16 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2446.codfw.wmnet with reason: host reimage
- 15:16 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2427.codfw.wmnet with reason: host reimage
- 15:12 aqu@deploy2002: Finished deploy [analytics/refinery@13f7a06] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@13f7a06c] (duration: 03m 28s)
- 15:11 moritzm: uploading pymsql 1.0.2-2~wmf11u1 to apt.wikimedia.org T355531
- 15:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2055.codfw.wmnet
- 15:08 aqu@deploy2002: Started deploy [analytics/refinery@13f7a06] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@13f7a06c]
- 15:08 aqu@deploy2002: Finished deploy [analytics/refinery@13f7a06] (thin): Regular analytics weekly train THIN [analytics/refinery@13f7a06c] (duration: 00m 05s)
- 15:08 aqu@deploy2002: Started deploy [analytics/refinery@13f7a06] (thin): Regular analytics weekly train THIN [analytics/refinery@13f7a06c]
- 15:07 aqu@deploy2002: Finished deploy [analytics/refinery@13f7a06]: Regular analytics weekly train [analytics/refinery@13f7a06c] (duration: 10m 12s)
- 15:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P55533 and previous config saved to /var/cache/conftool/dbconfig/20240124-150718-marostegui.json
- 15:04 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2055.codfw.wmnet
- 14:59 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2446.codfw.wmnet with OS bullseye
- 14:59 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2430.codfw.wmnet with OS bullseye
- 14:59 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2427.codfw.wmnet with OS bullseye
- 14:57 aqu@deploy2002: Started deploy [analytics/refinery@13f7a06]: Regular analytics weekly train [analytics/refinery@13f7a06c]
- 14:57 akosiaris@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 14:57 akosiaris@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 14:56 akosiaris@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 14:56 akosiaris@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 14:56 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 14:56 aqu@deploy2002: Finished deploy [analytics/refinery@d1ee04c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d1ee04cc] (duration: 03m 40s)
- 14:56 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 14:55 akosiaris: bump eventrouter limits/requests memory/cpu
- 14:55 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 14:55 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['elastic2094.codfw.wmnet']
- 14:55 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 14:52 aqu@deploy2002: Started deploy [analytics/refinery@d1ee04c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d1ee04cc]
- 14:52 aqu@deploy2002: Finished deploy [analytics/refinery@d1ee04c] (thin): Regular analytics weekly train THIN [analytics/refinery@d1ee04cc] (duration: 00m 06s)
- 14:52 aqu@deploy2002: Started deploy [analytics/refinery@d1ee04c] (thin): Regular analytics weekly train THIN [analytics/refinery@d1ee04cc]
- 14:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T354336)', diff saved to https://phabricator.wikimedia.org/P55532 and previous config saved to /var/cache/conftool/dbconfig/20240124-145211-marostegui.json
- 14:51 Lucas_WMDE: UTC afternoon backport+config window done
- 14:50 aqu@deploy2002: Finished deploy [analytics/refinery@d1ee04c]: Regular analytics weekly train [analytics/refinery@d1ee04cc] (duration: 09m 11s)
- 14:50 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:992706cswiki: remove unused birthday logo files (duration: 09m 36s)
- 14:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2176 (T354336)', diff saved to https://phabricator.wikimedia.org/P55531 and previous config saved to /var/cache/conftool/dbconfig/20240124-144947-marostegui.json
- 14:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2176.codfw.wmnet with reason: Maintenance
- 14:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2176.codfw.wmnet with reason: Maintenance
- 14:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T354336)', diff saved to https://phabricator.wikimedia.org/P55530 and previous config saved to /var/cache/conftool/dbconfig/20240124-144925-marostegui.json
- 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2054.codfw.wmnet
- 14:44 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
- 14:43 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for gerrit:992706cswiki: remove unused birthday logo files synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:41 aqu@deploy2002: Started deploy [analytics/refinery@d1ee04c]: Regular analytics weekly train [analytics/refinery@d1ee04cc]
- 14:41 aqu@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
- 14:41 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:992706cswiki: remove unused birthday logo files
- 14:40 aqu@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
- 14:39 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [[gerrit:992678|[azwiki] Add new namespace aliases (T355041)]] (duration: 10m 00s)
- 14:38 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2054.codfw.wmnet
- 14:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1054.eqiad.wmnet
- 14:37 aqu@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
- 14:36 aqu@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
- 14:36 aqu@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
- 14:35 aqu@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
- 14:35 aqu@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
- 14:35 aqu@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
- 14:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
- 14:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P55529 and previous config saved to /var/cache/conftool/dbconfig/20240124-143419-marostegui.json
- 14:34 aqu@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
- 14:33 aqu@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
- 14:33 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1054.eqiad.wmnet
- 14:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Continuing with sync
- 14:31 aqu: analytics/refinery weekly deployment train - begin
- 14:31 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2052.codfw.wmnet
- 14:31 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1052.eqiad.wmnet
- 14:30 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Backport for [[gerrit:992678|[azwiki] Add new namespace aliases (T355041)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:29 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
- 14:29 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [[gerrit:992678|[azwiki] Add new namespace aliases (T355041)]]
- 14:27 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [[gerrit:992671|[ganwiki] Change autoconfirmed setting (T355126)]] (duration: 09m 51s)
- 14:26 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2094.codfw.wmnet']
- 14:25 bking@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['elastic2094.codfw.wmnet']
- 14:25 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2094.codfw.wmnet']
- 14:25 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2094.codfw.wmnet']
- 14:25 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2094.codfw.wmnet']
- 14:25 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2052.codfw.wmnet
- 14:25 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1052.eqiad.wmnet
- 14:25 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2088.codfw.wmnet']
- 14:24 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088.codfw.wmnet']
- 14:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
- 14:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and superpes: Continuing with sync
- 14:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P55527 and previous config saved to /var/cache/conftool/dbconfig/20240124-141912-marostegui.json
- 14:19 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and superpes: Backport for [[gerrit:992671|[ganwiki] Change autoconfirmed setting (T355126)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:17 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [[gerrit:992671|[ganwiki] Change autoconfirmed setting (T355126)]]
- 14:14 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:992631Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798) (duration: 10m 52s)
- 14:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2053.codfw.wmnet
- 14:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Continuing with sync
- 14:05 logmsgbot: lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Backport for gerrit:992631Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:04 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2053.codfw.wmnet
- 14:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T354336)', diff saved to https://phabricator.wikimedia.org/P55526 and previous config saved to /var/cache/conftool/dbconfig/20240124-140406-marostegui.json
- 14:04 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ml-serve2005.codfw.wmnet with reason: Machine move (T355437)
- 14:04 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:992631Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798)
- 14:03 klausman@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ml-serve2005.codfw.wmnet with reason: Machine move (T355437)
- 14:01 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2174 (T354336)', diff saved to https://phabricator.wikimedia.org/P55525 and previous config saved to /var/cache/conftool/dbconfig/20240124-140142-marostegui.json
- 14:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2174.codfw.wmnet with reason: Maintenance
- 14:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2174.codfw.wmnet with reason: Maintenance
- 14:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T354336)', diff saved to https://phabricator.wikimedia.org/P55524 and previous config saved to /var/cache/conftool/dbconfig/20240124-140120-marostegui.json
- 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1053.eqiad.wmnet
- 13:54 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55523 and previous config saved to /var/cache/conftool/dbconfig/20240124-135424-root.json
- 13:50 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1053.eqiad.wmnet
- 13:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P55522 and previous config saved to /var/cache/conftool/dbconfig/20240124-134614-marostegui.json
- 13:39 samtar@deploy2002: Finished scap: Backport for gerrit:991100Added Diff to approved list of RSS feeds for Foundation Governance Wiki and removed inoperative feed. (T354790) (duration: 09m 14s)
- 13:39 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55521 and previous config saved to /var/cache/conftool/dbconfig/20240124-133919-root.json
- 13:37 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2051.codfw.wmnet
- 13:37 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1051.eqiad.wmnet
- 13:32 samtar@deploy2002: samtar and varnent: Continuing with sync
- 13:32 samtar@deploy2002: samtar and varnent: Backport for gerrit:991100Added Diff to approved list of RSS feeds for Foundation Governance Wiki and removed inoperative feed. (T354790) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 13:31 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1051.eqiad.wmnet
- 13:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P55520 and previous config saved to /var/cache/conftool/dbconfig/20240124-133107-marostegui.json
- 13:31 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2051.codfw.wmnet
- 13:30 samtar@deploy2002: Started scap: Backport for gerrit:991100Added Diff to approved list of RSS feeds for Foundation Governance Wiki and removed inoperative feed. (T354790)
- 13:24 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55519 and previous config saved to /var/cache/conftool/dbconfig/20240124-132414-root.json
- 13:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T354336)', diff saved to https://phabricator.wikimedia.org/P55518 and previous config saved to /var/cache/conftool/dbconfig/20240124-131600-marostegui.json
- 13:09 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55517 and previous config saved to /var/cache/conftool/dbconfig/20240124-130909-root.json
- 12:54 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55516 and previous config saved to /var/cache/conftool/dbconfig/20240124-125404-root.json
- 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2052.codfw.wmnet
- 12:39 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 5%: After switchover', diff saved to https://phabricator.wikimedia.org/P55515 and previous config saved to /var/cache/conftool/dbconfig/20240124-123859-root.json
- 12:34 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2052.codfw.wmnet
- 12:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1052.eqiad.wmnet
- 12:28 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1052.eqiad.wmnet
- 12:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 1%: After switchover', diff saved to https://phabricator.wikimedia.org/P55514 and previous config saved to /var/cache/conftool/dbconfig/20240124-122354-root.json
- 12:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1231 T355760', diff saved to https://phabricator.wikimedia.org/P55513 and previous config saved to /var/cache/conftool/dbconfig/20240124-122148-root.json
- 12:20 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db1173 to s6 primary T355760', diff saved to https://phabricator.wikimedia.org/P55512 and previous config saved to /var/cache/conftool/dbconfig/20240124-122030-marostegui.json
- 12:19 marostegui: Starting s6 eqiad failover from db1231 to db1173 - T355760
- 12:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 12:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 12:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2173.codfw.wmnet with reason: Maintenance
- 12:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2173.codfw.wmnet with reason: Maintenance
- 12:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55510 and previous config saved to /var/cache/conftool/dbconfig/20240124-121448-marostegui.json
- 12:07 ladsgroup@deploy2002: Finished scap: Backport for [[gerrit:992514|GenerateFancyCaptchas: Add ->disableSandbox() to shell command]] (duration: 09m 55s)
- 12:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355760
- 12:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355760
- 12:00 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P55509 and previous config saved to /var/cache/conftool/dbconfig/20240124-115942-marostegui.json
- 11:58 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:992514|GenerateFancyCaptchas: Add ->disableSandbox() to shell command]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 11:58 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host acmechief-test1001.eqiad.wmnet
- 11:57 ladsgroup@deploy2002: Started scap: Backport for [[gerrit:992514|GenerateFancyCaptchas: Add ->disableSandbox() to shell command]]
- 11:57 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 11:56 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 11:56 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2050.codfw.wmnet
- 11:55 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host acmechief-test2001.codfw.wmnet
- 11:55 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1050.eqiad.wmnet
- 11:54 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 11:52 hnowlan@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 11:52 hnowlan@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 11:49 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2050.codfw.wmnet
- 11:49 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1050.eqiad.wmnet
- 11:47 hnowlan@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 11:46 hnowlan@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P55506 and previous config saved to /var/cache/conftool/dbconfig/20240124-114435-marostegui.json
- 11:43 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host acmechief-test2001.codfw.wmnet
- 11:33 vgutierrez: repool cp3066 - T354424
- 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1014.eqiad.wmnet with OS bullseye
- 11:32 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
- 11:32 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
- 11:31 vgutierrez: depooling cp3066 - T354424
- 11:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55505 and previous config saved to /var/cache/conftool/dbconfig/20240124-112929-marostegui.json
- 11:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2170:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55504 and previous config saved to /var/cache/conftool/dbconfig/20240124-112705-marostegui.json
- 11:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 11:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 11:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55503 and previous config saved to /var/cache/conftool/dbconfig/20240124-112643-marostegui.json
- 11:26 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
- 11:26 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
- 11:24 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 11:24 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 11:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P55501 and previous config saved to /var/cache/conftool/dbconfig/20240124-111136-marostegui.json
- 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1014.eqiad.wmnet with reason: host reimage
- 10:59 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1014.eqiad.wmnet with reason: host reimage
- 10:57 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=rowikinews --fix # T350889
- 10:57 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1173 with weight 0 T355760', diff saved to https://phabricator.wikimedia.org/P55500 and previous config saved to /var/cache/conftool/dbconfig/20240124-105702-root.json
- 10:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P55499 and previous config saved to /var/cache/conftool/dbconfig/20240124-105630-marostegui.json
- 10:45 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1014.eqiad.wmnet with OS bullseye
- 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host snapshot1014.eqiad.wmnet
- 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1017.eqiad.wmnet with OS bullseye
- 10:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55498 and previous config saved to /var/cache/conftool/dbconfig/20240124-104123-marostegui.json
- 10:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2167:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55497 and previous config saved to /var/cache/conftool/dbconfig/20240124-103900-marostegui.json
- 10:38 hashar: deployment-server: removing `gerrit` remove from `/srv/mediawiki-staging` given it is tied to a specific username and the `origin` remote already has ssh protocol for push # ping James_F
- 10:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
- 10:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
- 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T354336)', diff saved to https://phabricator.wikimedia.org/P55496 and previous config saved to /var/cache/conftool/dbconfig/20240124-103837-marostegui.json
- 10:37 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host snapshot1014.eqiad.wmnet
- 10:36 moritzm: upgrading cumin1002 to pymsql 1.0.2-2~wmf11u1 T355531
- 10:31 hashar@deploy2002: rebuilt and synchronized wikiversions files: Revert "group1 wikis to 1.42.0-wmf.15" - T354433
- 10:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P55495 and previous config saved to /var/cache/conftool/dbconfig/20240124-102330-marostegui.json
- 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1017.eqiad.wmnet with reason: host reimage
- 10:10 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1017.eqiad.wmnet with reason: host reimage
- 10:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P55494 and previous config saved to /var/cache/conftool/dbconfig/20240124-100824-marostegui.json
- 10:00 vgutierrez: repool cp3066 - T354424
- 09:58 vgutierrez: depooling cp3066 - T354424
- 09:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1017.eqiad.wmnet with OS bullseye
- 09:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T354336)', diff saved to https://phabricator.wikimedia.org/P55493 and previous config saved to /var/cache/conftool/dbconfig/20240124-095317-marostegui.json
- 09:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2153 (T354336)', diff saved to https://phabricator.wikimedia.org/P55492 and previous config saved to /var/cache/conftool/dbconfig/20240124-095054-marostegui.json
- 09:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2153.codfw.wmnet with reason: Maintenance
- 09:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2153.codfw.wmnet with reason: Maintenance
- 09:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T354336)', diff saved to https://phabricator.wikimedia.org/P55491 and previous config saved to /var/cache/conftool/dbconfig/20240124-095032-marostegui.json
- 09:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: A1 codfw maintenance
- 09:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: A1 codfw maintenance
- 09:49 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1037.eqiad.wmnet to cluster eqiad and group C
- 09:41 ayounsi@cumin2002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-f8-eqiad
- 09:41 ayounsi@cumin2002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
- 09:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P55489 and previous config saved to /var/cache/conftool/dbconfig/20240124-093526-marostegui.json
- 09:32 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-f8-eqiad
- 09:32 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
- 09:31 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1037.eqiad.wmnet to cluster eqiad and group C
- 09:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: A1 codfw maintenance T355437
- 09:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: A1 codfw maintenance T355437
- 09:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: A1 codfw maintenance T355437
- 09:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: A1 codfw maintenance T355437
- 09:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: A1 codfw maintenance T355437
- 09:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: A1 codfw maintenance T355437
- 09:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2026.codfw.wmnet with reason: A1 codfw maintenance T355437
- 09:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es2026.codfw.wmnet with reason: A1 codfw maintenance T355437
- 09:27 hashar@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.15 refs T354433 (duration: 06m 55s)
- 09:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P55488 and previous config saved to /var/cache/conftool/dbconfig/20240124-092019-marostegui.json
- 09:20 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.15 refs T354433
- 09:08 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ganeti1037.eqiad.wmnet
- 09:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T354336)', diff saved to https://phabricator.wikimedia.org/P55487 and previous config saved to /var/cache/conftool/dbconfig/20240124-090512-marostegui.json
- 09:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2146 (T354336)', diff saved to https://phabricator.wikimedia.org/P55486 and previous config saved to /var/cache/conftool/dbconfig/20240124-090250-marostegui.json
- 09:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2146.codfw.wmnet with reason: Maintenance
- 09:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2146.codfw.wmnet with reason: Maintenance
- 09:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T354336)', diff saved to https://phabricator.wikimedia.org/P55485 and previous config saved to /var/cache/conftool/dbconfig/20240124-090228-marostegui.json
- 08:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
- 08:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P55484 and previous config saved to /var/cache/conftool/dbconfig/20240124-084721-marostegui.json
- 08:45 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti1037.eqiad.wmnet
- 08:36 hashar@deploy2002: Finished scap: Backport for gerrit:992513Use a class for 'LogActionsHandlers' (T355680) (duration: 08m 00s)
- 08:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P55483 and previous config saved to /var/cache/conftool/dbconfig/20240124-083215-marostegui.json
- 08:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
- 08:30 hashar@deploy2002: hashar: Continuing with sync
- 08:30 hashar@deploy2002: hashar: Backport for gerrit:992513Use a class for 'LogActionsHandlers' (T355680) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:28 hashar@deploy2002: Started scap: Backport for gerrit:992513Use a class for 'LogActionsHandlers' (T355680)
- 08:25 logmsgbot: wmde-fisch@deploy2002 Finished scap: Backport for gerrit:992411Allow Cite events for reference previews baseline stats (T353798) (duration: 08m 32s)
- 08:18 logmsgbot: wmde-fisch@deploy2002 wmde-fisch: Continuing with sync
- 08:18 logmsgbot: wmde-fisch@deploy2002 wmde-fisch: Backport for gerrit:992411Allow Cite events for reference previews baseline stats (T353798) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:17 logmsgbot: wmde-fisch@deploy2002 Started scap: Backport for gerrit:992411Allow Cite events for reference previews baseline stats (T353798)
- 08:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T354336)', diff saved to https://phabricator.wikimedia.org/P55482 and previous config saved to /var/cache/conftool/dbconfig/20240124-081708-marostegui.json
- 08:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2145 (T354336)', diff saved to https://phabricator.wikimedia.org/P55481 and previous config saved to /var/cache/conftool/dbconfig/20240124-081445-marostegui.json
- 08:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2145.codfw.wmnet with reason: Maintenance
- 08:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2145.codfw.wmnet with reason: Maintenance
- 08:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2141.codfw.wmnet with reason: Maintenance
- 08:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2141.codfw.wmnet with reason: Maintenance
- 08:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T354336)', diff saved to https://phabricator.wikimedia.org/P55480 and previous config saved to /var/cache/conftool/dbconfig/20240124-081340-marostegui.json
- 08:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55479 and previous config saved to /var/cache/conftool/dbconfig/20240124-081050-root.json
- 08:07 logmsgbot: wmde-fisch@deploy2002 wmde-fisch: Backport for gerrit:992411Allow Cite events for reference previews baseline stats (T353798) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:05 logmsgbot: wmde-fisch@deploy2002 Started scap: Backport for gerrit:992411Allow Cite events for reference previews baseline stats (T353798)
- 07:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P55478 and previous config saved to /var/cache/conftool/dbconfig/20240124-075834-marostegui.json
- 07:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55477 and previous config saved to /var/cache/conftool/dbconfig/20240124-075545-root.json
- 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P55476 and previous config saved to /var/cache/conftool/dbconfig/20240124-074327-marostegui.json
- 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55475 and previous config saved to /var/cache/conftool/dbconfig/20240124-074040-root.json
- 07:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T354336)', diff saved to https://phabricator.wikimedia.org/P55474 and previous config saved to /var/cache/conftool/dbconfig/20240124-072821-marostegui.json
- 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2130 (T354336)', diff saved to https://phabricator.wikimedia.org/P55473 and previous config saved to /var/cache/conftool/dbconfig/20240124-072557-marostegui.json
- 07:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2130.codfw.wmnet with reason: Maintenance
- 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55472 and previous config saved to /var/cache/conftool/dbconfig/20240124-072535-root.json
- 07:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2130.codfw.wmnet with reason: Maintenance
- 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T354336)', diff saved to https://phabricator.wikimedia.org/P55471 and previous config saved to /var/cache/conftool/dbconfig/20240124-072523-marostegui.json
- 07:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 100%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55470 and previous config saved to /var/cache/conftool/dbconfig/20240124-071954-root.json
- 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55469 and previous config saved to /var/cache/conftool/dbconfig/20240124-071030-root.json
- 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P55468 and previous config saved to /var/cache/conftool/dbconfig/20240124-071016-marostegui.json
- 07:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 75%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55467 and previous config saved to /var/cache/conftool/dbconfig/20240124-070449-root.json
- 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55466 and previous config saved to /var/cache/conftool/dbconfig/20240124-065525-root.json
- 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P55465 and previous config saved to /var/cache/conftool/dbconfig/20240124-065510-marostegui.json
- 06:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 50%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55464 and previous config saved to /var/cache/conftool/dbconfig/20240124-064944-root.json
- 06:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2129.codfw.wmnet with OS bookworm
- 06:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55463 and previous config saved to /var/cache/conftool/dbconfig/20240124-064020-root.json
- 06:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T354336)', diff saved to https://phabricator.wikimedia.org/P55462 and previous config saved to /var/cache/conftool/dbconfig/20240124-064003-marostegui.json
- 06:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2116 (T354336)', diff saved to https://phabricator.wikimedia.org/P55461 and previous config saved to /var/cache/conftool/dbconfig/20240124-063739-marostegui.json
- 06:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2116.codfw.wmnet with reason: Maintenance
- 06:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2116.codfw.wmnet with reason: Maintenance
- 06:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2112 (T354336)', diff saved to https://phabricator.wikimedia.org/P55460 and previous config saved to /var/cache/conftool/dbconfig/20240124-063717-marostegui.json
- 06:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 25%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55459 and previous config saved to /var/cache/conftool/dbconfig/20240124-063440-root.json
- 06:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2112', diff saved to https://phabricator.wikimedia.org/P55458 and previous config saved to /var/cache/conftool/dbconfig/20240124-062210-marostegui.json
- 06:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 10%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55457 and previous config saved to /var/cache/conftool/dbconfig/20240124-061934-root.json
- 06:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2129.codfw.wmnet with reason: host reimage
- 06:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2129.codfw.wmnet with reason: host reimage
- 06:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2112', diff saved to https://phabricator.wikimedia.org/P55456 and previous config saved to /var/cache/conftool/dbconfig/20240124-060703-marostegui.json
- 06:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 5%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55455 and previous config saved to /var/cache/conftool/dbconfig/20240124-060429-root.json
- 05:58 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2129.codfw.wmnet with OS bookworm
- 05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2129 T354506', diff saved to https://phabricator.wikimedia.org/P55454 and previous config saved to /var/cache/conftool/dbconfig/20240124-055635-marostegui.json
- 05:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2112 (T354336)', diff saved to https://phabricator.wikimedia.org/P55453 and previous config saved to /var/cache/conftool/dbconfig/20240124-055157-marostegui.json
- 05:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2158 db2157 es2026 db2136 T355437', diff saved to https://phabricator.wikimedia.org/P55452 and previous config saved to /var/cache/conftool/dbconfig/20240124-055143-marostegui.json
- 05:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2112 (T354336)', diff saved to https://phabricator.wikimedia.org/P55451 and previous config saved to /var/cache/conftool/dbconfig/20240124-054932-marostegui.json
- 05:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2112.codfw.wmnet with reason: Maintenance
- 05:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 1%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55450 and previous config saved to /var/cache/conftool/dbconfig/20240124-054924-root.json
- 05:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2112.codfw.wmnet with reason: Maintenance
- 05:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2102.codfw.wmnet with reason: Maintenance
- 05:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2102.codfw.wmnet with reason: Maintenance
- 05:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 05:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 02:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 02:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 02:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T354336)', diff saved to https://phabricator.wikimedia.org/P55449 and previous config saved to /var/cache/conftool/dbconfig/20240124-023210-marostegui.json
- 02:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P55448 and previous config saved to /var/cache/conftool/dbconfig/20240124-021704-marostegui.json
- 02:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P55447 and previous config saved to /var/cache/conftool/dbconfig/20240124-020157-marostegui.json
- 01:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T354336)', diff saved to https://phabricator.wikimedia.org/P55445 and previous config saved to /var/cache/conftool/dbconfig/20240124-014651-marostegui.json
- 01:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1234 (T354336)', diff saved to https://phabricator.wikimedia.org/P55444 and previous config saved to /var/cache/conftool/dbconfig/20240124-014430-marostegui.json
- 01:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1234.eqiad.wmnet with reason: Maintenance
- 01:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1234.eqiad.wmnet with reason: Maintenance
- 01:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T354336)', diff saved to https://phabricator.wikimedia.org/P55443 and previous config saved to /var/cache/conftool/dbconfig/20240124-014408-marostegui.json
- 01:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P55442 and previous config saved to /var/cache/conftool/dbconfig/20240124-012902-marostegui.json
- 01:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P55441 and previous config saved to /var/cache/conftool/dbconfig/20240124-011355-marostegui.json
- 00:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T354336)', diff saved to https://phabricator.wikimedia.org/P55440 and previous config saved to /var/cache/conftool/dbconfig/20240124-005849-marostegui.json
- 00:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1232 (T354336)', diff saved to https://phabricator.wikimedia.org/P55439 and previous config saved to /var/cache/conftool/dbconfig/20240124-005627-marostegui.json
- 00:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1232.eqiad.wmnet with reason: Maintenance
- 00:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1232.eqiad.wmnet with reason: Maintenance
- 00:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T354336)', diff saved to https://phabricator.wikimedia.org/P55438 and previous config saved to /var/cache/conftool/dbconfig/20240124-005605-marostegui.json
- 00:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P55437 and previous config saved to /var/cache/conftool/dbconfig/20240124-004058-marostegui.json
- 00:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P55436 and previous config saved to /var/cache/conftool/dbconfig/20240124-002551-marostegui.json
- 00:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T354336)', diff saved to https://phabricator.wikimedia.org/P55435 and previous config saved to /var/cache/conftool/dbconfig/20240124-001044-marostegui.json
- 00:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1228 (T354336)', diff saved to https://phabricator.wikimedia.org/P55434 and previous config saved to /var/cache/conftool/dbconfig/20240124-000824-marostegui.json
- 00:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1228.eqiad.wmnet with reason: Maintenance
- 00:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1228.eqiad.wmnet with reason: Maintenance
- 00:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T354336)', diff saved to https://phabricator.wikimedia.org/P55433 and previous config saved to /var/cache/conftool/dbconfig/20240124-000802-marostegui.json
2024-01-23
- 23:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P55432 and previous config saved to /var/cache/conftool/dbconfig/20240123-235255-marostegui.json
- 23:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P55430 and previous config saved to /var/cache/conftool/dbconfig/20240123-233749-marostegui.json
- 23:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T354336)', diff saved to https://phabricator.wikimedia.org/P55429 and previous config saved to /var/cache/conftool/dbconfig/20240123-232242-marostegui.json
- 23:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1219 (T354336)', diff saved to https://phabricator.wikimedia.org/P55428 and previous config saved to /var/cache/conftool/dbconfig/20240123-232021-marostegui.json
- 23:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1219.eqiad.wmnet with reason: Maintenance
- 23:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1219.eqiad.wmnet with reason: Maintenance
- 23:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T354336)', diff saved to https://phabricator.wikimedia.org/P55427 and previous config saved to /var/cache/conftool/dbconfig/20240123-231959-marostegui.json
- 23:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P55426 and previous config saved to /var/cache/conftool/dbconfig/20240123-230453-marostegui.json
- 22:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P55425 and previous config saved to /var/cache/conftool/dbconfig/20240123-224946-marostegui.json
- 22:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T354336)', diff saved to https://phabricator.wikimedia.org/P55424 and previous config saved to /var/cache/conftool/dbconfig/20240123-223439-marostegui.json
- 22:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1218 (T354336)', diff saved to https://phabricator.wikimedia.org/P55423 and previous config saved to /var/cache/conftool/dbconfig/20240123-223215-marostegui.json
- 22:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1218.eqiad.wmnet with reason: Maintenance
- 22:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1218.eqiad.wmnet with reason: Maintenance
- 22:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T354336)', diff saved to https://phabricator.wikimedia.org/P55422 and previous config saved to /var/cache/conftool/dbconfig/20240123-223153-marostegui.json
- 22:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P55421 and previous config saved to /var/cache/conftool/dbconfig/20240123-221646-marostegui.json
- 22:03 kostajh: UTC late deploys done
- 22:02 kostajh: T355695 running mwscript resetAuthenticationThrottle.php --wiki=enwikibooks --signup --ip 195.70.81.86
- 22:02 kostajh: T355695 running mwscript resetAuthenticationThrottle.php --wiki=enwikibooks --signup --ip 62.232.9.14
- 22:01 kostajh: T355695 running mwscript resetAuthenticationThrottle.php --wiki=enwiki --signup --ip 195.70.81.86
- 22:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P55420 and previous config saved to /var/cache/conftool/dbconfig/20240123-220140-marostegui.json
- 22:01 kostajh: T355695 running mwscript resetAuthenticationThrottle.php --wiki=enwiki --signup --ip 62.232.9.14
- 21:59 kharlan@deploy2002: Finished scap: Backport for [[gerrit:992461|[knwiki] Removing the temporary logo (already reverted) (T338136)]], [[gerrit:992466|[itwiki] Add the 'abusefilter-bypass-blocked-external-domains' right to botadmins (T355694)]], [[gerrit:992471|[enwiki] and [enwikibooks] Throttle exemption for event (T355695)]] (duration: 15m 33s)
- 21:53 kharlan@deploy2002: superpes and kharlan: Continuing with sync
- 21:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T354336)', diff saved to https://phabricator.wikimedia.org/P55419 and previous config saved to /var/cache/conftool/dbconfig/20240123-214633-marostegui.json
- 21:45 kharlan@deploy2002: superpes and kharlan: Backport for [[gerrit:992461|[knwiki] Removing the temporary logo (already reverted) (T338136)]], [[gerrit:992466|[itwiki] Add the 'abusefilter-bypass-blocked-external-domains' right to botadmins (T355694)]], [[gerrit:992471|[enwiki] and [enwikibooks] Throttle exemption for event (T355695)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1207 (T354336)', diff saved to https://phabricator.wikimedia.org/P55418 and previous config saved to /var/cache/conftool/dbconfig/20240123-214413-marostegui.json
- 21:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1207.eqiad.wmnet with reason: Maintenance
- 21:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1207.eqiad.wmnet with reason: Maintenance
- 21:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T354336)', diff saved to https://phabricator.wikimedia.org/P55417 and previous config saved to /var/cache/conftool/dbconfig/20240123-214351-marostegui.json
- 21:43 kharlan@deploy2002: Started scap: Backport for [[gerrit:992461|[knwiki] Removing the temporary logo (already reverted) (T338136)]], [[gerrit:992466|[itwiki] Add the 'abusefilter-bypass-blocked-external-domains' right to botadmins (T355694)]], [[gerrit:992471|[enwiki] and [enwikibooks] Throttle exemption for event (T355695)]]
- 21:36 kharlan@deploy2002: Finished scap: Backport for gerrit:992506revertrisk: Fix i18n message reference (T348298), gerrit:992507revertrisk: Fix i18n messages (T348298) (duration: 30m 51s)
- 21:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P55416 and previous config saved to /var/cache/conftool/dbconfig/20240123-212845-marostegui.json
- 21:26 kharlan@deploy2002: kharlan: Continuing with sync
- 21:26 kharlan@deploy2002: kharlan: Backport for gerrit:992506revertrisk: Fix i18n message reference (T348298), gerrit:992507revertrisk: Fix i18n messages (T348298) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P55415 and previous config saved to /var/cache/conftool/dbconfig/20240123-211338-marostegui.json
- 21:05 kharlan@deploy2002: Started scap: Backport for gerrit:992506revertrisk: Fix i18n message reference (T348298), gerrit:992507revertrisk: Fix i18n messages (T348298)
- 20:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T354336)', diff saved to https://phabricator.wikimedia.org/P55414 and previous config saved to /var/cache/conftool/dbconfig/20240123-205832-marostegui.json
- 20:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1206 (T354336)', diff saved to https://phabricator.wikimedia.org/P55413 and previous config saved to /var/cache/conftool/dbconfig/20240123-205611-marostegui.json
- 20:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1206.eqiad.wmnet with reason: Maintenance
- 20:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1206.eqiad.wmnet with reason: Maintenance
- 20:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T354336)', diff saved to https://phabricator.wikimedia.org/P55412 and previous config saved to /var/cache/conftool/dbconfig/20240123-205549-marostegui.json
- 20:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P55411 and previous config saved to /var/cache/conftool/dbconfig/20240123-204043-marostegui.json
- 20:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P55410 and previous config saved to /var/cache/conftool/dbconfig/20240123-202536-marostegui.json
- 20:23 cstone: payments-wiki upgraded from c2138768 to a3691a8e
- 20:23 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling both afterwards
- 20:12 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling both afterwards
- 20:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T354336)', diff saved to https://phabricator.wikimedia.org/P55409 and previous config saved to /var/cache/conftool/dbconfig/20240123-201030-marostegui.json
- 20:08 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling both afterwards
- 20:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1196 (T354336)', diff saved to https://phabricator.wikimedia.org/P55408 and previous config saved to /var/cache/conftool/dbconfig/20240123-200809-marostegui.json
- 20:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 20:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 20:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1196.eqiad.wmnet with reason: Maintenance
- 20:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1196.eqiad.wmnet with reason: Maintenance
- 20:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T354336)', diff saved to https://phabricator.wikimedia.org/P55407 and previous config saved to /var/cache/conftool/dbconfig/20240123-200726-marostegui.json
- 19:57 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling both afterwards
- 19:57 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet, repooling both afterwards
- 19:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P55406 and previous config saved to /var/cache/conftool/dbconfig/20240123-195220-marostegui.json
- 19:49 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet, repooling both afterwards
- 19:45 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs[2024-2025].codfw.wmnet with reason: testing data xfter cookbook
- 19:45 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs[2024-2025].codfw.wmnet with reason: testing data xfter cookbook
- 19:45 mutante: phab1004 - /srv/phab/phabricator/bin/mail volume
- 19:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P55405 and previous config saved to /var/cache/conftool/dbconfig/20240123-193713-marostegui.json
- 19:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T354336)', diff saved to https://phabricator.wikimedia.org/P55404 and previous config saved to /var/cache/conftool/dbconfig/20240123-192207-marostegui.json
- 19:21 ejegg: fundraising civicrm upgraded from d8b0c977 to b85b6dde
- 19:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1186 (T354336)', diff saved to https://phabricator.wikimedia.org/P55403 and previous config saved to /var/cache/conftool/dbconfig/20240123-191945-marostegui.json
- 19:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1186.eqiad.wmnet with reason: Maintenance
- 19:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1186.eqiad.wmnet with reason: Maintenance
- 19:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T354336)', diff saved to https://phabricator.wikimedia.org/P55402 and previous config saved to /var/cache/conftool/dbconfig/20240123-191922-marostegui.json
- 19:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P55401 and previous config saved to /var/cache/conftool/dbconfig/20240123-190416-marostegui.json
- 18:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P55400 and previous config saved to /var/cache/conftool/dbconfig/20240123-184909-marostegui.json
- 18:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
- 18:37 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
- 18:37 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
- 18:36 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
- 18:35 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
- 18:35 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
- 18:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T354336)', diff saved to https://phabricator.wikimedia.org/P55399 and previous config saved to /var/cache/conftool/dbconfig/20240123-183403-marostegui.json
- 18:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T354336)', diff saved to https://phabricator.wikimedia.org/P55398 and previous config saved to /var/cache/conftool/dbconfig/20240123-183141-marostegui.json
- 18:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 18:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 18:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55397 and previous config saved to /var/cache/conftool/dbconfig/20240123-183120-marostegui.json
- 18:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P55396 and previous config saved to /var/cache/conftool/dbconfig/20240123-181613-marostegui.json
- 18:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P55395 and previous config saved to /var/cache/conftool/dbconfig/20240123-180107-marostegui.json
- 17:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55394 and previous config saved to /var/cache/conftool/dbconfig/20240123-174600-marostegui.json
- 17:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55393 and previous config saved to /var/cache/conftool/dbconfig/20240123-174339-marostegui.json
- 17:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 17:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 17:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 17:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 17:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 17:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 17:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T354336)', diff saved to https://phabricator.wikimedia.org/P55392 and previous config saved to /var/cache/conftool/dbconfig/20240123-174215-marostegui.json
- 17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P55391 and previous config saved to /var/cache/conftool/dbconfig/20240123-172709-marostegui.json
- 17:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P55390 and previous config saved to /var/cache/conftool/dbconfig/20240123-171202-marostegui.json
- 16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T354336)', diff saved to https://phabricator.wikimedia.org/P55389 and previous config saved to /var/cache/conftool/dbconfig/20240123-165656-marostegui.json
- 16:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1135 (T354336)', diff saved to https://phabricator.wikimedia.org/P55388 and previous config saved to /var/cache/conftool/dbconfig/20240123-165433-marostegui.json
- 16:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1135.eqiad.wmnet with reason: Maintenance
- 16:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1135.eqiad.wmnet with reason: Maintenance
- 16:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1106.eqiad.wmnet with reason: Maintenance
- 16:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1106.eqiad.wmnet with reason: Maintenance
- 16:49 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1003.eqiad.wmnet with OS bookworm
- 16:39 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest1003.eqiad.wmnet with OS bookworm
- 16:14 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-f8-eqiad
- 16:14 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
- 16:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55387 and previous config saved to /var/cache/conftool/dbconfig/20240123-161426-root.json
- 16:10 sukhe: enable puppet on A:lvs to merge CR 991785 and run agent on all nodes
- 15:59 sukhe: disable puppet on A:lvs to merge CR 991785
- 15:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55386 and previous config saved to /var/cache/conftool/dbconfig/20240123-155921-root.json
- 15:55 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:54 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:54 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:53 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:52 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:52 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55385 and previous config saved to /var/cache/conftool/dbconfig/20240123-155219-ladsgroup.json
- 15:44 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55384 and previous config saved to /var/cache/conftool/dbconfig/20240123-154416-root.json
- 15:41 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
- 15:39 claime: trafficserver: move 30% of traffic to mw on k8s - T355532
- 15:37 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
- 15:37 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
- 15:37 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
- 15:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55383 and previous config saved to /var/cache/conftool/dbconfig/20240123-153712-ladsgroup.json
- 15:36 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
- 15:36 claime: Bumping mw-api-ext replicas - T355532
- 15:36 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 15:36 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 15:35 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 15:35 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 15:35 claime: Bumping mw-web replicas - T355532
- 15:33 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] DONE helmfile.d/services/termbox: apply
- 15:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] START helmfile.d/services/termbox: apply
- 15:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] DONE helmfile.d/services/termbox: apply
- 15:31 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] START helmfile.d/services/termbox: apply
- 15:31 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] DONE helmfile.d/services/termbox: apply
- 15:31 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] START helmfile.d/services/termbox: apply
- 15:29 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55382 and previous config saved to /var/cache/conftool/dbconfig/20240123-152911-root.json
- 15:22 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] DONE helmfile.d/services/termbox: apply
- 15:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55381 and previous config saved to /var/cache/conftool/dbconfig/20240123-152206-ladsgroup.json
- 15:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] START helmfile.d/services/termbox: apply
- 15:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] DONE helmfile.d/services/termbox: apply
- 15:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] START helmfile.d/services/termbox: apply
- 15:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] DONE helmfile.d/services/termbox: apply
- 15:19 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] START helmfile.d/services/termbox: apply
- 15:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55380 and previous config saved to /var/cache/conftool/dbconfig/20240123-151406-root.json
- 15:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2165.codfw.wmnet with reason: Maintenance
- 15:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2165.codfw.wmnet with reason: Maintenance
- 15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] DONE helmfile.d/services/termbox: apply
- 15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] START helmfile.d/services/termbox: apply
- 15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] DONE helmfile.d/services/termbox: apply
- 15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] START helmfile.d/services/termbox: apply
- 15:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55379 and previous config saved to /var/cache/conftool/dbconfig/20240123-150659-ladsgroup.json
- 15:06 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] DONE helmfile.d/services/termbox: apply
- 15:05 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] START helmfile.d/services/termbox: apply
- 15:00 Lucas_WMDE: UTC afternoon backport+config window done
- 14:59 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:992423ORES: Enable renamed revertrisklanguageagnostic model (T348298) (duration: 11m 20s)
- 14:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55378 and previous config saved to /var/cache/conftool/dbconfig/20240123-145901-root.json
- 14:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T354336)', diff saved to https://phabricator.wikimedia.org/P55377 and previous config saved to /var/cache/conftool/dbconfig/20240123-145353-marostegui.json
- 14:53 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and kharlan: Continuing with sync
- 14:49 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and kharlan: Backport for gerrit:992423ORES: Enable renamed revertrisklanguageagnostic model (T348298) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:48 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:992423ORES: Enable renamed revertrisklanguageagnostic model (T348298)
- 14:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1173.eqiad.wmnet with OS bookworm
- 14:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55376 and previous config saved to /var/cache/conftool/dbconfig/20240123-144356-root.json
- 14:42 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:992376Restore support for matching 'LIKE' patterns/wildcards (T355478) (duration: 07m 50s)
- 14:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P55375 and previous config saved to /var/cache/conftool/dbconfig/20240123-143846-marostegui.json
- 14:36 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Continuing with sync
- 14:36 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Backport for gerrit:992376Restore support for matching 'LIKE' patterns/wildcards (T355478) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:34 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:992376Restore support for matching 'LIKE' patterns/wildcards (T355478)
- 14:33 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:992377Restore support for matching 'LIKE' patterns/wildcards (T355478) (duration: 10m 29s)
- 14:32 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts sretest1003.eqiad.wmnet
- 14:32 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1003.eqiad.wmnet
- 14:27 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Continuing with sync
- 14:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1173.eqiad.wmnet with reason: host reimage
- 14:24 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Backport for gerrit:992377Restore support for matching 'LIKE' patterns/wildcards (T355478) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:24 pt1979@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1003.eqiad.wmnet
- 14:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1173.eqiad.wmnet with reason: host reimage
- 14:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P55374 and previous config saved to /var/cache/conftool/dbconfig/20240123-142339-marostegui.json
- 14:23 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
- 14:23 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:992377Restore support for matching 'LIKE' patterns/wildcards (T355478)
- 14:20 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
- 14:18 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:991606ext-EventLogging,ext-EventStreamConfig: Remove mediawiki.special_diff_interactions stream (T353366) (duration: 11m 49s)
- 14:15 pt1979@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts sretest1003.eqiad.wmnet
- 14:12 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and phuedx: Continuing with sync
- 14:12 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1173.eqiad.wmnet with OS bookworm
- 14:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and phuedx: Backport for gerrit:991606ext-EventLogging,ext-EventStreamConfig: Remove mediawiki.special_diff_interactions stream (T353366) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T354336)', diff saved to https://phabricator.wikimedia.org/P55373 and previous config saved to /var/cache/conftool/dbconfig/20240123-140833-marostegui.json
- 14:07 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
- 14:06 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:991606ext-EventLogging,ext-EventStreamConfig: Remove mediawiki.special_diff_interactions stream (T353366)
- 14:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1173 (T343718)', diff saved to https://phabricator.wikimedia.org/P55372 and previous config saved to /var/cache/conftool/dbconfig/20240123-140636-ladsgroup.json
- 14:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
- 14:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
- 14:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
- 14:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
- 13:58 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2195 (T354336)', diff saved to https://phabricator.wikimedia.org/P55371 and previous config saved to /var/cache/conftool/dbconfig/20240123-135819-marostegui.json
- 13:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2195.codfw.wmnet with reason: Maintenance
- 13:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2195.codfw.wmnet with reason: Maintenance
- 13:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T354336)', diff saved to https://phabricator.wikimedia.org/P55370 and previous config saved to /var/cache/conftool/dbconfig/20240123-135757-marostegui.json
- 13:52 Dreamy_Jazz: Ran `foreachwikiindblist group0 extensions/MediaModeration/maintenance/resendMatchEmails.php 20200405 --verbose`
- 13:51 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 13:50 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 13:50 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 13:49 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 13:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1016.eqiad.wmnet with OS bullseye
- 13:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P55369 and previous config saved to /var/cache/conftool/dbconfig/20240123-134250-marostegui.json
- 13:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P55368 and previous config saved to /var/cache/conftool/dbconfig/20240123-132744-marostegui.json
- 13:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55367 and previous config saved to /var/cache/conftool/dbconfig/20240123-131909-root.json
- 13:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage
- 13:12 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage
- 13:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T354336)', diff saved to https://phabricator.wikimedia.org/P55366 and previous config saved to /var/cache/conftool/dbconfig/20240123-131237-marostegui.json
- 13:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2181 (T354336)', diff saved to https://phabricator.wikimedia.org/P55365 and previous config saved to /var/cache/conftool/dbconfig/20240123-131027-marostegui.json
- 13:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2181.codfw.wmnet with reason: Maintenance
- 13:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2181.codfw.wmnet with reason: Maintenance
- 13:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55364 and previous config saved to /var/cache/conftool/dbconfig/20240123-131005-marostegui.json
- 13:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55363 and previous config saved to /var/cache/conftool/dbconfig/20240123-130404-root.json
- 12:56 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS bullseye
- 12:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P55362 and previous config saved to /var/cache/conftool/dbconfig/20240123-125459-marostegui.json
- 12:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55361 and previous config saved to /var/cache/conftool/dbconfig/20240123-124859-root.json
- 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host snapshot1017.eqiad.wmnet
- 12:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P55360 and previous config saved to /var/cache/conftool/dbconfig/20240123-123952-marostegui.json
- 12:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55359 and previous config saved to /var/cache/conftool/dbconfig/20240123-123354-root.json
- 12:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55358 and previous config saved to /var/cache/conftool/dbconfig/20240123-123346-root.json
- 12:31 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host snapshot1017.eqiad.wmnet
- 12:28 claime: Restarting killed maintenance job mediawiki_job_MachineVision_prioritize_uncategorized.service
- 12:26 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for sretest1001.eqiad.wmnet
- 12:26 kamila@cumin1002: START - Cookbook sre.hosts.remove-downtime for sretest1001.eqiad.wmnet
- 12:26 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on sretest1001.eqiad.wmnet with reason: testing the cookbook
- 12:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55357 and previous config saved to /var/cache/conftool/dbconfig/20240123-122446-marostegui.json
- 12:23 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on sretest1001.eqiad.wmnet with reason: testing the cookbook
- 12:23 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2168:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55356 and previous config saved to /var/cache/conftool/dbconfig/20240123-122336-marostegui.json
- 12:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 12:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 12:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55355 and previous config saved to /var/cache/conftool/dbconfig/20240123-122314-marostegui.json
- 12:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55354 and previous config saved to /var/cache/conftool/dbconfig/20240123-122105-root.json
- 12:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55353 and previous config saved to /var/cache/conftool/dbconfig/20240123-121849-root.json
- 12:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55352 and previous config saved to /var/cache/conftool/dbconfig/20240123-121841-root.json
- 12:17 claime: Restarting ferm.service on k8s node mw1495.eqiad.wmnet - T354855
- 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host snapshot1016.eqiad.wmnet
- 12:14 claime: scap::dsh::scap_proxies: Replace mw1486 by mw1405 - T355622
- 12:13 Amir1: dropping bv2015_edits table from all wikis (T355594)
- 12:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P55351 and previous config saved to /var/cache/conftool/dbconfig/20240123-120807-marostegui.json
- 12:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55350 and previous config saved to /var/cache/conftool/dbconfig/20240123-120600-root.json
- 12:05 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host snapshot1016.eqiad.wmnet
- 12:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55349 and previous config saved to /var/cache/conftool/dbconfig/20240123-120344-root.json
- 12:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55348 and previous config saved to /var/cache/conftool/dbconfig/20240123-120335-root.json
- 12:03 Amir1: dropping bv2009_edits table from all wikis (T355594)
- 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1017.eqiad.wmnet with OS bullseye
- 11:54 godog: initial cleanup of replicated thanos blocks - T351927
- 11:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P55347 and previous config saved to /var/cache/conftool/dbconfig/20240123-115301-marostegui.json
- 11:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55346 and previous config saved to /var/cache/conftool/dbconfig/20240123-115055-root.json
- 11:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55345 and previous config saved to /var/cache/conftool/dbconfig/20240123-114840-root.json
- 11:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55344 and previous config saved to /var/cache/conftool/dbconfig/20240123-114831-root.json
- 11:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1173', diff saved to https://phabricator.wikimedia.org/P55343 and previous config saved to /var/cache/conftool/dbconfig/20240123-114826-marostegui.json
- 11:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55342 and previous config saved to /var/cache/conftool/dbconfig/20240123-113754-marostegui.json
- 11:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55341 and previous config saved to /var/cache/conftool/dbconfig/20240123-113550-root.json
- 11:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2167:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55340 and previous config saved to /var/cache/conftool/dbconfig/20240123-113544-marostegui.json
- 11:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
- 11:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
- 11:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55339 and previous config saved to /var/cache/conftool/dbconfig/20240123-113522-marostegui.json
- 11:35 marostegui: Starting s6 eqiad failover from db1173 to db1231 - T355660
- 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1017.eqiad.wmnet with reason: host reimage
- 11:31 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1017.eqiad.wmnet with reason: host reimage
- 11:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55338 and previous config saved to /var/cache/conftool/dbconfig/20240123-112420-root.json
- 11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P55336 and previous config saved to /var/cache/conftool/dbconfig/20240123-112016-marostegui.json
- 11:11 Amir1: dropping pif_edits table from all wikis (T355594)
- 11:11 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host snapshot1017.eqiad.wmnet with OS bullseye
- 11:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55335 and previous config saved to /var/cache/conftool/dbconfig/20240123-110915-root.json
- 11:07 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1231 with weight 0 T355660', diff saved to https://phabricator.wikimedia.org/P55333 and previous config saved to /var/cache/conftool/dbconfig/20240123-110743-marostegui.json
- 11:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355660
- 11:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355660
- 11:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55332 and previous config saved to /var/cache/conftool/dbconfig/20240123-110540-root.json
- 11:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P55331 and previous config saved to /var/cache/conftool/dbconfig/20240123-110509-marostegui.json
- 10:58 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-master1002.eqiad.wmnet
- 10:58 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:58 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-master1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
- 10:56 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-master1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
- 10:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2171.codfw.wmnet with OS bookworm
- 10:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55330 and previous config saved to /var/cache/conftool/dbconfig/20240123-105410-root.json
- 10:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55329 and previous config saved to /var/cache/conftool/dbconfig/20240123-105035-root.json
- 10:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55328 and previous config saved to /var/cache/conftool/dbconfig/20240123-105003-marostegui.json
- 10:48 btullis@cumin1002: START - Cookbook sre.dns.netbox
- 10:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55327 and previous config saved to /var/cache/conftool/dbconfig/20240123-104753-marostegui.json
- 10:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2166.codfw.wmnet with reason: Maintenance
- 10:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2166.codfw.wmnet with reason: Maintenance
- 10:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T354336)', diff saved to https://phabricator.wikimedia.org/P55326 and previous config saved to /var/cache/conftool/dbconfig/20240123-104731-marostegui.json
- 10:43 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts an-master1002.eqiad.wmnet
- 10:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2171.codfw.wmnet with reason: host reimage
- 10:34 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-master1001.eqiad.wmnet
- 10:34 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:34 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-master1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
- 10:32 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-master1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
- 10:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P55325 and previous config saved to /var/cache/conftool/dbconfig/20240123-103225-marostegui.json
- 10:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2171.codfw.wmnet with reason: host reimage
- 10:27 btullis@cumin1002: START - Cookbook sre.dns.netbox
- 10:23 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1017.eqiad.wmnet with OS bullseye
- 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P55324 and previous config saved to /var/cache/conftool/dbconfig/20240123-101718-marostegui.json
- 10:13 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts sretest1003.eqiad.wmnet
- 10:13 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1003.eqiad.wmnet
- 10:12 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2171.codfw.wmnet with OS bookworm
- 10:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2171:3315 db2171:3316', diff saved to https://phabricator.wikimedia.org/P55323 and previous config saved to /var/cache/conftool/dbconfig/20240123-101056-marostegui.json
- 10:10 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts an-master1001.eqiad.wmnet
- 10:04 ayounsi@cumin1002: START - Cookbook sre.hosts.reboot-single for host sretest1003.eqiad.wmnet
- 10:04 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
- 10:03 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1003.eqiad.wmnet
- 10:03 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
- 10:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1016.eqiad.wmnet with OS bullseye
- 10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T354336)', diff saved to https://phabricator.wikimedia.org/P55322 and previous config saved to /var/cache/conftool/dbconfig/20240123-100212-marostegui.json
- 10:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2164 (T354336)', diff saved to https://phabricator.wikimedia.org/P55321 and previous config saved to /var/cache/conftool/dbconfig/20240123-100002-marostegui.json
- 09:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 09:59 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts sretest1003.eqiad.wmnet
- 09:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 09:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2164.codfw.wmnet with reason: Maintenance
- 09:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2164.codfw.wmnet with reason: Maintenance
- 09:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55320 and previous config saved to /var/cache/conftool/dbconfig/20240123-095923-marostegui.json
- 09:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P55319 and previous config saved to /var/cache/conftool/dbconfig/20240123-094417-marostegui.json
- 09:41 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
- 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage
- 09:29 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage
- 09:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P55318 and previous config saved to /var/cache/conftool/dbconfig/20240123-092910-marostegui.json
- 09:24 hashar@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.15 refs T354433
- 09:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55317 and previous config saved to /var/cache/conftool/dbconfig/20240123-091404-marostegui.json
- 09:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55316 and previous config saved to /var/cache/conftool/dbconfig/20240123-091154-marostegui.json
- 09:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2163.codfw.wmnet with reason: Maintenance
- 09:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2163.codfw.wmnet with reason: Maintenance
- 09:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T354336)', diff saved to https://phabricator.wikimedia.org/P55315 and previous config saved to /var/cache/conftool/dbconfig/20240123-091132-marostegui.json
- 09:04 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1003.eqiad.wmnet
- 09:01 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
- 09:01 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55314 and previous config saved to /var/cache/conftool/dbconfig/20240123-090104-root.json
- 08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P55313 and previous config saved to /var/cache/conftool/dbconfig/20240123-085625-marostegui.json
- 08:55 taavi: updating CR firewall policy with https://gerrit.wikimedia.org/r/c/operations/homer/public/+/992245/ https://gerrit.wikimedia.org/r/c/operations/homer/public/+/992359/
- 08:51 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS bullseye
- 08:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55312 and previous config saved to /var/cache/conftool/dbconfig/20240123-084559-root.json
- 08:44 gmodena@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
- 08:44 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
- 08:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55311 and previous config saved to /var/cache/conftool/dbconfig/20240123-084301-ladsgroup.json
- 08:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 08:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 08:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 08:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 08:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55310 and previous config saved to /var/cache/conftool/dbconfig/20240123-084244-ladsgroup.json
- 08:41 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
- 08:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P55309 and previous config saved to /var/cache/conftool/dbconfig/20240123-084119-marostegui.json
- 08:39 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
- 08:37 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
- 08:30 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55308 and previous config saved to /var/cache/conftool/dbconfig/20240123-083054-root.json
- 08:28 taavi: updating CR firewall policy with https://gerrit.wikimedia.org/r/c/operations/homer/public/+/992244
- 08:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55307 and previous config saved to /var/cache/conftool/dbconfig/20240123-082738-ladsgroup.json
- 08:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T354336)', diff saved to https://phabricator.wikimedia.org/P55306 and previous config saved to /var/cache/conftool/dbconfig/20240123-082613-marostegui.json
- 08:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2162 (T354336)', diff saved to https://phabricator.wikimedia.org/P55305 and previous config saved to /var/cache/conftool/dbconfig/20240123-082402-marostegui.json
- 08:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2162.codfw.wmnet with reason: Maintenance
- 08:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2162.codfw.wmnet with reason: Maintenance
- 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T354336)', diff saved to https://phabricator.wikimedia.org/P55304 and previous config saved to /var/cache/conftool/dbconfig/20240123-082340-marostegui.json
- 08:15 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55303 and previous config saved to /var/cache/conftool/dbconfig/20240123-081549-root.json
- 08:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55302 and previous config saved to /var/cache/conftool/dbconfig/20240123-081231-ladsgroup.json
- 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P55301 and previous config saved to /var/cache/conftool/dbconfig/20240123-080834-marostegui.json
- 08:02 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2051.codfw.wmnet
- 08:00 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55300 and previous config saved to /var/cache/conftool/dbconfig/20240123-080044-root.json
- 07:57 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2051.codfw.wmnet
- 07:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55299 and previous config saved to /var/cache/conftool/dbconfig/20240123-075725-ladsgroup.json
- 07:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1051.eqiad.wmnet
- 07:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P55298 and previous config saved to /var/cache/conftool/dbconfig/20240123-075327-marostegui.json
- 07:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1051.eqiad.wmnet
- 07:45 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55297 and previous config saved to /var/cache/conftool/dbconfig/20240123-074538-root.json
- 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T354336)', diff saved to https://phabricator.wikimedia.org/P55296 and previous config saved to /var/cache/conftool/dbconfig/20240123-073821-marostegui.json
- 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2161 (T354336)', diff saved to https://phabricator.wikimedia.org/P55295 and previous config saved to /var/cache/conftool/dbconfig/20240123-073610-marostegui.json
- 07:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2161.codfw.wmnet with reason: Maintenance
- 07:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2161.codfw.wmnet with reason: Maintenance
- 07:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T354336)', diff saved to https://phabricator.wikimedia.org/P55294 and previous config saved to /var/cache/conftool/dbconfig/20240123-073548-marostegui.json
- 07:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1231.eqiad.wmnet with OS bookworm
- 07:30 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55293 and previous config saved to /var/cache/conftool/dbconfig/20240123-073033-root.json
- 07:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P55292 and previous config saved to /var/cache/conftool/dbconfig/20240123-073021-ladsgroup.json
- 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P55291 and previous config saved to /var/cache/conftool/dbconfig/20240123-072041-marostegui.json
- 07:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P55290 and previous config saved to /var/cache/conftool/dbconfig/20240123-071515-ladsgroup.json
- 07:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
- 07:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
- 07:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P55289 and previous config saved to /var/cache/conftool/dbconfig/20240123-070535-marostegui.json
- 07:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P55288 and previous config saved to /var/cache/conftool/dbconfig/20240123-070008-ladsgroup.json
- 06:57 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1231.eqiad.wmnet with OS bookworm
- 06:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1231', diff saved to https://phabricator.wikimedia.org/P55287 and previous config saved to /var/cache/conftool/dbconfig/20240123-065606-marostegui.json
- 06:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T354336)', diff saved to https://phabricator.wikimedia.org/P55285 and previous config saved to /var/cache/conftool/dbconfig/20240123-065029-marostegui.json
- 06:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2154 (T354336)', diff saved to https://phabricator.wikimedia.org/P55284 and previous config saved to /var/cache/conftool/dbconfig/20240123-064819-marostegui.json
- 06:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2154.codfw.wmnet with reason: Maintenance
- 06:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2154.codfw.wmnet with reason: Maintenance
- 06:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T354336)', diff saved to https://phabricator.wikimedia.org/P55283 and previous config saved to /var/cache/conftool/dbconfig/20240123-064757-marostegui.json
- 06:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P55282 and previous config saved to /var/cache/conftool/dbconfig/20240123-064502-ladsgroup.json
- 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P55281 and previous config saved to /var/cache/conftool/dbconfig/20240123-063250-marostegui.json
- 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P55280 and previous config saved to /var/cache/conftool/dbconfig/20240123-061744-marostegui.json
- 06:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T354336)', diff saved to https://phabricator.wikimedia.org/P55279 and previous config saved to /var/cache/conftool/dbconfig/20240123-060237-marostegui.json
- 06:01 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2152 (T354336)', diff saved to https://phabricator.wikimedia.org/P55278 and previous config saved to /var/cache/conftool/dbconfig/20240123-060127-marostegui.json
- 06:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2152.codfw.wmnet with reason: Maintenance
- 06:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2152.codfw.wmnet with reason: Maintenance
- 06:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
- 06:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
- 06:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
- 06:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
- 05:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1209.eqiad.wmnet with reason: Maintenance
- 05:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1209.eqiad.wmnet with reason: Maintenance
- 04:54 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.15 refs T354433 (duration: 51m 22s)
- 04:02 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.15 refs T354433
- 01:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55277 and previous config saved to /var/cache/conftool/dbconfig/20240123-011434-ladsgroup.json
- 01:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
- 01:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
- 00:58 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=ruwikinews --fix # T350889
- 00:57 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=fiwikinews --fix # T350889
- 00:57 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=fiwiki --fix # T350889
- 00:56 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=enwiki --fix # T350889
- 00:55 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=cywiki --fix # T350889
- 00:42 zabe: running 'zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=viwiki --fix' in screen
- 00:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P55276 and previous config saved to /var/cache/conftool/dbconfig/20240123-003338-ladsgroup.json
- 00:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
- 00:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
- 00:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P55275 and previous config saved to /var/cache/conftool/dbconfig/20240123-003316-ladsgroup.json
- 00:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P55274 and previous config saved to /var/cache/conftool/dbconfig/20240123-001810-ladsgroup.json
- 00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P55273 and previous config saved to /var/cache/conftool/dbconfig/20240123-000303-ladsgroup.json
2024-01-22
- 23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P55272 and previous config saved to /var/cache/conftool/dbconfig/20240122-234757-ladsgroup.json
- 23:14 zabe@deploy2002: Finished scap: Backport for gerrit:991930Stop setting wgShowIPinHeader (T355479), gerrit:992250beta: Start reading from af_user(_text)/afh_user(_text) (T355616) (duration: 07m 31s)
- 23:08 zabe@deploy2002: zabe: Continuing with sync
- 23:08 zabe@deploy2002: zabe: Backport for gerrit:991930Stop setting wgShowIPinHeader (T355479), gerrit:992250beta: Start reading from af_user(_text)/afh_user(_text) (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 23:06 zabe@deploy2002: Started scap: Backport for gerrit:991930Stop setting wgShowIPinHeader (T355479), gerrit:992250beta: Start reading from af_user(_text)/afh_user(_text) (T355616)
- 22:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
- 22:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
- 22:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T354336)', diff saved to https://phabricator.wikimedia.org/P55271 and previous config saved to /var/cache/conftool/dbconfig/20240122-225618-marostegui.json
- 22:47 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2088']
- 22:47 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088']
- 22:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P55270 and previous config saved to /var/cache/conftool/dbconfig/20240122-224111-marostegui.json
- 22:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P55269 and previous config saved to /var/cache/conftool/dbconfig/20240122-222605-marostegui.json
- 22:24 maryum: Deployed patch for T355538
- 22:14 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1003.eqiad.wmnet with OS bookworm
- 22:14 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
- 22:13 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
- 22:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T354336)', diff saved to https://phabricator.wikimedia.org/P55268 and previous config saved to /var/cache/conftool/dbconfig/20240122-221058-marostegui.json
- 22:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1226 (T354336)', diff saved to https://phabricator.wikimedia.org/P55267 and previous config saved to /var/cache/conftool/dbconfig/20240122-220850-marostegui.json
- 22:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1226.eqiad.wmnet with reason: Maintenance
- 22:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1226.eqiad.wmnet with reason: Maintenance
- 22:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 22:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 22:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T354336)', diff saved to https://phabricator.wikimedia.org/P55266 and previous config saved to /var/cache/conftool/dbconfig/20240122-220811-marostegui.json
- 21:56 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
- 21:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P55265 and previous config saved to /var/cache/conftool/dbconfig/20240122-215305-marostegui.json
- 21:53 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
- 21:51 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 21:51 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add cloudrabbit1003 cloud-private address - taavi@cumin1002"
- 21:50 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add cloudrabbit1003 cloud-private address - taavi@cumin1002"
- 21:48 taavi@cumin1002: START - Cookbook sre.dns.netbox
- 21:46 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set cloudrabbit1003 as active - taavi@cumin1002"
- 21:45 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set cloudrabbit1003 as active - taavi@cumin1002"
- 21:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P55264 and previous config saved to /var/cache/conftool/dbconfig/20240122-213758-marostegui.json
- 21:33 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS bookworm
- 21:32 taavi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudrabbit1003.eqiad.wmnet with OS bookworm
- 21:24 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS bookworm
- 21:24 taavi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudrabbit1003.eqiad.wmnet with OS bookworm
- 21:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T354336)', diff saved to https://phabricator.wikimedia.org/P55263 and previous config saved to /var/cache/conftool/dbconfig/20240122-212252-marostegui.json
- 21:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1214 (T354336)', diff saved to https://phabricator.wikimedia.org/P55262 and previous config saved to /var/cache/conftool/dbconfig/20240122-212144-marostegui.json
- 21:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1214.eqiad.wmnet with reason: Maintenance
- 21:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1214.eqiad.wmnet with reason: Maintenance
- 21:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T354336)', diff saved to https://phabricator.wikimedia.org/P55261 and previous config saved to /var/cache/conftool/dbconfig/20240122-212122-marostegui.json
- 21:17 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS bookworm
- 21:07 taavi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudrabbit1003
- 21:07 taavi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudrabbit1003
- 21:07 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 21:07 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: allocate IPs for cloudrabbit1003 - taavi@cumin1002"
- 21:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P55260 and previous config saved to /var/cache/conftool/dbconfig/20240122-210615-marostegui.json
- 21:05 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: allocate IPs for cloudrabbit1003 - taavi@cumin1002"
- 21:03 taavi@cumin1002: START - Cookbook sre.dns.netbox
- 20:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P55259 and previous config saved to /var/cache/conftool/dbconfig/20240122-205109-marostegui.json
- 20:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T354336)', diff saved to https://phabricator.wikimedia.org/P55258 and previous config saved to /var/cache/conftool/dbconfig/20240122-203602-marostegui.json
- 20:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1211 (T354336)', diff saved to https://phabricator.wikimedia.org/P55257 and previous config saved to /var/cache/conftool/dbconfig/20240122-203354-marostegui.json
- 20:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1211.eqiad.wmnet with reason: Maintenance
- 20:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1211.eqiad.wmnet with reason: Maintenance
- 20:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T354336)', diff saved to https://phabricator.wikimedia.org/P55256 and previous config saved to /var/cache/conftool/dbconfig/20240122-203332-marostegui.json
- 20:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P55255 and previous config saved to /var/cache/conftool/dbconfig/20240122-201826-marostegui.json
- 20:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P55254 and previous config saved to /var/cache/conftool/dbconfig/20240122-200319-marostegui.json
- 19:57 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 19:56 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 19:56 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 19:55 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 19:54 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 19:54 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 19:51 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 19:50 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 19:50 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 19:48 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 19:48 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 19:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T354336)', diff saved to https://phabricator.wikimedia.org/P55253 and previous config saved to /var/cache/conftool/dbconfig/20240122-194813-marostegui.json
- 19:47 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 19:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1203 (T354336)', diff saved to https://phabricator.wikimedia.org/P55252 and previous config saved to /var/cache/conftool/dbconfig/20240122-194704-marostegui.json
- 19:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1203.eqiad.wmnet with reason: Maintenance
- 19:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1203.eqiad.wmnet with reason: Maintenance
- 19:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T354336)', diff saved to https://phabricator.wikimedia.org/P55251 and previous config saved to /var/cache/conftool/dbconfig/20240122-194642-marostegui.json
- 19:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P55250 and previous config saved to /var/cache/conftool/dbconfig/20240122-193136-marostegui.json
- 19:28 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 19:28 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 19:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P55249 and previous config saved to /var/cache/conftool/dbconfig/20240122-191629-marostegui.json
- 19:06 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
- 19:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T354336)', diff saved to https://phabricator.wikimedia.org/P55248 and previous config saved to /var/cache/conftool/dbconfig/20240122-190123-marostegui.json
- 19:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1193 (T354336)', diff saved to https://phabricator.wikimedia.org/P55247 and previous config saved to /var/cache/conftool/dbconfig/20240122-190014-marostegui.json
- 19:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1193.eqiad.wmnet with reason: Maintenance
- 19:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1193.eqiad.wmnet with reason: Maintenance
- 18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T354336)', diff saved to https://phabricator.wikimedia.org/P55246 and previous config saved to /var/cache/conftool/dbconfig/20240122-185952-marostegui.json
- 18:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P55245 and previous config saved to /var/cache/conftool/dbconfig/20240122-184446-marostegui.json
- 18:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P55244 and previous config saved to /var/cache/conftool/dbconfig/20240122-182939-marostegui.json
- 18:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P55243 and previous config saved to /var/cache/conftool/dbconfig/20240122-182432-ladsgroup.json
- 18:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 18:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 18:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55242 and previous config saved to /var/cache/conftool/dbconfig/20240122-182359-ladsgroup.json
- 18:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T354336)', diff saved to https://phabricator.wikimedia.org/P55241 and previous config saved to /var/cache/conftool/dbconfig/20240122-181433-marostegui.json
- 18:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1192 (T354336)', diff saved to https://phabricator.wikimedia.org/P55240 and previous config saved to /var/cache/conftool/dbconfig/20240122-181324-marostegui.json
- 18:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1192.eqiad.wmnet with reason: Maintenance
- 18:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1192.eqiad.wmnet with reason: Maintenance
- 18:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T354336)', diff saved to https://phabricator.wikimedia.org/P55239 and previous config saved to /var/cache/conftool/dbconfig/20240122-181302-marostegui.json
- 18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55238 and previous config saved to /var/cache/conftool/dbconfig/20240122-180853-ladsgroup.json
- 17:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P55237 and previous config saved to /var/cache/conftool/dbconfig/20240122-175755-marostegui.json
- 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55236 and previous config saved to /var/cache/conftool/dbconfig/20240122-175346-ladsgroup.json
- 17:46 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
- 17:44 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
- 17:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P55235 and previous config saved to /var/cache/conftool/dbconfig/20240122-174249-marostegui.json
- 17:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55234 and previous config saved to /var/cache/conftool/dbconfig/20240122-173840-ladsgroup.json
- 17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T354336)', diff saved to https://phabricator.wikimedia.org/P55233 and previous config saved to /var/cache/conftool/dbconfig/20240122-172743-marostegui.json
- 17:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1178 (T354336)', diff saved to https://phabricator.wikimedia.org/P55232 and previous config saved to /var/cache/conftool/dbconfig/20240122-172635-marostegui.json
- 17:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1178.eqiad.wmnet with reason: Maintenance
- 17:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1178.eqiad.wmnet with reason: Maintenance
- 17:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55231 and previous config saved to /var/cache/conftool/dbconfig/20240122-172612-marostegui.json
- 17:17 akosiaris: draining kubestage2001, uncordoning kubestage2002 to allow it to receive the pods. T355437
- 17:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P55230 and previous config saved to /var/cache/conftool/dbconfig/20240122-171106-marostegui.json
- 17:05 vgutierrez: restore HAProxy tune.bufsize = 16684 in cp3066 - T354424
- 16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P55229 and previous config saved to /var/cache/conftool/dbconfig/20240122-165559-marostegui.json
- 16:53 vgutierrez: testing HAProxy tune.bufsize = 32768 in cp3066 - T354424
- 16:46 dcausse@deploy2002: Finished deploy [airflow-dags/search@dcf08b2]: (no justification provided) (duration: 00m 31s)
- 16:46 dcausse@deploy2002: Started deploy [airflow-dags/search@dcf08b2]: (no justification provided)
- 16:42 Daimona: T353459 Running mwscript /home/daimona/GenerateInvitationList.php to test the script before it reaches production
- 16:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55228 and previous config saved to /var/cache/conftool/dbconfig/20240122-164053-marostegui.json
- 16:39 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1495.eqiad.wmnet with OS bullseye
- 16:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55227 and previous config saved to /var/cache/conftool/dbconfig/20240122-163844-marostegui.json
- 16:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1177.eqiad.wmnet with reason: Maintenance
- 16:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1177.eqiad.wmnet with reason: Maintenance
- 16:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55226 and previous config saved to /var/cache/conftool/dbconfig/20240122-163822-marostegui.json
- 16:38 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
- 16:38 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
- 16:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55225 and previous config saved to /var/cache/conftool/dbconfig/20240122-163808-ladsgroup.json
- 16:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 16:38 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
- 16:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 16:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 16:37 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
- 16:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 16:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P55224 and previous config saved to /var/cache/conftool/dbconfig/20240122-163729-ladsgroup.json
- 16:31 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1486.eqiad.wmnet with OS bullseye
- 16:29 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 16:29 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 16:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P55222 and previous config saved to /var/cache/conftool/dbconfig/20240122-162315-marostegui.json
- 16:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P55221 and previous config saved to /var/cache/conftool/dbconfig/20240122-162223-ladsgroup.json
- 16:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1495.eqiad.wmnet with reason: host reimage
- 16:12 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1486.eqiad.wmnet with reason: host reimage
- 16:09 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1495.eqiad.wmnet with reason: host reimage
- 16:08 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1486.eqiad.wmnet with reason: host reimage
- 16:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P55220 and previous config saved to /var/cache/conftool/dbconfig/20240122-160809-marostegui.json
- 16:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P55219 and previous config saved to /var/cache/conftool/dbconfig/20240122-160716-ladsgroup.json
- 15:56 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55218 and previous config saved to /var/cache/conftool/dbconfig/20240122-155607-root.json
- 15:55 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1495.eqiad.wmnet with OS bullseye
- 15:55 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1486.eqiad.wmnet with OS bullseye
- 15:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55217 and previous config saved to /var/cache/conftool/dbconfig/20240122-155302-marostegui.json
- 15:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P55216 and previous config saved to /var/cache/conftool/dbconfig/20240122-155210-ladsgroup.json
- 15:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55215 and previous config saved to /var/cache/conftool/dbconfig/20240122-155154-marostegui.json
- 15:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1172.eqiad.wmnet with reason: Maintenance
- 15:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1172.eqiad.wmnet with reason: Maintenance
- 15:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 15:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 15:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T354336)', diff saved to https://phabricator.wikimedia.org/P55214 and previous config saved to /var/cache/conftool/dbconfig/20240122-155115-marostegui.json
- 15:41 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55213 and previous config saved to /var/cache/conftool/dbconfig/20240122-154102-root.json
- 15:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P55212 and previous config saved to /var/cache/conftool/dbconfig/20240122-153608-marostegui.json
- 15:26 sukhe: sudo cumin -b1 -s120 "A:dns-rec and not P{dns6001*}" "enable-puppet 'do not enable' && run-puppet-agent"
- 15:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55211 and previous config saved to /var/cache/conftool/dbconfig/20240122-152557-root.json
- 15:24 sukhe: re-enable puppet on A:dns-rec and run agent to finish merging CR 979159
- 15:21 sukhe: enable puppet on dns6001 and run agent to test CR 979159
- 15:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P55210 and previous config saved to /var/cache/conftool/dbconfig/20240122-152102-marostegui.json
- 15:13 sukhe: disable Puppet on A:dns-rec to decouple anycast-hc and pdns-rec systemd binding: CR 979159
- 15:10 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55209 and previous config saved to /var/cache/conftool/dbconfig/20240122-151052-root.json
- 15:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T354336)', diff saved to https://phabricator.wikimedia.org/P55208 and previous config saved to /var/cache/conftool/dbconfig/20240122-150555-marostegui.json
- 15:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1167 (T354336)', diff saved to https://phabricator.wikimedia.org/P55207 and previous config saved to /var/cache/conftool/dbconfig/20240122-150046-marostegui.json
- 15:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 15:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 15:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1167.eqiad.wmnet with reason: Maintenance
- 15:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1167.eqiad.wmnet with reason: Maintenance
- 14:55 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55206 and previous config saved to /var/cache/conftool/dbconfig/20240122-145548-root.json
- 14:55 hashar@deploy2002: Finished deploy [gerrit/gerrit@6257faa]: Update Zuul plugin for Gerrit 3.7 - T355521 (duration: 00m 07s)
- 14:54 hashar@deploy2002: Started deploy [gerrit/gerrit@6257faa]: Update Zuul plugin for Gerrit 3.7 - T355521
- 14:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2105.codfw.wmnet with reason: Maintenance
- 14:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2105.codfw.wmnet with reason: Maintenance
- 14:42 jgiannelos@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 14:41 Lucas_WMDE: UTC afternoon backport+config window done
- 14:41 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 14:41 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 14:40 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:991358Set ShowRollbackConfirmation in arwiki (T355213) (duration: 09m 07s)
- 14:40 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55205 and previous config saved to /var/cache/conftool/dbconfig/20240122-144043-root.json
- 14:40 jgiannelos@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 14:40 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 14:35 logmsgbot: lucaswerkmeister-wmde@deploy2002 hubaishan and lucaswerkmeister-wmde: Continuing with sync
- 14:33 logmsgbot: lucaswerkmeister-wmde@deploy2002 hubaishan and lucaswerkmeister-wmde: Backport for gerrit:991358Set ShowRollbackConfirmation in arwiki (T355213) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:31 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:991358Set ShowRollbackConfirmation in arwiki (T355213)
- 14:30 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:991379Restrict pagequality-validate right to patroller in arwikisource (T354503) (duration: 09m 41s)
- 14:28 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1036.eqiad.wmnet to cluster eqiad and group B
- 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1036.eqiad.wmnet to cluster eqiad and group B
- 14:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55204 and previous config saved to /var/cache/conftool/dbconfig/20240122-142538-root.json
- 14:25 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db1134', diff saved to https://phabricator.wikimedia.org/P55203 and previous config saved to /var/cache/conftool/dbconfig/20240122-142530-marostegui.json
- 14:24 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and hubaishan: Continuing with sync
- 14:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and hubaishan: Backport for gerrit:991379Restrict pagequality-validate right to patroller in arwikisource (T354503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:991379Restrict pagequality-validate right to patroller in arwikisource (T354503)
- 13:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1165.eqiad.wmnet with OS bookworm
- 13:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1165.eqiad.wmnet with reason: host reimage
- 13:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1165.eqiad.wmnet with reason: host reimage
- 13:24 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti1036.eqiad.wmnet
- 13:22 marostegui: Upgrade sanitarium master, there will be lag on s6 wiki replicas T354506
- 13:21 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1165.eqiad.wmnet with OS bookworm
- 13:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1165', diff saved to https://phabricator.wikimedia.org/P55201 and previous config saved to /var/cache/conftool/dbconfig/20240122-132023-marostegui.json
- 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2050.codfw.wmnet
- 13:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
- 13:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2049.codfw.wmnet
- 13:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1049.eqiad.wmnet
- 13:01 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2050.codfw.wmnet
- 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1050.eqiad.wmnet
- 12:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1049.eqiad.wmnet
- 12:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2049.codfw.wmnet
- 12:55 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1050.eqiad.wmnet
- 12:48 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 12:47 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 12:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55200 and previous config saved to /var/cache/conftool/dbconfig/20240122-123351-root.json
- 12:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55199 and previous config saved to /var/cache/conftool/dbconfig/20240122-122634-marostegui.json
- 12:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55198 and previous config saved to /var/cache/conftool/dbconfig/20240122-121846-root.json
- 12:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P55197 and previous config saved to /var/cache/conftool/dbconfig/20240122-121128-marostegui.json
- 12:06 volans@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:20:00 on sretest1001.eqiad.wmnet with reason: Testing
- 12:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55195 and previous config saved to /var/cache/conftool/dbconfig/20240122-120341-root.json
- 11:56 volans@cumin1002: START - Cookbook sre.hosts.downtime for 0:20:00 on sretest1001.eqiad.wmnet with reason: Testing
- 11:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P55193 and previous config saved to /var/cache/conftool/dbconfig/20240122-115621-marostegui.json
- 11:56 volans@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 0:20:00 on sretest1001.eqiad.wmnet with reason: Testing
- 11:56 volans@cumin1002: START - Cookbook sre.hosts.downtime for 0:20:00 on sretest1001.eqiad.wmnet with reason: Testing
- 11:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55192 and previous config saved to /var/cache/conftool/dbconfig/20240122-114836-root.json
- 11:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55191 and previous config saved to /var/cache/conftool/dbconfig/20240122-114115-marostegui.json
- 11:41 vgutierrez: update to HAProxy 2.8.5 on cp3066 - T354424
- 11:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55190 and previous config saved to /var/cache/conftool/dbconfig/20240122-113331-root.json
- 11:26 jelto: start envoy on ticket-test.wikimedia.org to test alerting - T354479
- 11:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55189 and previous config saved to /var/cache/conftool/dbconfig/20240122-112401-marostegui.json
- 11:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2190.codfw.wmnet with reason: Maintenance
- 11:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2190.codfw.wmnet with reason: Maintenance
- 11:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55188 and previous config saved to /var/cache/conftool/dbconfig/20240122-112339-marostegui.json
- 11:21 jelto: stop envoy on ticket-test.wikimedia.org to test alerting - T354479
- 11:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55187 and previous config saved to /var/cache/conftool/dbconfig/20240122-111826-root.json
- 11:10 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2048.codfw.wmnet
- 11:10 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1048.eqiad.wmnet
- 11:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P55185 and previous config saved to /var/cache/conftool/dbconfig/20240122-110833-marostegui.json
- 11:04 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2048.codfw.wmnet
- 11:04 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1048.eqiad.wmnet
- 11:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55184 and previous config saved to /var/cache/conftool/dbconfig/20240122-110321-root.json
- 11:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2158.codfw.wmnet with OS bookworm
- 10:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P55183 and previous config saved to /var/cache/conftool/dbconfig/20240122-105326-marostegui.json
- 10:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55182 and previous config saved to /var/cache/conftool/dbconfig/20240122-105237-root.json
- 10:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55181 and previous config saved to /var/cache/conftool/dbconfig/20240122-105222-root.json
- 10:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2158.codfw.wmnet with reason: host reimage
- 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55180 and previous config saved to /var/cache/conftool/dbconfig/20240122-103820-marostegui.json
- 10:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55179 and previous config saved to /var/cache/conftool/dbconfig/20240122-103732-root.json
- 10:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2158.codfw.wmnet with reason: host reimage
- 10:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55178 and previous config saved to /var/cache/conftool/dbconfig/20240122-103717-root.json
- 10:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P55177 and previous config saved to /var/cache/conftool/dbconfig/20240122-103520-ladsgroup.json
- 10:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
- 10:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
- 10:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55176 and previous config saved to /var/cache/conftool/dbconfig/20240122-102227-root.json
- 10:22 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55175 and previous config saved to /var/cache/conftool/dbconfig/20240122-102220-marostegui.json
- 10:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2177.codfw.wmnet with reason: Maintenance
- 10:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55174 and previous config saved to /var/cache/conftool/dbconfig/20240122-102212-root.json
- 10:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2177.codfw.wmnet with reason: Maintenance
- 10:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T354336)', diff saved to https://phabricator.wikimedia.org/P55173 and previous config saved to /var/cache/conftool/dbconfig/20240122-102158-marostegui.json
- 10:18 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2158.codfw.wmnet with OS bookworm
- 10:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2158', diff saved to https://phabricator.wikimedia.org/P55172 and previous config saved to /var/cache/conftool/dbconfig/20240122-101634-marostegui.json
- 10:13 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for gerrit[1003,2002].wikimedia.org
- 10:13 cgoubert@cumin1002: START - Cookbook sre.hosts.remove-downtime for gerrit[1003,2002].wikimedia.org
- 10:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55171 and previous config saved to /var/cache/conftool/dbconfig/20240122-100722-root.json
- 10:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55170 and previous config saved to /var/cache/conftool/dbconfig/20240122-100707-root.json
- 10:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P55169 and previous config saved to /var/cache/conftool/dbconfig/20240122-100651-marostegui.json
- 10:04 hashar: gerrit: running jgit gc on every repository to regenerate potentially faulty reachability bitmaps files preventing fetches on some repositories # T355173
- 10:00 jelto: start envoy on ticket-test.wikimedia.org to test alerting - T354479
- 09:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2049.codfw.wmnet
- 09:56 jelto: stop envoy on ticket-test.wikimedia.org to test alerting - T354479
- 09:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2049.codfw.wmnet
- 09:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1049.eqiad.wmnet
- 09:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55167 and previous config saved to /var/cache/conftool/dbconfig/20240122-095217-root.json
- 09:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55166 and previous config saved to /var/cache/conftool/dbconfig/20240122-095202-root.json
- 09:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P55165 and previous config saved to /var/cache/conftool/dbconfig/20240122-095145-marostegui.json
- 09:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1035.eqiad.wmnet to cluster eqiad and group A
- 09:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1035.eqiad.wmnet to cluster eqiad and group A
- 09:47 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1049.eqiad.wmnet
- 09:38 hashar: Restarted Gerrit with upgraded version 3.7.6 # T354885
- 09:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55164 and previous config saved to /var/cache/conftool/dbconfig/20240122-093712-root.json
- 09:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55163 and previous config saved to /var/cache/conftool/dbconfig/20240122-093657-root.json
- 09:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T354336)', diff saved to https://phabricator.wikimedia.org/P55162 and previous config saved to /var/cache/conftool/dbconfig/20240122-093638-marostegui.json
- 09:26 cgoubert@cumin1002: conftool action : set/pooled=no; selector: name=mw2394.codfw.wmnet
- 09:26 cgoubert@cumin1002: conftool action : set/pooled=yes; selector: name=mw2444.codfw.wmnet
- 09:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55161 and previous config saved to /var/cache/conftool/dbconfig/20240122-092207-root.json
- 09:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55160 and previous config saved to /var/cache/conftool/dbconfig/20240122-092152-root.json
- 09:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2156 (T354336)', diff saved to https://phabricator.wikimedia.org/P55159 and previous config saved to /var/cache/conftool/dbconfig/20240122-091916-marostegui.json
- 09:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 09:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 09:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2156.codfw.wmnet with reason: Maintenance
- 09:18 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1035.eqiad.wmnet to cluster eqiad and group A
- 09:18 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1035.eqiad.wmnet to cluster eqiad and group A
- 09:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2156.codfw.wmnet with reason: Maintenance
- 09:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T354336)', diff saved to https://phabricator.wikimedia.org/P55158 and previous config saved to /var/cache/conftool/dbconfig/20240122-091838-marostegui.json
- 09:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1213.eqiad.wmnet with OS bookworm
- 09:17 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on gerrit[1003,2002].wikimedia.org with reason: Gerrit update
- 09:17 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on gerrit[1003,2002].wikimedia.org with reason: Gerrit update
- 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
- 09:11 hashar: Gerrit: reindexing all changes for 3.6 > 3.7 migration # T354885
- 09:08 hashar@deploy2002: Finished deploy [gerrit/gerrit@bdd1a8b]: Gerrit to version 3.7.6 (duration: 00m 10s)
- 09:08 hashar@deploy2002: Started deploy [gerrit/gerrit@bdd1a8b]: Gerrit to version 3.7.6
- 09:06 hashar: Upgrading Gerrit # T354885
- 09:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
- 09:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55157 and previous config saved to /var/cache/conftool/dbconfig/20240122-090504-root.json
- 09:03 cgoubert@cumin1002: conftool action : set/pooled=no; selector: name=mw2444.codfw.wmnet
- 09:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P55156 and previous config saved to /var/cache/conftool/dbconfig/20240122-090332-marostegui.json
- 09:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55155 and previous config saved to /var/cache/conftool/dbconfig/20240122-090218-root.json
- 09:01 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mw2394.codfw.wmnet
- 09:01 cgoubert@cumin1002: START - Cookbook sre.hosts.remove-downtime for mw2394.codfw.wmnet
- 08:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1213.eqiad.wmnet with reason: host reimage
- 08:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1213.eqiad.wmnet with reason: host reimage
- 08:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55154 and previous config saved to /var/cache/conftool/dbconfig/20240122-084959-root.json
- 08:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P55153 and previous config saved to /var/cache/conftool/dbconfig/20240122-084825-marostegui.json
- 08:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55152 and previous config saved to /var/cache/conftool/dbconfig/20240122-084713-root.json
- 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2048.codfw.wmnet
- 08:39 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1213.eqiad.wmnet with OS bookworm
- 08:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1213:3316 db1213:3315', diff saved to https://phabricator.wikimedia.org/P55151 and previous config saved to /var/cache/conftool/dbconfig/20240122-083812-marostegui.json
- 08:38 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2048.codfw.wmnet
- 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1048.eqiad.wmnet
- 08:35 xSavitar: UTC morning backport window done!
- 08:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55150 and previous config saved to /var/cache/conftool/dbconfig/20240122-083454-root.json
- 08:34 derick@deploy2002: Finished scap: Backport for gerrit:988403wmf-config: Remove unused wgCentralAuthTokenCacheType (T336004) (duration: 18m 15s)
- 08:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T354336)', diff saved to https://phabricator.wikimedia.org/P55149 and previous config saved to /var/cache/conftool/dbconfig/20240122-083319-marostegui.json
- 08:32 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1048.eqiad.wmnet
- 08:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55148 and previous config saved to /var/cache/conftool/dbconfig/20240122-083208-root.json
- 08:27 derick@deploy2002: d3r1ck01 and derick: Continuing with sync
- 08:26 derick@deploy2002: d3r1ck01 and derick: Backport for gerrit:988403wmf-config: Remove unused wgCentralAuthTokenCacheType (T336004) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55147 and previous config saved to /var/cache/conftool/dbconfig/20240122-081950-root.json
- 08:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55146 and previous config saved to /var/cache/conftool/dbconfig/20240122-081727-root.json
- 08:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55145 and previous config saved to /var/cache/conftool/dbconfig/20240122-081703-root.json
- 08:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2149 (T354336)', diff saved to https://phabricator.wikimedia.org/P55144 and previous config saved to /var/cache/conftool/dbconfig/20240122-081618-marostegui.json
- 08:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2149.codfw.wmnet with reason: Maintenance
- 08:15 derick@deploy2002: Started scap: Backport for gerrit:988403wmf-config: Remove unused wgCentralAuthTokenCacheType (T336004)
- 08:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2149.codfw.wmnet with reason: Maintenance
- 08:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T354336)', diff saved to https://phabricator.wikimedia.org/P55143 and previous config saved to /var/cache/conftool/dbconfig/20240122-081545-marostegui.json
- 08:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55142 and previous config saved to /var/cache/conftool/dbconfig/20240122-080445-root.json
- 08:02 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55141 and previous config saved to /var/cache/conftool/dbconfig/20240122-080222-root.json
- 08:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55140 and previous config saved to /var/cache/conftool/dbconfig/20240122-080158-root.json
- 08:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P55139 and previous config saved to /var/cache/conftool/dbconfig/20240122-080038-marostegui.json
- 07:54 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Shubhankar Patankar out of all services on: 2208 hosts
- 07:53 root@cumin2002: START - Cookbook sre.idm.logout Logging Shubhankar Patankar out of all services on: 2208 hosts
- 07:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55138 and previous config saved to /var/cache/conftool/dbconfig/20240122-074940-root.json
- 07:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55137 and previous config saved to /var/cache/conftool/dbconfig/20240122-074717-root.json
- 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55136 and previous config saved to /var/cache/conftool/dbconfig/20240122-074653-root.json
- 07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P55135 and previous config saved to /var/cache/conftool/dbconfig/20240122-074532-marostegui.json
- 07:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2169.codfw.wmnet with OS bookworm
- 07:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55134 and previous config saved to /var/cache/conftool/dbconfig/20240122-073435-root.json
- 07:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55133 and previous config saved to /var/cache/conftool/dbconfig/20240122-073212-root.json
- 07:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55132 and previous config saved to /var/cache/conftool/dbconfig/20240122-073148-root.json
- 07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T354336)', diff saved to https://phabricator.wikimedia.org/P55131 and previous config saved to /var/cache/conftool/dbconfig/20240122-073025-marostegui.json
- 07:28 kart_: Updated MinT to 2024-01-22-053144-production (T355303, T338608, T353510, T354666)
- 07:20 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
- 07:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55130 and previous config saved to /var/cache/conftool/dbconfig/20240122-071707-root.json
- 07:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
- 07:13 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
- 07:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2127 (T354336)', diff saved to https://phabricator.wikimedia.org/P55129 and previous config saved to /var/cache/conftool/dbconfig/20240122-071117-marostegui.json
- 07:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
- 07:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2127.codfw.wmnet with reason: Maintenance
- 07:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2127.codfw.wmnet with reason: Maintenance
- 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T354336)', diff saved to https://phabricator.wikimedia.org/P55128 and previous config saved to /var/cache/conftool/dbconfig/20240122-071054-marostegui.json
- 07:02 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55127 and previous config saved to /var/cache/conftool/dbconfig/20240122-070202-root.json
- 07:02 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
- 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P55126 and previous config saved to /var/cache/conftool/dbconfig/20240122-065548-marostegui.json
- 06:55 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
- 06:52 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2169.codfw.wmnet with OS bookworm
- 06:52 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
- 06:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2169:3316 db2169:3317', diff saved to https://phabricator.wikimedia.org/P55125 and previous config saved to /var/cache/conftool/dbconfig/20240122-064929-marostegui.json
- 06:47 kartik@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
- 06:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55124 and previous config saved to /var/cache/conftool/dbconfig/20240122-064657-root.json
- 06:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1187.eqiad.wmnet with OS bookworm
- 06:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P55123 and previous config saved to /var/cache/conftool/dbconfig/20240122-064041-marostegui.json
- 06:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1187.eqiad.wmnet with reason: host reimage
- 06:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T354336)', diff saved to https://phabricator.wikimedia.org/P55122 and previous config saved to /var/cache/conftool/dbconfig/20240122-062535-marostegui.json
- 06:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1187.eqiad.wmnet with reason: host reimage
- 06:10 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1187.eqiad.wmnet with OS bookworm
- 06:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1187 T354506', diff saved to https://phabricator.wikimedia.org/P55121 and previous config saved to /var/cache/conftool/dbconfig/20240122-060811-marostegui.json
- 06:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2109 (T354336)', diff saved to https://phabricator.wikimedia.org/P55120 and previous config saved to /var/cache/conftool/dbconfig/20240122-060529-marostegui.json
- 06:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2109.codfw.wmnet with reason: Maintenance
- 06:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2109.codfw.wmnet with reason: Maintenance
- 06:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1157.eqiad.wmnet with reason: Maintenance
- 06:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1157.eqiad.wmnet with reason: Maintenance
- 05:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
- 05:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
- 05:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55119 and previous config saved to /var/cache/conftool/dbconfig/20240122-054005-ladsgroup.json
- 05:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P55118 and previous config saved to /var/cache/conftool/dbconfig/20240122-052458-ladsgroup.json
- 05:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P55117 and previous config saved to /var/cache/conftool/dbconfig/20240122-050952-ladsgroup.json
- 04:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55116 and previous config saved to /var/cache/conftool/dbconfig/20240122-045445-ladsgroup.json
2024-01-21
- 23:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55115 and previous config saved to /var/cache/conftool/dbconfig/20240121-232323-ladsgroup.json
- 23:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
- 23:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
- 23:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55114 and previous config saved to /var/cache/conftool/dbconfig/20240121-232300-ladsgroup.json
- 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P55113 and previous config saved to /var/cache/conftool/dbconfig/20240121-230754-ladsgroup.json
- 22:55 tgr: T355491 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=dawiki --logwiki=metawiki 'Radiocolono' 'GuaritaRM'
- 22:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P55112 and previous config saved to /var/cache/conftool/dbconfig/20240121-225247-ladsgroup.json
- 22:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55111 and previous config saved to /var/cache/conftool/dbconfig/20240121-223740-ladsgroup.json
- 17:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55110 and previous config saved to /var/cache/conftool/dbconfig/20240121-171534-ladsgroup.json
- 17:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
- 17:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
- 17:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P55109 and previous config saved to /var/cache/conftool/dbconfig/20240121-171512-ladsgroup.json
- 17:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P55108 and previous config saved to /var/cache/conftool/dbconfig/20240121-170005-ladsgroup.json
- 16:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P55107 and previous config saved to /var/cache/conftool/dbconfig/20240121-164459-ladsgroup.json
- 16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P55106 and previous config saved to /var/cache/conftool/dbconfig/20240121-162952-ladsgroup.json
- 11:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P55105 and previous config saved to /var/cache/conftool/dbconfig/20240121-110344-ladsgroup.json
- 11:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
- 11:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
- 11:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55104 and previous config saved to /var/cache/conftool/dbconfig/20240121-110322-ladsgroup.json
- 10:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55103 and previous config saved to /var/cache/conftool/dbconfig/20240121-104815-ladsgroup.json
- 10:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55102 and previous config saved to /var/cache/conftool/dbconfig/20240121-103309-ladsgroup.json
- 10:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55101 and previous config saved to /var/cache/conftool/dbconfig/20240121-101802-ladsgroup.json
- 09:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55100 and previous config saved to /var/cache/conftool/dbconfig/20240121-091731-ladsgroup.json
- 09:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
- 09:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
- 09:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P55099 and previous config saved to /var/cache/conftool/dbconfig/20240121-091708-ladsgroup.json
- 09:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2175', diff saved to https://phabricator.wikimedia.org/P55098 and previous config saved to /var/cache/conftool/dbconfig/20240121-090831-marostegui.json
- 09:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P55097 and previous config saved to /var/cache/conftool/dbconfig/20240121-090202-ladsgroup.json
- 08:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P55096 and previous config saved to /var/cache/conftool/dbconfig/20240121-084655-ladsgroup.json
- 08:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P55095 and previous config saved to /var/cache/conftool/dbconfig/20240121-083148-ladsgroup.json
- 02:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P55094 and previous config saved to /var/cache/conftool/dbconfig/20240121-024507-ladsgroup.json
- 02:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
- 02:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
- 02:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P55093 and previous config saved to /var/cache/conftool/dbconfig/20240121-024445-ladsgroup.json
- 02:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P55092 and previous config saved to /var/cache/conftool/dbconfig/20240121-022939-ladsgroup.json
- 02:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P55091 and previous config saved to /var/cache/conftool/dbconfig/20240121-021432-ladsgroup.json
- 01:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P55090 and previous config saved to /var/cache/conftool/dbconfig/20240121-015926-ladsgroup.json
- 00:29 mutante: phabricator is back and on bullseye
- 00:11 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004 (duration: 00m 13s)
- 00:11 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004
- 00:03 mutante: phab1004:/usr/bin# ln -s /var/lib/scap/scap/bin/scap .
- 00:00 brennen@deploy2002: Installation of scap version "latest" completed for 1 hosts
- 00:00 brennen@deploy2002: Installing scap version "latest" for 1 hosts
2024-01-20
- 23:58 mutante: phab1004 - chown -R scap:scap /var/lib/scap
- 23:10 brennen@deploy2002: Installing scap version "latest" for 1 hosts
- 22:45 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004 (duration: 00m 10s)
- 22:44 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004
- 22:39 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004 (duration: 00m 10s)
- 22:39 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004
- 22:34 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: deployment
- 22:34 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab2002.codfw.wmnet with reason: deployment
- 22:28 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config revert (part 2) (duration: 00m 54s)
- 22:27 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config revert (part 2)
- 22:23 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config revert (duration: 00m 55s)
- 22:22 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config revert
- 22:02 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on phabricator.wikimedia.org with reason: OS upgrade
- 22:02 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on phabricator.wikimedia.org with reason: OS upgrade
- 22:02 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab.wmfusercontent.org with reason: OS upgrade
- 22:02 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab.wmfusercontent.org with reason: OS upgrade
- 22:02 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host phab1004.eqiad.wmnet with OS bullseye
- 22:02 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab1004.eqiad.wmnet with reason: OS upgrade
- 22:01 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab1004.eqiad.wmnet with reason: OS upgrade
- 21:46 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1004.eqiad.wmnet with reason: host reimage
- 21:43 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab1004.eqiad.wmnet with reason: host reimage
- 21:33 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
- 21:33 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
- 21:31 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host phab1004.eqiad.wmnet with OS bullseye
- 21:27 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host phab1004.eqiad.wmnet with OS bullseye
- 21:27 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host phab1004.eqiad.wmnet with OS bullseye
- 21:03 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config changes (redux) (duration: 01m 35s)
- 21:02 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config changes (redux)
- 20:38 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2002.codfw.wmnet with reason: maintenance
- 20:38 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2002.codfw.wmnet with reason: maintenance
- 20:37 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up database changes (duration: 00m 53s)
- 20:36 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up database changes
- 20:32 mutante: phabricator going down for maintenance
- 20:24 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab.wmfusercontent.org with reason: OS upgrade
- 20:23 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on phabricator.wikimedia.org with reason: OS upgrade
- 20:23 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on phabricator.wikimedia.org with reason: OS upgrade
- 20:22 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1004.eqiad.wmnet with reason: OS upgrade
- 20:22 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab1004.eqiad.wmnet with reason: OS upgrade
- 20:04 brennen: start of phab/phorge bullseye update window - T334519
- 20:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P55089 and previous config saved to /var/cache/conftool/dbconfig/20240120-200154-ladsgroup.json
- 20:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
- 20:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
- 14:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
- 14:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
- 09:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 09:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 09:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P55087 and previous config saved to /var/cache/conftool/dbconfig/20240120-095311-ladsgroup.json
- 09:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P55086 and previous config saved to /var/cache/conftool/dbconfig/20240120-093804-ladsgroup.json
- 09:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P55085 and previous config saved to /var/cache/conftool/dbconfig/20240120-092257-ladsgroup.json
- 09:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P55084 and previous config saved to /var/cache/conftool/dbconfig/20240120-090751-ladsgroup.json
- 04:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P55083 and previous config saved to /var/cache/conftool/dbconfig/20240120-041124-ladsgroup.json
- 04:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: Maintenance
- 04:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: Maintenance
- 04:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P55082 and previous config saved to /var/cache/conftool/dbconfig/20240120-041102-ladsgroup.json
- 03:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P55081 and previous config saved to /var/cache/conftool/dbconfig/20240120-035555-ladsgroup.json
- 03:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P55080 and previous config saved to /var/cache/conftool/dbconfig/20240120-034049-ladsgroup.json
- 03:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P55079 and previous config saved to /var/cache/conftool/dbconfig/20240120-032542-ladsgroup.json
2024-01-19
- 22:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P55078 and previous config saved to /var/cache/conftool/dbconfig/20240119-225906-ladsgroup.json
- 22:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
- 22:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
- 22:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P55077 and previous config saved to /var/cache/conftool/dbconfig/20240119-225844-ladsgroup.json
- 22:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P55076 and previous config saved to /var/cache/conftool/dbconfig/20240119-224337-ladsgroup.json
- 22:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P55075 and previous config saved to /var/cache/conftool/dbconfig/20240119-222830-ladsgroup.json
- 22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P55074 and previous config saved to /var/cache/conftool/dbconfig/20240119-221324-ladsgroup.json
- 22:05 ryankemper: [WDQS] Repooled `wdqs10[19,20]` (caught up on lag)
- 20:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 20:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 20:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T354336)', diff saved to https://phabricator.wikimedia.org/P55073 and previous config saved to /var/cache/conftool/dbconfig/20240119-202129-marostegui.json
- 20:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P55072 and previous config saved to /var/cache/conftool/dbconfig/20240119-200622-marostegui.json
- 19:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P55071 and previous config saved to /var/cache/conftool/dbconfig/20240119-195116-marostegui.json
- 19:45 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
- 19:43 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2088.codfw.wmnet']
- 19:38 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088.codfw.wmnet']
- 19:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T354336)', diff saved to https://phabricator.wikimedia.org/P55070 and previous config saved to /var/cache/conftool/dbconfig/20240119-193610-marostegui.json
- 19:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1223 (T354336)', diff saved to https://phabricator.wikimedia.org/P55069 and previous config saved to /var/cache/conftool/dbconfig/20240119-193028-marostegui.json
- 19:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1223.eqiad.wmnet with reason: Maintenance
- 19:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1223.eqiad.wmnet with reason: Maintenance
- 19:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T354336)', diff saved to https://phabricator.wikimedia.org/P55068 and previous config saved to /var/cache/conftool/dbconfig/20240119-193006-marostegui.json
- 19:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P55067 and previous config saved to /var/cache/conftool/dbconfig/20240119-191459-marostegui.json
- 18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P55066 and previous config saved to /var/cache/conftool/dbconfig/20240119-185953-marostegui.json
- 18:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T354336)', diff saved to https://phabricator.wikimedia.org/P55065 and previous config saved to /var/cache/conftool/dbconfig/20240119-184446-marostegui.json
- 18:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1212 (T354336)', diff saved to https://phabricator.wikimedia.org/P55064 and previous config saved to /var/cache/conftool/dbconfig/20240119-183902-marostegui.json
- 18:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 18:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 18:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1212.eqiad.wmnet with reason: Maintenance
- 18:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1212.eqiad.wmnet with reason: Maintenance
- 18:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T354336)', diff saved to https://phabricator.wikimedia.org/P55063 and previous config saved to /var/cache/conftool/dbconfig/20240119-183821-marostegui.json
- 18:27 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
- 18:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P55062 and previous config saved to /var/cache/conftool/dbconfig/20240119-182314-marostegui.json
- 18:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P55061 and previous config saved to /var/cache/conftool/dbconfig/20240119-180808-marostegui.json
- 18:02 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
- 17:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T354336)', diff saved to https://phabricator.wikimedia.org/P55060 and previous config saved to /var/cache/conftool/dbconfig/20240119-175301-marostegui.json
- 17:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1198 (T354336)', diff saved to https://phabricator.wikimedia.org/P55059 and previous config saved to /var/cache/conftool/dbconfig/20240119-174735-marostegui.json
- 17:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1198.eqiad.wmnet with reason: Maintenance
- 17:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1198.eqiad.wmnet with reason: Maintenance
- 17:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T354336)', diff saved to https://phabricator.wikimedia.org/P55058 and previous config saved to /var/cache/conftool/dbconfig/20240119-174713-marostegui.json
- 17:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P55057 and previous config saved to /var/cache/conftool/dbconfig/20240119-173207-marostegui.json
- 17:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P55056 and previous config saved to /var/cache/conftool/dbconfig/20240119-172715-ladsgroup.json
- 17:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
- 17:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
- 17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T352010)', diff saved to https://phabricator.wikimedia.org/P55055 and previous config saved to /var/cache/conftool/dbconfig/20240119-172652-ladsgroup.json
- 17:25 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cloudelastic1010.wikimedia.org with reason: need to fix regex certs
- 17:25 bking@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on cloudelastic1010.wikimedia.org with reason: need to fix regex certs
- 17:23 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1010.wikimedia.org
- 17:23 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1009.wikimedia.org
- 17:23 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1008.wikimedia.org
- 17:22 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1007.wikimedia.org
- 17:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P55054 and previous config saved to /var/cache/conftool/dbconfig/20240119-171700-marostegui.json
- 17:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P55053 and previous config saved to /var/cache/conftool/dbconfig/20240119-171146-ladsgroup.json
- 17:06 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
- 17:04 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host elastic2088.codfw.wmnet with OS bullseye
- 17:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T354336)', diff saved to https://phabricator.wikimedia.org/P55052 and previous config saved to /var/cache/conftool/dbconfig/20240119-170154-marostegui.json
- 16:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P55051 and previous config saved to /var/cache/conftool/dbconfig/20240119-165639-ladsgroup.json
- 16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1189 (T354336)', diff saved to https://phabricator.wikimedia.org/P55050 and previous config saved to /var/cache/conftool/dbconfig/20240119-165627-marostegui.json
- 16:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1189.eqiad.wmnet with reason: Maintenance
- 16:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1189.eqiad.wmnet with reason: Maintenance
- 16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T354336)', diff saved to https://phabricator.wikimedia.org/P55049 and previous config saved to /var/cache/conftool/dbconfig/20240119-165605-marostegui.json
- 16:41 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
- 16:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T352010)', diff saved to https://phabricator.wikimedia.org/P55048 and previous config saved to /var/cache/conftool/dbconfig/20240119-164133-ladsgroup.json
- 16:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P55047 and previous config saved to /var/cache/conftool/dbconfig/20240119-164058-marostegui.json
- 16:38 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
- 16:31 Emperor: mark new drive as non-RAID, mount, restore to service with puppet ms-be2072 T355330
- 16:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P55046 and previous config saved to /var/cache/conftool/dbconfig/20240119-162552-marostegui.json
- 16:16 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
- 16:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T354336)', diff saved to https://phabricator.wikimedia.org/P55045 and previous config saved to /var/cache/conftool/dbconfig/20240119-161046-marostegui.json
- 16:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1175 (T354336)', diff saved to https://phabricator.wikimedia.org/P55044 and previous config saved to /var/cache/conftool/dbconfig/20240119-160521-marostegui.json
- 16:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 16:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 16:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55043 and previous config saved to /var/cache/conftool/dbconfig/20240119-160459-marostegui.json
- 15:57 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
- 15:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P55042 and previous config saved to /var/cache/conftool/dbconfig/20240119-154953-marostegui.json
- 15:46 gmodena@deploy2002: Finished deploy [airflow-dags/analytics@f32c06e]: (no justification provided) (duration: 00m 30s)
- 15:46 gmodena@deploy2002: Started deploy [airflow-dags/analytics@f32c06e]: (no justification provided)
- 15:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P55041 and previous config saved to /var/cache/conftool/dbconfig/20240119-153446-marostegui.json
- 15:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55040 and previous config saved to /var/cache/conftool/dbconfig/20240119-151940-marostegui.json
- 15:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55039 and previous config saved to /var/cache/conftool/dbconfig/20240119-151413-marostegui.json
- 15:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 15:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 15:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 15:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 15:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 15:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 15:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2118.codfw.wmnet with reason: Maintenance
- 15:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2118.codfw.wmnet with reason: Maintenance
- 14:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T354336)', diff saved to https://phabricator.wikimedia.org/P55038 and previous config saved to /var/cache/conftool/dbconfig/20240119-145930-marostegui.json
- 14:56 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
- 14:50 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1107.eqiad.wmnet with OS bullseye
- 14:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P55036 and previous config saved to /var/cache/conftool/dbconfig/20240119-144423-marostegui.json
- 14:37 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1103.eqiad.wmnet with OS bullseye
- 14:35 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
- 14:34 bking@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['elastic2088.codfw.wmnet']
- 14:34 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088.codfw.wmnet']
- 14:34 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2088.codfw.wmnet']
- 14:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1107.eqiad.wmnet with reason: host reimage
- 14:31 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088.codfw.wmnet']
- 14:29 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1107.eqiad.wmnet with reason: host reimage
- 14:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P55034 and previous config saved to /var/cache/conftool/dbconfig/20240119-142917-marostegui.json
- 14:27 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 14:27 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 14:24 ejegg: payments-wiki upgraded from c37ddae5 to c2138768
- 14:21 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 14:21 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 14:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1103.eqiad.wmnet with reason: host reimage
- 14:17 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1103.eqiad.wmnet with reason: host reimage
- 14:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T354336)', diff saved to https://phabricator.wikimedia.org/P55033 and previous config saved to /var/cache/conftool/dbconfig/20240119-141411-marostegui.json
- 14:13 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1107.eqiad.wmnet with OS bullseye
- 14:12 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 14:12 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 14:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T354336)', diff saved to https://phabricator.wikimedia.org/P55032 and previous config saved to /var/cache/conftool/dbconfig/20240119-140746-marostegui.json
- 14:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2182.codfw.wmnet with reason: Maintenance
- 14:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2182.codfw.wmnet with reason: Maintenance
- 14:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55031 and previous config saved to /var/cache/conftool/dbconfig/20240119-140712-marostegui.json
- 14:07 gmodena@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 14:06 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
- 14:02 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1103.eqiad.wmnet with OS bullseye
- 13:58 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 13:57 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P55030 and previous config saved to /var/cache/conftool/dbconfig/20240119-135206-marostegui.json
- 13:46 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 13:46 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 13:43 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 13:38 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2046.codfw.wmnet
- 13:38 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1046.eqiad.wmnet
- 13:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P55029 and previous config saved to /var/cache/conftool/dbconfig/20240119-133659-marostegui.json
- 13:32 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2046.codfw.wmnet
- 13:32 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1046.eqiad.wmnet
- 13:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55028 and previous config saved to /var/cache/conftool/dbconfig/20240119-132153-marostegui.json
- 13:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2169:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55027 and previous config saved to /var/cache/conftool/dbconfig/20240119-131929-marostegui.json
- 13:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
- 13:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
- 13:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55026 and previous config saved to /var/cache/conftool/dbconfig/20240119-131906-marostegui.json
- 13:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P55024 and previous config saved to /var/cache/conftool/dbconfig/20240119-130400-marostegui.json
- 12:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P55023 and previous config saved to /var/cache/conftool/dbconfig/20240119-124853-marostegui.json
- 12:45 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 12:44 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 12:44 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 12:43 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 12:42 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 12:41 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 12:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55022 and previous config saved to /var/cache/conftool/dbconfig/20240119-123347-marostegui.json
- 12:32 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 12:32 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 12:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2168:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55021 and previous config saved to /var/cache/conftool/dbconfig/20240119-123023-marostegui.json
- 12:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 12:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 12:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T354336)', diff saved to https://phabricator.wikimedia.org/P55020 and previous config saved to /var/cache/conftool/dbconfig/20240119-123001-marostegui.json
- 12:30 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 12:29 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 12:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P55019 and previous config saved to /var/cache/conftool/dbconfig/20240119-121455-marostegui.json
- 11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P55018 and previous config saved to /var/cache/conftool/dbconfig/20240119-115948-marostegui.json
- 11:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1243 (T352010)', diff saved to https://phabricator.wikimedia.org/P55017 and previous config saved to /var/cache/conftool/dbconfig/20240119-114452-ladsgroup.json
- 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T354336)', diff saved to https://phabricator.wikimedia.org/P55016 and previous config saved to /var/cache/conftool/dbconfig/20240119-114442-marostegui.json
- 11:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
- 11:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
- 11:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T352010)', diff saved to https://phabricator.wikimedia.org/P55015 and previous config saved to /var/cache/conftool/dbconfig/20240119-114424-ladsgroup.json
- 11:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T354336)', diff saved to https://phabricator.wikimedia.org/P55014 and previous config saved to /var/cache/conftool/dbconfig/20240119-114219-marostegui.json
- 11:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 11:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 11:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance
- 11:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance
- 11:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T354336)', diff saved to https://phabricator.wikimedia.org/P55013 and previous config saved to /var/cache/conftool/dbconfig/20240119-114140-marostegui.json
- 11:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P55012 and previous config saved to /var/cache/conftool/dbconfig/20240119-112917-ladsgroup.json
- 11:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P55011 and previous config saved to /var/cache/conftool/dbconfig/20240119-112634-marostegui.json
- 11:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P55010 and previous config saved to /var/cache/conftool/dbconfig/20240119-111411-ladsgroup.json
- 11:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P55009 and previous config saved to /var/cache/conftool/dbconfig/20240119-111127-marostegui.json
- 10:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T352010)', diff saved to https://phabricator.wikimedia.org/P55008 and previous config saved to /var/cache/conftool/dbconfig/20240119-105904-ladsgroup.json
- 10:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T354336)', diff saved to https://phabricator.wikimedia.org/P55007 and previous config saved to /var/cache/conftool/dbconfig/20240119-105621-marostegui.json
- 10:45 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
- 10:42 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
- 10:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T354336)', diff saved to https://phabricator.wikimedia.org/P55006 and previous config saved to /var/cache/conftool/dbconfig/20240119-101340-marostegui.json
- 10:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2150.codfw.wmnet with reason: Maintenance
- 10:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2150.codfw.wmnet with reason: Maintenance
- 10:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T354336)', diff saved to https://phabricator.wikimedia.org/P55005 and previous config saved to /var/cache/conftool/dbconfig/20240119-101318-marostegui.json
- 09:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P55004 and previous config saved to /var/cache/conftool/dbconfig/20240119-095811-marostegui.json
- 09:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P55003 and previous config saved to /var/cache/conftool/dbconfig/20240119-094305-marostegui.json
- 09:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T354336)', diff saved to https://phabricator.wikimedia.org/P55002 and previous config saved to /var/cache/conftool/dbconfig/20240119-092758-marostegui.json
- 09:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T354336)', diff saved to https://phabricator.wikimedia.org/P55001 and previous config saved to /var/cache/conftool/dbconfig/20240119-092535-marostegui.json
- 09:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2122.codfw.wmnet with reason: Maintenance
- 09:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2122.codfw.wmnet with reason: Maintenance
- 09:25 jnuche@deploy2002: Installation of scap version "4.65.2" completed for 531 hosts
- 09:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T354336)', diff saved to https://phabricator.wikimedia.org/P55000 and previous config saved to /var/cache/conftool/dbconfig/20240119-092513-marostegui.json
- 09:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore2006.codfw.wmnet
- 09:24 jnuche@deploy2002: Installing scap version "4.65.2" for 531 hosts
- 09:15 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore2006.codfw.wmnet
- 09:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore2005.codfw.wmnet
- 09:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P54999 and previous config saved to /var/cache/conftool/dbconfig/20240119-091007-marostegui.json
- 09:03 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore2005.codfw.wmnet
- 09:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore2004.codfw.wmnet
- 08:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P54998 and previous config saved to /var/cache/conftool/dbconfig/20240119-085500-marostegui.json
- 08:53 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore2004.codfw.wmnet
- 08:50 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore1006.eqiad.wmnet
- 08:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T354336)', diff saved to https://phabricator.wikimedia.org/P54997 and previous config saved to /var/cache/conftool/dbconfig/20240119-083954-marostegui.json
- 08:39 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore1006.eqiad.wmnet
- 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2121 (T354336)', diff saved to https://phabricator.wikimedia.org/P54996 and previous config saved to /var/cache/conftool/dbconfig/20240119-083730-marostegui.json
- 08:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance
- 08:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance
- 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T354336)', diff saved to https://phabricator.wikimedia.org/P54995 and previous config saved to /var/cache/conftool/dbconfig/20240119-083709-marostegui.json
- 08:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore1005.eqiad.wmnet
- 08:22 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore1005.eqiad.wmnet
- 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P54994 and previous config saved to /var/cache/conftool/dbconfig/20240119-082202-marostegui.json
- 08:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore1004.eqiad.wmnet
- 08:11 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore1004.eqiad.wmnet
- 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P54993 and previous config saved to /var/cache/conftool/dbconfig/20240119-080655-marostegui.json
- 07:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 100%: T354336', diff saved to https://phabricator.wikimedia.org/P54992 and previous config saved to /var/cache/conftool/dbconfig/20240119-075828-root.json
- 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T354336)', diff saved to https://phabricator.wikimedia.org/P54991 and previous config saved to /var/cache/conftool/dbconfig/20240119-075149-marostegui.json
- 07:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2120 (T354336)', diff saved to https://phabricator.wikimedia.org/P54990 and previous config saved to /var/cache/conftool/dbconfig/20240119-074825-marostegui.json
- 07:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2120.codfw.wmnet with reason: Maintenance
- 07:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2120.codfw.wmnet with reason: Maintenance
- 07:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T354336)', diff saved to https://phabricator.wikimedia.org/P54989 and previous config saved to /var/cache/conftool/dbconfig/20240119-074752-marostegui.json
- 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 75%: T354336', diff saved to https://phabricator.wikimedia.org/P54988 and previous config saved to /var/cache/conftool/dbconfig/20240119-074323-root.json
- 07:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P54987 and previous config saved to /var/cache/conftool/dbconfig/20240119-073245-marostegui.json
- 07:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 50%: T354336', diff saved to https://phabricator.wikimedia.org/P54986 and previous config saved to /var/cache/conftool/dbconfig/20240119-072818-root.json
- 07:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P54985 and previous config saved to /var/cache/conftool/dbconfig/20240119-071739-marostegui.json
- 07:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 25%: T354336', diff saved to https://phabricator.wikimedia.org/P54984 and previous config saved to /var/cache/conftool/dbconfig/20240119-071313-root.json
- 07:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T354336)', diff saved to https://phabricator.wikimedia.org/P54983 and previous config saved to /var/cache/conftool/dbconfig/20240119-070233-marostegui.json
- 07:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2108 (T354336)', diff saved to https://phabricator.wikimedia.org/P54982 and previous config saved to /var/cache/conftool/dbconfig/20240119-070009-marostegui.json
- 07:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2108.codfw.wmnet with reason: Maintenance
- 06:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2108.codfw.wmnet with reason: Maintenance
- 06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
- 06:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
- 06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
- 06:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
- 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 10%: T354336', diff saved to https://phabricator.wikimedia.org/P54981 and previous config saved to /var/cache/conftool/dbconfig/20240119-065808-root.json
- 06:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
- 06:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
- 06:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
- 06:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
- 06:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
- 06:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
- 06:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T354336)', diff saved to https://phabricator.wikimedia.org/P54979 and previous config saved to /var/cache/conftool/dbconfig/20240119-063020-marostegui.json
- 06:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
- 06:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
- 06:28 marostegui@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 06:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 06:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1242 (T352010)', diff saved to https://phabricator.wikimedia.org/P54978 and previous config saved to /var/cache/conftool/dbconfig/20240119-061827-ladsgroup.json
- 06:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
- 06:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
- 06:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T352010)', diff saved to https://phabricator.wikimedia.org/P54977 and previous config saved to /var/cache/conftool/dbconfig/20240119-061805-ladsgroup.json
- 06:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P54976 and previous config saved to /var/cache/conftool/dbconfig/20240119-060258-ladsgroup.json
- 05:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P54975 and previous config saved to /var/cache/conftool/dbconfig/20240119-054751-ladsgroup.json
- 05:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T352010)', diff saved to https://phabricator.wikimedia.org/P54974 and previous config saved to /var/cache/conftool/dbconfig/20240119-053244-ladsgroup.json
- 03:38 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
- 02:49 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1103.eqiad.wmnet with OS bullseye
- 02:48 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1106.eqiad.wmnet with OS bullseye
- 02:45 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1105.eqiad.wmnet with OS bullseye
- 02:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1104.eqiad.wmnet with OS bullseye
- 02:31 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1106.eqiad.wmnet with reason: host reimage
- 02:28 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1106.eqiad.wmnet with reason: host reimage
- 02:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1105.eqiad.wmnet with reason: host reimage
- 02:24 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1105.eqiad.wmnet with reason: host reimage
- 02:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1104.eqiad.wmnet with reason: host reimage
- 02:21 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1104.eqiad.wmnet with reason: host reimage
- 02:18 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
- 02:17 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
- 02:12 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1106.eqiad.wmnet with OS bullseye
- 02:09 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1105.eqiad.wmnet with OS bullseye
- 02:09 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
- 02:06 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1104.eqiad.wmnet with OS bullseye
- 02:01 tzatziki: removing 4 files for legal compliance
- 01:42 tzatziki: removing 3 files for legal compliance
- 01:28 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1103.eqiad.wmnet with OS bullseye
- 01:08 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2097.codfw.wmnet with OS bullseye
- 01:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2096.codfw.wmnet with OS bullseye
- 00:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
- 00:50 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2097.codfw.wmnet with reason: host reimage
- 00:50 tzatziki: removing 1 file for legal compliance
- 00:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
- 00:47 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2097.codfw.wmnet with reason: host reimage
- 00:46 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2096.codfw.wmnet with reason: host reimage
- 00:43 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2096.codfw.wmnet with reason: host reimage
- 00:42 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2101.codfw.wmnet with OS bullseye
- 00:40 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2100.codfw.wmnet with OS bullseye
- 00:34 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2099.codfw.wmnet with OS bullseye
- 00:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2097.codfw.wmnet with OS bullseye
- 00:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1241 (T352010)', diff saved to https://phabricator.wikimedia.org/P54973 and previous config saved to /var/cache/conftool/dbconfig/20240119-002755-ladsgroup.json
- 00:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
- 00:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
- 00:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T352010)', diff saved to https://phabricator.wikimedia.org/P54972 and previous config saved to /var/cache/conftool/dbconfig/20240119-002733-ladsgroup.json
- 00:26 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2096.codfw.wmnet with OS bullseye
- 00:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2098.codfw.wmnet with OS bullseye
- 00:25 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2101.codfw.wmnet with reason: host reimage
- 00:22 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2100.codfw.wmnet with reason: host reimage
- 00:21 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2101.codfw.wmnet with reason: host reimage
- 00:18 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2100.codfw.wmnet with reason: host reimage
- 00:17 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2099.codfw.wmnet with reason: host reimage
- 00:14 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2099.codfw.wmnet with reason: host reimage
- 00:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs1020.eqiad.wmnet with reason: needs to catch up from its lag
- 00:13 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs1020.eqiad.wmnet with reason: needs to catch up from its lag
- 00:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P54971 and previous config saved to /var/cache/conftool/dbconfig/20240119-001226-ladsgroup.json
- 00:12 inflatador: bking@wdqs1020 depool host to catch up on lag
- 00:08 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2098.codfw.wmnet with reason: host reimage
- 00:05 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2098.codfw.wmnet with reason: host reimage
- 00:05 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2101.codfw.wmnet with OS bullseye
- 00:02 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2100.codfw.wmnet with OS bullseye
2024-01-18
- 23:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2099.codfw.wmnet with OS bullseye
- 23:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P54970 and previous config saved to /var/cache/conftool/dbconfig/20240118-235720-ladsgroup.json
- 23:50 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
- 23:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2098.codfw.wmnet with OS bullseye
- 23:47 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
- 23:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T352010)', diff saved to https://phabricator.wikimedia.org/P54969 and previous config saved to /var/cache/conftool/dbconfig/20240118-234213-ladsgroup.json
- 23:13 tstarling@deploy2002: Synchronized php-1.42.0-wmf.14/extensions/CodeMirror/resources/mode/mediawiki/mediawiki.less: fix CodeMirror style bug T355290 (duration: 06m 33s)
- 22:59 bking@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host elastic2086.codfw.wmnet
- 22:55 bking@cumin2002: START - Cookbook sre.puppet.migrate-host for host elastic2086.codfw.wmnet
- 22:55 bking@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host elastic2086*
- 22:54 bking@cumin2002: START - Cookbook sre.puppet.migrate-host for host elastic2086*
- 22:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
- 22:00 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
- 21:59 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
- 21:57 urbanecm@deploy2002: Finished scap: Backport for gerrit:991561Use BetaFeatures::isFeatureEnabled instead of getOption (T354288) (duration: 06m 58s)
- 21:50 urbanecm@deploy2002: Started scap: Backport for gerrit:991561Use BetaFeatures::isFeatureEnabled instead of getOption (T354288)
- 21:41 jforrester@deploy2002: Finished scap: Backport for gerrit:991547Promote wikimaniawiki to Vector 2022 as default skin (T355297) (duration: 07m 33s)
- 21:35 jforrester@deploy2002: jforrester and msz2001: Continuing with sync
- 21:35 jforrester@deploy2002: jforrester and msz2001: Backport for gerrit:991547Promote wikimaniawiki to Vector 2022 as default skin (T355297) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:34 jforrester@deploy2002: Started scap: Backport for gerrit:991547Promote wikimaniawiki to Vector 2022 as default skin (T355297)
- 21:15 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30-no-render-now.txt`
- 21:14 dreamyjazz@deploy2002: Finished scap: Backport for gerrit:991555Log to statsd HTTP status codes and reduce logstash log levels (T355216) (duration: 09m 00s)
- 21:14 Dreamy_Jazz: Stopped MediaModeration scanning script (T351400)
- 21:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 21:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 21:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T354336)', diff saved to https://phabricator.wikimedia.org/P54968 and previous config saved to /var/cache/conftool/dbconfig/20240118-211337-marostegui.json
- 21:08 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 21:08 dreamyjazz@deploy2002: dreamyjazz: Backport for gerrit:991555Log to statsd HTTP status codes and reduce logstash log levels (T355216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:05 dreamyjazz@deploy2002: Started scap: Backport for gerrit:991555Log to statsd HTTP status codes and reduce logstash log levels (T355216)
- 21:04 ejegg: payments-wiki upgraded from e38b24f0 to c37ddae5
- 20:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P54967 and previous config saved to /var/cache/conftool/dbconfig/20240118-205830-marostegui.json
- 20:44 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
- 20:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P54966 and previous config saved to /var/cache/conftool/dbconfig/20240118-204324-marostegui.json
- 20:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T354336)', diff saved to https://phabricator.wikimedia.org/P54965 and previous config saved to /var/cache/conftool/dbconfig/20240118-202817-marostegui.json
- 20:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1236 (T354336)', diff saved to https://phabricator.wikimedia.org/P54964 and previous config saved to /var/cache/conftool/dbconfig/20240118-202606-marostegui.json
- 20:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1236.eqiad.wmnet with reason: Maintenance
- 20:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1236.eqiad.wmnet with reason: Maintenance
- 20:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T354336)', diff saved to https://phabricator.wikimedia.org/P54963 and previous config saved to /var/cache/conftool/dbconfig/20240118-202544-marostegui.json
- 20:24 mutante: rsyncing phab repo data, gitlab2003 pulls from phab2002 (inactive server) - test only to see how long it will take, can be stopped
- 20:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P54962 and previous config saved to /var/cache/conftool/dbconfig/20240118-201037-marostegui.json
- 20:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2095.codfw.wmnet with OS bullseye
- 19:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P54961 and previous config saved to /var/cache/conftool/dbconfig/20240118-195531-marostegui.json
- 19:48 ryankemper: T354662 Running `sudo -i authdns-update` on `dns1004` following merge of https://gerrit.wikimedia.org/r/c/operations/dns/+/991429
- 19:46 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2095.codfw.wmnet with reason: host reimage
- 19:43 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2095.codfw.wmnet with reason: host reimage
- 19:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T354336)', diff saved to https://phabricator.wikimedia.org/P54960 and previous config saved to /var/cache/conftool/dbconfig/20240118-194024-marostegui.json
- 19:26 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2095.codfw.wmnet with OS bullseye
- 19:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2093.codfw.wmnet with OS bullseye
- 19:23 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
- 19:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2092.codfw.wmnet with OS bullseye
- 19:11 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2091.codfw.wmnet with OS bullseye
- 19:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2093.codfw.wmnet with reason: host reimage
- 19:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2089.codfw.wmnet with OS bullseye
- 19:04 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2093.codfw.wmnet with reason: host reimage
- 19:02 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2092.codfw.wmnet with reason: host reimage
- 18:59 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2092.codfw.wmnet with reason: host reimage
- 18:54 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2091.codfw.wmnet with reason: host reimage
- 18:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2091.codfw.wmnet with reason: host reimage
- 18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1238 (T352010)', diff saved to https://phabricator.wikimedia.org/P54959 and previous config saved to /var/cache/conftool/dbconfig/20240118-185038-ladsgroup.json
- 18:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
- 18:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
- 18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T352010)', diff saved to https://phabricator.wikimedia.org/P54958 and previous config saved to /var/cache/conftool/dbconfig/20240118-185016-ladsgroup.json
- 18:48 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2089.codfw.wmnet with reason: host reimage
- 18:47 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2093.codfw.wmnet with OS bullseye
- 18:45 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2089.codfw.wmnet with reason: host reimage
- 18:42 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2092.codfw.wmnet with OS bullseye
- 18:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T354336)', diff saved to https://phabricator.wikimedia.org/P54957 and previous config saved to /var/cache/conftool/dbconfig/20240118-184002-marostegui.json
- 18:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
- 18:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
- 18:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T354336)', diff saved to https://phabricator.wikimedia.org/P54956 and previous config saved to /var/cache/conftool/dbconfig/20240118-183940-marostegui.json
- 18:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P54955 and previous config saved to /var/cache/conftool/dbconfig/20240118-183510-ladsgroup.json
- 18:34 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2091.codfw.wmnet with OS bullseye
- 18:28 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2089.codfw.wmnet with OS bullseye
- 18:25 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
- 18:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P54954 and previous config saved to /var/cache/conftool/dbconfig/20240118-182433-marostegui.json
- 18:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P54953 and previous config saved to /var/cache/conftool/dbconfig/20240118-182003-ladsgroup.json
- 18:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P54951 and previous config saved to /var/cache/conftool/dbconfig/20240118-180927-marostegui.json
- 18:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T352010)', diff saved to https://phabricator.wikimedia.org/P54950 and previous config saved to /var/cache/conftool/dbconfig/20240118-180456-ladsgroup.json
- 17:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T354336)', diff saved to https://phabricator.wikimedia.org/P54949 and previous config saved to /var/cache/conftool/dbconfig/20240118-175420-marostegui.json
- 17:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T354336)', diff saved to https://phabricator.wikimedia.org/P54948 and previous config saved to /var/cache/conftool/dbconfig/20240118-175209-marostegui.json
- 17:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1202.eqiad.wmnet with reason: Maintenance
- 17:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1202.eqiad.wmnet with reason: Maintenance
- 17:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T354336)', diff saved to https://phabricator.wikimedia.org/P54947 and previous config saved to /var/cache/conftool/dbconfig/20240118-175147-marostegui.json
- 17:43 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2097.codfw.wmnet with OS bullseye
- 17:42 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2101.codfw.wmnet with OS bullseye
- 17:39 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2096.codfw.wmnet with OS bullseye
- 17:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P54946 and previous config saved to /var/cache/conftool/dbconfig/20240118-173640-marostegui.json
- 17:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2095.codfw.wmnet with OS bullseye
- 17:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2100.codfw.wmnet with OS bullseye
- 17:33 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
- 17:31 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2102.codfw.wmnet with OS bullseye
- 17:30 topranks: Re-enabling PyBal on lvs2011 after network migration T352912
- 17:30 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2093.codfw.wmnet with OS bullseye
- 17:28 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2099.codfw.wmnet with OS bullseye
- 17:27 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2092.codfw.wmnet with OS bullseye
- 17:25 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2091.codfw.wmnet with OS bullseye
- 17:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P54945 and previous config saved to /var/cache/conftool/dbconfig/20240118-172134-marostegui.json
- 17:20 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2098.codfw.wmnet with OS bullseye
- 17:14 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2102.codfw.wmnet with reason: host reimage
- 17:11 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2102.codfw.wmnet with reason: host reimage
- 17:11 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2089.codfw.wmnet with OS bullseye
- 17:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T354336)', diff saved to https://phabricator.wikimedia.org/P54944 and previous config saved to /var/cache/conftool/dbconfig/20240118-170627-marostegui.json
- 17:06 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
- 17:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T354336)', diff saved to https://phabricator.wikimedia.org/P54943 and previous config saved to /var/cache/conftool/dbconfig/20240118-170417-marostegui.json
- 17:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1194.eqiad.wmnet with reason: Maintenance
- 17:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1194.eqiad.wmnet with reason: Maintenance
- 17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T354336)', diff saved to https://phabricator.wikimedia.org/P54942 and previous config saved to /var/cache/conftool/dbconfig/20240118-170355-marostegui.json
- 16:54 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2102.codfw.wmnet with OS bullseye
- 16:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2101.codfw.wmnet with OS bullseye
- 16:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P54941 and previous config saved to /var/cache/conftool/dbconfig/20240118-164848-marostegui.json
- 16:42 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2100.codfw.wmnet with OS bullseye
- 16:36 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2090.codfw.wmnet with OS bullseye
- 16:35 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2099.codfw.wmnet with OS bullseye
- 16:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P54940 and previous config saved to /var/cache/conftool/dbconfig/20240118-163342-marostegui.json
- 16:33 hashar@deploy2002: Finished deploy [integration/docroot@1d9323f]: Remove Wikimedia Design Style Guide from the list - T347895 (duration: 00m 07s)
- 16:33 hashar@deploy2002: Started deploy [integration/docroot@1d9323f]: Remove Wikimedia Design Style Guide from the list - T347895
- 16:27 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2098.codfw.wmnet with OS bullseye
- 16:25 sukhe: running authdns-update for T355308
- 16:22 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2097.codfw.wmnet with OS bullseye
- 16:18 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2090.codfw.wmnet with reason: host reimage
- 16:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T354336)', diff saved to https://phabricator.wikimedia.org/P54939 and previous config saved to /var/cache/conftool/dbconfig/20240118-161834-marostegui.json
- 16:18 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2096.codfw.wmnet with OS bullseye
- 16:18 claime: Running puppet on 'P{P:kubernetes::node} and not P{F:lldp.parent ~ lsw}' - T352883
- 16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T354336)', diff saved to https://phabricator.wikimedia.org/P54938 and previous config saved to /var/cache/conftool/dbconfig/20240118-161624-marostegui.json
- 16:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1191.eqiad.wmnet with reason: Maintenance
- 16:16 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1191.eqiad.wmnet with reason: Maintenance
- 16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T354336)', diff saved to https://phabricator.wikimedia.org/P54937 and previous config saved to /var/cache/conftool/dbconfig/20240118-161602-marostegui.json
- 16:15 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2090.codfw.wmnet with reason: host reimage
- 16:15 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2095.codfw.wmnet with OS bullseye
- 16:12 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
- 16:09 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2093.codfw.wmnet with OS bullseye
- 16:06 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2092.codfw.wmnet with OS bullseye
- 16:06 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: moving lvs2011 network link T352912
- 16:06 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: moving lvs2011 network link T352912
- 16:06 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cr2-codfw,cr[1-2]-codfw IPv6,re0.cr1-codfw.mgmt,re0.cr2-codfw.mgmt cr1-codfw with reason: moving lvs2011 network link T352912
- 16:05 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cr2-codfw,cr[1-2]-codfw IPv6,re0.cr1-codfw.mgmt,re0.cr2-codfw.mgmt cr1-codfw with reason: moving lvs2011 network link T352912
- 16:04 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: moving lvs2011 network link T352912
- 16:04 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2011.codfw.wmnet with reason: moving lvs2011 network link T352912
- 16:04 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2091.codfw.wmnet with OS bullseye
- 16:03 claime: Running puppet on 'P{P:kubernetes::node} and P{F:lldp.parent ~ lsw}' - T352883
- 16:02 topranks: disabling PyBal and puppet on lvs2011, moving traffic to lvs2014 ahead of network change T352912
- 16:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P54936 and previous config saved to /var/cache/conftool/dbconfig/20240118-160055-marostegui.json
- 15:59 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1461.eqiad.wmnet with OS bullseye
- 15:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2090.codfw.wmnet with OS bullseye
- 15:56 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1439.eqiad.wmnet with OS bullseye
- 15:54 claime: Running puppet on A:wikikube-staging-worker - T352883
- 15:53 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1469.eqiad.wmnet with OS bullseye
- 15:52 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1045.eqiad.wmnet
- 15:52 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2045.codfw.wmnet
- 15:52 claime: Running puppet on kubestage2002 - T352883
- 15:52 claime: stopping puppet on P:kubernetes::node to deploy 980927 - T352883
- 15:50 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2089.codfw.wmnet with OS bullseye
- 15:49 claime: Running puppet on kubestage2002 - T352893
- 15:46 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1045.eqiad.wmnet
- 15:46 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2045.codfw.wmnet
- 15:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P54935 and previous config saved to /var/cache/conftool/dbconfig/20240118-154549-marostegui.json
- 15:45 claime: stopping puppet on P:kubernetes::node to deploy 980927 - T352893
- 15:45 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
- 15:40 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1461.eqiad.wmnet with reason: host reimage
- 15:37 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1439.eqiad.wmnet with reason: host reimage
- 15:35 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1469.eqiad.wmnet with reason: host reimage
- 15:32 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1461.eqiad.wmnet with reason: host reimage
- 15:32 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1439.eqiad.wmnet with reason: host reimage
- 15:31 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1469.eqiad.wmnet with reason: host reimage
- 15:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T354336)', diff saved to https://phabricator.wikimedia.org/P54933 and previous config saved to /var/cache/conftool/dbconfig/20240118-153042-marostegui.json
- 15:28 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T354336)', diff saved to https://phabricator.wikimedia.org/P54932 and previous config saved to /var/cache/conftool/dbconfig/20240118-152832-marostegui.json
- 15:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 15:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 15:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 15:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 15:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P54931 and previous config saved to /var/cache/conftool/dbconfig/20240118-152747-marostegui.json
- 15:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 100%: T355313', diff saved to https://phabricator.wikimedia.org/P54930 and previous config saved to /var/cache/conftool/dbconfig/20240118-152006-root.json
- 15:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1439.eqiad.wmnet with OS bullseye
- 15:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1469.eqiad.wmnet with OS bullseye
- 15:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1461.eqiad.wmnet with OS bullseye
- 15:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P54929 and previous config saved to /var/cache/conftool/dbconfig/20240118-151241-marostegui.json
- 15:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 75%: T355313', diff saved to https://phabricator.wikimedia.org/P54928 and previous config saved to /var/cache/conftool/dbconfig/20240118-150501-root.json
- 14:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P54927 and previous config saved to /var/cache/conftool/dbconfig/20240118-145734-marostegui.json
- 14:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 50%: T355313', diff saved to https://phabricator.wikimedia.org/P54926 and previous config saved to /var/cache/conftool/dbconfig/20240118-144956-root.json
- 14:43 Dreamy_Jazz: Afternoon UTC backport window done
- 14:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P54925 and previous config saved to /var/cache/conftool/dbconfig/20240118-144228-marostegui.json
- 14:42 Emperor: disable puppet on ms-be2072 to try and deal with faulty drive
- 14:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P54924 and previous config saved to /var/cache/conftool/dbconfig/20240118-144214-marostegui.json
- 14:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 14:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 14:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54923 and previous config saved to /var/cache/conftool/dbconfig/20240118-144152-marostegui.json
- 14:41 Dreamy_Jazz: Ran `echo 'https://en.wikipedia.org/static/images/mobile/copyright/wikipedia-tagline-th.svg' | mwscript purgeList.php`, `echo 'https://en.wikipedia.org/static/images/mobile/copyright/wikipedia-wordmark-th.svg' | mwscript purgeList.php`, `echo 'https://en.wikipedia.org/static/images/project-logos/thwiki.png' | mwscript purgeList.php`, `echo 'https://en.wikipedia.org/static/images/project-logos/thwiki-1.5x.png' | mwscript purgeList.php`, and `echo 'https://en.wikipedia.org/static/images/project-logos/thwiki-2x.png' | mwscript purgeList.php`
- 14:38 dreamyjazz@deploy2002: Finished scap: Backport for gerrit:989750thwiki: update tagline and optimise other logos (T341407) (duration: 08m 28s)
- 14:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
- 14:35 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 10 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
- 14:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 25%: T355313', diff saved to https://phabricator.wikimedia.org/P54922 and previous config saved to /var/cache/conftool/dbconfig/20240118-143451-root.json
- 14:33 dreamyjazz@deploy2002: anzx and dreamyjazz: Continuing with sync
- 14:31 dreamyjazz@deploy2002: anzx and dreamyjazz: Backport for gerrit:989750thwiki: update tagline and optimise other logos (T341407) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:30 dreamyjazz@deploy2002: Started scap: Backport for gerrit:989750thwiki: update tagline and optimise other logos (T341407)
- 14:28 kartik@deploy2002: Finished scap: Backport for gerrit:991002Set MT threshold for Punjabi Wikipedia to 97 (T347789) (duration: 10m 03s)
- 14:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P54921 and previous config saved to /var/cache/conftool/dbconfig/20240118-142646-marostegui.json
- 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: aqs
- 14:22 kartik@deploy2002: kartik: Continuing with sync
- 14:19 kartik@deploy2002: kartik: Backport for gerrit:991002Set MT threshold for Punjabi Wikipedia to 97 (T347789) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 10%: T355313', diff saved to https://phabricator.wikimedia.org/P54920 and previous config saved to /var/cache/conftool/dbconfig/20240118-141946-root.json
- 14:18 kartik@deploy2002: Started scap: Backport for gerrit:991002Set MT threshold for Punjabi Wikipedia to 97 (T347789)
- 14:12 Dreamy_Jazz: running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30-no-render-now.txt`
- 14:11 dreamyjazz@deploy2002: Finished scap: Backport for gerrit:991551Remove RENDER_NOW from File::transform call to avoid job thumbnailing (T355309) (duration: 07m 50s)
- 14:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P54919 and previous config saved to /var/cache/conftool/dbconfig/20240118-141139-marostegui.json
- 14:07 Dreamy_Jazz: Stopped MediaModeration scan for commonswiki
- 14:07 Dreamy_Jazz: stopped MediaModerations scan for group2
- 14:06 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: aqs
- 14:06 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 14:05 dreamyjazz@deploy2002: dreamyjazz: Backport for gerrit:991551Remove RENDER_NOW from File::transform call to avoid job thumbnailing (T355309) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 5%: T355313', diff saved to https://phabricator.wikimedia.org/P54918 and previous config saved to /var/cache/conftool/dbconfig/20240118-140441-root.json
- 14:03 dreamyjazz@deploy2002: Started scap: Backport for gerrit:991551Remove RENDER_NOW from File::transform call to avoid job thumbnailing (T355309)
- 13:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54917 and previous config saved to /var/cache/conftool/dbconfig/20240118-135633-marostegui.json
- 13:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54916 and previous config saved to /var/cache/conftool/dbconfig/20240118-135422-marostegui.json
- 13:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 13:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 13:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 13:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 13:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2107.codfw.wmnet with reason: Maintenance
- 13:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2107.codfw.wmnet with reason: Maintenance
- 13:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 1%: T355313', diff saved to https://phabricator.wikimedia.org/P54915 and previous config saved to /var/cache/conftool/dbconfig/20240118-134936-root.json
- 13:28 moritzm: installing python-requests security updates
- 13:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T354336)', diff saved to https://phabricator.wikimedia.org/P54914 and previous config saved to /var/cache/conftool/dbconfig/20240118-130451-marostegui.json
- 12:54 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
- 12:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1221 (T352010)', diff saved to https://phabricator.wikimedia.org/P54913 and previous config saved to /var/cache/conftool/dbconfig/20240118-125130-ladsgroup.json
- 12:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 12:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 12:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
- 12:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
- 12:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T352010)', diff saved to https://phabricator.wikimedia.org/P54912 and previous config saved to /var/cache/conftool/dbconfig/20240118-125048-ladsgroup.json
- 12:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P54911 and previous config saved to /var/cache/conftool/dbconfig/20240118-124945-marostegui.json
- 12:41 godog: grafana restarted on grafana1002 after https://gerrit.wikimedia.org/r/c/operations/puppet/+/991573
- 12:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P54910 and previous config saved to /var/cache/conftool/dbconfig/20240118-123541-ladsgroup.json
- 12:35 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
- 12:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P54909 and previous config saved to /var/cache/conftool/dbconfig/20240118-123439-marostegui.json
- 12:34 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
- 12:33 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
- 12:31 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 12:28 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 12:27 Dreamy_Jazz: Finished security deploy for T347742
- 12:27 dreamyjazz@deploy2002: Finished scap: Backport for gerrit:991552SECURITY: Use message label instead of sanitized text output for massmessage-form-page-help message (T347742) (duration: 08m 28s)
- 12:27 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1047.eqiad.wmnet
- 12:26 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 12:24 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2047.codfw.wmnet
- 12:21 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 12:20 dreamyjazz@deploy2002: dreamyjazz: Backport for gerrit:991552SECURITY: Use message label instead of sanitized text output for massmessage-form-page-help message (T347742) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 12:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P54908 and previous config saved to /var/cache/conftool/dbconfig/20240118-122035-ladsgroup.json
- 12:20 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2047.codfw.wmnet
- 12:20 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1047.eqiad.wmnet
- 12:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T354336)', diff saved to https://phabricator.wikimedia.org/P54907 and previous config saved to /var/cache/conftool/dbconfig/20240118-121932-marostegui.json
- 12:18 dreamyjazz@deploy2002: Started scap: Backport for gerrit:991552SECURITY: Use message label instead of sanitized text output for massmessage-form-page-help message (T347742)
- 12:17 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 12:17 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 12:16 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
- 12:16 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
- 12:16 jynus: depooled db2146, lot of lag, should be investigated later
- 12:15 jynus@cumin1002: dbctl commit (dc=all): 'Depool db2146', diff saved to https://phabricator.wikimedia.org/P54906 and previous config saved to /var/cache/conftool/dbconfig/20240118-121541-jynus.json
- 12:07 Dreamy_Jazz: Doing security deploy for T347742
- 12:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T352010)', diff saved to https://phabricator.wikimedia.org/P54905 and previous config saved to /var/cache/conftool/dbconfig/20240118-120528-ladsgroup.json
- 11:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2189 (T354336)', diff saved to https://phabricator.wikimedia.org/P54904 and previous config saved to /var/cache/conftool/dbconfig/20240118-114551-marostegui.json
- 11:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2189.codfw.wmnet with reason: Maintenance
- 11:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2189.codfw.wmnet with reason: Maintenance
- 11:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T354336)', diff saved to https://phabricator.wikimedia.org/P54903 and previous config saved to /var/cache/conftool/dbconfig/20240118-114528-marostegui.json
- 11:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P54902 and previous config saved to /var/cache/conftool/dbconfig/20240118-113022-marostegui.json
- 11:21 godog: bounce apache2 on logstash1025 / logstash1031 - T337818
- 11:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P54901 and previous config saved to /var/cache/conftool/dbconfig/20240118-111516-marostegui.json
- 11:04 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
- 11:01 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
- 11:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T354336)', diff saved to https://phabricator.wikimedia.org/P54900 and previous config saved to /var/cache/conftool/dbconfig/20240118-110009-marostegui.json
- 10:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2175 (T354336)', diff saved to https://phabricator.wikimedia.org/P54899 and previous config saved to /var/cache/conftool/dbconfig/20240118-104335-marostegui.json
- 10:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2175.codfw.wmnet with reason: Maintenance
- 10:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2175.codfw.wmnet with reason: Maintenance
- 10:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54898 and previous config saved to /var/cache/conftool/dbconfig/20240118-104313-marostegui.json
- 10:37 hashar@deploy2002: Finished deploy [integration/docroot@8f5aa9e]: Add Codex Icons package (duration: 00m 05s)
- 10:36 hashar@deploy2002: Started deploy [integration/docroot@8f5aa9e]: Add Codex Icons package
- 10:32 hashar@deploy2002: Finished deploy [integration/docroot@88f6458]: Add npm package link for Codex Design Tokens - T354310 (duration: 00m 07s)
- 10:32 hashar@deploy2002: Started deploy [integration/docroot@88f6458]: Add npm package link for Codex Design Tokens - T354310
- 10:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet
- 10:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P54896 and previous config saved to /var/cache/conftool/dbconfig/20240118-102806-marostegui.json
- 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2047.codfw.wmnet
- 10:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet
- 10:22 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2047.codfw.wmnet
- 10:19 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1047.eqiad.wmnet
- 10:13 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1047.eqiad.wmnet
- 10:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P54894 and previous config saved to /var/cache/conftool/dbconfig/20240118-101300-marostegui.json
- 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2046.codfw.wmnet
- 10:09 Dreamy_Jazz: T351400 running on a tmux session `foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/scanFilesInScanTable.php --sleep 0 --verbose 2>&1 | tee ~/scan-files-in-scan-table-group2-sleep-0-non-jobqueue.txt`
- 10:04 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2046.codfw.wmnet
- 10:01 btullis: built and published updated openjdk-11 images based on: 11.0.21-s0-20240111
- 09:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54893 and previous config saved to /var/cache/conftool/dbconfig/20240118-095753-marostegui.json
- 09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54892 and previous config saved to /var/cache/conftool/dbconfig/20240118-095522-marostegui.json
- 09:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 09:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T354336)', diff saved to https://phabricator.wikimedia.org/P54891 and previous config saved to /var/cache/conftool/dbconfig/20240118-095500-marostegui.json
- 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1046.eqiad.wmnet
- 09:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P54890 and previous config saved to /var/cache/conftool/dbconfig/20240118-093954-marostegui.json
- 09:30 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.14 refs T354432
- 09:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1046.eqiad.wmnet
- 09:25 godog: add 50G to prometheus@k8s-mlserve in codfw
- 09:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P54889 and previous config saved to /var/cache/conftool/dbconfig/20240118-092447-marostegui.json
- 09:15 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --sleep 0 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-0-non-jobqueue.txt`
- 09:12 Dreamy_Jazz: stopped MediaModeration scanning script
- 09:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T354336)', diff saved to https://phabricator.wikimedia.org/P54888 and previous config saved to /var/cache/conftool/dbconfig/20240118-090941-marostegui.json
- 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2148 (T354336)', diff saved to https://phabricator.wikimedia.org/P54887 and previous config saved to /var/cache/conftool/dbconfig/20240118-090712-marostegui.json
- 09:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2148.codfw.wmnet with reason: Maintenance
- 09:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2148.codfw.wmnet with reason: Maintenance
- 09:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54886 and previous config saved to /var/cache/conftool/dbconfig/20240118-090649-marostegui.json
- 08:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P54885 and previous config saved to /var/cache/conftool/dbconfig/20240118-085143-marostegui.json
- 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P54884 and previous config saved to /var/cache/conftool/dbconfig/20240118-083636-marostegui.json
- 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54883 and previous config saved to /var/cache/conftool/dbconfig/20240118-082130-marostegui.json
- 08:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2138:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54882 and previous config saved to /var/cache/conftool/dbconfig/20240118-081900-marostegui.json
- 08:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
- 08:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
- 08:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T354336)', diff saved to https://phabricator.wikimedia.org/P54881 and previous config saved to /var/cache/conftool/dbconfig/20240118-081838-marostegui.json
- 08:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P54880 and previous config saved to /var/cache/conftool/dbconfig/20240118-080332-marostegui.json
- 07:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P54879 and previous config saved to /var/cache/conftool/dbconfig/20240118-074825-marostegui.json
- 07:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T354336)', diff saved to https://phabricator.wikimedia.org/P54878 and previous config saved to /var/cache/conftool/dbconfig/20240118-073319-marostegui.json
- 07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2126 (T354336)', diff saved to https://phabricator.wikimedia.org/P54877 and previous config saved to /var/cache/conftool/dbconfig/20240118-073054-marostegui.json
- 07:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 07:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 07:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2126.codfw.wmnet with reason: Maintenance
- 07:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2126.codfw.wmnet with reason: Maintenance
- 07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T354336)', diff saved to https://phabricator.wikimedia.org/P54876 and previous config saved to /var/cache/conftool/dbconfig/20240118-073016-marostegui.json
- 07:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P54875 and previous config saved to /var/cache/conftool/dbconfig/20240118-071509-marostegui.json
- 07:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P54874 and previous config saved to /var/cache/conftool/dbconfig/20240118-070003-marostegui.json
- 06:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T354336)', diff saved to https://phabricator.wikimedia.org/P54873 and previous config saved to /var/cache/conftool/dbconfig/20240118-064456-marostegui.json
- 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2125 (T354336)', diff saved to https://phabricator.wikimedia.org/P54872 and previous config saved to /var/cache/conftool/dbconfig/20240118-064225-marostegui.json
- 06:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2125.codfw.wmnet with reason: Maintenance
- 06:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2125.codfw.wmnet with reason: Maintenance
- 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T354336)', diff saved to https://phabricator.wikimedia.org/P54871 and previous config saved to /var/cache/conftool/dbconfig/20240118-064203-marostegui.json
- 06:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P54870 and previous config saved to /var/cache/conftool/dbconfig/20240118-062657-marostegui.json
- 06:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P54869 and previous config saved to /var/cache/conftool/dbconfig/20240118-061150-marostegui.json
- 06:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1199 (T352010)', diff saved to https://phabricator.wikimedia.org/P54868 and previous config saved to /var/cache/conftool/dbconfig/20240118-061138-ladsgroup.json
- 06:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
- 06:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
- 06:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T352010)', diff saved to https://phabricator.wikimedia.org/P54867 and previous config saved to /var/cache/conftool/dbconfig/20240118-061116-ladsgroup.json
- 05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T354336)', diff saved to https://phabricator.wikimedia.org/P54866 and previous config saved to /var/cache/conftool/dbconfig/20240118-055643-marostegui.json
- 05:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P54865 and previous config saved to /var/cache/conftool/dbconfig/20240118-055609-ladsgroup.json
- 05:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2104 (T354336)', diff saved to https://phabricator.wikimedia.org/P54864 and previous config saved to /var/cache/conftool/dbconfig/20240118-055419-marostegui.json
- 05:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2104.codfw.wmnet with reason: Maintenance
- 05:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2104.codfw.wmnet with reason: Maintenance
- 05:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
- 05:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
- 05:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 05:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 05:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P54863 and previous config saved to /var/cache/conftool/dbconfig/20240118-054103-ladsgroup.json
- 05:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T352010)', diff saved to https://phabricator.wikimedia.org/P54862 and previous config saved to /var/cache/conftool/dbconfig/20240118-052556-ladsgroup.json
2024-01-17
- 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1190 (T352010)', diff saved to https://phabricator.wikimedia.org/P54861 and previous config saved to /var/cache/conftool/dbconfig/20240117-233655-ladsgroup.json
- 23:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
- 23:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
- 22:01 inflatador: bking@kafka-main2001 `kafka topics --alter --topic eqiad.cirrussearch.update_pipeline.fetch_error.rc0 --partitions 5` T354595
- 21:55 catrope@deploy2002: Finished scap: Backport for gerrit:991049Fix text overflow in history page (T354218) (duration: 09m 39s)
- 21:50 inflatador: bking@kafka-main2001 `kafka topics --alter --topic codfw.cirrussearch.update_pipeline.fetch_error.rc0 --partitions 5` T354595
- 21:49 catrope@deploy2002: jdlrobson and catrope: Continuing with sync
- 21:47 catrope@deploy2002: jdlrobson and catrope: Backport for gerrit:991049Fix text overflow in history page (T354218) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:47 inflatador: bking@kafka-main2001 `kafka topics --alter --topic eqiad.cirrussearch.update_pipeline.update.rc0 --partitions 5` T354595
- 21:45 catrope@deploy2002: Started scap: Backport for gerrit:991049Fix text overflow in history page (T354218)
- 21:43 catrope@deploy2002: Finished scap: Backport for gerrit:990152Enable desktop history page for all mobile logged in users (T353388) (duration: 15m 15s)
- 21:37 catrope@deploy2002: jdlrobson and catrope: Continuing with sync
- 21:30 catrope@deploy2002: jdlrobson and catrope: Backport for gerrit:990152Enable desktop history page for all mobile logged in users (T353388) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:28 catrope@deploy2002: Started scap: Backport for gerrit:990152Enable desktop history page for all mobile logged in users (T353388)
- 21:16 inflatador: bking@kafka-main1001 `kafka topics --alter --topic codfw.cirrussearch.update_pipeline.fetch_error.rc0 --partitions 5
- 21:15 inflatador: bking@kafka-main1001 `kafka topics --alter --topic eqiad.cirrussearch.update_pipeline.update.rc0 --partitions 5` T354595
- 21:13 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
- 21:13 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
- 21:13 inflatador: bking@kafka-main1001 `kafka topics --alter --topic codfw.cirrussearch.update_pipeline.update.rc0 --partitions 5`
- 21:07 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
- 21:07 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
- 21:06 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
- 21:06 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
- 21:05 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 21:04 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 20:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 20:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 20:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T354336)', diff saved to https://phabricator.wikimedia.org/P54860 and previous config saved to /var/cache/conftool/dbconfig/20240117-201513-marostegui.json
- 20:05 mutante: LDAP - added uid=dimakoushha to groups wmde and nda (T354276)
- 20:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P54859 and previous config saved to /var/cache/conftool/dbconfig/20240117-200006-marostegui.json
- 19:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P54858 and previous config saved to /var/cache/conftool/dbconfig/20240117-194500-marostegui.json
- 19:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T354336)', diff saved to https://phabricator.wikimedia.org/P54857 and previous config saved to /var/cache/conftool/dbconfig/20240117-192953-marostegui.json
- 19:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1233 (T354336)', diff saved to https://phabricator.wikimedia.org/P54856 and previous config saved to /var/cache/conftool/dbconfig/20240117-192737-marostegui.json
- 19:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1233.eqiad.wmnet with reason: Maintenance
- 19:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1233.eqiad.wmnet with reason: Maintenance
- 19:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T354336)', diff saved to https://phabricator.wikimedia.org/P54855 and previous config saved to /var/cache/conftool/dbconfig/20240117-192715-marostegui.json
- 19:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P54854 and previous config saved to /var/cache/conftool/dbconfig/20240117-191209-marostegui.json
- 19:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 19:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 18:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P54853 and previous config saved to /var/cache/conftool/dbconfig/20240117-185703-marostegui.json
- 18:54 jnuche@deploy2002: Finished scap: deploying K8s config changes from T355243 (duration: 01m 42s)
- 18:52 jnuche@deploy2002: Started scap: deploying K8s config changes from T355243
- 18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T354336)', diff saved to https://phabricator.wikimedia.org/P54852 and previous config saved to /var/cache/conftool/dbconfig/20240117-184156-marostegui.json
- 18:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1229 (T354336)', diff saved to https://phabricator.wikimedia.org/P54851 and previous config saved to /var/cache/conftool/dbconfig/20240117-183944-marostegui.json
- 18:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1229.eqiad.wmnet with reason: Maintenance
- 18:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1229.eqiad.wmnet with reason: Maintenance
- 18:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 18:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 18:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T354336)', diff saved to https://phabricator.wikimedia.org/P54850 and previous config saved to /var/cache/conftool/dbconfig/20240117-183857-marostegui.json
- 18:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P54849 and previous config saved to /var/cache/conftool/dbconfig/20240117-182351-marostegui.json
- 18:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P54848 and previous config saved to /var/cache/conftool/dbconfig/20240117-180844-marostegui.json
- 17:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T354336)', diff saved to https://phabricator.wikimedia.org/P54847 and previous config saved to /var/cache/conftool/dbconfig/20240117-175338-marostegui.json
- 17:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1222 (T354336)', diff saved to https://phabricator.wikimedia.org/P54846 and previous config saved to /var/cache/conftool/dbconfig/20240117-175120-marostegui.json
- 17:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
- 17:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
- 17:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T354336)', diff saved to https://phabricator.wikimedia.org/P54845 and previous config saved to /var/cache/conftool/dbconfig/20240117-175059-marostegui.json
- 17:39 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2395.codfw.wmnet with OS bullseye
- 17:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P54844 and previous config saved to /var/cache/conftool/dbconfig/20240117-173552-marostegui.json
- 17:29 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2357.codfw.wmnet with OS bullseye
- 17:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P54843 and previous config saved to /var/cache/conftool/dbconfig/20240117-172045-marostegui.json
- 17:19 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
- 17:19 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2395.codfw.wmnet with reason: host reimage
- 17:19 denisse@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host grafana2001.codfw.wmnet with OS bookworm
- 17:18 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
- 17:16 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2395.codfw.wmnet with reason: host reimage
- 17:13 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
- 17:11 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
- 17:08 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2357.codfw.wmnet with reason: host reimage
- 17:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T354336)', diff saved to https://phabricator.wikimedia.org/P54842 and previous config saved to /var/cache/conftool/dbconfig/20240117-170539-marostegui.json
- 17:05 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2357.codfw.wmnet with reason: host reimage
- 17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1197 (T354336)', diff saved to https://phabricator.wikimedia.org/P54841 and previous config saved to /var/cache/conftool/dbconfig/20240117-170327-marostegui.json
- 17:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1197.eqiad.wmnet with reason: Maintenance
- 17:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1197.eqiad.wmnet with reason: Maintenance
- 17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T354336)', diff saved to https://phabricator.wikimedia.org/P54840 and previous config saved to /var/cache/conftool/dbconfig/20240117-170305-marostegui.json
- 17:02 denisse@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on grafana2001.codfw.wmnet with reason: host reimage
- 17:00 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2395.codfw.wmnet with OS bullseye
- 16:57 denisse@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on grafana2001.codfw.wmnet with reason: host reimage
- 16:48 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2357.codfw.wmnet with OS bullseye
- 16:48 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
- 16:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P54839 and previous config saved to /var/cache/conftool/dbconfig/20240117-164759-marostegui.json
- 16:42 denisse@cumin2002: START - Cookbook sre.hosts.reimage for host grafana2001.codfw.wmnet with OS bookworm
- 16:41 jforrester@deploy2002: Finished deploy [integration/docroot@f08a107]: I746134 for T354310 (duration: 00m 07s)
- 16:40 jforrester@deploy2002: Started deploy [integration/docroot@f08a107]: I746134 for T354310
- 16:39 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
- 16:39 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
- 16:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P54838 and previous config saved to /var/cache/conftool/dbconfig/20240117-163252-marostegui.json
- 16:29 damilare: civicrm upgraded from 5ef5362f to d8b0c977
- 16:25 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 16:23 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
- 16:23 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
- 16:22 kamila@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 16:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T354336)', diff saved to https://phabricator.wikimedia.org/P54837 and previous config saved to /var/cache/conftool/dbconfig/20240117-161746-marostegui.json
- 16:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1188 (T354336)', diff saved to https://phabricator.wikimedia.org/P54836 and previous config saved to /var/cache/conftool/dbconfig/20240117-161534-marostegui.json
- 16:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1188.eqiad.wmnet with reason: Maintenance
- 16:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1188.eqiad.wmnet with reason: Maintenance
- 16:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T354336)', diff saved to https://phabricator.wikimedia.org/P54835 and previous config saved to /var/cache/conftool/dbconfig/20240117-161512-marostegui.json
- 16:14 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 16:13 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 16:13 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
- 16:13 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
- 16:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P54834 and previous config saved to /var/cache/conftool/dbconfig/20240117-160005-marostegui.json
- 15:54 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Testing new version of Superset
- 15:54 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Testing new version of Superset
- 15:54 btullis@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 7 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Testing new version of Superset
- 15:54 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Testing new version of Superset
- 15:49 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
- 15:49 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
- 15:45 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 15:45 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 15:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P54833 and previous config saved to /var/cache/conftool/dbconfig/20240117-154459-marostegui.json
- 15:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2045.codfw.wmnet
- 15:38 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 15:38 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 15:30 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2045.codfw.wmnet
- 15:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T354336)', diff saved to https://phabricator.wikimedia.org/P54832 and previous config saved to /var/cache/conftool/dbconfig/20240117-152953-marostegui.json
- 15:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1045.eqiad.wmnet
- 15:27 taavi: restart etherpad-lite.service on etherpad1003
- 15:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1182 (T354336)', diff saved to https://phabricator.wikimedia.org/P54831 and previous config saved to /var/cache/conftool/dbconfig/20240117-152737-marostegui.json
- 15:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 15:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 15:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54830 and previous config saved to /var/cache/conftool/dbconfig/20240117-152715-marostegui.json
- 15:23 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1045.eqiad.wmnet
- 15:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: cache::text
- 15:15 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
- 15:13 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
- 15:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P54827 and previous config saved to /var/cache/conftool/dbconfig/20240117-151208-marostegui.json
- 15:10 Lucas_WMDE: UTC afternoon backport+config window done
- 15:09 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:991061Exclude qqq from monolingual text languages (T341409) (duration: 07m 59s)
- 15:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
- 15:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1044.eqiad.wmnet
- 15:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
- 15:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
- 15:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2044.codfw.wmnet
- 15:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
- 15:03 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
- 15:02 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for gerrit:991061Exclude qqq from monolingual text languages (T341409) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 15:01 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:991061Exclude qqq from monolingual text languages (T341409)
- 14:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1044.eqiad.wmnet
- 14:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2044.codfw.wmnet
- 14:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P54826 and previous config saved to /var/cache/conftool/dbconfig/20240117-145702-marostegui.json
- 14:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: cache::text
- 14:51 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:991062Skip tainted references test:distnodiff script to fix Wikibase CI (T354881), gerrit:991060Only build result entries for used wbsearchentities results (T355053) (duration: 08m 28s)
- 14:49 claime: restarted rsyslog on kubernetes2048
- 14:45 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
- 14:44 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for gerrit:991062Skip tainted references test:distnodiff script to fix Wikibase CI (T354881), gerrit:991060Only build result entries for used wbsearchentities results (T355053) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:43 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:991062Skip tainted references test:distnodiff script to fix Wikibase CI (T354881), gerrit:991060Only build result entries for used wbsearchentities results (T355053)
- 14:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54824 and previous config saved to /var/cache/conftool/dbconfig/20240117-144156-marostegui.json
- 14:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54823 and previous config saved to /var/cache/conftool/dbconfig/20240117-144039-marostegui.json
- 14:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 14:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 14:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T354336)', diff saved to https://phabricator.wikimedia.org/P54822 and previous config saved to /var/cache/conftool/dbconfig/20240117-144018-marostegui.json
- 14:26 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2002.codfw.wmnet
- 14:25 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:991059Only build result entries for used wbsearchentities results (T355053) (duration: 09m 23s)
- 14:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P54821 and previous config saved to /var/cache/conftool/dbconfig/20240117-142511-marostegui.json
- 14:23 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
- 14:22 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-wf2002.codfw.wmnet
- 14:22 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2001.codfw.wmnet
- 14:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
- 14:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
- 14:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54820 and previous config saved to /var/cache/conftool/dbconfig/20240117-142015-ladsgroup.json
- 14:19 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
- 14:17 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for gerrit:991059Only build result entries for used wbsearchentities results (T355053) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:16 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:991059Only build result entries for used wbsearchentities results (T355053)
- 14:16 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-wf2001.codfw.wmnet
- 14:14 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [[gerrit:628773|Remove unused $wgExtraLanguageNames['qqq'] assignment (T263441)]] (duration: 11m 07s)
- 14:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P54819 and previous config saved to /var/cache/conftool/dbconfig/20240117-141005-marostegui.json
- 14:07 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
- 14:07 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for [[gerrit:628773|Remove unused $wgExtraLanguageNames['qqq'] assignment (T263441)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P54818 and previous config saved to /var/cache/conftool/dbconfig/20240117-140509-ladsgroup.json
- 14:03 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [[gerrit:628773|Remove unused $wgExtraLanguageNames['qqq'] assignment (T263441)]]
- 13:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T354336)', diff saved to https://phabricator.wikimedia.org/P54817 and previous config saved to /var/cache/conftool/dbconfig/20240117-135459-marostegui.json
- 13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1156 (T354336)', diff saved to https://phabricator.wikimedia.org/P54816 and previous config saved to /var/cache/conftool/dbconfig/20240117-135242-marostegui.json
- 13:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 13:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 13:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 13:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54815 and previous config saved to /var/cache/conftool/dbconfig/20240117-135158-marostegui.json
- 13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P54814 and previous config saved to /var/cache/conftool/dbconfig/20240117-135002-ladsgroup.json
- 13:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P54813 and previous config saved to /var/cache/conftool/dbconfig/20240117-133652-marostegui.json
- 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1014.eqiad.wmnet
- 13:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54812 and previous config saved to /var/cache/conftool/dbconfig/20240117-133456-ladsgroup.json
- 13:34 damilare: payments-wiki upgraded from 12d8ad5b to e38b24f0
- 13:32 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 13:32 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 13:30 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 13:30 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 13:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host snapshot1014.eqiad.wmnet
- 13:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P54811 and previous config saved to /var/cache/conftool/dbconfig/20240117-132145-marostegui.json
- 13:19 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2267.codfw.wmnet with OS bullseye
- 13:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54810 and previous config saved to /var/cache/conftool/dbconfig/20240117-130639-marostegui.json
- 13:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1146:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54809 and previous config saved to /var/cache/conftool/dbconfig/20240117-130422-marostegui.json
- 13:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 13:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 13:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 13:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 13:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2113.codfw.wmnet with reason: Maintenance
- 13:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2113.codfw.wmnet with reason: Maintenance
- 12:59 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2267.codfw.wmnet with reason: host reimage
- 12:58 taavi: removing vlan1119 interface on lvs1018 T355115
- 12:56 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2267.codfw.wmnet with reason: host reimage
- 12:47 taavi: removing vlan1119 interface on lvs1020 T355115
- 12:38 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2267.codfw.wmnet with OS bullseye
- 12:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T354336)', diff saved to https://phabricator.wikimedia.org/P54806 and previous config saved to /var/cache/conftool/dbconfig/20240117-122305-marostegui.json
- 12:22 hnowlan: setting mw[2267,2282,2357,2395] inactive in advance of reimaging
- 12:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P54805 and previous config saved to /var/cache/conftool/dbconfig/20240117-120758-marostegui.json
- 12:06 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
- 12:00 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
- 12:00 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
- 12:00 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw2394.codfw.wmnet with reason: Bad DIMM
- 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2044.codfw.wmnet
- 12:00 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw2394.codfw.wmnet with reason: Bad DIMM
- 11:59 cgoubert@cumin2002: conftool action : set/pooled=inactive; selector: name=mw2394.codfw.wmnet
- 11:55 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2044.codfw.wmnet
- 11:54 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
- 11:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P54804 and previous config saved to /var/cache/conftool/dbconfig/20240117-115252-marostegui.json
- 11:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1044.eqiad.wmnet
- 11:46 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1044.eqiad.wmnet
- 11:40 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2044.codfw.wmnet
- 11:40 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1044.eqiad.wmnet
- 11:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: memcached
- 11:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T354336)', diff saved to https://phabricator.wikimedia.org/P54803 and previous config saved to /var/cache/conftool/dbconfig/20240117-113745-marostegui.json
- 11:34 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: memcached
- 11:34 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1044.eqiad.wmnet
- 11:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2192 (T354336)', diff saved to https://phabricator.wikimedia.org/P54802 and previous config saved to /var/cache/conftool/dbconfig/20240117-113432-marostegui.json
- 11:34 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2044.codfw.wmnet
- 11:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2192.codfw.wmnet with reason: Maintenance
- 11:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2192.codfw.wmnet with reason: Maintenance
- 11:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T354336)', diff saved to https://phabricator.wikimedia.org/P54801 and previous config saved to /var/cache/conftool/dbconfig/20240117-113410-marostegui.json
- 11:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P54800 and previous config saved to /var/cache/conftool/dbconfig/20240117-111904-marostegui.json
- 11:09 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt`
- 11:09 Dreamy_Jazz: stopped scanning script
- 11:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P54799 and previous config saved to /var/cache/conftool/dbconfig/20240117-110357-marostegui.json
- 10:49 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1043.eqiad.wmnet
- 10:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T354336)', diff saved to https://phabricator.wikimedia.org/P54798 and previous config saved to /var/cache/conftool/dbconfig/20240117-104851-marostegui.json
- 10:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2178 (T354336)', diff saved to https://phabricator.wikimedia.org/P54797 and previous config saved to /var/cache/conftool/dbconfig/20240117-104438-marostegui.json
- 10:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2178.codfw.wmnet with reason: Maintenance
- 10:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2178.codfw.wmnet with reason: Maintenance
- 10:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54796 and previous config saved to /var/cache/conftool/dbconfig/20240117-104416-marostegui.json
- 10:43 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1043.eqiad.wmnet
- 10:33 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2043.codfw.wmnet
- 10:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P54795 and previous config saved to /var/cache/conftool/dbconfig/20240117-102909-marostegui.json
- 10:26 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2043.codfw.wmnet
- 10:26 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 10:26 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:18 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2043.codfw.wmnet
- 10:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P54793 and previous config saved to /var/cache/conftool/dbconfig/20240117-101403-marostegui.json
- 10:12 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2043.codfw.wmnet
- 09:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54792 and previous config saved to /var/cache/conftool/dbconfig/20240117-095856-marostegui.json
- 09:58 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:58 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:58 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2171:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54791 and previous config saved to /var/cache/conftool/dbconfig/20240117-095544-marostegui.json
- 09:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
- 09:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
- 09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T354336)', diff saved to https://phabricator.wikimedia.org/P54790 and previous config saved to /var/cache/conftool/dbconfig/20240117-095521-marostegui.json
- 09:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1043.eqiad.wmnet
- 09:51 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2042.codfw.wmnet
- 09:51 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1042.eqiad.wmnet
- 09:46 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1043.eqiad.wmnet
- 09:45 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1042.eqiad.wmnet
- 09:45 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2042.codfw.wmnet
- 09:40 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1042.eqiad.wmnet
- 09:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P54789 and previous config saved to /var/cache/conftool/dbconfig/20240117-094015-marostegui.json
- 09:36 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1042.eqiad.wmnet
- 09:35 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:35 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:30 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host mc2042.codfw.wmnet
- 09:29 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2042.codfw.wmnet
- 09:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P54788 and previous config saved to /var/cache/conftool/dbconfig/20240117-092507-marostegui.json
- 09:21 jnuche@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.14 refs T354432 (duration: 06m 15s)
- 09:15 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.14 refs T354432
- 09:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T354336)', diff saved to https://phabricator.wikimedia.org/P54787 and previous config saved to /var/cache/conftool/dbconfig/20240117-091000-marostegui.json
- 09:08 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host mc2042.codfw.wmnet
- 09:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2157 (T354336)', diff saved to https://phabricator.wikimedia.org/P54786 and previous config saved to /var/cache/conftool/dbconfig/20240117-090648-marostegui.json
- 09:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2157.codfw.wmnet with reason: Maintenance
- 09:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2157.codfw.wmnet with reason: Maintenance
- 09:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54785 and previous config saved to /var/cache/conftool/dbconfig/20240117-090626-marostegui.json
- 09:02 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2042.codfw.wmnet
- 08:56 dcausse@deploy2002: Finished scap: Backport for gerrit:990718enable page_rerender for all wikis (T351503) (duration: 09m 15s)
- 08:55 moritzm: installing Python 2.7 security updates
- 08:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P54784 and previous config saved to /var/cache/conftool/dbconfig/20240117-085119-marostegui.json
- 08:50 dcausse@deploy2002: pfischer and dcausse: Continuing with sync
- 08:48 dcausse@deploy2002: pfischer and dcausse: Backport for gerrit:990718enable page_rerender for all wikis (T351503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:46 dcausse@deploy2002: Started scap: Backport for gerrit:990718enable page_rerender for all wikis (T351503)
- 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P54783 and previous config saved to /var/cache/conftool/dbconfig/20240117-083613-marostegui.json
- 08:23 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on db2194.codfw.wmnet with reason: debugging something before T343674
- 08:22 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on db2194.codfw.wmnet with reason: debugging something before T343674
- 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54782 and previous config saved to /var/cache/conftool/dbconfig/20240117-082106-marostegui.json
- 08:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54781 and previous config saved to /var/cache/conftool/dbconfig/20240117-082001-ladsgroup.json
- 08:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 08:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 08:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2137:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54780 and previous config saved to /var/cache/conftool/dbconfig/20240117-081754-marostegui.json
- 08:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
- 08:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
- 08:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T354336)', diff saved to https://phabricator.wikimedia.org/P54779 and previous config saved to /var/cache/conftool/dbconfig/20240117-081731-marostegui.json
- 08:16 moritzm: installing python-git security updates
- 08:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P54778 and previous config saved to /var/cache/conftool/dbconfig/20240117-080225-marostegui.json
- 07:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P54777 and previous config saved to /var/cache/conftool/dbconfig/20240117-074719-marostegui.json
- 07:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T354336)', diff saved to https://phabricator.wikimedia.org/P54776 and previous config saved to /var/cache/conftool/dbconfig/20240117-073212-marostegui.json
- 07:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2128 (T354336)', diff saved to https://phabricator.wikimedia.org/P54775 and previous config saved to /var/cache/conftool/dbconfig/20240117-072902-marostegui.json
- 07:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 07:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 07:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2128.codfw.wmnet with reason: Maintenance
- 07:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2128.codfw.wmnet with reason: Maintenance
- 07:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T354336)', diff saved to https://phabricator.wikimedia.org/P54774 and previous config saved to /var/cache/conftool/dbconfig/20240117-072824-marostegui.json
- 07:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P54773 and previous config saved to /var/cache/conftool/dbconfig/20240117-071317-marostegui.json
- 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P54772 and previous config saved to /var/cache/conftool/dbconfig/20240117-065811-marostegui.json
- 06:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T354336)', diff saved to https://phabricator.wikimedia.org/P54771 and previous config saved to /var/cache/conftool/dbconfig/20240117-064304-marostegui.json
- 06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2123 (T354336)', diff saved to https://phabricator.wikimedia.org/P54770 and previous config saved to /var/cache/conftool/dbconfig/20240117-063951-marostegui.json
- 06:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2123.codfw.wmnet with reason: Maintenance
- 06:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2123.codfw.wmnet with reason: Maintenance
- 06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T354336)', diff saved to https://phabricator.wikimedia.org/P54769 and previous config saved to /var/cache/conftool/dbconfig/20240117-063929-marostegui.json
- 06:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P54768 and previous config saved to /var/cache/conftool/dbconfig/20240117-062422-marostegui.json
- 06:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P54767 and previous config saved to /var/cache/conftool/dbconfig/20240117-060916-marostegui.json
- 05:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T354336)', diff saved to https://phabricator.wikimedia.org/P54766 and previous config saved to /var/cache/conftool/dbconfig/20240117-055409-marostegui.json
- 05:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2111 (T354336)', diff saved to https://phabricator.wikimedia.org/P54765 and previous config saved to /var/cache/conftool/dbconfig/20240117-055056-marostegui.json
- 05:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2111.codfw.wmnet with reason: Maintenance
- 05:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2111.codfw.wmnet with reason: Maintenance
- 05:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2101.codfw.wmnet with reason: Maintenance
- 05:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2101.codfw.wmnet with reason: Maintenance
- 05:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1183.eqiad.wmnet with reason: Maintenance
- 05:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1183.eqiad.wmnet with reason: Maintenance
- 03:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 03:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 03:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54764 and previous config saved to /var/cache/conftool/dbconfig/20240117-033751-ladsgroup.json
- 03:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P54763 and previous config saved to /var/cache/conftool/dbconfig/20240117-032245-ladsgroup.json
- 03:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P54762 and previous config saved to /var/cache/conftool/dbconfig/20240117-030738-ladsgroup.json
- 02:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54761 and previous config saved to /var/cache/conftool/dbconfig/20240117-025232-ladsgroup.json
- 00:03 tstarling@deploy2002: Synchronized wmf-config: T344791 related cleanup (duration: 06m 22s)
2024-01-16
- 23:55 tstarling@deploy2002: Synchronized wmf-config/CommonSettings.php: Disable wgUseSameSiteLegacyCookies T344791 (duration: 09m 19s)
- 21:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54760 and previous config saved to /var/cache/conftool/dbconfig/20240116-214016-ladsgroup.json
- 21:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 21:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 20:43 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2297.codfw.wmnet with OS bullseye
- 20:37 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2296.codfw.wmnet with OS bullseye
- 20:30 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2295.codfw.wmnet with OS bullseye
- 20:26 ryankemper: T351650 Running puppet on `P:trafficserver::backend` following merge of https://gerrit.wikimedia.org/r/c/operations/puppet/+/991091
- 20:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2294.codfw.wmnet with OS bullseye
- 20:23 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2297.codfw.wmnet with reason: host reimage
- 20:20 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2297.codfw.wmnet with reason: host reimage
- 20:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2296.codfw.wmnet with reason: host reimage
- 20:16 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2292.codfw.wmnet with OS bullseye
- 20:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2293.codfw.wmnet with OS bullseye
- 20:13 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2296.codfw.wmnet with reason: host reimage
- 20:12 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2291.codfw.wmnet with OS bullseye
- 20:11 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2295.codfw.wmnet with reason: host reimage
- 20:08 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2295.codfw.wmnet with reason: host reimage
- 20:06 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2294.codfw.wmnet with reason: host reimage
- 20:03 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2297.codfw.wmnet with OS bullseye
- 20:02 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2294.codfw.wmnet with reason: host reimage
- 19:56 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2296.codfw.wmnet with OS bullseye
- 19:56 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2292.codfw.wmnet with reason: host reimage
- 19:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2293.codfw.wmnet with reason: host reimage
- 19:52 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2295.codfw.wmnet with OS bullseye
- 19:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2291.codfw.wmnet with reason: host reimage
- 19:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1375.eqiad.wmnet with OS bullseye
- 19:49 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2293.codfw.wmnet with reason: host reimage
- 19:48 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2292.codfw.wmnet with reason: host reimage
- 19:47 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2291.codfw.wmnet with reason: host reimage
- 19:47 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1376.eqiad.wmnet with OS bullseye
- 19:46 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2294.codfw.wmnet with OS bullseye
- 19:45 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1374.eqiad.wmnet with OS bullseye
- 19:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 19:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 19:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T354336)', diff saved to https://phabricator.wikimedia.org/P54759 and previous config saved to /var/cache/conftool/dbconfig/20240116-194509-marostegui.json
- 19:34 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1360.eqiad.wmnet with OS bullseye
- 19:32 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2293.codfw.wmnet with OS bullseye
- 19:31 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2292.codfw.wmnet with OS bullseye
- 19:31 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2291.codfw.wmnet with OS bullseye
- 19:31 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1363.eqiad.wmnet with OS bullseye
- 19:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P54758 and previous config saved to /var/cache/conftool/dbconfig/20240116-193002-marostegui.json
- 19:29 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1375.eqiad.wmnet with reason: host reimage
- 19:29 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1361.eqiad.wmnet with OS bullseye
- 19:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1362.eqiad.wmnet with OS bullseye
- 19:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1376.eqiad.wmnet with reason: host reimage
- 19:24 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1374.eqiad.wmnet with reason: host reimage
- 19:23 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1376.eqiad.wmnet with reason: host reimage
- 19:21 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1375.eqiad.wmnet with reason: host reimage
- 19:21 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1374.eqiad.wmnet with reason: host reimage
- 19:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P54757 and previous config saved to /var/cache/conftool/dbconfig/20240116-191456-marostegui.json
- 19:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1360.eqiad.wmnet with reason: host reimage
- 19:10 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1363.eqiad.wmnet with reason: host reimage
- 19:08 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1361.eqiad.wmnet with reason: host reimage
- 19:08 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1376.eqiad.wmnet with OS bullseye
- 19:07 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1362.eqiad.wmnet with reason: host reimage
- 19:07 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1375.eqiad.wmnet with OS bullseye
- 19:06 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1374.eqiad.wmnet with OS bullseye
- 19:06 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1363.eqiad.wmnet with reason: host reimage
- 19:05 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1362.eqiad.wmnet with reason: host reimage
- 19:05 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1360.eqiad.wmnet with reason: host reimage
- 19:04 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1361.eqiad.wmnet with reason: host reimage
- 18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T354336)', diff saved to https://phabricator.wikimedia.org/P54756 and previous config saved to /var/cache/conftool/dbconfig/20240116-185949-marostegui.json
- 18:57 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1230 (T354336)', diff saved to https://phabricator.wikimedia.org/P54755 and previous config saved to /var/cache/conftool/dbconfig/20240116-185723-marostegui.json
- 18:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1230.eqiad.wmnet with reason: Maintenance
- 18:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1230.eqiad.wmnet with reason: Maintenance
- 18:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 18:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 18:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54754 and previous config saved to /var/cache/conftool/dbconfig/20240116-185626-marostegui.json
- 18:51 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1363.eqiad.wmnet with OS bullseye
- 18:51 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1362.eqiad.wmnet with OS bullseye
- 18:50 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1361.eqiad.wmnet with OS bullseye
- 18:50 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1360.eqiad.wmnet with OS bullseye
- 18:42 mutante: phab2002 - pulling repo data from phab1004 by running sync script created by rsync::quickdatacopy after gerrit:990247 T354221
- 18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315', diff saved to https://phabricator.wikimedia.org/P54753 and previous config saved to /var/cache/conftool/dbconfig/20240116-184120-marostegui.json
- 18:38 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --sleep 1 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-non-job-queue.txt`
- 18:36 Dreamy_Jazz: stopped tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt`
- 18:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315', diff saved to https://phabricator.wikimedia.org/P54752 and previous config saved to /var/cache/conftool/dbconfig/20240116-182613-marostegui.json
- 18:20 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 18:19 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 18:19 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 18:19 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 18:18 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 18:18 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 18:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54751 and previous config saved to /var/cache/conftool/dbconfig/20240116-181107-marostegui.json
- 18:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1213:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54750 and previous config saved to /var/cache/conftool/dbconfig/20240116-180841-marostegui.json
- 18:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
- 18:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
- 18:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T354336)', diff saved to https://phabricator.wikimedia.org/P54749 and previous config saved to /var/cache/conftool/dbconfig/20240116-180819-marostegui.json
- 17:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P54748 and previous config saved to /var/cache/conftool/dbconfig/20240116-175313-marostegui.json
- 17:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P54747 and previous config saved to /var/cache/conftool/dbconfig/20240116-173806-marostegui.json
- 17:32 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1460.eqiad.wmnet with OS bullseye
- 17:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T354336)', diff saved to https://phabricator.wikimedia.org/P54746 and previous config saved to /var/cache/conftool/dbconfig/20240116-172300-marostegui.json
- 17:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1210 (T354336)', diff saved to https://phabricator.wikimedia.org/P54745 and previous config saved to /var/cache/conftool/dbconfig/20240116-172032-marostegui.json
- 17:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1210.eqiad.wmnet with reason: Maintenance
- 17:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1210.eqiad.wmnet with reason: Maintenance
- 17:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T354336)', diff saved to https://phabricator.wikimedia.org/P54744 and previous config saved to /var/cache/conftool/dbconfig/20240116-172011-marostegui.json
- 17:14 topranks: Disabling puppet and PyBal on lvs2012 ahead of migration of network link to lsw1-b2-codfw T352909
- 17:12 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1460.eqiad.wmnet with reason: host reimage
- 17:11 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: moving lvs hosts codfw T352784 T352918
- 17:11 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2012.codfw.wmnet with reason: moving lvs hosts codfw T352784 T352918
- 17:10 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1460.eqiad.wmnet with reason: host reimage
- 17:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P54743 and previous config saved to /var/cache/conftool/dbconfig/20240116-170503-marostegui.json
- 16:56 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on prometheus1006.eqiad.wmnet with reason: memory upgrade
- 16:56 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on prometheus1006.eqiad.wmnet with reason: memory upgrade
- 16:56 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw1460.eqiad.wmnet with OS bullseye
- 16:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P54742 and previous config saved to /var/cache/conftool/dbconfig/20240116-164957-marostegui.json
- 16:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T354336)', diff saved to https://phabricator.wikimedia.org/P54741 and previous config saved to /var/cache/conftool/dbconfig/20240116-163449-marostegui.json
- 16:33 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on prometheus1005.eqiad.wmnet with reason: memory upgrade
- 16:33 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on prometheus1005.eqiad.wmnet with reason: memory upgrade
- 16:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1200 (T354336)', diff saved to https://phabricator.wikimedia.org/P54740 and previous config saved to /var/cache/conftool/dbconfig/20240116-163224-marostegui.json
- 16:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1200.eqiad.wmnet with reason: Maintenance
- 16:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1200.eqiad.wmnet with reason: Maintenance
- 16:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T354336)', diff saved to https://phabricator.wikimedia.org/P54739 and previous config saved to /var/cache/conftool/dbconfig/20240116-163203-marostegui.json
- 16:22 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab1004 for T354969 (duration: 00m 50s)
- 16:22 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab1004 for T354969
- 16:21 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 for T354969 (duration: 00m 27s)
- 16:21 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 for T354969
- 16:20 mutante: phabricator deploy is imminent
- 16:20 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab1004.eqiad.wmnet with reason: deployment
- 16:20 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab1004.eqiad.wmnet with reason: deployment
- 16:20 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
- 16:19 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
- 16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P54738 and previous config saved to /var/cache/conftool/dbconfig/20240116-161656-marostegui.json
- 16:03 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt`
- 16:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P54737 and previous config saved to /var/cache/conftool/dbconfig/20240116-160150-marostegui.json
- 16:00 Dreamy_Jazz: stopped mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt
- 15:55 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on re0.cr[1-2]-codfw.mgmt with reason: moving lvs hosts codfw T352784 T352918
- 15:55 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on re0.cr[1-2]-codfw.mgmt with reason: moving lvs hosts codfw T352784 T352918
- 15:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T354336)', diff saved to https://phabricator.wikimedia.org/P54736 and previous config saved to /var/cache/conftool/dbconfig/20240116-154643-marostegui.json
- 15:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1185 (T354336)', diff saved to https://phabricator.wikimedia.org/P54735 and previous config saved to /var/cache/conftool/dbconfig/20240116-154419-marostegui.json
- 15:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1185.eqiad.wmnet with reason: Maintenance
- 15:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1185.eqiad.wmnet with reason: Maintenance
- 15:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T354336)', diff saved to https://phabricator.wikimedia.org/P54734 and previous config saved to /var/cache/conftool/dbconfig/20240116-154357-marostegui.json
- 15:29 Dreamy_Jazz: T351400 running mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt
- 15:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P54733 and previous config saved to /var/cache/conftool/dbconfig/20240116-152850-marostegui.json
- 15:28 Dreamy_Jazz: stopped mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 25 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-25.txt
- 15:27 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr[1-2]-codfw,cr[1-2]-codfw IPv6,lvs2013 with reason: moving lvs hosts codfw T352784
- 15:27 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cr[1-2]-codfw,cr[1-2]-codfw IPv6,lvs2013 with reason: moving lvs hosts codfw T352784
- 15:19 topranks: Disabling puppet and PyBal on lvs2013 ahead of migration of network link to ssw1-a1-codfw T352784
- 15:18 Dreamy_Jazz: T351400 running mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 25 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-20.txt
- 15:18 Dreamy_Jazz: Stopped mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 20 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-20.txt
- 15:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P54732 and previous config saved to /var/cache/conftool/dbconfig/20240116-151344-marostegui.json
- 15:13 Dreamy_Jazz: T351400 running mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 20 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-20.txt
- 15:11 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 15:07 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 15:00 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:00 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove old records for cloud-support1-c-eqiad - cmooney@cumin1002"
- 14:58 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove old records for cloud-support1-c-eqiad - cmooney@cumin1002"
- 14:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T354336)', diff saved to https://phabricator.wikimedia.org/P54731 and previous config saved to /var/cache/conftool/dbconfig/20240116-145837-marostegui.json
- 14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1161 (T354336)', diff saved to https://phabricator.wikimedia.org/P54730 and previous config saved to /var/cache/conftool/dbconfig/20240116-145613-marostegui.json
- 14:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 14:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 14:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 14:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 14:55 cmooney@cumin1002: START - Cookbook sre.dns.netbox
- 14:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 14:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 14:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54729 and previous config saved to /var/cache/conftool/dbconfig/20240116-145458-marostegui.json
- 14:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P54728 and previous config saved to /var/cache/conftool/dbconfig/20240116-143951-marostegui.json
- 14:33 moritzm: installing ca-certificates-java bugfix updates on bookworm
- 14:31 Dreamy_Jazz: UTC afternoon deploys done
- 14:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P54727 and previous config saved to /var/cache/conftool/dbconfig/20240116-142444-marostegui.json
- 14:24 dreamyjazz@deploy2002: Finished scap: Backport for gerrit:990760Add more statsd counters and add logstash logging (T351419) (duration: 07m 15s)
- 14:18 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 14:18 dreamyjazz@deploy2002: dreamyjazz: Backport for gerrit:990760Add more statsd counters and add logstash logging (T351419) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:17 moritzm: installing 5.10.205 kernels on buster hosts running the 5.10 backport
- 14:16 dreamyjazz@deploy2002: Started scap: Backport for gerrit:990760Add more statsd counters and add logstash logging (T351419)
- 14:14 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2042.codfw.wmnet
- 14:14 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1041.eqiad.wmnet
- 14:11 dreamyjazz@deploy2002: Finished scap: Backport for gerrit:990754Support parallel PhotoDNA requests (T354408) (duration: 07m 14s)
- 14:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54726 and previous config saved to /var/cache/conftool/dbconfig/20240116-140938-marostegui.json
- 14:07 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2042.codfw.wmnet
- 14:07 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1041.eqiad.wmnet
- 14:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1144:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54725 and previous config saved to /var/cache/conftool/dbconfig/20240116-140713-marostegui.json
- 14:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 14:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 14:05 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 14:05 dreamyjazz@deploy2002: dreamyjazz: Backport for gerrit:990754Support parallel PhotoDNA requests (T354408) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:04 dreamyjazz@deploy2002: Started scap: Backport for gerrit:990754Support parallel PhotoDNA requests (T354408)
- 13:54 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 13:35 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1001.eqiad.wmnet with OS bullseye
- 13:18 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
- 13:15 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
- 13:09 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 13:09 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 13:08 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 13:08 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 13:06 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 13:05 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 13:02 effie: reimage mc-wf1001 (part of puppet7 migration)
- 13:01 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc-wf1001.eqiad.wmnet with OS bullseye
- 12:57 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1040.eqiad.wmnet
- 12:56 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2041.codfw.wmnet
- 12:52 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1040.eqiad.wmnet
- 12:50 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2041.codfw.wmnet
- 12:30 moritzm: installing systemd bugfix updates from Bullseye point release
- 12:18 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc-wf1001.eqiad.wmnet
- 12:18 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2040.codfw.wmnet
- 12:11 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2040.codfw.wmnet
- 12:10 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc-wf1001.eqiad.wmnet
- 11:56 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.14 refs T354432
- 11:45 jnuche@deploy2002: Finished scap: Backport for gerrit:990752PreAuthenticationProvider: Deny account creation based on ipoid data (T354928) (duration: 29m 32s)
- 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2041.codfw.wmnet
- 11:39 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2041.codfw.wmnet
- 11:36 jnuche@deploy2002: jnuche and kharlan: Continuing with sync
- 11:36 jnuche@deploy2002: jnuche and kharlan: Backport for gerrit:990752PreAuthenticationProvider: Deny account creation based on ipoid data (T354928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2040.codfw.wmnet
- 11:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2040.codfw.wmnet
- 11:23 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1041.eqiad.wmnet
- 11:19 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2039.codfw.wmnet
- 11:16 jnuche@deploy2002: Started scap: Backport for gerrit:990752PreAuthenticationProvider: Deny account creation based on ipoid data (T354928)
- 11:15 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1041.eqiad.wmnet
- 11:13 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2039.codfw.wmnet
- 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1040.eqiad.wmnet
- 11:08 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1040.eqiad.wmnet
- 10:59 jnuche@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.14 refs T354432 (duration: 29m 36s)
- 10:53 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1039.eqiad.wmnet
- 10:47 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1039.eqiad.wmnet
- 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2039.codfw.wmnet
- 10:35 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2038.codfw.wmnet
- 10:30 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2039.codfw.wmnet
- 10:30 jnuche@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.14 refs T354432
- 10:29 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2038.codfw.wmnet
- 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2038.codfw.wmnet
- 10:21 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1038.eqiad.wmnet
- 10:16 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2038.codfw.wmnet
- 10:16 godog: clean up also 1.42.0-wmf.9 1.42.0-wmf.10 1.42.0-wmf.12 from mw22* - T355117
- 10:15 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1038.eqiad.wmnet
- 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1039.eqiad.wmnet
- 10:10 godog: manually pruning php-1.42.0-wmf.7 from mw22* - T355117
- 10:07 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1039.eqiad.wmnet
- 10:06 jnuche@deploy2002: Pruned MediaWiki: 1.42.0-wmf.7, 1.42.0-wmf.9, 1.42.0-wmf.10, 1.42.0-wmf.12 (duration: 07m 08s)
- 10:05 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1038.eqiad.wmnet
- 10:00 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1038.eqiad.wmnet
- 09:51 jnuche@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.14 refs T354432 (duration: 52m 52s)
- 09:28 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set cloudvirt2004-dev as active - taavi@cumin1002"
- 09:26 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set cloudvirt2004-dev as active - taavi@cumin1002"
- 09:25 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:23 taavi@cumin1002: START - Cookbook sre.dns.netbox
- 09:05 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Daniram3 out of all services on: 2211 hosts
- 09:04 denisse: reprepro: Copy grafana v9.4.14 from buster to bookworm - T352665
- 09:03 denisse: reprepro: Copy grafana v9.4.14 from buster to bookworm
- 09:03 root@cumin2002: START - Cookbook sre.idm.logout Logging Daniram3 out of all services on: 2211 hosts
- 08:59 jnuche@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.14 refs T354432
2024-01-15
- 21:46 reedy@deploy2002: Synchronized wmf-config/: Fix more stringified class names (duration: 06m 29s)
- 21:37 fab@deploy2002: Finished deploy [airflow-dags/research@9b6a69a]: (no justification provided) (duration: 00m 27s)
- 21:37 reedy@deploy2002: Synchronized wmf-config/InitialiseSettings.php: Swap stringified class names in ConfirmEdit usages (duration: 06m 30s)
- 21:36 fab@deploy2002: Started deploy [airflow-dags/research@9b6a69a]: (no justification provided)
- 21:23 tgr: UTC late deploys done
- 21:22 tgr@deploy2002: Finished scap: Backport for gerrit:990164Log emails in production (duration: 09m 11s)
- 21:15 tgr@deploy2002: tgr: Continuing with sync
- 21:14 tgr@deploy2002: tgr: Backport for gerrit:990164Log emails in production synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:12 tgr@deploy2002: Started scap: Backport for gerrit:990164Log emails in production
- 19:23 tzatziki: creating the u4c2024_edits table on all wikis
- 17:55 btullis@cumin1002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid public cluster: Roll restart of Druid jvm daemons.
- 17:48 btullis@cumin1002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid analytics cluster: Roll restart of Druid jvm daemons.
- 17:23 btullis@cumin1002: END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
- 17:02 btullis@cumin1002: START - Cookbook sre.druid.roll-restart-workers for Druid public cluster: Roll restart of Druid jvm daemons.
- 17:00 btullis@cumin1002: START - Cookbook sre.druid.roll-restart-workers for Druid analytics cluster: Roll restart of Druid jvm daemons.
- 16:51 btullis@cumin1002: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
- 16:45 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbstore1005.eqiad.wmnet
- 16:45 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:45 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbstore1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
- 15:26 hnowlan: depooled jobrunner mw1460 to repurpose as k8s node
- 15:06 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbstore1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
- 15:03 btullis@cumin1002: START - Cookbook sre.dns.netbox
- 14:59 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
- 14:47 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbstore1005.eqiad.wmnet
- 14:38 Lucas_WMDE: UTC afternoon backport+config window done
- 14:33 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for gerrit:989747cawiki: update wgAutoConfirmAge and wgAutoConfirmCount (T354425) (duration: 11m 36s)
- 14:28 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 14:28 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 14:27 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and anzx: Continuing with sync
- 14:26 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 14:25 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 14:24 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 14:24 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 14:23 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and anzx: Backport for gerrit:989747cawiki: update wgAutoConfirmAge and wgAutoConfirmCount (T354425) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:23 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 14:23 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 14:22 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for gerrit:989747cawiki: update wgAutoConfirmAge and wgAutoConfirmCount (T354425)
- 13:49 aikochou@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 13:26 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp2003.codfw.wmnet
- 13:19 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp2003.codfw.wmnet
- 13:19 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp2002.codfw.wmnet
- 13:12 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp2002.codfw.wmnet
- 13:12 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp2001.codfw.wmnet
- 13:09 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp1003.eqiad.wmnet
- 13:05 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp2001.codfw.wmnet
- 13:03 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp1003.eqiad.wmnet
- 13:00 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbstore1003.eqiad.wmnet
- 13:00 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:00 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbstore1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
- 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: mediawiki::memcached::gutter
- 12:59 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbstore1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
- 12:54 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: mediawiki::memcached::gutter
- 12:42 btullis@cumin1002: START - Cookbook sre.dns.netbox
- 12:39 effie: enable puppet on mc* hosts - - T349619
- 12:37 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbstore1003.eqiad.wmnet
- 12:23 effie: stopping puppet on all mediawiki memcached hosts (mc*, mc-gp*), puppet 7 migration in progress - T349619
- 12:01 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 92 hosts
- 12:00 btullis@cumin1002: START - Cookbook sre.hosts.remove-downtime for 92 hosts
- 11:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
- 11:38 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
- 11:10 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-coord[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service
- 11:10 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-coord[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service
- 11:10 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-master[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service
- 11:10 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-master[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service
- 11:09 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1037.eqiad.wmnet
- 11:08 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 8 hosts with reason: Bringing new nameservers into service
- 11:08 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 8 hosts with reason: Bringing new nameservers into service
- 11:08 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 97 hosts with reason: Bringing new nameservers into service
- 11:07 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 97 hosts with reason: Bringing new nameservers into service
- 11:03 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1037.eqiad.wmnet
- 10:58 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp1002.eqiad.wmnet
- 10:51 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp1002.eqiad.wmnet
- 10:48 moritzm: installing systemd bugfix updates from Bullseye point release
- 10:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1037.eqiad.wmnet
- 10:13 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1037.eqiad.wmnet
- 10:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc-gp1002.eqiad.wmnet
- 10:02 ladsgroup@deploy2002: Finished scap: Backport for gerrit:990424SecurePoll: Adding updated voterlist files (T349263) (duration: 16m 04s)
- 09:58 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc-gp1002.eqiad.wmnet
- 09:56 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 09:48 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:990424SecurePoll: Adding updated voterlist files (T349263) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 09:46 ladsgroup@deploy2002: Started scap: Backport for gerrit:990424SecurePoll: Adding updated voterlist files (T349263)
- 09:16 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:16 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:15 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:15 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:15 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:14 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 08:45 filippo@deploy2002: Finished deploy [performance/arc-lamp@67389a0]: (no justification provided) (duration: 00m 05s)
- 08:45 filippo@deploy2002: Started deploy [performance/arc-lamp@67389a0]: (no justification provided)
- 08:23 dcausse@deploy2002: Finished scap: Backport for gerrit:990029enable page_rerender for 5th batch of wikis (T351503) (duration: 11m 40s)
- 08:17 dcausse@deploy2002: pfischer and dcausse: Continuing with sync
- 08:13 dcausse@deploy2002: pfischer and dcausse: Backport for gerrit:990029enable page_rerender for 5th batch of wikis (T351503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:12 dcausse@deploy2002: Started scap: Backport for gerrit:990029enable page_rerender for 5th batch of wikis (T351503)
- 04:57 andrewbogott: restarting wikitech-static, oom
2024-01-14
- 15:47 taavi@deploy2002: Finished scap: Backport for gerrit:990396Log IpReputation channel as debug (T354928) (duration: 26m 49s)
- 15:36 taavi@deploy2002: taavi: Continuing with sync
- 15:35 taavi@deploy2002: taavi: Backport for gerrit:990396Log IpReputation channel as debug (T354928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 15:20 taavi@deploy2002: Started scap: Backport for gerrit:990396Log IpReputation channel as debug (T354928)
- 15:01 andrewbogott: manually emptying /srv/mediawiki/images/wikitech/archive on wikitech-static; the maintenance script didn't do it and the host is failing due to a full disk
- 15:01 andrewbogott: running deleteArchivedFiles.php on wikitech-static
2024-01-12
- 23:49 dzahn@cumin1001: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Conniecc1 out of all services on: 2213 hosts
- 23:47 dzahn@cumin1001: START - Cookbook sre.idm.logout Logging Conniecc1 out of all services on: 2213 hosts
- 22:52 dzahn@cumin1001: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Conniecc1 out of all services on: 2213 hosts
- 22:51 dzahn@cumin1001: START - Cookbook sre.idm.logout Logging Conniecc1 out of all services on: 2213 hosts
- 22:29 dzahn@cumin1001: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Conniecc1 out of all services on: 2213 hosts
- 22:28 dzahn@cumin1001: START - Cookbook sre.idm.logout Logging Conniecc1 out of all services on: 2213 hosts
- 18:07 mutante: aphlict1002 - systemctl start logrotate
- 17:18 tchanders@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
- 17:18 tchanders@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
- 17:17 tchanders@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
- 17:16 tchanders@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
- 17:10 tchanders@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 17:09 tchanders@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 16:52 cgoubert@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
- 16:52 cgoubert@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
- 16:51 cgoubert@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
- 16:51 cgoubert@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
- 16:20 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
- 16:20 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
- 16:20 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
- 16:19 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
- 15:46 klausman@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
- 15:37 klausman@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
- 15:14 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 15:14 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 14:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2114.codfw.wmnet with reason: Maintenance
- 14:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2114.codfw.wmnet with reason: Maintenance
- 14:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T354336)', diff saved to https://phabricator.wikimedia.org/P54714 and previous config saved to /var/cache/conftool/dbconfig/20240112-140423-marostegui.json
- 13:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P54713 and previous config saved to /var/cache/conftool/dbconfig/20240112-134916-marostegui.json
- 13:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P54712 and previous config saved to /var/cache/conftool/dbconfig/20240112-133410-marostegui.json
- 13:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T354336)', diff saved to https://phabricator.wikimedia.org/P54711 and previous config saved to /var/cache/conftool/dbconfig/20240112-131904-marostegui.json
- 12:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T354336)', diff saved to https://phabricator.wikimedia.org/P54710 and previous config saved to /var/cache/conftool/dbconfig/20240112-125944-marostegui.json
- 12:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2193.codfw.wmnet with reason: Maintenance
- 12:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2193.codfw.wmnet with reason: Maintenance
- 12:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54709 and previous config saved to /var/cache/conftool/dbconfig/20240112-125921-marostegui.json
- 12:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P54708 and previous config saved to /var/cache/conftool/dbconfig/20240112-124416-marostegui.json
- 12:33 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=dewiki --logwiki=metawiki 'Osip Knecht' 'Artquichotte39'
- 12:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P54707 and previous config saved to /var/cache/conftool/dbconfig/20240112-122909-marostegui.json
- 12:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54706 and previous config saved to /var/cache/conftool/dbconfig/20240112-121402-marostegui.json
- 12:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54704 and previous config saved to /var/cache/conftool/dbconfig/20240112-121150-marostegui.json
- 12:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2180.codfw.wmnet with reason: Maintenance
- 12:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2180.codfw.wmnet with reason: Maintenance
- 12:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54703 and previous config saved to /var/cache/conftool/dbconfig/20240112-121127-marostegui.json
- 12:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
- 12:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
- 12:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
- 12:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
- 12:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
- 12:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
- 11:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P54701 and previous config saved to /var/cache/conftool/dbconfig/20240112-115621-marostegui.json
- 11:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P54700 and previous config saved to /var/cache/conftool/dbconfig/20240112-114114-marostegui.json
- 11:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54699 and previous config saved to /var/cache/conftool/dbconfig/20240112-112608-marostegui.json
- 11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2171:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54698 and previous config saved to /var/cache/conftool/dbconfig/20240112-112049-marostegui.json
- 11:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
- 11:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
- 11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54697 and previous config saved to /var/cache/conftool/dbconfig/20240112-112027-marostegui.json
- 11:10 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
- 11:08 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
- 11:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P54696 and previous config saved to /var/cache/conftool/dbconfig/20240112-110521-marostegui.json
- 11:04 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
- 10:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P54695 and previous config saved to /var/cache/conftool/dbconfig/20240112-105014-marostegui.json
- 10:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54694 and previous config saved to /var/cache/conftool/dbconfig/20240112-103508-marostegui.json
- 10:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2169:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54693 and previous config saved to /var/cache/conftool/dbconfig/20240112-103250-marostegui.json
- 10:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
- 10:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
- 10:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54692 and previous config saved to /var/cache/conftool/dbconfig/20240112-103227-marostegui.json
- 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P54691 and previous config saved to /var/cache/conftool/dbconfig/20240112-101721-marostegui.json
- 10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P54690 and previous config saved to /var/cache/conftool/dbconfig/20240112-100214-marostegui.json
- 09:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54689 and previous config saved to /var/cache/conftool/dbconfig/20240112-094708-marostegui.json
- 09:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54688 and previous config saved to /var/cache/conftool/dbconfig/20240112-094451-marostegui.json
- 09:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 09:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 09:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2158.codfw.wmnet with reason: Maintenance
- 09:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2158.codfw.wmnet with reason: Maintenance
- 09:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T354336)', diff saved to https://phabricator.wikimedia.org/P54687 and previous config saved to /var/cache/conftool/dbconfig/20240112-094413-marostegui.json
- 09:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P54686 and previous config saved to /var/cache/conftool/dbconfig/20240112-092907-marostegui.json
- 09:25 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
- 09:25 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
- 09:17 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
- 09:16 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
- 09:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P54685 and previous config saved to /var/cache/conftool/dbconfig/20240112-091400-marostegui.json
- 09:09 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
- 08:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T354336)', diff saved to https://phabricator.wikimedia.org/P54684 and previous config saved to /var/cache/conftool/dbconfig/20240112-085854-marostegui.json
- 08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T354336)', diff saved to https://phabricator.wikimedia.org/P54683 and previous config saved to /var/cache/conftool/dbconfig/20240112-085637-marostegui.json
- 08:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2151.codfw.wmnet with reason: Maintenance
- 08:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2151.codfw.wmnet with reason: Maintenance
- 08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T354336)', diff saved to https://phabricator.wikimedia.org/P54682 and previous config saved to /var/cache/conftool/dbconfig/20240112-085614-marostegui.json
- 08:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P54681 and previous config saved to /var/cache/conftool/dbconfig/20240112-084108-marostegui.json
- 08:40 godog: upload and finish upgrade of prometheus 2.48 on all sites - T354399
- 08:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54680 and previous config saved to /var/cache/conftool/dbconfig/20240112-083837-root.json
- 08:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P54679 and previous config saved to /var/cache/conftool/dbconfig/20240112-082601-marostegui.json
- 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54678 and previous config saved to /var/cache/conftool/dbconfig/20240112-082332-root.json
- 08:20 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 3605
- 08:19 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 3605
- 08:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T354336)', diff saved to https://phabricator.wikimedia.org/P54677 and previous config saved to /var/cache/conftool/dbconfig/20240112-081055-marostegui.json
- 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2129 (T354336)', diff saved to https://phabricator.wikimedia.org/P54676 and previous config saved to /var/cache/conftool/dbconfig/20240112-080837-marostegui.json
- 08:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54675 and previous config saved to /var/cache/conftool/dbconfig/20240112-080827-root.json
- 08:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T354336)', diff saved to https://phabricator.wikimedia.org/P54674 and previous config saved to /var/cache/conftool/dbconfig/20240112-080815-marostegui.json
- 07:53 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54673 and previous config saved to /var/cache/conftool/dbconfig/20240112-075322-root.json
- 07:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P54672 and previous config saved to /var/cache/conftool/dbconfig/20240112-075309-marostegui.json
- 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54671 and previous config saved to /var/cache/conftool/dbconfig/20240112-073817-root.json
- 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P54670 and previous config saved to /var/cache/conftool/dbconfig/20240112-073802-marostegui.json
- 07:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54669 and previous config saved to /var/cache/conftool/dbconfig/20240112-072312-root.json
- 07:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T354336)', diff saved to https://phabricator.wikimedia.org/P54668 and previous config saved to /var/cache/conftool/dbconfig/20240112-072255-marostegui.json
- 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2124 (T354336)', diff saved to https://phabricator.wikimedia.org/P54667 and previous config saved to /var/cache/conftool/dbconfig/20240112-072038-marostegui.json
- 07:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2124.codfw.wmnet with reason: Maintenance
- 07:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2124.codfw.wmnet with reason: Maintenance
- 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T354336)', diff saved to https://phabricator.wikimedia.org/P54666 and previous config saved to /var/cache/conftool/dbconfig/20240112-072015-marostegui.json
- 07:08 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54665 and previous config saved to /var/cache/conftool/dbconfig/20240112-070807-root.json
- 07:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P54664 and previous config saved to /var/cache/conftool/dbconfig/20240112-070508-marostegui.json
- 06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1168.eqiad.wmnet with OS bookworm
- 06:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P54663 and previous config saved to /var/cache/conftool/dbconfig/20240112-065002-marostegui.json
- 06:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1168.eqiad.wmnet with reason: host reimage
- 06:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1168.eqiad.wmnet with reason: host reimage
- 06:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T354336)', diff saved to https://phabricator.wikimedia.org/P54662 and previous config saved to /var/cache/conftool/dbconfig/20240112-063456-marostegui.json
- 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2117 (T354336)', diff saved to https://phabricator.wikimedia.org/P54661 and previous config saved to /var/cache/conftool/dbconfig/20240112-063239-marostegui.json
- 06:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2117.codfw.wmnet with reason: Maintenance
- 06:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2117.codfw.wmnet with reason: Maintenance
- 06:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
- 06:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
- 06:23 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1168.eqiad.wmnet with OS bookworm
- 06:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1168 T354506', diff saved to https://phabricator.wikimedia.org/P54660 and previous config saved to /var/cache/conftool/dbconfig/20240112-062137-marostegui.json
- 06:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1173.eqiad.wmnet with reason: Maintenance
- 06:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1173.eqiad.wmnet with reason: Maintenance
- 04:12 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 04:12 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 04:12 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 04:11 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 04:11 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 04:11 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 00:59 mutante: LDAP - added myself to gerritadmin group
2024-01-11
- 21:36 jan_drewniak: https://phabricator.wikimedia.org/T349337#9454773 running maintenance script to delete unnecessary user preferences.
- 21:26 jdrewniak@deploy2002: Finished scap: Backport for gerrit:985647InitialiseSettings.php: disallow obsolete HTML in signatures (enwiki) (T354013), gerrit:984288InitialiseSettings.php: Allow thanking bots (T341388) (duration: 13m 43s)
- 21:20 jdrewniak@deploy2002: jdrewniak and houseblaster: Continuing with sync
- 21:14 jdrewniak@deploy2002: jdrewniak and houseblaster: Backport for gerrit:985647InitialiseSettings.php: disallow obsolete HTML in signatures (enwiki) (T354013), gerrit:984288InitialiseSettings.php: Allow thanking bots (T341388) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:12 jdrewniak@deploy2002: Started scap: Backport for gerrit:985647InitialiseSettings.php: disallow obsolete HTML in signatures (enwiki) (T354013), gerrit:984288InitialiseSettings.php: Allow thanking bots (T341388)
- 20:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
- 20:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
- 20:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T354336)', diff saved to https://phabricator.wikimedia.org/P54657 and previous config saved to /var/cache/conftool/dbconfig/20240111-205021-marostegui.json
- 20:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P54656 and previous config saved to /var/cache/conftool/dbconfig/20240111-203514-marostegui.json
- 20:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P54655 and previous config saved to /var/cache/conftool/dbconfig/20240111-202008-marostegui.json
- 20:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T354336)', diff saved to https://phabricator.wikimedia.org/P54654 and previous config saved to /var/cache/conftool/dbconfig/20240111-200502-marostegui.json
- 20:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1231 (T354336)', diff saved to https://phabricator.wikimedia.org/P54653 and previous config saved to /var/cache/conftool/dbconfig/20240111-200253-marostegui.json
- 20:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1231.eqiad.wmnet with reason: Maintenance
- 20:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1231.eqiad.wmnet with reason: Maintenance
- 20:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 20:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 20:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T354336)', diff saved to https://phabricator.wikimedia.org/P54652 and previous config saved to /var/cache/conftool/dbconfig/20240111-200209-marostegui.json
- 20:00 htriedman@deploy2002: Finished deploy [airflow-dags/platform_eng@07f5320]: (no justification provided) (duration: 00m 27s)
- 20:00 htriedman@deploy2002: Started deploy [airflow-dags/platform_eng@07f5320]: (no justification provided)
- 19:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P54651 and previous config saved to /var/cache/conftool/dbconfig/20240111-194703-marostegui.json
- 19:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P54649 and previous config saved to /var/cache/conftool/dbconfig/20240111-193156-marostegui.json
- 19:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T354336)', diff saved to https://phabricator.wikimedia.org/P54647 and previous config saved to /var/cache/conftool/dbconfig/20240111-191650-marostegui.json
- 19:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1224 (T354336)', diff saved to https://phabricator.wikimedia.org/P54646 and previous config saved to /var/cache/conftool/dbconfig/20240111-191440-marostegui.json
- 19:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1224.eqiad.wmnet with reason: Maintenance
- 19:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1224.eqiad.wmnet with reason: Maintenance
- 19:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54645 and previous config saved to /var/cache/conftool/dbconfig/20240111-191418-marostegui.json
- 19:11 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.13 refs T350089
- 19:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 19:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P54644 and previous config saved to /var/cache/conftool/dbconfig/20240111-185912-marostegui.json
- 18:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P54643 and previous config saved to /var/cache/conftool/dbconfig/20240111-184405-marostegui.json
- 18:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54641 and previous config saved to /var/cache/conftool/dbconfig/20240111-182859-marostegui.json
- 18:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1213:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54640 and previous config saved to /var/cache/conftool/dbconfig/20240111-182745-marostegui.json
- 18:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
- 18:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
- 18:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T354336)', diff saved to https://phabricator.wikimedia.org/P54639 and previous config saved to /var/cache/conftool/dbconfig/20240111-182723-marostegui.json
- 18:27 thcipriani@deploy2002: Finished deploy [gerrit/gerrit@376b3e5]: Remove devsat survey banner in 3.6 (gerrit primary: gerrit.wikimedia.org) (duration: 00m 07s)
- 18:27 thcipriani@deploy2002: Started deploy [gerrit/gerrit@376b3e5]: Remove devsat survey banner in 3.6 (gerrit primary: gerrit.wikimedia.org)
- 18:25 thcipriani@deploy2002: Finished deploy [gerrit/gerrit@376b3e5]: Remove devsat survey banner in 3.6 (gerrit2002 only) (duration: 00m 05s)
- 18:25 thcipriani@deploy2002: Started deploy [gerrit/gerrit@376b3e5]: Remove devsat survey banner in 3.6 (gerrit2002 only)
- 18:23 thcipriani: deploying gerrit to remove devsat survey (no restart needed)
- 18:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P54638 and previous config saved to /var/cache/conftool/dbconfig/20240111-181217-marostegui.json
- 17:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P54637 and previous config saved to /var/cache/conftool/dbconfig/20240111-175710-marostegui.json
- 17:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T354336)', diff saved to https://phabricator.wikimedia.org/P54636 and previous config saved to /var/cache/conftool/dbconfig/20240111-174204-marostegui.json
- 17:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1201 (T354336)', diff saved to https://phabricator.wikimedia.org/P54635 and previous config saved to /var/cache/conftool/dbconfig/20240111-173955-marostegui.json
- 17:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1201.eqiad.wmnet with reason: Maintenance
- 17:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1201.eqiad.wmnet with reason: Maintenance
- 17:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T354336)', diff saved to https://phabricator.wikimedia.org/P54634 and previous config saved to /var/cache/conftool/dbconfig/20240111-173933-marostegui.json
- 17:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P54633 and previous config saved to /var/cache/conftool/dbconfig/20240111-172427-marostegui.json
- 17:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P54632 and previous config saved to /var/cache/conftool/dbconfig/20240111-170920-marostegui.json
- 16:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T354336)', diff saved to https://phabricator.wikimedia.org/P54631 and previous config saved to /var/cache/conftool/dbconfig/20240111-165414-marostegui.json
- 16:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T354336)', diff saved to https://phabricator.wikimedia.org/P54630 and previous config saved to /var/cache/conftool/dbconfig/20240111-165305-marostegui.json
- 16:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 16:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 16:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54629 and previous config saved to /var/cache/conftool/dbconfig/20240111-165244-marostegui.json
- 16:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P54628 and previous config saved to /var/cache/conftool/dbconfig/20240111-163738-marostegui.json
- 16:23 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 16:23 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 16:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P54626 and previous config saved to /var/cache/conftool/dbconfig/20240111-162231-marostegui.json
- 16:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54625 and previous config saved to /var/cache/conftool/dbconfig/20240111-160725-marostegui.json
- 16:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: cache::upload
- 16:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54624 and previous config saved to /var/cache/conftool/dbconfig/20240111-160516-marostegui.json
- 16:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 16:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 16:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T354336)', diff saved to https://phabricator.wikimedia.org/P54623 and previous config saved to /var/cache/conftool/dbconfig/20240111-160454-marostegui.json
- 15:59 sukhe: restart pybal on lvs4010
- 15:58 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe
- 15:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P54622 and previous config saved to /var/cache/conftool/dbconfig/20240111-154947-marostegui.json
- 15:47 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe
- 15:41 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: cache::upload
- 15:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P54621 and previous config saved to /var/cache/conftool/dbconfig/20240111-153441-marostegui.json
- 15:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T354336)', diff saved to https://phabricator.wikimedia.org/P54620 and previous config saved to /var/cache/conftool/dbconfig/20240111-151934-marostegui.json
- 15:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T354336)', diff saved to https://phabricator.wikimedia.org/P54619 and previous config saved to /var/cache/conftool/dbconfig/20240111-151724-marostegui.json
- 15:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 15:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 15:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T354336)', diff saved to https://phabricator.wikimedia.org/P54618 and previous config saved to /var/cache/conftool/dbconfig/20240111-151702-marostegui.json
- 15:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P54617 and previous config saved to /var/cache/conftool/dbconfig/20240111-150156-marostegui.json
- 14:51 reedy@deploy2002: Synchronized wmf-config/: T325147 (duration: 06m 43s)
- 14:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P54616 and previous config saved to /var/cache/conftool/dbconfig/20240111-144649-marostegui.json
- 14:36 reedy@deploy2002: Synchronized wmf-config/: T344398 (duration: 07m 25s)
- 14:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T354336)', diff saved to https://phabricator.wikimedia.org/P54615 and previous config saved to /var/cache/conftool/dbconfig/20240111-143143-marostegui.json
- 14:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T354336)', diff saved to https://phabricator.wikimedia.org/P54614 and previous config saved to /var/cache/conftool/dbconfig/20240111-143034-marostegui.json
- 14:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 14:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 14:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 14:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 14:26 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
- 14:25 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
- 14:25 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
- 14:25 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
- 14:24 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 14:24 kamila@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 14:21 reedy@deploy2002: Synchronized wmf-config/InitialiseSettings.php: T205347 (duration: 07m 41s)
- 14:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54613 and previous config saved to /var/cache/conftool/dbconfig/20240111-141058-root.json
- 13:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54612 and previous config saved to /var/cache/conftool/dbconfig/20240111-135553-root.json
- 13:49 hashar@deploy2002: Finished deploy [gerrit/gerrit@af34477]: wm-zuul-status: add SCHEDULED for pending check run - T348959 (duration: 00m 07s)
- 13:49 hashar@deploy2002: Started deploy [gerrit/gerrit@af34477]: wm-zuul-status: add SCHEDULED for pending check run - T348959
- 13:41 moritzm: installing xerces-c security updates
- 13:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54611 and previous config saved to /var/cache/conftool/dbconfig/20240111-134048-root.json
- 13:29 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 13:29 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 13:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54610 and previous config saved to /var/cache/conftool/dbconfig/20240111-132543-root.json
- 13:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54609 and previous config saved to /var/cache/conftool/dbconfig/20240111-131038-root.json
- 12:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54608 and previous config saved to /var/cache/conftool/dbconfig/20240111-125533-root.json
- 12:47 hashar: Restarting Gerrit to apply config change https://gerrit.wikimedia.org/r/c/operations/puppet/+/989735/ # T206049
- 12:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54607 and previous config saved to /var/cache/conftool/dbconfig/20240111-124028-root.json
- 12:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2124.codfw.wmnet with OS bookworm
- 12:20 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 12:20 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 12:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2124.codfw.wmnet with reason: host reimage
- 12:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2124.codfw.wmnet with reason: host reimage
- 12:00 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
- 12:00 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
- 11:59 moritzm: installing Python 2.7 security updates on Bullseye
- 11:50 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2124.codfw.wmnet with OS bookworm
- 11:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2124 T354506', diff saved to https://phabricator.wikimedia.org/P54606 and previous config saved to /var/cache/conftool/dbconfig/20240111-114930-marostegui.json
- 11:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54605 and previous config saved to /var/cache/conftool/dbconfig/20240111-111958-root.json
- 11:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54604 and previous config saved to /var/cache/conftool/dbconfig/20240111-110453-root.json
- 10:54 moritzm: installing Linux 5.10.205 updates on Bullseye hosts
- 10:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54603 and previous config saved to /var/cache/conftool/dbconfig/20240111-104948-root.json
- 10:34 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54602 and previous config saved to /var/cache/conftool/dbconfig/20240111-103443-root.json
- 10:31 moritzm: installing exim4 security updates
- 10:31 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 10:30 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 10:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: druid::public::worker
- 10:26 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
- 10:26 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
- 10:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54601 and previous config saved to /var/cache/conftool/dbconfig/20240111-101938-root.json
- 10:13 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: druid::public::worker
- 10:12 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
- 10:12 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
- 10:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54600 and previous config saved to /var/cache/conftool/dbconfig/20240111-100433-root.json
- 10:04 sfaci@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
- 10:03 sfaci@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
- 10:03 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
- 10:00 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
- 10:00 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 09:58 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 09:53 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 09:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54599 and previous config saved to /var/cache/conftool/dbconfig/20240111-094928-root.json
- 09:39 hashar: Gerrit back up and operational, now running version 3.6.8
- 09:33 hashar: Gerrit restarted and its reindexing all changes T309870
- 09:23 hashar@deploy2002: Finished deploy [gerrit/gerrit@e099b0b]: Gerrit to version 3.6.8 # T309870 (duration: 00m 07s)
- 09:23 hashar@deploy2002: Started deploy [gerrit/gerrit@e099b0b]: Gerrit to version 3.6.8 # T309870
- 09:22 hashar@deploy2002: Finished deploy [gerrit/gerrit@e099b0b]: Gerrit to version 3.6.8 # T309870 (duration: 00m 27s)
- 09:21 hashar@deploy2002: Started deploy [gerrit/gerrit@e099b0b]: Gerrit to version 3.6.8 # T309870
- 09:21 hashar: Stopping Gerrit
- 09:10 hashar: gerrit: `ssh -p 29418 gerrit.wikimedia.org gerrit copy-approvals` # T309870
- 09:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1201.eqiad.wmnet with OS bookworm
- 08:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1201.eqiad.wmnet with reason: host reimage
- 08:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1201.eqiad.wmnet with reason: host reimage
2024-01-10
- 22:29 herron@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-eqiad
- 22:05 herron@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-eqiad
- 21:54 herron@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-codfw
- 21:36 Dreamy_Jazz: UTC late deploys done
- 21:33 dreamyjazz@deploy2002: Finished scap: Backport for gerrit:989569Add comment to clarify which rate limits apply to temporary users (T331576) (duration: 08m 05s)
- 21:28 herron@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-codfw
- 21:27 dreamyjazz@deploy2002: dreamyjazz and tchanders: Continuing with sync
- 21:27 dreamyjazz@deploy2002: dreamyjazz and tchanders: Backport for gerrit:989569Add comment to clarify which rate limits apply to temporary users (T331576) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:25 dreamyjazz@deploy2002: Started scap: Backport for gerrit:989569Add comment to clarify which rate limits apply to temporary users (T331576)
- 21:19 taavi@deploy2002: Finished scap: Backport for gerrit:989262Disable max width for index namespace (T352162) (duration: 14m 19s)
- 21:12 taavi@deploy2002: toyofuku and taavi: Continuing with sync
- 21:08 taavi@deploy2002: toyofuku and taavi: Backport for gerrit:989262Disable max width for index namespace (T352162) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:05 taavi@deploy2002: Started scap: Backport for gerrit:989262Disable max width for index namespace (T352162)
- 20:22 sukhe: enable puppet on lvs2013: T352758
- 19:29 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:29 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove old records for mr1-codfw core links - cmooney@cumin1002"
- 19:28 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove old records for mr1-codfw core links - cmooney@cumin1002"
- 19:26 jhuneidi@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.13 refs T350089 (duration: 07m 58s)
- 19:24 cmooney@cumin1002: START - Cookbook sre.dns.netbox
- 19:18 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.13 refs T350089
- 19:00 topranks: disabling OSPF connection from mr1-codfw to codfw core routers T348164
- 18:40 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
- 18:38 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on prometheus2006.codfw.wmnet with reason: memory upgrade
- 18:37 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on prometheus2006.codfw.wmnet with reason: memory upgrade
- 18:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
- 18:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
- 18:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
- 18:35 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for prometheus2005.codfw.wmnet
- 18:35 filippo@cumin1002: START - Cookbook sre.hosts.remove-downtime for prometheus2005.codfw.wmnet
- 18:24 sukhe: stop pybal on lvs2013: T352758
- 17:59 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on prometheus2005.codfw.wmnet with reason: memory upgrade
- 17:58 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on prometheus2005.codfw.wmnet with reason: memory upgrade
- 17:54 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1377.eqiad.wmnet with OS bullseye
- 17:47 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 17:46 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 17:44 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 17:44 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 17:40 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host lvs2014.codfw.wmnet
- 17:34 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
- 17:31 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
- 17:28 sukhe@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet
- 17:27 sukhe: enable puppet on lvs2014: T352758
- 17:16 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
- 17:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1378.eqiad.wmnet with OS bullseye
- 17:14 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:14 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update reverse dns for sandbox1-a-codfw irb.2201 gw - cmooney@cumin1002"
- 17:14 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update reverse dns for sandbox1-a-codfw irb.2201 gw - cmooney@cumin1002"
- 17:09 cmooney@cumin1002: START - Cookbook sre.dns.netbox
- 16:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
- 16:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
- 16:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1378.eqiad.wmnet with OS bullseye
- 16:36 godog: upgrade prometheus on prometheus2006 - T354399
- 16:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
- 16:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
- 16:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
- 16:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
- 16:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
- 16:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
- 16:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw[1379-1383].eqiad.wmnet with reason: testing reboot
- 16:25 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw[1379-1383].eqiad.wmnet with reason: testing reboot
- 16:22 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1379.eqiad.wmnet with OS bullseye
- 16:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
- 16:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
- 16:00 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1383.eqiad.wmnet with OS bullseye
- 15:59 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1381.eqiad.wmnet with OS bullseye
- 15:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1382.eqiad.wmnet with OS bullseye
- 15:57 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
- 15:41 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: logging::opensearch::data
- 15:41 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye
- 15:40 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
- 15:37 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
- 15:37 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
- 15:35 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
- 15:35 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
- 15:34 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
- 15:24 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: logging::opensearch::data
- 15:24 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2001.codfw.wmnet
- 15:22 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2013.codfw.wmnet with reason: Decommissioning — T352469
- 15:21 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2013.codfw.wmnet with reason: Decommissioning — T352469
- 15:21 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1383.eqiad.wmnet with OS bullseye
- 15:20 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1382.eqiad.wmnet with OS bullseye
- 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: logging::opensearch::collector
- 15:19 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1381.eqiad.wmnet with OS bullseye
- 15:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1380.eqiad.wmnet with OS bullseye
- 15:14 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
- 15:13 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-master[1003-1004].eqiad.wmnet with reason: Bringing new nameservers into service
- 15:13 klausman@cumin1001: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-staging2001.codfw.wmnet
- 15:12 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-master[1003-1004].eqiad.wmnet with reason: Bringing new nameservers into service
- 15:07 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbproxy[1018-1019].eqiad.wmnet
- 15:06 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:06 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbproxy[1018-1019].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
- 15:04 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on lvs2014.codfw.wmnet with reason: T352758
- 15:04 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on lvs2014.codfw.wmnet with reason: T352758
- 15:03 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbproxy[1018-1019].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
- 15:01 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
- 15:01 sukhe: disable puppet and stop pybal on lvs2014: T352758
- 15:00 taavi@cumin1002: START - Cookbook sre.dns.netbox
- 14:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
- 14:55 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: logging::opensearch::collector
- 14:54 topranks: adding vlans to ssw1-a8-codfw to trunk to lvs2014 T352758
- 14:52 taavi@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbproxy[1018-1019].eqiad.wmnet
- 14:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
- 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: lvs::balancer
- 14:39 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
- 14:39 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
- 14:38 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1380.eqiad.wmnet with OS bullseye
- 14:27 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: lvs::balancer
- 14:27 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 14:27 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1378.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
- 14:26 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 14:26 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1378.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
- 14:25 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
- 14:24 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
- 14:22 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 14:21 moritzm: installing lapack bugfix updates
- 14:21 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 14:04 moritzm: installing openblas bugfix updates
- 14:03 hashar: Switching operations-puppet-tests-buster-docker Jenkins job from tox v3 to tox v4 | T345152
- 13:56 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 13:56 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 13:54 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
- 13:54 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
- 13:15 godog: test prometheus 2.48.1 on prometheus1005 - T354399
- 12:48 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.roll-restart-workers (exit_code=99) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
- 12:47 stevemunene@cumin1002: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
- 12:39 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts druid1006.eqiad.wmnet
- 12:39 stevemunene@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:39 stevemunene@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1006.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
- 12:37 stevemunene@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1006.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
- 12:37 hnowlan@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
- 12:37 hnowlan@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
- 12:37 hnowlan@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
- 12:37 hnowlan@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
- 12:35 stevemunene@cumin1002: START - Cookbook sre.dns.netbox
- 12:22 stevemunene@cumin1002: START - Cookbook sre.hosts.decommission for hosts druid1006.eqiad.wmnet
- 12:21 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts druid1005.eqiad.wmnet
- 12:21 stevemunene@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:21 stevemunene@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
- 12:20 stevemunene@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
- 12:18 stevemunene@cumin1002: START - Cookbook sre.dns.netbox
- 12:05 stevemunene@cumin1002: START - Cookbook sre.hosts.decommission for hosts druid1005.eqiad.wmnet
- 11:56 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts druid1004.eqiad.wmnet
- 11:56 stevemunene@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:56 stevemunene@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1004.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
- 11:54 stevemunene@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1004.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
- 11:51 stevemunene@cumin1002: START - Cookbook sre.dns.netbox
- 11:47 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 11:46 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 11:46 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 11:46 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 11:46 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 11:46 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 11:43 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 11:43 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 11:43 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 11:41 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 11:41 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 11:41 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 11:39 stevemunene@cumin1002: START - Cookbook sre.hosts.decommission for hosts druid1004.eqiad.wmnet
- 11:37 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 11:37 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 11:36 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 11:36 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 11:03 moritzm: installing PHP 7.3 security updates
- 10:46 moritzm: installing curl security updates
- 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testreduce1001.eqiad.wmnet
- 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testreduce1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 10:02 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testreduce1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 10:01 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: sync
- 10:00 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: sync
- 10:00 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: sync
- 10:00 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: sync
- 09:57 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 09:55 hashar@deploy2002: Finished deploy [integration/docroot@355ddbb]: (no justification provided) (duration: 00m 04s)
- 09:55 hashar@deploy2002: Started deploy [integration/docroot@355ddbb]: (no justification provided)
- 09:55 moritzm: installing git security updates on deployment hosts
- 09:53 hashar@deploy2002: Finished deploy [integration/docroot@355ddbb]: Dummy deploy to test git safe.directory # T335354 (duration: 00m 06s)
- 09:53 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testreduce1001.eqiad.wmnet
- 09:53 hashar@deploy2002: Started deploy [integration/docroot@355ddbb]: Dummy deploy to test git safe.directory # T335354
- 09:38 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
- 09:38 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
- 09:38 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1378.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
- 09:38 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1378.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
- 09:01 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 15133
- 09:00 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 15133
- 08:59 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 13150
- 08:57 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 13150
- 08:47 dcausse@deploy2002: Finished scap: Backport for gerrit:989442enable page_rerender for 4th batch of wikis (T351503) (duration: 11m 50s)
- 08:42 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
- 08:41 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
- 08:41 moritzm: installing Exim security updates
- 08:40 dcausse@deploy2002: pfischer and dcausse: Continuing with sync
- 08:37 dcausse@deploy2002: pfischer and dcausse: Backport for gerrit:989442enable page_rerender for 4th batch of wikis (T351503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:35 dcausse@deploy2002: Started scap: Backport for gerrit:989442enable page_rerender for 4th batch of wikis (T351503)
- 08:12 kartik@deploy2002: Finished scap: Backport for gerrit:988984testwiki: Enable Section translation on WPs with Content Translation available as default (T351882) (duration: 09m 10s)
- 08:06 kartik@deploy2002: kartik: Continuing with sync
- 08:04 kartik@deploy2002: kartik: Backport for gerrit:988984testwiki: Enable Section translation on WPs with Content Translation available as default (T351882) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:03 kartik@deploy2002: Started scap: Backport for gerrit:988984testwiki: Enable Section translation on WPs with Content Translation available as default (T351882)
- 07:53 moritzm: installing openjdk-8 security updates
- 07:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2143.codfw.wmnet with OS bookworm
- 06:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2143.codfw.wmnet with reason: host reimage
- 06:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2143.codfw.wmnet with reason: host reimage
- 06:32 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2143.codfw.wmnet with OS bookworm
2024-01-09
- 21:23 aqu@deploy2002: Finished deploy [airflow-dags/analytics@ea53374]: Regular airflow-dags/analytics weekly train [airflow-dags@ea53374f] (duration: 00m 28s)
- 21:22 aqu@deploy2002: Started deploy [airflow-dags/analytics@ea53374]: Regular airflow-dags/analytics weekly train [airflow-dags@ea53374f]
- 21:21 aqu@deploy2002: Finished deploy [airflow-dags/analytics_test@ea53374]: Regular airflow-dags/analytics_test weekly train [airflow-dags@ea53374f] (duration: 00m 12s)
- 21:21 aqu@deploy2002: Started deploy [airflow-dags/analytics_test@ea53374]: Regular airflow-dags/analytics_test weekly train [airflow-dags@ea53374f]
- 21:03 aqu@deploy2002: Finished deploy [analytics/refinery@c4fed56] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c4fed56c] (test number 2 after permission error) (duration: 00m 05s)
- 21:03 aqu@deploy2002: Started deploy [analytics/refinery@c4fed56] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c4fed56c] (test number 2 after permission error)
- 21:02 aqu@deploy2002: Finished deploy [analytics/refinery@c4fed56] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c4fed56c] (duration: 03m 33s)
- 20:59 aqu@deploy2002: Started deploy [analytics/refinery@c4fed56] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c4fed56c]
- 20:59 aqu@deploy2002: Finished deploy [analytics/refinery@c4fed56] (thin): Regular analytics weekly train THIN [analytics/refinery@c4fed56c] (duration: 00m 06s)
- 20:58 aqu@deploy2002: Started deploy [analytics/refinery@c4fed56] (thin): Regular analytics weekly train THIN [analytics/refinery@c4fed56c]
- 20:58 aqu@deploy2002: Finished deploy [analytics/refinery@c4fed56]: Regular analytics weekly train [analytics/refinery@c4fed56c] (duration: 09m 06s)
- 20:49 eevans@cumin1002: conftool action : set/weight=0; selector: cluster=restbase,dc=codfw,name=restbase2019.codfw.wmnet
- 20:49 eevans@cumin1002: conftool action : set/weight=0; selector: cluster=restbase,dc=codfw,name=restbase2014.codfw.wmnet
- 20:49 eevans@cumin1002: conftool action : set/weight=0; selector: cluster=restbase,dc=codfw,name=restbase2013.codfw.wmnet
- 20:49 aqu@deploy2002: Started deploy [analytics/refinery@c4fed56]: Regular analytics weekly train [analytics/refinery@c4fed56c]
- 20:48 aqu: about to deploy analytics/refinery - weekly train
- 20:40 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.13 refs T350089
- 20:26 jhuneidi@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.13 refs T350089 (duration: 23m 33s)
- 20:03 jhuneidi@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.13 refs T350089
- 19:44 mutante: mwmaint1002 - rm -rf 1.42.0-wmf.7 ; mwmamint2002 - rm -rf php-1.39.0-wmf.25
- 19:35 mutante: mwmaint1002 - rm -rf /srv/mediawiki/php-1.40.0-wmf.17
- 19:33 mutante: mwmaint1002 - rm -rf /srv/mediawiki/php-1.39.0-wmf.25 after monitoring alerted about 99% disk usage on /srv
- 19:26 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: all wikis to 1.42.0-wmf.12 refs T350089
- 19:16 urandom: decommissioning cassandra, restbase2013-{a,b,c} — T352469
- 19:14 jhuneidi@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.13 refs T350089 (duration: 45m 48s)
- 18:42 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
- 18:40 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
- 18:29 jhuneidi@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.13 refs T350089
- 18:04 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:04 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new reverse entries for mr1 -> lsw1-a2 link in codfw - cmooney@cumin1002"
- 18:02 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new reverse entries for mr1 -> lsw1-a2 link in codfw - cmooney@cumin1002"
- 18:00 cmooney@cumin1002: START - Cookbook sre.dns.netbox
- 17:41 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db2143']
- 17:33 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2143']
- 17:31 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db2143']
- 17:21 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2143']
- 17:17 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti-test2004.codfw.wmnet
- 17:17 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:17 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti-test2004.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
- 17:14 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti-test2004.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
- 17:12 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
- 17:06 ayounsi@cumin1002: START - Cookbook sre.hosts.decommission for hosts ganeti-test2004.codfw.wmnet
- 17:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti-test[1001-1002].eqiad.wmnet
- 17:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti-test[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
- 17:04 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti-test[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
- 17:02 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
- 16:53 ayounsi@cumin1002: START - Cookbook sre.hosts.decommission for hosts ganeti-test[1001-1002].eqiad.wmnet
- 16:27 jayme: restart prometheus@k8s on prometheus1005 revert GOGC to 100 (default) - T354604
- 16:22 mutante: phabricator - differential has been disabled (T330797)
- 16:11 brennen@deploy2002: Finished deploy [phabricator/deployment@369e797]: deploy to phab1004 for T354545 (duration: 00m 56s)
- 16:10 brennen@deploy2002: Started deploy [phabricator/deployment@369e797]: deploy to phab1004 for T354545
- 16:10 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudrabbit1003.wikimedia.org
- 16:10 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:10 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudrabbit1003.wikimedia.org decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
- 16:09 brennen@deploy2002: Finished deploy [phabricator/deployment@369e797]: deploy to phab2002 for T354545 (duration: 00m 55s)
- 16:09 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudrabbit1003.wikimedia.org decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
- 16:09 mutante: phabricator deployment in progress
- 16:08 brennen@deploy2002: Started deploy [phabricator/deployment@369e797]: deploy to phab2002 for T354545
- 16:08 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
- 16:08 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
- 16:07 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on phab1004.eqiad.wmnet with reason: deployment
- 16:04 taavi@cumin1002: START - Cookbook sre.dns.netbox
- 15:58 taavi@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudrabbit1003.wikimedia.org
- 15:54 jayme: restart prometheus@k8s on prometheus1005 with GOGC=60 - T354604
- 15:37 akosiaris: depool and reboot mw1349 for a test T354413
- 15:36 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp
- 15:19 sukhe: restart pybal on lvs1019: T336043
- 15:19 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp
- 15:16 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 15:16 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 15:16 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
- 15:15 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
- 15:14 sukhe: restart pybal on lvs1020: T336043
- 15:06 TheresNoTime: done UTC afternoon backport window
- 15:03 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
- 15:02 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
- 15:02 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
- 15:01 TheresNoTime: `[samtar@mwmaint2002 ~]$ echo 'https://en.wikipedia.org/static/images/mobile/copyright/wikinews-wordmark-zh.svg' | mwscript purgeList.php` T353792
- 15:01 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
- 15:00 TheresNoTime: `[samtar@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki bjnwikiquote --add-prefix "BROKEN " --fix` T350235
- 14:59 TheresNoTime: `[samtar@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki zghwiki --add-prefix "BROKEN " --fix` T350241
- 14:58 samtar@deploy2002: Finished scap: Backport for gerrit:986659zghwiki: add metanamespace (T350241), gerrit:986660bjnwikiquote: add metanamespace (T350235) (duration: 12m 10s)
- 14:56 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
- 14:56 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
- 14:52 samtar@deploy2002: samtar and anzx: Continuing with sync
- 14:50 samtar@deploy2002: samtar and anzx: Backport for gerrit:986659zghwiki: add metanamespace (T350241), gerrit:986660bjnwikiquote: add metanamespace (T350235) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:46 samtar@deploy2002: Started scap: Backport for gerrit:986659zghwiki: add metanamespace (T350241), gerrit:986660bjnwikiquote: add metanamespace (T350235)
- 14:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2034.codfw.wmnet with OS bookworm
- 14:44 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1002"
- 14:43 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp
- 14:42 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1002"
- 14:38 TheresNoTime: `[samtar@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki hewikinews --fix` T349581
- 14:38 samtar@deploy2002: Finished scap: Backport for gerrit:968318Create draft namespace and add namespaces aliases for hewikinews (T349581) (duration: 10m 05s)
- 14:36 kevinbazira@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 14:35 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 14:34 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host snapshot1014.eqiad.wmnet
- 14:32 samtar@deploy2002: samtar and anzx: Continuing with sync
- 14:30 samtar@deploy2002: samtar and anzx: Backport for gerrit:968318Create draft namespace and add namespaces aliases for hewikinews (T349581) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:28 samtar@deploy2002: Started scap: Backport for gerrit:968318Create draft namespace and add namespaces aliases for hewikinews (T349581)
- 14:27 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp
- 14:26 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
- 14:26 bking@cumin2002: START - Cookbook sre.wdqs.restart
- 14:26 TheresNoTime: deployed patch for T350739, logging bot not working?
- 14:24 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2034.codfw.wmnet with reason: host reimage
- 14:23 samtar@deploy2002: Finished scap: Backport for [[gerrit:972473|[namespaces] Use correct diacritics in Romanian]] (duration: 14m 42s)
- 14:22 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_esams and not P{cp3066.esams.wmnet} and A:cp
- 14:21 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2034.codfw.wmnet with reason: host reimage
- 14:16 samtar@deploy2002: strainu and samtar: Continuing with sync
- 14:13 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=codfw,name=restbase2035.codfw.wmnet
- 14:12 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for restbase2035.codfw.wmnet
- 14:12 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for restbase2035.codfw.wmnet
- 14:09 samtar@deploy2002: strainu and samtar: Backport for [[gerrit:972473|[namespaces] Use correct diacritics in Romanian]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:08 samtar@deploy2002: Started scap: Backport for [[gerrit:972473|[namespaces] Use correct diacritics in Romanian]]
- 14:04 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_esams and not P{cp3066.esams.wmnet} and A:cp
- 14:01 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti2034.codfw.wmnet with OS bookworm
- 14:01 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host ganeti2034.codfw.wmnet with OS bookworm
- 13:58 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2033.codfw.wmnet with OS bookworm
- 13:58 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1002"
- 13:56 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1002"
- 13:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host snapshot1014.eqiad.wmnet
- 13:43 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host snapshot1014.eqiad.wmnet with OS bullseye
- 13:41 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti2034.codfw.wmnet with OS bookworm
- 13:37 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2033.codfw.wmnet with reason: host reimage
- 13:34 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2033.codfw.wmnet with reason: host reimage
- 13:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54575 and previous config saved to /var/cache/conftool/dbconfig/20240109-133327-root.json
- 13:20 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
- 13:18 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
- 13:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54574 and previous config saved to /var/cache/conftool/dbconfig/20240109-131822-root.json
- 13:16 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
- 13:14 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti2033.codfw.wmnet with OS bookworm
- 13:13 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop analytics cluster: Restart of jvm daemons.
- 13:10 btullis@cumin1002: END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
- 13:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54573 and previous config saved to /var/cache/conftool/dbconfig/20240109-130317-root.json
- 13:00 hnowlan@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
- 13:00 hnowlan@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
- 12:58 stevemunene@cumin1002: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons.
- 12:57 hnowlan@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
- 12:57 hnowlan@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
- 12:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54572 and previous config saved to /var/cache/conftool/dbconfig/20240109-124812-root.json
- 12:43 moritzm: imported mwbzutils 0.1.4~wmf-1+deb11u1 for bullseye-wikimedia T325228
- 12:43 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw[1380-1382].eqiad.wmnet with reason: failed reimage waiting on fix
- 12:42 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw[1380-1382].eqiad.wmnet with reason: failed reimage waiting on fix
- 12:39 btullis@cumin1002: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
- 12:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54571 and previous config saved to /var/cache/conftool/dbconfig/20240109-123307-root.json
- 12:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54570 and previous config saved to /var/cache/conftool/dbconfig/20240109-121802-root.json
- 12:17 stevemunene@cumin1002: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons.
- 12:10 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_esams and A:cp
- 12:07 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:07 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove wiki replica LVS VIPs - taavi@cumin1002"
- 12:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1180.eqiad.wmnet with OS bookworm
- 12:06 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove wiki replica LVS VIPs - taavi@cumin1002"
- 12:04 taavi@cumin1002: START - Cookbook sre.dns.netbox
- 12:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54569 and previous config saved to /var/cache/conftool/dbconfig/20240109-120257-root.json
- 12:01 btullis@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
- 11:50 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:50 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update dns entry for kubestage2002.codfw.wmnet - cmooney@cumin1002"
- 11:50 stevemunene@cumin1002: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons.
- 11:50 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_esams and A:cp
- 11:49 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update dns entry for kubestage2002.codfw.wmnet - cmooney@cumin1002"
- 11:46 cmooney@cumin1002: START - Cookbook sre.dns.netbox
- 11:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1180.eqiad.wmnet with reason: host reimage
- 11:43 btullis@cumin1002: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
- 11:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1180.eqiad.wmnet with reason: host reimage
- 11:38 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp
- 11:37 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-b8-codfw,lsw1-b8-codfw IPv6 with reason: Adding vlan to switch, precaution in case it triggers EVPN L3 bug.
- 11:37 btullis@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-jumbo-eqiad
- 11:37 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on lsw1-b8-codfw,lsw1-b8-codfw IPv6 with reason: Adding vlan to switch, precaution in case it triggers EVPN L3 bug.
- 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1014.eqiad.wmnet with reason: host reimage
- 11:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1014.eqiad.wmnet with reason: host reimage
- 11:30 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1180.eqiad.wmnet with OS bookworm
- 11:30 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=mw2394.codfw.wmnet,cluster=jobrunner
- 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1180 T354506', diff saved to https://phabricator.wikimedia.org/P54568 and previous config saved to /var/cache/conftool/dbconfig/20240109-112922-root.json
- 11:22 cgoubert@cumin2002: conftool action : set/pooled=no; selector: name=mw2394.codfw.wmnet
- 11:19 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1014.eqiad.wmnet with OS bullseye
- 11:19 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 11:19 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 11:18 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
- 11:18 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
- 11:17 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
- 11:17 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
- 11:15 taavi@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s3
- 11:15 taavi@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3
- 11:14 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp
- 11:05 moritzm: installing exim security updates
- 10:54 godog: restart prometheus@k8s on prometheus1005 to see if labeldrop id will yield expected results - T354604
- 10:45 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host ganeti2033.codfw.wmnet with OS bookworm
- 10:38 btullis@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad
- 10:22 sfaci@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
- 10:21 sfaci@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
- 10:19 btullis@cumin1002: END (PASS) - Cookbook sre.opensearch.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:datahubsearch
- 10:11 btullis@cumin1002: START - Cookbook sre.opensearch.roll-restart-reboot rolling restart_daemons on A:datahubsearch
- 10:00 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp
- 09:59 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti2033.codfw.wmnet with OS bookworm
- 09:54 oblivian@deploy2002: Finished scap: Backport for gerrit:987033Always process media files via shellbox on k8s (T352515) (duration: 11m 03s)
- 09:52 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:52 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2033/2034 move - ayounsi@cumin1002"
- 09:48 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2033/2034 move - ayounsi@cumin1002"
- 09:47 oblivian@deploy2002: oblivian: Continuing with sync
- 09:46 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
- 09:44 oblivian@deploy2002: oblivian: Backport for gerrit:987033Always process media files via shellbox on k8s (T352515) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 09:43 oblivian@deploy2002: Started scap: Backport for gerrit:987033Always process media files via shellbox on k8s (T352515)
- 09:39 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp
- 09:34 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_codfw and A:cp
- 09:27 oblivian@deploy2002: Finished scap: Backport for gerrit:987032Use shellbox for djvu handling on kubernetes (T352515) (duration: 23m 56s)
- 09:20 oblivian@deploy2002: oblivian: Continuing with sync
- 09:15 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_codfw and A:cp
- 09:14 moritzm: prune obsolete nginx packages from ncredir hosts after migration to new library scheme T329529
- 09:11 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp
- 09:06 arnaudb: upload wmfdb 0.1.4 from https://gitlab.wikimedia.org/repos/sre/wmfdb/-/tree/dgit/bookworm-wikimedia to fix default ca bundle
- 09:05 oblivian@deploy2002: oblivian: Backport for gerrit:987032Use shellbox for djvu handling on kubernetes (T352515) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 09:03 oblivian@deploy2002: Started scap: Backport for gerrit:987032Use shellbox for djvu handling on kubernetes (T352515)
- 08:59 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 45287
- 08:54 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 45287
- 08:54 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp
- 08:49 oblivian@deploy2002: Finished scap: Backport for gerrit:987031Remove throttle exception (T352569) (duration: 09m 01s)
- 08:48 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 9902
- 08:47 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 9902
- 08:42 oblivian@deploy2002: oblivian: Continuing with sync
- 08:42 oblivian@deploy2002: oblivian: Backport for gerrit:987031Remove throttle exception (T352569) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:40 oblivian@deploy2002: Started scap: Backport for gerrit:987031Remove throttle exception (T352569)
- 08:22 kartik@deploy2002: Finished scap: Backport for gerrit:988493testwiki: Enable Section translation on WPs with potential to be supported with MinT using MADLAD-400 (T353510) (duration: 15m 54s)
- 08:21 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2143.codfw.wmnet with OS bookworm
- 08:20 godog: set aside WAL for prometheus@k8s in codfw and restart - T354399
- 08:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54567 and previous config saved to /var/cache/conftool/dbconfig/20240109-081946-root.json
- 08:11 kartik@deploy2002: kartik: Continuing with sync
- 08:10 kartik@deploy2002: kartik: Backport for gerrit:988493testwiki: Enable Section translation on WPs with potential to be supported with MinT using MADLAD-400 (T353510) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:06 kartik@deploy2002: Started scap: Backport for gerrit:988493testwiki: Enable Section translation on WPs with potential to be supported with MinT using MADLAD-400 (T353510)
- 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 100%: After a crash', diff saved to https://phabricator.wikimedia.org/P54566 and previous config saved to /var/cache/conftool/dbconfig/20240109-080558-root.json
- 08:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54565 and previous config saved to /var/cache/conftool/dbconfig/20240109-080441-root.json
- 07:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 75%: After a crash', diff saved to https://phabricator.wikimedia.org/P54564 and previous config saved to /var/cache/conftool/dbconfig/20240109-075053-root.json
- 07:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54563 and previous config saved to /var/cache/conftool/dbconfig/20240109-074936-root.json
- 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 50%: After a crash', diff saved to https://phabricator.wikimedia.org/P54562 and previous config saved to /var/cache/conftool/dbconfig/20240109-073548-root.json
- 07:34 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54561 and previous config saved to /var/cache/conftool/dbconfig/20240109-073431-root.json
- 07:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 25%: After a crash', diff saved to https://phabricator.wikimedia.org/P54560 and previous config saved to /var/cache/conftool/dbconfig/20240109-072043-root.json
- 07:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54559 and previous config saved to /var/cache/conftool/dbconfig/20240109-071926-root.json
- 07:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 10%: After a crash', diff saved to https://phabricator.wikimedia.org/P54558 and previous config saved to /var/cache/conftool/dbconfig/20240109-070538-root.json
- 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54557 and previous config saved to /var/cache/conftool/dbconfig/20240109-070421-root.json
- 07:01 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2143.codfw.wmnet with OS bookworm
- 06:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2151.codfw.wmnet with OS bookworm
- 06:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 5%: After a crash', diff saved to https://phabricator.wikimedia.org/P54556 and previous config saved to /var/cache/conftool/dbconfig/20240109-065033-root.json
- 06:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54555 and previous config saved to /var/cache/conftool/dbconfig/20240109-064916-root.json
- 06:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 1%: After a crash', diff saved to https://phabricator.wikimedia.org/P54554 and previous config saved to /var/cache/conftool/dbconfig/20240109-063528-root.json
- 06:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2151.codfw.wmnet with reason: host reimage
- 06:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2151.codfw.wmnet with reason: host reimage
- 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1224', diff saved to https://phabricator.wikimedia.org/P54553 and previous config saved to /var/cache/conftool/dbconfig/20240109-062806-root.json
- 06:11 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2151.codfw.wmnet with OS bookworm
- 06:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2151 T354506', diff saved to https://phabricator.wikimedia.org/P54552 and previous config saved to /var/cache/conftool/dbconfig/20240109-061015-root.json
- 03:11 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 03:11 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 03:11 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 03:10 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 03:10 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 03:10 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 01:22 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
- 01:17 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
2024-01-08
- 23:16 eileen: civicrm upgraded from 16b5417b to c7304245
- 22:58 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
- 22:57 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
- 22:56 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2003.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 22:30 ryankemper@puppetmaster1001: conftool action : set/weight=10:pooled=yes; selector: name=elastic2087\.codfw\.wmnet
- 22:04 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host sretest2003.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 21:50 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 21:49 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 21:37 cjming: end of UTC late backport window
- 21:32 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
- 21:29 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
- 21:27 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
- 21:24 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
- 21:15 cjming@deploy2002: Finished scap: Backport for gerrit:988714Remove android.metrics_platform.* stream definitions (T354199) (duration: 08m 17s)
- 21:08 cjming@deploy2002: cjming: Continuing with sync
- 21:08 cjming@deploy2002: cjming: Backport for gerrit:988714Remove android.metrics_platform.* stream definitions (T354199) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:07 cjming@deploy2002: Started scap: Backport for gerrit:988714Remove android.metrics_platform.* stream definitions (T354199)
- 19:30 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
- 19:28 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
- 19:27 taavi: make puppet re-generate empty envoy config file on testreduce1002 T345220
- 19:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 19:13 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 19:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 19:09 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 19:08 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 19:04 sukhe: running authdns-update for CR 988684: T345220
- 19:04 sukhe: running authdns-update for CR 988684: T336043
- 18:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 18:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 18:34 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 18:27 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 18:25 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 18:21 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 18:19 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 18:12 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 18:10 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 17:56 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 17:53 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 17:43 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: gerrit:988673 Bumping portals to master (T128546) (duration: 06m 17s)
- 17:36 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: gerrit:988673 Bumping portals to master (T128546) (duration: 06m 21s)
- 17:34 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1377.eqiad.wmnet with OS bullseye
- 17:18 godog: wipe prometheus@k8s eqiad WAL and restart - T354399
- 17:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
- 17:15 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 17:15 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 17:14 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
- 17:14 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp
- 17:12 ladsgroup@deploy2002: Finished scap: Backport for gerrit:988658Undeploy Listings extension part III (T253216) (duration: 08m 01s)
- 17:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 17:07 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 17:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 17:06 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 17:06 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:988658Undeploy Listings extension part III (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 17:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 17:04 ladsgroup@deploy2002: Started scap: Backport for gerrit:988658Undeploy Listings extension part III (T253216)
- 17:04 ladsgroup@deploy2002: Finished scap: Backport for gerrit:988658Undeploy Listings extension part III (T253216) (duration: 12m 24s)
- 17:00 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
- 16:57 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 16:54 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1377.eqiad.wmnet with OS bullseye
- 16:53 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:988658Undeploy Listings extension part III (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti2034.codfw.wmnet
- 16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2034.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
- 16:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
- 16:51 ladsgroup@deploy2002: Started scap: Backport for gerrit:988658Undeploy Listings extension part III (T253216)
- 16:49 ladsgroup@deploy2002: Finished scap: Backport for gerrit:988658Undeploy Listings extension part III (T253216) (duration: 08m 47s)
- 16:49 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
- 16:48 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2034.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
- 16:46 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp
- 16:44 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqsin and not P{cp[5030,5032].eqsin.wmnet} and A:cp
- 16:43 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 16:42 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 16:42 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:988658Undeploy Listings extension part III (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 16:41 ladsgroup@deploy2002: Started scap: Backport for gerrit:988658Undeploy Listings extension part III (T253216)
- 16:37 pt1979@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti2034.codfw.wmnet
- 16:36 btullis@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dbstore1008.eqiad.wmnet on all recursors
- 16:36 btullis@cumin1002: START - Cookbook sre.dns.wipe-cache dbstore1008.eqiad.wmnet on all recursors
- 16:35 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
- 16:35 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:35 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove unwanted AAAA records from new dbstore hosts - btullis@cumin1002"
- 16:34 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove unwanted AAAA records from new dbstore hosts - btullis@cumin1002"
- 16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti2033.codfw.wmnet
- 16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2033.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
- 16:32 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2033.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
- 16:30 btullis@cumin1002: START - Cookbook sre.dns.netbox
- 16:25 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 16:25 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and not P{cp[5030,5032].eqsin.wmnet} and A:cp
- 16:25 ladsgroup@deploy2002: Finished scap: Backport for gerrit:988658Undeploy Listings extension part III (T253216) (duration: 24m 06s)
- 16:24 taavi: lvs1018: sudo ipvsadm --delete-service --tcp-service 208.80.154.243:3311 (and all the way to :3318) - T346947
- 16:23 taavi: lvs1018: sudo ipvsadm --delete-service --tcp-service 208.80.154.242:3311 (and all the way to :3318) - T346947
- 16:21 taavi: lvs1020: sudo ipvsadm --delete-service --tcp-service 208.80.154.243:3311 (and all the way to :3318) - T346947
- 16:20 taavi: lvs1020: sudo ipvsadm --delete-service --tcp-service 208.80.154.242:3311 (and all the way to :3318) - T346947
- 16:18 pt1979@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti2033.codfw.wmnet
- 16:15 taavi: restart pybal on lvs1018 - T346947
- 16:14 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 16:14 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:988658Undeploy Listings extension part III (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 16:09 taavi: restart pybal on lvs1020 - T346947
- 16:01 ladsgroup@deploy2002: Started scap: Backport for gerrit:988658Undeploy Listings extension part III (T253216)
- 15:59 sfaci@deploy2002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
- 15:59 sfaci@deploy2002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
- 15:58 ladsgroup@deploy2002: Finished scap: Backport for gerrit:988655Undeploy listing extension part II (T253216) (duration: 08m 40s)
- 15:57 sfaci@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
- 15:57 sfaci@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
- 15:52 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 15:51 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:988655Undeploy listing extension part II (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 15:49 ladsgroup@deploy2002: Started scap: Backport for gerrit:988655Undeploy listing extension part II (T253216)
- 15:48 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw1377.eqiad.wmnet with reason: reboot debugging
- 15:48 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw1377.eqiad.wmnet with reason: reboot debugging
- 15:47 ladsgroup@deploy2002: Finished scap: Backport for gerrit:988654Undeploy Listings extension, part I (T253216) (duration: 08m 22s)
- 15:46 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 15:46 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 15:45 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 15:41 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 15:40 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:988654Undeploy Listings extension, part I (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 15:40 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 15:38 ladsgroup@deploy2002: Started scap: Backport for gerrit:988654Undeploy Listings extension, part I (T253216)
- 15:35 claime: Draining and cordoning kubestage2002.codfw.wmnet - T352883
- 15:32 krinkle@deploy2002: Finished scap: Backport for gerrit:987999Fix parsing logic when comments or hidden characters are present (T354385) (duration: 07m 52s)
- 15:26 krinkle@deploy2002: krinkle: Continuing with sync
- 15:26 krinkle@deploy2002: krinkle: Backport for gerrit:987999Fix parsing logic when comments or hidden characters are present (T354385) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 15:24 krinkle@deploy2002: Started scap: Backport for gerrit:987999Fix parsing logic when comments or hidden characters are present (T354385)
- 14:46 urbanecm@deploy2002: Finished scap: Backport for gerrit:987159Add agent.app_install_id to android.product_metrics.* streams (T353680), gerrit:982467Remove partial migration of EditAttemptStep instrument (T351335), gerrit:982903Add new stream names to the config variable (T353297), [[gerrit:988504|agent.app_ -> agent_app_ in android.product_metrics.* streams (T353680)]] (duration: 10m 22s)
- 14:40 urbanecm@deploy2002: urbanecm and phuedx and ksarabia and sfaci: Continuing with sync
- 14:37 urbanecm@deploy2002: urbanecm and phuedx and ksarabia and sfaci: Backport for gerrit:987159Add agent.app_install_id to android.product_metrics.* streams (T353680), gerrit:982467Remove partial migration of EditAttemptStep instrument (T351335), gerrit:982903Add new stream names to the config variable (T353297), [[gerrit:988504|agent.app_ -> agent_app_ in android.product_metrics.* streams (T353680)]] synce
- 14:35 urbanecm@deploy2002: Started scap: Backport for gerrit:987159Add agent.app_install_id to android.product_metrics.* streams (T353680), gerrit:982467Remove partial migration of EditAttemptStep instrument (T351335), gerrit:982903Add new stream names to the config variable (T353297), [[gerrit:988504|agent.app_ -> agent_app_ in android.product_metrics.* streams (T353680)]]
- 14:34 urbanecm@deploy2002: Sync cancelled.
- 14:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
- 14:27 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 10 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
- 14:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54548 and previous config saved to /var/cache/conftool/dbconfig/20240108-141717-root.json
- 14:14 urbanecm@deploy2002: urbanecm and phuedx and ksarabia and sfaci: Backport for gerrit:987159Add agent.app_install_id to android.product_metrics.* streams (T353680), gerrit:982467Remove partial migration of EditAttemptStep instrument (T351335), gerrit:982903Add new stream names to the config variable (T353297) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:13 urbanecm@deploy2002: Started scap: Backport for gerrit:987159Add agent.app_install_id to android.product_metrics.* streams (T353680), gerrit:982467Remove partial migration of EditAttemptStep instrument (T351335), gerrit:982903Add new stream names to the config variable (T353297)
- 14:12 urbanecm@deploy2002: Finished scap: Backport for gerrit:988449enable page_rerender for 3rd batch of wikis (T351503) (duration: 09m 35s)
- 14:06 urbanecm@deploy2002: pfischer and urbanecm: Continuing with sync
- 14:04 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 14:04 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 14:04 urbanecm@deploy2002: pfischer and urbanecm: Backport for gerrit:988449enable page_rerender for 3rd batch of wikis (T351503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:02 urbanecm@deploy2002: Started scap: Backport for gerrit:988449enable page_rerender for 3rd batch of wikis (T351503)
- 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54547 and previous config saved to /var/cache/conftool/dbconfig/20240108-140212-root.json
- 14:01 moritzm: installing curl security updates
- 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54546 and previous config saved to /var/cache/conftool/dbconfig/20240108-134707-root.json
- 13:33 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 13:33 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 13:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54545 and previous config saved to /var/cache/conftool/dbconfig/20240108-133202-root.json
- 13:32 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 13:31 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54544 and previous config saved to /var/cache/conftool/dbconfig/20240108-133016-root.json
- 13:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54543 and previous config saved to /var/cache/conftool/dbconfig/20240108-131657-root.json
- 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54542 and previous config saved to /var/cache/conftool/dbconfig/20240108-131511-root.json
- 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54541 and previous config saved to /var/cache/conftool/dbconfig/20240108-130152-root.json
- 13:00 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54540 and previous config saved to /var/cache/conftool/dbconfig/20240108-130006-root.json
- 12:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54539 and previous config saved to /var/cache/conftool/dbconfig/20240108-124647-root.json
- 12:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1224.eqiad.wmnet with OS bookworm
- 12:45 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54538 and previous config saved to /var/cache/conftool/dbconfig/20240108-124501-root.json
- 12:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54537 and previous config saved to /var/cache/conftool/dbconfig/20240108-122956-root.json
- 12:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
- 12:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
- 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54536 and previous config saved to /var/cache/conftool/dbconfig/20240108-121451-root.json
- 12:10 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1224.eqiad.wmnet with OS bookworm
- 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1224 T354506', diff saved to https://phabricator.wikimedia.org/P54535 and previous config saved to /var/cache/conftool/dbconfig/20240108-120759-root.json
- 12:03 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 45287
- 12:02 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 45287
- 12:02 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 35847
- 12:02 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 35847
- 12:01 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9902
- 12:00 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 9902
- 12:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2117.codfw.wmnet with OS bookworm
- 11:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54534 and previous config saved to /var/cache/conftool/dbconfig/20240108-115946-root.json
- 11:57 ladsgroup@deploy2002: Finished scap: Backport for gerrit:988460Disable Listings extension everywhere except rowikivoyage (T253216) (duration: 08m 43s)
- 11:50 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 11:50 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:988460Disable Listings extension everywhere except rowikivoyage (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 11:48 ladsgroup@deploy2002: Started scap: Backport for gerrit:988460Disable Listings extension everywhere except rowikivoyage (T253216)
- 11:45 taavi@deploy2002: Finished scap: Backport for gerrit:988252OATHAuthServices: Fix service name (T354505), gerrit:988253Fix disabling two-factor authentication (T354505) (duration: 09m 21s)
- 11:39 taavi@deploy2002: taavi: Continuing with sync
- 11:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2117.codfw.wmnet with reason: host reimage
- 11:38 taavi@deploy2002: taavi: Backport for gerrit:988252OATHAuthServices: Fix service name (T354505), gerrit:988253Fix disabling two-factor authentication (T354505) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 11:36 taavi@deploy2002: Started scap: Backport for gerrit:988252OATHAuthServices: Fix service name (T354505), gerrit:988253Fix disabling two-factor authentication (T354505)
- 11:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2117.codfw.wmnet with reason: host reimage
- 11:29 ladsgroup@deploy2002: Finished scap: Backport for gerrit:988456Stop writing to the old columns of pagelinks in testwiki (T352010) (duration: 10m 02s)
- 11:23 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 11:20 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:988456Stop writing to the old columns of pagelinks in testwiki (T352010) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 11:19 ladsgroup@deploy2002: Started scap: Backport for gerrit:988456Stop writing to the old columns of pagelinks in testwiki (T352010)
- 11:17 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2117.codfw.wmnet with OS bookworm
- 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2117 T354506', diff saved to https://phabricator.wikimedia.org/P54533 and previous config saved to /var/cache/conftool/dbconfig/20240108-111452-root.json
- 10:36 XioNoX: repool eqsin - T332395
- 10:33 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 10:32 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:21 ladsgroup@deploy2002: Finished scap: Backport for gerrit:987861styles: Replace obsolete WikimediaUI Base var with Codex alias (duration: 07m 32s)
- 10:20 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 10:20 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 10:15 ladsgroup@deploy2002: volker-e and ladsgroup: Continuing with sync
- 10:15 ladsgroup@deploy2002: volker-e and ladsgroup: Backport for gerrit:987861styles: Replace obsolete WikimediaUI Base var with Codex alias synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 10:14 ladsgroup@deploy2002: Started scap: Backport for gerrit:987861styles: Replace obsolete WikimediaUI Base var with Codex alias
- 10:11 ladsgroup@deploy2002: Finished scap: Backport for gerrit:987657Set commonswiki pagelinks migration stage to READ NEW (T351237) (duration: 08m 52s)
- 10:05 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 10:04 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:987657Set commonswiki pagelinks migration stage to READ NEW (T351237) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 10:02 ladsgroup@deploy2002: Started scap: Backport for gerrit:987657Set commonswiki pagelinks migration stage to READ NEW (T351237)
- 09:54 XioNoX: asw1-eqsin> request system reboot - T332395
- 09:32 Emperor: reboot ms-be2074-80 before adding them to the rings T353149
- 09:32 Emperor: reboot ms-be1072-82 before adding them to the rings T353149
- 09:24 XioNoX: start install process on asw1-eqsin - T332395
- 09:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 35 hosts with reason: eqsin switch upgrade
- 09:04 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 35 hosts with reason: eqsin switch upgrade
- 09:03 XioNoX: depool eqsin for switch upgrade - T332395
- 08:27 xSavitar: UTC morning backport window done.
- 08:26 derick@deploy2002: Finished scap: Backport for gerrit:974508wmf-config: Remove unused wgStatsCacheType setting (T336004) (duration: 09m 11s)
- 08:20 derick@deploy2002: derick and d3r1ck01: Continuing with sync
- 08:18 derick@deploy2002: derick and d3r1ck01: Backport for gerrit:974508wmf-config: Remove unused wgStatsCacheType setting (T336004) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:17 derick@deploy2002: Started scap: Backport for gerrit:974508wmf-config: Remove unused wgStatsCacheType setting (T336004)
2024-01-06
- 22:27 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 22:27 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 22:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
2024-01-05
- 23:49 thcipriani@deploy2002: Finished deploy [gerrit/gerrit@de3a994]: Removing survey banner gerrit:987995 (gerrit.wikimedia.org only this deploy) (duration: 00m 08s)
- 23:49 thcipriani@deploy2002: Started deploy [gerrit/gerrit@de3a994]: Removing survey banner gerrit:987995 (gerrit.wikimedia.org only this deploy)
- 23:31 thcipriani@deploy2002: Finished deploy [gerrit/gerrit@de3a994]: Removing survey banner gerrit:987995 (gerrit-replicas only this deploy) (duration: 00m 06s)
- 23:31 thcipriani@deploy2002: Started deploy [gerrit/gerrit@de3a994]: Removing survey banner gerrit:987995 (gerrit-replicas only this deploy)
- 23:25 thcipriani: deploying gerrit to remove survey banner https://gerrit.wikimedia.org/r/987995 (no downtime needed)
- 19:29 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for restbase2034.codfw.wmnet
- 19:29 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for restbase2034.codfw.wmnet
- 19:23 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=codfw,name=restbase2034.codfw.wmnet
- 19:07 mutante: vrts1001 - sudo systemctl start clamav-daemon
- 17:14 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 17:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 16:43 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 16:42 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 16:40 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 16:30 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 16:29 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 16:19 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 15:40 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 15:40 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 15:40 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 15:40 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 15:31 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 15:30 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 14:50 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 14:50 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 14:45 milimetric@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
- 14:45 milimetric@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
- 14:43 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 14:42 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 14:41 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 14:41 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 14:38 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 14:37 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 14:14 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 14:14 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 13:42 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 13:41 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 13:23 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 13:23 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 11:56 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mw1379.eqiad.wmnet
- 11:49 kamila@cumin1002: START - Cookbook sre.hosts.reboot-single for host mw1379.eqiad.wmnet
- 09:26 moritzm: installing 5.10.205 kernels on Bullseye hosts
- 09:15 _joe_: upgrading conftool across the fleet
- 08:01 moritzm: installing 6.1.69 kernels on Bookworm hosts
- 01:27 zabe: zabe@mwmaint2002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=arzwiki --logwiki=metawiki 'WanderingPlaywrite' 'WanderingPlaywright' # T354397
- 00:59 cwhite: restarted prometheus@k8s on prometheus1006 and backed up the wal for OOM loop investigation
- 00:52 cwhite: restarted prometheus@k8s on prometheus1005 and backed up the wal for OOM loop investigation
2024-01-04
- 23:10 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 23:10 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 22:34 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 22:33 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 22:33 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 22:33 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 22:31 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 22:31 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 22:29 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 22:29 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 22:29 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 22:29 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 22:25 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 22:25 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 22:24 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 22:24 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 22:22 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 22:22 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 22:22 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 22:21 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 22:21 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 22:21 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 22:00 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 22:00 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 21:38 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 21:38 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 21:27 brennen: end of utc late backport window
- 21:26 brennen@deploy2002: Finished scap: Backport for gerrit:987738Ensure all non-okay statuses from ::getImageContents have a message (T354374) (duration: 08m 01s)
- 21:20 brennen@deploy2002: brennen and dreamyjazz: Continuing with sync
- 21:19 brennen@deploy2002: brennen and dreamyjazz: Backport for gerrit:987738Ensure all non-okay statuses from ::getImageContents have a message (T354374) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:18 brennen@deploy2002: Started scap: Backport for gerrit:987738Ensure all non-okay statuses from ::getImageContents have a message (T354374)
- 21:17 brennen@deploy2002: Finished scap: Backport for gerrit:987734Check for invalid JSON on a good response from PhotoDNA (T354370) (duration: 07m 57s)
- 21:11 brennen@deploy2002: brennen and dreamyjazz: Continuing with sync
- 21:10 brennen@deploy2002: brennen and dreamyjazz: Backport for gerrit:987734Check for invalid JSON on a good response from PhotoDNA (T354370) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:09 brennen@deploy2002: Started scap: Backport for gerrit:987734Check for invalid JSON on a good response from PhotoDNA (T354370)
- 20:41 ryankemper: [apifeatureusage] T350703 Restarted `logstash` on `apifeatureusage[1,2]001`
- 20:39 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.12 refs T350088
- 20:30 mutante: mwmaint2002 - /usr/local/sbin/sync-home-mwmaint after gerrit:987778
- 20:20 dduvall@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.12 refs T350088 (duration: 06m 09s)
- 20:16 ejegg: standalone (payments listener) SmashPig upgraded from fc74ccca to 20d6434e
- 20:13 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.12 refs T350088
- 20:03 mutante: releases2003 - systemctl status rsync-srv-org-wikimedia-releases-releases2003.codfw.wmnet after gerrit:987436
- 20:01 mutante: releases2003 - systemctl start rsync-srv-patches-releases2003.codfw.wmnet after gerrit:987436
- 19:59 brett: restarting pybal on lvs5006 for testing purposes - T353760
- 19:59 mutante: releases1003 - systemctl start rsync-srv-patches-releases-primary after gerrit:987436
- 19:57 dcausse: repooling wdqs1019
- 19:52 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 19:51 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 19:49 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.12 refs T350088
- 19:47 mutante: deploy1002 - systemctl start rsync-patches_module after gerrit:987436
- 19:32 dduvall@deploy2002: Finished scap: Backport for gerrit:987473Revise logic for creating compact links button on Vector 2022 (T353850) (duration: 07m 58s)
- 19:26 dduvall@deploy2002: jdlrobson and dduvall: Continuing with sync
- 19:26 dduvall@deploy2002: jdlrobson and dduvall: Backport for gerrit:987473Revise logic for creating compact links button on Vector 2022 (T353850) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 19:25 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 19:25 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 19:24 dduvall@deploy2002: Started scap: Backport for gerrit:987473Revise logic for creating compact links button on Vector 2022 (T353850)
- 19:22 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 19:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 19:04 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 19:04 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 18:46 sukhe: [second time] mx2001: exiqgrep -i -r w*@gmail.com | xargs exim -Mrm
- 18:03 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1377.eqiad.wmnet with OS bullseye
- 17:57 sukhe: mx2001: exiqgrep -i -r w*@gmail.com | xargs exim -Mrm
- 17:46 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
- 17:43 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
- 17:42 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 17:42 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 17:35 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 17:34 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 17:28 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
- 17:10 oblivian@puppetmaster2001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=kubernetes,service=kubesvc,name=mw1377.*
- 16:43 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 16:42 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 16:42 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 16:41 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 16:41 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 16:41 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 16:36 volans@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw1378.eqiad.wmnet
- 16:25 volans@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw1378.eqiad.wmnet
- 16:00 volans@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts mw1378.eqiad.wmnet
- 15:59 volans@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw1378.eqiad.wmnet
- 15:58 moritzm: installing libdatetime-timezone-perl updates
- 15:51 moritzm: rolling restart of FPM/apache on mw canaries to pick up curl updates
- 15:48 XioNoX: repool esams - T346779
- 15:46 volans@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw1378.eqiad.wmnet
- 15:38 XioNoX: undrain esams-eqiad transport - T346779
- 15:37 XioNoX: re-enable peering/transit on cr1-esams - T346779
- 15:35 volans@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw1378.eqiad.wmnet
- 15:30 XioNoX: reboot fpc0 on cr1-esams - T346779
- 15:29 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw1378.mgmt.eqiad.wmnet with reboot policy GRACEFUL
- 15:26 XioNoX: disable peering/transit on cr1-esams for linecard reboot - T346779
- 15:19 volans: running sre.hosts.provision for mw1378 - T351074
- 15:19 volans@cumin2002: START - Cookbook sre.hosts.provision for host mw1378.mgmt.eqiad.wmnet with reboot policy GRACEFUL
- 15:16 XioNoX: drain esams-eqiad transport - T346779
- 15:14 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 15:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 15:13 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 15:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 15:12 moritzm: installing curl security updates
- 15:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 15:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 15:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 15:08 volans: rebooting mw1378 (downtimed and depooled) to debug reboot issues afer reimage - T351074
- 15:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 15:07 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 15:07 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 15:07 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 15:07 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 15:05 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 15:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 15:04 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 15:04 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 15:01 XioNoX: depool esams for router work - T346779
- 15:00 tchanders@deploy2002: Finished scap: Backport for gerrit:984810enable page_rerender for 2nd batch: dewiki, frwiktionary, and kuwiktionary (duration: 17m 55s)
- 14:59 volans: rebooting mw1378 (downtimed and depooled) to debug reboot issues afer reimage - T351074
- 14:56 volans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw1378.eqiad.wmnet with reason: WIP hosts to be setup
- 14:56 volans@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on mw1378.eqiad.wmnet with reason: WIP hosts to be setup
- 14:54 tchanders@deploy2002: pfischer and tchanders: Continuing with sync
- 14:45 tchanders@deploy2002: pfischer and tchanders: Backport for gerrit:984810enable page_rerender for 2nd batch: dewiki, frwiktionary, and kuwiktionary synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:42 tchanders@deploy2002: Started scap: Backport for gerrit:984810enable page_rerender for 2nd batch: dewiki, frwiktionary, and kuwiktionary
- 14:40 tchanders@deploy2002: Finished scap: Backport for gerrit:987726Attempt to send original file to PhotoDNA if no thumbnail (T353854) (duration: 09m 25s)
- 14:34 tchanders@deploy2002: tchanders and dreamyjazz: Continuing with sync
- 14:34 tchanders@deploy2002: tchanders and dreamyjazz: Backport for gerrit:987726Attempt to send original file to PhotoDNA if no thumbnail (T353854) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:30 tchanders@deploy2002: Started scap: Backport for gerrit:987726Attempt to send original file to PhotoDNA if no thumbnail (T353854)
- 14:25 tchanders@deploy2002: Finished scap: Backport for gerrit:987485Attempt to send original file to PhotoDNA if no thumbnail (T353854) (duration: 09m 24s)
- 14:20 tchanders@deploy2002: dreamyjazz and tchanders: Continuing with sync
- 14:20 tchanders@deploy2002: dreamyjazz and tchanders: Backport for gerrit:987485Attempt to send original file to PhotoDNA if no thumbnail (T353854) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:16 tchanders@deploy2002: Started scap: Backport for gerrit:987485Attempt to send original file to PhotoDNA if no thumbnail (T353854)
- 14:12 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 14:12 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 14:09 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 14:09 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 14:09 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 14:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 14:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 14:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 14:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 14:03 XioNoX: repool drmrs - T354340
- 14:01 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 14:00 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 14:00 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 13:57 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 2686
- 13:56 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 2686
- 13:53 moritzm: installing libssh security updates
- 13:24 dcausse: restarting blazegraph on wdqs1019 (stuck with high thread count)
- 13:07 zabe@deploy2002: Finished scap: Backport for gerrit:987483Revert "Get blocks from DatabaseBlockStore instead of doing our own query" (T353620), gerrit:987482Revert "Support new block schema" (T354298) (duration: 10m 06s)
- 13:02 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host mw1377.eqiad.wmnet
- 13:02 XioNoX: depool drmrs for router work - T354340
- 13:01 zabe@deploy2002: zabe: Continuing with sync
- 13:00 zabe@deploy2002: zabe: Backport for gerrit:987483Revert "Get blocks from DatabaseBlockStore instead of doing our own query" (T353620), gerrit:987482Revert "Support new block schema" (T354298) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 12:56 zabe@deploy2002: Started scap: Backport for gerrit:987483Revert "Get blocks from DatabaseBlockStore instead of doing our own query" (T353620), gerrit:987482Revert "Support new block schema" (T354298)
- 12:53 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 63296
- 12:52 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 63296
- 12:10 kamila@cumin1002: START - Cookbook sre.hosts.reboot-single for host mw1377.eqiad.wmnet
- 12:04 moritzm: installing lua5.3 security updates
- 11:52 moritzm: installing libde265 security updates
- 11:35 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1379.eqiad.wmnet with OS bullseye
- 11:19 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
- 11:16 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
- 11:01 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye
- 10:51 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 10:33 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 10:32 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 10:17 akosiaris: bump memory limits for calico-node in wikikube codfw/eqiad by 25% (i.e from 400Mi to 500Mi) take #3
- 10:17 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 09:57 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 09:38 akosiaris: delete mw1377-mw1383 from eqiad wikikube nodes
- 09:38 akosiaris: bump memory limits for calico-node in wikikube codfw/eqiad by 25% (i.e from 400Mi to 500Mi) take #2
- 09:36 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 09:36 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 09:22 akosiaris: bump memory limits for calico-node in wikikube codfw/eqiad by 25% (i.e from 400Mi to 500Mi)
- 09:22 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 09:13 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:13 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:12 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:11 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:09 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 08:49 ladsgroup@deploy2002: Finished scap: Backport for gerrit:987134Update virtual domain for url shortener (duration: 12m 35s)
- 08:43 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 08:38 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:987134Update virtual domain for url shortener synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:36 ladsgroup@deploy2002: Started scap: Backport for gerrit:987134Update virtual domain for url shortener
- 08:34 ladsgroup@deploy2002: Finished scap: Backport for gerrit:985160Add virtual domain config for reading lists extension (T353948) (duration: 09m 05s)
- 08:28 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 08:27 ladsgroup@deploy2002: ladsgroup: Backport for gerrit:985160Add virtual domain config for reading lists extension (T353948) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 08:25 ladsgroup@deploy2002: Started scap: Backport for gerrit:985160Add virtual domain config for reading lists extension (T353948)
- 07:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1151.eqiad.wmnet with OS bookworm
- 06:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1151.eqiad.wmnet with reason: host reimage
- 06:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1151.eqiad.wmnet with reason: host reimage
- 06:28 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1151.eqiad.wmnet with OS bookworm
- 03:49 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
2024-01-03
- 23:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow
- 23:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow
- 23:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow
- 23:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow
- 23:33 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
- 23:24 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 23:24 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 23:18 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1383.eqiad.wmnet with OS bullseye
- 23:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1380.eqiad.wmnet with OS bullseye
- 23:14 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1382.eqiad.wmnet with OS bullseye
- 23:12 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1378.eqiad.wmnet with OS bullseye
- 23:10 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1381.eqiad.wmnet with OS bullseye
- 23:07 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1379.eqiad.wmnet with OS bullseye
- 23:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
- 23:01 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
- 22:59 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
- 22:59 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
- 22:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
- 22:54 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
- 22:54 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
- 22:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
- 22:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
- 22:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
- 22:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
- 22:51 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
- 22:51 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
- 22:40 bking@cumin2002: START - Cookbook sre.wdqs.restart
- 22:38 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1383.eqiad.wmnet with OS bullseye
- 22:38 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1382.eqiad.wmnet with OS bullseye
- 22:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1381.eqiad.wmnet with OS bullseye
- 22:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1380.eqiad.wmnet with OS bullseye
- 22:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye
- 22:36 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1378.eqiad.wmnet with OS bullseye
- 22:36 bking@cumin2002: START - Cookbook sre.wdqs.restart
- 22:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2087.codfw.wmnet with OS bullseye
- 22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2087.codfw.wmnet with reason: host reimage
- 21:59 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2087.codfw.wmnet with reason: host reimage
- 21:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1377.eqiad.wmnet with OS bullseye
- 21:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: broken reimage
- 21:47 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: broken reimage
- 21:43 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2087.codfw.wmnet with OS bullseye
- 21:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
- 21:34 zabe@deploy2002: Finished scap: Backport for gerrit:986825Update mediawiki/mediawiki-codesniffer to 42.0.0 (duration: 10m 34s)
- 21:33 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
- 21:28 zabe@deploy2002: zabe: Continuing with sync
- 21:27 zabe@deploy2002: zabe: Backport for gerrit:986825Update mediawiki/mediawiki-codesniffer to 42.0.0 synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:24 zabe@deploy2002: Started scap: Backport for gerrit:986825Update mediawiki/mediawiki-codesniffer to 42.0.0
- 21:19 TheresNoTime: UTC late backport window done
- 21:18 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
- 21:14 samtar@deploy2002: Finished scap: Backport for gerrit:986200Add "patroller" user group to testwiki (T354063) (duration: 12m 19s)
- 21:08 samtar@deploy2002: novemlinguae and samtar: Continuing with sync
- 21:06 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1383.eqiad.wmnet with OS bullseye
- 21:06 samtar@deploy2002: novemlinguae and samtar: Backport for gerrit:986200Add "patroller" user group to testwiki (T354063) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 21:04 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1382.eqiad.wmnet with OS bullseye
- 21:02 samtar@deploy2002: Started scap: Backport for gerrit:986200Add "patroller" user group to testwiki (T354063)
- 20:59 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1381.eqiad.wmnet with OS bullseye
- 20:47 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1380.eqiad.wmnet with OS bullseye
- 20:45 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1379.eqiad.wmnet with OS bullseye
- 20:37 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1378.eqiad.wmnet with OS bullseye
- 20:34 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1377.eqiad.wmnet with OS bullseye
- 20:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2450.codfw.wmnet with OS bullseye
- 20:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2443.codfw.wmnet with OS bullseye
- 20:11 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2451.codfw.wmnet with OS bullseye
- 20:04 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2442.codfw.wmnet with OS bullseye
- 20:00 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
- 19:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2436.codfw.wmnet with OS bullseye
- 19:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
- 19:57 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2450.codfw.wmnet with reason: host reimage
- 19:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2440.codfw.wmnet with OS bullseye
- 19:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2443.codfw.wmnet with reason: host reimage
- 19:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2437.codfw.wmnet with OS bullseye
- 19:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
- 19:51 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2451.codfw.wmnet with reason: host reimage
- 19:51 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2451.codfw.wmnet with reason: host reimage
- 19:51 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2450.codfw.wmnet with reason: host reimage
- 19:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2443.codfw.wmnet with reason: host reimage
- 19:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
- 19:49 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
- 19:49 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
- 19:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2442.codfw.wmnet with reason: host reimage
- 19:42 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
- 19:39 mutante: root@doc2002: /usr/local/sbin/sync-doc-host-data-sync after gerrit:987406
- 19:39 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
- 19:38 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2442.codfw.wmnet with reason: host reimage
- 19:36 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
- 19:36 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2440.codfw.wmnet with reason: host reimage
- 19:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2436.codfw.wmnet with reason: host reimage
- 19:35 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1383.eqiad.wmnet with OS bullseye
- 19:35 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2440.codfw.wmnet with reason: host reimage
- 19:35 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
- 19:35 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1382.eqiad.wmnet with OS bullseye
- 19:34 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1381.eqiad.wmnet with OS bullseye
- 19:33 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2451.codfw.wmnet with OS bullseye
- 19:33 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2437.codfw.wmnet with reason: host reimage
- 19:33 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2450.codfw.wmnet with OS bullseye
- 19:32 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2443.codfw.wmnet with OS bullseye
- 19:31 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
- 19:28 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2437.codfw.wmnet with reason: host reimage
- 19:28 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
- 19:26 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2436.codfw.wmnet with reason: host reimage
- 19:26 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
- 19:25 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
- 19:22 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1380.eqiad.wmnet with OS bullseye
- 19:21 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye
- 19:19 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2442.codfw.wmnet with OS bullseye
- 19:18 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2440.codfw.wmnet with OS bullseye
- 19:11 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1378.eqiad.wmnet with OS bullseye
- 19:11 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
- 19:10 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2437.codfw.wmnet with OS bullseye
- 19:08 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2436.codfw.wmnet with OS bullseye
- 18:27 brennen@deploy2002: Finished deploy [phabricator/deployment@369e797]: deploy to phab2002 for T334519 (duration: 00m 27s)
- 18:27 brennen@deploy2002: Started deploy [phabricator/deployment@369e797]: deploy to phab2002 for T334519
- 18:27 brennen: running an essentially no-op phab2002 deploy
- 18:11 dduvall@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.12 refs T350088 (duration: 07m 23s)
- 18:03 dduvall@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.12 refs T350088
- 17:06 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and not P{cp4044.ulsfo.wmnet} and A:cp
- 16:45 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and not P{cp4044.ulsfo.wmnet} and A:cp
- 16:33 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and not P{cp4050.ulsfo.wmnet,cp4051.ulsfo.wmnet} and A:cp
- 16:27 stevemunene@deploy2002: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
- 16:27 stevemunene@deploy2002: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
- 16:27 stevemunene@deploy2002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
- 16:26 stevemunene@deploy2002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
- 16:26 stevemunene@deploy2002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
- 16:26 stevemunene@deploy2002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
- 16:25 stevemunene@deploy2002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
- 16:25 stevemunene@deploy2002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
- 16:24 stevemunene@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
- 16:24 stevemunene@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
- 16:23 stevemunene@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
- 16:22 stevemunene@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
- 16:16 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and not P{cp4050.ulsfo.wmnet,cp4051.ulsfo.wmnet} and A:cp
- 16:11 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P{cp3066.esams.wmnet} and A:cp
- 16:10 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P{cp3066.esams.wmnet} and A:cp
- 15:39 moritzm: rebuild md RAIDs after disk swap T353324
- 14:55 TheresNoTime: UTC afternoon backport window done
- 14:54 samtar@deploy2002: Finished scap: Backport for gerrit:986658zhwikinews: update wordmark (T353792) (duration: 09m 11s)
- 14:48 samtar@deploy2002: anzx and samtar: Continuing with sync
- 14:46 samtar@deploy2002: anzx and samtar: Backport for gerrit:986658zhwikinews: update wordmark (T353792) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:45 samtar@deploy2002: Started scap: Backport for gerrit:986658zhwikinews: update wordmark (T353792)
- 14:43 samtar@deploy2002: Finished scap: Backport for gerrit:985389aswikiquote: change wordmark and update logo (T353934) (duration: 07m 51s)
- 14:38 samtar@deploy2002: samtar and anzx: Continuing with sync
- 14:37 samtar@deploy2002: samtar and anzx: Backport for gerrit:985389aswikiquote: change wordmark and update logo (T353934) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:36 samtar@deploy2002: Started scap: Backport for gerrit:985389aswikiquote: change wordmark and update logo (T353934)
- 14:34 samtar@deploy2002: Finished scap: Backport for gerrit:986662Edit Recovery: fix typo in expiry field name (T347673) (duration: 07m 46s)
- 14:29 samtar@deploy2002: samtar: Continuing with sync
- 14:28 samtar@deploy2002: samtar: Backport for gerrit:986662Edit Recovery: fix typo in expiry field name (T347673) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:27 samtar@deploy2002: Started scap: Backport for gerrit:986662Edit Recovery: fix typo in expiry field name (T347673)
- 14:18 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 14:18 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 14:17 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 14:17 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 14:11 samtar@deploy2002: Finished scap: Backport for gerrit:985376zhwikivoyage: Enable block feature for abusefilter (T353604), gerrit:987417ganwiki: Add transwiki import sources (T354000) (duration: 09m 58s)
- 14:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 14:06 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 14:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 14:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 14:05 samtar@deploy2002: samtar and stang: Continuing with sync
- 14:03 moritzm: installing qemu security updates
- 14:02 samtar@deploy2002: samtar and stang: Backport for gerrit:985376zhwikivoyage: Enable block feature for abusefilter (T353604), gerrit:987417ganwiki: Add transwiki import sources (T354000) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:01 samtar@deploy2002: Started scap: Backport for gerrit:985376zhwikivoyage: Enable block feature for abusefilter (T353604), gerrit:987417ganwiki: Add transwiki import sources (T354000)
- 13:32 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Nick Ifeajika out of all services on: 2220 hosts
- 13:31 root@cumin2002: START - Cookbook sre.idm.logout Logging Nick Ifeajika out of all services on: 2220 hosts
- 13:29 moritzm: installing Java 8/11 security updates
- 12:34 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 12:34 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 12:29 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot-master (exit_code=0) rolling restart_daemons on A:maps-master
- 12:28 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot-master rolling restart_daemons on A:maps-master
- 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
- 12:18 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-eqiad
- 12:14 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-codfw
- 12:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 12:08 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-codfw
- 12:02 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 12:02 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 12:01 moritzm: installing gnutls28 security updates on buster
- 11:47 oblivian@deploy2002: Finished scap: Backport for gerrit:987400Fix timeouts detection on mw on k8s jobrunners (T354229) (duration: 11m 38s)
- 11:44 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 11:44 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 11:41 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 11:41 oblivian@deploy2002: oblivian: Continuing with sync
- 11:40 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 11:39 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 11:37 oblivian@deploy2002: oblivian: Backport for gerrit:987400Fix timeouts detection on mw on k8s jobrunners (T354229) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 11:36 oblivian@deploy2002: Started scap: Backport for gerrit:987400Fix timeouts detection on mw on k8s jobrunners (T354229)
- 11:31 oblivian@deploy2002: Finished scap: Backport for gerrit:951049Disable things that don't work on k8s when on k8s (duration: 15m 29s)
- 11:25 oblivian@deploy2002: oblivian: Continuing with sync
- 11:25 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 11:24 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 11:24 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 11:24 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 11:24 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 11:23 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 11:23 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 11:23 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 11:23 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 11:22 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 11:22 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 11:21 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 11:21 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 11:20 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 11:20 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 11:18 oblivian@deploy2002: oblivian: Backport for gerrit:951049Disable things that don't work on k8s when on k8s synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 11:16 oblivian@deploy2002: Started scap: Backport for gerrit:951049Disable things that don't work on k8s when on k8s
- 11:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:56 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:53 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:51 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 10:51 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:48 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 10:48 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:46 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 10:46 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:35 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 10:35 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:23 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:16 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:15 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:11 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:11 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 10:10 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 10:09 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 10:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:57 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:40 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:39 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:36 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:36 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:35 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:35 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:33 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:33 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:32 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:32 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:31 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:31 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:21 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:21 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:21 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:21 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:21 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:10 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:10 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 09:07 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 09:03 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 01:16 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 01:16 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 01:13 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 01:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 00:55 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 00:55 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 00:08 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 00:08 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
2024-01-02
- 22:42 urbanecm: mwmaint2002: Restart `mwscript extensions/GrowthExperiments/maintenance/reassignMentees.php --wiki=enwiki --mentor 'FormalDude' --performer 'Martin Urbanec (WMF)'` (T354220)
- 22:29 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2087.codfw.wmnet with OS bullseye
- 21:08 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2087.codfw.wmnet with OS bullseye
- 20:52 urbanecm: mwmaint2002: `mwscript extensions/GrowthExperiments/maintenance/reassignMentees.php --wiki=enwiki --mentor 'FormalDude' --performer 'Martin Urbanec (WMF)'` (T354220)
- 20:32 mutante: phab2002 - synced /srv/homes tfrom phab1004 to /srv/homes on phab2002
- 19:39 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.12 refs T350088
- 18:29 mutante: confctl select 'name=mw2394.codfw.wmnet' set/pooled=inactive | T354193#9430654 - seems like 2396 was previously depooled instead of this 2394
- 17:29 dancy@deploy2002: Installation of scap version "4.65.1" completed for 566 hosts
- 17:28 dancy@deploy2002: Installing scap version "4.65.1" for 566 hosts
- 17:26 dancy@deploy2002: Installing scap version "4.65.1" for 567 hosts
- 14:59 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbstore1008.eqiad.wmnet with OS bookworm
- 14:58 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbstore1009.eqiad.wmnet with OS bookworm
- 14:44 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki=csbwiktionary --fix # T354114
- 14:43 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1009.eqiad.wmnet with reason: host reimage
- 14:40 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on dbstore1009.eqiad.wmnet with reason: host reimage
- 14:37 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1008.eqiad.wmnet with reason: host reimage
- 14:34 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on dbstore1008.eqiad.wmnet with reason: host reimage
- 14:32 _joe_: confctl select 'name=mw2396.codfw.wmnet' set/pooled=inactive
- 14:26 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host dbstore1009.eqiad.wmnet with OS bookworm
- 14:20 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host dbstore1008.eqiad.wmnet with OS bookworm
- 14:16 urbanecm@deploy2002: Finished scap: Backport for gerrit:985384cswiki: Grant patrolmarks to autopatrolled (T354004), gerrit:986640csbwiktionary: Set MetaNamespaceName to Wikisłowôrz (T354114) (duration: 13m 46s)
- 14:04 urbanecm@deploy2002: urbanecm: Continuing with sync
- 14:04 urbanecm@deploy2002: urbanecm: Backport for gerrit:985384cswiki: Grant patrolmarks to autopatrolled (T354004), gerrit:986640csbwiktionary: Set MetaNamespaceName to Wikisłowôrz (T354114) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 14:02 urbanecm@deploy2002: Started scap: Backport for gerrit:985384cswiki: Grant patrolmarks to autopatrolled (T354004), gerrit:986640csbwiktionary: Set MetaNamespaceName to Wikisłowôrz (T354114)
- 10:55 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P{cp4044.ulsfo.wmnet,cp4050.ulsfo.wmnet} and A:cp
- 10:50 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P{cp4044.ulsfo.wmnet,cp4050.ulsfo.wmnet} and A:cp
- 10:38 vgutierrez: fetching haproxy 2.6.16 for thirdparty/haproxy26 bullseye-wikimedia (apt.wm.o)
- 09:23 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Commissioning new database server
- 09:23 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Commissioning new database server
- 09:17 pfischer@deploy2002: Finished scap: Backport for gerrit:987028configure message_key_fields for update_pipeline (duration: 15m 35s)
- 09:05 pfischer@deploy2002: pfischer: Continuing with sync
- 09:04 pfischer@deploy2002: pfischer: Backport for gerrit:987028configure message_key_fields for update_pipeline synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
- 09:02 moritzm: installing nodejs security updates on bookworm
- 09:02 pfischer@deploy2002: Started scap: Backport for gerrit:987028configure message_key_fields for update_pipeline
- 08:33 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2448.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 08:27 jayme: restart prometheus@k8s prometheus@k8s-aux in eqiad - T343529
- 08:26 akosiaris@cumin1001: START - Cookbook sre.hosts.provision for host mw2448.mgmt.codfw.wmnet with reboot policy GRACEFUL
- 06:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2144.codfw.wmnet with OS bookworm
- 06:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2144.codfw.wmnet with reason: host reimage
- 06:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2144.codfw.wmnet with reason: host reimage
- 06:06 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2144.codfw.wmnet with OS bookworm
- 05:00 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.12 refs T350088 (duration: 56m 48s)
- 04:03 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.12 refs T350088
2024-01-01
- 21:38 eileen: config revision changed from 026cf508 to 21b91455
- 21:13 eileen: config revision changed from 3a1a1444 to 026cf508
- 21:13 eileen: fork/mapping-edit-button-fix
- 17:11 joal@deploy2002: Finished deploy [airflow-dags/analytics@8b8a456]: Fix monthly job [airflow-dags/analytics@8b8a4567] (duration: 00m 31s)
- 17:11 joal@deploy2002: Started deploy [airflow-dags/analytics@8b8a456]: Fix monthly job [airflow-dags/analytics@8b8a4567]
Other archives
2000s
- Archive 1: 2004 Jun - 2004 Sep
- Archive 2: 2004 Oct - 2004 Nov
- Archive 3: 2004 Dec - 2005 Mar
- Archive 4: 2005 Apr - 2005 Jul
- Archive 5: 2005 Aug - 2005 Oct, with revision history 2004-06-23 to 2005-11-25
- Archive 6: 2005 Nov - 2006 Feb
- Archive 7: 2006 Mar - 2006 Jun
- Archive 8: 2006 Jul - 2006 Sep
- Archive 9: 2006 Oct - 2007 Jan, with revision history 2005-11-25 to 2007-02-21
- Archive 10: 2007 Feb - 2007 Jun
- Archive 11: 2007 Jul - 2007 Dec
- Archive 12: 2008 Jan - 2008 Jul
- Archive 12a: 2008 Aug
- Archive 12b: 2008 Sept
- Archive 13: 2008 Oct - 2009 Jun
- Archive 14: 2009 Jun - 2009 Dec
2010s
- Archive 15: 2010 Jan - 2010 Jun
- Archive 16: 2010 Jul - 2010 Oct
- Archive 17: 2010 Nov - 2010 Dec
- Archive 18: 2011 Jan - 2011 Jun
- Archive 19: 2011 Jul - 2011 Dec
- Archive 20: 2011 Dec - 2012 Jun, with revision history 2007-02-21 to 2012-03-27
- Archive 21: 2012 Jul - 2013 Jan
- Archive 22: 2013 Jan - 2013 Jul
- Archive 23: 2013 Aug - 2013 Dec
- Archive 24: 2014 Jan - 2014 Mar
- Archive 25: 2014 April - 2014 September
- Archive 26: 2014 October - 2014 December
- Archive 27: 2015 January - 2015 July
- Archive 28: 2015 August - 2015 December
- Archive 29: 2016 January - 2016 May
- Archive 30: 2016 June - 2016 August
- Archive 31: 2016 September - 2016 December
- Archive 32: 2017 January - 2017 July
- Archive 33: 2017 August - 2017 December
- Archive 34: 2018 January - 2018 April
- Archive 35: 2018 May - 2018 August
- Archive 36: 2018 September - 2018 December
- Archive 37: 2019 January - 2019 April
- Archive 38: 2019 May - 2019 August
- Archive 39: 2019 September - 2019 December
2020s
- Archive 40: 2020 January - 2020 April
- Archive 41: 2020 May - 2020 July
- Archive 42: 2020 August - 2020 November
- Archive 43: 2020 December
- Archive 44: 2021 January - 2021 April
- Archive 45: 2021 May - 2021 July
- Archive 46: 2021 August - 2021 October
- Archive 47: 2021 November - 2021 December
- Archive 48: 2022 January
- Archive 49: 2022 February
- Archive 50: 2022 March
- Archive 51: 2022 April 1-15
- Archive 52: 2022 April 16-30
- Archive 53: 2022 May
- Archive 54: 2022 June
- Archive 55: 2022 July
- Archive 56: 2022 August
- Archive 57: 2022 September
- Archive 58: 2022 October
- Archive 59: 2022 November 1-15
- Archive 60: 2022 November 16-30
- Archive 61: 2022 December
- Archive 62: 2023 January
- Archive 63: 2023 February
- Archive 64: 2023 March
- Archive 65: 2023 April
- Archive 66: 2023 May
- Archive 67: 2023 June
- Archive 68: 2023 July
- Archive 69: 2023 August 1-15
- Archive 70: 2023 August 16-31
- Archive 71: 2023 September
- Archive 72: 2023 October
- Archive 73: 2023 November
- Archive 74: 2023 December
- Archive 75: 2024 January
- Archive 76: 2024 February
- Archive 77: 2024 March
- Archive 78: 2024 April
- Archive 79: 2024 May 1-15
- Archive 80: 2024 May 16-31
This article is issued from Wikimedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.