Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
DateService IssueNotes
5/13/2015VM Server Emergency Security PatchServices: All departmental virtual machines will be power cycled once the patches are in place for the exploit in VENOM to be fixed.
 5/9/2015Astro email RAID Mounts Emergency Maintenance

Services: All astro email services including webmail will be unavailable while correct action is being taken to scrub the RAID pool for astro inboxes (/var/mail). Status shows the scrub should be completed around 12:45 PM today.

12:37 All services restored.

 

3/13/2015EMERGENCY RAID MAINTENANCE

Services: /opt/local, DHCP (laptop connections), Mathmatica, IDL, astro Webmail, etc. NOTE: Regular astro mail should continue work but may be slow.

14:10 All services restored

12:45 Determined system hangs were caused by the failure of one of the two systems disk which are mirrored. Paul Morris worked with Oracle Support and removed the failed disk from the mirror, rebooted, ran integrity tests and brought the RAID back online. Replacement disk is being shipped and will be installed next week.

9:25 RAID (cerberus) failed and went off-line. Oracle technical support is working the issue

2/25/15Astro RAID Mounts Emergency Maintenance

Services: All astro services including email, FTP and webmail will be unavailable for a short time starting at 6:45AM. This work is expected to take from 15 to 20 minutes.

6:50 astro and all related services back online

6:45 astro rebooted.

02/03/15Astro Email Scheduled Maintenance

Services: All astro email services including webmail will be unavailable for a short time starting at 5:30 PM today. This work is expected to take from 15 to 20 minutes.

5:38 all email services are back online, maintenance completed

5:30 astro email services taken off-line

01/07/15

Astro and Galactica Maintenance: Solaris Updates

Email, Webmail, FTP

/sans/* RAID shares

Services: All astro services including email, webmail, ftp along /sans/* partiion shares will be unavailable for a short time while astro and galactica are taken off-line to apply regular updates

9:10 FTP is now online; Maintenance completed.

8:23 All services expect FTP back online

7:30 astro and galactica taken off-line

12/16/14Astro Email - Unscheduled maintenance

Service: All email services including webmail

7:36 all email services restarting and available

7:30 astro email restart to clear out zombie processes

12/10/14Services

Service: printers and calendars.

10:25 - System functional again.

9:52 - Disc IO errors due causing system to hang.

12/9/14Services

Service: printers and calendars.

14:30 - Temporary system in place.

11:10 - System became unresponsive due to a hardware failure.  Migrating system to new hardware.

11/16/14cerberus RAID

Service: /opt/local, /san and other cerberus RAID mounts

NOTICE: Server is unreachable.

11:31 - Service restored.

11:05 - . RCA in progress.

11/16/14Astronomy Web Site

Service: All Astronomy web sites hosted on dept web server, affirmed

NOTICE: Sites unreachable. Server load 257.0 (should be in single digits or low double digits)

11:36 - Service restored.

10:32 - System hung after reboot.

10:20 - Preparing to reboot.

09:59 - Access to www.as.utexas.edu times out. Load excessively high.

11/16/14Web Mail

Service: Astronomy web mail

NOTICE: Server is unreachable.

11:32 - Service restored.

10:25 - Cannot access webmail.as.utexas.edu. RCA in progress.

11/07/14Astro Email

Service: Astro IMAP and POP email (reading)

NOTICE: Emergency maintenance: SSL 3.0 patch

07:52 - Patch successfully applied; Service restored

07:50 - Inbound and stored mail reading off-lined

11/05/14Astro Email

Service: All astro email

NOTICE: SSL 3.0 patch failure.

05/09/14Astro Email Blacklisting

Service: Outbound astro email

NOTICE: The de-blacklisting of astro may take hours to a few days to propagate

22:38 - Outbound email restored

21:55 - astro scheduled for reboot

21:07 - De-blacklisting request made

20:45 - Root caused identified, clean up initiated

19:15 - Outbound mail service off-lined

03/14/14

 
Astro Server Upgrade

Service: All astro services including email

03:14 - Mail Services resumed. SSH keys have changed. Please report any problems you might have.

07:00 - System taken off-line

01/01/14DocuShare Server Emergency Maintenance

Service: All DocuShare services

21:49 - UT ISO Notification of ds Bot/Worm compromise

22:00 - Border Quarantine set by Networking

15:16 - Efforts to reestablish services continue

11/27/13File Server (cerberus) Scheduled Maintenance

Service: Email and all other astro services

07:00 - cerberus taken offline and began updates

08:30 - cerberus hung on last update; Oracle Support contacted

10:00 - Oracle Support engineer finally gets involved

12:45 - Oracle Support engineer still working to resolve upgrade issues

14.30 - Maintenance completed.

11/02/13Astro Scheduled Maintenance

Service: Email and all other astro services

07:00 - astro taken offline.

09:45 - Solaris and email updates completed.

09:55 - astro is back online.

08/12/13Astro IMAP Emergency Maintenance

Service: Mail

13:06 - Services resumed.

11:46 - services being restarted again. Please do not use your email client.

11:30 - IMAP services being restarted due to system load issue.

08/10/13Solaris OS Scheduled Maintenance

Service: All Suns running Solaris OS

07:00 - Will begin rebooting systems.

08:48 - All workstation systems are back online.

08:15 - astro is back online.

08/10/13Astro Empergy Maintenance

Service: Email (astro)

07:00 - Begin applying recommended patches; Email and other services unavailable.

08:15 - Email and all other astro services are back online.

07/01/13Webmail Host Server Upgrade

Service: Webmail

9:15 - Maintenance finished.

9:00 - Server hosting Webmail will be down for a few minutes for a memory upgrade.

05/28/13Webserver Emergency Maintenance

Service: Webserver

19:59 - Partial functionality returned.

13:30 - Drive reconfiguration failure.

8:50 - Service functionality returned.

7:00 - Hardware failure.

05/09/13Astro Emergency Maintenance

Service: Mail

18:10 - Mail service operational.

16:29 - Mail is temporarily offline while allocation issues for the service are being resolved.

14:43 - Troubleshooting mail service errors.

05/03/13Vault Scheduled Maintenance

Service: Vault

8:00 - Server hosting Vault brought down to rebuild new drives.

04/29/13Astro Emergency Maintenance

Service: Mail

13:55 - Mail service operational.

10:09 - Mail service interruption

04/09/13Astro Emergency Maintenance

Service: Mail

11:45 - Mail service operational.

11:15 - Mail service interruption