Current SCC System Status
The Research Computing Systems group will update this status when there are changes. Click for additional information.
SCC Updates
- A power event occurred at the MGHPCC datacenter overnight due to a chiller issue, resulting in all SCC compute nodes losing power for a period of time. As a result, any jobs running at the time have failed. The cluster is now operating normally, please review your jobs and resubmit if necessary. Questions and requests for assistance can be sent to help@scc.bu.edu. (Sunday, 4/27/25, 8:54 am)
- The following message was sent to all SCC users on 4/16/25. (Wednesday, 4/16/25)
Dear Researcher,
The Shared Computing Cluster (SCC) will be offline Monday, June 2, 7 am to Wednesday, June 4, 9 am while the MGHPCC, which houses the SCC, undergoes scheduled annual maintenance.
RCS will post updates before, during, and after the downtime here.
- Downtime: Monday, June 2, 7 am to Wednesday, June 4, 9 am.
- Systems Impacted: SCC (SCC OnDemand, login nodes, batch nodes, home directories, project disk space), Linux Virtual Lab, and class.bu.edu.
- Queue Draining: The SCC scheduler will not dispatch jobs that have specified a runtime that extends beyond June 2 at 6 am. These jobs will remain pending and will run when the system returns to normal operation on June 4 at 9 am.
Please email help@scc.bu.edu if you have any questions or concerns.
Regards,
Research Computing ServicesA version of this document is also posted at: http://scv.bu.edu/text-documents/scc-downtime-jun2025.html.
- SCC OnDemand was inaccessible and the SCC was generally unresponsive from 7:45 am to 9 am on Monday, April 14 due to filesystems issues. Our systems administrators rebooted two fileservers and the system was back to normal operation by 9 am. (Monday, 4/14/25)
- A power event occurred at the MGHPCC datacenter that houses the SCC causing all compute nodes to restart. Any jobs running at the time have failed. The cluster is now operating normally, please review your jobs and resubmit if necessary. Questions and requests for assistance can be sent to help@scc.bu.edu. (Wednesday, 2/19/25, 11:02 pm)
- The scheduled maintenance was completely successfully and the system was restored to normal operation at 3:15pm on Sunday, February 17. (Sunday, 2/17/25)
- The following message was sent to all SCC users on 1/24/25. (Friday, 1/24/25)
Dear Researcher,The Shared Computing Cluster (SCC) will be offline Friday, February 14, 7 pm to Monday, February 17, noon for scheduled filesystem maintenance.
RCS will post updates before, during, and after the downtime here.
- Downtime: Friday, February 14, 7 pm to Monday, February 17, noon
- Systems Impacted: SCC (login and batch nodes, home directories and project disk space), SCC OnDemand (scc-ondemand.bu.edu), Linux Virtual Lab, and class.bu.edu.
- Queue Draining: The SCC scheduler will not dispatch jobs that have specified a run
time that extends beyond Friday, February 14, 6 pm. These jobs will remain pending and will run when the system returns to normal operation on Monday, February 17 at noon.
Please email help@scc.bu.edu if you have any questions or concerns.
Regards,
Research Computing ServicesA version of this document is also posted at: http://scv.bu.edu/text-documents/scc-downtime-feb2025.html.
- Research Computing staff have completed the repair of the SCC filesystem and all services were restored to normal operation at 5:30pm. (Tuesday, 1/14/2025, 5:40 pm)
- The SCC filesystem is having issues and is not responsive. We are in the process of investigating this issue. (Tuesday, 1/14/2025, 2:45 pm)
- The scc1 login node has been returned to normal operation. (Tuesday, 12/3/2024)
- The scc1 login node is currently unavailable. Please use scc2 during this time. (Sunday, 12/1/2024)
- The following message was sent to all SCC users on 11/18/24. (Monday, 11/18/24)
Dear Researcher,On Sunday, November 24, from 6 am to 7 am, the SCC login nodes (scc-ondemand1, scc-ondemand2, scc1, scc2, geo, and scc4) will undergo maintenance to update system software. These nodes will be rebooted and interactive logins, VNC sessions, and interactive batch sessions (qrsh, qlogin, qsh) will be terminated. Queued and running batch jobs will not be affected.
Please email help@scc.bu.edu if you have any questions or concerns.
Regards,
Research Computing Services - The following message was sent to all SCC users on 10/3/24. (Monday, 10/7/24)
Dear Researcher,On Sunday, October 13, from 6 am to 7 am, the SCC login nodes (scc-ondemand1, scc-ondemand2, scc1, scc2, geo, and scc4) will undergo maintenance to update system software. These nodes will be rebooted and interactive logins, VNC sessions, and interactive batch sessions (qrsh, qlogin, qsh) will be terminated. Queued and running batch jobs will not be affected.
Please email help@scc.bu.edu if you have any questions or concerns.
Regards,
Research Computing Services - The following message was sent to all SCC users on 8/15/24. (Thursday, 8/15/24)
Dear Researcher,On Sunday, August 18, from 6 am to 7 am, the SCC login nodes (scc-ondemand1, scc-ondemand2, scc1, scc2, geo, scc4, and scc-globus) will undergo maintenance to update system software. These nodes will be rebooted and interactive logins, VNC sessions, and interactive batch sessions (qrsh, qlogin, qsh) will be terminated. Queued and running batch jobs will not be affected.
Please email help@scc.bu.edu if you have any questions or concerns.
Regards,
Research Computing Services - The Shared Computing Cluster (SCC) will be offline Wednesday, May 22, 2024, 7 am to Friday, May 24, 9 am while the MGHPCC, which houses the SCC, undergoes scheduled annual maintenance.
RCS will post updates before, during, and after the downtime on this page.
- Downtime: Wednesday, May 22, 7 am to Friday, May 24, 9 am.
- Systems Impacted: SCC (login and batch nodes, home directories and project disk space), SCC OnDemand (scc-ondemand.bu.edu), Linux Virtual Lab, and class.bu.edu.
- Queue Draining: The SCC scheduler will not dispatch jobs that have specified a runtime that extends beyond May 22 at 6 am. These jobs will remain pending and will run when the system returns to normal operation on May 24 at 9 am.
Please email help@scc.bu.edu if you have any questions or concerns.
A version of the above announcement was mailed on April 4 to all SCC users and is also posted at: http://scv.bu.edu/text-documents/may22-2024-downtime.html.
- On Sunday, April 21, 6-7 am, the SCC login nodes (scc1, scc2, geo, and scc4) will undergo maintenance to update system software. These nodes will be rebooted and interactive logins, VNC sessions, and interactive batch sessions (qrsh, qlogin, qsh) will be terminated. Queued batch jobs, running batch jobs, and scc-ondemand sessions will not be affected.
Please email help@scc.bu.edu if you have any questions or concerns. (Friday, 4/19/2024, 10:00 am)
- Update: At 1:20 pm connectivity to the SCC was restored. As said below, it appears that only external access to the cluster has been impacted during the network outage, all internal functions including running jobs, queued jobs, and filesystem operation continue normally. (Thursday, 3/21/2024, 1:23 pm)
- Update: At 12:08 pm the SCC became accessible for a short period but is now inaccessible again. It appears that only external access to the cluster has been impacted during the network outage, all internal functions including running jobs, queued jobs, and filesystem operation continue normally. (Thursday, 3/21/2024, 12:24 pm)
- At approximately 7:30 am this morning, the SCC became unusable due to an unexpected network outage at the MGHPCC where the SCC is Housed. All SCC login and compute nodes are affected by this outage. RCS and MGHPCC staff are actively working to restore the network.
We apologize for any inconvenience that this outage is causing.
Please email help@scc.bu.edu if you have any questions or concerns. (Thursday, 3/21/2024, 10:09 am)
- The SCC slowness issue was resolved by 4:00pm. (Tuesday, 1/23/2024, 4:12 pm)
- RCS staff are investigating the slowness of the SCC system. Updates will be posted here. (Tuesday, 1/23/2024, 2:30 pm)
- The Shared Computing Cluster (SCC) will be offline Wednesday, January 3, 2024, 7 am to 5 pm for networking upgrades.
RCS will post updates before, during, and after the downtime on this page.
- Downtime: Wednesday, January 3, 7 am to 5 pm.
- Systems Impacted: SCC (login and batch nodes, home directories and project disk space), SCC OnDemand (scc-ondemand.bu.edu), Linux Virtual Lab, and class.bu.edu.
- Queue Draining: The SCC scheduler will not dispatch jobs that have specified a runtime that extends beyond January 3 at 6 am. These jobs will remain pending and will run when the system returns to normal operation on January 3 at 5 pm.
Please email help@scc.bu.edu if you have any questions or concerns.
A version of the above announcement was mailed on November 30 to all SCC users and is also posted at: http://scv.bu.edu/text-documents/jan3-2024-downtime.html.
- Monday, December 18, 2023, 10:15 am The SCC has returned to normal operation.
There was an unplanned power outage at the MGHPCC at 6:39 am. All compute nodes lost power. Login nodes and the SCC filesystem were not affected. Any jobs running at 6:39 am were lost and need to be resubmitted. Jobs waiting in queue were not affected. If you encounter any additional issues, please contact help@scc.bu.edu for further assistance. - The SCC is now back to normal operation. Apologies for the interruption to your work and thank you for your patience while we restored the system to normal operation. Please resubmit any jobs running at the time of the outage. Any jobs waiting in the queue at the time of the outage will run when it is their turn. (2:20pm, 11/18/2023)There was an unplanned power outage at the MGHPCC at 12:15 pm. All compute nodes lost power, and we are working to restore access now. Login nodes and the SCC filesystem were not affected. All jobs running at 12:15 pm were lost and need to be resubmitted, Jobs waiting in queue were not affected. If you encounter any additional issues, please contact help@scc.bu.edu for further assistance.
Thank you for your patience and understanding as we work to restore services.Research Computing Services (Saturday, 11/18/2023, 1:50 pm) - The following message was sent out to all SCC users at 2:02 pm on Sunday, October 15, 2023.
Subject: SCC is Now Available / Data Center Power RestoredThank you for all your patience as we worked to restore SCC services. Power at the MGHPCC data center has been restored, and the SCC cluster is now available for use.Due to the extensive power outage, all of the batch jobs that were running when the power went out have failed. Queued batch jobs that had not yet started running remain and will start as resources are available.If you encounter any additional issues, please contact help@scc.bu.edu for further assistance.
Research Computing Services
- Power to the SCC has been restored. Network services have been verified as restored. The SCC filesystem is operational . Most equipment is powered up but the batch system remains disabled. We will update and send an email announcement once services are opened up. (Sunday, 10/15/23, 1:17 pm)
- Power has been restored to all racks in the data center. Our staff are able to reach core SCC gear and are working on assessing and bringing up the cluster.We will send a general email again once systems are back up. Next Update: 01:00pm (Sunday, 10/15/23, 11:44 am)
- The following message was sent out to all SCC users at 9:05 pm on Saturday, October 14, 2023.
Subject: SCC Not Available Due to Data Center Power OutageDue to a major data center power outage at the MGHPCC in Holyoke, the Shared Computing Cluster is currently unavailable. This affects the SCC cluster and filesystem, including SCC login nodes, SCC OnDemand, and class.bu.edu.We will work to restore SCC services once the data center has restored power and confirmed stable operation. We estimate that full restoration of services will not occur until tomorrow (Sunday).Updates will be posted on the IS&T Techweb site and on the SCC Status page.
Thank you for your patience and understanding as we work to restore services.
Research Computing Services
- Power at the MGHPCC, where the SCC is housed, went out at 4:33 pm today. The SCC is in an inoperable state at this time. People at the MGHPCC are working to restore the power. We will post updates here. (Saturday, 10/14/23, 6:12 pm)
- SCC OS Upgrade to Alma8
The SCC operating system is being upgraded from the current CentOS 7 to the AlmaLinux 8 (Alma8) operating system on Tuesday, August 1, 2023. Schedule details are described below.Research Computing Services staff are actively testing software to minimize any disruption to your research activities. Most people should not experience problems and inconvenience. Nevertheless, you should start testing your work soon, especially if you use software packages that you or RCS staff installed in your directories.- Testing: We have set up an Alma8 testing environment which includes a temporary login node, scc6.bu.edu, and a few compute nodes. Over the next month, we will upgrade some more batch nodes to Alma8. To view the list of nodes which have already been upgraded, type the command:
qhost -l alma8
- Web Page: Alma8 Transition, contains information to help you transition to Alma8 such as testing tips, compiling, and Buy-in node considerations.
- Alma8 Modules Status: Please see the Alma8 Modules Status table for a comprehensive list of modules for CentOS 7 and what their status is for Alma8. If you have questions about specific modules, please let us know.
- Anaconda to be retired under Alma8: The old Anaconda software will not run under Alma8. There are current Python alternatives which you should use instead. Please see the Anaconda 2/3 Module Phaseout page for details and instructions.
- Rolling Upgrade Timing: At 6:00 am on August 1, the login nodes scc1.bu.edu and scc4.bu.edu will be upgraded. All queues running CentOS 7 will be disabled. They will be restarted running Alma8. On Wednesday, August 2 at 6:00 am, scc2.bu.edu and geo.bu.edu will be upgraded. Jobs that run before August 1 at 6:00 am will run under CentOS 7. Jobs that run when the queues are restarted with Alma8 will run under Alma8.
- Note that the data in /scratch on the batch nodes will not be preserved. If you need that data, you should move it elsewhere.
Please email help@scc.bu.edu if you have questions or concerns.
Regards,
Research Computing ServicesThe above announcement was mailed on June 28, 2023 to all SCC users and is also posted at: https://scv.bu.edu/text-documents/alma8-transition.html.
- Testing: We have set up an Alma8 testing environment which includes a temporary login node, scc6.bu.edu, and a few compute nodes. Over the next month, we will upgrade some more batch nodes to Alma8. To view the list of nodes which have already been upgraded, type the command:
- SCC Annual Downtime and OS Upgrade to Alma8
Dear Researcher,We are writing to let you know about two important upcoming events which impact the Shared Computing Cluster (SCC) user community:- Downtime: Monday, June 5, 2023, 6 am to Wednesday, June 7, 2023, 9 am
- OS upgrade to Alma8: July/August 2023 – exact dates TBA
Scheduled Annual Downtime
The SCC will be offline from Monday, June 5, 6 am to Wednesday, June 7, 9 am while the MGHPCC, which houses the SCC, undergoes scheduled annual maintenance.
- Downtime: Monday, June 5, 6 am to Wednesday, June 7, 9 am.
- Systems Impacted: SCC (login and batch nodes, home directories and project disk space), SCC OnDemand (scc-ondemand.bu.edu), Linux Virtual Lab, and class.bu.edu.
- Queue Draining: The SCC scheduler will not dispatch jobs that have specified a runtime that extends beyond June 5 at 5 am. These jobs will remain pending and will run when the system returns to normal operation on June 7 at 9 am.
- Updates: Posted on the SCC Status Page at: www.bu.edu/tech/research/sccupdates/
OS Upgrade to Alma8
The SCC operating system is being upgraded from the current CentOS 7 to the AlmaLinux 8 (Alma8) operating system in July/August 2023. Exact dates TBA.
Research Computing Services staff are actively testing software to minimize any disruption to your research activities. Most people should not experience problems and inconvenience. Nevertheless, you should start testing your work soon, especially if you use software packages that you or RCS staff installed in your directories.
- Testing: We have set up an Alma8 testing environment which includes a temporary login node, scc6.bu.edu, and a few compute nodes.
- Web Page: Alma8 Transition, contains information to help you transition to Alma8 such as testing tips, compiling, and Buy-in node considerations.
- Alma8 Modules Status: Please see the Alma8 Modules Status table for a comprehensive list of modules for CentOS 7 and what their status is for Alma8. If you have questions about specific modules, please let us know.
Please email help@scc.bu.edu if you have questions or concerns about the Annual Downtime or the OS upgrade to Alma8.
The above announcement was mailed on April 25 to all SCC users and is also posted at: https://scv.bu.edu/text-documents/downtime-alma8-summer23.html.
- The SCC maintenance was successfully concluded and the system restored to normal operation. (Tuesday, 3/7/23, 5:45 pm)
- The Shared Computing Cluster (SCC) will be offline Tuesday, March 7, 2023, 9 am to 5 pm for network maintenance.RCS will post updates before, during, and after the downtime here.
- Downtime: Tuesday, March 7, 9 am to 5 pm.
- Systems Impacted: SCC (login and batch nodes, home directories and project disk space), SCC OnDemand (scc-ondemand.bu.edu), Linux Virtual Lab, and class.bu.edu.
- Queue Draining: The SCC scheduler will not dispatch jobs that have specified a runtime that extends beyond March 7 at 8 am. These jobs will remain pending and will run when the system returns to normal operation on March 7 at 5 pm.
This announcement is also posted at: https://scv.bu.edu/text-documents/scc-downtime-mar2023.html.
- The RCS staff is out the week of December 26 – 30 due to BU’s Intersession. Please don’t expect replies to questions until we return on Tuesday, January 3, 2023. We will however step in if there are any major issues with the SCC. Happy Holidays! (Tuesday, 12/27/22, 11:00 am)
- On Thursday, October 20, from 7 am to 10 am there was a networking issue causing the SCC to be inaccessible. That issue has now been resolved. (Thursday, 10/20/22, 10:00 am)
- The Shared Computing Cluster (SCC) will be offline Saturday, August 13, 2022, 6 am to 5 pm for network maintenance.RCS will post updates before, during, and after the downtime here.
- Downtime: Saturday, August 13, 6 am to 5 pm.
- Systems Impacted: SCC (login and batch nodes, home directories and project disk space), SCC OnDemand (scc-ondemand.bu.edu), Linux Virtual Lab (scc-lite.bu.edu), and class.bu.edu.
- Queue Draining: The SCC scheduler will not dispatch jobs that have specified a runtime that extends beyond August 13 at 5 am. These jobs will remain pending and will run when the system returns to normal operation on August 13 at 5 pm.
This announcement is also posted at: https://scv.bu.edu/text-documents/scc-downtime-aug2022.html.
- The scheduled annual maintenance at the MGHPCC has been successfully completed ahead of schedule and the SCC (login and batch nodes, home directories and project disk space) has returned to normal operation. Queued batch jobs are being dispatched. SCC OnDemand (scc-ondemand.bu.edu), the Linux Virtual Lab (scc-lite.bu.edu), class.bu.edu, and ATLAS are also back in operation. (Wednesday, 5/25/22, 12:30 am)
- Reminder: Starts Monday
SCC/MGHPCC Scheduled Annual Downtime
Monday, May 23, 6 am to Wednesday, May 25, 9 am See bullet in yellow below for details.Please email help@scc.bu.edu with any questions or concerns.
(Friday, 5/20/22, 5:00 pm) - The cooling issue at the MGHPCC has been resolved and nearly the whole SCC is now operating normally. Nodes had shut down to avoid damage after the cooling failed and jobs that were running on them will need to be resubmitted. (Tuesday, April 12, 2022, 5:55pm)A cooling related outage at the MGHPCC has brought down many SCC compute nodes. The issue is under investigation and we will restore systems as soon as possible. (Tuesday, 4/12/22, 3:30 pm)
- The network connectivity issue has been resolved and the SCC is now back to operating normally (Wednesday, March 16, 2022, 11:53 AM)There is a network connectivity issue to the SCC. You will right now be unable to connect to the SCC over SSH or OnDemand but as far as we know, jobs continue to run at the SCC fine. We will post updates here as we have them. (Wednesday, 3/16/22, 11:42 am)
- The Shared Computing Cluster (SCC) will be offline from Monday, May 23, 2022, 6 am to Wednesday, May 25, 2022, 9 am while the MGHPCC, which houses the SCC, undergoes scheduled annual maintenance. RCS will post updates before, during, and after the downtime here.
- Downtime: Monday, May 23, 6 am to Wednesday, May 25, 9 am.
- Systems Impacted: SCC (login and batch nodes, home directories and project disk space), SCC OnDemand (scc-ondemand.bu.edu), Linux Virtual Lab (scc-lite.bu.edu), class.bu.edu, and ATLAS.
- Queue Draining: The SCC scheduler will not dispatch jobs that have specified a runtime that extends beyond May 23 at 5 am. These jobs will remain pending and will run when the system returns to normal operation on May 25 at 9 am.
Please email help@scc.bu.edu if you have any questions or concerns.
This announcement is also posted at: https://scv.bu.edu/text-documents/mghpcc_outage_2022.html.
- MGHPCC unplanned power outage Last night starting at 5:29 pm, power sags caused 286 SCC nodes and 108 Atlas nodes to reboot. Any jobs running on those nodes need to be resubmitted. Jobs in the queue were not affected. All nodes are now running again. (Friday, 8/13/21)
- The maintenance was successfully completed and the SCC returned to normal operation at 11:30 pm on Tuesday, August 10, 2021. (Wednesday, 8/11/21, 8:00 am)
- The Shared Computing Cluster (SCC) will be offline from Monday, August 9, 2021, 6 am to Wednesday, August 11, 2021, 9 am while the MGHPCC, which houses the SCC, undergoes scheduled annual maintenance.Note that this downtime occurs during the last week of Summer Session II and may be disruptive to courses while finishing assignments and final projects. Please schedule lectures and assignments accordingly.RCS will post updates before, during, and after the downtime here.
- Downtime: Monday, August 9, 6 am to Wednesday, August 11, 9 am.
- Systems Impacted: SCC (login and batch nodes, home directories and project disk space), SCC OnDemand (scc-ondemand.bu.edu), Linux Virtual Lab (scc-lite.bu.edu), class.bu.edu, and ATLAS.
- Queue Draining: The SCC scheduler will not dispatch jobs that have specified a runtime that extends beyond August 9 at 5 am. These jobs will remain pending and will run when the system returns to normal operation on August 11 at 9 am.
Please email help@scc.bu.edu if you have any questions or concerns.
This announcement is also posted at: https://scv.bu.edu/text-documents/mghpcc_outage_2021.html.
- The maintenance was successfully completed and the SCC returned to normal operation at 1 pm on Saturday, May 15, 2021. (Monday, 4/5/21, 1 pm)
Announcement:
The Shared Computing Cluster (SCC) will be offline Saturday, May 15, 2021, 7 am to 5 pm while the SCC undergoes system maintenance.Updates will be posted here.Details:- Downtime: Saturday, May 15, 2021, 7 am to 5 pm
- Systems Impacted: SCC (login and batch nodes, home directories and project disk space), SCC OnDemand (scc-ondemand.bu.edu), Linux Virtual Lab (scc-lite.bu.edu), and class.bu.edu.
- Queue Draining: The SCC scheduler will not dispatch jobs that have specified a runtime that extends beyond May 15 at 6 am. These jobs will remain pending and will run when the system returns to normal operation later that day at 5 pm.
Please email help@scc.bu.edu if you have any questions or concerns.
- Power has been restored to the MGHPCC and the compute nodes are back up. Jobs are running. (Monday, March 29, 2021, 1:00 pm)There was an unexpected power outage at the MGHPCC. All compute nodes are down. Login nodes and storage are covered by UPS and are fine. RCS and MGHPCC staff are investigating. We will post updates on this page. (Monday, 3/29/21, 12:00 pm)
- As of March 15, 2021, Duo two-factor authentication is required to access the SCC via ssh/scp/sftp. Please consult the linked web page for more information. One note for MobaXterm users. If you are directly connecting to the SCC using the MobaXterm Terminal, you should switch to using the ‘Sessions’ feature as described on our Get Started – Connect (SSH) web page. This will avoid some issues related to Duo TFA and is generally easier and better. (Monday, 3/15/21, 4:30 pm)
- The Shared Computing Cluster (SCC) has been restored to normal operation. However, some batch jobs that were running when the issue occurred may need to be resubmitted. No data was lost. (Friday, February 12, 2021, 10:30pm)Dell attempted a hardware repair on elements of the SCC and now parts of the filesystem are not accessible. Dell is working on a plan to get things back up and running again but it will take some time. Our systems administrators are also working on the problem. (Friday, 2/12/21, 5:00 pm)
- MGHPCC unplanned power outage early this morning.
There was a power outage early this morning that took down all SCC compute nodes and all of the Atlas cluster. All running jobs need to be resubmitted. Jobs in the queue were not affected.
(Tuesday, 1/12/21) - SCC Login Nodes Maintenance – Sunday, January 10, 6-7 am
On Sunday, January 10, starting at 6 am the SCC login nodes (scc1, scc2, geo, scc4, scc-lite, and class.bu.edu) will undergo maintenance to update system software. Non-interactive batch jobs will continue to run during this process. Interactive logins, VNC sessions, and interactive batch sessions (qrsh, qlogin, qsh) will be terminated. Queued batch jobs and scc-ondemand sessions will not be affected.Although the login nodes might be usable intermittently during the maintenance period please don’t try to use them until after 7 am. Use scc-ondemand instead.This notice and progress updates will be posted on the SCC Status page (this page) at https://www.bu.edu/tech/research/sccupdates/.Please email help@scc.bu.edu if you have any questions or concerns.
(Friday, 1/8/21) - The SCC system has returned to normal operation.
There was an unplanned power interruption at the MGHPCC at 12:30 am. All compute nodes were down. All jobs running at 12:30 am were lost and need to be resubmitted. Queued jobs were not lost. (Wednesday, 11/4/20, 8:00 am) - The geo login node is now available. The problem was a network issue with that node and has been resolved. (Friday, 10/23/20, 3:30 pm)
- The geo login node is currently not accessible and is being actively investigated. Please use scc1.bu.edu, scc2.bu.edu, or http://scc-ondemand.bu.edu/ instead. We will post an update here when the problem is resolved. (Thursday, 10/22/20, 4:20 pm)
- The MGHPCC annual maintenance has been successfully completed and the SCC is back to normal operation. (Tuesday, 10/20/20, 11:00 pm)
- The Shared Computing Cluster (SCC) will be offline from Monday, October 19, 2020, 9 am to Wednesday, October 21, 2020, 9 am while the MGHPCC, which houses the SCC, undergoes scheduled annual maintenance. Please note that this annual maintenance is off-cycle because it was impossible to schedule in the spring due to COVID-19.This mid-semester downtime may be disruptive to the courses that use the SCC for their course work. Please schedule lectures and assignments accordingly.RCS will post updates before, during, and after the downtime here.
- Downtime: Monday, October 19, 9 am to Wednesday, October 21, 9 am.
- Systems Impacted: SCC (login and batch nodes, home directories and project disk space), SCC OnDemand (scc-ondemand.bu.edu), Linux Virtual Lab (scc-lite.bu.edu), class.bu.edu, and ATLAS.
- Queue Draining: The SCC scheduler will not dispatch jobs that have specified a runtime that extends beyond October 19 at 7 am. These jobs will remain pending and will run when the system returns to normal operation on October 21 at 9 am.
Please email help@scc.bu.edu if you have any questions or concerns.
- The system has been returned to normal perfermance. (Thursday, 7/16/20, 1:30 pm)
- Filesystem slowness is being investigated. We will return to normal operation as soon as possible. (Thursday, 7/16/20, 11:30 am)
- The networking issue has been resolved. Connections to the SCC services from off-campus are functioning nomally. (Wednesday, 7/15/20, 9:05 am)
- Currently, all SCC services are inaccessible from off-campus. This includes ssh to login nodes and ondemand. Networking is investigating the problem. Until it is fixed, you should use the vpn or ssh scc-lite.bu.edu and then ssh onto the SCC. The SCC system itself is operating nomally. (Wednesday, 7/15/20, 6:30 am)
- Using the VPN is no longer requried to access SCC Open OnDemand. (Sunday, 3/22/20, 6:00 am)
- SCC Login Nodes Maintenance – Sunday, March 8, 2020, 6-8 am
On Sunday, March 8, starting at 6 am the SCC login nodes (scc1, scc2, geo, scc4, scc-globus, scc-lite, and class.bu.edu) will undergo maintenance to update system software. Non-interactive batch jobs will continue to run during this process. Interactive logins, VNC sessions, and interactive batch sessions (qrsh, qlogin, qsh) will be terminated. Queued batch jobs and scc-ondemand sessions will not be affected.Although the login nodes might be usable intermittently during the maintenance period please don’t try to use them until after 8am. Use scc-ondemand instead.This was posted on March 5, 2020 at 11:00 am and emailed to all current SCC users. - Duo required for SCC OnDemand as of March 10, 2020
As of Tuesday, March 10 at 6 am, Duo two-factor authentication will be required
to access SCC OnDemand in addition to the Kerberos password authentication
already required. The VPN will continue to be required for access from off-campus for the immediate future. Note that implementing Duo is the first step in allowing the removal of the VPN requirement in approximately one month.You may have already registered with Duo and be familiar with using two-factor authentication. This is the same authentication mechanism used by BUWorks and the StudentLink. If you have previously used Duo to authenticate to either of those services, all of your registered devices will work with SCC OnDemand. If you have not registered a device for Duo authentication, please go to
https://www.bu.edu/tech/support/duo/enroll-device/ to register and set up one or more methods of authenticating. Options include your cell phone, office phone, tablet, and others. If you need assistance registering for or using Duo, please send email to ithelp@bu.edu. If you have any questions about this, please send email to help@scc.bu.edu.
This was posted on March 4, 2020 and emailed to all current SCC OnDemand users. - Two updates to the SCC went into affect on January 2, 2020:To streamline application usage we removed a subset of system default applications from the default path. This change only affected older applications that were previously available without the use of the module system. Researchers who require these application versions will now only be able use them through the module system.If you are already using the module system for these applications, this change will not affect you.
Application System Default Version As a module Alternative (newer) Versions Available charmm 32b1 charmm/32b1 charmm/41.0 gauss 13.1.1 gauss/13.1.1 None gaussian 09 None* gaussian/0.9D, gaussian/16.A.03, gaussian/16.B.01 idl 8.2 None* idl/8.7.1, idl/8.7.2 maple 16.0 maple/16.0 maple/2018 mathematica 9.0 mathematica/9.0 mathematica/11.3.0, mathematica/12.0.0 matlab 2013a matlab/2013a matlab/2011b, matlab/2013b, matlab/2016a,
matlab/2017b, matlab/2018b, matlab/2019bR 2.15.3 None* R/3.0.0, R/3.5.1, R/3.6.0 sas 9.3 sas/9.3 sas/9.4 stata 12 stata/12 stata/15, stata/16 vmd 1.9.1 vmd/1.9.1 vmd/1.9.3 * Some programs are out of date, unused, or no longer function and will not become modules. Researchers should use a newer version from the module system.
To enable additional features in the module system, we also upgraded the underlying module system software. All existing features and functionality remain and the upgrade does not require any changes to the way researchers interact with the modules installed on the SCC. The upgrade increases the speed, enables additional search functions, and adds new capability to individual modules in addition to providing an updated appearance. With this upgrade, we are able to provide greater flexibility and capability of software on the SCC. The Modules help page has been updated to reflect these changes.
Please email help@scc.bu.edu if you have any questions or concerns. (Thursday, 1/2/20, 3:50 pm)
- We have discovered that there was a bug with the script that sent the emails about the incident yesterday, Monday, September 23, and those people with BU login names beginning with letters towards the end of the alphabet (generally N-Z) did not get either of our emails. Our apologies for this. (Tuesday, 9/24/19, 2:55 pm)
- Update at 11:00 AM, Tuesday, September 24. The SCC was returned to operation at 11:45 PM last night but we waited until now to confirm everything was resolved. We just sent an email message to all SCC users saying the following:The Shared Computing Cluster (SCC) has been restored to normal operation. An issue with the system that supports filesystem quotas was preventing cluster operations. This issue has been resolved and no files have been affected. However, batch jobs that were running when the issue occurred at 10:15AM on Monday, September 23, will very likely have failed and need to be resubmitted.
Update at 5:30 PM. The SCC remains inaccessible and we continue actively working with the vendor to resolve this issue. We will post updates here when we have further information. We will send a follow-up email to all SCC users when the issue is resolved.
Update at 12:00 PM. We sent a message to all SCC users notifying them of the issue. We will continue to post updates here and will send a follow-up email to all SCC users when the issue is resolved.
Primary content of the email message:
The Shared Computing Cluster (SCC) is completely inaccessible and has been since 10:15 am today. The entire system is unusable. RCS staff are working to resolve the issue and have called in the vendor for additional help. We do not currently have an estimate for when the cluster will return to operation.
This issue affects the following services: SCC (login and batch nodes, home directories, project disk space, /STASH), Linux Virtual Lab (scc-lite.bu.edu), class.bu.edu, and RCS examples web pages.
We apologize for any inconvenience that this issue causes and appreciate your patience as we work hard to resolve it.
If you have questions please send email to help@scc.bu.edu. (Tuesday, 9/24/19, 12:00 pm)
- The SCC filesystem is currently inaccessible. We will post updates when we have additional information. The vendor has been called in to help with correcting the issue. (Monday, 9/23/19, 10:30 am)
- The MGHPCC data center experienced an unexpected power outage from 6:32 pm – 7:15 pm. The SCC login nodes and filesystem were not affected. However, all SCC compute nodes and batch jobs running during this time were affected by this outage. Queued batch jobs are now being dispatched as we continue the process of bringing the remaining compute nodes back online. (Friday, 9/13/19)
- Recent filesystem slowness experienced by compute nodes was traced to a defective networking device.The device was replaced and we believe that issue has been resolved. Performance on the compute nodes and the scc1 login node have returned to normal. (Friday, 8/30/19, 11:00 am)
- The scc1.bu.edu login node is having intermittent problems which are currently being investigated. (Monday, 8/12/19, 11:45 am)
- The SCC has returned to normal operation now that the scheduled maintenance has been successfully completed. (Sunday, 8/11/19, 11:45 am)
- The SCC has returned to normal operation following the conclusion of the scheduled maintenance. (Sunday, 8/11/19, 11:30 am)
- The SCC batch system is down for scheduled maintenance (see below for details). It will resume operation by noon on Sunday but may be available earlier. When it is available, this page will be updated.The login nodes and filesystem will intermittently be available during the downtime but you should not do anything that you can not afford to be interrupted. (Satureday, 8/10/19, 5:00 pm)
- The following message was sent to all SCC researchers on July 11, 2019:
The Shared Computing Cluster (SCC) will be offline August 10-11 while BU’s data center, which provides essential services to the SCC, undergoes scheduled upgrades. A University-wide announcement is forthcoming..When possible, RCS will post SCC specific updates before, during, and after the downtime here on the SCC Status Page at https://www.bu.edu/tech/support/research/whats-happening/updates/. Please expect interruptions to the availability of this web page during the downtime.- Downtime: Saturday, August 10 at 5 pm to Sunday, August 11 at noon.
- Services Impacted: SCC (login and batch nodes, home directories, project disk space, /STASH), Linux Virtual Lab (scc-lite.bu.edu), class.bu.edu, RCS examples web pages, and SCC Account Management web forms.
- Queue Draining: The SCC scheduler will not dispatch jobs that have specified a runtime that extends beyond August 10 at 4 pm. These jobs will remain pending and will run when the system returns to normal operation on August 11.
Please email help@scc.bu.edu if you have any questions or concerns.
Regards,
Research Computing Services staff
(Tuesday, 8/6/19) - During an electrical storm in Holyoke on Saturday, July 6, at 5:00 PM there was a brief power interruption to the SCC. As a result, 401 SCC nodes rebooted and the jobs running on those nodes were disrupted and will need to be resubmitted. The login nodes and file system were not affected. (Saturday, 7/6/19, 5:30 pm)
- The SCC and related services have been returned to normal operation ahead of schedule, following the successful completion of the scheduled MGHPCC annual maintenance. (Wednesday, 6/12/19, 2:00 am)
- The Shared Computing Cluster (SCC) is offline for the scheduled annual maintenance at the MGHPCC. We expect the SCC to return to normal operation by noon on Wednesday, June 12. We will post any updates here at https://www.bu.edu/tech/research/sccupdates/ . (Monday, 6/10/19, 9:00 am)
- Letter sent to all SCC users on June 3, 2019 at 2 pm.
The Shared Computing Cluster (SCC) will be offline June 10-12 while the MGHPCC, which houses the SCC, undergoes scheduled annual maintenance. During this time, the SCC operating system will be upgraded to CentOS 7. RCS will hold Walk-in Consulting Hours to assist with migrating your workflow to CentOS 7.- Downtime: Monday, June 10 at 9 am to Wednesday, June 12 at noon.
- Systems Impacted: SCC (login and batch nodes, home directories and project disk space), Linux Virtual Lab (scc-lite.bu.edu), class.bu.edu, and ATLAS.
- Queue Draining: The SCC scheduler will not dispatch jobs that have specified a runtime that extends beyond June 10 at 9 am. These jobs will remain pending and will run under CentOS 7 when the system returns to normal operation on June 12 at noon.
Upgrade to CentOS 7: Please see the CentOS 7 Transition pages for detailed instructions on what you need to do to make sure your jobs run correctly after the upgrade.
Walk-in Consulting Hours: Wednesday-Friday, June 12-14, 10 am – 5 pm
Locations:
CRC, RCS, 2 Cummington Mall, rm 107
CRC, Kilachand Center, 610 Commonwealth Ave, rm 901A
CRC, Earth & Environment, 725 Commonwealth Ave, rm 334G
BUMC, Crosstown Center, 801 Massachusetts Ave, rm 485
BUMC, Talbot Building, 715 Albany St, rm 302C (Thursday, June 13, 2-4 pm only)RCS will post updates before, during, and after the downtime here. Please email help@scc.bu.edu if you have any questions or concerns.
- SCC Login Nodes Reboot – Sunday, November 25 at 6:00AM
On Sunday, November 25, at 6:00AM the SCC login nodes (scc1, scc2, geo, scc4) will be rebooted in order to install a security update. Non-interactive batch jobs will continue to run during this process. However, interactive logins, VNC sessions, and interactive batch sessions (qrsh, qlogin, qsh) will be terminated. Queued batch jobs will not be affected.The reboot will take about 20 minutes.Progress updates will be posted herePlease email help@scc.bu.edu if you have any questions or concerns.-The Research Computing Services staff (Thursday, 11/15/18, 11:00 am)
The reboot happened with no issues on Sunday morning at 6:00 am. (Monday, 11/26/18, 11:15 am) - The filesystem issue has been resolved and the SCC is back to normal operations. Be aware that codes running during the time with problems may have experienced issues. (September 26, 11:30 AM).The /usr/local filesystem is currently inaccessible. As such, most commands (including qsub, qstat, and module) can not be run. We are currently working to fix this problem and will update this as we get more information. (Wednesday, 9/26/18, 11:00 am)
- At 2:35 PM on Tuesday, July 17, due to a lightning storm in Holyoke, there was a brief power interruption at the SCC. 454 SCC nodes rebooted as a result and the jobs running on those nodes were disrupted and will need to be resubmitted. The login nodes and file system were not affected. This interruption was not related to the significant power event last week. (Wednesday, 7/18/18, 11:30 am)
- The SCC was restored to normal operation at 2:10 PM on July 9.
Batch jobs that were running when the outage occurred will need to be resubmitted. Batch jobs already in the queue, but not yet running, were not affected and will run as expected. The file system was not affected by this outage. (Monday, 7/9/18, 2:10 pm) - Due to a power event affecting the entire MGHPCC building, the SCC is unavailable at the moment. We will post updates here as they become available. (Monday, 7/9/18, 11:45 am)
- The SCC and related services have been returned to normal operation ahead of schedule, following the completion of the scheduled MGHPCC annual maintenance. (Tuesday, 5/22/18, 8:30 pm)
- Shared Computing Cluster/MGHPCC Outage May 21-23, 2018
In order to perform the annual scheduled maintenance at the MGHPCC, there will be a 48 hour full power outage at the site starting at 12 noon on Monday, May 21 and ending at 12 noon on Wednesday, May 23. This outage impacts the Shared Computing Cluster (SCC) and ATLAS, as well as the Linux Virtual Lab (scc-lite.bu.edu) and class.bu.edu which use filesystems located at the MGHPCC. All SCC login and compute nodes, home directories, and Project Disk Space will be unavailable during the outage. The planned outage is scheduled to last until Wednesday at 12 noon; we will let you know if the work is completed early and the systems are available for use sooner here.You may continue to submit jobs until the downtime. However, on Friday, April 20 we will start a process of draining the batch queues on the SCC to prevent long running jobs from starting. Any jobs that would not complete prior to the shutdown will remain pending until the computer systems are returned to normal operations. This process will continue until Monday, May 21, 12 noon, at which time we will shut down all SCC computer systems including all of the login nodes.
If you would like to run and complete jobs on the batch system over the next four weeks before the downtime, you should adjust your hard run time limits accordingly. For more details see: https://www.bu.edu/tech/support/research/rcs-archive/system-usage-old/running-jobs/submitting-jobs/#job-resources.We anticipate having all systems returned to production status by 12 noon on Wednesday, May 23 at which time pending jobs will begin to be processed.We will post updates before, during, and after the downtime here. Please email help@scc.bu.edu if you have any additional questions or concerns. (Monday, 4/2/18, 11:30 am) - On Wednesday morning, March 28, the SCC was behaving very slowly due to user activity. This issue has now been resolved. (Wednesday, 3/28/18, 10:20 am)
- At 1:57 PM on Friday, March 2, there was a 75 millisecond power sag at the MGHPCC that affected some of the SCC compute nodes. Some jobs belonging to a small number of people were disrupted as a result of this issue. We regret any inconvenience this may have caused. (Tuesday, 3/6/18, 1:30 pm)
- The SCC was operating very slowly on late Wednesday morning on February 7. This problem has now been resolved. (Wednesday, 2/7/18, 11:17 am)
- We believe the system performance issues have stabilized, and continue to monitor the situation. (Tuesday, 12/12/17, 3:00 pm)
- The following message was sent out to all SCC users at 5:45 PM on Thursday, December 7, 2016:
Dear Colleague,
Over the past few days you may have noticed performance lags on the Shared Computing Cluster. We are writing to let you know that we are aware of these issues and are continuing to diagnose the problem. The performance lags are complex and intermittent and are proving difficult to diagnose. You may continue to experience periods of degraded performance on the cluster until we have fully resolved the issue.Further updates on the status of the SCC will be posted to this page.We apologize for any inconvenience that these issues may have caused you during the past few days and appreciate your continued patience as we work hard to resolve them.If you have further questions please send email to help@scc.bu.eduRegards-The Research Computing Services staff (Thursday, 12/7/16, 5:45 pm) - The SCC is experiencing an intermittent problem. We are investigating it and will fix it as soon as possible. (Tuesday, 12/5/17, 1:00 pm)
- The SCC is experiencing an intermittent problem affecting the login nodes. The batch system appears to be operating normally. Our systems administrators are working on the issue and we will post another message when it has been corrected. (Friday, 9/29/17, 10:50 am)
This was corrected by 11:20 AM. (Friday, 9/29/17, 11:20 am) - >On Sunday, September 3, at 6:00 AM the SCC login nodes (scc1, scc2, geo, scc4, and scc-lite) will be rebooted in order to install a security update. Non-interactive batch jobs will continue to run during this process. However, interactive logins, VNC sessions, and interactive batch sessions (qrsh, qlogin, qsh) will be terminated. Queued batch jobs will not be affected.The reboot will take about 20 minutes. (Monday, 8/28/17, 1:00 pm)
- On Sunday, July 2, at 6:00 AM the SCC login nodes (scc1, scc2, geo, scc4, and scc-lite) will be rebooted in order to install a security update. Non-interactive batch jobs will continue to run during this process. However, interactive logins, VNC sessions, and interactive batch sessions (qrsh, qlogin, qsh) will be terminated. Queued batch jobs will not be affected.The reboot will take about 20 minutes. (Wednesday, 6/28/17, 1:00 pm)
- The Shared Computing Cluster (SCC), as well as ATLAS, the Linux Virtual Lab (scc-lite.bu.edu), and class.bu.edu have been restored to normal operation ahead of schedule. (Tuesday, 5/23/17, 10:20 pm)
- As scheduled, the SCC is now down for scheduled maintenance through 9:00am on Wednesday, May 24th, although it is possible it will come up earlier than that.During the downtime, although it is possible for project maintenance web forms to be submitted (to add users, request additional SUs, etc…), these forms will not be processed until the SCC is up on Wednesday so you will not see any changes until then. (Monday, 5/2217, 12:30 pm)
- There will be an outage at the SCC starting on May 22. Details are below. (Thursday, 4/20/17, 3:30 pm)
Shared Computing Cluster/MGHPCC Outage May 22-24, 2017In order to perform the annual scheduled maintenance at the MGHPCC, there will be a full power outage at the site starting at 9:00am on Monday, May 22 and ending at 9:00am on Wednesday, May 24th. This outage impacts the Shared Computing Cluster (SCC) and ATLAS, as well as the Linux Virtual Lab (scc-lite.bu.edu) and class.bu.edu which use filesystems located at the MGHPCC. All SCC login and compute nodes, home directories, and Project Disk Space will be unavailable during the outage. While the planned outage is scheduled to last until Wednesday at 9:00am, we will let you know if the work is completed early and the systems are available for use sooner at https://www.bu.edu/tech/support/research/whats-happening/updates/.You may continue to submit jobs until the downtime. However, on Friday, April 21 we will start a process of draining the batch queues on the SCC to prevent long running jobs from starting. Any jobs that would not complete prior to the shutdown will remain pending until the computer systems are returned to normal operations. This process will continue until Monday, May 22, 9:00am, at which time and we shut down all SCC computer systems including all of the login nodes.If you would like to run and complete jobs on the batch system over the next four weeks before the downtime, you should adjust your hard run time limits accordingly. For more details see: https://www.bu.edu/tech/support/research/rcs-archive/system-usage-old/running-jobs/submitting-jobs/#job-resources.
We anticipate having all systems returned to production status by 9:00am on Wednesday, May 24 at which time pending jobs will begin to be processed.During the downtime, the SCC login nodes (scc1, scc2, geo, and scc4) are also going to be replaced with newer machines with more processors and additional memory. This change will likely not have any noticeable effect on most people; the names of the login nodes remain unchanged.We will post updates before, during, and after the downtime on the SCC status page at https://www.bu.edu/tech/research/sccupdates/. Please email help@scc.bu.edu if you have any additional questions or concerns. - From approximately noon on Tuesday, April 25 through 2:20 pm on Wednesday, April 26, there was an error in reporting project balances on the RCS Project Management web site for LPIs and Administrative Contacts. This has now been corrected and only affected the web site report. (Wednesday, 4/26/17, 3:30 pm)
- The SCC has been majorly expanded and enhanced, with 86 new nodes added including state-of-the-art GPUs, larger memory nodes, and faster fabric MPI nodes. The annoucement letter was sent to all SCC researchers on April 10, 2017 (Monday, 4/10/17, 3:00 pm)
- The SCC login nodes reboot detailed just below was accomplished successfully on Sunday, January 22. (Sunday, 1/22/17, 6:09 am)
- Message sent to all SCC researchers on January 18, 2017 at 4:30 PM:
SCC Login Node Reboot – Sunday, January 22 at 6:00 AM
On Sunday, January 22, at 6:00AM the SCC login nodes (scc1, scc2, geo, and scc4) will be rebooted in order to install a security update. Non-interactive batch jobs will continue to run during this process but interactive logins, VNC sessions, and interactive batch sessions (qrsh, qlogin, qsh) will be terminated. Queued batch jobs will not be affected.The reboot will take about 20 minutes.This notice and progress updates will be posted on the SCC Status page at https://www.bu.edu/tech/research/sccupdates/.Please email help@scc.bu.edu if you have any questions or concerns.-The Research Computing Services staff (Wednesday, 1/18/17, 4:30 pm) - The SCC and related services have been returned to normal operation ahead of schedule, following the completion of the scheduled MGHPCC annual maintenance. (Wednesday, 5/18/16, 10:00 am)
- There will be an outage at the SCC starting on May 16. Details are below.
In order to perform yearly scheduled maintenance, there will be a full power outage at the MGHPCC tentatively scheduled to start at 12:00pm on Monday, May 16th until 12:00pm on Wednesday, May 18th. This outage impacts the Shared Computing Cluster (SCC) and ATLAS, as well as the on-campus Linux Virtual Lab machine (scc-lite.bu.edu) and websites hosted at class.bu.edu that use data stored on systems at the MGHPCC. All SCC login and compute nodes, home directories, and project and restricted project filesystems will be unavailable during the outage. While the planned outage is scheduled to extend through noon on Wednesday, we anticipate that restoration of services could occur earlier on Wednesday during the overnight hours.We will start a process of draining the batch queues on the Shared Computing Cluster (SCC) on Saturday, April 16th to prevent long running jobs from starting. This process will continue until 12:00pm Monday, May 16th at which point all jobs will be stopped and we will shut down all the SCC computer systems, including all the login nodes. Although you may continue submitting jobs between April 16th and the downtime, if your jobs would not complete prior to the shutdown, they will remain pending until the computer systems are returned to normal operations. People submitting jobs to the batch system should plan on adjusting their hard run time limits accordingly (see https://www.bu.edu/tech/support/research/rcs-archive/system-usage-old/running-jobs/submitting-jobs/#job-resources for more details). We anticipate having all systems returned to production status by 12:00pm on Wednesday, May 18th. (Wednesday, 5/4/16, 3:00 pm) - SCC performance is back to normal (Thursday, 4/7/16, 12:30 pm)
- Problem with performance on SCC.
We are are aware that there is a problem with performance on the SCC cluster and are working on it now and will resolve it as soon as we can. Apologies in advance for the inconvenience. (Thursday, 4/7/16, 11:15 am) - SCC performance is back to normal
(Monday, 11/23/15, 6:00 pm) - System Slowness.
We are working to resolve the slowness issue and bring the system back to normal performance as soon as possible. (Monday, 11/23/15, 3:45 pm) - System performance is back to normal
(October 15, 2015, 2:20 p.m.) - System Slowness.
We are working to resolve the slowness issue and bring the system back to normal performance as soon as possible. (Thursday, 10/15/15, 1:30 pm) - Batch system back to normal.
The issues with the batch system have been resolved and the system is back to normal. (Wednesday, 9/2/15, 5:15 pm) - Problem with the batch system on SCC.
We are are aware that there is a problem with the batch system on the SCC cluster and are working on it now and will resolve it as soon as we can. The login nodes and Project Disk are not affected and you can continue to use them normally. Apologies for the inconvenience. (Wednesday, 9/2/15, 3:00 pm) - Eversource has scheduled a power outage at 111 Cummington Mall on Saturday, July 18, 2:00 a.m to 2:00 p.m. to perform emergency repairs. During the outage, RCS Account Management services will not be available. No Project and related resource management forms will be processed and automated messages will not be sent. RCS Account Management services will resume after the power has been restored. The Shared Computing Cluster (SCC) and all other resources located in Holyoke will not be affected.
Other areas affected by the power outage are:
Momentary: 225 BSR (Castle), 226 BSR (HIS), 232 BSR (PLS), 236 BSR (EGL), 264-270 BSR (SSW), 640 Comm Ave. (COM), 675 Comm Ave. (STO), 2 Cummington (BSC), 38 Cummington (SLB), 44 Cummington (BME)
Duration: 704 Comm Ave. (FOB), 708 Comm Ave. (Residence/Commercial), 710 & 712 Comm Ave. (Commercial), 714 Comm Ave. (Residence/Commercial), 722 Comm Ave. (Residence/Commercial), 48 Cummington (ERA), 68-100 Cummington (SOC), 111 Cummington (MCS)
(Friday, 7/17/15, 1:00 pm) - The unrelated issues involving scc1.bu.edu and the batch system have been resolved. (Friday, 5/22/15, 7:00 pm)
- Scc1.bu.edu is having issues which are being actively investigated; the machine was also just rebooted so current sessions will have been disrupted. We recommend using another login node (scc2.bu.edu, geo.bu.edu, or scc4.bu.edu) for the time being. (May 22, 2015 3:13 PM)
Following the reboot, the system seems to be working normally but we continue to actively monitor it. (Friday, 5/22/15, 3:32 pm) - The scheduled downtime at the MGHPCC completed early and all SCC login and batch nodes are back on line. (Tuesday, 5/19/15, 12:40 am)
- The SCC/MGHPCC will have an outage starting at noon on Sunday, May 17. Details are below, a message just sent to all SCC researchers.
Dear Colleague,
In order to perform yearly scheduled maintenance, there will be a full power outage at the MGHPCC tentatively scheduled to start at 12:00pm on Sunday, May 17th until 12:00pm on Tuesday, May 19th. This outage impacts the Shared Computing Cluster (SCC) and ATLAS, as well as the on-campus Linux Virtual Lab machine (scc-lite.bu.edu) and websites hosted at class.bu.edu that use data stored on systems at the MGHPCC. All SCC login and compute nodes, home directories, and project and restricted project filesystems will be unavailable during the outage. While the planned outage is scheduled to extend through noon on Tuesday, we anticipate that restoration of services could occur earlier on Tuesday during the overnight hours.We will start a process of draining the batch queues on the Shared Computing Cluster (SCC) on Friday, April 17th to prevent long running jobs from starting. This process will continue until 12:00pm Sunday, May 17th at which point all jobs will be stopped and we will shut down all the SCC computer systems, including all the login nodes. Although you may continue submitting jobs between April 17th and the downtime, if your jobs would not complete prior to the shutdown, they will remain pending until the computer systems are returned to normal operations. People submitting jobs to the batch system should plan on adjusting their hard run time limits accordingly (see https://www.bu.edu/tech/support/research/rcs-archive/system-usage-old/running-jobs/submitting-jobs/#job-resources for more details). We anticipate having all systems returned to production status by 12:00pm on Tuesday, May 19th.We will post updates before and during the downtime on our Shared Computing Cluster status page at https://www.bu.edu/tech/research/sccupdates/.Please email help@scc.bu.edu if you have any additional questions or concerns.
Regards,
The Research Computing Services Group
(Tuesday, 4/14/15, 4:30 pm) - As of May 4, 2015, we will be disabling projects from Shared batch system access if they go over their CPU/SU allocation and remain over for two weeks, despite multiple warnings, without requesting additional resources. We are also reducing the number of messages sent about projects over their allocation. A notice was just sent to all SCC users explaining the changes. (Friday, 4/17/15, 4:30 pm)
- The following information was just sent out to all SCC users:
Shared Computing Cluster (SCC) Service Degradation March 14th, 2015
Dear Researcher,
We are writing to inform you of an IS&T Emergency Outage that will affect services on the Shared Computing Cluster (SCC). On Sat. March 14th from 11PM until Sun. March 15th at 8AM IS&T will be replacing network fiber cabling that was damaged in a recent steam pipe rupture. This outage will affect the following on the SCC in addition to other University wide services:o Users may have intermittent difficulty logging in to the SCC. Existing login sessions should continue to work.
o The archive storage service will be unavailable. Batch jobs that attempt to utilize archive storage space during the outage will hang or fail.
o Interactive or batch sessions that utilize the following licensed products will not work correctly:
MATLAB
Mathematica
Maple
Lumerical
Abaqus
In addition, the above software packages will not work from other computers that utilize the campus license servers during the time of the outage.In an effort to minimize the impact of failed jobs we will be suspending batch job dispatch during the outage.We apologize for any inconvienance that this outage may cause.If you have further questions please email help@scc.bu.edu.Regards,The Research Computing Services Group - There was an approximately 20 minute interruption to the Project Disk filesystems. (Friday, 10/17/14, 12:30 am)
- Normal access to the Project Disk filesystems has been restored. (Thursday, 10/2/14, 12:55 am)
- The Project Disk filesystems are inaccessible and staff are working to restore access as quickly as possible. (Thursday, 10/2/14, 10:55 am)
- The MGHPCC maintenance has been completed well ahead of schedule and the system is now operating normally. (Monday, 8/11/14, 3:00 pm)
- The MGHPCC maintenance work is proceeding ahead of schedule. The BU networking work and SCC filesystem maintenance have been completed. The SCC login nodes and filesystems are now on-line using generator power. The batch nodes will remain off-line until power is restored. (Monday, 8/11/14, 12:50 pm)
- The MGHPCC and SCC will be down for scheduled maintenance all day on Monday, August 11. The SCC will be brought down at 10PM on August 10 to prepare for this. More details on this outage are given in this letter to all users from Glenn Bresnahan. (Friday, 7/25/14, 2:30 pm)
- SCC1 seems to be working fine now. We are not sure what caused the access/performance problems and interruption could occur again. The other login nodes did not seem to be affected. Please use one of the other login nodes if you experience further problems with SCC1. (Friday, 7/18/14, 2:45 pm)
- The problem with SCC1 is being investigated. Please use any of the other login nodes. (Friday, 7/18/14, 2:30 pm)
- The networking upgrade mentioned in the prior update of May 21 has been indefinitely postponed. We will post a new update when it is rescheduled which may not be for many months. (Monday, 6/2/14, 2:15 pm)
- There will at some point relatively soon be a significant networking upgrade to BU services performed at 881 Commonwealth Avenue; the date and time are not yet set. This will likely cause a disruption of some services on the SCC. License servers will be down, making MATLAB, Abacus, Lumerical, Mathematica, and Maple inaccessible. Kerberos authentication will also be affected, making it impossible to log in to the SCC with your Kerberos login and password; if you are already logged in, this should not affect you. There may also be intermittent other login and connectivity issues, including issues with wireless authentication and web login.
If you have questions about this disruption, please send them to help@scc.bu.edu. (Wednesday, 5/21/14, 10:45 am) - The issues with the Project Disk Space storage system (/project, /projectnb, /restricted/project, and /restricted/projectnb) on the Shared Computing Cluster (SCC) have now been resolved. There was no loss of data and the system is now operating normally.
The problem was caused by the failure of two RAID disk controllers leading to the failure of a RAID disk array. Intervention by the vendor was needed to restore the disk array to operation. (Wednesday, 4/2/14, 7:30 pm) - All of the Project Disk space partitions (/project, /projectnb, /restricted/project, and /restricted/projectnb) are currently inaccessible from all nodes. This issue is under investigation and we will fix it as soon as we can. We will also post additional updates here as we have them. Home directories remain accessible. (Wednesday, 4/2/14, 2:42 pm)
- Power has been restored to the MGHPCC and the SCC is now fully operational. Some electrical issues remain, but it is believed that these can be resolved without further service interruptions. (Wednesday, 12/11/13, 11:30 pm)
- Replacement power equipment has been installed and is undergoing final testing. The MGHPCC will attempt to restore full power this evening. SCV staff remain on-site to bring the computer systems up as soon as possible once power is restored. (Wednesday, 12/11/13, 6:30 pm)
- The SCC login nodes and filesystems are now accessible using generator power at the MGHPCC. The compute nodes will continue to be unavailable until full power is restored. While the MGHPCC is expecting replacement parts for the main power feed tomorrow morning, we do not currently have an estimate on when full power will be restored. We will continue to post further updates on the SCC status page. (Tuesday, 12/10/13, 9:00 pm)
- Due to equipment failures in the main power path, the MGHPCC was not able to return to operation on the target schedule. Equipment vendors and electric company personnel are on-site assessing the problem. SCV staff are also standing by on-site to return the computing systems to normal operation as soon as possible once power is restored. (Tuesday, 12/10/13, 9:50 am)
- Job queue is being drained in preparation for power outage at 10:00 PM 12/08 described below. Jobs which would not complete prior to the shutdown are being held until the system returns on 12/10. (Friday, 12/6/13, 10:00 am)
- December 9 – In order to address an exigent issue, there will be a full power outage at the MGHPCC on December 9th. This outage impacts the Shared Computing Cluster and ATLAS, as well as the on-campus Katana and LinGA clusters that use data stored on systems in Holyoke.We anticipate the systems will be down from 10:00PM on December 8 until 9:000AM on December 10. More details are available here. (Wednesday, 11/20/13, 4:00 pm)
- VNC is now available on the SCC. Using this software can greatly speed up the performance of GUI/graphical applications running on the SCC. (Wednesday, 10/9/13, 4:00 pm)
- Note that if one login node is responding slowly, you may get better responsiveness by logging in to another. The login nodes are scc1.bu.edu, scc2.bu.edu, geo.bu.edu (for Earth & Environment department users), and scc4.bu.edu (for BU Medical Campus users). (Monday, 8/19/13, 10:30 am)
- Glenn Bresnahan, director of SCV, sends out to all SCF users an update on the SCC Performance Issues. (Wednesday, 8/14/13, 12:30 pm)
- Performance back to normal.
As of 7:30 p.m. the SCC’s performance is back to normal. We are still trying to identify the underlying sources of these problems. (Wednesday, 8/7/13, 7:35 pm) - Recurring performance problem.
As of approximately 5:30 p.m. we have again been experiencing intermittent performance degradation. The Systems group is working on it to bring back normal performance as soon as possible. They also continue to try to locate and understand the underlying causes of these performance problems in an effort to prevent them from returning. Many apologies for the interruptions to your productivity. (Wednesday, 8/7/13, 6:15 pm) - File servers hung
Today at approximately 1:00 p.m. two file servers hung and took down the filesystem. The system was restored at 1:30 p.m. This incident was not related to the previous performance degradation issues. (Wednesday, 8/7/13, 1:30 pm) - Update: We believe we have resolved the problem below as of around 4:30 pm on August 6. It was unrelated to the issue on the 2nd and 5th. (August 6, 2013, 5:30 p.m.)
- Recurring performance problem.
We are aware that the SCC is having intermittent performance problems again. We are working on it and are trying to fix it as soon as possible. (Tuesday, 8/6/13, 3:00 pm) - On Friday, August 2nd at approximately 3:30pm and again on Monday, August 5th at approximately 12:30pm the Shared Computing Cluster (SCC) experienced system-wide degradations in performance lasting multiple hours . The SCV systems group has been working to identify the issue causing these degradations. At this time we believe that we have identified the issue and continue to work to fully rectify the problem. Users may continue to experience periods of degraded performance on the cluster until we have fully resolved the issue.
We apologize for any inconvenience that these issues may have caused you during the past few days and appreciate your continued patience as we continue to work hard to resolve them. (Tuesday, 8/6/13, 11:00 am) - Problem with performance on SCC.
We are are aware that there is a problem with performance on the SCC cluster and are working on it now and will resolve it as soon as we can. Apologies in advance for the inconvenience.(Monday, 8/5/13, 12:45 pm) - Performance on the SCC cluster is back to normal. We are still investigating the cause of the problems recently experienced. (Friday, 8/2/13, 5:05 pm)
- Problem with performance on SCC.
We are are aware that there is a problem with performance on the SCC cluster and are working on it now and will resolve it as soon as we can. Apologies in advance for the inconvenience. (Friday, 8/2/13, 4:35 pm) - Charging for usage in Service Units (SUs) begins on July 1, 2013. The compute nodes are charged at an SU factor of 2.6 SUs per CPU hour of usage. Also, note that the way usage is calculated on the SCC is different than it is on the Katana Cluster. Usage is charged on the SCC by wall clock time as on the Blue Gene rather than by actual usage as it is on the Katana Cluster. Thus if you request 12 processors and your code runs for 10 hours, you will be charged for the full 120 hours (multiplied by the SU factor for the node(s) you are running on) even if your actual computation only ran for, say, 30 hours. This change will also apply to the nodes which move out to become part of the SCC that used to be part of the Katana Cluster. (Monday, 7/1/13)
- During the week of July 8-12, 2013, all of geo.bu.edu and the katana-d*, katana-e*, katana-j*, and katana-k* nodes will move out of the Katana Cluster to become part of the SCC. This includes all of the Buy-In Program nodes. All of these nodes will also be renamed during the transition. Details on this are in this note sent out on July 2.
The schedule is:
July 3rd: 6:00am-6:30am Katana outage to physically relocate
July 7th: 7:00am Disable batch queues on machines that are moving.
July 8th: 7:00am Power off all machines that are moving
8:00am-6:00pm Systems de-installed and moved to Holyoke
1:00pm "SCC3" becomes an alias for the system name "GEO"
July 9: 8:00am-6:00pm Reinstallation and cabling of machines in Holyoke
July 10: 12:00pm Target for GEO nodes in production
July 11: 12:00pm Target for 2012 Buy-in nodes in production
July 12: 12:00pm Target for all systems in production
(Tuesday, 6/25/13)
- During the week of June 24, 2013, the BUDGE nodes are being moved out of the Katana Cluster to become part of the SCC. They will be operational again on Friday, June 28 with the new names scc-ha1..scc-he2 and scc-ja1..scc-je2. These nodes each have 8 NVIDIA Tesla M2070 GPU Cards with 6 GB of Memory. (Monday, 6/24/13)
- A bug in the automounter on the SCC systems has been identified that prevents the /net/HOSTNAME/ automount space from working properly for certain servers. There are two known problem servers at this time:
casrs1
nfs-archiveAs a workaround, until a proper bug fix becomes available, we have created a new automount space to handle the problem cases. If you experience a problem accessing /net/HOSTNAME/ for some HOSTNAME, look in /auto/. If HOSTNAME appears there, try that path, otherwise report the problem to help@scc.bu.edu.The /auto space is maintained manually so only the know problem servers can be accessed through that path. All other servers should be accessed through the usual /net path.This problem does not affect the Katana Cluster. (Monday, 6/17/13) - The SCC officially went into production use on June 10, 2013. However, there are still some transitional things continuing. Not all software packages are yet installed and disk space is still in a transitional state for some projects. (Monday, 6/10/13)
- You may or may not have noticed that most of the files on the old Project Disk on Katana have been moved to the new Project Disk on the SCC. You can continue to use your files from either system using the same paths that you always have. If you have not been accessing your files, we quietly moved them over the past week. If you have been accessing your files, we are contacting you individually to find a time that is convenient for you to take a break from accessing them while we move your files for you.Projects that did not have directories in /project and /projectnb on the old system do now have them on the new system with 50 GB quotas on each partition.A note for active Blue Gene users: Since the compute nodes are on a private network, we will not move your files at this time and will be contacting you over the next few weeks to discuss the details and options. (Friday, 6/7/13)
- We will be hosting a seminar on June 11 from 12-2pm to go over issues related to the migration to the SCC. Please do register; a light lunch will be served. The slides from these talks are posted here. (Monday, 6/3/13)
- MATLAB versions R2012b and R2013a are both available. R2012b is launched by /usr/local/bin/matlab at the moment, but you can access R2013a by running /usr/local/apps/matlab-2013a/bin/matlab. (Thursday, 5/30/13)
- In preparation for the new SCC Project Disk Space file systems going live in mid-June, we are making some changes – the first of which you may notice tomorrow, May 29, in the web forms and reports – the primary unit for reporting disk space will be Gigabytes, not Megabytes. In addition, when the SCC goes into production mid-June, all projects on the SCC will have directories and quotas on both backed up and not-backed-up Project Disk partitions. For projects that already have directories and quotas on Katana, they will be transferred to the SCC. Directories and quotas will be created for projects that did not already have them on Katana. The default minimum will be 50 GB on both partitions. For projects that need more quota, there is no charge for requests up to a total of 1 TB (200 GB backed up and 800 not-backed-up). Researchers who need more than that should look into the Buy-in options.
- MATLAB version R2013a is now installed on the SCC. (Monday, 5/20/13)
- FTTW, Mathematica, Accelrys CHARMm, Gaussian, Grace, OpenGL/GLUT, and Nedit have all been installed on the SCC. (Friday, 5/17/13)
- Production use of the SCC will begin in mid-June 2013.
- Added a table of available software packages on the SCC. This will be regularly updated during the friendly user period. (Thursday, 5/2/13)
- Made SCC web site live for everyone to access. (Thursday, 5/2/13)
- Friendly User access to the SCC begins. (Friday, 4/26/13)
- Initial elements of the Shared Computing Cluster (SCC) are installed at the Massachusetts Green High Performance Computing Center (MGHPCC). BU is the first institution to install HPC resources at the MGHPCC. (Tuesday, 1/22/13)