Hypera Development - Issues with New York Infrastructure – Incident details

All systems operational

Issues with New York Infrastructure

Resolved
Major outage
Started almost 2 years agoLasted 2 days

Affected

API

Major outage from 10:32 PM to 7:40 AM

Axolotl

Major outage from 10:32 PM to 7:40 AM

Maven repository

Major outage from 10:32 PM to 7:40 AM

Application proxy

Major outage from 10:32 PM to 7:40 AM

Updates
  • Resolved
    Resolved

    We have rerouted traffic back to our primary cluster, and all operations are continuing as normal.

    We apologise for any inconvenience this outage has caused.

  • Update
    Update

    We are moving our API and our Discord Bot (Axolotl) back to our primary cluster. This could result in a few minutes of downtime, however the switch should be near instant and not noticeable.

  • Monitoring
    Monitoring

    Power has been restored to the datacenter and our infrastructure is now accessible again. We will be moving traffic back from our temporary infrastructure later today.

  • Update
    Update

    Maintenance on our Discord Bot, Axolotl, has been completed.

  • Update
    Update

    We are quickly performing on our Discord Bot, Axolotl, to prevent issues from occurring when the Secaucus, NJ data-center comes back online.

  • Update
    Update

    INAP has posted an incident update:

    We have completed the entire site inspection with the fire marshal and the electrical inspector and utility power has been restored.

    We are now working to restore critical systems and our onsite team has energized the primary electrical equipment that powers the site.

    Concurrently, we are beginning work to bring the mechanical plant online. Additional engineers from other facilities are on site this morning to expedite site turn-up.

    The ETA for bringing up the critical infrastructure systems is approximately 5 hours.

    We are planning for a late afternoon/early evening time frame when clients can come back on-site.

    We will send out additional information regarding access to the facility and remote hands assistance and we will notify you once client access to the facility is permitted

  • Update
    Update

    We are currently looking for options to temporarily restore our Maven Repository, however as the majority of our libraries are also hosted on Sonatype OSS, we are not treating this as urgent.

    We are also creating plans to make our infrastructure automated and highly available in the near future, allowing us to stay online when an incident like this occurs.

    We apologise for any inconvenience this outage has caused.

  • Update
    Update

    INAP posted an incident update on July 11, 2023 at 16:04 CDT:

    The Secaucus data center remains powered down at this time, per the fire marshal.

    Evocative onsite management met with the fire marshal and electrical inspectors to review the current state of the facility. The fire marshal has reviewed the current level of progress on the cleaning and requested additional remediation actions around the fire system components. The follow up inspection from the fire marshal is 9:00 AM EDT, Wednesday. The Evocative team is bringing on additional personnel and vendors to comply completely with the additional requests by this deadline.

    In preparation for being able to allow clients onsite, the fire marshal has stated that Evocative must perform a full test of the fire/life safety systems. This must be performed after utility power has been restored and fire system components replaced. Vendors have already been scheduled and are standing by to complete this work tomorrow.

    If there are no significant changes in the timeline, the earliest INAP will be allowed back into the site to power up servers will be later in the day on Wednesday. We anticipate the next update of significance will be provided by Evocative Wednesday morning, and will keep you updated as information becomes available.

  • Update
    Update

    We have successfully deployed our application proxy and API on our temporary New Jersey infrastructure, restoring api.hypera.dev.

  • Update
    Update

    We have setup temporary infrastructure in New Jersey and have restored Axolotl so we can continue providing support via Discord. We will be restoring our application proxy and our API soon.

  • Update
    Update

    We are currently in the process of starting up temporary infrastructure to bring some of our essential services back online.

  • Identified
    Identified

    INAP raised an incident on July 10th, 2023 at 17:32 CDT stating that they received reports of loss of connectivity affecting their New Jersey (NYJ) datacenter, and that they had started working to identify the issue.

    At July 10th, 2023 at 22:21 CDT, INAP released an update stating the following:

    A small electrical failure occurred with one of the redundant UPS devices which began smoking and led to a small fire in the UPS room. The fire department was dispatched and the fire extinguished quickly. The fire department subsequently cut power to the datacenter and disabled the generators while they and the utility company inspected the electrical system.

    The fire marshal is requiring extensive cleaning of the UPS devices and rooms before the site can be reenergized. There is a vendor onsite currently who will be performing the cleanup.

    We do not currently have an ETA for when the datacenter will be back, and therefore are unable to provide an ETA for when we can restore our services.
    We will be looking into alternative solutions in other datacenters until INAP's NYJ datacenter has been restored.

  • Investigating
    Investigating

    We have been notified of issues affecting our New York infrastructure and are investigating the cause.