Ontario MoH HCV Service Outage September 18 2018 (Resolved)

The MoH Health Card Validation service is not responding to requests this morning, starting approximately 9:30AM ET. Kiosks that are configured to check the HCV status of health cards may time-out during the check (20s) and report an error to the patient.

For clients using Ocean Kiosks with Ontario HCV, as a short-term measure, you may wish to disable the HCV checking in your tablet settings. To do so, log into your Ocean account, go to the Tablets tab, click on the relevant tablet group "edit" button, and choose "disable" under the HCV tab.

We will continue to monitor the status and update this ticket when the Ministry service is back online.

---

Update 11:20 AM ET: The MoH HCV server seems to be responding properly again.

Update 12:40 PM ET: The MoH HCV server is now returning empty invalid results, so sites will again be experiencing errors on check-ins or a flashing screen. We recommend that sites disable HCV as described above until further notice.

Update 5:25 PM ET: We are still seeing invalid responses from the MoH server.

Update 9:45 AM ET: The HCV server seems to be responding properly again.


Issue impacting Cloud Connect Accuro Users (Resolved)

We are aware of an issue with Accuro sites on the Cloud Connect platform.  Some sites received an email with the subject "Ocean Cloud Connect Server Deauthorized" and must now log into Cloud Connect to reauthorize their connection to the Accuro API Server.  It appears that the authentication tokens used to communicate with the Accuro API are expiring unexpectedly.  We are working with QHR support to determine the cause of the issue.

Instructions for reauthorizing your Cloud Connect server can be found here.

---

Update Sept 14: We have found the cause of the deauthorization issue and will be issuing a patch to the Cloud Connect platform today to correct it.  Unfortunately, some clients may be deauthorized when we issue the patch, but we hope to minimize the impact by issuing it as quickly as possible.  We apologize for the inconvenience.

---

We have patched Cloud Connect to fix this issue.  This issue is now resolved.


Ocean Health Map Service Interruption July 5, 2018 (Resolved)

We experienced a service interruption with the Elastic Search engine that backs our health directory from 4:25PM ET until 4:35PM ET. Users running searches in the directory during this window would have encountered an error. The rest of Ocean was unaffected.

The system is performing normally again. We will update this ticket with additional information as it becomes available in the analysis.


Ontario MoH HCV Service Outage July 5, 2018 (RESOLVED)

The MoH Health Card Validation service is timing out this morning, starting approximately 10:45AM ET. Kiosks that are configured to check the HCV status of health cards may time-out during the check (20s) and report an error to the patient.

For clients using Ocean Kiosks with Ontario HCV, as a short-term measure, you may wish to disable the HCV checking in your tablet settings. To do so, log into your Ocean account, go to the Tablets tab, click on the relevant tablet group "edit" button, and choose "disable" under the HCV tab.

We will continue to monitor the status and update this ticket when the Ministry service is back online.

---

Update 4:10 PM ET: The MoH HCV server seems to be responding properly again. Issue resolved.


Accuro API Issue impacting OceanConnect Users (Resolved)

We have been notified by QHR that there is an issue with the Accuro API affecting Ocean clients using OceanConnect.  Impacted sites seem to be unable to authenticate with the API, preventing note upload and mark arrived.  We will update this article as we receive more information from QHR.

---

Updated July 3, 2018 1:20pm

We have been informed by QHR that the API issue has been resolved.  This issue is now closed.


Ocean Service Interruptions June 25 (resolved)

We have experienced intermittent technical issues in our data centre over the past several hours that have triggered service interruptions on the Ocean Server and health directory. We have stabilized the system and are investigating the impact and root cause.  We will post updates on this article shortly.

--- 

Update June 27, 10:45 AM ET: We have identified the problem and fixed the underlying issue. This issue is now resolved.


Ocean Service Interruptions May 30 (resolved)

We have experienced a number of network issues in our data centre over the past hour that have triggered intermittent service interruptions on the Ocean Server for a minute or two at a time. We are investigating to determine the root cause and will post updates on this ticket shortly.

---

Update 12:30 ET: We've increased the concurrent connection pool capacity on our web tier and will continue to monitor carefully over the next couple hours.

---

Update May 30 11:30 ET: It appears that the capacity changes made yesterday have corrected the problem.  This issue is now closed.


Telus API Issue impacting OceanConnect Users

We have been notified by Telus that there is an issue with the PSS API affecting Ocean clients using OceanConnect.  The impact seems to be limited to marking patients arrived, but we are still investigating.  We will update this article as we receive more information from Telus.

---

Update 1:30PM: TELUS has confirmed an issue with the stability of the API connections to PSS instances. They are investigating.

--- 

Update 1:40PM: TELUS says the API connections have stabilized and the issue should be resolved. We will continue to monitor.

---

Update May 30, 12:10 ET: The issue is recurring intermittently today.  Telus is aware of the issue.

---

Update June 1, 10:10 ET: The issue went away for a day or so but has returned and a number of clinics are experiencing the PSS arrival arrow issue again this morning. We are seeing the same error message from the TELUS API from those PSS instances and have escalated to TELUS again. TELUS confirmed that they are actively working on the issue. No root cause available from TELUS yet, other than general API instability.

---

Update June 1, 3:20 ET: TELUS says the API tunnel is stable again and things should be running normally. We will follow up with TELUS to try to understand the root cause of the API instability.

---

Update June 7, 10:30 AM ET: We are seeing the TELUS API failing across multiple sites again in the same manner as previously documented in this ticket. We are escalating to TELUS again. TELUS has acknowledge continuing issues with their API and are investigating. We will update this ticket as information becomes available.

--- 

Update June 8, 11:30 AM ET: It appears that the TELUS API has been stable since approximately 4:30 PM ET yesterday.

---

Update June 21, 10:50 AM ET: We are seeing the TELUS API failing across multiple sites again in the same manner as previously documented in this ticket. We have escalated to TELUS again.  We will update this ticket as information becomes available.

--- 

Update June 21, 10:00 PM ET: It appears that the TELUS API has stabilized.  TELUS is accelerating technical work to reduce the likelihood that the issue will recur.

--- 

Update June 22, 10:50 AM ET: TELUS says they successfully migrated their API endpoint to a new server last night to address the persistent instability.

 


Ontario MoH HCV Service Outage

The MoH Health Card Validation service has been intermittently unavailable this afternoon, starting at around noon ET, although some errors were seen as early as 10:30AM. Kiosks that are configured to check the HCV status of health cards may time-out during the check (20s) and report an error to the patient.

For clients using Ocean Kiosks with Ontario HCV, as a short-term measure, you may wish to disable the HCV checking in your tablet settings. To do so, log into your Ocean account, go to the Tablets tab, click on the relevant tablet group "edit" button, and choose "disable" under the HCV tab.

We will continue to monitor the status and update this ticket when the Ministry service is back online.

---

Update 3:45PM: It appears that the HCV service is back online. HCV-enabled kiosks should be back to normal.


Scheduled Maintenance for May 18, 2018

The Ocean Server will be offline briefly after 9pm, Eastern time (6pm Pacific) on May 18, 2018 for a software upgrade.  We expect the system to be offline for less that 30 minutes, during which a system maintenance page will appear in place of the Ocean Portal.


Ocean Performance Issues related to Friday's Server Migration (Resolved)

Tablets and Ocean Portal users experienced intermittent performance issues this morning as a result of an incorrect configuration setting on the new server.  The configuration error became apparent when server load returned to normal daily volume.  Ocean server was up throughout, but some requests would have failed due to timeouts.  The issue has been corrected as of 11:15am ET and performance has returned to normal.  Please let us know if you continue to experience issues.


Email Delivery Issues Relating to Friday's Server Migration (Resolved)

We are experiencing some issues with email delivery from our new data centre from Friday night. Emails may fail to be delivered and no error is presented.

In particular, clients using the "from" address in their email settings are affected (under Admin->Site Account), and batch sending may fail on large files due to an unexpected throttling limit from the email service provider. We are working to resolve these issues throughout today. In the meantime, as usual, clinics should rely on Ocean's patient confirmation status to determine whether a message was delivered successfully.

We will post updates to this article as they become available.

---

May 7 5:30PM ET: We've configured an internal mail server to which we will migrate and are now waiting for our data centre provider to configure some firewall configuration changes to match. We currently expect to migrate to the internal server tomorrow night.

---

May 8 11:30PM ET: We've updated the Ocean server to use the new internal mail server, akin to our previous configuration at the Toronto data centre. We've tested a number of email send scenarios and it appears to be working, and no error reports have been seen from the system. We believe the email delivery issues described above are resolved and we will close this issue accordingly.

--- 

May 9 4:00PM ET: Our logs indicated that our new server was been flagged by ProofPoint.com, which is a spam blacklist. We requested removal.

---

May 9 4:20PM ET: ProofPoint blacklist removal was completed.


Scheduled Maintenance for May 4, 2018

The Ocean Server will be offline briefly after 9pm, Eastern time (6pm-9pm Pacific) on May 4, 2018 for a system migration.  We expect the system to be offline for less that 30 minutes, during which a system maintenance page will appear in place of the Ocean Portal.

The production Ocean servers will be migrated from our existing production data centre in Toronto to our disaster recovery (DR) data centre in Montreal.  Once migration is complete, our Toronto data centre will be designated as our DR data centre and Montreal will become our production data centre.  This migration is being undertaken to increase server resources and provide additional future hosting configuration flexibility.  Connected devices will automatically switch over to the new data centre.  Please contact us at ocean.tips/support if any devices fail to move to the new location.

Update 2018-05-04 11:55PM ET: The migration to the Montreal data centre is complete. Service interruption was approximately 10 minutes shortly after 9:00 ET. Data replication has been reconfigured to allow for failover to our Toronto data centre if needed.


Scheduled Maintenance for April 9, 2018

The Ocean Server will be offline briefly after business hours on April 9, 2018 for a system upgrade.  We expect the system to be offline for less that 5 minutes sometime between 9pm and midnight, Eastern time (6pm-9pm Pacific).


TELUS API Connectivity Issues April 9, 2018

We are seeing errors returned from the TELUS API this morning. TELUS is investigating the issue. TELUS clients that have connected an OceanConnect device to the TELUS API may experience the following:

  • Patients not loaded for kiosk use
  • Patients not marked as arrived
  • Generated Ocean notes not being added to chart
  • Ocean Form Reminders delayed

As a temporary workaround, clinics can download clinical notes as required using the "download" link in the toolbar.

Clinics that are not yet connected to the TELUS API will not be affected.

We will update this ticket as more information becomes available from TELUS.

----

Update 2:55PM from TELUS: "Please note that we continue to experience an issue with the SSH Tunnel service for PS Suite clients. This issue is intermittent and is causing the SSH tunnel connection to drop. This results in a “nTELUS EMR Mobile cannot connect to your EMR Server. Please see the following link for more information.” error message from the API. TELUS has a team currently troubleshooting and have made it a high priority."

----

Update 5:30PM: TELUS believes they have isolated the issue to a change in the latest PSS version, and a ripple effect of a handful of clinics upgrading Sunday morning that resulted in Ocean connectivity disruptions for other clinics (that were not upgraded). They are going to disconnect the upgraded clinics from the API tonight, apply a PSS patch and monitor the API tomorrow. TELUS is contacting clinics directly that need to be disconnected, so if you haven't heard from them, there is no action required.

----

Update April 10, 2018 9:45AM: Although we do not yet have confirmation from TELUS, it appears that the maintenance work done by TELUS last night has helped at least some clinics that have reported success with kiosks this morning. Three clinics were disconnected from the API by TELUS; we assume that TELUS has informed those clinics but will be reaching out to confirm. If you are not one of the "disconnected" clinics, please try your kiosks this morning and contact us if you have any issues.

----

Update April 11, 2018 9:45AM: TELUS has confirmed that the issue was related to a PSS upgrade and patches were applied to affected clinics last night to address the issue. All clinics, including those disconnected from the API yesterday, have been reconnected by TELUS and your Ocean Kiosks, Ocean Online and other Ocean technology that interfaces with a TELUS EMR should be back to normal. Let us know if otherwise.

 

 


Important Note for Accuro Clinics Using the New Ocean API

Important note for Accuro clinics using the new Ocean API with the username "OceanAPI":

QHR has identified an issue with the Ocean API that results from multiple clinics using the same username for the API, causing the authorization to be revoked for clinics in what appears to be a random manner. In the past, usernames for different clinics have been routinely set to "OceanAPI".

QHR is working on a fix, but in the meantime, QHR support can change your username from OceanAPI to a name unique to your clinic.

Next Steps:

If your API username is "OceanAPI" and you are on the new API, contact QHR customer support and ask them to change your API username.


System Interruption Jan 24 10:46-10:50 AM EST

We experienced a system interruption between 10:46 and 10:50 AM EST on Jan 24 due to a networking problem.  We are working with our hosting provider to determine the root cause of the outage and will update this ticket when more information is available. We apologize for the inconvenience.




Performance Issue with eRequest Inbox Tab: June 27, 2017 10:01 [Resolved]

June 27, 2017 10AM ET: We are investigating a performance issue relating to the eRequest inbox tab relating to last night's release. Users may see delays of around 5s loading the page. We will update this post with more information and an ETA for a fix shortly.

Update 12:30pm: we found two independent sources of slowness (unnecessarily queries for some hidden inbox folders plus an inefficient database query path) that we will fix tonight in a patch. The system will be offline for approximately 5-10 minutes tonight at 9:30pm ET while we add the new indexes.


System Interruption Feb 28 5:15-5:21 PM EST [Resolved]

We are currently experiencing a system interruption on the Ocean application due to excessive system load. We are investigating and will update this ticket shortly as more information is available. We apologize for the inconvenience.

UPDATE 5:24PM: the Ocean system is back online after a system restart. Total outage was six minutes. We will be investigating the root cause based on some captured diagnostic information and will update this ticket.

ROOT CAUSE ANALYSIS 2017-03-02 10PM: After analysis, it appears that the initial source of the excessive load was a series of directory searches that exposed a poorly indexed path. Compounding the problem was some system instrumentation designed to log "performance warnings"; due to the volume of requests that triggered this logging, it caused a knock on effect, forcing the hard restart by CognisantMD operations staff.

We are addressing the issue in three ways:

- We have put in place a measure through a patch release tonight to help protect against such events in the future

- We have some index changes planned for the next major release of the Ocean server

- We are adding protective code to reduce redundant performance warning logging in the event of a system wide slowdown


Critical patch release Feb 16 at 12PM EST

We will be releasing a critical patch release between 12:00 and 1PM EST today to address an issue that prevented EMR users without Ocean accounts from being able to send messages. We expect the Ocean system to be unavailable for approximately 30s. We apologize for any inconvenience this may cause.


Service Interruption January 25, 2017 10:41-10:45 AM

The Ocean system was unavailable from 10:41 to 10:45 AM EST (4 minutes) on January 25th due to an unexpected lock related a database integrity check run manually by the Ocean system operations team. We apologize for any inconvenience it may have caused.


Critical patch release January 19 at 1PM EST

We will be releasing a critical patch release between 12:30 and 1PM EST today to address a couple of critical bugs from last night's release. We expect the Ocean system to be unavailable for approximately 30s. We apologize for any inconvenience this may cause.


Critical bug in tablet settings management [resolved]

We are investigating a bug that affects tablet group setting management introduced as part of last night's release (Dec 13). It will cause favourites layout items and rules to be duplicated in the display. If the user clicks "save", the "doubled" favourites and rules will be saved to the database and may cause the forms to appear twice to the patient.

We are working on a fix and will deploy it tonight and fix any affected settings groups. In the meantime, please be very careful when updating tablet settings groups to ensure you don't save duplicate rules. If you can wait until tomorrow to update tablet settings rules, please do so.

Please let us know if you have any questions or concerns by contacting us at ocean.tips/support, and we apologize for the inconvenience.

Update (2016-12-14 10:15PM): The fix has been deployed to our production data centre and we ran a script to repair a number of duplicated rules and favourites (we contacted those sites directly to let them know, so if you didn't hear from us, you were not affected). Again, we apologize for the inconvenience.


System Interruption October 4 10:41 EST (2 minutes)

The Ocean system was offline for approximately 2 minutes this morning at 10:41 EST. During some routine maintenance (storage increase for backups) on our disaster recovery server, the production database server was accidentally rebooted instead of the disaster recovery database server. We apologize for any inconvenience this may have caused.


System Interruption Sept 14 3:43 EST (2 minutes)

We experienced a brief service interruption on the Ocean server between 3:43 and 3:45 PM EST today for a duration of just under 2 minutes. Some users may have experienced delays or errors during certain system operations.

We are looking into the root cause and will update as more is known.


Ocean on Android 6 Advisory

With Android 6, Google changed the way that a tablet app can identify its hardware by removing access to the wireless "mac address". Whereas Ocean previously could uniquely identify tablets in this manner, and use it for session management (like a username), this doesn't work in Android 6.

Fortunately, most tablets do not auto-update to Android 6, so the problem occurs only for certain tablets like newer Samsung and Lenovo tablets. In early July, we released a fix in version 125 of the Ocean tablet app that identifies tablets using a "hardware ID" instead. We highly recommend clients upgrade their tablets to v125 to "get ahead" of any potential Android 6 auto-upgrades.

If your tablets are running Android 6 and an Ocean app below v125, there are a few things that you might notice:

  • General connectivity issues based on the sessions getting mixed up (if you have 2 or more tablets on Android 6 with Ocean below v125)
  • Billing issues: if you notice redundant tablets on your month-end bill, it may be because your tablets changed identifiers and confused Ocean into thinking there were more tablets than in reality.

If you need help with either of the above, please contact ocean.tips/support and we'll help sort things out. You should also review your Tablets tab in the portal to ensure you don't have legacy registrations for tablets on your account.

Finally, in order to get ahead of the issue, next week we'll force upgrade all tablets below v125 to the latest version. If you have a large number of tablets running below v125, you may wish to upgrade them in advance. To learn how to do this, please refer to "Upgrading Your Tablet".


Ocean Studies Data Issue (RESOLVED)

We are investigating a bug that caused Ocean Studies submissions to be lost retroactively when using older versions of Ocean Tablets. The bug was introduced in Thursday night's Ocean server upgrade as a side-effect of the feature to eliminate redundant submissions to an Ocean Study in the same patient session.

We have fixed the bug and we deployed the fix at 10am this morning. We are in the process of recovering data from backups and replication transaction logs. 

At this point, we expect to have full data recovery completed by the end of the day. We apologize for the inconvenience. Please contact us at ocean.tips/support if you have any concerns.

2016-03-23 11:30PM: Issue resolved. We have restored all study submissions and verified against audit logs. Your Ocean Study repositories are back to normal and we do not believe that any data was lost (although some data was unavailable during the day today for export). We apologize for the inconvenience.


Critical system maintenance - Friday, February 19, 2016 at 10pm ET **COMPLETED**

A critical vulnerability has been identified in a library used by the operating system running Ocean servers. The vulnerability has been identified by security researchers at Google and Red Hat and a patch has been made generally available.  We have applied this patch to all Ocean servers however a reboot is required to complete the patch application.  As this reboot will impact Ocean services, we are scheduling it outside business hours.  We expect the reboot will take approximately 5 minutes, during which time you may encounter errors while using Ocean tablets or Ocean EMR integrations and the Ocean Portal will be unavailable.  If the outage lasts longer than 5 minutes, please contact CognisantMD support.

Technical details on the vulnerability can be found here.

Although this is a critical vulnerability, we have no reason to believe that any Ocean server has been compromised in any way.  Please contact support if you have any concerns.

10:40pm: reboot process complete, all servers responding