The Event Platform
#Key platform behaviors
Your subscribing service implementation and architecture must deal with the following capabilities and constraints to consume events from the platform at its best.
#At least once
We're not in a "at most once" paradigm but in a at least once
one. This paradigm involves two things your service must compose with:
- Expect duplicates
- Expect un-ordered events
To help identify duplicated events and deal with un-ordered events if it's something critical for your business, you can rely on both fields id
and time
from the event, which provide unique identifiers and publication timestamp for each event.
#Optimized throughput
Our delivery engine will try to deliver events as fast as possible but will adapt the throughput within the limits described in this section.
If your subscription has an HTTPS destination, please respond with 429 Too Many Requests
when your system is overloaded: this way, the delivery engine will slow down, retry undelivered events, and gradually increase the throughput when your system gets back to recovers (i.e., when it responds 200 OK
again).
Please note that if the throughput drops too significantly, the suspension policy will be triggered (see bellow).
#Delivery timeout
Specifically for the HTTPS destination type, the delivery timeout ensures that messages are processed within a specified time frame. Your endpoint is expected to handle requests within 3 seconds
. If processing exceeds this duration, the event will enter the retry process (see bellow).
Under normal circumstances, your HTTPS endpoint must handle the event as fast as possible.
Our recommendation is to put the message in a queuing system
or in a database
for asynchronous processing.
#Retry policy for transient failures
Our retry policy helps ensure event delivery during transient failures, allowing for multiple attempts to deliver messages when temporary issues arise. The retry back-off mechanism is managed internally and is not configurable on a per-tenant basis.
If your destination is unable to ingest an event, we will retry deliver as follow:
- First retry: 5 minutes after the previous attempt.
- Second retry: 10 minutes after the previous attempt.
- Third retry: 20 minutes after the previous attempt.
These retries are on a best-effort basis and apply only to transient errors or timeouts. After three retry attempts, the message is dropped. This type of failure may trigger the suspension policy (see bellow).
#Suspension policy
Our system enforces a strict suspension policy to ensure the integrity and reliability of event delivery. We distinguish two types of suspension conditions: criteria-based suspensions and threshold-based suspensions.
#Criteria-Based Suspension:
Your subscription is immediately suspended if you meet one of these conditions:
- Your HTTPS endpoint responds with a
404 Not Found
HTTP status. - Your HTTPS endpoint responds with
3xx Redirection
HTTP statuses, as we do not support redirections. - We're not authorized to publish message in your Google Cloud Topic.
#Threshold-Based Suspension:
This type of suspension is based on the success rate of your HTTPS endpoint. If the success rate drops below 95% within the last rolling hour, your subscription will be suspended.
Here are the errors type that decrease the success rate:
5XX Server Error
HTTP statuses429 Too Many Requests
HTTP statuses (while the minimum rate threshold is exceeded)4xx Client Error Response
HTTP status- Delivery timeout
When the platform suspends your subscription, a notification will be sent to the technical email address you provided, along with contextual information about the suspension.