ATS, AWS CloudWatch and Slack integration

Discussion in 'App Development' started by fan27, Sep 26, 2019.

  1. fan27

    fan27

    I have open sourced my monitoring solution that I use for AlgoTerminal live trading. Though the logger is written in C#, any logger in any language will work as long as it logs structured events (JSON) and has the following properties that my logger has:

    StrategyId, EventType and EventSubType.

    This solution can work for both AWS cloud hosting, on-premise or any other hosting provider. Here is how it works.

    1. Both strategy execution and health check events are logged to disk and then posted to AWS CloudWatch. Also, the ATS (Automated Trading System) log file is also posted to CloudWatch.

    2. AWS CloudWatch Alarms are setup for health check events (must be a certain number of events per duration of time) and the ATS log stream (alarm is triggered if an error is encountered). If the alarm is triggered, a Lambda function posts the alarm to Slack.

    3. Each strategy can have its own Slack channel and we can filter exactly what kind of events go to the Slack channel.

    Below are the repos for the logger and the AWS Lambda functions for sending log events to Slack.
    https://github.com/fasterquant

    Here is a lambda for sending the alarms to Slack.
    https://github.com/blueimp/aws-lambda/tree/master/cloudwatch-alarm-to-slack

    Sorry I don't have more information on setting this all up. I plan on writing some blog articles but there are many pieces in AWS an I have not had the time to document them all and get them in a presentable format.

    Below is my Slack account. All strategy events are are going to one channel but I have since implemented the ability for each strategy to have its own channel.

    AtsSlackIntegration.JPG
     
  2. Why so complex? Why don't you directly address the slack api from within your application and log to slack?i have done so successfully with the former hipchat for years. Is it because you don't want to handle message scheduling on your own? Also, what do you mean with "must be a certain number events per duration of time"? That a single event may not trigger a posting on your slack channel? Isn't that dangerous?

     
    tommcginnis likes this.
  3. fan27

    fan27

    Reporting on the "health" of the system requires the notification mechanism to be on another system. For example, each strategy writes a health check event each time a bar completes. I have a metric in AWS CloudWatch that will trigger an alarm if I do not receive X amount of health check events in a five minute period. If the notification system is tightly coupled to the ATS or even on the same system, an application or system crash will not result in a notification. But to your point, if everything is running smoothly, you can post directly from your ATS. Another advantage of my design is I have decoupled notifications from my ATS, meaning I can totally changed how and what I get notified about without making any updates to my live ATS.
     
    tommcginnis likes this.
  4. Cool, I did not initially catch that you are monitoring system health as well, that makes sense then. Thanks for explaining.

     
    tommcginnis and fan27 like this.