Merge pull request #3134 from anthonybocci/2.4

Create documentation about incidents and metrics
2025-03-14 20:39:44 +01:00 · 2018-06-25 08:51:04 +01:00 · 2018-06-25 08:51:04 +01:00 · bb097a1dad
commit bb097a1dad
parent 7544618827 360f163a88
4 changed files with 142 additions and 0 deletions
--- a/docs/component-statuses.md
+++ b/docs/component-statuses.md
@ -0,0 +1,15 @@
+# Component Statuses
+
+Unlike Incidents, Cachet starts listing Component statuses from 1.
+When creating or updating a component, you'll need to specify a status for it.
+
+A status can be one of the following:
+
+Status|Name|Description
+------|----|-----------
+1|Operational|The component is working
+2|Performance issues|The component is experiencing some slowness.
+3|Partial Outage|The component may not be working for everybody. This could be a geographical issue for example.
+4|Major outage|The component is not working for anybody.
+
+
--- a/docs/incidents/index.md
+++ b/docs/incidents/index.md
@ -0,0 +1,41 @@
+# Incidents
+
+An incident is something that should not happen, but that happens anyway.
+
+## What is exactly an incident
+
+In your status page you are showing the state of some components. It may be a
+server, a database, of whatever you want.  
+If your database server crashes, it is an incident.
+
+## Why should I create an incident
+
+Having a status page is a good thing, being honest with the state of your
+components is better.  
+A status page is not only there to show a green light, it's also there to show
+why something bad is happening, and when it will be fixed.
+
+So, when your component experiences a problem, it's a good practice to create an
+incident.
+
+## How to use the incidents
+
+When experiencing an incident, it's good to keep being up-to-date with what
+happens in the real world. That's why you can use _incident updates_.  
+
+How you manage your incidents is up to you, but if you have no idea you can do
+the following:
+
+1. An incident happens. While a team is working to fix it, a person is creating
+   an incident. Be clear about what happens. At the same time, set the concerned
+   component with the right status (_Major Outage_, _Performance issues_ or
+   other)
+2. You identify the origin of the problem, add an _incident update_ to explain
+   what is the problem, if it's important or not.
+3. You think the problem is fixed but are not sure, add an incident update to
+   explain that. Say it should be fixed, you are watching if everything keeps
+   being good.
+4. If it's not fixed, add an _incident update_ as in the second point because
+   it's identified bt not fixed. If it's fixed, congratulation! Add an _incident
+   update_ to explain the details, and say it's definitely fixed. Do not forget
+   to set the component as _Operational_ again.
--- a/docs/metrics/create-metric.md
+++ b/docs/metrics/create-metric.md
@ -0,0 +1,54 @@
+# Create a metric
+
+This documentation will guide you through the metric creation.  
+You need to know [what is a metric][1].
+
+## Filling the form
+
+Creating a metric is as simple as filling a form. You just need to know what do
+the fields mean.
+
+To access to the metrics creation, follow these steps:
+
+- Log into your Cachet instance.
+- Once on the Dashboard click `Metrics` in the sidebar.
+- Click the `Create a metric` button.
+
+And you are there! You should be able to see the metric form.  
+Let's explain the fields:
+
+- `Name`: The name of the metric as it will be shown on the status page.
+  Example: "API response time".
+- `Suffix`: The suffix that will be added in the tooltip when you put your mouse
+  over the point on the metric. Usually it's the unit of the raw data. Example:
+  "ms". If you send "42" to the metric, then "42ms" would be show in the
+  tooltip.
+- `Description`: A description of the metric. What is the usage of the metric?
+  What does it measure? Example: "The average response time of our API".
+- `Calculation of metrics`: What computation should be done on your data before
+  displaying them in the metric? It may be either _Sum_ or _Average_. Example:
+  _Average_ to compute the average reponse time for a given time.
+- `Default view`: The default view of the metric. Viewing the datas of 1 year
+  ago is not useful, but it's about your preference to see datas of the last
+  hour, 12 hours, week or month. Example: _Last 12 hours_ because you want to
+  see the last 12 hours of data by default. It's only the default view, this can
+  be changed in a select box.
+- `Decimal places` The number of decimal of the point that is displayed. If you
+  are computing the average of something it's almost sure that you'll get an
+  average with a coma, line 42,424242. Example: 2 to get 42,42 instead of a long
+  number.
+- `How many minutes of threshold between metric points?`: The number of minutes
+  between the points in the metric. According to your needs it may be 1, 5 or
+  even 30. It's really up to you. Example: 60 to get one point every hour.
+- `Display chart on the status page?`: If checked, this chart will be displayed
+  on the status page. But it's possible to create the metric and not showing it.
+- `Visibility`: Who should be able to see the chart? You have three choices:
+    - `Visible to authenticated users`: It means that people won't be able to
+      see it except if they are authenticated. Useful if it's an internal metric.
+    - `Visible to everybody`: It means that every user, even not authenticated,
+      will be able to see the chart.
+    - `Always hidden`: It means that nobody will be able to see the chart.
+
+
+
+[1]: index.md
--- a/docs/metrics/index.md
+++ b/docs/metrics/index.md
@ -0,0 +1,32 @@
+# Metrics
+
+This guide aims to explain basics about metrics.
+
+## What are metrics
+
+When you do monitoring on your services, servers, APIs or others, you can get
+raw data. These datas may be a response time to a request, the number of queries
+handled in a minute, etc.
+
+The metrics are these raw datas. Using the [Cachet's API][1] you can send the datas
+about what you are monitoring to Cachet.
+
+
+## What can do metrics for you
+
+Having good metrics to show may be great for customers or partners.  
+
+You have a big webservice that is under pressure? So it's important to have a
+short response time. A metric could show to your users that the webservice is
+responding fast!  
+Imagine, you have a metric named "Response time". Every 10 seconds you call your
+webservice, and send the response time to the Cachet's API, in the metric. On
+your status page you'll be able to see the average response time for a minute
+for example.
+
+Doing so, your users would see that during the last 10 minutes your response
+time was worst than previously, and it begins to being better.
+
+
+
+[1]: api-documentation.md