Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue #1627 - stats usage #1629

Merged
merged 36 commits into from
Aug 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
816753d
added stats env variables
rg2011 Jul 26, 2024
f521e22
Merge branch 'master' into fix/issue-1627
rg2011 Jul 26, 2024
14bed96
Update CNR
rg2011 Jul 26, 2024
67674d7
Update admin.md
rg2011 Jul 29, 2024
679b7d1
Update admin.md
rg2011 Jul 29, 2024
0bb40cd
Update admin.md
rg2011 Jul 30, 2024
04bacb1
added prometheus exporter
rg2011 Jul 30, 2024
1664ca3
fix include path
rg2011 Jul 30, 2024
ad0cab2
fix paths
rg2011 Jul 30, 2024
539a6f0
fix stats collection
rg2011 Jul 30, 2024
89a74e7
unconditionally publish global stats
rg2011 Jul 30, 2024
908d6f3
fixed reference to hasOwnProperty
rg2011 Jul 30, 2024
22dfddb
improve metrics format
rg2011 Jul 30, 2024
b7bad16
fix some typos and add all global stats
rg2011 Jul 30, 2024
390a0fc
remove superflouos comments
rg2011 Jul 30, 2024
34b47e7
remove noise from the PR
rg2011 Jul 30, 2024
e549c8c
add support for openmetrics
rg2011 Jul 30, 2024
5142a8b
improve processing of accepts header
rg2011 Jul 30, 2024
3cd0463
Improve content-type negotation for openmetrics
rg2011 Jul 30, 2024
600791f
improve header negotiation
rg2011 Jul 30, 2024
90bf909
optimize not getting stats if header is wrong
rg2011 Jul 30, 2024
e2b07ff
improved negotation
rg2011 Jul 30, 2024
863ab5e
fix typo
rg2011 Jul 30, 2024
d0faa5d
fix accept headers
rg2011 Jul 30, 2024
50f904e
improve negotiation
rg2011 Jul 30, 2024
16da95b
Merge branch 'master' into fix/issue-1627
rg2011 Jul 31, 2024
32b70f8
Merge branch 'fix/issue-1627' into fix/prometheus
rg2011 Jul 31, 2024
836d5e7
improve comments
rg2011 Jul 31, 2024
31f1071
added doc
rg2011 Aug 1, 2024
2d43858
updated documetation
rg2011 Aug 1, 2024
f7eedbe
remove unneeded changes
rg2011 Aug 2, 2024
1b44fe3
reallocate metrics code
rg2011 Aug 2, 2024
14cc8d6
improve docs
rg2011 Aug 2, 2024
683121f
Update CHANGES_NEXT_RELEASE
rg2011 Aug 2, 2024
b770380
Remove push-based stats
rg2011 Aug 2, 2024
2c9772d
Update CHANGES_NEXT_RELEASE
rg2011 Aug 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGES_NEXT_RELEASE
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
- Add: openmetrics-compatible `/metrics` endpoint in nortbound API (#1627)
- Remove: push-based stats (including stats section in config file)
- Fix: service header to use uppercase in case of update and delete (#1528)
- Fix: Allow to send to CB batch update for multimeasures for NGSI-LD (#1623)
- Add: new JEXL transformations for including into an array keys that have a certain value: valuePicker and valuePickerMulti
12 changes: 0 additions & 12 deletions doc/admin.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
- [loglevel](#loglevel)
- [contextBroker](#contextbroker)
- [server](#server)
- [stats](#stats)
- [authentication](#authentication)
- [deviceRegistry](#deviceregistry)
- [mongodb](#mongodb)
Expand Down Expand Up @@ -159,17 +158,6 @@ support nulls or multi-attribute requests if they are encountered.
}
```

#### `stats`

It configures the periodic collection of statistics. Use `interval` in milliseconds to set the time between stats
writings.

```javascript
stats: {
interval: 100;
}
```

#### `authentication`

Stores the authentication data, for use in retrieving tokens for devices with a trust token (required in scenarios with
Expand Down
53 changes: 53 additions & 0 deletions doc/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,8 @@
- [Retrieve log level `GET /admin/log`](#retrieve-log-level-get-adminlog)
- [About operations](#about-operations)
- [List IoTA Information `GET /iot/about`](#list-iota-information-get-iotabout)
- [Metrics](#metrics)
- [Retrieve metrics `GET /metrics`](#retrieve-metrics-get-metrics)

<!-- /TOC -->

Expand Down Expand Up @@ -2224,6 +2226,57 @@ Example:
}
```

### Metrics

The IoT Agent Library exposes a [openmetrics-compatible](https://github.com/OpenObservability/OpenMetrics) endpoint for
telemetry collectors to gather application statistics.

#### Retrieve metrics `GET /metrics`

_**Response code**_

- `200` `OK` if successful.
- `406` `Wrong Accept Header` If accept format is not supported.
- `500` `SERVER ERROR` if there was any error not contemplated above.

_**Response body**_

Returns the current value of the server stats,

- If `Accept` header contains `application/openmetrics-text`, the response has content-type
`application/openmetrics-text; version=1.0.0; charset=utf-8`
- Else, If `Accept` header is missing or supports `text/plain` (explicitly or by `*/*`) , the response has
content-type `text/plain; version=0.0.4; charset=utf-8` (legacy format for [prometheus](https://prometheus.io))
- In any other case, returns an error message with `406` status.

For the kind of metrics exposed by the application, the actual payload itself is completely the same for both
content-types, and follows the openmetrics specification, e.g:

```
# HELP deviceCreationRequests global metric for deviceCreationRequests
# TYPE deviceCreationRequests counter
deviceCreationRequests 0
# HELP deviceRemovalRequests global metric for deviceRemovalRequests
# TYPE deviceRemovalRequests counter
deviceRemovalRequests 0
# HELP measureRequests global metric for measureRequests
# TYPE measureRequests counter
measureRequests 0
# HELP raiseAlarm global metric for raiseAlarm
# TYPE raiseAlarm counter
raiseAlarm 0
# HELP releaseAlarm global metric for releaseAlarm
# TYPE releaseAlarm counter
releaseAlarm 0
# HELP updateEntityRequestsOk global metric for updateEntityRequestsOk
# TYPE updateEntityRequestsOk counter
updateEntityRequestsOk 2
# HELP updateEntityRequestsError global metric for updateEntityRequestsError
# TYPE updateEntityRequestsError counter
updateEntityRequestsError 5
# EOF
```

[1]:
https://czosel.github.io/jexl-playground/#/?context=%7B%0A%20%20%22longitude%22%3A%205%2C%0A%20%20%22latitude%22%3A%2037%2C%0A%20%20%22level%22%3A223%0A%7D&input=%7Bcoordinates%3A%20%5Blongitude%2Clatitude%5D%2C%20type%3A%20'Point'%7D&transforms=%7B%0A%7D
[2]:
Expand Down
2 changes: 2 additions & 0 deletions doc/deprecated.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ A list of deprecated features and the version in which they were deprecated foll
- Support to legacy expressions (finally removed in 3.2.0)
- Bidirectinal pluging (finally removed in 3.4.0)
- appendMode configuration (`IOTA_APPEND_MODE` env var) (finally removed in 3.4.0)
- `config.stats` section, and push-mode statistics.

The use of Node.js v14 is highly recommended.

Expand Down Expand Up @@ -57,3 +58,4 @@ The following table provides information about the last iotagent-node-lib versio
| Support to Legacy Expressions | 3.1.0 | April 25th, 2023 |
| bidirectional plugin | 3.3.0 | August 24th, 2023 |
| appendMode configuration (`IOTA_APPEND_MODE` env var) | 3.3.0 | August 24th, 2023 |
| push-mode stats | 4.5.0 | June 11th, 2024 |
16 changes: 9 additions & 7 deletions doc/devel/development.md
Original file line number Diff line number Diff line change
Expand Up @@ -226,18 +226,20 @@ npm run prettier:text

### Stats Registry

The library provides a mechanism for the periodic reporting of stats related to the library's work. In order to activate
the use of the periodic stats, it must be configured in the config file, as described in the
[Configuration](../admin.md#configuration) section.

The Stats Registry holds two dictionaries, with the same set of stats. For each stat, one of the dictionaries holds the
historical global value and the other one stores the value since the last value reporting (or current value).
The library provides a mechanism for the collection of stats related to the library's work. The Stats Registry holds a
dictionary with the historical global value of each stat.

The stats library currently stores only the following values:

- **deviceCreationRequests**: number of Device Creation Requests that arrived to the API (no matter the result).
- **deviceRemovalRequests**: number of Removal Device Requests that arrived to the API (no matter the result).
- **measureRequests**: number of times the ngsiService.update() function has been invoked (no matter the result).
- **raiseAlarm**: number of times the alarmManagement.raise() function has been invoked.
- **releaseAlarm**: number of times the alarmManagement.release() function has been invoked.
- **updateEntityRequestsOk**: number of times the ngsiService.sendUpdateValue() function has been invoked
successfully.
- **updateEntityRequestsError**: number of times the ngsiService.sendUpdateValue() function has been invoked and
failed.

More values will be added in the future to the library. The applications using the library can add values to the Stats
Registry just by using the following function:
Expand All @@ -247,7 +249,7 @@ iotagentLib.statsRegistry.add('statName', statIncrementalValue, callback);
```

The first time this function is invoked, it will add the new stat to the registry. Subsequent calls will add the value
to the specified stat both to the current and global measures. The stat will be cleared in each interval as usual.
to the specified stat.

### Alarm module

Expand Down
27 changes: 12 additions & 15 deletions lib/fiware-iotagent-lib.js
Original file line number Diff line number Diff line change
Expand Up @@ -45,22 +45,19 @@ const context = {
op: 'IoTAgentNGSI.Global'
};

/* eslint-disable-next-line no-unused-vars */
function activateStatLogs(newConfig, callback) {
if (newConfig.stats && newConfig.stats.interval) {
async.series(
[
apply(statsRegistry.globalLoad, {
deviceCreationRequests: 0,
deviceRemovalRequests: 0,
measureRequests: 0
}),
apply(statsRegistry.addTimerAction, statsRegistry.logStats)
],
callback
);
} else {
callback();
}
async.series([
apply(statsRegistry.globalLoad, {
deviceCreationRequests: 0,
deviceRemovalRequests: 0,
measureRequests: 0,
raiseAlarm: 0,
releaseAlarm: 0,
updateEntityRequestsOk: 0,
updateEntityRequestsError: 0
})
], callback);
}

/**
Expand Down
5 changes: 1 addition & 4 deletions lib/model/dbConn.js
Original file line number Diff line number Diff line change
Expand Up @@ -190,10 +190,7 @@ function configureDb(callback) {
/*jshint camelcase:false, validthis:true */
const currentConfig = config.getConfig();

if (
(currentConfig.deviceRegistry && currentConfig.deviceRegistry.type === 'mongodb') ||
(currentConfig.stats && currentConfig.stats.persistence === true)
) {
if (currentConfig.deviceRegistry && currentConfig.deviceRegistry.type === 'mongodb') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/telefonicaid/iotagent-node-lib/blob/master/doc/admin.md#stats section in documentation should be removed? Or it is still used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it has been removed, no longer used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I missed that... NTC

if (!currentConfig.mongodb || !currentConfig.mongodb.host) {
logger.fatal(context, 'MONGODB-003: No host found for MongoDB driver.');
callback(new errors.BadConfiguration('No host found for MongoDB driver'));
Expand Down
3 changes: 3 additions & 0 deletions lib/services/common/alarmManagement.js
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
*/

let alarmRepository = {};
const statsRegistry = require('../stats/statsRegistry');
const logger = require('logops');
const context = {
op: 'IoTAgentNGSI.Alarms'
Expand All @@ -41,6 +42,7 @@ function raise(alarmName, description) {
};

logger.error(context, 'Raising [%s]: %j', alarmName, description);
statsRegistry.add('raiseAlarm', 1, function () {});
}
}

Expand All @@ -53,6 +55,7 @@ function release(alarmName) {
if (alarmRepository[alarmName]) {
delete alarmRepository[alarmName];
logger.error(context, 'Releasing [%s]', alarmName);
statsRegistry.add('releaseAlarm', 1, function () {});
}
}

Expand Down
4 changes: 3 additions & 1 deletion lib/services/ngsi/ngsiService.js
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@

const async = require('async');
const apply = async.apply;
const statsRegistry = require('../stats/statsRegistry');
const intoTrans = require('../common/domain').intoTrans;
const fillService = require('./../common/domain').fillService;
const errors = require('../../errors');
Expand Down Expand Up @@ -67,7 +68,8 @@ function init() {
* @param {String} token User token to identify against the PEP Proxies (optional).
*/
function sendUpdateValue(entityName, attributes, typeInformation, token, callback) {
entityHandler.sendUpdateValue(entityName, attributes, typeInformation, token, callback);
const newCallback = statsRegistry.withStats('updateEntityRequestsOk', 'updateEntityRequestsError', callback);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only here and raise/release alarm are feeding statsRegistry at this moment. It should be added also for create/delete groups/devices and so on ?

Copy link
Contributor Author

@rg2011 rg2011 Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only added metrics for things I want to alert on our monitoring systems. Currently, I want to create alerts:

  • If the number of failed updates increases much above average,
  • If the number of successful updates falls much below average,
  • if the number of alarms RAISE is higher than the number of alarms RELEASE.

So I only added metrics for those three use cases. It might be better to add metrics on demand, when some use case requires them.

entityHandler.sendUpdateValue(entityName, attributes, typeInformation, token, newCallback);
}

/**
Expand Down
2 changes: 2 additions & 0 deletions lib/services/northBound/northboundServer.js
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ const intoTrans = domainUtils.intoTrans;
const deviceProvisioning = require('./deviceProvisioningServer');
const deviceUpdating = require('./deviceProvisioningServer');
const groupProvisioning = require('./deviceGroupAdministrationServer');
const statsRegistry = require('../stats/statsRegistry');
const logger = require('logops');
const context = {
op: 'IoTAgentNGSI.NorthboundServer'
Expand Down Expand Up @@ -83,6 +84,7 @@ function start(config, callback) {
northboundServer.router.get('/version', middlewares.retrieveVersion);
northboundServer.router.put('/admin/log', middlewares.changeLogLevel);
northboundServer.router.get('/admin/log', middlewares.getLogLevel);
northboundServer.router.get('/metrics', statsRegistry.openmetricsHandler);

northboundServer.app.use(baseRoot, northboundServer.router);
contextServer.loadContextRoutes(northboundServer.router);
Expand Down
Loading
Loading