Commit Graph

32 Commits

Author SHA1 Message Date
Costa Tsaousis
cd584e0357
ZSTD and GZIP/DEFLATE streaming support (#16268)
* move compression header to compression.h

* prototype with zstd compression

* updated capabilities

* no need for resetting compression

* left-over reset function

* use ZSTD_compressStream() instead of ZSTD_compressStream2() for backwards compatibility

* remove call to LZ4_decoderRingBufferSize()

* debug signature failures

* fix the buffers of lz4

* fix decoding of zstd

* detect compression based on initialization; prefer ZSTD over LZ4

* allow both lz4 and zstd

* initialize zstd streams

* define missing ZSTD_CLEVEL_DEFAULT

* log zero compressed size

* debug log

* flush compression buffer

* add sender compression statistics

* removed debugging messages

* do not fail if zstd is not available

* cleanup and buildinfo

* fix max message size, use zstd level 1, add compressio ratio reporting

* use compression level 1

* fix ratio title

* better compression error logs

* for backwards compatibility use buffers of COMPRESSION_MAX_CHUNK

* switch to default compression level

* additional streaming error conditions detection

* do not expose compression stats when compression is not enabled

* test for the right lz4 functions

* moved lz4 and zstd to their own files

* add gzip streaming compression

* gzip error handling

* added unittest for streaming compression

* eliminate a copy of the uncompressed data during zstd compression

* eliminate not needed zstd allocations

* cleanup

* decode gzip with Z_SYNC_FLUSH

* set the decoding gzip algorithm

* user configuration for compression levels and compression algorithms order

* fix exclusion of not preferred compressions

* remove now obsolete compression define, since gzip is always available

* rename compression algorithms order in stream.conf

* move common checks in compression.c

* cleanup

* backwards compatible error checking
2023-10-27 17:37:34 +03:00
Costa Tsaousis
ce75313de0
systemd-journal plugin (#15363) 2023-08-03 15:42:11 +03:00
vkalintiris
0e230a260e
Revert "Refactor RRD code. (#15423)" (#15723)
This reverts commit 440bd51e08.

dbengine was still being used for non-zero tiers
even on non-dbengine modes.
2023-08-03 13:13:36 +03:00
vkalintiris
440bd51e08
Refactor RRD code. (#15423)
* Storage engine.

* Host indexes to rrdb

* Move globals to rrdb

* Move storage_tiers_backfill to rrdb

* default_rrd_update_every to rrdb

* default_rrd_history_entries to rrdb

* gap_when_lost_iterations_above to rrdb

* rrdset_free_obsolete_time_s to rrdb

* libuv_worker_threads to rrdb

* ieee754_doubles to rrdb

* rrdhost_free_orphan_time_s to rrdb

* rrd_rwlock to rrdb

* localhost to rrdb

* rm extern from func decls

* mv rrd macro under rrd.h

* default_rrdeng_page_cache_mb to rrdb

* default_rrdeng_extent_cache_mb to rrdb

* db_engine_journal_check to rrdb

* default_rrdeng_disk_quota_mb to rrdb

* default_multidb_disk_quota_mb to rrdb

* multidb_ctx to rrdb

* page_type_size to rrdb

* tier_page_size to rrdb

* No storage_engine_id in rrdim functions

* storage_engine_id is provided by st

* Update to fix merge conflict.

* Update field name

* Remove unnecessary macros from rrd.h

* Rm unused type decls

* Rm duplicate func decls

* make internal function static

* Make the rest of public dbengine funcs accept a storage_instance.

* No more rrdengine_instance :)

* rm rrdset_debug from rrd.h

* Use rrdb to access globals in ML and ACLK

Missed due to not having the submodules in the
worktree.

* rm total_number

* rm RRDVAR_TYPE_TOTAL

* rm unused inline

* Rm names from typedef'd enums

* rm unused header include

* Move include

* Rm unused header include

* s/rrdhost_find_or_create/rrdhost_get_or_create/g

* s/find_host_by_node_id/rrdhost_find_by_node_id/

Also, remove duplicate definition in rrdcontext.c

* rm macro used only once

* rm macro used only once

* Reduce rrd.h api by moving funcs into a collector specific utils header

* Remove unused func

* Move parser specific function out of rrd.h

* return storage_number instead of void pointer

* move code related to rrd initialization out of rrdhost.c

* Remove tier_grouping from rrdim_tier

Saves 8 * storage_tiers bytes per dimension.

* Fix rebase

* s/rrd_update_every/update_every/

* Mark functions as static and constify args

* Add license notes and file to build systems.

* Remove remaining non-log/config mentions of memory mode

* Move rrdlabels api to separate file.

Also, move localhost functions that loads
labels outside of database/ and into daemon/

* Remove function decl in rrd.h

* merge rrdhost_cache_dir_for_rrdset_alloc into rrdset_cache_dir

* Do not expose internal function from rrd.h

* Rm NETDATA_RRD_INTERNALS

Only one function decl is covered. We have more
database internal functions that we currently
expose for no good reason. These will be placed
in a separate internal header in follow up PRs.

* Add license note

* Include libnetdata.h instead of aral.h

* Use rrdb to access localhost

* Fix builds without dbengine

* Add header to build system files

* Add rrdlabels.h to build systems

* Move func def from rrd.h to rrdhost.c

* Fix macos build

* Rm non-existing function

* Rebase master

* Define buffer length macro in ad_charts.

* Fix FreeBSD builds.

* Mark functions static

* Rm func decls without definitions

* Rebase master

* Rebase master

* Properly initialize value of storage tiers.

* Fix build after rebase.
2023-07-26 15:30:49 +03:00
Costa Tsaousis
c74bf56ee2
Code reorg and cleanup - enrichment of /api/v2 (#15294)
* claim script now accepts the same params as the kickstart

* rewrote buildinfo to unify all methods

* added cloud unavailable in cloud status

* added all exporters

* renamed httpd to h2o

* rename ENABLE_COMPRESSION to ENABLE_LZ4

* rename global variable

* rename ENABLE_HTTPS to ENABLE_OPENSSL

* fix coverity-scan for openssl

* add lz4 to coverity-scan

* added all plugins and most of the features

* added all plugins and most of the features

* generalize bitmap code so that we can have any size of bitmaps

* cleanup

* fix compilation without protobuf

* fix compilation with others allocators

* fix bitmap

* comprehensive bitmaps unit test

* bitmap as macros

* added developer mode

* added system info to build info

* cloud available/unavailable

* added /api/v2/info

* added units and ni to transitions

* when showing instances and transitions, show only the instances that have transitions

* cleanup

* add missing quotes

* add anchor to transitions

* added more to build info

* calculate retention per tier and expose it to /api/v2/info

* added currently collected metrics

* do not show space and retention when no numbers are available

* fix impossible overflow

* Add function for transitions and execute callback

* In case of error, reset and try next dictionary entry

* Fix error message

* simpler logic to maintain retention per tier

* /api/v2/alert_transitions

* Handle case of recipient null
Convert after and before to usec

* Add classification, type and component

* working /api/v2/alert_transitions

* Fix query to properly handle context and alert name

* cleanup

* Add search with transition

* accept transition in /api/v2/alert_transitions

* totaly dynamic facets

* fixed debug info

* restructured facets

* cleanup; removal of options=transitions

* updated alert entries flags

* method to exec

* Return also exec run timestamp
Temp table cleanup only when we don't execute with a transition

* cleanup obsolete anchor parameter

* Add sql_get_alert_configuration function

* added options=config to alert_transitions

* added /api/v2/alert_config

* preliminary work for /api/v2/claim

* initialize variables; do not expose expected retention if no disk space info is available; do not report aclk as initializing when not claimed

* fix claim session key filename

* put a newline into the session key file

* more progress on claiming

* final /api/v2/claim endpoint

* after claiming, refresh our state at the output

* Fix query to fetch config

* Remove debug log

* add configuration objects

* add configuration objects - fixed

* respect the NETDATA_DISABLE_CLOUD env variable

* NETDATA_DISABLE_CLOUD env variable sets the default, but the config sets the final value

* use a new claimed_id on every claiming

* regenerate random key on claiming and wait for online status

* ignore write() return value when writing a newline

* dont show cloud status disabled when claimed_id is missing

* added ctx to alert instances

* cleanup config and transitions from /api/v2/alerts

* fix unused variable

* in /api/v2/alert_config show 1 config without an array

* show alert values conditionally, by appending options=values

* When storing host info if the key value is empty, store unknown

* added options=summary to control when the alerts summary is shown

* increased http_api_v2 to version 5

* claming random key file is now not world readable

* added local-listeners binary that detects all the listening ports, their IPs and their command lines

---------

Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2023-07-06 01:49:32 +03:00
thiagoftsm
588096c6b6
Debugfs collector (#15017)
Co-authored-by: Fotis Voutsas <fotis@netdata.cloud>
Co-authored-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>
Co-authored-by: ilyam8 <ilya@netdata.cloud>
2023-05-15 19:06:26 +03:00
Timotej S
4ff305a4ad
make zlib compulsory dep (#14928)
zlib compulsory
2023-05-10 17:30:58 +02:00
Timotej S
66c47355b4
initial minimal h2o webserver integration (#14585)
Introduces h2o based web server as an alternative
2023-05-10 13:37:44 +02:00
Timotej S
c6f563da74
minor - add trace alloc to buildinfo (#13817)
add trace alloc to buildinfo
2022-10-13 16:20:51 +02:00
Timotej S
971fe35547
Remove aclk_api.[ch] (#13540)
* get rid of aclk_starter middleman
* get rid of aclk_api.[ch]
2022-08-24 10:41:14 +02:00
Timotej S
2df6667153
Cleanup of APIs (#13539)
ACLK related API cleanup
2022-08-19 17:20:25 +02:00
Timotej S
cb13f0787d
Removes Legacy JSON Cloud Protocol Support In Agent (#13111)
* removes old protocol support (cloud removed support already)
2022-06-27 16:03:20 +02:00
Timotej S
4116afa075
minor - fix analytics_build_info (#12354) 2022-03-10 16:15:08 +01:00
Timotej S
f3e8c077cf
adds install method to /api/v1/info as label (#12040)
* install_type to rrdhost_system_info and as labels
2022-02-18 12:35:01 +01:00
Emmanuel Vasilakis
3bb51bc783
remove check for aclk_ng and prometheus in order to assume protobuf in buildinfo (#12168) 2022-02-17 18:45:36 +02:00
Austin S. Hemmelgarn
f70a97206e
Add install type info to -W buildinfo output. (#12010)
* Add install type info to `-W buildinfo` ouptut.

By reading it from the `.install-type` file and presenting it properly.

* Move get_value_from_key from daemon/analytics to libnetdata.

It will be used also in the buildinfo code.

* Restructure install type handling for buildinfo.

* Restructure to make code more reusable.

Allowing for deduplication and also enabling other potential callers.

* Fix incorrect variable name in analytics changes.
2022-01-24 12:25:31 -05:00
avstrakhov
b003e5fd40
Add code for LZ4 streaming data compression (#11821)
* Add code for LZ4 streaming data compression

* Fix LGTM alert

* Add lz4 library for link when compression enabled

* Add LZ4_resetStream_fast presence detection

* Disable compression for older LZ4 libraries

* Correct LZ4 API check

* [Testing Stream Compression] Debug msgs and report.md

* Add LZ4 library version using LZ4_initStream

* Fixed bug in SSL mode

* [Testing compression] - Add compression info messages

* Set compression enabled by default, update doc

* Update streaming/README.md

Co-authored-by: DShreve2 <david@netdata.cloud>

* [Agent Negotiation] Compression as separate capability

* [Agent Negotiation] Compression as separate capability - default compression variable always active

* Add code to negotiate compression

* [Agent Negotiation] Based on stream version

* [Agent Negotiation] Version based - fix compilation error

* [Agent Negotiation] Fix glob var default_compression_enbaled=0 affects all the connections - Handle compression - stream version based

* [Agent Negotiation - Compression] - Add control flag in 1. sender/receiver state & 2. stream.conf per child

* [Agent Negotiation - Compression] Fix stream.conf key, mguid control

* [Agent Negotiate Compression] Fine control on stream.conf per key,mguid for each child

* [Agent Negotiation Compression] Stop destroying compressor for runtime configuration + Update Readme.md

* [Agent Negotiation Compression] Use stream_version 4 if compression is disabled

* Correct child's compression check

* [Agent Negotiation Compression] Create streaming compression section in docs.

* [Agent Negotiation Compresion] Remove redundant debug msgs

* [Stream Compression] - integrate compression build info & config info in api/v1/info endpoint.

* [Agent Negotiation] Finalize README.md

* [Agent Stream Compression] Fix buildinfo json, Finalize readme.md

* [Agent Stream Compression] Negotiate compression based on stream version

* [Agent Stream Compression] Stream compression control per child in stream.conf |  per AP_KEY, MACHINE_GUID

* [Agent Stream Compression] Avoid destroying compressor enabling runtime configuration + Update Readme.md

* [Agent Stream Compression] - Provide stream compression build info & config info in api/v1/info endpoint + Update Readme.md

* [Agent Stream Compression] Fix rebase conflicts

* [Agent Stream Compression] Fix more rebase conflicts

* [Agent Stream Compression] 1. Stream version based negotiation 2. per child stream.conf control 3. finalize docs 4. stream compression build info in web api

* [Agent Stream Compression] 1. Stream version based negotiation 2. per child stream.conf control 3. finalize docs 4. stream compression build info in web api

* [Agent Stream Compression] Change unsuccessful buffer check to error

* [Agent Stream Compression] Readme.md proof-read corrections, downgrade to stream_version_clabels, add shields for supported versions, EOF lint

* [Agent Stream Compression] Fix missed lz4 library on Alpine Linux

* Phrasal review

Co-authored-by: odynik <odynik.ee@gmail.com>
Co-authored-by: DShreve2 <david@netdata.cloud>
Co-authored-by: Tina Lüdtke <tina@kickoke.com>
2022-01-19 17:57:49 +02:00
Timotej S
5736b4bcb1
Removes ACLK Legacy (#11841)
* remove legacy from makefiles
* remove ACLK Legacy from installer
* remove ACLK Legacy from configure.ac
* remove legacy from cmake
* aclk api cleanup
* remove legacy files from packaging
* changes for CI from Austin
2022-01-04 10:11:04 +01:00
Austin S. Hemmelgarn
a67c830243
Add protobuf to -W buildinfo output. (#11634)
* Re-sort buildinfo macros.

* Add protobuf to buildinfo output.

* Add info about whether protobuf is from the system or bundled.

* Reorganize checks to be slightly saner.

* Move declaration to the correct place.
2021-11-08 07:49:18 -05:00
vkalintiris
9ed4cea590
Anomaly Detection MVP (#11548)
* Add support for feature extraction and K-Means clustering.

This patch adds support for performing feature extraction and running the
K-Means clustering algorithm on the extracted features.

We use the open-source dlib library to compute the K-Means clustering
centers, which has been added as a new git submodule.

The build system has been updated to recognize two new options:

    1) --enable-ml: build an agent with ml functionality, and
    2) --enable-ml-tests: support running tests with the `-W mltest`
       option in netdata.

The second flag is meant only for internal use. To build tests successfully,
you need to install the GoogleTest framework on your machine.

* Boilerplate code to track hosts/dims and init ML config options.

A new opaque pointer field is added to the database's host and dimension
data structures. The fields point to C++ wrapper classes that will be used
to store ML-related information in follow-up patches.

The ML functionality needs to iterate all tracked dimensions twice per
second. To avoid locking the entire DB multiple times, we use a
separate dictionary to add/remove dimensions as they are created/deleted
by the database.

A global configuration object is initialized during the startup of the
agent. It will allow our users to specify ML-related configuration
options, eg. hosts/charts to skip from training, etc.

* Add support for training and prediction of dimensions.

Every new host spawns a training thread which is used to train the model
of each dimension.

Training of dimensions is done in a non-batching mode in order to avoid
impacting the generated ML model by the CPU, RAM and disk utilization of
the training code itself.

For performance reasons, prediction is done at the time a new value
is pushed in the database. The alternative option, ie. maintaining a
separate thread for prediction, would be ~3-4x times slower and would
increase locking contention considerably.

For similar reasons, we use a custom function to unpack storage_numbers
into doubles, instead of long doubles.

* Add data structures required by the anomaly detector.

This patch adds two data structures that will be used by the anomaly
detector in follow-up patches.

The first data structure is a circular bit buffer which is being used to
count the number of set bits over time.

The second data structure represents an expandable, rolling window that
tracks set/unset bits. It is explicitly modeled as a finite-state
machine in order to make the anomaly detector's behaviour easier to test
and reason about.

* Add anomaly detection thread.

This patch creates a new anomaly detection thread per host. Each thread
maintains a BitRateWindow which is updated every second based on the
anomaly status of the correspondent host.

Based on the updated status of the anomaly window, we can identify the
existence/absence of an anomaly event, it's start/end time and the
dimensions that participate in it.

* Create/insert/query anomaly events from Sqlite DB.

* Create anomaly event endpoints.

This patch adds two endpoints to expose information about anomaly
events. The first endpoint returns the list of anomalous events within a
specified time range. The second endpoint provides detailed information
about a single anomaly event, ie. the list of anomalous dimensions in
that event along with their anomaly rate.

The `anomaly-bit` option has been added to the `/data` endpoint in order
to allow users to get the anomaly status of individual dimensions per
second.

* Fix build failures on Ubuntu 16.04 & CentOS 7.

These distros do not have toolchains with C++11 enabled by default.
Replacing nullptr with NULL should be fix the build problems on these
platforms when the ML feature is not enabled.

* Fix `make dist` to include ML makefiles and dlib sources.

Currently, we add ml/kmeans/dlib to EXTRA_DIST. We might want to
generate an explicit list of source files in the future, in order to
bring down the generated archive's file size.

* Small changes to make the LGTM & Codacy bots happy.

- Cast unused result of function calls to void.
- Pass a const-ref string to Database's constructor.
- Reduce the scope of a local variable in the anomaly detector.

* Add user configuration option to enable/disable anomaly detection.

* Do not log dimension-specific operations.

Training and prediction operations happen every second for each
dimension. In prep for making this PR easier to run anomaly detection
for many charts & dimensions, I've removed logs that would cause log
flooding.

* Reset dimensions' bit counter when not above anomaly rate threshold.

* Update the default config options with real values.

With this patch the default configuration options will match the ones
we want our users to use by default.

* Update conditions for creating new ML dimensions.

1. Skip dimensions with update_every != 1,
2. Skip dimensions that come from the ML charts.

With this filtering in place, any configuration value for the
relevant simple_pattern expressions will work correctly.

* Teach buildinfo{,json} about the ML feature.

* Set --enable-ml by default in the configuration options.

This patch is only meant for testing the building of the ML functionality
on Github. It will be reverted once tests pass successfully.

* Minor build system fixes.

- Add path to json header
- Enable C++ linker when ML functionality is enabled
- Rename ml/ml-dummy.cc to ml/ml-dummy.c

* Revert "Set --enable-ml by default in the configuration options."

This reverts commit 28206952a59a577675c86194f2590ec63b60506c.

We pass all Github checks when building the ML functionality, except for
those that run on CentOS 7 due to not having a C++11 toolchain.

* Check for missing dlib and nlohmann files.

We simply check the single-source files upon which our build system
depends. If they are missing, an error message notifies the user
about missing git submodules which are required for the ML
functionality.

* Allow users to specify the maximum number of KMeans iterations.

* Use dlib v19.10

v19.22 broke compatibility with CentOS 7's g++. Development of the
anomaly detection used v19.10, which is the version used by most Debian and
Ubuntu distribution versions that are not past EOL.

No observable performance improvements/regressions specific to the K-Means
algorithm occur between the two versions.

* Detect and use the -std=c++11 flag when building anomaly detection.

This patch automatically adds the -std=c++11 when building netdata
with the ML functionality, if it's supported by the user's toolchain.

With this change we are able to build the agent correctly on CentOS 7.

* Restructure configuration options.

- update default values,
- clamp values to min/max defaults,
- validate and identify conflicting values.

* Add update_every configuration option.

Considerring that the MVP does not support per host configuration
options, the update_every option will be used to filter hosts to train.

With this change anomaly detection will be supported on:

    - Single nodes with update_every != 1, and
    - Children nodes with a common update_every value that might differ from
      the value of the parent node.

* Reorganize anomaly detection charts.

This follows Andrew's suggestion to have four charts to show the number
of anomalous/normal dimensions, the anomaly rate, the detector's window
length, and the events that occur in the prediction step.

Context and family values, along with the necessary information in the
dashboard_info.js file, will be updated in a follow-up commit.

* Do not dump anomaly event info in logs.

* Automatically handle low "train every secs" configuration values.

If a user specifies a very low value for the "train every secs", then
it is possible that the time it takes to train a dimension is higher
than the its allotted time.

In that case, we want the training thread to:

    - Reduce it's CPU usage per second, and
    - Allow the prediction thread to proceed.

We achieve this by limiting the training time of a single dimension to
be equal to half the time allotted to it. This means, that the training
thread will never consume more than 50% of a single core.

* Automatically detect if ML functionality should be enabled.

With these changes, we enable ML if:

    - The user has not explicitly specified --disable-ml, and
    - Git submodules have been checked out properly, and
    - The toolchain supports C++11.

If the user has explicitly specified --enable-ml, the build fails if
git submodules are missing, or the toolchain does not support C++11.

* Disable anomaly detection by default.

* Do not update charts in locked region.

* Cleanup code reading configuration options.

* Enable C++ linker when building ML.

* Disable ML functionality for CMake builds.

* Skip LGTM for dlib and nlohmann libraries.

* Do not build ML if libuuid is missing.

* Fix dlib path in LGTM's yaml config file.

* Add chart to track duration of prediction step.

* Add chart to track duration of training step.

* Limit the number dimensions in an anomaly event.

This will ensure our JSON results won't grow without any limit. The
default ML configuration options, train approximately ~1700 dimensions
in a newly-installed Netdata agent. The hard-limit is set to 2000
dimensions which:

    - Is well above the default number of dimensions we train,
    - If it is ever reached it means that the user had accidentaly a
      very low anomaly rate threshold, and
    - Considering that we sort the result by anomaly score, the cutoff
      dimensions will be the less anomalous, ie. the least important to
      investigate.

* Add information about the ML charts.

* Update family value in ML charts.

This fix will allow us to show the individual charts in the RHS Anomaly
Detection submenu.

* Rename chart type

s/anomalydetection/anomaly_detection/g

* Expose ML feat in /info endpoint.

* Export ML config through /info endpoint.

* Fix CentOS 7 build.

* Reduce the critical region of a host's lock.

Before this change, each host had a single, dedicated lock to protect
its map of dimensions from adding/deleting new dimensions while training
and detecting anomalies. This was problematic because training of a
single dimension can take several seconds in nodes that are under heavy
load.

After this change, the host's lock protects only the insertion/deletion
of new dimensions, and the prediction step. For the training of dimensions
we use a dedicated lock per dimension, which is responsible for protecting
the dimension from deletion while training.

Prediction is fast enough, even on slow machines or under heavy load,
which allows us to use the host's main lock and avoid increasing the
complexity of our implementation in the anomaly detector.

* Improve the way we are tracking anomaly detector's performance.

This change allows us to:

    - track the total training time per update_every period,
    - track the maximum training time of a single dimension per
      update_every period, and
    - export the current number of total, anomalous, normal dimensions
      to the /info endpoint.

Also, now that we use dedicated locks per dimensions, we can train under
heavy load continuously without having to sleep in order to yield the
training thread and allow the prediction thread to progress.

* Use samples instead of seconds in ML configuration.

This commit changes the way we are handling input ML configuration
options from the user. Instead of treating values as seconds, we
interpret all inputs as number of update_every periods. This allows
us to enable anomaly detection on hosts that have update_every != 1
second, and still produce a model for training/prediction & detection
that behaves in an expected way.

Tested by running anomaly detection on an agent with update_every = [1,
2, 4] seconds.

* Remove unecessary log message in detection thread

* Move ML configuration to global section.

* Update web/gui/dashboard_info.js

Co-authored-by: Andrew Maguire <andrewm4894@gmail.com>

* Fix typo

Co-authored-by: Andrew Maguire <andrewm4894@gmail.com>

* Rebase.

* Use negative logic for anomaly bit.

* Add info for prediction_stats and training_stats charts.

* Disable ML on PPC64EL.

The CI test fails with -std=c++11 and requires -std=gnu++11 instead.
However, it's not easy to quickly append the required flag to CXXFLAGS.
For the time being, simply disable ML on PPC64EL and if any users
require this functionality we can fix it in the future.

* Add comment on why we disable ML on PPC64EL.

Co-authored-by: Andrew Maguire <andrewm4894@gmail.com>
2021-10-27 09:26:21 +03:00
Timotej S
dad48421a6
Makes New Cloud architecture optional for ACLK-NG (#11587)
ACLK-NG supports both new and old cloud protocol. Protobuf and C++ compiler are required only for new cloud protocol.
There is no reason to skip building whole ACLK-NG when protobuf is missing.
2021-09-29 17:53:53 +02:00
Emmanuel Vasilakis
3ef5ba029c
Send correct aclk implementation used by agent to posthog. (#11247) 2021-06-15 17:09:58 +03:00
Timotej S
59af90b08c
Allows ACLK NG and Legacy to coexist (#11225) 2021-06-14 10:38:58 +02:00
Emmanuel Vasilakis
e9ccc75a45
Provide more agent analytics to posthog (#11020)
* Move statistics related functions to analytics.c

* error message change, space added after if

* start an analytics thread

* use heartbeat instead of sleep

* add late enviroment (after rrdinit) pick of some attributes

* change loop

* re-enable info messages

* remove possible new line

* log and report hits on allmetrics pages. detect if exporting engines are enabled/in use, and report them

* use lowercase for analytics variables

* add collectors

* add buildinfo

* more attributes from late environment

* add new attributes to v1/info

* re-gather meta data before exit. update allmetrics counters to be available in v1/info

* log hits to dashboard

* add mirrored hosts

* added notification methods

* fix spaces, proper JSON naming

* add alerts, charts and metrics count

* more attributes

* keep the thread up, and report a meta event every 2 hours

* small formating changes. Disable analytics_log_prometheus when for unit testing. Add the new attributes to the anonymous-statistics.sh.in script

* applied clang-format

* dont gather data again on exit

* safe buffer length in snprintfz

* add rrdset lock

* remove show_archived

* remove setenv

* calculate lengths during sets
2021-04-27 10:11:20 +03:00
Emmanuel Vasilakis
3d571ebb44
Revert "Provide more agent analytics to posthog (#10887)" (#11011)
This reverts commit a1ce482f3e.
2021-04-21 20:53:12 +03:00
Emmanuel Vasilakis
a1ce482f3e
Provide more agent analytics to posthog (#10887)
* Move statistics related functions to analytics.c

* error message change, space added after if

* start an analytics thread

* use heartbeat instead of sleep

* add late enviroment (after rrdinit) pick of some attributes

* change loop

* re-enable info messages

* remove possible new line

* log and report hits on allmetrics pages. detect if exporting engines are enabled/in use, and report them

* use lowercase for analytics variables

* add collectors

* add buildinfo

* more attributes from late environment

* add new attributes to v1/info

* re-gather meta data before exit. update allmetrics counters to be available in v1/info

* log hits to dashboard

* add mirrored hosts

* added notification methods

* fix spaces, proper JSON naming

* add alerts, charts and metrics count

* more attributes

* keep the thread up, and report a meta event every 2 hours

* small formating changes. Disable analytics_log_prometheus when for unit testing. Add the new attributes to the anonymous-statistics.sh.in script

* applied clang-format

* dont gather data again on exit

* safe buffer length in snprintfz

* add rrdset lock

* remove show_archived
2021-04-21 18:24:51 +03:00
Timotej S
e7e5d0c372
Adds ACLK-NG as fallback(#10315)
* adds a new implementation of ACLK written almost from scratch
* external dependencies only OpenSSL and JSON-C
* fallback for systems where ACLK Legacy can't build (for technical or philosophical reasons)
* can be forced to build by giving "--aclk-ng" to the installer
2021-03-16 12:38:16 +01:00
Austin S. Hemmelgarn
9014acf01f
Add JSON output option for buildinfo. (#10706)
* Add JSON output option for buildinfo.

* use JSON bools

* fix wrong quoting

Co-authored-by: Timotej Šiškovič <timotej@netdata.cloud>
2021-03-04 13:01:07 -05:00
vkalintiris
ad163251fc
Remove unreachable #else directives in plugins. (#10523)
They are unreachable because Makefile.am will conditionally include the
relevant source files iff the #ifdef's argument is defined in
configure.ac.
2021-02-25 14:29:41 +02:00
Timotej S
593e1b6dbc
allows use of system libwebsockets instead of bundled one (#9984)
* allows usage of system libwebsockets
* fixes problems that were preventing ACLK to work with LWS `4.1.`
* add LWS info to buildinfo

Co-authored-by: Austin S. Hemmelgarn <austin@netdata.cloud>
2020-10-30 10:28:28 +01:00
Timotej S
512416f559
adds ACLK DISABLE_CLOUD to -W buildinfo (#9936) 2020-09-16 15:03:18 +02:00
Austin S. Hemmelgarn
5f3a7224ab
Added a way to get build configuration info from the agent. (#9913)
* Add a way to get build configuration info from the agent.

This adds a new option to the `-W` switch called 'buildinfo'. When
invoked with this argument, Netdata will print it's version, the
configure options, and a list of optional features and whether they are
enabled or not.

This is intended to serve three purposes:

* It allows developers to more quickly get an idea of how Netdata was
  built when triaging bug reports.
* It provides an easier way to validate changes to the build system that
  affect optional features during the development cycle.
* It provides an easier way to build CI workflows that validate that
  building under a given set of constraints results in a feature being
  enabled or not.

The actual implementation is a bit large but overall exceedingly simple,
consisting of a set of preprocessor directives to extract optional
feature state information from config.h and then a series of printf()
calls to actually report this info (which should end up optimized by
smart compilers due to all the arguments being compile-time constants).

* Added zlib to optional libraries.

* Added remaining optional plugins to buildinfo output.

* Changed formatting to be more human friendly.

* Add remaining optional libraries.

* Fix up formatting to be even more human friendly.

* Fix typo in buildinfo output.

* Remove unused variable.

* Fixed spelling of config.h option name.

* Update daemon/buildinfo.c

Co-authored-by: Markos Fountoulakis <44345837+mfundul@users.noreply.github.com>

* Fix option name mismatch for libcrypto.

* Update daemon/buildinfo.c

Co-authored-by: Markos Fountoulakis <44345837+mfundul@users.noreply.github.com>

Co-authored-by: Markos Fountoulakis <44345837+mfundul@users.noreply.github.com>
2020-09-16 07:03:27 -04:00