Metrics
Metrics are essential in understanding how any large system is behaving and performing. Concourse can emit metrics about both the system health itself and about the builds that it is running. Operators can tap into these metrics in order to observe the health of the system.
Configuring Metrics
The web node can be configured to emit metrics on start.
Currently supported metrics emitters are InfluxDB, NewRelic, Prometheus, and Datadog. There is also a dummy emitter that
will just spit the metrics out in to the logs at DEBUG level, which can be enabled with the --emit-to-logs flag.
Regardless of your metrics emitter, you can set CONCOURSE_METRICS_BUFFER_SIZE to determine how many metrics emissions
are sent at a time. Increasing this number can be helpful if sending metrics is regularly failing (due to rate limiting
or network failures) or if latency is particularly high.
There are various flags for different emitters; run concourse web --help and look for "Metric Emitter" to see what's
available.
What's emitted?
This reference section lists of all the metrics that Concourse emits via the Prometheus emitter.
To make this document easy to maintain, Prometheus is used as the "source of truth" - primarily because it has help text built-in, making this list easy to generate. Treat this list as a reference when looking for the equivalent metric names for your emitter of choice.
concourse_builds_aborted_total: counter
Total number of Concourse builds aborted.
concourse_builds_check_aborted_total: counter
Total number of Concourse check builds aborted.
concourse_builds_check_errored_total: counter
Total number of Concourse check builds errored.
concourse_builds_check_failed_total: counter
Total number of Concourse check builds failed.
concourse_builds_check_finished_total: counter
Total number of Concourse check builds finished.
concourse_builds_check_running: gauge
Number of Concourse check builds currently running.
concourse_builds_check_started_total: counter
Total number of Concourse check builds started.
concourse_builds_check_succeeded_total: counter
Total number of Concourse check builds succeeded.
concourse_builds_errored_total: counter
Total number of Concourse builds errored.
concourse_builds_failed_total: counter
Total number of Concourse builds failed.
concourse_builds_finished_total: counter
Total number of Concourse builds finished.
concourse_builds_running: gauge
Number of Concourse builds currently running.
concourse_builds_started_total: counter
Total number of Concourse builds started.
concourse_builds_succeeded_total: counter
Total number of Concourse builds succeeded.
concourse_caches_get_step_cache_hits: counter
Total number of get steps that hit caches
concourse_caches_streamed_resource_caches: counter
Total number of streamed resource caches
concourse_db_connections: gauge
Current number of concourse database connections
concourse_db_queries_total: counter
Total number of database Concourse database queries
concourse_gc_created_containers_to_be_garbage_collected: counter
Created Containers being garbage collected
concourse_gc_created_volumes_to_be_garbage_collected: counter
Created Volumes being garbage collected
concourse_gc_creating_containers_to_be_garbage_collected: counter
Creating Containers being garbage collected
concourse_gc_destroying_containers_to_be_garbage_collected: counter
Destorying Containers being garbage collected
concourse_gc_destroying_volumes_to_be_garbage_collected: counter
Destroying Volumes being garbage collected
concourse_gc_failed_containers_to_be_garbage_collected: counter
Failed Containers being garbage collected
concourse_gc_failed_volumes_to_be_garbage_collected: counter
Failed Volumes being garbage collected
concourse_gc_gc_artifact_collector_duration: histogram
Duration of gc artifact collector (ms)
concourse_gc_gc_build_collector_duration: histogram
Duration of gc build collector (ms)
concourse_gc_gc_container_collector_duration: histogram
Duration of gc container collector (ms)
concourse_gc_gc_resource_cache_collector_duration: histogram
Duration of gc resource cache collector (ms)
concourse_gc_gc_resource_cache_use_collector_duration: histogram
Duration of gc resource cache use collector (ms)
concourse_gc_gc_resource_config_check_session_collector_duration: histogram
Duration of gc resource config check session collector (ms)
concourse_gc_gc_resource_config_collector_duration: histogram
Duration of gc resource config collector (ms)
concourse_gc_gc_task_cache_collector_duration: histogram
Duration of gc task cache collector (ms)
concourse_gc_gc_volume_collector_duration: histogram
Duration of gc volume collector (ms)
concourse_gc_gc_worker_collector_duration: histogram
Duration of gc worker collector (ms)
concourse_http_responses_duration_seconds: histogram
Response time in seconds
concourse_jobs_scheduled_total: counter
Total number of Concourse jobs scheduled.
concourse_jobs_scheduling: gauge
Number of Concourse jobs currently being scheduled.
concourse_lidar_checks_enqueued_total: counter
Total number of checks enqueued
concourse_lidar_checks_finished_total: counter
Total number of checks finished.
concourse_lidar_checks_started_total: counter
Total number of checks started. With global resource enabled, a check build may not really run a check, thus total checks started should be less than total check builds started.
concourse_locks_held: gauge
Database locks held
concourse_volumes_orphaned_volumes_to_be_deleted: counter
Number of orphaned volumes to be garbage collected.
concourse_volumes_volumes_streamed: counter
Total number of volumes streamed from one worker to the other
concourse_workers_containers: gauge
Number of containers per worker
concourse_workers_registered: gauge
Number of workers per state as seen by the database
concourse_workers_tasks: gauge
Number of active tasks per worker
concourse_workers_unknown_containers: gauge
Number of unknown containers found on worker
concourse_workers_unknown_volumes: gauge
Number of unknown volumes found on worker
concourse_workers_volumes: gauge
Number of volumes per worker
go_gc_duration_seconds: summary
A summary of the wall-time pause (stop-the-world) duration in garbage collection cycles.
go_gc_gogc_percent: gauge
Heap size target percentage configured by the user, otherwise 100. This value is set by the GOGC environment variable, and the runtime/debug.SetGCPercent function. Sourced from /gc/gogc:percent.
go_gc_gomemlimit_bytes: gauge
Go runtime memory limit configured by the user, otherwise math.MaxInt64. This value is set by the GOMEMLIMIT environment variable, and the runtime/debug.SetMemoryLimit function. Sourced from /gc/gomemlimit:bytes.
go_goroutines: gauge
Number of goroutines that currently exist.
go_info: gauge
Information about the Go environment.
go_memstats_alloc_bytes: gauge
Number of bytes allocated in heap and currently in use. Equals to /memory/classes/heap/objects:bytes.
go_memstats_alloc_bytes_total: counter
Total number of bytes allocated in heap until now, even if released already. Equals to /gc/heap/allocs:bytes.
go_memstats_buck_hash_sys_bytes: gauge
Number of bytes used by the profiling bucket hash table. Equals to /memory/classes/profiling/buckets:bytes.
go_memstats_frees_total: counter
Total number of heap objects frees. Equals to /gc/heap/frees:objects + /gc/heap/tiny/allocs:objects.
go_memstats_gc_sys_bytes: gauge
Number of bytes used for garbage collection system metadata. Equals to /memory/classes/metadata/other:bytes.
go_memstats_heap_alloc_bytes: gauge
Number of heap bytes allocated and currently in use, same as go_memstats_alloc_bytes. Equals to /memory/classes/heap/objects:bytes.
go_memstats_heap_idle_bytes: gauge
Number of heap bytes waiting to be used. Equals to /memory/classes/heap/released:bytes + /memory/classes/heap/free:bytes.
go_memstats_heap_inuse_bytes: gauge
Number of heap bytes that are in use. Equals to /memory/classes/heap/objects:bytes + /memory/classes/heap/unused:bytes
go_memstats_heap_objects: gauge
Number of currently allocated objects. Equals to /gc/heap/objects:objects.
go_memstats_heap_released_bytes: gauge
Number of heap bytes released to OS. Equals to /memory/classes/heap/released:bytes.
go_memstats_heap_sys_bytes: gauge
Number of heap bytes obtained from system. Equals to /memory/classes/heap/objects:bytes + /memory/classes/heap/unused:bytes + /memory/classes/heap/released:bytes + /memory/classes/heap/free:bytes.
go_memstats_last_gc_time_seconds: gauge
Number of seconds since 1970 of last garbage collection.
go_memstats_mallocs_total: counter
Total number of heap objects allocated, both live and gc-ed. Semantically a counter version for go_memstats_heap_objects gauge. Equals to /gc/heap/allocs:objects + /gc/heap/tiny/allocs:objects.
go_memstats_mcache_inuse_bytes: gauge
Number of bytes in use by mcache structures. Equals to /memory/classes/metadata/mcache/inuse:bytes.
go_memstats_mcache_sys_bytes: gauge
Number of bytes used for mcache structures obtained from system. Equals to /memory/classes/metadata/mcache/inuse:bytes + /memory/classes/metadata/mcache/free:bytes.
go_memstats_mspan_inuse_bytes: gauge
Number of bytes in use by mspan structures. Equals to /memory/classes/metadata/mspan/inuse:bytes.
go_memstats_mspan_sys_bytes: gauge
Number of bytes used for mspan structures obtained from system. Equals to /memory/classes/metadata/mspan/inuse:bytes + /memory/classes/metadata/mspan/free:bytes.
go_memstats_next_gc_bytes: gauge
Number of heap bytes when next garbage collection will take place. Equals to /gc/heap/goal:bytes.
go_memstats_other_sys_bytes: gauge
Number of bytes used for other system allocations. Equals to /memory/classes/other:bytes.
go_memstats_stack_inuse_bytes: gauge
Number of bytes obtained from system for stack allocator in non-CGO environments. Equals to /memory/classes/heap/stacks:bytes.
go_memstats_stack_sys_bytes: gauge
Number of bytes obtained from system for stack allocator. Equals to /memory/classes/heap/stacks:bytes + /memory/classes/os-stacks:bytes.
go_memstats_sys_bytes: gauge
Number of bytes obtained from system. Equals to /memory/classes/total:byte.
go_sched_gomaxprocs_threads: gauge
The current runtime.GOMAXPROCS setting, or the number of operating system threads that can execute user-level Go code simultaneously. Sourced from /sched/gomaxprocs:threads.
go_threads: gauge
Number of OS threads created.
process_cpu_seconds_total: counter
Total user and system CPU time spent in seconds.
process_max_fds: gauge
Maximum number of open file descriptors.
process_network_receive_bytes_total: counter
Number of bytes received by the process over the network.
process_network_transmit_bytes_total: counter
Number of bytes sent by the process over the network.
process_open_fds: gauge
Number of open file descriptors.
process_resident_memory_bytes: gauge
Resident memory size in bytes.
process_start_time_seconds: gauge
Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes: gauge
Virtual memory size in bytes.
process_virtual_memory_max_bytes: gauge
Maximum amount of virtual memory available in bytes.
promhttp_metric_handler_requests_in_flight: gauge
Current number of scrapes being served.
promhttp_metric_handler_requests_total: counter
Total number of scrapes by HTTP status code.