Grafana是一个开源的,拥有丰富dashboard和图表编辑的指标分析平台 配置文件:通过nginx反向代理让域名能访问grafana平台
相关命令
启动:service grafana-server start
停止:service grafana-server stop
重启:service grafana-server restart
加入开机自启动: chkconfig --add grafana-server on
Prometheus Server负责定时在目标上抓取Metrics数据,每个抓取目标都需要暴露一个HTTP服务接口用于Prometheus定时抓取 配置文件:将各种监控插件的接口暴漏给prometheus进行抓去
prometheus.yml(也可以额外新增配置文件,在prometheus.yml中直接引用)
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["ip:port"]
- job_name: 'neusoft'
file_sd_configs:
- files:
- /home/neusoft/sdkjg/installfiles/conf/neusoft.json
neusoft.json
[
{
"targets": [
"ip:port"
],
"labels": {
"group": "neusoft",
"app": "",
"hostname": "nacos"
}
},
{
"targets": [
"ip:port"
],
"labels": {
"group": "neusoft",
"app": "",
"hostname": "微服务"
}
}
]
启动命令(指定端口--web.listen-address、指定数据存储位置--storage.tsdb.path)
nohup /home/neusoft/sdkjg/installfiles/prometheus-2.42.0.linux-amd64/prometheus --config.file=/home/neusoft/sdkjg/installfiles/prometheus-2.42.0.linux-amd64/prometheus.yml --web.listen-address=':9190' --storage.tsdb.path="/mnt/data/neusoft/sdkjg/app/data" > /var/log/prometheus.log 2>&1 &
node-exporter用于采集服务器层面的运行指标,包括机器的loadavg、filesystem、meminfo等基础监控
启动命令(指定端口--web.listen-address)
nohup /home/neusoft/sdkjg/installfiles/node_exporter --web.listen-address=':9100' > /var/log/node_exporter.log 2>&1 &
Promtail 将本地日志内容传送到 Loki 实例
配置文件
positions: filename: /tmp/positions.yaml
clients:
scrape_configs:
positions: filename: /tmp/positions.yaml
clients:
scrape_configs:
格式化nginx出输出日志
log_format json_analytics escape=json '{' '"msec": "$msec", ' # request unixtime in seconds with a milliseconds resolution '"connection": "$connection", ' # connection serial number '"connection_requests": "$connection_requests", ' # number of requests made in connection '"pid": "$pid", ' # process pid '"request_id": "$request_id", ' # the unique request id '"request_length": "$request_length", ' # request length (including headers and body) '"remote_addr": "$remote_addr", ' # client IP '"remote_user": "$remote_user", ' # client HTTP username '"remote_port": "$remote_port", ' # client port '"time_local": "$time_local", ' '"time_iso8601": "$time_iso8601", ' # local time in the ISO 8601 standard format '"request": "$request", ' # full path no arguments if the request '"request_uri": "$request_uri", ' # full path and arguments if the request '"args": "$args", ' # args '"status": "$status", ' # response status code '"body_bytes_sent": "$body_bytes_sent", ' # the number of body bytes exclude headers sent to a client '"bytes_sent": "$bytes_sent", ' # the number of bytes sent to a client '"http_referer": "$http_referer", ' # HTTP referer '"http_user_agent": "$http_user_agent", ' # user agent '"http_x_forwarded_for": "$http_x_forwarded_for", ' # http_x_forwarded_for '"http_host": "$http_host", ' # the request Host: header '"server_name": "$server_name", ' # the name of the vhost serving the request '"request_time": "$request_time", ' # request processing time in seconds with msec resolution '"upstream": "$upstream_addr", ' # upstream backend server for proxied requests '"upstream_connect_time": "$upstream_connect_time", ' # upstream handshake time incl. TLS '"upstream_header_time": "$upstream_header_time", ' # time spent receiving upstream headers '"upstream_response_time": "$upstream_response_time", ' # time spend receiving upstream body '"upstream_response_length": "$upstream_response_length", ' # upstream response length '"upstream_cache_status": "$upstream_cache_status", ' # cache HIT/MISS where applicable '"ssl_protocol": "$ssl_protocol", ' # TLS protocol '"ssl_cipher": "$ssl_cipher", ' # TLS cipher '"scheme": "$scheme", ' # http or https '"request_method": "$request_method", ' # request method '"server_protocol": "$server_protocol", ' # request protocol, like HTTP/1.1 or HTTP/2.0 '"pipe": "$pipe", ' # "p" if request was pipelined, "." otherwise '"gzip_ratio": "$gzip_ratio", ' '"http_cf_ray": "$http_cf_ray",' '"geoip_country_code": "$http_cf_ipcountry"' '}'; access_log logs/json_access.log json_analytics;
启动命令
nohup /home/neusoft/sdkjg/installfiles/promtail-linux-amd64 -config.file=/home/neusoft/sdkjg/installfiles/promtail-local-config-linux.yaml > /var/log/promtail.log 2>&1 &
## 5.loki
Loki是 Grafana Labs 团队最新的开源项目,是一个水平可扩展,高可用性,多租户的日志聚合系统。
Loki日志系统由以下3个部分组成:
- loki是主服务器,负责存储日志和处理查询。
- promtail是专为loki定制的代理,负责收集日志并将其发送给 loki 。
- Grafana用于 UI展示。
配置文件
- 微服务
auth_enabled: false
server: http_listen_port: 3100 grpc_listen_port: 9096
common: instance_addr: ip path_prefix: /tmp/loki storage: filesystem: chunks_directory: /tmp/loki/chunks rules_directory: /tmp/loki/rules replication_factor: 1 ring: kvstore: store: inmemory
query_range: results_cache: cache: embedded_cache: enabled: true max_size_mb: 100
schema_config: configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
ruler: alertmanager_url: http://localhost:9093
#
#
- nginx
auth_enabled: false
server: http_listen_port: 3100 grpc_listen_port: 9096
common: instance_addr: ip path_prefix: /tmp/loki storage: filesystem: chunks_directory: /tmp/loki/chunks rules_directory: /tmp/loki/rules replication_factor: 1 ring: kvstore: store: inmemory
query_range: results_cache: cache: embedded_cache: enabled: true max_size_mb: 100
schema_config: configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
limits_config: enforce_metric_name: false reject_old_samples: true reject_old_samples_max_age: 168h ingestion_rate_mb: 30 #修改每用户摄入速率限制,即每秒样本量,默认值为4M ingestion_burst_size_mb: 15 #修改每用户摄入速率限制,即每秒样本量,默认值为6M
ruler: alertmanager_url: http://localhost:9093
#
#
启动命令
nohup /home/neusoft/sdkjg/installfiles/loki-linux-amd64 -config.file=/home/neusoft/sdkjg/installfiles/loki-local-config-linux.yaml > /var/log/loki.log 2>&1 & ```
将各监控节点作为数据源配置到grafana平台,在dashboard显示监控大屏 {% asset_img grafana1.png %} 服务器监控 {% asset_img grafana2.png %} nginx监控 {% asset_img grafana3.png %} redis监控 {% asset_img grafana4.png %} {% plantuml %} skinparam rectangle<<behavior>> { roundCorner 25 } sprite $aService jar:archimate/application-service sprite $bService jar:archimate/business-service
rectangle "Grafana" as Grafana <<$bService>><<behavior>> #Business rectangle "Loki\nip:3100" as LokiNginx <<$bService>><<behavior>> #Business rectangle "Loki\nip:3100" as LokiServer <<$bService>><<behavior>> #Business rectangle "Prometheus\nip:9190" as Prometheus <<$bService>><<behavior>> #Business rectangle "Redis\nip:6379" as Redis <<$bService>><<behavior>> #Business
Grafana -down- LokiServer Grafana -down- LokiNginx Grafana -down- Prometheus Grafana -down- Redis
rectangle "Promtail(nginx)\nip:9080" as PromtailNginx <<$aService>><<behavior>> #Application rectangle "Promtail(server)\nip:9080" as PromtailServer <<$aService>><<behavior>> #Application rectangle "node_export\nip:9100" as nodeExport <<$aService>><<behavior>> #Application rectangle "none" as none <<$aService>><<behavior>> #Application
archimate #Technology "Servers" as Servers <<technology-device>>
none -up-> Redis nodeExport -up-> Prometheus PromtailNginx -up-> LokiNginx PromtailServer -up-> LokiServer
Servers -up-> none Servers -up-> PromtailNginx Servers -up-> PromtailServer Servers -up-> nodeExport {% endplantuml %}