promtail + loki + grafana 日志看板平台,帮助开发快速通过日志定位问题

公司一个项目,一个服务有多台服务器,开发每次查看php程序或go程序日志都要去ftp拉取,一台一台找,找个问题搞半天,最近优化了服务器成本,释放了多台云主机,留了一台有docker的机器安装loki和granfana;收集日志采用promtail,并使用supervisor管理进程

docker安装 https://zhpengfei.com/install-docker-cluster/

一、docker-compose 安装loki 、grafana

1.1 安装 docker-compose 命令

curl -L https://get.daocloud.io/docker/compose/releases/download/1.21.1/docker-compose-`uname -s`-`uname -m` -o /usr/bin/docker-compose
chmod +x /usr/bin/docker-compose

1.2 loki 目录结构

[root@loki ~]# mkdir loki
[root@loki ~]# tree loki
loki
├── config
│   └── loki
│       ├── config.yaml
│       └── config.yamlbak
└── docker-compose.yaml
[root@loki ~]# cd docker-compose/loki

1.3 编写loki和granfana的docker-compose

[root@loki loki]# vim docker-compose.yaml
cat docker-compose.yaml
version: "3"

networks:
  loki:

services:
    loki:
        image: grafana/loki:latest
        ports:
          - "3100:3100"
          - "9095:9095"
        command: -config.file=/etc/loki/config.yaml
        volumes:
            - ./config/loki:/etc/loki
            - /data/loki:/loki
        networks:
          - loki
    grafana:
        image: grafana/grafana:latest
        ports:
          - "3000:3000"
        volumes:
            - /data/grafana:/var/lib/grafana
        environment:
            GF_SECURITY_ADMIN_PASSWORD: 123456
            GF_SERVER_HTTP_PORT: 3000
        networks:
          - loki

1.4 创建 loki配置 文件

在当前目录下,创建config/loki目录

cd docker-compose/loki
mkdir -p config/loki/
vim config/loki/config.yaml
auth_enabled: false

server:
  http_listen_port: 3100
  grpc_listen_port: 9095
  grpc_server_max_recv_msg_size: 1572864000 #grpc最大接收消息值,默认4m
  grpc_server_max_send_msg_size: 1572864000 #grpc最大发送消息值,默认4m

ingester:
  lifecycler:
    address: 172.19.72.235
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1
    final_sleep: 0s
  chunk_idle_period: 5m
  chunk_retain_period: 30s
  wal:
    dir: /loki/wal

compactor:
  working_directory: /loki/persistent      # 压缩目录,一般也作为持久化目录
  compaction_interval: 10m                 # 压缩间隔
  retention_enabled: true                  # 持久化开启
  retention_delete_delay: 5m               # 过期后多久删除
  retention_delete_worker_count: 150       # 过期删除协程数目
schema_config:
    configs:
      - from: "2023-10-23"
        index:
            period: 24h
            prefix: loki_index_
        object_store: filesystem          # 持久化方式:本地文件
        schema: v11
        store: boltdb-shipper

storage_config:
    boltdb_shipper:
        active_index_directory: /loki/boltdb-index    # index 目录
        cache_location: /loki/boltdb-cache            # cache 目录
    filesystem:
        directory: /loki/chunks                       # chunks 目录
limits_config:
  retention_period: 240h                              # 多久过期

创建数据目录并给777权限

如果不给777权限,启动会报错mkdir /loki/chunks: permission denied或其他目录无法创建

mkdir /data/loki
mkdir /data/grafana
chmod 777 /data/loki
chmod 777 /data/grafana

安装kilo 、grafana

docker-compose up -d
docker-compose logs #查看日志
docker-compose ps #查看进程
image

二、安装promtail

需要收集日志的服务器没安装docker,就直接下载安装包,在命令行启动

wget https://github.com/grafana/loki/releases/download/v2.9.2/promtail-linux-amd64.zip #当前最新版,和loki版本一样
unzip promtail-linux-amd64.zip
vim promtail-local-config.yaml
server:
  http_listen_port: 9080
  grpc_listen_port: 0
  grpc_server_max_recv_msg_size: 1572864000
  grpc_server_max_send_msg_size: 1572864000

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://172.19.72.235:3100/loki/api/v1/push

scrape_configs:
- job_name: api
  static_configs:
  - targets:
      - 172.19.72.235
    labels:
      job: api
      __path__: /data/runtime/logs/*/*.log

- job_name: websoket
  static_configs:
  - targets:
      - 172.19.72.235
    labels:
      job: websoket
      __path__: /data/logs/*.log

启动 promtail

./promtail-linux-amd64 --config.file=promtail-local-config.yaml

启动没问题之后添加到supervisor

[program:promtail-log]
directory=/usr/local/data/promtail
command=/usr/local/data/promtail/promtail-linux-amd64 -config.file=promtail-local-config.yaml
autostart=true
autorestart=true
startsecs=5
priority=1
stopsignal=INT
stopwaitsecs=11
stopasgroup=true
killasgroup=true
[root@api1 promtail]# supervisorctl update
[root@api1 promtail]# supervisorctl status
promtail-log                     RUNNING   pid 11360, uptime 1 days, 0:38:48

三、遇到的问题汇总:

问题1 :权限问题

如果loki_loki_1启动失败,基本上都是/loki/persistent、/loki/wal、/loki/chunks、/loki/boltdb-index、/loki/boltdb-cache无法创建,而这些目录是挂在到本地磁盘/data/loki下面,只需要给/data/loki 777权限,重启服务即可

问题2:promtail和loki,发送接收报错

当日志量过大时候,promtail就报以下错误,loki接收也会报错

status: 500. message: rpc error: code = resourceexhausted desc = trying to send message larger than max (5066121 vs. 4194304)

在loki和promtail配置文件server中,加入下面参数,重启服务就好啦

  grpc_server_max_recv_msg_size: 1572864000
  grpc_server_max_send_msg_size: 1572864000

上面的配置文件中,已经包含这两个参数

四、配置grafana

4.1 浏览器中打开granfana

http://172.19.72.235:3000/

4.2 添加数据源

http://172.19.72.235:3000/connections/add-new-connection

添加数据源
测试数据源
日志过滤
过滤日志

至此就将grafana交给研发使用即可

Comments

No comments yet. Why don’t you start the discussion?

发表评论