network网络问题定位

定位丢包,错包情况

watch more /proc/net/dev用于定位丢包,错包情况,以便看网络瓶颈,重点关注drop(包被丢弃)和网络包传送的总量,不要超过网络上限:

  • 最左边的表示接口的名字,Receive表示收包,Transmit表示发送包;
  • bytes:表示收发的字节数;
  • packets:表示收发正确的包量;
  • errs:表示收发错误的包量;
  • drop:表示收发丢弃的包量;

wiki.js

https://github.com/Requarks/wiki?tab=readme-ov-file

install

https://docs.requarks.io/install/docker

1
2
3
4
5
6
7
8
MySQL root@localhost:wiki> SHOW VARIABLES like "auto_increment%";
+--------------------------+-------+
| Variable_name | Value |
+--------------------------+-------+
| auto_increment_increment | 2 |
| auto_increment_offset | 2 |
+--------------------------+-------+

auto_increment_offset : AUTO_INCREMENT列值的起点,也就是初始值。取值范围是1 .. 65535

auto_increment_increment : 控制列中的值的增量值,也就是步长。其默认值是1,取值范围是1 .. 65535

https://github.com/requarks/wiki/issues/1485

Incorrect groups auto-increment configuration! Should start at 0 and increment by 1. Contact your database administrator.

MySQL wikijs@localhost:wiki> truncate groups
You’re about to run a destructive command.
Do you want to proceed? (y/n): y
Your call!
(1701, ‘Cannot truncate a table referenced in a foreign key constraint (wiki.usergroups, CONSTRAINT usergroups_groupid_foreign)’)

#SET FOREIGN_KEY_CHECKS=0;

/etc/my.cnf 设置

gitlab 使用

安装

https://docs.gitlab.com/ee/install/docker.html

volume目录

1
mkdir  -p  /mnt/oss/gitlab/{config,logs,data}
docker-compose.yml
docker-compose.yml
version: '3.6'
services:
  web:
    image: 'gitlab/gitlab-ce:16.6.2-ce.0'
    restart: always
    container_name: gitlab
    # hostname: 'gitlab-ce' #hostname标签是设置容器的主机名
    environment:
      GITLAB_OMNIBUS_CONFIG: |
        external_url 'https://gitlab.ui.k8s.cn:18443'
        letsencrypt['enable'] = false
        gitlab_rails['gitlab_ssh_host'] = 'gitlab.ui.k8s.cn'
        gitlab_rails['gitlab_shell_ssh_port'] = 2224
        gitlab_rails['gravatar_enabled'] = true
        #### For HTTPS
        gitlab_rails['gravatar_ssl_url'] = "https://seccdn.libravatar.org/avatar/%{hash}?s=%{size}&d=identicon"
        #### Use this line instead for HTTP
        # gitlab_rails['gravatar_plain_url'] = "http://cdn.libravatar.org/avatar/%{hash}?s=%{size}&d=identicon"

#gitlab_rails['initial_root_password'] = 'c123456;'
#复制crt证书到挂载目录 nginx['ssl_certificate'] = "/etc/gitlab/ssl/gitlab.ui.k8s.cn.crt" nginx['ssl_certificate_key'] = "/etc/gitlab/ssl/gitlab.ui.k8s.cn.key"
#配置http自动跳转到https协议的地址; nginx['redirect_http_to_https'] = true
nginx['enable'] = true nginx['client_max_body_size'] = '250m' #配置监听容器内的443端口,注意不是外面主机的443端口 nginx['listen_port'] = 443
nginx['ssl_protocols'] = "TLSv1.1 TLSv1.2" nginx['logrotate_frequency'] = "weekly" nginx['logrotate_rotate'] = 52 nginx['logrotate_compress'] = "compress" nginx['logrotate_method'] = "copytruncate" nginx['logrotate_delaycompress'] = "delaycompress"
nginx['proxy_set_headers'] = { "X-Forwarded-Proto" => "https", "X-Forwarded-Ssl" => "on", }
nginx['custom_error_pages'] = { '404' => { 'title' => 'Example title', 'header' => 'Example header', 'message' => 'Example message' } }
# gitlab_rails['smtp_enable'] = true # gitlab_rails['smtp_address'] = "smtp.example.com" # gitlab_rails['smtp_port'] = 587 # gitlab_rails['smtp_user_name'] = "no-reply@example.com" # gitlab_rails['smtp_password'] = "changeMeToSomethingGood" # gitlab_rails['smtp_domain'] = "example.com" # gitlab_rails['smtp_authentication'] = "login" # gitlab_rails['smtp_enable_starttls_auto'] = true ports: - '8980:80' - '18443:443' - '2224:22' volumes: - '/mnt/oss/gitlab/config:/etc/gitlab' - '/mnt/oss/gitlab/logs:/var/log/gitlab' - '/mnt/oss/gitlab/data:/var/opt/gitlab' - '/mnt/oss/gitlab/certs:/etc/gitlab/ssl' shm_size: '256m'
gitlab-runner: image: 'gitlab/gitlab-runner:v16.9.0' restart: unless-stopped container_name: 'gitlab-runner' depends_on: - web privileged: true extra_hosts: - "gitlab.ui.k8s.cn:192.168.122.1" # 添加主机名 volumes: - /mnt/oss/gitlab/runner/config:/etc/gitlab-runner - /var/run/docker.sock:/var/run/docker.sock - /mnt/oss/gitlab/certs/gitlab.ui.k8s.cn.crt:/home/gitlab-runner/gitlab.ui.k8s.cn.crt

修改密码

postgresql介绍

install

官网网站

https://www.postgresql.org/download/
https://www.postgresql.org/ftp/source/

centos

先安装PostgreSQL的YUM源
yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

安装server
yum install -y postgresql13-server

初始化数据库
/usr/pgsql-13/bin/postgresql-13-setup initdb

PostgreSQL 初始化数据库之后,默认的数据目录是在/var/lib/pgsql

移动原始数据目录
mv /var/lib/pgsql /data/

创建软连
cd /var/lib && ln -s /data/databases/pgsql pgsql

默认密码设置为“postgres”

配置文件

/etc/postgresql/13/main/postgresql.conf

postgresql.conf
关键参数
#connection control
listen_addresses = '*'
max_connections = 2000
superuser_reserved_connections = 10     
tcp_keepalives_idle = 60               
tcp_keepalives_interval = 10         
tcp_keepalives_count = 10        
password_encryption = md5      

#memory management shared_buffers = 16GB #推荐操作系统物理内存的1/4 max_prepared_transactions = 2000 work_mem = 8MB maintenance_work_mem = 2GB autovacuum_work_mem = 1GB dynamic_shared_memory_type = posix max_files_per_process = 24800 effective_cache_size = 32GB #推荐操作系统物理内存的1/2
#write optimization bgwriter_delay = 10ms bgwriter_lru_maxpages = 1000 bgwriter_lru_multiplier = 10.0 bgwriter_flush_after = 512kB effective_io_concurrency = 0 max_worker_processes = 256 max_parallel_maintenance_workers = 6 max_parallel_workers_per_gather = 0 max_parallel_workers = 28
#wal optimization synchronous_commit = remote_write full_page_writes = on wal_compression = on wal_writer_delay = 10ms wal_writer_flush_after = 1MB commit_delay = 10 commit_siblings = 5 checkpoint_timeout = 30min max_wal_size = 32GB min_wal_size = 16GB archive_mode = on max_wal_senders = 64 wal_keep_segments = 15 wal_sender_timeout = 60s max_replication_slots = 64 hot_standby_feedback = off
#log optimization log_destination = 'csvlog' logging_collector = on log_directory = '/pg12.4/logs' # 日志存放路径,提前规划在系统上创建好 log_filename = 'postgresql-%a.log' log_file_mode = 0600 log_truncate_on_rotation = on log_rotation_age = 1d log_rotation_size = 1GB
#audit settings log_min_duration_statement = 5s log_checkpoints = on log_connections = on log_disconnections = on log_error_verbosity = verbose log_line_prefix = '%m [%p] %q %u %d %a %r %e ' log_statement = 'ddl' log_timezone = 'PRC' track_io_timing = on track_activity_query_size = 2048
#autovacuum autovacuum = on vacuum_cost_delay = 0 old_snapshot_threshold = 6h log_autovacuum_min_duration = 0 autovacuum_max_workers = 8 autovacuum_vacuum_scale_factor = 0.02 autovacuum_analyze_scale_factor = 0.01 autovacuum_freeze_max_age = 1200000000 autovacuum_multixact_freeze_max_age = 1250000000 autovacuum_vacuum_cost_delay = 0ms
#system environment datestyle = 'iso, mdy' timezone = 'Asia/Shanghai' lc_messages = 'en_US.utf8' lc_monetary = 'en_US.utf8' lc_numeric = 'en_US.utf8' lc_time = 'en_US.utf8' default_text_search_config = 'pg_catalog.english'

thanos storage

文档

https://github.com/thanos-io/thanos/tree/main

https://thanos.io/v0.33/thanos/getting-started.md/

组件:

边车组件(Sidecar):连接 Prometheus,并把 Prometheus 暴露给查询网关(Querier/Query),以供实时查询,并且可以上传 Prometheus 数据给云存储,以供长期保存(相当于可以连接本地prometheus以及查询器的)
查询网关(Querier/Query):实现了 Prometheus API,与汇集底层组件(如边车组件 Sidecar,或是存储网关 Store Gateway)的数据(可以去查询sidecar里面的数据,或者是程查询存储网关里面的一个数据,有一部分的数据可能还在本地,因为sidecar还没有将数据上传上去,这个时候去查询的时候会根据查询时间会去路由到本地的sidecar,如果数据在远程存储上面,那么就会从存储网关上面去读取)
存储网关(Store Gateway):将云存储中的数据内容暴露出来
压缩器(Compactor):将云存储中的数据进行压缩和下采样
接收器(Receiver):从 Prometheus 的 remote-write WAL(Prometheus 远程预写式日志)获取数据,暴露出去或者上传到云存储(和sidecar是两种不同的方式)
规则组件(Ruler):针对监控数据进行评估和报警
Bucket:主要用于展示对象存储中历史数据的存储情况,查看每个指标源中数据块的压缩级别,解析度,存储时段和时间长度等信息。

端口

Component Interface Port
Sidecar gRPC 10901
Sidecar HTTP 10902
Query gRPC 10903
Query HTTP 10904
Store gRPC 10905
Store HTTP 10906
Receive gRPC (store API) 10907
Receive HTTP (remote write API) 10908
Receive HTTP 10909
Rule gRPC 10910
Rule HTTP 10911
Compact HTTP 10912
Query Frontend HTTP 10913

从使用角度来看有两种方式去使用 Thanos,sidecar模式(remote read API,与 Prometheus server 部署于同一个 pod或主机 中)和 receiver 模式( Prometheus Remote Write API)。

grafana loki logs

Loki

grafana/loki-stack:单体模式
grafana/loki-canary:金丝雀;
grafana/loki-distributed:分布式;微服务模式,适合生产较大规模场景
grafana/loki-simple-scalable:简单可扩展,读写分离模式;

组件

  • Read: {QueryFrontend, Querier},
  • Write: {Ingester, Distributor},
  • Backend: {QueryScheduler, Ruler, Compactor, IndexGateway}

loki-distributed中Ingester、distributor、querier 和 query-frontend 组件是始终安装的,其他组件是可选的

loki-log

k8s alert

Alertmanager

conf

  • Global:全局配置,主要用来配置一些通用的配置,比如邮件通知的账号、密码、SMTP服务器、微信告警等。Global 块配置下的配置选项在本配置文件内的所有配置项下可 见,但是文件内其它位置的子配置可以覆盖 Global 配置;
  • Templates:用于放置自定义模板的位置;
  • Route:告警路由配置,用于告警信息的分组路由,可以将不同分组的告警发送给不同 的收件人。比如将数据库告警发送给 DBA,服务器告警发送给 OPS;
  • nhibit_rules:告警抑制,主要用于减少告警的次数,防止“告警轰炸”。比如某个宿主机 宕机,可能会引起容器重建、漂移、服务不可用等一系列问题,如果每个异常均有告警, 会一次性发送很多告警,造成告警轰炸,并且也会干扰定位问题的思路,所以可以使用 告警抑制,屏蔽由宿主机宕机引来的其他问题,只发送宿主机宕机的消息即可;
  • Receivers:告警收件人配置,每个 receiver 都有一个名字,经过 route 分组并且路由后 需要指定一个 receiver,就是在此位置配置的
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
global:
#"resolved"或"firing",firing->time->resolved
resolve_timeout: 5m
http_config:
follow_redirects: true
enable_http2: true
smtp_hello: localhost
smtp_require_tls: true
# 邮件
smtp_smarthost: 'smtp.exmail.qq.com:25'
smtp_from: 'xxx@xxx.com'
smtp_auth_username: 'xxx@xxx.com'
smtp_auth_password: 'xxx'
#企业微信
# wechat_api_url: https://qyapi.weixin.qq.com/cgi-bin/

route:
# 默认告警通知接收者,凡未被匹配进入各子路由节点的告警均被发送到此接收者
receiver: default-receiver
continue: false
group_wait: 10s
# 再次告警时间间隔
group_interval: 5m
# 通知成功未恢复,再次告警时间间隔
repeat_interval: 3h
receivers:
- name: default-receiver
templates:
- /etc/alertmanager/*.tmpl
载入天数...载入时分秒... ,