Docker搭建高可用的 MySQL、Redis、Kafka 服务

Docker

下面是一份面向生产实践的 Docker 教程,分别演示如何搭建高可用的 MySQL、Redis、Kafka 服务,并将数据目录挂载到宿主机以便持久化。内容包含单机快速开始与生产级高可用(HA)方案(含 Docker Compose),并说明常见坑与运维要点。

注意:

  • 所有示例均使用 Docker Compose v2 语法(compose.yaml)。请先安装 Docker 与 Docker Compose。
  • 目录路径可按需调整,宿主机请确保有足够的磁盘空间与 IOPS。
  • 若是多节点高可用方案,请准备多台机器并配置各自的主机名与网络互通(或者使用 Swarm/K8s)。

目录

  • 一、MySQL 高可用(MGR 与 InnoDB Cluster、M/S 复制)
  • 二、Redis 高可用(Redis Sentinel 与 Redis Cluster)
  • 三、Kafka 高可用(多 Broker + ZooKeeper 或 KRaft 模式)
  • 四、通用运维建议与常见问题

一、MySQL 高可用与持久化
目标:

  • 快速开始:单机 MySQL,数据持久化。
  • 生产 HA:MySQL Group Replication / InnoDB Cluster 或主从复制 + 自动故障转移(MHA/Orchestrator/ProxySQL + Keepalived)。

A. 快速开始(单机)
目录结构:

  • /data/mysql/single/{data,conf,logs}

示例 compose.yaml:

yaml
services:
  mysql:
    image: mysql:8.0
    container_name: mysql-single
    restart: always
    environment:
      MYSQL_ROOT_PASSWORD: StrongRootPassw0rd!
      MYSQL_DATABASE: appdb
      MYSQL_USER: appuser
      MYSQL_PASSWORD: AppPassw0rd!
      TZ: Asia/Shanghai
    command:
      - --character-set-server=utf8mb4
      - --collation-server=utf8mb4_0900_ai_ci
      - --default-authentication-plugin=caching_sha2_password
      - --innodb-buffer-pool-size=1G
      - --log-bin=mysql-bin
      - --server-id=1
    volumes:
      - /data/mysql/single/data:/var/lib/mysql
      - /data/mysql/single/conf:/etc/mysql/conf.d
      - /data/mysql/single/logs:/var/log/mysql
    ports:
      - "3306:3306"

说明:

  • 数据挂载:/var/lib/mysql → /data/mysql/single/data。
  • 日志挂载:/var/log/mysql,可根据镜像实际日志路径调整。
  • 自定义配置可在 conf 目录放置 my.cnf 片段。
    启动:
  • mkdir -p /data/mysql/single/{data,conf,logs}
  • docker compose up -d

B. 生产高可用方案 1:MySQL Group Replication(MGR)/ InnoDB Cluster
特点:

  • 原生多主/单主复制,自动成员管理。推荐通过 InnoDB Cluster(Shell 管理)实现。
  • 通常需要 3 个节点(奇数),同网段/低延迟。

目录结构(每节点):

  • /data/mysql/mgr/nodeX/{data,conf,logs}

示例简化(同机多端口演示,生产建议多机部署):

yaml
services:
  mysql1:
    image: mysql:8.0
    container_name: mysql1
    restart: always
    environment:
      MYSQL_ROOT_PASSWORD: RootPass#1
      TZ: Asia/Shanghai
    command:
      - --server-id=1
      - --log-bin=mysql-bin
      - --binlog-checksum=NONE
      - --gtid-mode=ON
      - --enforce-gtid-consistency=ON
      - --transaction-write-set-extraction=XXHASH64
      - --plugin-load-add=mysql_clone.so
      - --loose-group-replication-group-name=aaaaaaaa-bbbb-cccc-dddd-eeeeeeee0001
      - --loose-group-replication-start-on-boot=off
      - --loose-group-replication-local-address=mysql1:33061
      - --loose-group-replication-group-seeds=mysql1:33061,mysql2:33061,mysql3:33061
      - --loose-group-replication-single-primary-mode=ON
      - --loose-group-replication-enforce-update-everywhere-checks=OFF
    volumes:
      - /data/mysql/mgr/node1/data:/var/lib/mysql
      - /data/mysql/mgr/node1/conf:/etc/mysql/conf.d
      - /data/mysql/mgr/node1/logs:/var/log/mysql
    ports: ["3307:3306"]

  mysql2:
    image: mysql:8.0
    container_name: mysql2
    restart: always
    environment:
      MYSQL_ROOT_PASSWORD: RootPass#2
      TZ: Asia/Shanghai
    command:
      - --server-id=2
      - --log-bin=mysql-bin
      - --binlog-checksum=NONE
      - --gtid-mode=ON
      - --enforce-gtid-consistency=ON
      - --transaction-write-set-extraction=XXHASH64
      - --plugin-load-add=mysql_clone.so
      - --loose-group-replication-group-name=aaaaaaaa-bbbb-cccc-dddd-eeeeeeee0001
      - --loose-group-replication-start-on-boot=off
      - --loose-group-replication-local-address=mysql2:33061
      - --loose-group-replication-group-seeds=mysql1:33061,mysql2:33061,mysql3:33061
      - --loose-group-replication-single-primary-mode=ON
      - --loose-group-replication-enforce-update-everywhere-checks=OFF
    volumes:
      - /data/mysql/mgr/node2/data:/var/lib/mysql
      - /data/mysql/mgr/node2/conf:/etc/mysql/conf.d
      - /data/mysql/mgr/node2/logs:/var/log/mysql
    ports: ["3308:3306"]

  mysql3:
    image: mysql:8.0
    container_name: mysql3
    restart: always
    environment:
      MYSQL_ROOT_PASSWORD: RootPass#3
      TZ: Asia/Shanghai
    command:
      - --server-id=3
      - --log-bin=mysql-bin
      - --binlog-checksum=NONE
      - --gtid-mode=ON
      - --enforce-gtid-consistency=ON
      - --transaction-write-set-extraction=XXHASH64
      - --plugin-load-add=mysql_clone.so
      - --loose-group-replication-group-name=aaaaaaaa-bbbb-cccc-dddd-eeeeeeee0001
      - --loose-group-replication-start-on-boot=off
      - --loose-group-replication-local-address=mysql3:33061
      - --loose-group-replication-group-seeds=mysql1:33061,mysql2:33061,mysql3:33061
      - --loose-group-replication-single-primary-mode=ON
      - --loose-group-replication-enforce-update-everywhere-checks=OFF
    volumes:
      - /data/mysql/mgr/node3/data:/var/lib/mysql
      - /data/mysql/mgr/node3/conf:/etc/mysql/conf.d
      - /data/mysql/mgr/node3/logs:/var/log/mysql
    ports: ["3309:3306"]

networks:
  default:
    name: mysql-mgr-net

初始化步骤(进入 mysql1 容器示例):

  • docker exec -it mysql1 mysql -uroot -p
  • 创建复制账户并配置通道(3 节点类似):
    • CREATE USER 'repl'@'%' IDENTIFIED BY 'ReplPass!23';
    • GRANT REPLICATION SLAVE, REPLICATION CLIENT ON . TO 'repl'@'%';
    • SET SQL_LOG_BIN=0; CREATE USER 'mysql_innodb_cluster_1'@'%' IDENTIFIED BY 'IcPass!23'; GRANT ALL PRIVILEGES ON . TO 'mysql_innodb_cluster_1'@'%'; SET SQL_LOG_BIN=1;
  • 安装并启动 MGR(单主模式):
    • INSTALL PLUGIN group_replication SONAME 'group_replication.so';
    • CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='ReplPass!23' FOR CHANNEL 'group_replication_recovery';
    • SET GLOBAL group_replication_bootstrap_group=ON; START GROUP_REPLICATION; SET GLOBAL group_replication_bootstrap_group=OFF;
  • 在 mysql2/mysql3 上执行相同的用户与 CHANGE MASTER 后,运行 START GROUP_REPLICATION。
  • 验证:SELECT * FROM performance_schema.replication_group_members;

生产建议:

  • 使用 MySQL Shell 创建 InnoDB Cluster 简化初始化与拓扑维护。
  • 接入层用 ProxySQL 或 MySQL Router 实现读写分离与自动主节点路由。
  • 参数优化:innodb_buffer_pool_size、io_capacity、sync_binlog=1、binlog_group_commit、semi-sync(若走传统主从)。

C. 生产高可用方案 2:主从复制 + ProxySQL + Keepalived

  • 至少 1 主 2 从;ProxySQL 做路由;Keepalived 提供虚 IP 漂移。
  • 故障转移可借助 Orchestrator 自动提升从库为主库。
  • 配置与 MGR 不同,重点在 GTID 主从复制与路由层的健康检查。

二、Redis 高可用与持久化
目标:

  • 快速开始:单机 Redis,AOF/RDB 持久化。
  • 生产 HA:Redis Sentinel(主从 + 哨兵)或 Redis Cluster(分片 + 高可用)。

A. 快速开始(单机)
目录结构:

  • /data/redis/single/{data,conf}

配置文件 redis.conf 示例(放 /data/redis/single/conf/redis.conf):

bind 0.0.0.0
port 6379
protected-mode yes
requirepass StrongRedisPass!23
appendonly yes
appendfsync everysec
dir /data

compose.yaml:

yaml
services:
  redis:
    image: redis:7.2
    container_name: redis-single
    restart: always
    command: ["redis-server", "/usr/local/etc/redis/redis.conf"]
    volumes:
      - /data/redis/single/conf/redis.conf:/usr/local/etc/redis/redis.conf:ro
      - /data/redis/single/data:/data
    ports:
      - "6379:6379"

启动:

  • mkdir -p /data/redis/single/{data,conf}
  • 写入 redis.conf
  • docker compose up -d

B. Redis Sentinel 高可用(1 主 2 从 + 3 哨兵)
目录结构:

  • /data/redis/sentinel/{master,slave1,slave2}/data
  • /data/redis/sentinel/sentinel{1,2,3}

compose.yaml(同机演示):

yaml
services:
  redis-master:
    image: redis:7.2
    command: ["redis-server", "--appendonly", "yes", "--requirepass", "Redis#Pass", "--masterauth", "Redis#Pass"]
    volumes:
      - /data/redis/sentinel/master/data:/data
    ports: ["6380:6379"]
    restart: always

  redis-slave1:
    image: redis:7.2
    command: ["redis-server", "--replicaof", "redis-master", "6379", "--requirepass", "Redis#Pass", "--masterauth", "Redis#Pass", "--appendonly", "yes"]
    depends_on: [redis-master]
    volumes:
      - /data/redis/sentinel/slave1/data:/data
    ports: ["6381:6379"]
    restart: always

  redis-slave2:
    image: redis:7.2
    command: ["redis-server", "--replicaof", "redis-master", "6379", "--requirepass", "Redis#Pass", "--masterauth", "Redis#Pass", "--appendonly", "yes"]
    depends_on: [redis-master]
    volumes:
      - /data/redis/sentinel/slave2/data:/data
    ports: ["6382:6379"]
    restart: always

  sentinel1:
    image: redis:7.2
    command: ["redis-sentinel", "/etc/redis/sentinel.conf"]
    volumes:
      - /data/redis/sentinel/sentinel1:/etc/redis
    ports: ["26379:26379"]
    depends_on: [redis-master, redis-slave1, redis-slave2]
    restart: always

  sentinel2:
    image: redis:7.2
    command: ["redis-sentinel", "/etc/redis/sentinel.conf"]
    volumes:
      - /data/redis/sentinel/sentinel2:/etc/redis
    ports: ["26380:26379"]
    depends_on: [redis-master, redis-slave1, redis-slave2]
    restart: always

  sentinel3:
    image: redis:7.2
    command: ["redis-sentinel", "/etc/redis/sentinel.conf"]
    volumes:
      - /data/redis/sentinel/sentinel3:/etc/redis
    ports: ["26381:26379"]
    depends_on: [redis-master, redis-slave1, redis-slave2]
    restart: always

为每个 sentinel 写入 sentinel.conf(以 sentinel1 为例,其他两个仅端口不同):
路径:/data/redis/sentinel/sentinel1/sentinel.conf

port 26379
bind 0.0.0.0
dir /tmp
sentinel monitor mymaster redis-master 6379 2
sentinel auth-pass mymaster Redis#Pass
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 60000
  • sentinel2/sentinel3 将 port 分别改为 26380/26381。
  • 验证:哨兵能在主挂掉时选举从为新主。客户端需实现哨兵发现:从 sentinel 获取当前主地址。

C. Redis Cluster(分片 + 高可用)

  • 至少 6 节点(3 主 3 从)。
  • 使用 --cluster-enabled yes。

示例(同机多端口,生产建议多机):

yaml
services:
  redis-node-7001:
    image: redis:7.2
    command: ["redis-server", "--port","7001","--cluster-enabled","yes","--cluster-config-file","nodes.conf","--cluster-node-timeout","5000","--appendonly","yes","--requirepass","Redis#Pass","--masterauth","Redis#Pass"]
    volumes:
      - /data/redis/cluster/7001:/data
    ports: ["7001:7001"]
  redis-node-7002:
    image: redis:7.2
    command: ["redis-server", "--port","7002","--cluster-enabled","yes","--cluster-config-file","nodes.conf","--cluster-node-timeout","5000","--appendonly","yes","--requirepass","Redis#Pass","--masterauth","Redis#Pass"]
    volumes:
      - /data/redis/cluster/7002:/data
    ports: ["7002:7002"]
  redis-node-7003:
    image: redis:7.2
    command: ["redis-server", "--port","7003","--cluster-enabled","yes","--cluster-config-file","nodes.conf","--cluster-node-timeout","5000","--appendonly","yes","--requirepass","Redis#Pass","--masterauth","Redis#Pass"]
    volumes:
      - /data/redis/cluster/7003:/data
    ports: ["7003:7003"]
  redis-node-7004:
    image: redis:7.2
    command: ["redis-server", "--port","7004","--cluster-enabled","yes","--cluster-config-file","nodes.conf","--cluster-node-timeout","5000","--appendonly","yes","--requirepass","Redis#Pass","--masterauth","Redis#Pass"]
    volumes:
      - /data/redis/cluster/7004:/data
    ports: ["7004:7004"]
  redis-node-7005:
    image: redis:7.2
    command: ["redis-server", "--port","7005","--cluster-enabled","yes","--cluster-config-file","nodes.conf","--cluster-node-timeout","5000","--appendonly","yes","--requirepass","Redis#Pass","--masterauth","Redis#Pass"]
    volumes:
      - /data/redis/cluster/7005:/data
    ports: ["7005:7005"]
  redis-node-7006:
    image: redis:7.2
    command: ["redis-server", "--port","7006","--cluster-enabled","yes","--cluster-config-file","nodes.conf","--cluster-node-timeout","5000","--appendonly","yes","--requirepass","Redis#Pass","--masterauth","Redis#Pass"]
    volumes:
      - /data/redis/cluster/7006:/data
    ports: ["7006:7006"]

初始化集群(容器内执行一次):

  • docker exec -it redis-node-7001 redis-cli -a Redis#Pass --cluster create
    redis-node-7001:7001 redis-node-7002:7002 redis-node-7003:7003
    redis-node-7004:7004 redis-node-7005:7005 redis-node-7006:7006
    --cluster-replicas 1
    注意:
  • 跨主机部署需配置 advertise-ip,确保节点间可互联。

三、Kafka 高可用与持久化
Kafka 从 3.3 起支持 KRaft(自带控制面,无需 ZooKeeper),建议新集群使用 KRaft。若需兼容旧版也给出 ZooKeeper 方案。

A. 快速开始(单机 KRaft)
目录结构:

  • /data/kafka/single/{data}

compose.yaml:

yaml
services:
  kafka:
    image: bitnami/kafka:3.7
    container_name: kafka-single
    restart: always
    environment:
      - KAFKA_CFG_PROCESS_ROLES=broker,controller
      - KAFKA_CFG_NODE_ID=1
      - KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=1@kafka:9093
      - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093
      - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9092
      - KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
      - KAFKA_CFG_LOG_DIRS=/bitnami/kafka/data
      - KAFKA_CFG_OFFSETS_TOPIC_REPLICATION_FACTOR=1
      - KAFKA_CFG_TRANSACTION_STATE_LOG_REPLICATION_FACTOR=1
      - KAFKA_CFG_TRANSACTION_STATE_LOG_MIN_ISR=1
      - ALLOW_PLAINTEXT_LISTENER=yes
    volumes:
      - /data/kafka/single/data:/bitnami/kafka/data
    ports:
      - "9092:9092"

启动:

  • mkdir -p /data/kafka/single/data
  • docker compose up -d

B. 生产高可用(KRaft 多节点,3 控制器 + 3 Broker,可混合角色或专用角色)
方案 1:3 节点混合角色(每节点既是 controller 又是 broker)

yaml
services:
  kafka1:
    image: bitnami/kafka:3.7
    hostname: kafka1
    environment:
      - KAFKA_CFG_NODE_ID=1
      - KAFKA_CFG_PROCESS_ROLES=broker,controller
      - KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=1@kafka1:9093,2@kafka2:9093,3@kafka3:9093
      - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093
      - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka1:9092
      - KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
      - ALLOW_PLAINTEXT_LISTENER=yes
      - KAFKA_CFG_LOG_DIRS=/bitnami/kafka/data
      - KAFKA_CFG_OFFSETS_TOPIC_REPLICATION_FACTOR=3
      - KAFKA_CFG_DEFAULT_REPLICATION_FACTOR=3
      - KAFKA_CFG_MIN_INSYNC_REPLICAS=2
    volumes:
      - /data/kafka/kraft/kafka1:/bitnami/kafka/data
    ports: ["19092:9092"]

  kafka2:
    image: bitnami/kafka:3.7
    hostname: kafka2
    environment:
      - KAFKA_CFG_NODE_ID=2
      - KAFKA_CFG_PROCESS_ROLES=broker,controller
      - KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=1@kafka1:9093,2@kafka2:9093,3@kafka3:9093
      - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093
      - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka2:9092
      - KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
      - ALLOW_PLAINTEXT_LISTENER=yes
      - KAFKA_CFG_LOG_DIRS=/bitnami/kafka/data
      - KAFKA_CFG_OFFSETS_TOPIC_REPLICATION_FACTOR=3
      - KAFKA_CFG_DEFAULT_REPLICATION_FACTOR=3
      - KAFKA_CFG_MIN_INSYNC_REPLICAS=2
    volumes:
      - /data/kafka/kraft/kafka2:/bitnami/kafka/data
    ports: ["29092:9092"]

  kafka3:
    image: bitnami/kafka:3.7
    hostname: kafka3
    environment:
      - KAFKA_CFG_NODE_ID=3
      - KAFKA_CFG_PROCESS_ROLES=broker,controller
      - KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=1@kafka1:9093,2@kafka2:9093,3@kafka3:9093
      - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093
      - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka3:9092
      - KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
      - ALLOW_PLAINTEXT_LISTENER=yes
      - KAFKA_CFG_LOG_DIRS=/bitnami/kafka/data
      - KAFKA_CFG_OFFSETS_TOPIC_REPLICATION_FACTOR=3
      - KAFKA_CFG_DEFAULT_REPLICATION_FACTOR=3
      - KAFKA_CFG_MIN_INSYNC_REPLICAS=2
    volumes:
      - /data/kafka/kraft/kafka3:/bitnami/kafka/data
    ports: ["39092:9092"]

networks:
  default:
    name: kafka-kraft-net

说明与要点:

  • 每个节点挂载自己的数据目录:/bitnami/kafka/data。
  • 生产应使用 SASL/SSL,不要启用 ALLOW_PLAINTEXT_LISTENER。
  • advertised_listeners 需按客户端访问路径配置(跨主机请使用可路由的主机名或 IP)。

方案 2:传统 ZooKeeper(兼容旧版)

yaml
services:
  zk:
    image: bitnami/zookeeper:3.9
    environment:
      - ZOO_ENABLE_AUTH=no
      - ZOO_SERVER_ID=1
      - ZOO_SERVERS=0.0.0.0:2888:3888
    ports: ["2181:2181"]
    volumes:
      - /data/zookeeper/data:/bitnami/zookeeper
    restart: always

  kafka1:
    image: bitnami/kafka:3.4
    environment:
      - KAFKA_CFG_ZOOKEEPER_CONNECT=zk:2181
      - ALLOW_PLAINTEXT_LISTENER=yes
      - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092
      - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka1:9092
      - KAFKA_CFG_LOG_DIRS=/bitnami/kafka/data
      - KAFKA_CFG_OFFSETS_TOPIC_REPLICATION_FACTOR=3
      - KAFKA_CFG_DEFAULT_REPLICATION_FACTOR=3
      - KAFKA_CFG_MIN_INSYNC_REPLICAS=2
    volumes:
      - /data/kafka/zk/kafka1:/bitnami/kafka/data
    ports: ["19092:9092"]
    depends_on: [zk]
  kafka2:
    image: bitnami/kafka:3.4
    environment:
      - KAFKA_CFG_ZOOKEEPER_CONNECT=zk:2181
      - ALLOW_PLAINTEXT_LISTENER=yes
      - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092
      - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka2:9092
      - KAFKA_CFG_LOG_DIRS=/bitnami/kafka/data
      - KAFKA_CFG_OFFSETS_TOPIC_REPLICATION_FACTOR=3
      - KAFKA_CFG_DEFAULT_REPLICATION_FACTOR=3
      - KAFKA_CFG_MIN_INSYNC_REPLICAS=2
    volumes:
      - /data/kafka/zk/kafka2:/bitnami/kafka/data
    ports: ["29092:9092"]
    depends_on: [zk]
  kafka3:
    image: bitnami/kafka:3.4
    environment:
      - KAFKA_CFG_ZOOKEEPER_CONNECT=zk:2181
      - ALLOW_PLAINTEXT_LISTENER=yes
      - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092
      - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka3:9092
      - KAFKA_CFG_LOG_DIRS=/bitnami/kafka/data
      - KAFKA_CFG_OFFSETS_TOPIC_REPLICATION_FACTOR=3
      - KAFKA_CFG_DEFAULT_REPLICATION_FACTOR=3
      - KAFKA_CFG_MIN_INSYNC_REPLICAS=2
    volumes:
      - /data/kafka/zk/kafka3:/bitnami/kafka/data
    ports: ["39092:9092"]
    depends_on: [zk]

networks:
  default:
    name: kafka-zk-net

验证与基本操作

  • 创建 Topic(KRaft/ZK 通用,容器内执行):
    • kafka-topics.sh --bootstrap-server kafka1:9092 --create --topic test --partitions 3 --replication-factor 3
  • 生产者:
    • kafka-console-producer.sh --bootstrap-server kafka1:9092 --topic test
  • 消费者:
    • kafka-console-consumer.sh --bootstrap-server kafka1:9092 --topic test --from-beginning

持久化与磁盘

  • Kafka 数据目录:/bitnami/kafka/data(或 /opt/kafka/logs),必须挂载到独立磁盘,调优磁盘调度与文件系统。
  • 建议开启自动日志清理,配置保留策略:log.retention.hours / log.retention.bytes。

四、通用运维建议与常见问题

  • 安全
    • 使用强密码与最小权限账户。
    • 生产启用 TLS 与鉴权(MySQL:SSL;Redis:redis 6+ ACL;Kafka:SASL/SCRAM + TLS)。
    • 网络隔离:Docker 自定义网络 + 安全组/防火墙限制。
  • 数据与备份
    • 所有服务的数据目录均挂载到宿主机或网络盘;建议使用独立数据盘。
    • 定期备份:MySQL 使用 mysqldump/xbcloud/xtrabackup;Redis 备份 AOF/RDB;Kafka 做文件级冷备或跨集群镜像(MirrorMaker 2)。
  • 监控与告警
    • MySQL:Prometheus mysqld_exporter、performance_schema。
    • Redis:redis_exporter。
    • Kafka:JMX Exporter;关键指标 ISR、UnderReplicatedPartitions、RequestHandlerAvgIdlePercent。
  • 性能与参数
    • MySQL:调整 innodb_buffer_pool_size、redo、binlog、IOPS;NUMA 关闭内存交错问题。
    • Redis:禁用透明大页、调整内核 overcommit;AOF rewrite 策略。
    • Kafka:页缓存充足;网络 send/receive buffer;batch.size、linger.ms、compression.type。
  • 常见坑
    • advertised_listeners 配置不当导致外部无法连接 Kafka。
    • Redis Cluster/哨兵节点间互通问题(容器名解析与端口映射)。
    • MySQL MGR 需要时钟同步(NTP)与合适的防火墙放通端口。
    • 容器重启权限问题:宿主机目录权限需允许容器内用户读写(例如 chown -R 1001:1001 用于 bitnami 镜像)。