比较提交

...

9 次代码提交

修改 40 个文件,包含 1078 行新增203 行删除
+2 -2
查看文件
@@ -1,8 +1,8 @@
### k3s 部署仓库 让你快速拥有一个高可用的k3s集群 并且具有完备的生产级能力(监控、告警、防护、负载、备份)
#### install 集群安装相关
#### 集群安装相关
参见 [install/README.md](install/README.md)
参见 [ansible/README.md](ansible/README.md)
#### apps 相关应用
+15
查看文件
@@ -0,0 +1,15 @@
# 环境变量模板
# 复制为 .env 并填写实际值,然后 source .env
# Tailscale Auth Key (必须)
export TAILSCALE_AUTH_KEY=""
# K3s HA Server URL (添加节点时需要)
export HA_SERVER_URL="https://k3s.example.com:6443"
# SSH 密码 (如果使用密码认证,必须设置;如果使用密钥认证,留空)
export SSH_PASSWORD=""
# SSH 公钥路径 (默认 ~/.ssh/id_rsa.pub)
# export SSH_PUBKEY=""
+17
查看文件
@@ -0,0 +1,17 @@
# Ansible 临时文件
*.retry
# 敏感文件
kubeconfig.yaml
kubeconfig-*.yaml
*.pem
*.key
# 本地环境
.env
.env.local
# IDE
.idea/
.vscode/
+159
查看文件
@@ -0,0 +1,159 @@
# K3s Ansible 自动化安装
一键部署 K3s 集群,支持国内镜像加速、Tailscale 组网、SSH 安全加固。
## 目录结构
```
ansible/
├── ansible.cfg # Ansible 配置
├── inventory/
│ ├── hosts.yml # 主机清单 ⭐ 需修改
│ └── group_vars/all.yml # 全局变量
├── roles/
│ ├── ssh/ # SSH 安全加固
│ │ ├── tasks/main.yml
│ │ ├── handlers/main.yml
│ │ └── templates/sshd_config.j2
│ ├── common/ # 基础配置 (hostname, sysctl, tailscale)
│ │ ├── tasks/main.yml
│ │ └── handlers/main.yml
│ └── k3s/ # K3s 安装
│ ├── tasks/main.yml
│ └── templates/
│ ├── k3s-server.yaml.j2 # Server 配置 (统一 init/join)
│ ├── k3s-agent.yaml.j2 # Agent 配置
│ └── registries.yaml.j2 # 镜像加速
└── playbooks/
└── site.yml # 完整安装
```
## 快速开始
### 1. 配置主机清单
编辑 `inventory/hosts.yml`:
```yaml
masters:
hosts:
master1:
ansible_host: 10.0.0.1
node_hostname: master1
cluster_init: true # 首个节点设为 true
node_region: cn-sh # 区域标签
use_mirror: true # 使用镜像加速
enable_lb: true # 启用 LB
netfilter_mode: "" # 阿里云/华为云设为 nodivert
```
### 2. 设置环境变量
```bash
# 必须
export TAILSCALE_AUTH_KEY="tskey-auth-xxx"
# 首次安装 (SSH 加固)
export SSH_PASSWORD="your-root-password"
```
### 3. 执行安装
```bash
cd k3s/ansible
# 方式一: 首次安装 (含 SSH 加固,端口改为 2103,启用密钥认证)
ansible-playbook playbooks/site.yml --tags ssh,common,k3s,status
# 方式二: 常规安装 (已配置 SSH 密钥)
ansible-playbook playbooks/site.yml
# 方式三: 仅安装首个 master
ansible-playbook playbooks/site.yml -l first-master-name
# 方式四: 添加新节点
ansible-playbook playbooks/site.yml -l new-node-name
```
### 4. 获取 kubeconfig
```bash
# 安装完成后自动保存到 ansible/kubeconfig.yaml
sed -i '' 's/127.0.0.1/k3s.yourdomain.com/g' kubeconfig.yaml
export KUBECONFIG=$(pwd)/kubeconfig.yaml
kubectl get nodes
```
## 节点变量
| 变量 | 类型 | 默认值 | 说明 |
|------|------|--------|------|
| `ansible_host` | string | - | 节点 IP |
| `node_hostname` | string | - | 主机名 |
| `cluster_init` | bool | false | 首个 master 设为 true |
| `node_region` | string | - | 区域标签 (cn-sh/hk/us-west) |
| `use_mirror` | bool | false | 使用镜像加速 |
| `enable_lb` | bool | - | 启用 K3s LB |
| `netfilter_mode` | string | "" | Tailscale netfilter: off/nodivert/on |
| `node_labels` | dict | - | 自定义标签 |
| `node_taints` | list | - | 节点污点 (格式: key=value:effect) |
## 环境变量
| 变量 | 必须 | 说明 |
|------|------|------|
| `TAILSCALE_AUTH_KEY` | ✅ | Tailscale Auth Key |
| `K3S_TOKEN` | 单独添加节点时 | 集群 Token (完整安装时自动获取) |
| `K3S_SERVER_URL` | 单独添加节点时 | API Server 地址 (完整安装时自动设置) |
| `SSH_PASSWORD` | 首次安装 | SSH 密码 |
| `SSH_PUBKEY` | - | SSH 公钥 (默认 ~/.ssh/id_rsa.pub) |
## 镜像加速
`use_mirror: true` 时自动启用:
- K3s 安装脚本: `rancher-mirror.rancher.cn`
- 常规容器镜像加速
## SSH 安全加固
首次安装时 (`--tags ssh`) 自动执行:
1. 端口改为 2103
2. 禁用密码登录
3. 启用密钥认证
4. 自动添加本地公钥
## 集群安装流程
Playbook 按以下顺序执行:
1. **初始化节点安装**: 安装 `cluster_init: true` 的第一个 master 节点
2. **动态获取 Token**: 从初始化节点读取 `/var/lib/rancher/k3s/server/node-token`
3. **Token 注入**: 将 K3S_TOKEN 和 K3S_SERVER_URL 设置为所有节点的 fact
4. **其他 Master 节点**: 使用动态获取的 Token 加入集群
5. **Agent 节点**: 使用动态获取的 Token 加入集群
这样在一次性安装整个集群时,无需手动设置 `K3S_TOKEN` 环境变量。
## 常用命令
```bash
# 测试连接
ansible all -m ping
# 仅运行特定阶段
ansible-playbook playbooks/site.yml --tags common
ansible-playbook playbooks/site.yml --tags k3s
# 指定节点
ansible-playbook playbooks/site.yml -l master1,agent1
# 调试模式
ansible-playbook playbooks/site.yml -vvv
# 检查语法
ansible-playbook playbooks/site.yml --syntax-check
```
@@ -1,4 +1,4 @@
## 安装方法
## 手动安装方法
需要在每个节点上执行以下命令 节点系统需求 debian 11+ ubuntu 20.04+
@@ -39,12 +39,7 @@ sysctl -p /etc/sysctl.d/99-tailscale.conf
采用config.yaml的方式进行配置(非环境变量) 使集群配置能够进行git版本控制
- master-init.config.yaml 是第一个master节点的配置
- master.config.yaml 是master从节点的配置 (单节点不需要)
- agent.config.yaml 是agent节点的配置 (单节点不需要)
注意!! 将tls-san改为你自己的域名 如果你的节点没有配置域名 请将其替换为节点的ip地址,
`YOU_SHOULD_MODIFY_THIS_JOIN_KEY` 替换为你申请的tailscale auth key
参考roles/k3s/templates下的配置文件模版
根据节点类型, 将上述文件中的内容写入到此处
@@ -59,7 +54,7 @@ mkdir -p /etc/rancher/k3s && vim /etc/rancher/k3s/config.yaml
```shell
curl -sfL https://get.k3s.io | \
INSTALL_K3S_VERSION=v1.33.2+k3s1 \
INSTALL_K3S_VERSION=v1.34.2+k3s1 \
sh -s - server
```
@@ -69,7 +64,7 @@ curl -sfL https://get.k3s.io | \
```shell
curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh | \
INSTALL_K3S_VERSION=v1.33.2+k3s1 \
INSTALL_K3S_VERSION=v1.34.2+k3s1 \
INSTALL_K3S_MIRROR=cn \
sh -s - server
```
+21
查看文件
@@ -0,0 +1,21 @@
[defaults]
inventory = inventory/hosts.yml
roles_path = roles
host_key_checking = False
retry_files_enabled = False
stdout_callback = default
callbacks_enabled = ansible.builtin.default
interpreter_python = auto_silent
deprecation_warnings = False
[callback_default]
result_format = yaml
[privilege_escalation]
become = True
become_method = sudo
become_user = root
[ssh_connection]
pipelining = True
ssh_args = -o ControlMaster=auto -o ControlPersist=60s
+47
查看文件
@@ -0,0 +1,47 @@
# K3s Ansible 全局变量
---
# ============================================
# 敏感信息 (通过环境变量传入)
# ============================================
tailscale_auth_key: "{{ lookup('env', 'TAILSCALE_AUTH_KEY') }}"
# 高可用集群的 server_url 需要指向负载均衡器地址,单节点集群则指向自身
ha_server_url: "{{ lookup('env', 'HA_SERVER_URL') | default('', true) }}"
# ============================================
# K3s 配置
# ============================================
# K3s Server URL (优先使用 HA_SERVER_URL,否则动态使用 init 节点地址)
k3s_server_url: "{{ ha_server_url if (ha_server_url | length > 0) else '' }}"
k3s_version: "v1.34.2+k3s1"
# ETCD 配置
etcd_snapshot_retention: 1
etcd_snapshot_schedule_cron: "0 0 * * *"
etcd_snapshot_compress: true
# 禁用的组件
k3s_disable_components:
- traefik
# ============================================
# 安装源配置
# ============================================
# 国内镜像源
mirror_k3s_install_url: "https://rancher-mirror.rancher.cn/k3s/k3s-install.sh"
# 官方源
global_k3s_install_url: "https://get.k3s.io"
# ============================================
# 镜像加速配置 (use_mirror: true 时启用)
# ============================================
registry_mirrors:
docker.io:
- "docker.1ms.run"
- "docker.m.daocloud.io"
ghcr.io:
- "ghcr.m.daocloud.io"
registry.k8s.io:
- "k8s.m.daocloud.io"
quay.io:
- "quay.m.daocloud.io"
+130
查看文件
@@ -0,0 +1,130 @@
# K3s 集群主机清单
---
all:
vars:
# SSH 配置
ansible_user: root
# 默认端口,首次安装时使用22,后续会被动态覆盖
ansible_port: 22
ansible_password: "{{ lookup('env', 'SSH_PASSWORD') | default(omit, true) }}"
# SSH 安全配置
ssh_new_port: 2103
ssh_pubkey: "{{ lookup('env', 'SSH_PUBKEY') | default(lookup('file', '~/.ssh/id_rsa.pub'), true) }}"
children:
# Master 节点 (Server)
masters:
hosts:
tca:
ansible_host: tca.node.dev.cm
node_hostname: tca
cluster_init: true
node_region: cn-sh
use_mirror: true
node_taints:
- "node-role.kubernetes.io/control-plane:NoSchedule"
tcb:
ansible_host: tcb.node.dev.cm
node_hostname: tcb
node_region: cn-sh
use_mirror: true
node_taints:
- "node-role.kubernetes.io/control-plane:NoSchedule"
tcc:
ansible_host: tcc.node.dev.cm
node_hostname: tcc
node_region: cn-sh
use_mirror: true
node_taints:
- "node-role.kubernetes.io/control-plane:NoSchedule"
# Agent 节点 (Worker)
agents:
hosts:
tce:
ansible_host: tce.node.dev.cm
node_hostname: tce
node_region: cn-sh
use_mirror: true
tcd:
ansible_host: tcd.node.dev.cm
node_hostname: tcd
node_region: cn-sh
use_mirror: true
tchk:
ansible_host: tchk.node.dev.cm
node_hostname: tchk
node_region: cn-hk
enable_lb: true
tthk:
ansible_host: tthk.node.dev.cm
node_hostname: tthk
node_region: cn-hk
enable_lb: true
alihk:
ansible_host: alihk.node.dev.cm
node_hostname: alihk
node_region: cn-hk
enable_lb: true
netfilter_mode: nodivert
alihka:
ansible_host: alihka.node.dev.cm
node_hostname: alihka
node_region: cn-hk
netfilter_mode: nodivert
hwhk:
ansible_host: hwhk.node.dev.cm
node_hostname: hwhk
node_region: cn-hk
enable_lb: true
netfilter_mode: nodivert
hwsg:
ansible_host: hwsg.node.dev.cm
node_hostname: hwsg
node_region: sg-sg
netfilter_mode: nodivert
hwa:
ansible_host: hwa.node.dev.cm
node_hostname: hwa
node_region: cn-sh
use_mirror: true
netfilter_mode: nodivert
clawhk:
ansible_host: clawhk.node.dev.cm
node_hostname: clawhk
node_region: cn-hk
clawjp:
ansible_host: clawjp.node.dev.cm
node_hostname: clawjp
node_region: jp-tyo
orajpa:
ansible_host: orajpa.node.dev.cm
node_hostname: orajpa
node_region: jp-tyo
orakra:
ansible_host: orakra.node.dev.cm
node_hostname: orakra
node_region: kr-sel
orasga:
ansible_host: orasga.node.dev.cm
node_hostname: orasga
node_region: sg-sg
# 以下为内网节点 需要手动先配置好vpn才能访问
homea:
ansible_host: homea
node_hostname: homea
node_region: cn-sh
use_mirror: true
homeb:
ansible_host: homeb
node_hostname: homeb
node_region: cn-sh
use_mirror: true
# 节点分组
k3s_cluster:
children:
masters:
agents:
+169
查看文件
@@ -0,0 +1,169 @@
# K3s 集群安装 Playbook
---
# ============================================
# 阶段 0: 提前检测 检测环境变量和 SSH 端口
# ============================================
- name: Pre-check Environment and SSH Port
hosts: k3s_cluster
gather_facts: false
tags: [always]
tasks:
# 环境验证 (run_once 确保只执行一次)
- name: Check TAILSCALE_AUTH_KEY
ansible.builtin.fail:
msg: "请设置: export TAILSCALE_AUTH_KEY='tskey-auth-xxx'"
when: lookup('env', 'TAILSCALE_AUTH_KEY') | length == 0
run_once: true
delegate_to: localhost
- name: Check SSH credentials
ansible.builtin.debug:
msg: |
{% if lookup('env', 'SSH_PASSWORD') | length > 0 %}
✓ 优先使用密码登录
{% else %}
✓ 使用密钥登录
{% endif %}
run_once: true
delegate_to: localhost
# SSH 端口探测
- name: Try new SSH port ({{ ssh_new_port }})
ansible.builtin.wait_for:
host: "{{ ansible_host }}"
port: "{{ ssh_new_port }}"
timeout: 3
delegate_to: localhost
become: false
register: new_port_check
ignore_errors: true
- name: Set SSH port based on availability
ansible.builtin.set_fact:
ansible_port: "{{ ssh_new_port if new_port_check is succeeded else 22 }}"
- name: Display detected SSH port
ansible.builtin.debug:
msg: "{{ inventory_hostname }}: 使用端口 {{ ansible_port }}"
when: ansible_verbosity > 0
# ============================================
# 阶段 1: SSH 安全加固 (可选,首次安装时使用)
# ============================================
- name: SSH Security Hardening
hosts: k3s_cluster
gather_facts: false
tags: [ssh, never]
roles:
- ssh
# ============================================
# 阶段 2: 基础配置
# ============================================
- name: Common Setup
hosts: k3s_cluster
gather_facts: true
tags: [common]
roles:
- common
# ============================================
# 阶段 3: 安装 K3s (按顺序: init -> masters -> agents)
# ============================================
- name: Install K3s on init node
hosts: masters
gather_facts: true
serial: 1
tags: [k3s]
roles:
- role: k3s
when: cluster_init | default(false)
- name: Fetch K3S_TOKEN & K3S_SERVER_URL from init node
hosts: k3s_cluster
gather_facts: false
run_once: true
tags: [k3s]
tasks:
- name: Find init node
ansible.builtin.set_fact:
init_node: "{{ item }}"
loop: "{{ groups['masters'] }}"
when: hostvars[item].cluster_init | default(false)
- name: Detect init node SSH port
ansible.builtin.wait_for:
host: "{{ hostvars[init_node].ansible_host }}"
port: "{{ ssh_new_port }}"
timeout: 3
delegate_to: localhost
become: false
register: init_node_port_check
ignore_errors: true
- name: Set init node SSH port
ansible.builtin.set_fact:
init_node_port: "{{ ssh_new_port if init_node_port_check is succeeded else 22 }}"
- name: Read K3S_TOKEN from init node
ansible.builtin.slurp:
src: /var/lib/rancher/k3s/server/node-token
register: k3s_token_content
delegate_to: "{{ init_node }}"
vars:
ansible_port: "{{ hostvars[inventory_hostname].init_node_port }}"
- name: Determine K3S_SERVER_URL
ansible.builtin.set_fact:
# 优先使用 HA_SERVER_URL 环境变量,否则使用 init 节点地址
k3s_server_url: "{{ ha_server_url if (ha_server_url | length > 0) else 'https://' + hostvars[init_node].ansible_host + ':6443' }}"
- name: Set K3S_TOKEN and K3S_SERVER_URL for target hosts
ansible.builtin.set_fact:
k3s_token: "{{ k3s_token_content.content | b64decode | trim }}"
k3s_server_url: "{{ k3s_server_url }}"
delegate_to: "{{ item }}"
delegate_facts: true
loop: "{{ ansible_play_hosts }}"
- name: Install K3s on other masters
hosts: masters
gather_facts: true
serial: 1
tags: [k3s]
roles:
- role: k3s
when: not (cluster_init | default(false))
- name: Install K3s on agents
hosts: agents
gather_facts: true
tags: [k3s]
roles:
- k3s
# ============================================
# 阶段 4: 显示集群状态
# ============================================
- name: Show cluster status
hosts: masters
gather_facts: false
tags: [status]
run_once: true
tasks:
- name: Get nodes
ansible.builtin.command: kubectl get nodes -o wide
environment:
KUBECONFIG: /etc/rancher/k3s/k3s.yaml
register: nodes
changed_when: false
when: cluster_init | default(false)
- name: Display nodes
ansible.builtin.debug:
msg: |
══════════════════════════════════════════════════════════════
K3s 集群节点:
{{ nodes.stdout }}
══════════════════════════════════════════════════════════════
when: cluster_init | default(false)
+4
查看文件
@@ -0,0 +1,4 @@
---
- name: Apply sysctl
ansible.builtin.command: sysctl --system
changed_when: true
+61
查看文件
@@ -0,0 +1,61 @@
# 基础配置 Role
# 功能: hostname、sysctl、Tailscale 安装
---
- name: Set hostname
ansible.builtin.hostname:
name: "{{ node_hostname }}"
when: node_hostname is defined
- name: Update /etc/hosts
ansible.builtin.lineinfile:
path: /etc/hosts
regexp: '^127\.0\.1\.1'
line: "127.0.1.1 {{ node_hostname }}"
when: node_hostname is defined
- name: Configure sysctl for IP forwarding
ansible.builtin.copy:
dest: /etc/sysctl.d/99-k3s.conf
content: |
net.ipv4.ip_forward = 1
net.ipv6.conf.all.forwarding = 1
mode: "0644"
notify: Apply sysctl
- name: Install dependencies
ansible.builtin.apt:
name:
- curl
- wget
- ca-certificates
state: present
update_cache: true
- name: Check if Tailscale is installed
ansible.builtin.command: which tailscale
register: common_tailscale_check
failed_when: false
changed_when: false
- name: Download Tailscale install script
ansible.builtin.get_url:
url: https://tailscale.com/install.sh
dest: /tmp/tailscale-install.sh
mode: "0755"
when: common_tailscale_check.rc != 0
- name: Install Tailscale
ansible.builtin.command: /tmp/tailscale-install.sh
when: common_tailscale_check.rc != 0
changed_when: true
- name: Remove Tailscale install script
ansible.builtin.file:
path: /tmp/tailscale-install.sh
state: absent
- name: Enable Tailscale service
ansible.builtin.systemd:
name: tailscaled
enabled: true
state: started
+115
查看文件
@@ -0,0 +1,115 @@
# K3s 安装 Role (统一 Server 和 Agent)
---
- name: Validate TAILSCALE_AUTH_KEY
ansible.builtin.fail:
msg: "请设置环境变量: export TAILSCALE_AUTH_KEY='tskey-auth-xxx'"
when: (tailscale_auth_key | default('')) | length == 0
- name: Create K3s config directory
ansible.builtin.file:
path: /etc/rancher/k3s
state: directory
mode: "0755"
# 检查安装状态
- name: Check if K3s is installed
ansible.builtin.stat:
path: /usr/local/bin/k3s
register: k3s_binary
# 部署配置文件(注册变更状态)
- name: Deploy K3s server config
ansible.builtin.template:
src: k3s-server.yaml.j2
dest: /etc/rancher/k3s/config.yaml
mode: "0600"
when: "'masters' in group_names"
register: k3s_server_config
- name: Deploy K3s agent config
ansible.builtin.template:
src: k3s-agent.yaml.j2
dest: /etc/rancher/k3s/config.yaml
mode: "0600"
when: "'agents' in group_names"
register: k3s_agent_config
- name: Deploy registries.yaml
ansible.builtin.template:
src: registries.yaml.j2
dest: /etc/rancher/k3s/registries.yaml
mode: "0644"
when: use_mirror | default(false)
# 判断是否需要安装/重启
- name: Set K3s installation flag
ansible.builtin.set_fact:
k3s_needs_install: "{{ not k3s_binary.stat.exists or (k3s_server_config.changed | default(false)) or (k3s_agent_config.changed | default(false)) }}"
# 设置安装变量
- name: Set K3s install variables
ansible.builtin.set_fact:
k3s_install_url: "{{ mirror_k3s_install_url if (use_mirror | default(false)) else global_k3s_install_url }}"
k3s_install_mirror: "{{ 'INSTALL_K3S_MIRROR=cn' if (use_mirror | default(false)) else '' }}"
# 下载安装脚本
- name: Download K3s install script
ansible.builtin.get_url:
url: "{{ k3s_install_url }}"
dest: /tmp/k3s-install.sh
mode: "0755"
when: k3s_needs_install
# 安装 K3s
- name: Install K3s server
ansible.builtin.command:
cmd: /tmp/k3s-install.sh server
environment:
INSTALL_K3S_VERSION: "{{ k3s_version }}"
INSTALL_K3S_MIRROR: "{{ 'cn' if (use_mirror | default(false)) else '' }}"
when:
- "'masters' in group_names"
- k3s_needs_install
changed_when: true
- name: Install K3s agent
ansible.builtin.command:
cmd: /tmp/k3s-install.sh agent
environment:
INSTALL_K3S_VERSION: "{{ k3s_version }}"
INSTALL_K3S_MIRROR: "{{ 'cn' if (use_mirror | default(false)) else '' }}"
when:
- "'agents' in group_names"
- k3s_needs_install
changed_when: true
# 清理安装脚本
- name: Remove install script
ansible.builtin.file:
path: /tmp/k3s-install.sh
state: absent
# 等待 K3s 就绪 (仅 Server)
- name: Wait for K3s server ready
ansible.builtin.wait_for:
path: /var/lib/rancher/k3s/server/node-token
timeout: 120
when: "'masters' in group_names"
# 保存 kubeconfig (仅 cluster-init)
- name: Fetch kubeconfig
ansible.builtin.fetch:
src: /etc/rancher/k3s/k3s.yaml
dest: "{{ playbook_dir }}/../kubeconfig.yaml"
flat: true
when: cluster_init | default(false)
- name: Update kubeconfig server address
ansible.builtin.replace:
path: "{{ playbook_dir }}/../kubeconfig.yaml"
regexp: 'server: https://127\.0\.0\.1:6443'
replace: "server: {{ k3s_server_url }}"
delegate_to: localhost
become: false
when: cluster_init | default(false)
@@ -0,0 +1,36 @@
# K3s Agent 配置模板
---
server: "{{ k3s_server_url }}"
token: "{{ k3s_token }}"
# Tailscale VPN
vpn-auth: "name=tailscale,joinKey={{ tailscale_auth_key }}{% if netfilter_mode | default('') %},extraArgs=--netfilter-mode={{ netfilter_mode }}{% endif %}"
# 节点标签
node-label:
{% if node_region is defined %}
- "topology.kubernetes.io/region={{ node_region }}"
{% endif %}
{% if enable_lb is defined %}
- "svccontroller.k3s.cattle.io/enablelb={{ enable_lb | string | lower }}"
{% endif %}
{% if node_labels is defined %}
{% for key, value in node_labels.items() %}
- "{{ key }}={{ value }}"
{% endfor %}
{% endif %}
# 节点污点
{% if node_taints is defined %}
node-taint:
{% for taint in node_taints %}
- "{{ taint }}"
{% endfor %}
{% endif %}
# Kubelet 资源预留
{% if kubelet_reserved is defined %}
kubelet-arg:
- "kube-reserved={{ kubelet_reserved }}"
{% endif %}
@@ -0,0 +1,56 @@
# K3s Server 统一配置模板 (master-init 和 master-join)
---
{% if cluster_init | default(false) %}
# 首个节点初始化集群
cluster-init: true
{% else %}
# 加入已有集群
server: "{{ k3s_server_url }}"
token: "{{ k3s_token }}"
{% endif %}
# TLS SAN: 包含 HA 地址 + 所有 master 节点地址
tls-san:
{% if ha_server_url | default('') | length > 0 %}
- "{{ ha_server_url | regex_replace('^https?://([^:]+)(:[0-9]+)?$', '\\1') }}"
{% endif %}
{% for host in groups['masters'] %}
- "{{ hostvars[host].ansible_host }}"
{% endfor %}
# ETCD 快照配置
etcd-snapshot-retention: {{ etcd_snapshot_retention }}
etcd-snapshot-schedule-cron: "{{ etcd_snapshot_schedule_cron }}"
etcd-snapshot-compress: {{ etcd_snapshot_compress | lower }}
# Tailscale VPN
vpn-auth: "name=tailscale,joinKey={{ tailscale_auth_key }}{% if netfilter_mode | default('') %},extraArgs=--netfilter-mode={{ netfilter_mode }}{% endif %}"
# 禁用组件
disable:
{% for component in k3s_disable_components %}
- {{ component }}
{% endfor %}
# 节点标签
node-label:
{% if node_region is defined %}
- "topology.kubernetes.io/region={{ node_region }}"
{% endif %}
{% if enable_lb is defined %}
- "svccontroller.k3s.cattle.io/enablelb={{ enable_lb | string | lower }}"
{% endif %}
{% if node_labels is defined %}
{% for key, value in node_labels.items() %}
- "{{ key }}={{ value }}"
{% endfor %}
{% endif %}
# 节点污点
{% if node_taints is defined %}
node-taint:
{% for taint in node_taints %}
- "{{ taint }}"
{% endfor %}
{% endif %}
@@ -0,0 +1,11 @@
# 镜像加速配置
---
mirrors:
{% for registry, endpoints in registry_mirrors.items() %}
"{{ registry }}":
endpoint:
{% for endpoint in endpoints %}
- "https://{{ endpoint }}"
{% endfor %}
{% endfor %}
+21
查看文件
@@ -0,0 +1,21 @@
---
- name: Restart sshd
ansible.builtin.systemd:
name: sshd
state: restarted
listen: Restart sshd
- name: Update ansible port
ansible.builtin.set_fact:
ansible_port: "{{ ssh_new_port }}"
listen: Update ansible port
- name: Wait for new SSH port
ansible.builtin.wait_for:
port: "{{ ssh_new_port }}"
host: "{{ ansible_host }}"
delay: 5
timeout: 60
delegate_to: localhost
become: false
listen: Wait for new SSH port
+33
查看文件
@@ -0,0 +1,33 @@
# SSH 安全加固 Role
# 功能: 修改端口、配置密钥认证、禁用密码登录
---
- name: Ensure .ssh directory exists
ansible.builtin.file:
path: /root/.ssh
state: directory
mode: "0700"
- name: Add SSH public key
ansible.builtin.authorized_key:
user: root
key: "{{ ssh_pubkey }}"
state: present
- name: Backup original sshd_config
ansible.builtin.copy:
src: /etc/ssh/sshd_config
dest: /etc/ssh/sshd_config.bak
remote_src: true
force: false
mode: "0600"
- name: Deploy secure sshd_config
ansible.builtin.template:
src: sshd_config.j2
dest: /etc/ssh/sshd_config
mode: "0600"
validate: "/usr/sbin/sshd -t -f %s"
notify:
- Restart sshd
- Update ansible port
- Wait for new SSH port
@@ -0,0 +1,12 @@
# SSH 配置模板
Port {{ ssh_new_port }}
PermitRootLogin prohibit-password
PasswordAuthentication no
PubkeyAuthentication yes
ChallengeResponseAuthentication no
UsePAM yes
X11Forwarding no
PrintMotd no
AcceptEnv LANG LC_*
Subsystem sftp /usr/lib/openssh/sftp-server
+5 -46
查看文件
@@ -1,46 +1,6 @@
### apps
应用部署方法
```shell
kubectl apply -f apps/xxx -R
```
举例:
```shell
kubectl apply -f apps/infra/data/redis -R
```
你可以一次性将所有的应用部署到k8s集群中 但是此处建议分开部署 每个文件夹单独执行 以保证不会出现错误与性能问题
注意!! 在部署前你需要替换yaml中的YOU_SHOULD_MODIFY_THIS_ 开头的字段 替换为自己的值 这些值的来源部分是自己生成的、部分是需要你自己去申请的
比如说你需要去华为云申请一个access key id和secret key 还有一个bucket name 这些值需要你自己去申请
### 应用说明
./kube文件夹下的请全部执行 此文件架内部为集群优化相关内容 例如dns延迟优化
(patch-affinity.yaml 按需 仅在你想让k3s自带的system服务使用特定节点时使用 比如保留核心服务停留在高可用节点上)
- infra-net: 网络相关的应用
- nginx: 负载均衡服务 替换集群默认的ingress(traefik)
- crowdsec: 安全防护服务
- tailscale: 集群内网加速服务 如果对集群内网加速没有需求 可以不安装
- infra-data: 数据存储相关的应用
- redis: redis服务
- postgresql-ha: postgresql服务
- cloudnative: postgresql服务 操作符版本 推荐
- infra-devops: devops相关的应用
- gitea: git托管服务
- cert-manager: 证书管理服务
- reflector: 密钥同步服务
- velero: 备份服务
- infra-monitor: 监控相关的应用
- prometheus: 监控服务
- loki: 日志服务
- apps: 其他应用 个人应用部分
- whoami: 测试服务
集群服务helm部署的应用,包含一些基础服务和一些业务服务
### 调试集群内服务方法 运行此命令
@@ -57,14 +17,13 @@ kubectl run -i --tty --rm --restart=Never \
然后使用reflector将secret中的密钥同步到其他namespace中
```shell
kubectl -n infra-devops create secret generic s3-devcm-hw \
kubectl -n infra-data create secret generic s3-devcm-hw \
--from-literal=ACCESS_KEY_ID=xxxxx \
--from-literal=ACCESS_SECRET_KEY=xxxxx
kubectl -n infra-devops annotate secret s3-devcm-hw \
kubectl -n infra-data annotate secret s3-devcm-hw \
reflector.v1.k8s.emberstack.com/reflection-allowed=true \
reflector.v1.k8s.emberstack.com/reflection-allowed-namespaces=infra-data \
reflector.v1.k8s.emberstack.com/reflection-auto-enabled=true \
reflector.v1.k8s.emberstack.com/reflection-auto-namespace=infra-data --overwrite
reflector.v1.k8s.emberstack.com/reflection-allowed-namespaces=infra-devops,apps \
reflector.v1.k8s.emberstack.com/reflection-auto-enabled=true --overwrite
```
+5 -1
查看文件
@@ -41,6 +41,9 @@ spec:
pathType: Prefix
podAnnotations:
backup.velero.io/backup-volumes: halo-data
persistence:
annotations:
helm.sh/resource-policy: keep
metrics:
enabled: true
mysql:
@@ -52,8 +55,9 @@ spec:
host: cnpg17-cluster-hk-rw.infra-data
port: 5432
user: app
password: FybaFtf6NV5jnxhj5bOPpHbO6KypZeHiyiskgAWkM5nioW2j82HtCf6GnW9xVKjE
password: from-secret
database: halo
existingSecret: cnpg17-cluster-hk-app
haloUsername: rohow
haloExternalUrl: https://dev.cm
-28
查看文件
@@ -1,28 +0,0 @@
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
name: redis-cluster-sh
namespace: infra-data
spec:
chart: oci://registry-1.docker.io/bitnamicharts/redis
targetNamespace: infra-data
version: 20.7.0
valuesContent: |-
global:
redis:
password: ribiPwYQNU6GWxCYR0Nj
master:
nodeAffinityPreset:
type: soft
key: topology.kubernetes.io/region
values:
- cn-sh
replica:
replicaCount: 0
nodeAffinityPreset:
type: soft
key: topology.kubernetes.io/region
values:
- cn-sh
@@ -0,0 +1,21 @@
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
name: valkey-cluster-sh
namespace: infra-data
spec:
chart: oci://registry-1.docker.io/bitnamicharts/valkey-cluster
targetNamespace: infra-data
version: 3.0.23
valuesContent: |-
image:
repository: bitnamilegacy/valkey-cluster
cluster:
nodes: 1
replicas: 0
valkey:
nodeAffinityPreset:
type: hard
key: topology.kubernetes.io/region
values:
- cn-sh
@@ -0,0 +1,26 @@
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: cert-manager-webhook-dnspod
labels:
app: cert-manager-webhook-dnspod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@dev.cm
privateKeySecretRef:
name: cert-manager-webhook-dnspod-letsencrypt
solvers:
- dns01:
cnameStrategy: Follow
webhook:
groupName: cert.dev.cm
solverName: dnspod
config:
ttl: 600
secretIdRef:
name: dnspod-secret
key: secretId
secretKeyRef:
name: dnspod-secret
key: secretKey
@@ -9,17 +9,6 @@ spec:
targetNamespace: infra-devops
version: 1.4.5
valuesContent: |-
namespace: infra-devops
certManager:
namespace: infra-devops
groupName: cert.dev.cm
clusterIssuer:
# 此处需在部署后修改clusterIssuer 添加在dns01下
# cnameStrategy: Follow
staging: false
email: admin@dev.cm
secretId: AKIDzmKdvDSfonogKip55pIVR6h7ScjaBWcg
secretKey: zudDdtytkPr8HI9oKeniSxIRPCmCe0CD
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
@@ -29,7 +18,12 @@ spec:
operator: In
values:
- "cn-sh"
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
image:
tag: "1.5.2"
namespace: infra-devops
certManager:
namespace: infra-devops
groupName: cert.dev.cm
# 此处关闭 选择手动创建 以支持cnameStrategy
clusterIssuer:
enabled: false
+5 -16
查看文件
@@ -1,5 +1,3 @@
# 需要提前安装crds
# kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.18.2/cert-manager.crds.yaml
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
@@ -9,7 +7,7 @@ spec:
repo: https://charts.jetstack.io
chart: cert-manager
targetNamespace: infra-devops
version: v1.19.2
version: v1.19.3
valuesContent: |-
affinity:
nodeAffinity:
@@ -20,10 +18,6 @@ spec:
operator: In
values:
- "cn-sh"
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
webhook:
affinity:
nodeAffinity:
@@ -34,10 +28,6 @@ spec:
operator: In
values:
- "cn-sh"
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
cainjector:
affinity:
nodeAffinity:
@@ -48,14 +38,13 @@ spec:
operator: In
values:
- "cn-sh"
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
crds:
enabled: true
keep: true
# 在删除证书时同时删除secret
enableCertificateOwnerRef: true
prometheus:
enabled: true
enabled: false
servicemonitor:
enabled: true
interval: 300s
@@ -18,11 +18,3 @@ spec:
operator: In
values:
- "cn-sh"
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- tce
+15 -7
查看文件
@@ -25,7 +25,9 @@ spec:
- key: kubernetes.io/hostname
operator: In
values:
- homea
- homeb
# 此处暂时切换关闭upgradeCRDs操作 待官方修复后再开启
upgradeCRDs: false
deployNodeAgent: true
snapshotsEnabled: false
configuration:
@@ -46,13 +48,19 @@ spec:
s3ForcePathStyle: false
s3Url: https://obs.cn-east-3.myhuaweicloud.com
checksumAlgorithm: ""
extraEnvVars:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: s3-devcm-hw
key: ACCESS_KEY_ID
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: s3-devcm-hw
key: ACCESS_SECRET_KEY
credentials:
useSecret: true
secretContents:
cloud: |
[default]
aws_access_key_id = A9RI5BC15F3L9EI8T51T
aws_secret_access_key = ky1n3OlNNu7wjgctVjCqb03HWxjZucRGhvcEBp51
useSecret: false
initContainers:
- name: velero-plugin-for-aws
image: velero/velero-plugin-for-aws:v1.13.0
@@ -0,0 +1,29 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: dev-cm-flux-web-admin
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flux-web-admin
subjects:
- kind: Group
name: dev.cm:owners
apiGroup: rbac.authorization.k8s.io
- kind: Group
name: dev.cm:admins
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: dev-cm-flux-web-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flux-web-user
subjects:
- kind: Group
name: dev.cm
apiGroup: rbac.authorization.k8s.io
+18 -2
查看文件
@@ -10,7 +10,6 @@ spec:
artifact: "oci://ghcr.io/controlplaneio-fluxcd/flux-operator-manifests"
components:
- source-controller
- source-watcher
- kustomize-controller
- helm-controller
- notification-controller
@@ -22,4 +21,21 @@ spec:
domain: "cluster.local"
storage:
class: "local-path"
size: "10Gi"
size: "10Gi"
kustomize:
patches:
- target:
kind: Deployment
patch: |
- op: add
path: /spec/template/spec/affinity
value:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- homea
+10
查看文件
@@ -8,6 +8,16 @@ spec:
targetNamespace: infra-gitops
version: 0.40.0
valuesContent: |-
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- homea
installCRDs: true
web:
config:
+17 -4
查看文件
@@ -67,17 +67,13 @@ spec:
HOST: cnpg17-cluster-sh-rw.infra-data:5432
NAME: gitea
USER: app
PASSWD: HueUoQx05DM0ICBPu1GrmBvBXE6NO3poKE6yPqokPv3dPpWvWRLAr3RXSpaL3AZd
SSL_MODE: disable
session:
PROVIDER: redis
PROVIDER_CONFIG: redis://:ribiPwYQNU6GWxCYR0Nj@redis-cluster-sh-master.infra-data:6379/0
cache:
ADAPTER: redis
HOST: redis://:ribiPwYQNU6GWxCYR0Nj@redis-cluster-sh-master.infra-data:6379/0?pool_size=100&idle_timeout=180s
queue:
TYPE: redis
CONN_STR: redis://:ribiPwYQNU6GWxCYR0Nj@redis-cluster-sh-master.infra-data:6379/0
repository:
DEFAULT_REPO_UNITS: repo.code,repo.releases,repo.issues,repo.pulls
actions:
@@ -99,6 +95,23 @@ spec:
ui:
THEMES: gitea-auto, gitea-light, gitea-dark, github-auto, github-light, github-dark, github-soft-dark
DEFAULT_THEME: github-auto
additionalConfigFromEnvs:
- name: GITEA__DATABASE__PASSWD
valueFrom:
secretKeyRef:
name: cnpg17-cluster-sh-app
key: password
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: valkey-cluster-sh
key: valkey-password
- name: GITEA__SESSION__PROVIDER_CONFIG
value: "redis://:$(REDIS_PASSWORD)@valkey-cluster-sh-headless.infra-data:6379/0?pool_size=100&idle_timeout=180s"
- name: GITEA__CACHE__HOST
value: "redis://:$(REDIS_PASSWORD)@valkey-cluster-sh-headless.infra-data:6379/0?pool_size=100&idle_timeout=180s"
- name: GITEA__QUEUE__CONN_STR
value: "redis://:$(REDIS_PASSWORD)@valkey-cluster-sh-headless.infra-data:6379/0?pool_size=100&idle_timeout=180s"
valkey-cluster:
enabled: false
extraVolumes:
+1 -1
查看文件
@@ -68,7 +68,7 @@ spec:
host: cnpg17-cluster-sh-rw.infra-data:5432
name: grafana
user: app
password: HueUoQx05DM0ICBPu1GrmBvBXE6NO3poKE6yPqokPv3dPpWvWRLAr3RXSpaL3AZd
password: fYyAc4PNKLrvEB0IfkDm1TMR7sZkAcK1DGp4yqG5Y9aSS0UJMCgSiW6hhrsTztLA
persistence:
type: pvc
enabled: true
+1 -1
查看文件
@@ -102,7 +102,7 @@ spec:
port: 5432
db_name: crowdsec
user: app
password: FybaFtf6NV5jnxhj5bOPpHbO6KypZeHiyiskgAWkM5nioW2j82HtCf6GnW9xVKjE
password: 4EMiSg9adUSxPAwNWIsHhKd1WZ7lhGuCnNofCFHuU1aQHSho85xeSK6TPcgJ4NU7
sslmode: require
api:
server:
+1 -1
查看文件
@@ -19,7 +19,7 @@ spec:
nodeSelector:
svccontroller.k3s.cattle.io/enablelb: "true"
tolerations:
- key: "node-role.kubernetes.io/master"
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
labels:
+1 -1
查看文件
@@ -143,7 +143,7 @@ spec:
operator: "Exists"
containers:
- name: node-cache
image: registry.k8s.io/dns/k8s-dns-node-cache:1.25.0
image: registry.k8s.io/dns/k8s-dns-node-cache:1.26.7
resources:
requests:
cpu: 25m
+1 -1
查看文件
@@ -9,4 +9,4 @@ spec:
- key: node-role.kubernetes.io/control-plane
operator: In
values:
- "true"
- "true"
-12
查看文件
@@ -1,12 +0,0 @@
# worker 工作节点
server: "https://k3s.dev.cm:6443"
token: "K1010dd6f0853e824cfaf417117f31a0d797a738aa2d4b9d01cd5972a9b084c81a0::server:e4836f1f469315fadd5b12c07d7fb10e"
# 网络相关
# WARN 阿里云、华为云因使用100网段作为dns等内部服务 需要关闭netfilter 否则会自动添加iptables规则导致无法访问dns
# WARN 需要添加 extraArgs=--netfilter-mode=off
vpn-auth: "name=tailscale,joinKey=tskey-auth-kUMo6hWP9711CNTRL-oo21xakMTxCKJBWK8t9XxComm3fAFUvy,extraArgs=--netfilter-mode=off"
# 节点相关
# 保留节点资源 根据节点做不同配置
# kubelet-arg: "kube-reserved=cpu=1000m,memory=1Gi"
-16
查看文件
@@ -1,16 +0,0 @@
# server 主节点
cluster-init: true
tls-san:
- "k3s.dev.cm,k3s.fillcode.com"
# 数据相关
etcd-snapshot-retention: "1"
etcd-snapshot-schedule-cron: "0 0 * * *"
etcd-snapshot-compress: true
# 网络相关
vpn-auth: "name=tailscale,joinKey=tskey-auth-kUMo6hWP9711CNTRL-oo21xakMTxCKJBWK8t9XxComm3fAFUvy"
# 组件相关
disable:
- traefik
-17
查看文件
@@ -1,17 +0,0 @@
# server 从节点
server: "https://tca:6443"
token: "K1010dd6f0853e824cfaf417117f31a0d797a738aa2d4b9d01cd5972a9b084c81a0::server:e4836f1f469315fadd5b12c07d7fb10e"
tls-san:
- "k3s.dev.cm,k3s.fillcode.com"
# 数据相关
etcd-snapshot-retention: "1"
etcd-snapshot-schedule-cron: "0 0 * * *"
etcd-snapshot-compress: true
# 网络相关
vpn-auth: "name=tailscale,joinKey=tskey-auth-kUMo6hWP9711CNTRL-oo21xakMTxCKJBWK8t9XxComm3fAFUvy"
# 组件相关
disable:
- traefik
-15
查看文件
@@ -1,15 +0,0 @@
mirrors:
"docker.io":
endpoint:
- "docker.1ms.run"
- "docker.m.daocloud.io"
- "mirror.ccs.tencentyun.com"
"ghcr.io":
endpoint:
- "ghcr.m.daocloud.io"
"registry.k8s.io":
endpoint:
- "k8s.m.daocloud.io"
"quay.io":
endpoint:
- "quay.m.daocloud.io"