skywalking容器化部署docker鏡像構(gòu)建k8s從測(cè)試到可用

前言碎語(yǔ)
skywalking是個(gè)非常不錯(cuò)的apm產(chǎn)品,但是在使用過(guò)程中有個(gè)非常蛋疼的問(wèn)題,在基于es的存儲(chǔ)情況下,es的數(shù)據(jù)一有問(wèn)題,就會(huì)導(dǎo)致整個(gè)skywalking web ui服務(wù)不可用,然后需要agent端一個(gè)服務(wù)一個(gè)服務(wù)的停用,然后服務(wù)重新部署后好,全部走一遍。這種問(wèn)題同樣也會(huì)存在skywalking的版本升級(jí)迭代中。而且apm 這種過(guò)程數(shù)據(jù)是允許丟棄的,默認(rèn)skywalking中關(guān)于trace的數(shù)據(jù)記錄只保存了90分鐘。故博主準(zhǔn)備將skywalking的部署容器化,一鍵部署升級(jí)。下文是整個(gè)skywalking 容器化部署的過(guò)程。
目標(biāo):將skywalking的docker鏡像運(yùn)行在k8s的集群環(huán)境中提供服務(wù)
docker鏡像構(gòu)建
FROM registry.cn-xx.xx.com/keking/jdk:1.8
ADD apache-skywalking-apm-incubating/ /opt/apache-skywalking-apm-incubating/
RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime \
&& echo 'Asia/Shanghai' >/etc/timezone \
&& chmod +x /opt/apache-skywalking-apm-incubating/config/setApplicationEnv.sh \
&& chmod +x /opt/apache-skywalking-apm-incubating/webapp/setWebAppEnv.sh \
&& chmod +x /opt/apache-skywalking-apm-incubating/bin/startup.sh \
&& echo "tail -fn 100 /opt/apache-skywalking-apm-incubating/logs/webapp.log" >> /opt/apache-skywalking-apm-incubating/bin/startup.sh
EXPOSE 8080 10800 11800 12800
CMD /opt/apache-skywalking-apm-incubating/config/setApplicationEnv.sh \
&& sh /opt/apache-skywalking-apm-incubating/webapp/setWebAppEnv.sh \
&& /opt/apache-skywalking-apm-incubating/bin/startup.sh在編寫(xiě)Dockerfile時(shí)需要考慮幾個(gè)問(wèn)題:skywalking中哪些配置需要?jiǎng)討B(tài)配置(運(yùn)行時(shí)設(shè)置)?怎么保證進(jìn)程一直運(yùn)行(skywalking 的startup.sh和tomcat中 的startup.sh類(lèi)似)?
application.yml
#cluster:
# zookeeper:
# hostPort: localhost:2181
# sessionTimeout: 100000
naming:
jetty:
#OS real network IP(binding required), for agent to find collector cluster
host: 0.0.0.0
port: 10800
contextPath: /
cache:
# guava:
caffeine:
remote:
gRPC:
# OS real network IP(binding required), for collector nodes communicate with each other in cluster. collectorN --(gRPC) --> collectorM
host: #real_host
port: 11800
agent_gRPC:
gRPC:
#os real network ip(binding required), for agent to uplink data(trace/metrics) to collector. agent--(grpc)--> collector
host: #real_host
port: 11800
# Set these two setting to open ssl
#sslCertChainFile: $path
#sslPrivateKeyFile: $path
# Set your own token to active auth
#authentication: xxxxxx
agent_jetty:
jetty:
# OS real network IP(binding required), for agent to uplink data(trace/metrics) to collector through HTTP. agent--(HTTP)--> collector
# SkyWalking native Java/.Net/node.js agents don't use this.
# Open this for other implementor.
host: 0.0.0.0
port: 12800
contextPath: /
analysis_register:
default:
analysis_jvm:
default:
analysis_segment_parser:
default:
bufferFilePath: ../buffer/
bufferOffsetMaxFileSize: 10M
bufferSegmentMaxFileSize: 500M
bufferFileCleanWhenRestart: true
ui:
jetty:
# Stay in `localhost` if UI starts up in default mode.
# Change it to OS real network IP(binding required), if deploy collector in different machine.
host: 0.0.0.0
port: 12800
contextPath: /
storage:
elasticsearch:
clusterName: #elasticsearch_clusterName
clusterTransportSniffer: true
clusterNodes: #elasticsearch_clusterNodes
indexShardsNumber: 2
indexReplicasNumber: 0
highPerformanceMode: true
# Batch process setting, refer to https://www.elastic.co/guide/en/elasticsearch/client/java-api/5.5/java-docs-bulk-processor.html
bulkActions: 2000 # Execute the bulk every 2000 requests
bulkSize: 20 # flush the bulk every 20mb
flushInterval: 10 # flush the bulk every 10 seconds whatever the number of requests
concurrentRequests: 2 # the number of concurrent requests
# Set a timeout on metric data. After the timeout has expired, the metric data will automatically be deleted.
traceDataTTL: 2880 # Unit is minute
minuteMetricDataTTL: 90 # Unit is minute
hourMetricDataTTL: 36 # Unit is hour
dayMetricDataTTL: 45 # Unit is day
monthMetricDataTTL: 18 # Unit is month
#storage:
# h2:
# url: jdbc:h2:~/memorydb
# userName: sa
configuration:
default:
#namespace: xxxxx
# alarm threshold
applicationApdexThreshold: 2000
serviceErrorRateThreshold: 10.00
serviceAverageResponseTimeThreshold: 2000
instanceErrorRateThreshold: 10.00
instanceAverageResponseTimeThreshold: 2000
applicationErrorRateThreshold: 10.00
applicationAverageResponseTimeThreshold: 2000
# thermodynamic
thermodynamicResponseTimeStep: 50
thermodynamicCountOfResponseTimeSteps: 40
# max collection's size of worker cache collection, setting it smaller when collector OutOfMemory crashed.
workerCacheMaxSize: 10000
#receiver_zipkin:
# default:
# host: localhost
# port: 9411
# contextPath: /webapp.yml
server:
port: 8080
collector:
path: /graphql
ribbon:
ReadTimeout: 10000
listOfServers: #real_host:10800
security:
user:
admin:
password: #skywalking_password動(dòng)態(tài)配置:密碼,grpc等需要綁定主機(jī)的ip都需要運(yùn)行時(shí)設(shè)置,這里我們?cè)趩?dòng)skywalking的startup.sh只之前,先執(zhí)行了兩個(gè)設(shè)置配置的腳本,通過(guò)k8s在運(yùn)行時(shí)設(shè)置的環(huán)境變量來(lái)替換需要?jiǎng)討B(tài)配置的參數(shù)
setApplicationEnv.sh
#!/usr/bin/env sh
sed -i "s/#elasticsearch_clusterNodes/${elasticsearch_clusterNodes}/g" /opt/apache-skywalking-apm-incubating/config/application.yml
sed -i "s/#elasticsearch_clusterName/${elasticsearch_clusterName}/g" /opt/apache-skywalking-apm-incubating/config/application.yml
sed -i "s/#real_host/${real_host}/g" /opt/apache-skywalking-apm-incubating/config/application.ymlsetWebAppEnv.sh
#!/usr/bin/env sh
sed -i "s/#skywalking_password/${skywalking_password}/g" /opt/apache-skywalking-apm-incubating/webapp/webapp.yml
sed -i "s/#real_host/${real_host}/g" /opt/apache-skywalking-apm-incubating/webapp/webapp.yml保持進(jìn)程存在:通過(guò)在skywalking 啟動(dòng)腳本startup.sh末尾追加"tail -fn 100 /opt/apache-skywalking-apm-incubating/logs/webapp.log",來(lái)讓進(jìn)程保持運(yùn)行,并不斷輸出webapp.log的日志
Kubernetes中部署
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: skywalking
namespace: uat
spec:
replicas: 1
selector:
matchLabels:
app: skywalking
template:
metadata:
labels:
app: skywalking
spec:
imagePullSecrets:
- name: registry-pull-secret
nodeSelector:
apm: skywalking
containers:
- name: skywalking
image: registry.cn-xx.xx.com/keking/kk-skywalking:5.2
imagePullPolicy: Always
env:
- name: elasticsearch_clusterName
value: elasticsearch
- name: elasticsearch_clusterNodes
value: 172.16.16.129:31300
- name: skywalking_password
value: xxx
- name: real_host
valueFrom:
fieldRef:
fieldPath: status.podIP
resources:
limits:
cpu: 1000m
memory: 4Gi
requests:
cpu: 700m
memory: 2Gi
---
apiVersion: v1
kind: Service
metadata:
name: skywalking
namespace: uat
labels:
app: skywalking
spec:
selector:
app: skywalking
ports:
- name: web-a
port: 8080
targetPort: 8080
nodePort: 31180
- name: web-b
port: 10800
targetPort: 10800
nodePort: 31181
- name: web-c
port: 11800
targetPort: 11800
nodePort: 31182
- name: web-d
port: 12800
targetPort: 12800
nodePort: 31183
type: NodePortKubernetes部署腳本中唯一需要注意的就是env中關(guān)于pod ip的獲取,skywalking中有幾個(gè)ip必須綁定容器的真實(shí)ip,這個(gè)地方可以通過(guò)環(huán)境變量設(shè)置到容器里面去
文末結(jié)語(yǔ)
整個(gè)skywalking容器化部署從測(cè)試到可用大概耗時(shí)1天,其中花了個(gè)多小時(shí)整了下譚兄的skywalking-docker鏡像(https://hub.docker.com/r/wutang/skywalking-docker/),發(fā)現(xiàn)有個(gè)腳本有權(quán)限問(wèn)題(譚兄反饋已解決,還沒(méi)來(lái)的及測(cè)試),以及有幾個(gè)地方自己不是很好控制,便build了自己的docker鏡像,其中最大的問(wèn)題還是解決集群中網(wǎng)絡(luò)通訊的問(wèn)題,一開(kāi)始我把skywalking中的服務(wù)ip都設(shè)置為0.0.0.0,然后通過(guò)集群的nodePort映射出來(lái),這個(gè)時(shí)候的agent通過(guò)集群ip+31181是可以訪問(wèn)到naming服務(wù)的,然后通過(guò)naming服務(wù)獲取到的collector gRPC服務(wù)缺變成了0.0.0.0:11800, 這個(gè)地址agent肯定訪問(wèn)不到collector的,后面通過(guò)綁定pod ip的方式解決了這個(gè)問(wèn)題。
以上就是skywalking容器化部署docker鏡像構(gòu)建k8s從測(cè)試到可用的詳細(xì)內(nèi)容,更多關(guān)于skywalking容器化部署docker鏡像構(gòu)建k8s的資料請(qǐng)關(guān)注腳本之家其它相關(guān)文章!
相關(guān)文章
Docker 拉取鏡像及標(biāo)簽操作 pull | tag
這篇文章主要介紹了Docker 拉取鏡像及標(biāo)簽操作 pull | tag,具有很好的參考價(jià)值,希望對(duì)大家有所幫助。一起跟隨小編過(guò)來(lái)看看吧2020-11-11
在Docker中安裝Elasticsearch7.6.2的教程
這篇文章主要介紹了在Docker中安裝Elasticsearch7.6.2的教程,具有很好的參考價(jià)值,希望對(duì)大家有所幫助。一起跟隨小編過(guò)來(lái)看看吧2020-11-11
docker容器日志占滿(mǎn)硬盤(pán)空間的問(wèn)題解決
由于默認(rèn)情況下,docker使用json-file類(lèi)型的日志驅(qū)動(dòng),該日志驅(qū)動(dòng)默認(rèn)情況下,每個(gè)容器的日志會(huì)一直追加在文件名為 containerId-json.log文件中,因此在容器不重建的情況下,該日志文件會(huì)一直追加內(nèi)容,直到占滿(mǎn)整個(gè)服務(wù)器硬盤(pán)空間,本文就來(lái)介紹一下解決方法2023-09-09
docker部署Nestjs的簡(jiǎn)單配置實(shí)現(xiàn)
使用Docker部署NestJS應(yīng)用程序可以確保在不同的環(huán)境中運(yùn)行一致,并且避免了由于依賴(lài)關(guān)系或配置問(wèn)題導(dǎo)致的部署錯(cuò)誤,本文主要介紹了docker來(lái)部署Nestjs的簡(jiǎn)單配置,感興趣的可以了解一下2024-02-02
docker部署微信小程序自動(dòng)構(gòu)建發(fā)布和更新的詳細(xì)步驟
通過(guò) Jenkins 和 Docker 部署微信小程序,并實(shí)現(xiàn)自動(dòng)構(gòu)建、發(fā)布和版本更新,主要涉及到幾個(gè)步驟,下面給大家分享docker部署微信小程序自動(dòng)構(gòu)建發(fā)布和更新的詳細(xì)步驟,感興趣的朋友一起看看吧2024-12-12
Rancher無(wú)法添加主機(jī)問(wèn)題的解決方法
這篇文章主要給大家介紹了關(guān)于Rancher無(wú)法添加主機(jī)問(wèn)題的解決方法,文中通過(guò)圖文介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,需要的朋友們下面隨著小編來(lái)一起學(xué)習(xí)學(xué)習(xí)吧2018-06-06
docker 實(shí)現(xiàn)容器與宿主機(jī)無(wú)縫調(diào)用shell命令
這篇文章主要介紹了docker 實(shí)現(xiàn)容器與宿主機(jī)無(wú)縫調(diào)用shell命令的操作,具有很好的參考價(jià)值,希望對(duì)大家有所幫助。一起跟隨想過(guò)來(lái)看看吧2021-03-03

