万花筒
作者万花筒2023-01-09 13:06
系统架构师, 红帽企业级开源解决方案中心

在自己的服务和应用中使用OpenTelemetry 和 Jaeger实现分布式追踪

字数 13871阅读 2416评论 0赞 2

译自:
https://cloud.redhat.com/blog/using-opentelemetry-and-jaeger-with-your-own-services/application

在本博客中,向您展示:
· 如何在一个 Quarkus 应用中使用 OpenTelemetry
· 如何在 Jaeger UI 中展示 OpenTelemetry 的信息

在本博客中,我会使用分布式追踪技术来增强服务能力,实现服务架构的采集洞察,用以在现代化应用、云原生应用及微服务应用中实现:监控、网络采样分析、和组件之间的交互式故障排查。

使用分布式追踪技术,可以执行以下功能:
· 监控分布式交易
· 优化性能和延迟
· 执行根因分析

Red Hat OpenShift 分布式追踪包括 2 个组件 :
第一个是 Red Hat OpenShift distributed tracing platform – 该组件基于开源的 Jaeger 项目;第二个是 Red Hat OpenShift distributed tracing data collection – 该组件基于 OpenTelemetry 项目 .

可以参考当前最新版本的 OpenShift 4.11 官方资料( https://docs.openshift.com/container-platform/4.11/distr_tracing/distributed-tracing-release-notes.html
这两个组件都是以 Operator 的方式提供, 4.11 版本中使用的是: Jaeger 1.39 和 OpenTelemetry 0.63.1.

OpenTelemetry and Jaeger

在下面的示意图中,向大家展示了数据在应用和 OpenTelemetry 以及 Jaeger 之间的流动。


为了演示简单,我会用 Jaeger 的 AllInOne 镜像,该镜像会在一个 pod 中安装 collector, query 和 Jaeger UI ,默认使用内存数据库存储。

更多信息可以参考一下链接:

启用分布式追踪**

集群管理员,首先必须安装一次 Distributed Tracing Platform 和 Distributed Tracing Data Collection operator 。

在 OpenShift 中,可以很方便的从 OperatorHub 中找到并安装 Operator 组件,以下步骤基于 OpenShift 4.9 实现。详细步骤可以参考 Installing the Red Hat OpenShift distributed tracing platform Operator . ( https://docs.openshift.com/container-platform/4.9/distr_tracing/distr_tracing_install/distr-tracing-installing.html#distr-tracing-jaeger-operator-install_install-distributed-tracing


在本演示中,我们不安装 OpenShift Elasticsearch Operator, 因为我们使用内存跟踪 - 没有持久化。

确保以 cluster-admin 身份登录 :

等待一小段时间,检查一下 operator 的 pods 是否运行正常, CRDs 是否已经被创建 :

$ oc get pod -n openshift-operators|grep jaeger
jaeger-operator-bc65549bd-hch9v 1/1 Running 0 10d
$ oc get pod -n openshift-operators|grep opentelemetry
opentelemetry-operator-controller-manager-69f7f56598-nsr5h 2/2 Running 0 10d
$ oc get crd jaegers.jaegertracing.io
NAME CREATED AT
jaegers.jaegertracing.io 2021-12-08T15:51:29Z
$ oc get crd opentelemetrycollectors.opentelemetry.io
NAME CREATED AT
opentelemetrycollectors.opentelemetry.io 2021-12-15T07:57:38Z

创建一个新的Project

以 jaeger-demo 为例创建 Project ,并且给一个标准用户 ( 例如 developer)Project 的管理权限 :

$ oc new-project jaeger-demo
Now using project "jaeger-demo" on server "https://api.yourserver:6443".
You can add applications to this project with the 'new-app' command. For example, try:
oc new-app rails-postgresql-example
to build a new example application in Ruby. Or use kubectl to deploy a simple Kubernetes application:
kubectl create deployment hello-node --image=k8s.gcr.io/serve_hostname
$ oc policy add-role-to-user admin developer -n jaeger-demo
clusterrole.rbac.authorization.k8s.io/admin added: "developer"

以该标准用户登录**

$ oc login -u developer
Authentication required for https://api.yourserver:6443 (openshift)
Username: developer
Password:
Login successful.
You have one project on this server: "jaeger-demo"
Using project "jaeger-demo".

创建Jaeger

创建一个简单 Jaeger 实例,并命名为 my-jager

$ cat <<EOF |oc apply -f -
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: my-jaeger
spec: {}
EOF

jaeger.jaegertracing.io/my-jaeger created

Jaeger 实例成功启动,检查服务状态和 route 信息

$ oc get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
my-jaeger-agent ClusterIP None 5775/UDP,5778/TCP,6831/UDP,6832/UDP 73m
my-jaeger-collector ClusterIP 172.30.127.95 9411/TCP,14250/TCP,14267/TCP,14268/TCP 73m
my-jaeger-collector-headless ClusterIP None 9411/TCP,14250/TCP,14267/TCP,14268/TCP 73m
my-jaeger-query ClusterIP 172.30.243.178 443/TCP,16685/TCP 73m
$ oc get route my-jaeger -o jsonpath='{.spec.host}'
my-jaeger-jaeger-demo.apps.rbaumgar.demo.net

打开一个新的浏览器窗口,并且输入 route 的 url ,然后以 OpenShift login 方式登录 ( 使用之前的标准用户: developer).

创建OpenTelemetry Collector

创建一个 configmap ,然后创建一个 OpenTelemetry Collector 实例,并命名为 my-otelcol.

使用 configmap 是因为 Jaeger 服务需要加密。这些证书被签发用来作为 TLS web server 证书 . 更多信息查询 Understanding service serving certificateshttps://docs.openshift.com/container-platform/4.9/security/certificates/service-serving-certificate.html

$ cat <<EOF |oc apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
  annotations:
    service.beta.openshift.io/inject-cabundle: "true"
  name: my-otelcol-cabundle
---
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: my-otelcol
spec:
  config: |
    receivers:
      otlp:
        protocols:
          grpc:
          http:
    processors:
      batch:

    exporters:
      logging:
        loglevel: info

      jaeger:
        endpoint: my-jaeger-collector-headless.jaeger-demo.svc:14250
        ca_file: "/etc/pki/ca-trust/source/service-ca/service-ca.crt"

    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [batch]
          exporters: [logging,jaeger]
  mode: deployment
  resources: {}
  targetAllocator: {}
  volumeMounts:
  - mountPath: /etc/pki/ca-trust/source/service-ca
    name: cabundle-volume
  volumes:
  - configMap:
      name: my-otelcol-cabundle
    name: cabundle-volume
EOF

configmap/my-otelcol-cabundle created
opentelemetrycollector.opentelemetry.io/my-otelcol created

在未来的版本中 OpenTelemetryCollector ( > 0.38) ,可能不需要 configmap 和 volumes 以及 volumesMounts 。

当 OpenTelemetryCollector 实例启动和运行时,检查日志。

$ oc logs deployment/my-otelcol-collector
2022-01-03T14:57:21.117Z info service/collector.go:303 Starting otelcol... {"Version": "v0.33.0", "NumCPU": 4}
2022-01-03T14:57:21.117Z info service/collector.go:242 Loading configuration...
2022-01-03T14:57:21.118Z info service/collector.go:258 Applying configuration...
2022-01-03T14:57:21.119Z info builder/exporters_builder.go:264 Exporter was built. {"kind": "exporter", "name": "logging"}
2022-01-03T14:57:21.121Z info builder/exporters_builder.go:264 Exporter was built. {"kind": "exporter", "name": "jaeger"}
2022-01-03T14:57:21.121Z info builder/pipelines_builder.go:214 Pipeline was built. {"pipeline_name": "traces", "pipeline_datatype": "traces"}
2022-01-03T14:57:21.121Z info builder/receivers_builder.go:227 Receiver was built. {"kind": "receiver", "name": "otlp", "datatype": "traces"}
2022-01-03T14:57:21.121Z info service/service.go:143 Starting extensions...
2022-01-03T14:57:21.121Z info service/service.go:188 Starting exporters...
2022-01-03T14:57:21.121Z info builder/exporters_builder.go:93 Exporter is starting... {"kind": "exporter", "name": "logging"}
2022-01-03T14:57:21.121Z info builder/exporters_builder.go:98 Exporter started. {"kind": "exporter", "name": "logging"}
2022-01-03T14:57:21.121Z info builder/exporters_builder.go:93 Exporter is starting... {"kind": "exporter", "name": "jaeger"}
2022-01-03T14:57:21.122Z info jaegerexporter/exporter.go:186 State of the connection with the Jaeger Collector backend{"kind": "exporter", "name": "jaeger", "state": "CONNECTING"}
2022-01-03T14:57:21.123Z info builder/exporters_builder.go:98 Exporter started. {"kind": "exporter", "name": "jaeger"}
2022-01-03T14:57:21.123Z info service/service.go:193 Starting processors...
2022-01-03T14:57:21.123Z info builder/pipelines_builder.go:52 Pipeline is starting... {"pipeline_name": "traces", "pipeline_datatype": "traces"}
2022-01-03T14:57:21.123Z info builder/pipelines_builder.go:63 Pipeline is started. {"pipeline_name": "traces", "pipeline_datatype": "traces"}
2022-01-03T14:57:21.123Z info service/service.go:198 Starting receivers...
2022-01-03T14:57:21.123Z info builder/receivers_builder.go:71 Receiver is starting... {"kind": "receiver", "name": "otlp"}
2022-01-03T14:57:21.123Z info otlpreceiver/otlp.go:75 Starting GRPC server on endpoint 0.0.0.0:4317 {"kind": "receiver", "name": "otlp"}
2022-01-03T14:57:21.123Z info otlpreceiver/otlp.go:137 Setting up a second GRPC listener on legacy endpoint 0.0.0.0:55680 {"kind": "receiver", "name": "otlp"}
2022-01-03T14:57:21.123Z info otlpreceiver/otlp.go:75 Starting GRPC server on endpoint 0.0.0.0:55680 {"kind": "receiver", "name": "otlp"}
2022-01-03T14:57:21.123Z info otlpreceiver/otlp.go:93 Starting HTTP server on endpoint 0.0.0.0:4318 {"kind": "receiver", "name": "otlp"}
2022-01-03T14:57:21.123Z info otlpreceiver/otlp.go:159 Setting up a second HTTP listener on legacy endpoint 0.0.0.0:55681 {"kind": "receiver", "name": "otlp"}
2022-01-03T14:57:21.123Z info otlpreceiver/otlp.go:93 Starting HTTP server on endpoint 0.0.0.0:55681 {"kind": "receiver", "name": "otlp"}
2022-01-03T14:57:21.123Z info builder/receivers_builder.go:76 Receiver started. {"kind": "receiver", "name": "otlp"}
2022-01-03T14:57:21.123Z info service/collector.go:206 Setting up own telemetry...
2022-01-03T14:57:21.127Z info service/telemetry.go:99 Serving Prometheus metrics {"address": ":8888", "level": 0, "service.instance.id": "930be080-492b-432b-b5c1-1a6cc0f1b707"}
2022-01-03T14:57:21.127Z info service/collector.go:218 Everything is ready. Begin running and processing data.
2022-01-03T14:57:22.123Z info jaegerexporter/exporter.go:186 State of the connection with the Jaeger Collector backend{"kind": "exporter", "name": "jaeger", "state": "READY"}

很重要的信息,在最后一行 ("State of connection...") ,它表明 collector 已经连到了 Jager 实例,假如没有出现这条,你必须更新 OpenTelemetry Collector 实例, spec.config.exports.jaeger.endpoint 字段的值,正常应该是: ..svc:14250.

可以通过以下命令修改 OpenTelemetry Collector 实例 :

$ oc edit opentelemetrycollector my-otelcol

示例应用**

部署一个示例应用**

所有的现代化应用开发框架都对 ( 比如 Quarkus) 都对 OpenTelemetry 提供良好的支持 , Quarkus - USING OPENTELEMETRY . ( https://quarkus.io/guides/opentelemetry

为了简化本博客,我们使用一个已有示例应用,该应用基于 GitHub - rbaumgar/otelcol-demo-app ( https://github.com/rbaumgar/otelcol-demo-app ) : 来演示集成了 OpenTelemetry 和 Jaeger 的 Quarkus 应用。

部署该应用,并且暴露一个 route

$ cat <<EOF |oc apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: otelcol-demo-app
  name: otelcol-demo-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: otelcol-demo-app
  template:
    metadata:
      labels:
        app: otelcol-demo-app
    spec:
      containers:
      - image: quay.io/rbaumgar/otelcol-demo-app-jvm
        imagePullPolicy: IfNotPresent
        name: otelcol-demo-app
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: otelcol-demo-app
  name: otelcol-demo-app
spec:
  ports:
  - port: 8080
    protocol: TCP
    targetPort: 8080
    name: web
  selector:
    app: otelcol-demo-app
  type: ClusterIP
---
apiVersion: route.openshift.io/v1
kind: Route
metadata:
  labels:
    app: otelcol-demo-app
  name: otelcol-demo-app
spec:
  path: /
  to:
    kind: Service
    name: otelcol-demo-app
  port:
    targetPort: web
EOF

deployment.apps/otelcol-demo-app created
service/otelcol-demo-app created
route.route.openshift.io/otelcol-demo-app exposed

你可以增加一个环境变量, name 为 OTELCOL_SERVER ,并且指定不同的 OpenTelemetry Collector 的 URL 值,默认是 : http://my-otelcol-collector:4317

测试示例应用**

检查 route 和 /hello _上下文_ ,会看到 hello 信息,检查 /hello/podname ,会看到 hello podname 信息,重复多次该操作。

$ export URL=$(oc get route otelcol-demo-app -o jsonpath='{.spec.host}')
$ curl $URL/hello
hello
$ curl $URL/sayHello/demo1
hello: demo1
$ curl $URL/sayRemote/demo2
hello: demo2 from http://otelcol-demo-app-jaeger-demo.apps.rbaumgar.demo.net/
...

回到 Jager URL. 刷新页面 . 在 Service 下,选择 my-service. Find Traces...

服务名,是在演示应用的 application.properties 文件中指定的 (quarkus.application.name) , collector 的 url 也是在该文件中指定, (quarkus.opentelemetry.tracer.exporter.otlp.endpoint= http://my-otelcol-collector:4317 ).

打开一个 trace entry ,展开,可以看到详细信息。


完成。

如果你想看到更多细节,研究 OpenTracing 如何在 Quarkus 中使用的,可以到 Github 中查看: GitHub - rbaumgar/otelcol-demo-app: Quarkus demo app to show OpenTelemetry with Jaeger.

使用OpenTelemetry Collector,以Sidecar container的方式**

默认情况下 OpenTelemetry Collector 以单独的 pod 运行 (mode: deployement) ,假如你有兴趣在你的应用的同一个 pod 中运行 Collector ,你可以使用 sidecar 模式定义 OpenTelemtry Collector CRD, 通过该参数 spec.mode: sidecar.

$ oc delete opentelemetrycollector my-otelcol
$ cat <<EOF |oc apply -f -
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: my-otelcol
spec:
  config: |
    receivers:
      otlp:
        protocols:
          grpc:
          http:
    processors:
      batch:

    exporters:
      logging:
        loglevel: info

      jaeger:
        endpoint: my-jaeger-collector-headless.jaeger-demo.svc:14250
        ca_file: "/etc/pki/ca-trust/source/service-ca/service-ca.crt"

    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [batch]
          exporters: [logging,jaeger]
  mode: sidecar
  resources: {}
  targetAllocator: {}
  volumeMounts:
  - mountPath: /etc/pki/ca-trust/source/service-ca
    name: cabundle-volume
  volumes:
  - configMap:
      name: my-otelcol-cabundle
    name: cabundle-volume
EOF

opentelemetrycollector.opentelemetry.io/my-otelcol created
此时一个新的 pod 已经启动 !

你需要增加注解 annotation 到 deployment 中, sidecar.opentelemetry.io/inject: "true". 你还需要指定 OpenTelemetry Collector 的 url 到 localhost (environment OTELCOL_SERVER). 如下 :

kind: Deployment
apiVersion: apps/v1
metadata:
  name: otelcol-demo-app
  labels:
    app: otelcol-demo-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: otelcol-demo-app
  template:
    metadata:
      labels:
        app: otelcol-demo-app
      annotations:
        sidecar.opentelemetry.io/inject: 'true'
    spec:
      containers:
        - name: otelcol-demo-app
          image: quay.io/rbaumgar/otelcol-demo-app-jvm
          env:
            - name: OTELCOL_SERVER
              value: 'http://localhost:4317'
...              

当一切正常时,您的应用程序 pod 运行时有两个容器,一个与应用程序一起运行,另一个与 OpenTelemetryCollector 一起运行。在第二个日志中,您应该看到与上面相同的消息,包括与 Jaeger 的连接 .

移除Demo

$ oc delete deployment,svc,route otelcol-demo-app
$ oc delete opentelemetrycollector my-otelcol
$ oc delete jaeger my-jaeger
$ oc delete cm my-otelcol-cabundle
$ oc delete project jaeger-demo

参考文档 :

Github: rbaumgar/otelcol-demo-app https://github.com/rbaumgar/otelcol-demo-app/blob/master/OpenTelemetry.md

如果觉得我的文章对您有用,请点赞。您的支持将鼓励我继续创作!

2

添加新评论0 条评论

Ctrl+Enter 发表

作者其他文章

X社区推广