Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Added Support accepting OTLP via Kafka #4049

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions cmd/ingester/app/builder/builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ func CreateConsumer(logger *zap.Logger, metricsFactory metrics.Factory, spanWrit
unmarshaller = kafka.NewProtobufUnmarshaller()
case kafka.EncodingZipkinThrift:
unmarshaller = kafka.NewZipkinThriftUnmarshaller()
case kafka.EncodingOtlpJSON:
unmarshaller = kafka.NewOtlpJSONUnmarshaller()
case kafka.EncodingOtlpProto:
unmarshaller = kafka.NewOtlpProtoUnmarshaller()
default:
return nil, fmt.Errorf(`encoding '%s' not recognised, use one of ("%s")`,
options.Encoding, strings.Join(kafka.AllEncodings, "\", \""))
Expand Down
4 changes: 4 additions & 0 deletions plugin/storage/kafka/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,10 @@ const (
EncodingProto = "protobuf"
// EncodingZipkinThrift is used for spans encoded as Zipkin Thrift.
EncodingZipkinThrift = "zipkin-thrift"
// EncodingOtlpJSON is used for spans encoded as OTLP JSON.
EncodingOtlpJSON = "otlp-json"
// EncodingOtlpProto is used for spans encoded as OTLP Proto.
EncodingOtlpProto = "otlp-proto"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add to L69, so that they will appear in -h output.


configPrefix = "kafka.producer"
suffixBrokers = ".brokers"
Expand Down
42 changes: 42 additions & 0 deletions plugin/storage/kafka/unmarshaller.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,11 @@ import (

"github.com/gogo/protobuf/jsonpb"
"github.com/gogo/protobuf/proto"
"go.opentelemetry.io/collector/pdata/ptrace/ptraceotlp"

"github.com/jaegertracing/jaeger/model"
"github.com/jaegertracing/jaeger/model/converter/thrift/zipkin"
otlp2jaeger "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/jaeger"
)

// Unmarshaller decodes a byte array to a span
Expand Down Expand Up @@ -79,3 +81,43 @@ func (h *ZipkinThriftUnmarshaller) Unmarshal(msg []byte) (*model.Span, error) {
}
return mSpans[0], err
}

type OtlpJSONUnmarshaller struct{}

func NewOtlpJSONUnmarshaller() *OtlpJSONUnmarshaller {
return &OtlpJSONUnmarshaller{}
}

func (OtlpJSONUnmarshaller) Unmarshal(buf []byte) (*model.Span, error) {
req := ptraceotlp.NewExportRequest()
err := req.UnmarshalJSON(buf)
if err != nil {
return nil, err
}

batch, err := otlp2jaeger.ProtoFromTraces(req.Traces())
if err != nil {
return nil, err
}
return batch[0].Spans[0], nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a problem. Does OTLP Kafka exporter allow writing batches of spans as a single Kafka message?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please help me in fixing this!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be difficult to "fix" if by that you mean implementing support for receiving batches. The current consumer in Jaeger was designed to receive one span per message.

I suggest looking into whether OTEL Collector can be configured to send one span per message (Kafka exporter introduced in open-telemetry/opentelemetry-collector#1439), probably with some pipeline configuration. We would want to reference that in Jaeger docs to make it clear that batch per message is not supported. And we should log an error in the code above if more than one span is found in the batch.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro I was busy with my college exams, I will get on it right away

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked into this a bit. I don't think OTEL has an option right now to split a batch into multiple Kafka messages, and I suggest that's what needs to happen. While it may be less efficient, going the other way (i.e. supporting batches in Jaeger) would break an invariant that we currently maintain that all spans from a given trace ID end up in the same Kafka partition. The current OTEL collector code won't be able to maintain that invariant.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so can you please help me with understanding what i have to do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make a change to OTEL Kafka exporter to support a config flag that would force de-batching of the spans into one span per message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, we could also implement batch handling in the ingester. There are two ways of doing that:

  1. change the unmarshaler signature a bit to allow returning arrays of spans and opportunistically try to save them one at a time. If one of them fails, the batch will be partially saved, and if retry happens the whole batch will be re-saved again. If this happens rarely enough it might be acceptable solution. We would need to verify how out metrics work, so that they don't count whole batch as +1.
  2. [much larger change] upgrade our Storage API to allow batches of spans. Just to be clear, I don't think it's worth doing for this ticket, but some storage backends do support batch inserts. E.g. for Elasticsearch there is no particular sharding scheme in place, so it can save a batch of random spans in a single node (risk of hot partitions though, and OOMs if batches are very large). For Cassandra, sending batch of spans for different trace IDs means the receiving node will become a coordinator, will reshard them as needed, and communicate with other Cassandra nodes. We tried to avoid this mode in our current implementation by making sure the db client does the sharding upfront, to minimize extra hops. I don't remember how either backend would handle atomicity in case of a batch save.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying the first of your suggestions.

change the unmarshaler signature a bit to allow returning arrays of spans

But I don't have the authority to commit to this branch.
Is it possible to get it?

Or I can create a new PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new PR is fine. You can cherry-pick the commits from the original PR to give credit to the original author

}

type OtlpProtoUnmarshaller struct{}

func NewOtlpProtoUnmarshaller() *OtlpProtoUnmarshaller {
return &OtlpProtoUnmarshaller{}
}

func (h *OtlpProtoUnmarshaller) Unmarshal(buf []byte) (*model.Span, error) {
req := ptraceotlp.NewExportRequest()
err := req.UnmarshalProto(buf)
if err != nil {
return nil, err
}

batch, err := otlp2jaeger.ProtoFromTraces(req.Traces())
if err != nil {
return nil, err
}
return batch[0].Spans[0], nil
}