Skip to content
#

DataOps

DataOps is an automated, process-oriented methodology, used by analytic and data teams, to improve the quality and reduce the cycle time of data analytics. While DataOps began as a set of best practices, it has now matured to become a new and independent approach to data analytics. DataOps applies to the entire data lifecycle from data preparation to reporting, and recognizes the interconnected nature of the data analytics team and information technology operations.

Here are 110 public repositories matching this topic...

kowl
MaxBoeh
MaxBoeh commented Nov 4, 2021

We are using the protobuf-git configuration as described at https://cloudhut.dev/docs/features/protobuf#git-repository
In our repository the proto-files live within a proto directory, which seems to be very common, and contains 5 levels of nested folders.

Currently KOWL searches only the first 5 levels of the checkout for .proto files, so our last level is not considered.

  • Please
feature good first issue backend
flyte
rubrix
frascuchon
frascuchon commented Oct 18, 2021

The default RubrixLogHTTPMiddleware record mapper for token classification expect a structured including a text field for inputs. This could make prediction model inputs a bit cumbersome. Default mapper could accepts also flat strings as inputs:

def token_classification_mapper(inputs, outputs):
    i
good first issue help wanted

Streaming reference architecture for ETL with Kafka and Kafka-Connect. You can find more on http://lenses.io on how we provide a unified solution to manage your connectors, most advanced SQL engine for Kafka and Kafka Streams, cluster monitoring and alerting, and more.

  • Updated May 20, 2022
  • Scala
elementary
oravi
oravi commented May 7, 2022

Task Overview

  • Currently timestamp_column is the only configuration that is needed to be configured globally in the model config section (usually it's being configured in the properties.yml under elementary in the config tag).
  • Passing the timestamp_column as a test param will enable running multiple tests with different timestamp columns. For example running a test with updated_at colum
enhancement good first issue
harikrishnakanchi
harikrishnakanchi commented Oct 28, 2021

In golang client, consumers get dynamic message instance after parsing. Add an example in the docs on how to use dynamic message instance to get values of different types in consumer code.

List of protobuf types to cover

  • timestamp
  • duration
  • bytes
  • message type
  • struct
  • map
good first issue docs
Wikipedia
Wikipedia

Related Topics

open-data