mlops

At the moment, from API there are two useful columns about tasks:

Key	Type	Description
tasks.created_at	datetime	Date and time of task creation.
tasks.updated_at	datetime	Date and time of last update to the task.

However, in tabs (views) there are possible columns tasks:created_at a

Thank you for this great tool!

[Describe the bug
A clear and concise description of what the bug is.]

Broken link in the automatically generated Edit Your Expectation Suite starter noteboook: https://docs.greatexpectations.io/en/latest/autoapi/great_expectations/data_asset/index.html?highlight=remove_expectation&utm_source=notebook&utm_medium=edit_expectations#great_expectations.data_

With a config like this

{
    "METAFLOW_DATASTORE_SYSROOT_S3": "s3://mf-test/metaflow/",
}

(note a slash after METAFLOW_DATASTORE_SYSROOT_S3)

metaflow.S3(run=self).put* produces double-slashes like here:

s3://mf-test/metaflow//data/DataLoader/1630978962283843/month=01/data.parquet

The trailing slash in the config shouldn't make a difference

Description

We need to update our list of community developed plugins, as our ecosystem grows, this task focuses on adding the kedro-airflow-k8s plugin to our Community Developed Plugins page.

Possible Implementation

Follow the process in our [Contribution G

🚨🚨 Feature Request

Related to an existing Issue
A new implementation (Improvement, Extension)

If your feature will improve `HUB`

Need a way to check if a dataset already exists.

hub.empty throws an error if a dataset exists and hub.load throws an error if the dataset does not exist.

Need a way to check if a dataset already exists without throwing a

Use case

Polyaxon tracking can be used to automatically log all information generated by Ludwig.

We should probably think about adding support for Polyaxon's tracking module in Ludwig contrib.

For SC Operator it may be a good idea to generate CRD manifests from inside a docker container.
This should provide reproducible generation step and avoid "produces different output on my machine" issues.

Linter should also fail if generation of manifests produce diff with the commited version.

What steps did you take

Code gets stuck in infinite loop is SageMaker training job gets stopped (unhandled use case)

What happened:

https://github.com/kubeflow/pipelines/blob/master/components/aws/sagemaker/train/src/sagemaker_training_component.py#L57-L66

Above code only caters for training job status Completed or Failed, so if the training job status is marked as `Stopped

When specifying on demand feature views at retrieval time (e.g. get_X_features), the output feature vectors include e.g. request data or dependent feature vectors, even if users did not specify said features.

Expected Behavior

Non-specified dependent feature values are not returned in output

Current Behavior

Non-specified dependent feature values are in output

Steps to reprodu

Describe the bug

flytectl register files command doesn't fail without --countinueOnError. It should fail with exit code 0.

Expected behavior

Flytectl register should fail if there is an error
Flytectl register should not fail if the user passes the --countinueOnError

Additional context to reproduce

No response

Screenshots

No response

Are

Proposed refactoring or deprecation

Output informative messages when run parameter cannot be set (i.e. type is not supported by storage)

Motivation

To give users clear error messages and improve onboarding experience.

Pitch

When setting run parameter of unsupported type I would like to get a precise error describing what was done wrong instead of generic Python exception (`No

When running my first test model I ran into the issue where my target field was sometimes empty (signifying zero, or 'no data' for that timeframe). It was a numeric field so It'd be nice to be able to set a value in my config.json that it defaults to when there is no data per field.

An additional side note but the lack of a --verbose flag to tell me which row it was failing on made it take long

The load_dotted_path raises the following error if unable to load the module:

Traceback (most recent call last):
  File "/Users/Edu/Desktop/import-error/script.py", line 4, in <module>
    load_dotted_path('tests.quality.fn')
  File "/Users/Edu/dev/ploomber/src/ploomber/util/dotted_path.py", line 128, in load_dotted_path
    module = importlib.import_module(mod)
  File "/Users/

We're using marshmallow to parse whylogs config from YAML

However, Pydantic is much more powerful as it allows users to set config via various mechanims, from YAML, JSON to Environment settings.

We should consider moving to pydantic

mlops

Here are 458 public repositories matching this topic...

GokuMohandas / MadeWithML

EthicalML / awesome-production-machine-learning

heartexlabs / label-studio

visenger / awesome-mlops

aws / amazon-sagemaker-examples

great-expectations / great_expectations

Netflix / metaflow

quantumblacklabs / kedro

Description

Possible Implementation

activeloopai / Hub

🚨🚨 Feature Request

If your feature will improve HUB

bentoml / BentoML

polyaxon / polyaxon

Use case

allegroai / clearml

SeldonIO / seldon-core

kubeflow / pipelines

What steps did you take

What happened:

feast-dev / feast

Expected Behavior

Current Behavior

Steps to reprodu

semi-technologies / weaviate

flyteorg / flyte

Describe the bug

Expected behavior

Additional context to reproduce

Screenshots

Are

aimhubio / aim

Proposed refactoring or deprecation

Motivation

Pitch

evidentlyai / evidently

zenml-io / zenml

MLReef / mlreef

ebhy / budgetml

microsoft / MLOps

tangramdotdev / tangram

GokuMohandas / MLOps

kelvins / awesome-mlops

abhishek-ch / around-dataengineering

ploomber / ploomber

microsoft / MLOpsPython

whylabs / whylogs

Improve this page

Add this topic to your repo

If your feature will improve `HUB`