feature-engineering
Here are 1,493 public repositories matching this topic...
-
Updated
Jun 23, 2022 - Python
In the check_schema_version utility function, there is custom code to determine whether saved schema versions are older or newer than the current schema version. This comparison could likely be simplified significantly by using the packaging library performing the version comparison instead of the custom code.
Current code:
current = SCHEMA_VERSION.split(".")
saved = ve
Is your feature request related to a problem? Please describe.
Feast is often hard to install alongside other python packages that use google-cloud-core. Specifically, Feast sets an upper-bound on this library (2.0.0), but the latest version is 2.3.1 and many python packages have a lower-bound of 2.0.0 and above.
Describe the solution you'd like
Remove google-cloud-core fr
-
Updated
Jun 21, 2022 - Java
I trained models on Windows, then I tried to use them on Linux, however, I could not load them due to an incorrect path joining. During model loading, I got learner_path in the following format experiments_dir/model_1/100_LightGBM\\learner_fold_0.lightgbm. The last two slashes were incorrectly concatenated with the rest part of the path. In this regard, I would suggest adding something like `l
-
Updated
Feb 10, 2021 - Python
We have an English version about this demo: https://github.com/4paradigm/OpenMLDB/tree/main/demo/talkingdata-adtracking-fraud-detection
Please translate this doc to Chinese, and save it as: docs/zh/use_case/talkingdata.md, please don't forget to update the file docs/zh/use_case/index.rst
-
Updated
Jun 27, 2022 - Scala
-
Updated
Feb 14, 2017 - Jupyter Notebook
-
Updated
Jun 22, 2022 - Python
when a variable is in a logarithmic scale, it might make sense to create the intervals based on a log scale instead of linear scale.
Quote:
"
When the numbers span multiple magnitudes, it may be better to group by powers of
10 (or powers of any constant): 0–9, 10–99, 100–999, 1000–9999, etc. The bin widths
grow exponentially
"
the idea is taken from: Feature Engineering for Machine Lear
Currently in the get_result_df function, there's no way to specify a temporary folder name. Will be useful if this function can support a parameter like local_folder or something so end user can control where to download those files.
-
Updated
Dec 20, 2017 - Python
-
Updated
Dec 15, 2018 - Jupyter Notebook
Current version of bucketize uses fixed boundaries. If the user doesn't know these boundaries they need to calculate them using cudf.
We should support splitting continuous variables into buckets based on quantile and uniform splits of the data.
For uniform splits the statistics gathering phase needs to compute the min and max of the column and figure out the boundaries to create N buckets.
-
Updated
Mar 6, 2022 - Python
-
Updated
Jan 20, 2021 - Python
-
Updated
Jul 4, 2022 - Jupyter Notebook
-
Updated
May 8, 2019 - Python
-
Updated
Jun 2, 2022 - Python
-
Updated
Jun 27, 2022 - Java
Is your feature request related to a problem? Please describe.
The friction to getting the examples up and running is installing the dependencies. A docker container with them already provided would reduce friction for people to get started with Hamilton.
Describe the solution you'd like
- A docker container, that has different python virtual environments, that has the dependencies t
-
Updated
Jul 1, 2022 - Python
In PR #3133, we marked tests to skip if the environment was a Python 3.9 environment. I don't think all the tests that are being skipped need to be skipped anymore. In working through the PolynomialDetrender tests, it was noted that Python 3.9 environments were skipping these tests, probably due to sktime not being compatible with that version of Py
-
Updated
Oct 26, 2018
-
Updated
Feb 16, 2022 - Jupyter Notebook
-
Updated
Mar 1, 2022 - Jupyter Notebook
-
Updated
Jun 22, 2022 - Python
Improve this page
Add a description, image, and links to the feature-engineering topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the feature-engineering topic, visit your repo's landing page and select "manage topics."
Bug with GPU Model
Currently, while using pruning methods like
TaylorFOWeightPruner, If I use a model on GPU for getting the metrics (as calculated for getting masks), it fails on line while creating masks. The reason why it fails i