-
Updated
Oct 1, 2021 - Java
data-catalog
Here are 65 public repositories matching this topic...
Upgrade Gradle to v7
The renovate configurations are intentionally configured to prevent automated major version upgrades, as seen in #1639. Therefore, a contribution is requested to manually upgrade the code to use the latest version of Gradle.
The original PR will show the comprehensive change log for Gradle itself, which can be referenced to look for applicable breaking changes. Outside of those, the existing a
@cantzakas created the SQL query necessary to pull metadata in (hyperqueryhq/whale#140) -- we just have to make the greenplum extractor scaffolding. This should just follow the exact same shape as the Postgres extractor.
-
Updated
Sep 7, 2021 - Python
It is not surprising that deep and shallow scan show different results. Shallow scan only looks at column names. Deep scan looks at a sample of the data. I've even noticed that two different runs of deep scan show different results as sample rows are different. This is the challenge with not scanning all of the data. Its a trade-off between performance/cost and accuracy. There is no right answer.
Motivation
As odd-platform supports redshift and so on, it would be awesome to support BigQuery integration.
Add more logging in all modules to emit debug signals for improved logging.
-
Updated
May 13, 2020 - HTML
Intake-esm adds the attribute intake_esm_varname to datasets, and I have encountered cases where that ends up being None (still looking for the exact model).
Zarr does not like that type of metadata:
import xarray as xr
ds_test = xr.DataArray(5).to_dataset(name='test')
ds_test.attrs['test'] = None
ds_test.to_zarr('test.zarr')gives
------------------------
Deliverables
- add unit tests
- add extractor
- add README.md in
plugins/extractors/mariadb, defining output - register your extractor
plugins/extractors/populate.go - add extractor the extractor list in
docs/reference/extractor.md
Output must contain a Table
Table
| Field | Sample Value |
|---|---|
urn |
`my_database.my_t |
-
Updated
Sep 30, 2021 - Python
-
Updated
Apr 13, 2021 - Python
-
Updated
Jul 9, 2021 - HTML
-
Updated
Sep 29, 2021 - PHP
-
Updated
Sep 22, 2021 - JavaScript
-
Updated
Aug 4, 2021
-
Updated
Sep 28, 2021 - Java
-
Updated
Sep 30, 2021 - TypeScript
-
Updated
Jul 21, 2021 - Java
-
Updated
Mar 24, 2021 - Java
-
pattern= catalog : dataset name : url : comment
-
ocean: World Ocean Atlas: https://www.nodc.noaa.gov/OC5/woa18/ : different versions and variables via parameter #15
-
global carbon budget with https://github.com/edjdavid/intake-excel #22
-
land: precipitation: https://psl.noaa.gov/data/gridded/tables/precipitation.html:
-
Mauna Loa CO2 netcdf ftp://aftp.cmdl.noaa.go
-
Updated
Jul 9, 2021 - Shell
-
Updated
Jul 9, 2021 - TypeScript
-
Updated
Jan 19, 2021 - Python
-
Updated
Sep 11, 2020
-
Updated
Oct 1, 2021 - Python
-
Updated
Sep 25, 2020 - C#
-
Updated
Sep 25, 2020 - C#
Improve this page
Add a description, image, and links to the data-catalog topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the data-catalog topic, visit your repo's landing page and select "manage topics."
Currently we only support db store publisher (e.g neo4j, mysql,neptune). But it would be pretty easy to support message queue publisher using the interface (e.g SQS, kinesis, Eventhub, kafka) which allows push ETL model support.
There is a pr (amundsen-io/amundsendatabuilder#431) which unfortunately isn't get merged. The pr could be used as an example on how to support t