Insights: apache/iceberg
Overview
Could not load contribution data
Please try again later
31 Pull requests merged by 17 people
-
add Aggregate Expressions
#5961 merged
Oct 25, 2022 -
Spark 3.2: Ensure rowStartPosInBatch in ColumnarBatchReader is set correctly
#6041 merged
Oct 25, 2022 -
Spark 3.3: Ensure rowStartPosInBatch in ColumnarBatchReader is set correctly
#6026 merged
Oct 24, 2022 -
Build: Bump mkdocs from 1.3.1 to 1.4.1 in /python
#6033 merged
Oct 24, 2022 -
Core: Increase inferred column metrics limit to 100
#5916 merged
Oct 23, 2022 -
Add section on semantic versioning and deprecations
#6032 merged
Oct 23, 2022 -
Python: Implement S3V4RestSigner
#5969 merged
Oct 21, 2022 -
Python: Implement select
#5966 merged
Oct 21, 2022 -
Python: Visitor to convert Iceberg to PyArrow schema
#5949 merged
Oct 21, 2022 -
Core: Rename TableTestBase.Assertions to not conflict with AssertJ Assertions
#6022 merged
Oct 21, 2022 -
API: Update expression sanitization for relative dates and times
#5944 merged
Oct 21, 2022 -
Replace and ban hamcrest usage
#6030 merged
Oct 21, 2022 -
Replace Assert.fail usage with AssertJ fluent testing
#6029 merged
Oct 21, 2022 -
Hive: Set the Table owner on table creation
#5763 merged
Oct 21, 2022 -
docs:Add an example of CTAS with PARTITIONED BY (rebased, fix #3854)
#6020 merged
Oct 21, 2022 -
Python: Split expressions base
#5987 merged
Oct 21, 2022 -
Closes #5988 - Allow configuration of Hive MetastoreClient using Catalog properties
#5989 merged
Oct 21, 2022 -
Core: Deprecate HTTPClientFactory / Allow configuring ObjectMapper for HTTPClient
#5998 merged
Oct 21, 2022 -
Nessie: no longer push whole metadata JSON to Nessie
#5999 merged
Oct 21, 2022 -
Core: Don't fail scan planning if REST metric reporting fails
#6023 merged
Oct 20, 2022 -
[python_legacy] BOTO_STS_CLIENT lazy initialization
#5930 merged
Oct 20, 2022 -
Core,Spark: Refactor to move "copy-on-write" and "merge-on-read" literals to constants
#6006 merged
Oct 20, 2022 -
Python: Add support for providing SSL config for REST Catalog client.
#6019 merged
Oct 20, 2022 -
Orc: Support row group bloom filters
#5313 merged
Oct 20, 2022 -
Core: Parallelize the determining of reachable manifests during file cleanup
#5981 merged
Oct 19, 2022 -
Spark 3.2: Split SparkScan and SparkBatch
#6014 merged
Oct 19, 2022 -
Core: Fix TestSnapshotUtil time random disorder
#6015 merged
Oct 19, 2022 -
Spark 3.2: Remove redundant imports in SparkScan
#6016 merged
Oct 19, 2022 -
Spark 3.2: Add SparkChangelogTable
#6013 merged
Oct 18, 2022 -
Spark 3.2: Use ScanTaskGroup methods when computing stats
#6011 merged
Oct 18, 2022
18 Pull requests opened by 13 people
-
Core, API: Field metadata support
#6017 opened
Oct 19, 2022 -
[Docs] Update migrate behaviour with respect to drop_table in spark-procedures docs.
#6025 opened
Oct 20, 2022 -
Python: GlueCatalog Full Implementation
#6034 opened
Oct 23, 2022 -
Build: Add gaborkaszab as a collaborator
#6036 opened
Oct 24, 2022 -
Python: Fix Github pages
#6038 opened
Oct 24, 2022 -
AWS: Add AwsKmsClient implementation
#6040 opened
Oct 24, 2022 -
Core: Partial Update
#6043 opened
Oct 25, 2022 -
[iceberg-hive-metastore] Add support for group ownership
#6045 opened
Oct 25, 2022 -
Spark 3.1: Ensure rowStartPosInBatch in ColumnarBatchReader is set correctly
#6046 opened
Oct 25, 2022 -
Core: Replace projected Schema with schemaId/fieldIds/fieldNames in ScanReport
#6047 opened
Oct 25, 2022 -
Docs: Fix broken link for puffin in Spec
#6048 opened
Oct 25, 2022 -
Flink: Add Sink options to override the compression properties of the Table
#6049 opened
Oct 25, 2022 -
Core: Improve collection handling in JsonUtil
#6051 opened
Oct 25, 2022 -
Infra: Update slack invite link
#6052 opened
Oct 25, 2022 -
Build: Let revapi compare API compatibility against apache-iceberg-1.0.0
#6053 opened
Oct 25, 2022 -
Infra: Publish nightly build for Spark-3.3_2.13
#6054 opened
Oct 25, 2022 -
Spark 3.3: Use separate scan during file filtering in copy-on-write operations
#6055 opened
Oct 25, 2022 -
Parquet: Remove the row position since parquet row group has it natively
#6056 opened
Oct 25, 2022
8 Issues closed by 3 people
-
Schema Evolution exception: too many data columns
#4542 closed
Oct 25, 2022 -
Add min sequence number of referenced data files in a position-delete file's manifest entry
#3789 closed
Oct 22, 2022 -
Docs: Add an example of CTAS with PARTITIONED BY
#3854 closed
Oct 21, 2022 -
Allow configuration of HiveMetastoreClient using Catalog Properties
#5988 closed
Oct 21, 2022 -
can not delete while iceberg sql extention set
#6024 closed
Oct 20, 2022 -
iceberg support branch/tag, what is the difference between nessie and iceberg?
#4476 closed
Oct 20, 2022 -
Drop managed table after drop data
#3792 closed
Oct 19, 2022
9 Issues opened by 9 people
-
Spark3.2.2 rewriteDataFiles task yarn driver Stuck
#6050 opened
Oct 25, 2022 -
Column pruning/projection is not happening in correlated queries (e.g Q94, Q16)
#6044 opened
Oct 25, 2022 -
Add delete file information to partitions table
#6042 opened
Oct 24, 2022 -
Nessie: Switch to Nessie API v2
#6031 opened
Oct 21, 2022 -
rewrite datafile OOM bug
#6028 opened
Oct 21, 2022 -
metadata location wrong with hadoop HA on multi clusters
#6027 opened
Oct 21, 2022 -
When I use flink sql to synchronize MySQL data to icerberg (hive catalog), an error is reported.
#6021 opened
Oct 20, 2022 -
What is the expected behavior of expireOlderThan for a table with a tag that has not reached max age?
#6018 opened
Oct 19, 2022
41 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Core, API: Support incremental scanning with branch
#5984 commented on
Oct 25, 2022 • 24 new comments -
Spark: Fix a separate table cache being created for each rewriteFiles
#5392 commented on
Oct 21, 2022 • 17 new comments -
Flink: Support read options in flink source
#5967 commented on
Oct 25, 2022 • 17 new comments -
Core: Add option to combine tasks by partition
#2276 commented on
Oct 20, 2022 • 12 new comments -
API: Add view interfaces
#4925 commented on
Oct 25, 2022 • 12 new comments -
Spark 3.3: Add a procedure to generate table changes
#6012 commented on
Oct 21, 2022 • 10 new comments -
Core: Add file seq number to ManifestEntry
#6002 commented on
Oct 23, 2022 • 8 new comments -
Data: Support reading default values from generic Avro readers
#6004 commented on
Oct 23, 2022 • 7 new comments -
Doc: add assume role session name doc and remove redundant spark shell examples
#5994 commented on
Oct 25, 2022 • 5 new comments -
Python: Fix caching of the PyArrowFileIO
#6010 commented on
Oct 24, 2022 • 5 new comments -
API/Core: Add metadata field to NestedField
#5631 commented on
Oct 24, 2022 • 2 new comments -
Iceberg table maintenance/compaction within AWS
#5997 commented on
Oct 25, 2022 • 2 new comments -
API: Add default value API
#4732 commented on
Oct 21, 2022 • 2 new comments -
Spark Integration to read from Snapshot ref
#5150 commented on
Oct 23, 2022 • 2 new comments -
Core: Use explicit JSON Parser for namespace creation request
#5968 commented on
Oct 24, 2022 • 2 new comments -
Orc : Bug when adding a inner struct field as partition field
#4604 commented on
Oct 19, 2022 • 1 new comment -
Implement rate limiting while reading stream from Iceberg table as Spark3 DSv2 source
#2789 commented on
Oct 19, 2022 • 1 new comment -
Quick start docker-compose demo doesn't work
#5993 commented on
Oct 19, 2022 • 1 new comment -
Add Checkstyle Rule to prevent Map<StructLike, ...> and Set<StructLike>
#4616 commented on
Oct 20, 2022 • 1 new comment -
Is there a full example for Iceberg+Flink+Minio
#3968 commented on
Oct 21, 2022 • 1 new comment -
Delete files not eventually removed if RewriteDataFile run right after delete (when using 'use-starting-sequence-number' default)
#4127 commented on
Oct 21, 2022 • 1 new comment -
IcebergGenerics.read(table) not work for most kinds of metadata tables
#4523 commented on
Oct 22, 2022 • 1 new comment -
pip install pyiceberg on windows require C++ to be installed
#5901 commented on
Oct 22, 2022 • 1 new comment -
Pyflink+Iceberg+Kinesis
#4633 commented on
Oct 24, 2022 • 1 new comment -
missing SetWriteDistributionAndOrdering class for spark sql plan
#4628 commented on
Oct 25, 2022 • 1 new comment -
Nessie: Use unique path for different table with same name
#4826 commented on
Oct 24, 2022 • 1 new comment -
[Core]Add EncryptionManagerFactory to configure encryption via catalog properties and table metadata.
#5539 commented on
Oct 24, 2022 • 1 new comment -
[Flink] Avoid submitting too many empty snapshots
#5561 commented on
Oct 25, 2022 • 1 new comment -
Spark: Check for hive support when using SparkSessionCatalog
#5693 commented on
Oct 20, 2022 • 1 new comment -
Cache dropStats result for ManifestReader iterator
#5836 commented on
Oct 20, 2022 • 1 new comment -
API,Core: Introduce metrics for data files by file format
#5837 commented on
Oct 24, 2022 • 1 new comment -
Spark: Iceberg bug 5935 fix where some methods of Spark3Util do not set current session in spark's threadlocal
#5959 commented on
Oct 23, 2022 • 1 new comment -
Core: Optimize the TableScanContext
#5982 commented on
Oct 21, 2022 • 1 new comment -
Structured Streaming writes to iceberg table with non-identity partition spec breaks with spark extensions enabled
#5625 commented on
Oct 20, 2022 • 0 new comments -
Spark : Spark3Util is not setting the spark session being used as active session when executing sensitive functions
#5935 commented on
Oct 24, 2022 • 0 new comments -
Parquet: Support parquet modular encryption
#2639 commented on
Oct 24, 2022 • 0 new comments -
API: Optionally ignore position deletes in rewrite validation
#4703 commented on
Oct 21, 2022 • 0 new comments -
Encryption integration and test
#5544 commented on
Oct 24, 2022 • 0 new comments -
AWS: Fix catalog names in LakeFormationTestBase
#5767 commented on
Oct 20, 2022 • 0 new comments -
Core: Make TableScanContext immutable
#5985 commented on
Oct 19, 2022 • 0 new comments