-
Notifications
You must be signed in to change notification settings - Fork 28.7k
Insights: apache/spark
Overview
-
0 Active issues
-
- 0 Merged pull requests
- 32 Open pull requests
- 0 Closed issues
- 0 New issues
Could not load contribution data
Please try again later
32 Pull requests opened by 25 people
-
[SPARK-53254][PYTHON][TESTS] Skip hanging tests in PyPy daily workflow
#51983 opened
Aug 12, 2025 -
[SPARK-53260]Reducing number of JDBC overhead connections creation
#51991 opened
Aug 12, 2025 -
[SPARK-52844][PYTHON] Update pyyaml to 5.4
#51993 opened
Aug 12, 2025 -
[SPARK-53262][SS] Support schema evolution for streaming dedupe operation
#51996 opened
Aug 12, 2025 -
[SPARK-53012][PYHTON] Support Arrow Python UDTF in Spark Connect
#51998 opened
Aug 12, 2025 -
[SPARK-52482][DOCS][FOLLOW-UP] Mention behavior changes in migration guide
#51999 opened
Aug 13, 2025 -
temp:
#52001 opened
Aug 13, 2025 -
[SPARK-53264][SQL][CATALYST]. Incorrect nullability when correlated scalar subquery ge…
#52003 opened
Aug 13, 2025 -
[SPARK-52969][SQL] Support DSv2 OrcScan Dynamic Partition Pruning
#52009 opened
Aug 13, 2025 -
[TEST-ONLY] Test PyPy 7.3.19 with Python 3.10
#52014 opened
Aug 13, 2025 -
[SPARK-52677][SQL] Simplify DataTypeUtils.canWrite and TableOutputResolver
#52016 opened
Aug 13, 2025 -
[SPARK-53265][PYTHON][DOCS] Add Arrow Python UDF Type Coercion Tables in Arrow Python UDF Docs
#52025 opened
Aug 14, 2025 -
[SPARK-53273][CONNECT][SQL] Make RegisterUserDefinedFunction in SparkConnectPlanner side effect free
#52026 opened
Aug 14, 2025 -
[SPARK-52336][CORE] Prepend Spark identifier to GCS user agent
#52027 opened
Aug 14, 2025 -
[SPARK-53275][SQL] Handle stateful expressions when ordering in interpreted mode
#52028 opened
Aug 14, 2025 -
[SPARK-53287][PS] Add ANSI Migration Guide
#52034 opened
Aug 15, 2025 -
[SPARK-53288][SS] Fix assertion error with streaming global limit
#52035 opened
Aug 15, 2025 -
[TEST-ONLY] A branch-3.5 PR to check the CI
#52040 opened
Aug 15, 2025 -
[SPARK-53295][PS] Turn on ANSI by default for Pandas API on Spark
#52045 opened
Aug 15, 2025 -
[SPARK-53293][SQL] Modify exprIdToOrdinal implementation for speedup on queries with wide tables
#52046 opened
Aug 15, 2025 -
[SPARK-53294][SS] Enable StateDataSource with state checkpoint v2 (only batchId option)
#52047 opened
Aug 15, 2025 -
[SPARK-52982][PYTHON] Disallow lateral join with Arrow Python UDTFs
#52048 opened
Aug 15, 2025 -
[SPARK-49872][CORE] Remove jackson JSON string length limitation
#52049 opened
Aug 16, 2025 -
[SPARK-53296][CORE] ESS exit main thread in case boss thread exits
#52050 opened
Aug 16, 2025 -
[SPARK-53298][SQL] Make an isolation to control Shuffle partitionSizeInBytes converted from `REBALANCE` hint
#52052 opened
Aug 17, 2025 -
[SPARK-53301][PYTHON] Differentiate type hints of Pandas UDF and Arrow UDF
#52054 opened
Aug 18, 2025 -
[SPARK-53303][SS][CONNECT] Use the empty state encoder when the initial state is not provided in TWS
#52056 opened
Aug 18, 2025 -
[SPARK-53306][SQL][CONNECT][YARN][TESTS] Fix wrong package statements
#52059 opened
Aug 18, 2025 -
[SPARK-53308] [SQL] Don't remove aliases in RemoveRedundantAliases that would cause duplicates
#52060 opened
Aug 18, 2025 -
[SPARK-53311][SQL][PYTHON][CORE] make PullOutNonDeterministic use canonicalized expressions
#52061 opened
Aug 18, 2025
40 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[SPARK-52582][SQL] Improve the memory usage of XML parser
#51287 commented on
Aug 18, 2025 • 72 new comments -
[SPARK-53103][SS] Throw an error if state directory is not empty on batch 0
#51817 commented on
Aug 15, 2025 • 24 new comments -
[SPARK-52991][SQL] Implement MERGE INTO with SCHEMA EVOLUTION for V2 Data Source
#51698 commented on
Aug 17, 2025 • 22 new comments -
[SPARK-53207][SDP] Send Pipeline Event to Client Asynchronously
#51956 commented on
Aug 15, 2025 • 13 new comments -
[SPARK-53108][SQL] Implement the time_diff function in Scala
#51826 commented on
Aug 18, 2025 • 8 new comments -
[SPARK-52777][SQL] Enable shuffle cleanup mode configuration in Spark SQL
#51458 commented on
Aug 17, 2025 • 7 new comments -
[SPARK-53125][TEST] RemoteSparkSession prints whole `spark-submit` command
#51846 commented on
Aug 13, 2025 • 2 new comments -
[SPARK-53230][SQL] Assign a name to error class _LEGACY_ERROR_TEMP_1011
#51955 commented on
Aug 15, 2025 • 2 new comments -
[SPARK-53127][SQL] Enable LIMIT ALL to override recursion row limit
#51847 commented on
Aug 14, 2025 • 2 new comments -
[SPARK-52621][SQL] Cast TIME to/from VARIANT
#51553 commented on
Aug 11, 2025 • 2 new comments -
[SPARK-53030][PYTHON] Support Arrow writer for streaming Python data sources
#51757 commented on
Aug 12, 2025 • 1 new comment -
[SPARK-53174][CORE] Add TMPDIR environment variable with the value of java.io.tmpdir
#51902 commented on
Aug 13, 2025 • 1 new comment -
[SPARK-53193][DOCS] Add advanced JVM optimization parameters to tuning guide
#51920 commented on
Aug 12, 2025 • 1 new comment -
[SPARK-53198][CORE] Support terminating driver JVM after SparkContext is stopped
#51929 commented on
Aug 15, 2025 • 1 new comment -
[SPARK-53212][PYTHON] improve error handling for scalar Pandas UDFs
#51937 commented on
Aug 12, 2025 • 1 new comment -
[SPARK-53209][YARN] Add ActiveProcessorCount JVM option to YARN executor and AM
#51948 commented on
Aug 16, 2025 • 1 new comment -
[SPARK-53148][CONNECT][SQL] Make SqlCommand in SparkConnectPlanner side effect free
#51903 commented on
Aug 15, 2025 • 0 new comments -
[IN PROGRESS][K8S] Introduce pending pod limit per ResourceProfile
#51913 commented on
Aug 15, 2025 • 0 new comments -
[SPARK-53156][CORE] Track Driver Memory Metrics when the Application ends
#51882 commented on
Aug 14, 2025 • 0 new comments -
[SPARK-53144][CONNECT][SQL] Make CreateViewCommand in SparkConnectPlanner side effect free
#51874 commented on
Aug 18, 2025 • 0 new comments -
[SPARK-53143][SQL] Fix self join in DataFrame API - Join is not the only expected output from analyzer
#51873 commented on
Aug 12, 2025 • 0 new comments -
[SPARK-53142][SQL] Support dynamic expression addition in SemanticComparator
#51871 commented on
Aug 12, 2025 • 0 new comments -
[SPARK-53182][PYTHON][DOCS] Fix broken and missing links in PySpark DataFrames user guide
#51851 commented on
Aug 13, 2025 • 0 new comments -
[WIP][TESTS] Upgrade pypy to 3.11
#51966 commented on
Aug 13, 2025 • 0 new comments -
[SPARK-53105][Structured Streaming] Fix tests for checkpoint v2 in RocksDBSuite
#51834 commented on
Aug 11, 2025 • 0 new comments -
[SPARK-53112][SQL][PYTHON][CONNECT] Support TIME in the make_timestamp_ntz and try_make_timestamp_ntz functions in PySpark
#51831 commented on
Aug 18, 2025 • 0 new comments -
[SPARK-53109][SQL] Support TIME in the make_timestamp_ntz and try_make_timestamp_ntz functions in Scala
#51828 commented on
Aug 18, 2025 • 0 new comments -
[SPARK-42360][SQL] Rule to convert Left Outer Join with suitable filter to Left Anti Join
#51762 commented on
Aug 11, 2025 • 0 new comments -
[SPARK-52844][PYTHON] Update protobuf to 5.29.5
#51747 commented on
Aug 15, 2025 • 0 new comments -
[SPARK-53015][BUILD] Upgrade log4j to 2.25.1
#51719 commented on
Aug 18, 2025 • 0 new comments -
[SPARK-52844][PYTHON][TESTS] Update black to 24.3.0
#51687 commented on
Aug 13, 2025 • 0 new comments -
[SPARK-33737][K8S] Support getting pod state using Informers + Listers
#51396 commented on
Aug 15, 2025 • 0 new comments -
[SPARK-50603][SQL] Respect user-provided basePath for streaming file source reads without glob
#51267 commented on
Aug 15, 2025 • 0 new comments -
[SPARK-51168][BUILD] Test Hadoop 3.4.2
#51127 commented on
Aug 18, 2025 • 0 new comments -
[SPARK-52041][CORE] Add better support for integrating with external cluster manager
#50770 commented on
Aug 18, 2025 • 0 new comments -
[SPARK-49547][SQL][PYTHON] Add iterator of `RecordBatch` API to `applyInArrow`
#49005 commented on
Aug 15, 2025 • 0 new comments -
[SPARK-22876][YARN] Respect YARN AM failure validity interval
#42570 commented on
Aug 13, 2025 • 0 new comments -
[SPARK-44639][SS][YARN] Use Java tmp dir for local RocksDB state storage on Yarn
#42301 commented on
Aug 15, 2025 • 0 new comments -
[SPARK-37019][SQL] Add codegen support to array higher-order functions
#34558 commented on
Aug 13, 2025 • 0 new comments -
[SPARK-35564][SQL] Support subexpression elimination for conditionally evaluated expressions
#32987 commented on
Aug 15, 2025 • 0 new comments