Pulse · apache/spark · GitHub

May 13, 2019 – May 20, 2019

Overview

34 Active Pull Requests

0 Active Issues
- 0 Merged Pull Requests
- 34 Proposed Pull Requests
- 0 Closed Issues
- 0 New Issues

Loading contribution data...

34 Pull requests proposed by 29 people

Proposed #24593 [SPARK-27692][SQL] Add new optimizer rule to evaluate the deterministic scala udf only once if all inputs are literals May 13, 2019
Proposed #24597 [SPARK-27698][SQL] Add new method for getting pushed down filters in Parquet file reader May 14, 2019
Proposed #24599 [SPARK-27701][SQL] Extend NestedColumnAliasing to more nested field cases May 14, 2019
Proposed #24601 [SPARK-27702][K8S] Allow using some alternatives for service accounts May 14, 2019
Proposed #24603 [SPARK-27706][SQL][WEBUI] Add SQL metrics of numOutputRows for BroadcastExchangeExec May 14, 2019
Proposed #24605 [SPARK-27711][CORE] Unset InputFileBlockHolder at the end of tasks May 14, 2019
Proposed #24609 [SPARK-27715][SQL][UI] SQL query details in UI dose not show in correct format. May 15, 2019
Proposed #24610 [SPARK-27716][SQL] Complete the transactions support for part of jdbc datasource operations. May 15, 2019
Proposed #24611 [SPARK-27717][SS] support UNION in continuous processing May 15, 2019
Proposed #24613 [SPARK-27549][SS] Add support for committing kafka offsets per batch for supporting external tooling May 15, 2019
Proposed #24615 [SPARK-27488][CORE] Driver interface to support GPU resources May 15, 2019
Proposed #24616 [SPARK-27726] [Core] Fix performance of ElementTrackingStore deletes when using InMemoryStore under high loads May 15, 2019
Proposed #24617 [SPARK-27732][SQL] Add v2 CreateTable implementation. May 15, 2019
Proposed #24618 [SPARK-27734][CORE][SQL][WIP] Add memory based thresholds for shuffle spill May 15, 2019
Proposed #24620 [SPARK-27737][SQL] Upgrade to 2.3.5 for Hive Metastore Client 2.3 May 15, 2019
Proposed #24623 [SPARK-27739][SQL] df.persist should save stats from optimized plan May 16, 2019
Proposed #24624 [SPARK-27743][SQL] alter table bucket May 16, 2019
Proposed #24625 [SPARK-27744][SQL] preserve spark properties on async subquery tasks May 16, 2019
Proposed #24626 [SPARK-27747][SQL] add a logical plan link in the physical plan May 16, 2019
Proposed #24627 [SPARK-27748][SS] Kafka consumer/producer password/token redaction. May 16, 2019
Proposed #24628 [SPARK-27749][SQL][test-hadoop3.2][test-maven] hadoop-3.2 support hive-thriftserver May 16, 2019
Proposed #24631 [SPARK-27774][CORE][EXAMPLES][MLLIB] Avoid hardcoded configs May 17, 2019
Proposed #24634 [SPARK-27361][YARN] YARN support for GPU-aware scheduling May 17, 2019
Proposed #24635 [SPARK-27762][SQL] Support user provided avro schema for writing fields with different ordering May 17, 2019
Proposed #24636 [SPARK-27684][SQL] Avoid conversion overhead for primitive types May 18, 2019
Proposed #24637 [SPARK-27707][SQL] Prune unnecessary nested fields from Generate to address performance issue in explode May 18, 2019
Proposed #24639 [SPARK-27699][FOLLOW-UP][SQL][test-hadoop3.2][test-maven] Fix hadoop-3.2 test error May 19, 2019
Proposed #24640 [SPARK-27770] [SQL] [TEST] Port AGGREGATES.sql [Part 1] May 19, 2019
Proposed #24643 [SPARK-26412][PySpark][SQL][WIP] Allow Pandas UDF to take an iterator of pd.Series or an iterator of tuple of pd.Series May 19, 2019
Proposed #24644 [SPARK-27402][INFRA][FOLLOW-UP] Exclude 'hive-thriftserver' in modules to test for hadoop3.2 for now May 20, 2019
Proposed #24645 [SPARK-27773][Shuffle] add metrics for number of exceptions caught in shuffle service's TransportChannelHandler May 20, 2019
Proposed #24646 [SPARK-27757][CORE] Bump Jackson to 2.9.9 May 20, 2019
Proposed #24647 [SPARK-27776][SQL]Avoid duplicate Java reflection in DataSource. May 20, 2019
Proposed #24648 [SPARK-27777][ML] Eliminate uncessary sliding job in AreaUnderCurve May 20, 2019

47 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

39 new comments Open #24533 [SPARK-27637][Shuffle] For nettyBlockTransferService, if IOException occurred while fetching data, check whether relative executor is alive before retry May 20, 2019
36 new comments Open #24565 [SPARK-27665][Core] Split fetch shuffle blocks protocol from OpenBlocks May 17, 2019
23 new comments Open #24499 [SPARK-27677][Core] Serve local disk persisted blocks by the external service after releasing executor by dynamic allocation May 18, 2019
21 new comments Open #23546 [SPARK-23153][K8s] Support client dependencies with a Hadoop Compatible File System May 17, 2019
21 new comments Open #24221 [SPARK-27248][SQL] `refreshTable` should recreate cache with same cache name and storage level May 20, 2019
18 new comments Open #24569 [SPARK-23191][CORE] Warn rather than terminate when duplicate worker register happens May 15, 2019
13 new comments Open #23992 [SPARK-27074][SQL] Hive 3.1 metastore support HiveClientImpl.runHive May 20, 2019
13 new comments Open #24523 [SPARK-27631][SQL] Avoid repeating calculate table statistics May 16, 2019
12 new comments Open #20430 [SPARK-23263][TEST] CTAS should update stat if autoUpdate statistics is enabled May 20, 2019
12 new comments Open #24556 [SPARK-27641][CORE] Fix MetricsSystem to remove unregistered source correctly May 16, 2019
11 new comments Open #24374 [WIP][SPARK-27366][CORE] Support GPU Resources in Spark job scheduling May 15, 2019
10 new comments Open #24233 [SPARK-26356][SQL] remove SaveMode from data source v2 May 20, 2019
9 new comments Open #21586 [SPARK-24586][SQL] Upcast should not allow casting from string to other types May 20, 2019
9 new comments Open #24575 [SPARK-27670][SQL]Add HA for HiveThriftServer2 based on HiveServer2. May 15, 2019
8 new comments Open #24497 [SPARK-27630][CORE]Stage retry causes totalRunningTasks calculation to be negative May 20, 2019
7 new comments Open #24219 [SPARK-27258][K8S]Deal with the k8s resource names that don't match their own regular expression May 16, 2019
5 new comments Open #24566 [SPARK-27667][SQL]get the current database from spark catalog instead of querying the Hive May 17, 2019
5 new comments Open #24585 [Spark-27664][SQL] Performance issue while listing large number of files on an object store. May 19, 2019
4 new comments Open #23767 [SPARK-26329][CORE] Faster polling of executor memory metrics. May 17, 2019
4 new comments Open #24044 [WIP][test-hadoop3.2] Test Hadoop 3.2 on jenkins May 17, 2019
4 new comments Open #24490 [SPARK-27300][GRAPH][test-maven] Add Spark Graph modules and dependencies May 18, 2019
4 new comments Open #24530 [SPARK-27520][CORE][WIP] Introduce a global config system to replace hadoopConfiguration May 16, 2019
3 new comments Open #24011 [SPARK-27071][CORE] Expose additional metrics in status.api.v1.StageData May 16, 2019
3 new comments Open #24068 [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion May 15, 2019
3 new comments Open #24335 [SPARK-27425][SQL] Add count_if functions May 20, 2019
3 new comments Open #24372 [SPARK-27462][SQL] Enhance insert into hive table that could choose some columns in target table flexibly. May 16, 2019
3 new comments Open #24525 [SPARK-27633][SQL] Remove redundant aliases in NestedColumnAliasing May 15, 2019
2 new comments Open #19096 [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking. May 13, 2019
2 new comments Open #24367 [SPARK-27457][SQL] modify bean encoder to support avro objects May 18, 2019
2 new comments Open #24559 [SPARK-27658][SQL] Add FunctionCatalog API May 16, 2019
1 new comment Open #17862 [SPARK-20602] [ML]Adding LBFGS optimizer and Squared_hinge loss for LinearSVC May 17, 2019
1 new comment Open #17864 [SPARK-20604][ML] Allow imputer to handle numeric types May 15, 2019
1 new comment Open #18784 [SPARK-21559][Mesos] remove mesos fine-grained mode May 16, 2019
1 new comment Open #19410 [SPARK-22184][CORE][GRAPHX] GraphX fails in case of insufficient memory and checkpoints enabled May 17, 2019
1 new comment Open #20303 [SPARK-23128][SQL] A new approach to do adaptive execution in Spark SQL May 14, 2019
1 new comment Open #22138 [SPARK-25151][SS] Apply Apache Commons Pool to KafkaDataConsumer May 13, 2019
1 new comment Open #23267 [SPARK-25401] [SQL] Reorder join predicates to match child outputOrdering May 15, 2019
1 new comment Open #23634 [SPARK-26154][SS] Streaming left/right outer join should not return outer nulls for already matched rows May 13, 2019
1 new comment Open #24076 [SPARK-27142] Provide REST API for SQL level information May 16, 2019
1 new comment Open #24327 [WIP][SPARK-27418][SQL] Migrate Parquet to File Data Source V2 May 14, 2019
1 new comment Open #24344 [SPARK-27440][SQL] Optimize uncorrelated predicate subquery May 16, 2019
1 new comment Open #24382 [SPARK-27330][SS] support task abort in foreach writer May 15, 2019
1 new comment Open #24405 [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas May 13, 2019
1 new comment Open #24442 [SPARK-27547][SQL] Fix DataFrame self-join problems May 16, 2019
1 new comment Open #24553 [SPARK-27604][SQL] Enhance constant propagation May 13, 2019
1 new comment Open #24560 [SPARK-27661][SQL] Add SupportsNamespaces API May 15, 2019
0 new comments Open #24299 [SPARK-27388][SQL] encoder for objects defined by properties (ie. Avro) May 18, 2019

You can’t perform that action at this time.