[SPARK-35142][PYTHON][ML] Fix incorrect return type for rawPredictionUDF in OneVsRestModel
#32245
Conversation
| @@ -3151,7 +3151,7 @@ def func(predictions): | |||
| predArray.append(x) | |||
| return Vectors.dense(predArray) | |||
|
|
|||
| rawPredictionUDF = udf(func) | |||
harupy
Apr 20, 2021
•
Author
Contributor
Should I add a test here to ensure that the rawPrediction column is no longer string
spark/python/pyspark/ml/tests/test_algorithms.py
Lines 108 to 117
in
0494dc9
Should I add a test here to ensure that the rawPrediction column is no longer string
spark/python/pyspark/ml/tests/test_algorithms.py
Lines 108 to 117 in 0494dc9
HyukjinKwon
Apr 20, 2021
Member
Yeah, I think we should better add a test if possible.
Yeah, I think we should better add a test if possible.
harupy
Apr 20, 2021
Author
Contributor
Got it, added a test
Got it, added a test
WeichenXu123
Apr 20, 2021
Contributor
@HyukjinKwon
why only transformed_df.head() trigger this error ?
does it indicate bugs in pyspark-sql udf ?
@HyukjinKwon
why only transformed_df.head() trigger this error ?
does it indicate bugs in pyspark-sql udf ?
HyukjinKwon
Apr 21, 2021
Member
Seems like pred.show() triggers an exception too? what does it return in other methods?
Seems like pred.show() triggers an exception too? what does it return in other methods?
|
ok to test |
|
add to whitelist |
|
cc @WeichenXu123 FYI |
|
Test build #137665 has finished for PR 32245 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #137666 has finished for PR 32245 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #137668 has finished for PR 32245 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
LGTM |
|
Test build #137708 has finished for PR 32245 at commit
|
|
Kubernetes integration test unable to build dist. exiting with code: 1 |
|
Test build #137713 has finished for PR 32245 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
LGTM |
|
Looks good. @harupy, would you mind filling the PR description per the template? |
rawPredictionUDF in OneVsRestModelrawPredictionUDF in OneVsRestModel
|
@viirya, are you preparing Spark 2.4 RC now? This is supposed to be in Spark 2.4 too but this isn't a regression so it doesn't block. It's just a good to have so if you're preparing, it should be fine to don't backport. |
|
BTW, the tests passed at https://github.com/harupy/spark/actions/runs/769366516. GitHub Actions didn't work properly for linking that run for some reasons .. I will leave it to @WeichenXu123 then. |
…nUDF` in `OneVsRestModel` ### What changes were proposed in this pull request? Fixes incorrect return type for `rawPredictionUDF` in `OneVsRestModel`. ### Why are the changes needed? Bugfix ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Unit test. Closes #32245 from harupy/SPARK-35142. Authored-by: harupy <17039389+harupy@users.noreply.github.com> Signed-off-by: Weichen Xu <weichen.xu@databricks.com> (cherry picked from commit b6350f5) Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
|
Backport to branch-3.1 cause conflicts.
|
|
@WeichenXu123 Opened a PR: #32269 |
|
I don't see backport to 2.4. Do you plan to backport it? @WeichenXu123 @harupy? |
|
@viirya Got it. I'll open another PR for 2.4. Wait, does spark/python/pyspark/ml/classification.py Lines 1964 to 2009 in 1630d64 |
|
Okay, looks like we can skip Spark 2.4. |
|
Thanks for confirming. @harupy @HyukjinKwon |
What changes were proposed in this pull request?
Fixes incorrect return type for
rawPredictionUDFinOneVsRestModel.Why are the changes needed?
Bugfix
Does this PR introduce any user-facing change?
No
How was this patch tested?
Unit test.