Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] fixing documentation issues in tf example notebook #267

Merged
merged 1 commit into from Sep 3, 2020

Conversation

@alecgunny
Copy link
Collaborator

@alecgunny alecgunny commented Sep 3, 2020

Addressing #258

@alecgunny alecgunny requested a review from benfred Sep 3, 2020
@nvidia-merlin-bot
Copy link
Collaborator

@nvidia-merlin-bot nvidia-merlin-bot commented Sep 3, 2020

Click to view CI Results
GitHub pull request #267 of commit 36e465bef64e93d5ac80ac8916c6a0f2fc37d5a6, no merge conflicts.
Running as SYSTEM
Setting status of 36e465bef64e93d5ac80ac8916c6a0f2fc37d5a6 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/756/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/267/*:refs/remotes/origin/pr/267/* # timeout=10
 > git rev-parse 36e465bef64e93d5ac80ac8916c6a0f2fc37d5a6^{commit} # timeout=10
Checking out Revision 36e465bef64e93d5ac80ac8916c6a0f2fc37d5a6 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 36e465bef64e93d5ac80ac8916c6a0f2fc37d5a6 # timeout=10
Commit message: "fixing issues in tf example notebook"
 > git rev-list --no-walk 1cc040afe9807c1865356d7baec631ec270ae4bb # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins8029572781434215054.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.1.1
    Uninstalling nvtabular-0.1.1:
      Successfully uninstalled nvtabular-0.1.1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
61 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.0.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, hypothesis-5.28.0, asyncio-0.12.0, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 429 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 11%]
.......... [ 13%]
tests/unit/test_io.py .................................................. [ 25%]
............................ [ 32%]
tests/unit/test_notebooks.py .... [ 33%]
tests/unit/test_ops.py ................................................. [ 44%]
........................................................................ [ 61%]
...................................... [ 70%]
tests/unit/test_s3.py .. [ 70%]
tests/unit/test_tf_dataloader.py ............ [ 73%]
tests/unit/test_torch_dataloader.py ............... [ 76%]
tests/unit/test_workflow.py ............................................ [ 87%]
....................................................... [100%]

=============================== warnings summary ===============================
/opt/conda/envs/rapids/lib/python3.7/site-packages/pandas/util/init.py:12
/opt/conda/envs/rapids/lib/python3.7/site-packages/pandas/util/init.py:12: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.
import pandas.util.testing

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 43027 instead
http_address["port"], self.http_server.port

tests/unit/test_torch_dataloader.py::test_empty_cols[parquet]
tests/unit/test_torch_dataloader.py::test_empty_cols[parquet]
tests/unit/test_torch_dataloader.py::test_empty_cols[parquet]
tests/unit/test_torch_dataloader.py::test_empty_cols[parquet]
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:660: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:76: UserWarning: Row group size 28392 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:76: UserWarning: Row group size 29960 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:76: UserWarning: Row group size 29568 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:76: UserWarning: Row group size 60480 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:97: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 7 0 0 0 100%
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 2 4 1 81% 31->32, 32, 37
nvtabular/io/dataset.py 99 9 46 8 88% 94->95, 95, 107->108, 108, 116->117, 117, 125->137, 130->135, 135-137, 212->213, 213, 227->228, 228-229, 247->248, 248
nvtabular/io/dataset_engine.py 12 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 153 5 50 5 95% 100->102, 102-104, 112->114, 114, 140->141, 141, 236->238, 244->249
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 119 9 42 2 92% 29, 46, 70->71, 71, 109, 112, 173->174, 174, 195-197
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 207 15 56 4 92% 86, 98-106, 134->135, 135, 181, 194, 269->271, 284->285, 285, 308->309, 309-310
nvtabular/loader/tensorflow.py 109 16 46 10 82% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 244-253, 268->269, 269, 288->289, 289, 296->297, 297, 298->301, 301, 306->307, 307
nvtabular/loader/tf_utils.py 51 7 20 5 83% 13->16, 16->18, 23->25, 26->27, 27, 34-35, 40->48, 43-48
nvtabular/loader/torch.py 33 0 4 0 100%
nvtabular/ops/init.py 20 0 0 0 100%
nvtabular/ops/categorify.py 356 54 188 37 81% 147->148, 148, 156->161, 161, 171->172, 172, 216->217, 217, 260->261, 261, 264->270, 340->341, 341-343, 345->346, 346, 347->348, 348, 366->369, 369, 380->381, 381, 387->390, 413->414, 414-415, 417->418, 418-419, 421->422, 422-438, 440->444, 444, 448->449, 449, 450->451, 451, 458->459, 459, 462->465, 465->466, 466, 469->473, 473-476, 486->487, 487, 489->492, 494->511, 511-514, 537->538, 538, 541->542, 542, 543->544, 544, 551->552, 552, 553->556, 556, 663->664, 664, 665->666, 666, 687->702, 727->732, 730->731, 731, 741->738, 746->738
nvtabular/ops/clip.py 25 3 10 4 80% 50->51, 51, 59->60, 60, 64->66, 66->67, 67
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 21 1 4 1 92% 71->72, 72
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 55->56, 56, 85->86, 86
nvtabular/ops/filter.py 17 1 2 1 89% 43->44, 44
nvtabular/ops/groupby_statistics.py 79 3 30 3 94% 145->146, 146, 150->172, 179->180, 180, 204
nvtabular/ops/hash_bucket.py 30 4 16 2 83% 31->32, 32-34, 35->38, 38
nvtabular/ops/join_external.py 66 4 26 5 90% 81->82, 82, 83->84, 84, 98->101, 101, 114->118, 155->156, 156
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 24 2 8 2 88% 49->50, 50, 51->52, 52
nvtabular/ops/logop.py 17 1 4 1 90% 37->38, 38
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 33 1 2 0 97% 60
nvtabular/ops/normalize.py 49 4 14 4 84% 49->50, 50, 57->56, 90->91, 91, 100->102, 102-103
nvtabular/ops/operator.py 19 1 8 2 89% 43->42, 45->46, 46
nvtabular/ops/stat_operator.py 9 0 0 0 100%
nvtabular/ops/target_encoding.py 70 2 10 2 95% 150->151, 151-152, 202->205
nvtabular/ops/transform_operator.py 41 3 10 2 90% 42-46, 69->71, 88->89, 89
nvtabular/utils.py 17 3 6 3 74% 22->23, 23, 25->26, 26, 33->34, 34
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 420 38 230 24 89% 99->103, 103, 109->110, 110-114, 144->exit, 160->exit, 176->exit, 192->exit, 245->247, 295->296, 296, 375->378, 378, 403->404, 404, 410->413, 413, 476->477, 477, 495->497, 497-506, 517->516, 566->571, 571, 574->575, 575, 610->611, 611, 658->649, 724->735, 735, 758-788, 816->817, 817, 830->833, 863->864, 864-866, 870->871, 871, 904->905, 905
setup.py 2 2 0 0 0% 18-20

TOTAL 2612 225 996 148 88%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 88.50%
================= 429 passed, 17 warnings in 515.54s (0:08:35) =================
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins5262642054674194196.sh

@benfred
benfred approved these changes Sep 3, 2020
@benfred benfred merged commit 3cab153 into NVIDIA:main Sep 3, 2020
1 check passed
1 check passed
Jenkins Unit Test Run Success
Details
@benfred benfred mentioned this pull request Sep 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
You can’t perform that action at this time.