gpu
Here are 2,025 public repositories matching this topic...
-
Updated
Feb 6, 2021 - Jupyter Notebook
-
Updated
Jan 4, 2021 - Makefile
At this moment relu_layer op doesn't allow threshold configuration, and legacy RELU op allows that.
We should add configuration option to relu_layer.
-
Updated
Oct 7, 2020 - JavaScript
-
Updated
Feb 6, 2021 - Python
-
Updated
Feb 5, 2021 - Python
Problem: the approximate method can still be slow for many trees
catboost version: master
Operating System: ubuntu 18.04
CPU: i9
GPU: RTX2080
Would be good to be able to specify how many trees to use for shapley. The model.predict and prediction_type versions allow this. lgbm/xgb allow this.
-
Updated
Feb 4, 2021 - Python
-
Updated
Feb 6, 2021 - Jupyter Notebook
-
Updated
Feb 6, 2021 - Python
As seen in openwall/john#4530 (comment):
Benchmarking: sspr-opencl, NetIQ SSPR / Adobe AEM [MD5/SHA1/SHA2 OpenCL]... Warning: binary() returned misaligned pointer
DONE
This is because opencl_sspr_fmt_plug.c wrongly has:
#define BINARY_ALIGN MEM_ALIGN_WORDwhereas the code only guarantees alignment appropriate for
Hi ,
I have tried out both loss.backward() and model_engine.backward(loss) for my code. There are several subtle differences that I have observed , for one retain_graph = True does not work for model_engine.backward(loss) . This is creating a problem since buffers are not being retained every time I run the code for some reason.
Please look into this if you could.
-
Updated
Feb 6, 2021 - C++
-
Updated
Apr 24, 2020 - Jsonnet
-
Updated
Jun 13, 2020 - HTML
-
Updated
Feb 6, 2021 - C++
Is your feature request related to a problem? Please describe.
It might be useful to have a singular clean and performant way to check if all the columns of a dataframe are of the same dtype, such as a DataFrame property _is_homogeneous. This comes up in a lot of places, such as where we might want to dispatch to a cupy matrix implementation (Transpose, some row wise reductions I believe
Current implementation of join can be improved by performing the operation in a single call to the backend kernel instead of multiple calls.
This is a fairly easy kernel and may be a good issue for someone getting to know CUDA/ArrayFire internals. Ping me if you want additional info.
We would like to forward a particular 'key' column which is part of the features to appear alongside the predictions - this is to be able to identify to which set of features a particular prediction belongs to. Here is an example of predictions output using the tensorflow.contrib.estimator.multi_class_head:
{"classes": ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"],
"scores": [0.068196
Names map and input are exchanged mistakenly. By sense of Preconditions paragraph they have to be exchanged I suppose, because there is no problem when map and result coincide (in current context).
-
Updated
Dec 17, 2020 - CMake
-
Updated
Feb 5, 2021 - C++
-
Updated
Feb 4, 2021 - Jupyter Notebook
Improve this page
Add a description, image, and links to the gpu topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the gpu topic, visit your repo's landing page and select "manage topics."
Add support for
torch.maxwith:Motivation
Currently,
torch.maxhas support for CUDA float16:But all three other combinations of CPU/CUDA and float16/bfloat16 are not supported: