gpu

According to
https://github.com/pytorch/pytorch/blob/64847c7f0b3559c6edc40f001619b80c7dc68ef7/c10/util/Exception.h#L479
all usages of AT_ERROR("msg") should be replace with TORCH_CHECK(false, "msg")
There are currently 29 instances of AT_ERROR being used in c10 codebase:

$ grep AT_ERROR c10 -R|grep -v Exception.h|wc -l
29

At this moment relu_layer op doesn't allow threshold configuration, and legacy RELU op allows that.
We should add configuration option to relu_layer.

Problem: the approximate method can still be slow for many trees
catboost version: master
Operating System: ubuntu 18.04
CPU: i9
GPU: RTX2080

Would be good to be able to specify how many trees to use for shapley. The model.predict and prediction_type versions allow this. lgbm/xgb allow this.

As seen in openwall/john#4530 (comment):

Benchmarking: sspr-opencl, NetIQ SSPR / Adobe AEM [MD5/SHA1/SHA2 OpenCL]... Warning: binary() returned misaligned pointer
DONE

This is because opencl_sspr_fmt_plug.c wrongly has:

#define BINARY_ALIGN            MEM_ALIGN_WORD

whereas the code only guarantees alignment appropriate for

Hi ,

I have tried out both loss.backward() and model_engine.backward(loss) for my code. There are several subtle differences that I have observed , for one retain_graph = True does not work for model_engine.backward(loss) . This is creating a problem since buffers are not being retained every time I run the code for some reason.

Please look into this if you could.

Currently, aggregation APIs (groupby, reductions, rolling, etc.) are scattered around in multiple files and there are inconsistencies between the directory structures in cpp/include/, cpp/src/, cpp/tests/, and cpp/benchmarks/. For example:

cpp/include/:

include/cudf/aggregation.hpp
include/cudf/groupby.hpp
include/cudf/rolling.hpp
....

cpp/src/:

src/aggregati

Current implementation of join can be improved by performing the operation in a single call to the backend kernel instead of multiple calls.

This is a fairly easy kernel and may be a good issue for someone getting to know CUDA/ArrayFire internals. Ping me if you want additional info.

We would like to forward a particular 'key' column which is part of the features to appear alongside the predictions - this is to be able to identify to which set of features a particular prediction belongs to. Here is an example of predictions output using the tensorflow.contrib.estimator.multi_class_head:

{"classes": ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"],
 "scores": [0.068196

PR NVIDIA/cub#218 fixes this CUB's radix sort. We should:

Check whether Thrust's other backends handle this case correctly.
Provide a guarantee of this in the stable_sort documentation.
Add regression tests to enforce this on all backends.

gpu

Here are 2,066 public repositories matching this topic...

pytorch / pytorch

alacritty / alacritty

fastai / fastai

NVIDIA / nvidia-docker

gpujs / gpu.js

eclipse / deeplearning4j

PavelDoGreat / WebGL-Fluid-Simulation

apache / tvm

OlafenwaMoses / ImageAI

catboost / catboost

chainer / chainer

h2oai / h2o-3

MVIG-SJTU / AlphaPose

cupy / cupy

openwall / john

gfx-rs / gfx

microsoft / DeepSpeed

halide / Halide

intel-isl / Open3D

PipelineAI / pipeline

NVIDIA / DIGITS

rapidsai / cudf

exelban / stats

arrayfire / arrayfire

tensorflow / adanet

NVIDIA / thrust

ultralight-ux / Ultralight

NVIDIA / DALI

Syllo / nvtop

pycaret / pycaret

Improve this page

Add this topic to your repo