avx

Just a suggestion to add a "--help" support to benchdnn.
As today benchdnn does nt seem to have a --help support, the doc being only available online on this page:
https://github.com/intel/mkl-dnn/tree/master/tests/benchdnn
the proposal is to add a minimum of "print usage" to the benchdnn CLI tool.

Environ

Hello,
Do you know about Conan?
Conan is modern dependency manager for C++. And will be great if your library will be available via package manager for other developers.

Here you can find example, how you can create package for the library.

If you have any questions, j

I am sorry but I don't know how to put it into an Issue format, so I have to explain it plainly.

Dear contributors,

It seems that every source file in Vc library has the following copyright notice:

/*  This file is part of the Vc library. {{{
Copyright © 2009-2015 XXX <xxx@example.com>
Redistribution and use in source and binary forms, with or without
modification, are permitte

At least on x86, the fastest intrinsics for shuffling the contents of a vector or blending data from two vectors take an immediate operand, which must be a compile-time constant. So there would be a use case for a compile-time version of xsimd::select(), as it could use these faster instructions.

An example of prior art for this is the shuffle() instruction family in bSIMD:

https://develop

Docs about dest param :
"You can pass all parameter same (this is similar to m1 *= m1), you can pass dest as m1 or m2 (this is similar to m1 *= m2)" (c)

But some functions behave differently. For example:

vec3 A = {0,0,2};
vec3 B = {2,0,0};
glm_vec3_cross(A, B, A); // A = {0, 4, -8} was expected A = {0, 4, 0}

We should add WASM simd128 implementations of as many SSE/SSE2/etc. functions as possible.

Some of the functions won't see much, if any, improvements since we already have GCC-style vector extension and OpenMP SIMD support. The real benefit will be for the functions that can't use GCC-style vectors. For example, saturated operations, min/max, etc. And of course there will be a lot of cases w

Raising scalars and vectors to integer powers is very common (e.g. llvm.powi), so I think the following APIs would be very useful:

double Sleef_ipow_u10(double x, int32_t y);
__m128d Sleef_ipowd2_u10(__m128d x, int32_t y);
__m256d Sleef_ipowd4_u10(__m256d x, int32_t y);
__m512d Sleef_ipowd8_u10(__m512d x, int32_t y);
floa

Hi there,
I am an experienced C++ programmer but I'm completely lost when it comes to SIMD operations. Currently I'm trying your library for over a week and I still cannot figure out, how to get it to be more performant than the straight forward way.

In my particular case, I am trying to create a SAXPY operation according to BLAS standard using SIMD operations. My vectors are huge and still th

Hi

So, following the instructions, I ran cmake

 cmake .. -DSIMD=AVX2 -DDEV=1 -DBOOST_ROOT=/**/boost_1_72_0 -GNinja

It says:

CMake Warning:
  Manually-specified variables were not used by the project:

    BOOST_ROOT
    DEV

Then: ninja -j1 update - unknown target

However, running:

ninja -j 4 tests
ctest

Has worked, apparently successfully: `10

We have Bitmap image saving function void Save( const std::string & path, const penguinV::Image & image, uint32_t startX, uint32_t startY, uint32_t width, uint32_t height ) which locates in src/file/bmp_image.h and src/file/bmp_image.cpp files.

During file saving we purposely copy a line of image to temporary array and then write the array into file. The reason behind this is that

avx

Here are 135 public repositories matching this topic...

oneapi-src / oneDNN

Environ

ermig1979 / Simd

mind / wheels

VcDevel / Vc

kfrlib / kfr

xtensor-stack / xsimd

microsoft / DirectXMath

recp / cglm

minio / sha256-simd

hfp / libxsmm

simd-everywhere / simde

shibatch / sleef

agenium-scale / boost.simd

tlk00 / BitMagic

aff3ct / MIPP

altimesh / hybridizer-basic-samples

VcDevel / std-simd

fiigii / PacketTracer

manodeep / Corrfunc

agenium-scale / nsimd

lemire / despacer

powturbo / Turbo-Base64

ihhub / penguinV

BBuf / Image-processing-algorithm-Speed

PoC-Consortium / engraver

edanor / umesimd

EgorBo / IntrinsicsPlayground

swojtasiak / fcml-lib

VectorChief / UniSIMD-assembler

Erkaman / sse-avx-rasterization

Improve this page

Add this topic to your repo