urbanists.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
We're a server for people who like bikes, transit, and walkable cities. Let's get to know each other!

Server stats:

543
active users

#blas

1 post1 participant0 posts today
Continued thread

Even now, Thrust as a dependency is one of the main reason why we have a #CUDA backend, a #HIP / #ROCm backend and a pure #CPU backend in #GPUSPH, but not a #SYCL or #OneAPI backend (which would allow us to extend hardware support to #Intel GPUs). <doi.org/10.1002/cpe.8313>

This is also one of the reason why we implemented our own #BLAS routines when we introduced the semi-implicit integrator. A side-effect of this choice is that it allowed us to develop the improved #BiCGSTAB that I've had the opportunity to mention before <doi.org/10.1016/j.jcp.2022.111>. Sometimes I do wonder if it would be appropriate to “excorporate” it into its own library for general use, since it's something that would benefit others. OTOH, this one was developed specifically for GPUSPH and it's tightly integrated with the rest of it (including its support for multi-GPU), and refactoring to turn it into a library like cuBLAS is

a. too much effort
b. probably not worth it.

Again, following @eniko's original thread, it's really not that hard to roll your own, and probably less time consuming than trying to wrangle your way through an API that may or may not fit your needs.

6/

Time for an #introduction!
I'm a young Canuck with interests/experience in #HPC, #Linux, #BLAS, #SYCL, #C, #AVX512, #Rust, heterogeneous compute & other such things.

Currently my personal projects are bringing #FP16 to the #OpenBLAS library, working to standardize what Complex domain BLAS FP16 kernels/implementations should look like, and making sure #SYCL is available everywhere.

I also write every now and again. Here's the tail of AVX512 FP16 on Alderlake
gist.github.com/FCLC/56e4b3f4a

On AVX512 FP16, Alder Lake, custom kernels, and how &quot;Mistakes were made&quot; has never rang so true - A not so brief discussion of Alder Lake, the new AVX512 FP 16 extensions, Sapphire Rapids...
GistOn AVX512 FP16, Alder Lake, custom kernels, and how "Mistakes were made" has never rang so trueOn AVX512 FP16, Alder Lake, custom kernels, and how "Mistakes were made" has never rang so true - A not so brief discussion of Alder Lake, the new AVX512 FP 16 extensions, Sapphire Rapids...