Howto Install Tensorflow-GPU with Keras in R – A manual that worked on 2021.02.20 (and likely will work in future)

A brief instruction:
0. Update your Nvidia graphic card driver (just driver; you need NOT install/update CUDA but make sure that your card has cuda compute capability >= 3.5)
1. install Anaconda (release Anaconda3-2020.11 from anaconda.org)
2. open anaconda prompt and run
>conda create -n tfgpu210p37 python==3.7
>conda activate tfgpu210p37
>conda install cudatoolkit=10.1 cudnn=7.6 -c=conda-forge
>conda install -c anaconda tensorflow-gpu
3. in R run
>install.packages("keras")
>reticulate::use_condaenv("tfgpu210p37", required = TRUE)
>library(keras)
4. If you wanna understand what is going on under the hood, read further

On web there are a lot of obsolete manuals how to install keras in R. Also our note install.packages("keras"); library(keras); install_keras() is outdated, it worked before but now it results in
1: In normalizePath(path.expand(path), winslash, mustWork) :
path[1]=”C:\Users\vasily\AppData\Local\r-miniconda/python.exe”: The system cannot find the file specified
2: In normalizePath(path.expand(path), winslash, mustWork) :
path[1]=”C:\Users\vasily\AppData\Local\r-miniconda\envs\r-reticulate/python.exe”: The system cannot find the file specified

Thus one needs to dwell a bit deeper in DevOps domain.

Let us start with hardware: try to avoid outdated components (CPUs and GPUs). I, myself, had a pretty exotic problem: I had two graphic cards: GeForce GT 730 and GeForce GTX 1060. I was aware about CUDA compute capability and driver version requirements, so I even did not install the driver for GeForce GT 730. However, Windows 10 automatically did it for me, downgrading the driver version for GTX 1060 as a side effect and making CUDA10 non-runnable on my machine! So I had to remove GT 730 "surgically".
My old good PC on which I experiment with GPUs since 2011

Also an outdated CPU can make troubles: generally the AVX is assumed (although TensorFlow 2.1.0 surprisingly did run on my old AMD Phenom II X6 1055T).
Of course if you are a masochist passionate DevOps, you can use a precompiled non-AVX binaries or even build the TensforFlow from source ... but I assume that you are (like me) a data scientist, who just wanna get his tools run.

So assuming that your hardware is more or less current, check Nvidia driver version and update driver is necessary

Further download and install Anaconda (I used the release 2020-11). Anaconda is a package and environment manager, to some extend like pip in Python but Anaconda not only manages Python libraries but also Python versions and other software packages (it particular, it will take over the cumbersome matching of CUDA and cuDNN proper versions to tensorflow binaries).
From program menu run anaconda prompt (or powershell prompt)

Run
conda create -n tfgpu210p37 python==3.7, which creates a conda environment with Python3.7.0 (you may also try python==3.7.5, which is the last version of Python37 but - at least on 2021.02.20 - it is important to stick to the major release 3.7).
conda activate tfgpu210p37 (software packages that we are going to install will be put in currently active environment).
conda install cudatoolkit=10.1 cudnn=7.6 -c=conda-forge (you can check the matching of major versions of CUDA and cuDNN on tensorflow webpage)
conda install -c anaconda tensorflow-gpu Note that -c means channel, i.e. a repository. You might as how I guessed that the latest available version of the tensorflow-gpu on this channel is 2.1.0. Well, just go to the channel and browse the available files, you may also watch the installation process or check the Anaconda navigator (to see the TF version expost)

Finally start RStudio and

>install.packages("keras") #obviously run only once :)
>reticulate::use_condaenv("tfgpu210p37", required = TRUE) #this shall be called once per R-session
>library(keras)

reticulate is a package that bridges Python and R. Note that you will likely have to explicitly specified condaenv each time you run R (but not each time you call keras funtionality). And yes, conda tensorflow package already contrains keras (and tensorboard too).


Update on 29.12.2023
If you still want this old version of tensorflow-gpu, you now need to specify the version explicitely, i.e. conda install -c anaconda tensorflow-gpu=2.1.0 (otherwise conda will not be able to solve the dependencies).
Likely, it will install numpy-1.21.5, which you need to downgrade to 1.19.5 by means of conda install -c anaconda numpy=1.19.5 (otherwise you will likely get an error like NotImplementedError: Cannot convert a symbolic Tensor (2nd_target:0) to a numpy array).
Moreover, in R-studio you may need explicitly specify the path to tfgpu210p37: go to tools - global options - Python - Select the conda environment


Note if you want to install the latest Tensorflow-GPU version (currently 2.4.0) then do the following (courtesy: TheCodingBug)

conda create -n tf_gpu python==3.8
conda activate tf_gpu
conda install cudatoolkit=11.0 cudnn=8.0 -c=conda-forge
pip install --upgrade tensorflow-gpu==2.4.1

[to check whether TF-GPU works run Python interpreter and type]
>>> import tensorflow as tf
>>> tf.test.is_gpu_available()

However, as you can see, this approach mixed conda and pip installation, which is not recommended. And this Tensorflow/Keras configuration will NOT work in R, since pip overwrites some conda packages, in particular h5py (and in Rstudio you will get an error message like Error in load_model_hdf5... : The h5py Python package is required to save and load models)


Also note that although in theory conda completely isolates environments from each other, in practice it is not always the case. In particular, if you mess with conda and pip (as above) it will impact all your conda environments with Python38 (I did not check whether it hurts only Python 3.8.0 or Python 3.8.[any]). But ok, mixing conda and pip is used on your own risk all the same.

But even more suprising (and irritating) is the following: first I successully installed tensorflow-gpu on one of my machines as described above and then tried the same with Python3.8, i.e. with conda create -n tf python==3.8. Not only did not it work, it also broke my previously working Python37 condaenv, now I am getting the following error message (also the NN training does start and runs several steps)
########################################################
2021-02-19 09:13:33.057968: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2021-02-19 09:13:33.728950: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2021-02-19 09:13:33.729602: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2021-02-19 09:13:33.734070: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2021-02-19 09:13:33.734489: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2021-02-19 09:13:33.740939: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2021-02-19 09:13:33.741305: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2021-02-19 09:13:33.742512: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2021-02-19 09:13:33.743083: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2021-02-19 09:13:33.743452: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2021-02-19 09:13:33.743821: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2021-02-19 09:13:33.744320: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2021-02-19 09:13:33.747101: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2021-02-19 09:13:33.762326: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2021-02-19 09:13:33.762947: W tensorflow/stream_executor/stream.cc:2041] attempting to perform BLAS operation using StreamExecutor without BLAS support
2021-02-19 09:13:33.763842: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Internal: Blas GEMM launch failed : a.shape=(500, 4), b.shape=(4, 8), m=500, n=8, k=4
[[{{node sequential/gru/while/body/_1/MatMul_1}}]]
2021-02-19 09:13:33.765274: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Cancelled: Iterator was cancelled
[[{{node IteratorGetNext}}]]
Train on 640000 samples, validate on 160000 samples
Epoch 1/400
500/640000 [..............................] - ETA: 1:10:50Error in py_call_impl(callable, dots$args, dots$keywords) :
InternalError: Blas GEMM launch failed : a.shape=(500, 4), b.shape=(4, 8), m=500, n=8, k=4
[[{{node sequential/gru/while/body/_1/MatMul_1}}]] [Op:__inference_distributed_function_2490]

Function call stack:
distributed_function
In addition: Warning message:
Error in py_call_impl(callable, dots$args, dots$keywords) :
InternalError: Blas GEMM launch failed : a.shape=(500, 4), b.shape=(4, 8), m=500, n=8, k=4
[[{{node sequential/gru/while/body/_1/MatMul_1}}]] [Op:__inference_distributed_function_2490]

Function call stack:
distributed_function


On another PC (with an old Processor) I first got the following
>>> import tensorflow as tf
Traceback (most recent call last):
File "C:\Users\agandoWin10\anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 64, in
from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: DLL load failed while importing _pywrap_tensorflow_internal: A dynamic link library (DLL) initialization routine failed.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\agandoWin10\anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\__init__.py", line 41, in
from tensorflow.python.tools import module_util as _module_util
File "C:\Users\agandoWin10\anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\__init__.py", line 39, in
from tensorflow.python import pywrap_tensorflow as _pywrap_tensorflow
File "C:\Users\agandoWin10\anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 83, in
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "C:\Users\agandoWin10\anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 64, in
from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: DLL load failed while importing _pywrap_tensorflow_internal: A dynamic link library (DLL) initialization routine failed.

Failed to load the native TensorFlow runtime.
See https://www.tensorflow.org/install/errors
for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.
>>>
I thought that this is due to missing AVX but suddenly it start working.


This last but not least: first think whether you really need a GPU acceleration or your model can be trained on CPU. And even if you need more speed, do not expect it the GPU will bring you [too] much of it. I, myself, got a speed factor x3 for my model. How dissapointing is this, compared to my previous experience with CUDA, where I got x100 faster Monte-Carlo on a commodity graphic card and x1000 faster on Tesla K20!
So if you don't really need the GPU, just run

conda create -n tfp37 python==3.7
conda activate tfp37
conda install -c anaconda tensorflow #(from here https://anaconda.org/anaconda/tensorflow don't mess with this https://anaconda.org/conda-forge/tensorflow or this https://anaconda.org/anaconda/keras)
>reticulate::use_condaenv("tfp37", required = TRUE)
>library(keras)

Like this post and wanna learn more? Have a look at Knowledge rather than Hope: A Book for Retail Investors and Mathematical Finance Students

FinViz - an advanced stock screener (both for technical and fundamental traders)