How to Get Current Available GPUs in TensorFlow?
Discover available GPUs in TensorFlow by invoking the function tf.config.list_physical_devices('GPU')
. This function provides a list of GPU devices accessible to TensorFlow. You can use this handy function to get an instantaneous count of available GPUs:
Running this snippet will display the number of GPUs TensorFlow has detected. By applying the len()
function, you can effortlessly count the total number of accessible GPUs.
Avoiding a GPU Memory Allocation Fiasco
When it comes to GPUs in TensorFlow, you have to ensure judicious memory utilization. By default, TensorFlow tries to allocate all available GPU memory, leading to a potential clash with other apps or subsequent model runs. Here's how to prevent that:
- Configure memory growth for GPUs. By setting
allow_growth
toTrue
, TensorFlow allocates only the required amount of GPU memory and increases as required.
In TensorFlow 1.x, configure the memory growth within a session like so:
For versions TensorFlow 2.0 and up, apply the following configuration:
Ways around Invisible GPUs
Sometimes, due to certain constraints, TensorFlow may not be able to access GPUs directly. When this happens, invoke CUDA_VISIBLE_DEVICES
environment variable to tweak GPU visibility. For example, the command CUDA_VISIBLE_DEVICES=0
permits TensorFlow to detect only the first GPU.
For occasions where you can't bypass TensorFlow, good news is NVIDIA's nvidia-smi
comes to the rescue! You can deploy it with Python's inbuilt subprocess
module for a wonderful collaboration:
But hold your horses! This approach isn't without its share of effects, so tread carefully.
Cracking the Code behind Device Details
A mere list of accessible GPUs may not suffice. For times when you need comprehensive device specifics, TensorFlow offers the DeviceAttributes
protocol buffer which contains historical, performance, and configuration data about devices:
When working in a distributed ecosystem, you would want to query GPU information across machines/processes. For such applications, you could consider TensorFlow's impressive suite of distribution strategies or synchronise device querying manually across your cluster.
Was this article helpful?