How to prevent tensorflow from allocating the totality of a GPU memory?

python

memory-management

gpu-optimization

tensorflow

byAlex Kataev·Feb 19, 2025

Manage GPU memory in TensorFlow by enabling memory growth. This setting allows for dynamic allocation, thereby ensuring that not all GPU memory is hogged at once. Use the tf.config.experimental.set_memory_growth method:

import tensorflow as tf

# List GPUs and get them to only grow memory usage when asked politely
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  for gpu in gpus:
    tf.config.experimental.set_memory_growth(gpu, True)

Here, TensorFlow's GPU memory consumption becomes incremental rather than preemptive, resulting in more efficient memory usage.

Tight control with memory fraction

To specify an upper limit on GPU memory, assign a fraction of the total memory to TensorFlow:

import tensorflow as tf

# Maybe 40%? Not too selfish, not too generous.
gpu_options = tf.compat.v1.GPUOptions(per_process_gpu_memory_fraction=0.4)
config = tf.compat.v1.ConfigProto(gpu_options=gpu_options)
session = tf.compat.v1.Session(config=config)

This allocates 40% of the total GPU memory to TensorFlow—an approach particularly useful when GPUs are being shared among users.

Explicit memory limit with virtual devices

For finer control, explicitly define the amount of GPU memory for TensorFlow:

import tensorflow as tf

# 4GB should do the trick? Fingers crossed!
memory_limit = 4096
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  tf.config.experimental.set_virtual_device_configuration(
      gpus[0],
      [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=memory_limit)]
  )

Here, we set a specific memory limit of 4GB on the first GPU.

In multi-user environments, ensuring efficient resource utilization is critical. TensorFlow can be configured to use only as much GPU memory as the execution of tasks necessitates:

# Same Google, can we borrow your TensorBoard slide?
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.compat.v1.Session(config=config)

This incremental allocation strategy allows better resource sharing and GPU memory efficiency.

How to dance with older versions

When using TensorFlow versions below 2.0, the dynamics change a bit. You have to work with ConfigProto() inside a tf.Session:

import tensorflow as tf

# "ConfigProto, will you be my date for the TensorFlow prom?"
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.compat.v1.Session(config=config)

Adjusting to the version is key to avoid compatibility issues.

Gauge your model's appetite

Fully understanding the memory consumption of your model is paramount. Allocate memory with caution, as under-allocation may lead to out-of-memory errors, and over-allocation can result in inefficiency.

Fragmentation issues

When setting the per_process_gpu_memory_fraction, remember to leave space for memory fragmentation. Starting with values like 0.4 leaves headroom for fragmented memory.

explain-codes / Python / How to prevent tensorflow from allocating the totality of a GPU memory?

Linked

How to Get Current Available GPUs in TensorFlow?



How to run Tensorflow on CPU



Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2



Tensorflow not found using pip



How do I print the model summary in PyTorch?

