How to validate a url in Python? (Malformed or not)

python

url-validation

python-stdlib

urlparse

byAlex Kataev·Feb 22, 2025

To validate a URL in Python efficiently, you can leverage the validators package url method:

# No Validators? No party! Get them: pip install validators
import validators

# Check for URL validity
is_valid = validators.url("http://www.example.com")

# URL validation status: True if valid, False otherwise
print("Is your URL up for the party? :", is_valid)

This checks if your URL complies with the standard format, returning True for valid URLs without any hassle.

URL validation strategies

The straightforward validators package method is just one way to validate URLs in Python. Here, we crack the code for more extensive methods, suited to different necessities and frameworks.

Django for the win

Calling all Django users! Your framework comes equipped with the powerful URLValidator. This handy tool can verify URL format and even check if a URL path exists:

from django.core.validators import URLValidator
from django.core.exceptions import ValidationError

validator = URLValidator(verify_exists=True)

try:
    validator("https://www.example.com")
    print("URL is no impostor!")
except ValidationError as e:
    print("This URL is trying to fool you:", e)

Befriend Python's urlparse

If you are seeking a solution that is ingrained in the standard Python library, say hello to urlparse:

from urllib.parse import urlparse

def is_valid_url(url):
    parsed_url = urlparse(url)
    return all([parsed_url.scheme, parsed_url.netloc])

url = "http://www.example.com"
print("Is your URL playing by the rules? :", is_valid_url(url))

This method scrutinizes your URL and ensures both the scheme and network location are not absent, basics for a valid URL.

Anticipating varieties and complexities

URLs love to play dress-up and often appear in various forms and structures. Here's a quick game-plan:

Ports in disguise: Check for these sneaky guys. They follow the network location and are separated by a ":".
The Query Param Mystery: Don't overlook URLs containing query strings or URL-encoded parameters.
Testing the waters: Always check your URL validation code with different URL formats.

Unraveling URL validation

URL validation requires a keen eye for details, as it often extends beyond a mere format check.

Regex the rescue

For absolute control, who better to rely on than custom regex patterns:

import re

regex_pattern = re.compile(
    r"^(?:http|ftp)s?://"  # http:// or https://
    r"(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+(?:[A-Z]{2,6}\.?|[A-Z0-9-]{2,}\.?$)"  # domain
    r"(?::\d+)?"  # optional port
    r"(?:/?|[/?]\S+)$", re.IGNORECASE)

def is_valid_url(url, pattern=regex_pattern):
    # If Regex was a superhero, he'd be The Flash!
    return re.match(pattern, url) is not None 

print("Did your URL outsmart Regex?:", is_valid_url("http://localhost:8000/test"))

Check before you wreck... an HTTP request

Before making an HTTP request, validate that URL to dodge unnecessary errors:

import requests

url = "http://www.example.com"
if is_valid_url(url):
    response = requests.get(url)
    # Keep calm and carry on with the response
else:
    print("Your URL does not get a green signal.")

Data integrity: Deal or no deal?

In data processing applications, especially ones involving user-input or external data sources, validating each URL is non-negotiable to preserve data integrity and steer clear of processing errors or security vulnerabilities.

explain-codes / Python / How to validate a url in Python? (Malformed or not)

Linked

Error: "dictionary update sequence element #0 has length 1; 2 is required" on Django 1.4



Check if a JavaScript string is a URL



Extract protocol and host name from URL



How to replace whitespaces with underscore?



How to read html from a url in python 3

