Hub Python Library documentation

HF URIs

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v1.16.2).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

HF URIs

A HF URI is a URI-like string that identifies a location on the Hugging Face Hub. Throughout the library and the CLI, hf://... strings are used to point at:

  • a model, dataset, space or kernel repository (optionally pinned at a revision);
  • a file or sub-folder inside such a repository;
  • a bucket or a sub-folder inside a bucket.

A HF mount wraps a HF URI with a local mount path and an optional :ro / :rw flag, used by Spaces and Jobs volumes.

This page documents the canonical syntax of HF URIs and HF mounts. The same parser is used everywhere in the library, so a URI that is valid in one context (e.g. HfFileSystem) is parsed identically in another.

parse_hf_uri() also accepts Hugging Face web URLs (e.g. https://huggingface.co/datasets/my-org/my-dataset/blob/main/train.csv) so you can copy-paste a link from the website. See Web URLs below.

HF URI syntax

hf://[<TYPE>/]<ID>[@<REVISION>][/<PATH>]
Component Required Allowed values
hf:// yes Literal protocol prefix.
<TYPE>/ no models/, datasets/, spaces/, kernels/, buckets/ (plural).
<ID> yes <namespace>/<name>
@<REVISION> no Branch, tag, commit SHA, or special ref (refs/pr/N, refs/convert/...). Repos only.
/<PATH> no Path inside the repo or bucket.

HF mount syntax

hf://[<TYPE>/]<ID>[@<REVISION>][/<PATH>]:<MOUNT_PATH>[:ro|:rw]

A mount is a HF URI followed by :<MOUNT_PATH> and an optional :ro / :rw flag.

Component Required Allowed values
<MOUNT_PATH> yes Absolute mount path (must start with /).
:ro / :rw no Read-only / read-write flag.

What is a HF URI

The following are all valid HF URIs:

# Models (type prefix is optional, but the id is always 'namespace/name')
hf://my-org/my-model                            # implicit type prefix
hf://models/my-org/my-model                     # explicit type prefix
hf://models/my-org/my-model/config.json         # file inside a model repo
hf://models/my-org/my-model@v1.0/config.json    # pinned to a revision

# Datasets, Spaces, Kernels (type prefix is required)
hf://datasets/my-org/my-dataset
hf://datasets/my-org/my-dataset@dev/train.csv
hf://spaces/my-user/my-space
hf://kernels/my-org/my-kernel

# Special revisions (preserved as-is)
hf://datasets/my-org/my-dataset@refs/pr/10/data.csv
hf://datasets/my-org/my-dataset@refs/convert/parquet/data.parquet

# Buckets (always 'namespace/name', no revision)
hf://buckets/my-org/my-bucket
hf://buckets/my-org/my-bucket/sub/folder

The following are valid HF mounts (volume specifications):

hf://my-org/my-model:/data
hf://datasets/my-org/my-dataset:/mnt:ro
hf://datasets/my-org/my-dataset/train:/mnt:rw    # mount a sub-folder
hf://buckets/my-org/my-bucket:/storage:rw

What is not a HF URI

The parser is strict on purpose. The following are rejected:

Invalid URI Reason
my-org/my-model, ./local/path Missing hf:// prefix and not a recognized Hugging Face URL.
hf://dataset/org/m, hf://model/org/m Singular type forms are forbidden, use the plural (datasets/, …).
hf://datasets, hf://buckets/ A type prefix alone is not a valid URI, an <ID> is required.
hf://gpt2, hf://datasets/squad Canonical repos (without a namespace) are not supported.
hf://buckets/single-segment Buckets must always be namespace/name.
hf://buckets/org/b@v1 Buckets do not support a revision marker.
hf://org/m@, hf://datasets/foo/bar@/x Empty revision after @.
hf://a/b/c@v1 A repo id must be namespace/name, extra segments are paths.
hf://org/m:/ Mount path must be a non-empty absolute path.

Web URLs

For convenience, parse_hf_uri() also accepts Hugging Face web URLs that you copy-paste from your browser. They are normalized to the canonical hf:// form, so you can paste a URL straight into the CLI or the library and “it just works”:

>>> from huggingface_hub import parse_hf_uri
>>> parse_hf_uri("https://huggingface.co/datasets/my-org/my-dataset/blob/main/train.csv")
HfUri(type='dataset', id='my-org/my-dataset', revision='main', path_in_repo='train.csv')

URLs from huggingface.co (and its hf.co short domain, the staging host, and the host of a custom HF_ENDPOINT) are recognized, with or without the https:// scheme. Query strings (?download=true) and fragments (#L10) are ignored.

Supported URL formats

The table below lists the supported routes. For brevity, only the URL path is shown (the part after the host, e.g. huggingface.co); the recognized host is implied.

Points at URL path
Model repository /<ns>/<name>
Dataset repository (same for spaces/, kernels/, models/) /datasets/<ns>/<name>
Folder inside a repo, pinned at <rev> /<ns>/<name>/tree/<rev>[/<path>]
File inside a repo (file viewer route) /<ns>/<name>/blob/<rev>/<path>
File inside a repo (download route) /<ns>/<name>/resolve/<rev>/<path>
File inside a repo (raw route) /<ns>/<name>/raw/<rev>/<path>
File inside a repo (blame route) /<ns>/<name>/blame/<rev>/<path>
Bucket /buckets/<ns>/<name>
File inside a bucket (buckets are not versioned) /buckets/<ns>/<name>/resolve/<path>
Folder inside a bucket /buckets/<ns>/<name>/tree/<path>

The revision is taken from the single segment right after blob/resolve/raw/tree/blame. Special refs (refs/pr/N, refs/convert/...) are matched eagerly even though they contain /; any other branch/tag name containing / must be URL-encoded (feature%2Ffoo).

URLs that are not parsed

When a URL is ambiguous or does not point at a concrete Hub location, it is rejected (it is never guessed):

Paths below are again shown without the host, except for the last row where the host itself is the reason for rejection.

Reason URL path
Single-segment URL: user/org page, listing page, or canonical repo — ambiguous. /<username>
Listing page, no <ns>/<name>. /datasets
Canonical repos (without a namespace) are not supported. /gpt2
Not a file/folder location (same for commits, discussions, settings, …). /<ns>/<name>/commit/<rev>
Collections are not repositories. /collections/<ns>/<slug>
Host is not a recognized Hugging Face host. https://example.com/<ns>/<name>

Rendering a web URL

HfUri.to_url() is the inverse of parsing a URL: it renders the browsable web URL for a HF URI.

>>> uri = parse_hf_uri("hf://datasets/my-org/my-dataset@v1/train.csv")
>>> uri.to_url()
'https://huggingface.co/datasets/my-org/my-dataset/blob/v1/train.csv'

It points at the repository / bucket landing page when no path or revision is set, at the folder viewer (/tree/<rev>) when only a revision is set, at the file viewer (/blob/<rev>/<path>, revision defaulting to main) for repository files, and at the tree route (/tree/<path>) for bucket files. Special characters in the path (spaces, #, …) are percent-encoded. Pass endpoint=... to target a custom host.

Parsing in Python

Parsing URIs

parse_hf_uri() is the centralized URI parser. It is a pure string parser (no network calls) and returns a frozen HfUri dataclass.

>>> from huggingface_hub import parse_hf_uri
>>> parse_hf_uri("hf://datasets/my-org/my-dataset@refs/pr/3/train.json")
HfUri(type='dataset', id='my-org/my-dataset', revision='refs/pr/3', path_in_repo='train.json')

HfUri is round-trippable via HfUri.to_uri(), which always emits the canonical form (with an explicit type prefix):

>>> uri = parse_hf_uri("hf://my-org/my-model@v1/config.json")
>>> uri.to_uri()
'hf://models/my-org/my-model@v1/config.json'

Use the type and id fields directly. The boolean properties is_repo and is_bucket disambiguate between repository URIs and bucket URIs when needed.

Parsing mounts

parse_hf_mount parses a mount specification (a HF URI with a local mount path and optional :ro/:rw flag) and returns a frozen HfMount dataclass. It uses parse_hf_uri() under the hood.

>>> from huggingface_hub import parse_hf_mount
>>> parse_hf_mount("hf://buckets/my-org/my-bucket/sub/dir:/mnt:ro")
HfMount(source=HfUri(type='bucket', id='my-org/my-bucket', revision=None, path_in_repo='sub/dir'), mount_path='/mnt', read_only=True)

HfMount is round-trippable via HfMount.to_uri:

>>> mount = parse_hf_mount("hf://my-org/my-model:/data:ro")
>>> mount.to_uri()
'hf://models/my-org/my-model:/data:ro'

Reference

class huggingface_hub.HfUri

< >

( type: typing.Literal['model', 'dataset', 'space', 'kernel', 'bucket'] id: str revision: str | None = None path_in_repo: str = '' _raw: str | None = None )

Parameters

  • type (str) — One of ‘model’, ‘dataset’, ‘space’, ‘kernel’ or ‘bucket’.
  • id (str) — The repository id (‘namespace/name’, e.g. ‘my-org/my-model’) for repo URIs, or the bucket id (‘namespace/name’) for bucket URIs.
  • revision (str, optional) — The revision specified after ’@’ in the URI, URL-decoded. ‘None’ if no revision was specified, or for bucket URIs (which never carry a revision). Special refs like ‘refs/pr/10’ and ‘refs/convert/parquet’ are preserved as-is.
  • path_in_repo (str) — The path inside the repo or bucket. Empty string if the URI points at the root.

Parsed representation of a Hugging Face Hub URI (‘hf://…’).

to_uri

< >

( )

Render the URI as a canonical ‘hf://’ string.

The type prefix is always written explicitly (e.g. ‘hf://models/my-org/my-model’).

to_url

< >

( endpoint: str | None = None ) str

Parameters

Returns

str

the web URL.

Render the URI as a Hugging Face web URL (the kind you open in a browser).

This is the inverse of parsing a URL with parse_hf_uri(). The returned URL points at:

  • the repository / bucket landing page when no path or revision is set;
  • the folder viewer (‘/tree/<revision>’) when only a revision is set;
  • the file viewer (‘/blob/<revision>/<path>’) for repository files (revision defaults to ‘main’);
  • the tree route (‘/tree/<path>’) for bucket files (buckets are not versioned).

Example:

>>> from huggingface_hub import parse_hf_uri
>>> parse_hf_uri("hf://datasets/my-org/my-dataset@v1/train.csv").to_url()
'https://huggingface.co/datasets/my-org/my-dataset/blob/v1/train.csv'

huggingface_hub.parse_hf_uri

( uri: str ) HfUri

Parameters

Returns

HfUri

the parsed URI.

Raises

HfUriError

  • HfUriError — If the URI is malformed (missing prefix, invalid type, missing id, unsupported URL route, etc.).

Parse a Hugging Face Hub URI (‘hf://…’) or a Hugging Face web URL.

A HF URI is a URI-like string identifying a location on the Hugging Face Hub. The full grammar is:

hf://[<TYPE>/]<ID>[@<REVISION>][/<PATH>]

For convenience, Hugging Face web URLs (the ones you copy-paste from the website) are also accepted and normalized to the canonical ‘hf://’ form, e.g. ’https://huggingface.co/datasets/my-org/my-dataset/blob/main/train.csv’. Only unambiguous URLs (repository / bucket pages and file/folder viewer routes) are accepted; any other route is rejected.

See ‘docs/source/en/package_reference/hf_uris.md’ for the full specification.

Examples:

>>> from huggingface_hub.utils import parse_hf_uri
>>> parse_hf_uri("hf://my-org/my-model")
HfUri(type='model', id='my-org/my-model', revision=None, path_in_repo='')
>>> parse_hf_uri("hf://datasets/my-org/my-dataset@refs/pr/3/train.json")
HfUri(type='dataset', id='my-org/my-dataset', revision='refs/pr/3', path_in_repo='train.json')
>>> parse_hf_uri("https://huggingface.co/datasets/my-org/my-dataset/blob/main/train.csv")
HfUri(type='dataset', id='my-org/my-dataset', revision='main', path_in_repo='train.csv')

class huggingface_hub.utils.HfMount

< >

( source: HfUri mount_path: str read_only: bool | None = None _raw: str | None = None )

Parameters

  • source (HfUri) — The parsed HF URI identifying the Hub resource to mount.
  • mount_path (str) — The local mount path (always starts with ’/’).
  • read_only (bool, optional) — True if the mount ends with ‘:ro’, False if it ends with ‘:rw’, ‘None’ if no flag was provided.

A HF URI paired with a local mount path and optional read-only flag.

Used by Spaces and Jobs to describe volume mounts. The full syntax is:

hf://[<TYPE>/]<ID>[@<REVISION>][/<PATH>]:<MOUNT_PATH>[:ro|:rw]

to_uri

< >

( )

Render the mount as a canonical ‘hf://’ string.

Example: ‘hf://models/my-org/my-model:/data:ro’

huggingface_hub.utils.parse_hf_mount

< >

( mount_str: str ) HfMount

Parameters

  • mount_str (str) — The mount string to parse. Must start with ‘hf://’ and contain a ’:’ segment.

Returns

HfMount

the parsed mount.

Raises

HfUriError

  • HfUriError — If the mount string is malformed (missing mount path, invalid URI, etc.).

Parse a HF mount specification (‘hf://…:<MOUNT_PATH>[:ro|:rw]’).

A mount specification is a HF URI followed by a local mount path and an optional read-only/read-write flag.

The full grammar is:

hf://[<TYPE>/]<ID>[@<REVISION>][/<PATH>]:<MOUNT_PATH>[:ro|:rw]

See ‘docs/source/en/package_reference/hf_uris.md’ for the full specification.

Examples:

>>> from huggingface_hub.utils import parse_hf_mount
>>> parse_hf_mount("hf://my-org/my-model:/data:ro")
HfMount(source=HfUri(type='model', id='my-org/my-model', revision=None, path_in_repo=''), mount_path='/data', read_only=True)
>>> parse_hf_mount("hf://buckets/my-org/my-bucket/sub/dir:/mnt:rw")
HfMount(source=HfUri(type='bucket', id='my-org/my-bucket', revision=None, path_in_repo='sub/dir'), mount_path='/mnt', read_only=False)
Update on GitHub