# pyrfs

> Pythonic filesystem ergonomics inspired by R's fs — tidy paths, typed values, chainable, pandas-friendly

pyrfs is a Python filesystem library porting the UX of R's fs package: consistent noun_verb naming (path_*, file_*, dir_*, link_*), tidy paths, typed self-describing values (FsPath, Bytes, Perms), explicit failure, and three interchangeable surfaces — functional, fluent FsPath chaining, and a pandas Series accessor with typed DataFrame columns.

# Start here

# pyrfs

**Pythonic filesystem ergonomics, inspired by R's [fs](https://fs.r-lib.org).**

Tidy paths, typed self-describing values, explicit failure — chainable, and pandas-native. Pure Python ≥ 3.10, zero hard dependencies.

```
import pyrfs as fs

fs.dir_ls("src", recurse=True, glob="*.py")   # [FsPath('src/app.py'), ...]
fs.file_size("data.csv") > "10MB"             # True — sizes compare to literals
fs.file_copy("a.txt", "backup/")              # FsPath('backup/a.txt'), refuses to clobber
```

## Install

Not yet on PyPI — install from GitHub:

```
pip install "pyrfs @ git+https://github.com/Lightbridge-KS/pyrfs"
# with the pandas integration:
pip install "pyrfs[pandas] @ git+https://github.com/Lightbridge-KS/pyrfs"
```

## One engine, three surfaces

Every operation is implemented once and reachable three ways — pick per task, mix freely:

```
import pyrfs as fs

fs.path("foo", "bar", "a", ext="txt")    # FsPath('foo/bar/a.txt')
fs.dir_ls("data", glob="*.csv")
fs.file_copy("a.txt", "b.txt")           # -> FsPath('b.txt')
```

Closest to R's fs — the `noun_verb` names transfer directly. See [Coming from R's fs](https://pyrfs.netlify.app/coming-from-r/index.md).

```
from pyrfs import FsPath

(FsPath("data") / "raw.csv").with_ext("parquet").copy_to("clean/")
FsPath("project").mkdir().touch_file("README.md")
FsPath("logs").ls(glob="*.log")
```

`FsPath` **is a `str`** — it drops into `open()`, `pd.read_csv()`, any API that takes a path.

```
import pyrfs as fs

(fs.dir_info("src", recurse=True)
   .query("size > '10KB' and type == 'file'")   # typed columns!
   .sort_values("size", ascending=False))

df["path"].fs.ext()        # vectorized over a column
```

`size` and `permissions` are real ExtensionDtypes — string literals work inside `.query()`.

## Where next

- **[Coming from R's fs](https://pyrfs.netlify.app/coming-from-r/index.md)** — the translation table.
- **[The three surfaces](https://pyrfs.netlify.app/guides/three-surfaces/index.md)** — when to use which.
- **[Typed values](https://pyrfs.netlify.app/guides/typed-values/index.md)** — `Bytes('444.5K')`, `Perms('rw-r--r--')`.
- **[Tour notebook](https://pyrfs.netlify.app/tour/pyrfs-tour/index.md)** — everything, runnable.
- **[API reference](https://pyrfs.netlify.app/api/paths/index.md)** — by family: `path_*`, `file_*`, `dir_*`, `link_*`.

# Coming from R's fs

pyrfs keeps fs's **UX contract** — consistent `noun_verb` naming, tidy paths, predictable typed returns, explicit failure — expressed in idiomatic Python. If you know fs, your muscle memory transfers: the functional names are identical.

## The four families

| Prefix  | Domain                                           | Examples                                                      |
| ------- | ------------------------------------------------ | ------------------------------------------------------------- |
| `path_` | construct & manipulate path strings (**no I/O**) | `path()`, `path_dir()`, `path_ext_set()`, `path_rel()`        |
| `file_` | operate on files                                 | `file_create()`, `file_copy()`, `file_info()`, `file_chmod()` |
| `dir_`  | operate on directories                           | `dir_create()`, `dir_ls()`, `dir_info()`, `dir_tree()`        |
| `link_` | operate on links                                 | `link_create()`, `link_path()`, `link_copy()`                 |

Plus predicates (`is_file`, `is_dir`, …), `user_ids`/`group_ids`, and temp helpers (`file_temp`, `path_temp`, `file_temp_push/pop`) — all as in fs.

## Translation table

| R fs                          | pyrfs functional              | pyrfs fluent                                |
| ----------------------------- | ----------------------------- | ------------------------------------------- |
| `path("a", "b", ext = "txt")` | `path("a", "b", ext="txt")`   | `FsPath("a") / "b"` then `.with_ext("txt")` |
| `dir_ls("d", recurse = TRUE)` | `dir_ls("d", recurse=True)`   | `FsPath("d").ls(recurse=True)`              |
| `dir_info("d")`               | `dir_info("d")` → DataFrame   | —                                           |
| `file_copy("a", "b")`         | `file_copy("a", "b")`         | `FsPath("a").copy_to("b")`                  |
| `file_size("a")`              | `file_size("a")` → `Bytes`    | `FsPath("a").size()`                        |
| `path_ext_set("a.txt", "md")` | `path_ext_set("a.txt", "md")` | `FsPath("a.txt").with_ext("md")`            |
| `path_rel("a/b", "a")`        | `path_rel("a/b", "a")`        | `FsPath("a/b").rel_to("a")`                 |
| `dir_tree("d")`               | `dir_tree("d")`               | `FsPath("d").tree()`                        |
| `fs_bytes("10MB")`            | `Bytes("10MB")`               | —                                           |
| `fs_perms("644")`             | `Perms("644")`                | —                                           |
| `x %>% file_delete()`         | loop / `df.pipe(...)`         | `FsPath(x).delete()`                        |

## The headline demo, ported

```
# R
dir_info("src", recurse = FALSE) |>
  filter(type == "file", size > "10KB") |>
  arrange(desc(size))
```

```
# Python (with the pandas extra)
(fs.dir_info("src")
   .query("size > '10KB' and type == 'file'")
   .sort_values("size", ascending=False))
```

`size` and `permissions` are real pandas ExtensionDtypes, so comparisons against human literals work inside `.query()` — same trick as fs's `fs_bytes`/`fs_perms` tibble columns.

## Vectorization

fs is vectorized end to end; Python is scalar-by-default. pyrfs functions are **polymorphic on the first argument**:

```
fs.path_ext("a.txt")              # 'txt'                 (scalar -> scalar)
fs.path_ext(["a.txt", "b.md"])    # ['txt', 'md']         (list -> list)
fs.path_ext(df["path"])           # pandas Series          (Series -> Series)
df["path"].fs.ext()               # the idiomatic column form
```

## What's different (on purpose)

- **Errors are Python-native.** `FileExistsError`/`FileNotFoundError`/ `PermissionError` instead of classed `fs_error` conditions; `FsValueError` for pyrfs-level validation. `tryCatch` → `try/except`.
- **`recurse` defaults match fs** (`False` for listing, `True` for `dir_create`), and accepts an `int` depth, exactly like fs.
- **Byte units are 1024-based across the board** — `Bytes("10MB") == Bytes("10MiB")`, matching `fs_bytes`.
- **`is_file`/`is_dir` classify the entry itself** (lstat): a symlink answers `True` only to `is_link` — fs semantics, not `os.path.isdir` semantics.
- **No `dir_move()`** — directories move via `file_move()`, same as fs.
- **`FsPath` is a `str`, not a `pathlib.Path`.** Best interop and pandas round-tripping; call `.as_pathlib()` when you want pathlib semantics. The `/` join concatenates then tidies — an absolute right-hand side does *not* reset the path (unlike `os.path.join`).
- **The split method is `parts()`** — `str.split()` is left untouched so `FsPath` never surprises code that treats it as a string.
- **`dir_walk()` is a lazy generator** rather than a callback walker — the Pythonic spin; `dir_ls()`/`dir_map()` are built on it.
# Guides

# Safety & errors

pyrfs inherits fs's stance: **explicit failure, destructive actions opt-in**. Nothing silently returns `False`; nothing clobbers unless you ask.

## Safe defaults (learn once)

| Argument          | Meaning                                           | Default                         | On                      |
| ----------------- | ------------------------------------------------- | ------------------------------- | ----------------------- |
| `overwrite`       | allow clobbering an existing target               | `False`                         | copy/move               |
| `recurse`         | `True` = fully, `False` = no, `int` = to depth    | `False` listing / `True` create | `dir_*`                 |
| `all`             | include hidden dotfiles                           | `False`                         | `dir_ls`, `dir_map`, …  |
| `type`            | filter by entry type (`"file"`, `"directory"`, …) | `"any"`                         | traversals              |
| `glob` / `regexp` | filter listings (mutually exclusive)              | `None`                          | `dir_ls`, `path_filter` |
| `fail`            | raise vs warn on unreadable entries               | `True`                          | traversals              |

Behavior flags are keyword-only, so call sites read self-documenting: `file_copy(a, b, overwrite=True)`.

## The error model

```
fs.file_copy("a.txt", "b.txt")        # FileExistsError if b.txt exists
fs.dir_ls("nope")                     # FileNotFoundError
fs.path_filter(ps, glob="*.py", regexp=r"\.py$")
                                      # FsValueError: cannot set both
```

- **OS-level failures raise native `OSError` subclasses** — `FileNotFoundError`, `FileExistsError`, `PermissionError` — familiar and `except`-able.
- **pyrfs-level validation raises `FsError`** (usually the `FsValueError` subclass): conflicting arguments, bad size/permission literals, deleting a non-symlink with `link_delete`.

## Softening traversals: `fail=False`

One unreadable entry shouldn't abort a whole directory walk:

```
fs.dir_ls("/var", recurse=True, fail=False)
# UserWarning: skipping unreadable directory: ...
# -> returns everything it *could* read
```

This is a direct port of fs's `fail` knob, and applies to `dir_ls`, `dir_walk`, `dir_map`, and `dir_info`.

## Destination resolution (copy/move)

Copying or moving **into an existing directory** targets `dir/basename` — shell `cp`/`mv` semantics — and the `overwrite` guard applies to that *resolved* target:

```
fs.file_copy("report.pdf", "archive/")     # -> FsPath('archive/report.pdf')
fs.file_copy("report.pdf", "archive/")     # FileExistsError
```

There is no `dir_move()`: directories are files at the OS level, so `file_move()` moves them — same deliberate choice as fs.

# The three surfaces

Every pyrfs operation is implemented **once** in a pure-stdlib engine; the three user-facing surfaces are thin delegates. They interoperate freely — `dir_ls()` returns `FsPath`s you can chain methods on or drop into a DataFrame column.

## A — Functional: scripts and R muscle memory

```
import pyrfs as fs

files = fs.dir_ls("data", recurse=True, glob="*.csv")
fs.dir_create("backup")
for f in files:
    fs.file_copy(f, "backup/", overwrite=True)
```

Names mirror R's fs exactly — see [Coming from R's fs](https://pyrfs.netlify.app/coming-from-r/index.md). Functions are polymorphic on the first argument (scalar → scalar, list → list, Series → Series).

## B — Fluent `FsPath`: OO-style chaining

```
from pyrfs import FsPath

report = (FsPath("analysis") / "draft.md").with_ext("html")
work = FsPath("project").mkdir().touch_file("README.md").touch_file("setup.py")
big_logs = [p for p in FsPath("logs").walk(glob="*.log") if p.size() > "5MB"]
```

Because `FsPath` subclasses `str`:

- `open(p)`, `pd.read_csv(p)`, `json.dump(..., open(p, "w"))` all just work;
- every `str` method behaves normally (`p.startswith("src/")`, `p.split("/")`);
- it serializes cleanly (JSON, parquet, databases) as a plain string.

Mutating verbs return the resulting path, so chains read top-to-bottom like R pipes.

## C — pandas: columns and frames

Requires the extra: `pip install "pyrfs[pandas]"`.

```
import pandas as pd
import pyrfs as fs

# typed frame in, typed frame out
big = (fs.dir_info("src", recurse=True)
         .query("size > '10KB' and type == 'file'")
         .sort_values("size", ascending=False))

# vectorized path algebra over a column
df = pd.DataFrame({"path": fs.dir_ls("src", recurse=True, type="file")})
df.assign(
    ext=df["path"].fs.ext(),
    dir=df["path"].fs.dir(),
    size=df["path"].fs.size(),     # a real 'bytes'-dtype column
)
```

Without pandas installed, the core works unchanged and `*_info` returns `list[dict]` rows carrying the same typed scalars.

## Choosing

| Situation                                       | Reach for        |
| ----------------------------------------------- | ---------------- |
| Shell-script-like automation, R habits          | **A** functional |
| Building paths through transformations, OO code | **B** fluent     |
| Filtering/aggregating many files as data        | **C** pandas     |

# Typed values

The heart of fs's charm: values that *know what they are* and print for humans. pyrfs ships three — each subclasses a builtin, so it still behaves like its base type everywhere.

## `Bytes` ⊂ `int`

```
from pyrfs import Bytes

Bytes("10MB")                        # Bytes(10485760)
str(Bytes(455200))                   # '444.5K'
Bytes(455200) < "1MB"                # True — comparisons parse literals
sum([Bytes("1MB"), Bytes("500KB")])  # Bytes -> '1.49M' (arithmetic stays typed)
```

All units are 1024-based

`"10MB"`, `"10MiB"` and `"10M"` all mean `10 * 1024**2`, matching R's `fs_bytes`. `repr()` stays exact (`Bytes(455200)`); `str()`/`format()` humanize.

`file_size()` returns `Bytes`, so `fs.file_size("x.bin") > "10KB"` reads like the question you're asking.

## `Perms` ⊂ `int`

```
from pyrfs import Perms

Perms("644")                  # Perms('rw-r--r--')
Perms("644") == "rw-r--r--"   # True
Perms("644") == "u=rw,go=r"   # True — symbolic forms parse too
Perms("644") | "u+x"          # Perms('rwxr--r--') — mode algebra stays typed
```

`file_chmod()` accepts all the same forms, and symbolic modes apply **relative to the current mode** (chmod semantics): `fs.file_chmod("run.sh", "u+x")`.

## `FsPath` ⊂ `str`

```
from pyrfs import FsPath

FsPath("src//a.txt/")         # FsPath('src/a.txt')  — tidied on construction
FsPath("a") / "b" / "c.md"    # FsPath('a/b/c.md')
```

Tidy form: always `/` separators, no doubled or trailing slashes. In a terminal, the repr is coloured by on-disk type via `LS_COLORS` (degrades automatically on non-TTY or `NO_COLOR`).

## In pandas columns

With the `[pandas]` extra these become real ExtensionDtypes — `"bytes"`, `"perms"`, `"path"` — so whole columns display humanized, sort correctly, compare against literals inside `.query()`, and `sum()`/`min()`/`max()` return typed scalars:

```
s = pd.Series(["1K", "10MB", "455"], dtype="bytes")
s > "1K"          # [False, True, False]
s.sum()           # Bytes -> '10M'
```
# API reference

# Directories — `dir_*`

All traversals share the fs filter set: `all`, `recurse` (bool or depth), `type`, `glob`/`regexp` (mutually exclusive), `invert`, `fail`.

## pyrfs.dir_create

```
dir_create(path: str, *, mode: int | str = 493, recurse: bool = True) -> FsPath
```

Create a directory (parents too when `recurse`); existing dirs are fine.

Vectorized: also accepts an iterable or pandas Series of paths.

Parameters:

| Name      | Type              | Description                                                                                                                         | Default    |
| --------- | ----------------- | ----------------------------------------------------------------------------------------------------------------------------------- | ---------- |
| `path`    | `str or PathLike` | The directory to create.                                                                                                            | *required* |
| `mode`    | `int or str`      | Permissions for newly created directories (default 0o755); subject to the process umask.                                            | `493`      |
| `recurse` | `bool`            | Create missing parents too (default True, matching fs — note this differs from the recurse=False default of the listing functions). | `True`     |

Returns:

| Type     | Description                |
| -------- | -------------------------- |
| `FsPath` | The created path (chains). |

See Also

file_create : The file counterpart. FsPath.mkdir : Fluent equivalent.

Examples:

```
>>> dir_create("out/plots")
FsPath('out/plots')
>>> dir_exists("out/plots")
True
```

## pyrfs.dir_exists

```
dir_exists(path: str) -> bool
```

Whether the path exists and is a directory (follows symlinks).

Vectorized: also accepts an iterable or pandas Series of paths.

See Also

pyrfs.is_dir : Entry-itself (lstat) semantics — a symlink to a directory answers `False` there but `True` here.

## pyrfs.dir_ls

```
dir_ls(path: PathInput = '.', *, all: bool = False, recurse: bool | int = False, type: str | Iterable[str] = 'any', glob: str | None = None, regexp: str | None = None, invert: bool = False, fail: bool = True) -> list[FsPath]
```

List directory entries with the full fs filter set.

The eager form of `dir_walk` — same parameters, returns a sorted list.

Parameters:

| Name      | Type                     | Description                                                                                     | Default |
| --------- | ------------------------ | ----------------------------------------------------------------------------------------------- | ------- |
| `path`    | `str or PathLike`        | Directory to list (default: the working directory).                                             | `'.'`   |
| `all`     | `bool`                   | Include hidden dotfiles.                                                                        | `False` |
| `recurse` | `bool or int`            | True = full recursion, False = this level only, an int limits depth (1 = one level below path). | `False` |
| `type`    | `str or iterable of str` | Keep only these entry types ("file", "directory", "symlink", ...); "any" keeps all.             | `'any'` |
| `glob`    | `str`                    | Keep entries whose path matches (mutually exclusive).                                           | `None`  |
| `regexp`  | `str`                    | Keep entries whose path matches (mutually exclusive).                                           | `None`  |
| `invert`  | `bool`                   | Keep entries that do not match glob/regexp.                                                     | `False` |
| `fail`    | `bool`                   | Raise on unreadable entries (True) or warn and skip (False).                                    | `True`  |

Returns:

| Type             | Description                                             |
| ---------------- | ------------------------------------------------------- |
| `list of FsPath` | Entry paths, prefixed by path, siblings sorted by name. |

Raises:

| Type           | Description                                                           |
| -------------- | --------------------------------------------------------------------- |
| `FsValueError` | If both glob and regexp are set, or type names an unknown entry type. |

See Also

dir_walk : The lazy (generator) form. dir_info : The same listing as typed stat rows / DataFrame. pyrfs.path_filter : The same glob/regexp filter for in-memory lists.

Examples:

```
>>> from pyrfs import file_touch
>>> _ = dir_create("proj/sub")
>>> _ = file_touch(["proj/a.py", "proj/b.txt"])
>>> dir_ls("proj")
[FsPath('proj/a.py'), FsPath('proj/b.txt'), FsPath('proj/sub')]
>>> dir_ls("proj", glob="*.py")
[FsPath('proj/a.py')]
>>> dir_ls("proj", type="directory")
[FsPath('proj/sub')]
```

## pyrfs.dir_walk

```
dir_walk(path: PathInput = '.', *, all: bool = False, recurse: bool | int = False, type: str | Iterable[str] = 'any', glob: str | None = None, regexp: str | None = None, invert: bool = False, fail: bool = True) -> Iterator[FsPath]
```

Lazily yield directory entries, with the full fs filter set.

Parameters:

| Name      | Type                     | Description                                                                                     | Default |
| --------- | ------------------------ | ----------------------------------------------------------------------------------------------- | ------- |
| `path`    | `str or PathLike`        | Directory to walk.                                                                              | `'.'`   |
| `all`     | `bool`                   | Include hidden dotfiles.                                                                        | `False` |
| `recurse` | `bool or int`            | True = full recursion, False = this level only, an int limits depth (1 = one level below path). | `False` |
| `type`    | `str or iterable of str` | Keep only these entry types ("file", "directory", "symlink", ...); "any" keeps all.             | `'any'` |
| `glob`    | `str`                    | Keep entries whose path matches (mutually exclusive).                                           | `None`  |
| `regexp`  | `str`                    | Keep entries whose path matches (mutually exclusive).                                           | `None`  |
| `invert`  | `bool`                   | Keep entries that do not match glob/regexp.                                                     | `False` |
| `fail`    | `bool`                   | Raise on unreadable entries (True) or warn and skip (False).                                    | `True`  |

Yields:

| Type     | Description                                             |
| -------- | ------------------------------------------------------- |
| `FsPath` | Entry paths, prefixed by path, siblings sorted by name. |

Raises:

| Type           | Description                                                           |
| -------------- | --------------------------------------------------------------------- |
| `FsValueError` | If both glob and regexp are set, or type names an unknown entry type. |

See Also

dir_ls : The eager (list-returning) form. dir_map : Apply a function to each entry.

Examples:

```
>>> from pyrfs import file_touch
>>> _ = dir_create("logs")
>>> _ = file_touch("logs/a.log")
>>> walker = dir_walk("logs")  # nothing read yet — it's a generator
>>> next(walker)
FsPath('logs/a.log')
```

## pyrfs.dir_map

```
dir_map(path: PathInput, fn: Callable[[FsPath], object], *, all: bool = False, recurse: bool | int = False, type: str | Iterable[str] = 'any', glob: str | None = None, regexp: str | None = None, invert: bool = False, fail: bool = True) -> list[object]
```

Apply `fn` to each entry and collect the results.

Takes the same filter arguments as `dir_ls`.

See Also

dir_walk : Iterate lazily instead of collecting.

Examples:

```
>>> from pyrfs import file_touch
>>> _ = dir_create("d")
>>> _ = file_touch(["d/a.py", "d/b.py"])
>>> dir_map("d", lambda p: p.ext())
['py', 'py']
```

## pyrfs.dir_copy

```
dir_copy(path: str, new_path: PathInput, *, overwrite: bool = False) -> FsPath
```

Copy a directory tree to `new_path` (a name, or an existing directory).

Same destination resolution and `overwrite` guard as `file_copy`: copying into an existing directory targets `new_path/basename` (shell `cp -r` semantics). With `overwrite=True` an existing destination is *replaced*, not merged. Symlinks are copied as symlinks.

Parameters:

| Name        | Type              | Description                                                 | Default    |
| ----------- | ----------------- | ----------------------------------------------------------- | ---------- |
| `path`      | `str or PathLike` | Source directory.                                           | *required* |
| `new_path`  | `str or PathLike` | Destination name, or an existing directory to copy into.    | *required* |
| `overwrite` | `bool`            | Replace an existing (resolved) destination (default False). | `False`    |

Returns:

| Type     | Description               |
| -------- | ------------------------- |
| `FsPath` | The root of the new copy. |

Raises:

| Type                 | Description                                                  |
| -------------------- | ------------------------------------------------------------ |
| `NotADirectoryError` | If path is not a directory.                                  |
| `FileExistsError`    | If the (resolved) destination exists and overwrite is False. |

See Also

file_copy : Single files. file_move : Directories move via `file_move` (there is no dir_move).

Examples:

```
>>> _ = dir_create("src/sub")
>>> dir_copy("src", "backup")
FsPath('backup')
>>> dir_exists("backup/sub")
True
```

## pyrfs.dir_delete

```
dir_delete(path: str) -> FsPath
```

Delete a directory and everything below it (recursive, like `rm -rf`).

Vectorized: also accepts an iterable or pandas Series of paths.

Returns:

| Type     | Description       |
| -------- | ----------------- |
| `FsPath` | The deleted path. |

See Also

file_delete : Single files and symlinks. FsPath.rmdir : Fluent equivalent.

Examples:

```
>>> _ = dir_create("scratch/deep")
>>> dir_delete("scratch")
FsPath('scratch')
>>> dir_exists("scratch")
False
```

## pyrfs.dir_tree

```
dir_tree(path: PathInput = '.', *, recurse: bool | int = True, all: bool = False) -> None
```

Print a box-drawing tree of the directory, like the Unix `tree`.

Entries are coloured by type via `LS_COLORS` in a capable terminal (plain on non-TTY or `NO_COLOR`). Hidden files are skipped unless `all=True`; `recurse` limits depth as in `dir_ls`.

Examples:

```
>>> from pyrfs import file_touch
>>> _ = dir_create("proj/src")
>>> _ = file_touch("proj/README.md")
>>> dir_tree("proj")
proj
├── README.md
└── src
```

# Files — `file_*`

Mutating verbs return the (new) path so calls chain; `overwrite=False` on an existing target raises `FileExistsError`. Copy/move into an existing directory targets `dir/basename`.

## pyrfs.file_create

```
file_create(path: str, *, mode: int | str = 420) -> FsPath
```

Create a new file (an existing file is left unchanged).

Vectorized: also accepts an iterable or pandas Series of paths.

Parameters:

| Name   | Type              | Description                                                                                                                                     | Default    |
| ------ | ----------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- | ---------- |
| `path` | `str or PathLike` | The file to create. The parent directory must exist.                                                                                            | *required* |
| `mode` | `int or str`      | Permissions for a newly created file — octal string ("644"), symbolic ("u=rw,go=r"), or raw bits (default 0o644); subject to the process umask. | `420`      |

Returns:

| Type     | Description                |
| -------- | -------------------------- |
| `FsPath` | The created path (chains). |

See Also

file_touch : Also update timestamps when the file exists. pyrfs.dir_create : The directory counterpart.

Examples:

```
>>> file_create("notes.txt")
FsPath('notes.txt')
```

## pyrfs.file_touch

```
file_touch(path: str) -> FsPath
```

Update access/modification times, creating the file if needed.

Vectorized: also accepts an iterable or pandas Series of paths.

See Also

file_create : Create without updating timestamps of an existing file.

Examples:

```
>>> file_touch("stamp.txt")
FsPath('stamp.txt')
```

## pyrfs.file_copy

```
file_copy(path: str, new_path: PathInput, *, overwrite: bool = False) -> FsPath
```

Copy a file to `new_path` (a file name, or an existing directory).

Vectorized: copy many files into one directory with `file_copy([a, b], "dir")`.

Parameters:

| Name        | Type              | Description                                                                                               | Default    |
| ----------- | ----------------- | --------------------------------------------------------------------------------------------------------- | ---------- |
| `path`      | `str or PathLike` | Source file.                                                                                              | *required* |
| `new_path`  | `str or PathLike` | Destination file name, or an existing directory to copy into (the target then becomes new_path/basename). | *required* |
| `overwrite` | `bool`            | Allow clobbering an existing destination (default False).                                                 | `False`    |

Returns:

| Type     | Description               |
| -------- | ------------------------- |
| `FsPath` | The path of the new copy. |

Raises:

| Type              | Description                                                  |
| ----------------- | ------------------------------------------------------------ |
| `FileExistsError` | If the (resolved) destination exists and overwrite is False. |

See Also

file_move : Move instead of copy. pyrfs.dir_copy : Copy a directory tree. FsPath.copy_to : Fluent equivalent.

Examples:

```
>>> src = file_create("a.txt")
>>> file_copy(src, "b.txt")
FsPath('b.txt')
>>> file_copy(src, "b.txt")
Traceback (most recent call last):
    ...
FileExistsError: target already exists: FsPath('b.txt') (pass overwrite=True)
```

## pyrfs.file_move

```
file_move(path: str, new_path: PathInput, *, overwrite: bool = False) -> FsPath
```

Move (rename) a file — or a directory: dirs move via `file_move`.

Same destination resolution and `overwrite` guard as `file_copy`. There is deliberately no `dir_move`, matching fs.

Parameters:

| Name        | Type              | Description                                               | Default    |
| ----------- | ----------------- | --------------------------------------------------------- | ---------- |
| `path`      | `str or PathLike` | Source file or directory.                                 | *required* |
| `new_path`  | `str or PathLike` | Destination name, or an existing directory to move into.  | *required* |
| `overwrite` | `bool`            | Allow clobbering an existing destination (default False). | `False`    |

Returns:

| Type     | Description       |
| -------- | ----------------- |
| `FsPath` | The new location. |

Raises:

| Type              | Description                                                  |
| ----------------- | ------------------------------------------------------------ |
| `FileExistsError` | If the (resolved) destination exists and overwrite is False. |

See Also

file_copy : Copy instead of move. FsPath.move_to : Fluent equivalent.

Examples:

```
>>> _ = file_create("a.txt")
>>> file_move("a.txt", "b.txt")
FsPath('b.txt')
```

## pyrfs.file_delete

```
file_delete(path: str) -> FsPath
```

Delete a file or symlink (for directories use `dir_delete`).

Vectorized: also accepts an iterable or pandas Series of paths.

Returns:

| Type     | Description       |
| -------- | ----------------- |
| `FsPath` | The deleted path. |

Raises:

| Type                | Description                 |
| ------------------- | --------------------------- |
| `FileNotFoundError` | If the file does not exist. |

See Also

pyrfs.dir_delete : Recursive directory deletion. pyrfs.link_delete : Symlink-only deletion (refuses non-links).

Examples:

```
>>> p = file_create("scrap.txt")
>>> file_delete(p)
FsPath('scrap.txt')
>>> file_exists(p)
False
```

## pyrfs.file_exists

```
file_exists(path: str) -> bool
```

Whether the path exists — a broken symlink counts as existing.

Uses `lexists` (the entry itself), matching fs. Vectorized: also accepts an iterable or pandas Series of paths.

See Also

pyrfs.dir_exists : Directory-specific test (follows symlinks). pyrfs.is_file, pyrfs.is_dir, pyrfs.is_link : Type predicates.

Examples:

```
>>> _ = file_create("here.txt")
>>> file_exists(["here.txt", "gone.txt"])
[True, False]
```

## pyrfs.file_access

```
file_access(path: str, mode: str = 'exists') -> bool
```

Test access to a path for the current process.

Vectorized: also accepts an iterable or pandas Series of paths.

Parameters:

| Name   | Type                                     | Description                  | Default    |
| ------ | ---------------------------------------- | ---------------------------- | ---------- |
| `path` | `str or PathLike`                        | The path to test.            | *required* |
| `mode` | `('exists', 'read', 'write', 'execute')` | The kind of access to check. | `"exists"` |

Raises:

| Type           | Description                                     |
| -------------- | ----------------------------------------------- |
| `FsValueError` | If mode is not one of the four accepted values. |

Examples:

```
>>> p = file_create("data.txt")
>>> file_access(p, "read")
True
```

## pyrfs.file_size

```
file_size(path: str) -> Bytes
```

File size as a `pyrfs.Bytes` value (compares against literals).

Vectorized: also accepts an iterable or pandas Series of paths.

Returns:

| Type    | Description                                                                                           |
| ------- | ----------------------------------------------------------------------------------------------------- |
| `Bytes` | The size — an int subclass that displays humanized (444.5K) and compares against strings like "10KB". |

See Also

pyrfs.Bytes : The typed scalar. file_info : Size together with the full stat row.

Examples:

```
>>> p = file_create("two-bytes.bin")
>>> with open(p, "wb") as fh:
...     _ = fh.write(b"hi")
>>> file_size(p)
Bytes(2)
>>> file_size(p) < "1KB"
True
```

## pyrfs.file_chmod

```
file_chmod(path: str, mode: int | str) -> FsPath
```

Change permissions; symbolic modes apply relative to the current mode.

Vectorized: also accepts an iterable or pandas Series of paths.

Parameters:

| Name   | Type              | Description                                                                                                                                             | Default    |
| ------ | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- |
| `path` | `str or PathLike` | The file to change.                                                                                                                                     | *required* |
| `mode` | `int or str`      | Octal string ("644"), display form ("rw-r--r--"), or raw bits — all absolute; symbolic clauses ("u+x") modify the current mode, like the chmod command. | *required* |

See Also

pyrfs.Perms : The typed permission scalar. FsPath.chmod : Fluent equivalent.

Examples:

```
>>> p = file_create("run.sh", mode="644")
>>> _ = file_chmod(p, "u+x")
>>> file_access(p, "execute")
True
```

## pyrfs.file_chown

```
file_chown(path: str, user: str | int | None = None, group: str | int | None = None) -> FsPath
```

Change owner and/or group (names or numeric ids; POSIX only).

Parameters:

| Name    | Type              | Description              | Default    |
| ------- | ----------------- | ------------------------ | ---------- |
| `path`  | `str or PathLike` | The file to change.      | *required* |
| `user`  | `str or int`      | New owner (name or uid). | `None`     |
| `group` | `str or int`      | New group (name or gid). | `None`     |

Raises:

| Type           | Description                         |
| -------------- | ----------------------------------- |
| `FsValueError` | If neither user nor group is given. |

## pyrfs.file_show

```
file_show(path: str) -> FsPath
```

Open a file in the OS default application (`open`/`xdg-open`).

Examples:

```
>>> file_show("report.pdf")
FsPath('report.pdf')
```

# FsPath

## pyrfs.FsPath

Bases: `str`

A tidy filesystem path string — the fluent pyrfs surface.

Construction normalizes the path (`/` separators, no doubled or trailing slashes). The `/` operator joins; methods chain because each returns an `FsPath`. Inherited `str` behavior is untouched — `p.split("/")`, `p.startswith(...)`, `open(p)` all work as on any string (the *split-into-components* method is `parts`, so `str.split` is never shadowed). In a capable terminal the repr is coloured by on-disk type via `LS_COLORS`.

See Also

pyrfs.path : Functional construction with an `ext=` option. as_pathlib : Convert when you want `pathlib` semantics.

Examples:

```
>>> FsPath("src//a.txt/")  # tidied on construction
FsPath('src/a.txt')
>>> (FsPath("foo") / "bar" / "a.txt").with_ext("md")
FsPath('foo/bar/a.md')
>>> FsPath("a/b").startswith("a")  # still a str
True
```

### __truediv__

```
__truediv__(other: str | PathLike[str]) -> FsPath
```

Join with `other`: `FsPath('a') / 'b'` -> `FsPath('a/b')`.

Concatenation + tidy: an absolute right-hand side does *not* reset the path (unlike `pathlib`/`os.path.join`).

### __rtruediv__

```
__rtruediv__(other: str | PathLike[str]) -> FsPath
```

Support `'a' / FsPath('b')` joining from a plain string.

### ext

```
ext() -> str
```

Extension without the dot (`''` if none) — `pyrfs.path_ext`.

### with_ext

```
with_ext(ext: str) -> FsPath
```

Replace (or add) the extension; `''` removes it — `pyrfs.path_ext_set`.

Examples:

```
>>> (FsPath("data") / "raw.csv").with_ext("parquet")
FsPath('data/raw.parquet')
```

### dir

```
dir() -> FsPath
```

Directory part of the path (`'.'` if none) — `pyrfs.path_dir`.

### name

```
name() -> FsPath
```

File name — the last path component — `pyrfs.path_file`.

### parts

```
parts() -> list[str]
```

Path components (a leading root stays `'/'`) — `pyrfs.path_split`.

Named `parts` (as in `pathlib`) so `str.split` keeps its normal string behavior.

Examples:

```
>>> FsPath("/usr/bin").parts()
['/', 'usr', 'bin']
```

### rel_to

```
rel_to(start: str | PathLike[str]) -> FsPath
```

This path expressed relative to `start` — `pyrfs.path_rel`.

### has_parent

```
has_parent(parent: str | PathLike[str]) -> bool
```

Whether this path sits at or below `parent` — `pyrfs.path_has_parent`.

### expand

```
expand() -> FsPath
```

Expand a leading `~` to the home directory — `pyrfs.path_expand`.

### norm

```
norm() -> FsPath
```

Normalize `.` and `..` lexically — `pyrfs.path_norm`.

### abs

```
abs() -> FsPath
```

Absolute form (links unresolved) — `pyrfs.path_abs`.

### real

```
real() -> FsPath
```

Canonical form, symlinks resolved — `pyrfs.path_real`.

### copy_to

```
copy_to(new_path: str | PathLike[str], *, overwrite: bool = False) -> FsPath
```

Copy this file to `new_path` — `pyrfs.file_copy`.

Copying into an existing directory targets `new_path/basename`; an existing destination raises `FileExistsError` unless `overwrite=True`. Returns the new copy's path (chains).

### move_to

```
move_to(new_path: str | PathLike[str], *, overwrite: bool = False) -> FsPath
```

Move (rename) this file or directory — `pyrfs.file_move`.

Same destination resolution and `overwrite` guard as `copy_to`.

### create

```
create(*, mode: int | str = 420) -> FsPath
```

Create this file (existing files untouched) — `pyrfs.file_create`.

### touch

```
touch() -> FsPath
```

Update timestamps, creating the file if needed — `pyrfs.file_touch`.

### delete

```
delete() -> None
```

Delete this file or symlink — `pyrfs.file_delete`.

Returns `None`: a deleted path has nothing to chain onto. For directories use `rmdir`.

### exists

```
exists() -> bool
```

Whether this path exists (broken symlinks count) — `pyrfs.file_exists`.

### access

```
access(mode: str = 'exists') -> bool
```

Test `"exists"`/`"read"`/`"write"`/`"execute"` — `pyrfs.file_access`.

### size

```
size() -> Bytes
```

File size as a `pyrfs.Bytes` value — `pyrfs.file_size`.

Examples:

```
>>> FsPath("notes.txt").create().size() == 0
True
```

### chmod

```
chmod(mode: int | str) -> FsPath
```

Change permissions — `pyrfs.file_chmod`.

Symbolic modes (`"u+x"`) apply to the *current* mode; octal and display forms are absolute. Returns this path (chains).

### info

```
info() -> dict[str, object]
```

Stat this path into one row of typed values — `pyrfs.file_info`.

Returns a single `dict` (use the functional `pyrfs.file_info` / `pyrfs.dir_info` for tables).

### mkdir

```
mkdir(*, mode: int | str = 493, recurse: bool = True) -> FsPath
```

Create this directory (parents too when `recurse`) — `pyrfs.dir_create`.

Examples:

```
>>> FsPath("proj").mkdir().touch_file("README.md").ls()
[FsPath('proj/README.md')]
```

### rmdir

```
rmdir() -> None
```

Delete this directory and everything below it — `pyrfs.dir_delete`.

Recursive (`rm -rf` semantics), despite the `os.rmdir`-like name. Returns `None`: nothing left to chain onto.

### touch_file

```
touch_file(name: str | PathLike[str]) -> FsPath
```

Create a child file and return *this directory* (keeps chaining).

Returning the directory (not the new file) lets several `touch_file` calls chain; use `(p / name).touch()` when you want the file's path back.

### ls

```
ls(*, all: bool = False, recurse: bool | int = False, type: str | Iterable[str] = 'any', glob: str | None = None, regexp: str | None = None, invert: bool = False, fail: bool = True) -> list[FsPath]
```

List entries of this directory — `pyrfs.dir_ls` (same filters).

### walk

```
walk(*, all: bool = False, recurse: bool | int = True, type: str | Iterable[str] = 'any', glob: str | None = None, regexp: str | None = None, invert: bool = False, fail: bool = True) -> Iterator[FsPath]
```

Lazily yield entries below this directory — `pyrfs.dir_walk`.

Unlike the functional default, `recurse=True` here: walking a tree is the common fluent use.

### tree

```
tree(*, recurse: bool | int = True, all: bool = False) -> None
```

Print a box-drawing tree of this directory — `pyrfs.dir_tree`.

### is_file

```
is_file() -> bool
```

Whether this is a regular file (lstat; symlinks answer `False`) — `pyrfs.is_file`.

### is_dir

```
is_dir() -> bool
```

Whether this is a directory (lstat; symlinks answer `False`) — `pyrfs.is_dir`.

### is_link

```
is_link() -> bool
```

Whether this is a symlink — `pyrfs.is_link`.

### as_pathlib

```
as_pathlib() -> pathlib.Path
```

This path as a `pathlib.Path`, when you want pathlib semantics.

Examples:

```
>>> FsPath("a/b").as_pathlib()
PosixPath('a/b')
```

# Info, temp & errors

`*_info` returns a typed DataFrame when pandas is installed, otherwise `list[dict]` rows carrying the same typed scalars.

## pyrfs.file_info

```
file_info(path: PathInput | Iterable[PathInput], *, follow: bool = False) -> pd.DataFrame | list[dict[str, object]]
```

Stat path(s) into a typed table.

Returns a DataFrame with typed columns (`path`/`size`/`permissions` as pyrfs dtypes) when pandas is installed, else `list[dict]` rows of the same typed scalars.

Parameters:

| Name     | Type                                    | Description                                                           | Default    |
| -------- | --------------------------------------- | --------------------------------------------------------------------- | ---------- |
| `path`   | `str, os.PathLike, or iterable of them` | Path(s) to stat.                                                      | *required* |
| `follow` | `bool`                                  | Stat symlink targets instead of the links themselves (default False). | `False`    |

See Also

dir_info : Stat a directory's entries. pyrfs.FsPath.info : One row, as a plain dict.

Examples:

```
>>> file_info("pyproject.toml")
             path  type    size permissions ...
0  pyproject.toml  file    1.7K   rw-r--r-- ...
```

## pyrfs.dir_info

```
dir_info(path: PathInput = '.', *, all: bool = False, recurse: bool | int = False, type: str | Iterable[str] = 'any', glob: str | None = None, regexp: str | None = None, invert: bool = False, fail: bool = True) -> pd.DataFrame | list[dict[str, object]]
```

Stat directory entries into a typed table (same filters as `dir_ls`).

Returns a DataFrame with typed columns when pandas is installed, else `list[dict]` rows. This is the fs headline: with typed columns, string literals work inside `.query()`.

See Also

file_info : Stat explicit path(s). pyrfs.dir_ls : The underlying listing and its filter arguments.

Examples:

```
>>> (dir_info("pyrfs", recurse=True)
...     .query("size > '10KB' and type == 'file'")
...     .sort_values("size", ascending=False))
```

## pyrfs.has_pandas

```
has_pandas() -> bool
```

Whether pandas is importable (cached; decides the `*_info` shape).

Examples:

```
>>> has_pandas() in (True, False)
True
```

## pyrfs.file_temp

```
file_temp(pattern: str = 'file', tmp_dir: PathInput | None = None, ext: str = '') -> FsPath
```

Return a unique temp path (a *name* only — the file is not created).

If names were queued with `file_temp_push`, the oldest queued name is returned instead — deterministic mode, fs's trick for reproducible examples, docs, and tests.

Parameters:

| Name      | Type              | Description                                                   | Default  |
| --------- | ----------------- | ------------------------------------------------------------- | -------- |
| `pattern` | `str`             | Filename prefix (default "file").                             | `'file'` |
| `tmp_dir` | `str or PathLike` | Directory for the name (default: the session temp directory). | `None`   |
| `ext`     | `str`             | Extension, with or without the leading dot.                   | `''`     |

See Also

file_temp_push, file_temp_pop : The deterministic-name queue. pyrfs.path_temp : The temp *directory* itself.

Examples:

```
>>> file_temp(ext="csv")
FsPath('/tmp/file2bf36b4eb5d8.csv')
>>> _ = file_temp_push("/tmp/demo.csv")
>>> file_temp()  # deterministic: returns the queued name
FsPath('/tmp/demo.csv')
```

## pyrfs.file_temp_push

```
file_temp_push(path: PathInput | Iterable[PathInput]) -> list[FsPath]
```

Queue deterministic path(s) for subsequent `file_temp` calls.

Returns:

| Type             | Description                    |
| ---------------- | ------------------------------ |
| `list of FsPath` | The queued paths (FIFO order). |

Examples:

```
>>> file_temp_push(["/tmp/one", "/tmp/two"])
[FsPath('/tmp/one'), FsPath('/tmp/two')]
>>> file_temp(), file_temp()
(FsPath('/tmp/one'), FsPath('/tmp/two'))
```

## pyrfs.file_temp_pop

```
file_temp_pop() -> FsPath | None
```

Remove and return the oldest queued temp path (`None` if empty).

Examples:

```
>>> _ = file_temp_push("/tmp/queued")
>>> file_temp_pop()
FsPath('/tmp/queued')
>>> file_temp_pop() is None
True
```

## pyrfs.FsError

Bases: `Exception`

Base class for all pyrfs validation errors.

## pyrfs.FsValueError

Bases: `FsError`, `ValueError`

An argument value (or combination of arguments) is invalid.

# Links — `link_*`

`link_create(path, new_path)` creates `new_path` pointing *to* `path` (the fs argument order). Symbolic links are the default.

## pyrfs.link_create

```
link_create(path: str, new_path: PathInput, *, symbolic: bool = True) -> FsPath
```

Create a link at `new_path` pointing to `path`.

Note the argument order (fs's): target first, link name second.

Parameters:

| Name       | Type              | Description                                                  | Default    |
| ---------- | ----------------- | ------------------------------------------------------------ | ---------- |
| `path`     | `str or PathLike` | What the link points to (need not exist for symbolic links). | *required* |
| `new_path` | `str or PathLike` | Where to create the link.                                    | *required* |
| `symbolic` | `bool`            | Symbolic link (default) or hard link.                        | `True`     |

Returns:

| Type     | Description          |
| -------- | -------------------- |
| `FsPath` | The new link's path. |

Raises:

| Type              | Description                 |
| ----------------- | --------------------------- |
| `FileExistsError` | If new_path already exists. |

See Also

link_path : Read where a symlink points.

Examples:

```
>>> from pyrfs import file_touch
>>> _ = file_touch("big.csv")
>>> link_create("big.csv", "latest.csv")
FsPath('latest.csv')
>>> link_path("latest.csv")
FsPath('big.csv')
```

## pyrfs.link_path

```
link_path(path: str) -> FsPath
```

Return the target a symlink points to (`OSError` if not a symlink).

Vectorized: also accepts an iterable or pandas Series of paths.

See Also

pyrfs.path_real : Fully resolve a path through all links.

## pyrfs.link_exists

```
link_exists(path: str) -> bool
```

Whether the path is a symlink (its target need not exist).

Equivalent to `pyrfs.is_link`. Vectorized: also accepts an iterable or pandas Series of paths.

## pyrfs.link_copy

```
link_copy(path: str, new_path: PathInput, *, overwrite: bool = False) -> FsPath
```

Copy a symlink itself (the new link points to the same target).

The target is *not* copied — use `pyrfs.file_copy` to copy what the link points to.

Parameters:

| Name        | Type              | Description                                               | Default    |
| ----------- | ----------------- | --------------------------------------------------------- | ---------- |
| `path`      | `str or PathLike` | An existing symlink.                                      | *required* |
| `new_path`  | `str or PathLike` | Where to create the duplicate link.                       | *required* |
| `overwrite` | `bool`            | Allow clobbering an existing destination (default False). | `False`    |

Raises:

| Type              | Description                                       |
| ----------------- | ------------------------------------------------- |
| `FileExistsError` | If the destination exists and overwrite is False. |

## pyrfs.link_delete

```
link_delete(path: str) -> FsPath
```

Delete a symlink — the target is untouched; non-links are refused.

Raises:

| Type           | Description                                                                          |
| -------------- | ------------------------------------------------------------------------------------ |
| `FsValueError` | If path is not a symlink (use pyrfs.file_delete or pyrfs.dir_delete for real files). |

Examples:

```
>>> from pyrfs import file_exists, file_touch
>>> _ = file_touch("real.txt")
>>> _ = link_create("real.txt", "ln.txt")
>>> _ = link_delete("ln.txt")
>>> file_exists("real.txt")  # target survives
True
```

# Path algebra — `path_*`

Pure path-string manipulation, no filesystem I/O (except the few that resolve against the running process, as documented). All functions accept a scalar, list, or pandas Series as the first argument and return tidy [`FsPath`](https://pyrfs.netlify.app/api/fspath/index.md) values.

## pyrfs.path

```
path(*parts: PathInput, ext: str = '') -> FsPath
```

Construct a tidy path from parts, optionally adding an extension.

Parts are joined with `/` and tidied. The join is pure concatenation — an absolute later part does *not* reset the path, unlike `os.path.join`.

Parameters:

| Name     | Type              | Description                                                                                  | Default |
| -------- | ----------------- | -------------------------------------------------------------------------------------------- | ------- |
| `*parts` | `str or PathLike` | Path components to join.                                                                     | `()`    |
| `ext`    | `str`             | Extension to append, with or without the leading dot (one dot is guaranteed, never doubled). | `''`    |

Returns:

| Type     | Description            |
| -------- | ---------------------- |
| `FsPath` | The joined, tidy path. |

See Also

path_join : Join components given as a list (inverse of `path_split`). FsPath.**truediv** : The fluent `/` join operator.

Examples:

```
>>> path("foo", "bar", "a", ext="txt")
FsPath('foo/bar/a.txt')
>>> path("a/", "/b")  # concatenation, not os.path.join reset
FsPath('a/b')
```

## pyrfs.path_wd

```
path_wd() -> FsPath
```

Return the current working directory as a tidy path.

See Also

path_abs : Anchor a relative path to the working directory.

## pyrfs.path_abs

```
path_abs(path: str) -> FsPath
```

Make a path absolute against the working directory (links unresolved).

A leading `~` is expanded first. Vectorized: also accepts an iterable or pandas Series of paths.

See Also

path_real : Also resolve symlinks (canonical form). path_norm : Lexical `.`/`..` normalization only.

Examples:

```
>>> path_abs("data").startswith("/")
True
```

## pyrfs.path_real

```
path_real(path: str) -> FsPath
```

Canonicalize a path, resolving symlinks (touches the filesystem).

Vectorized: also accepts an iterable or pandas Series of paths.

See Also

path_abs : Absolute form without resolving links.

## pyrfs.path_norm

```
path_norm(path: str) -> FsPath
```

Normalize `.` and `..` components lexically (no filesystem access).

Vectorized: also accepts an iterable or pandas Series of paths.

Examples:

```
>>> path_norm("a/../b/./c")
FsPath('b/c')
```

## pyrfs.path_rel

```
path_rel(path: str, start: PathInput = '.') -> FsPath
```

Return the path expressed relative to `start`.

Vectorized: also accepts an iterable or pandas Series of paths.

Parameters:

| Name    | Type              | Description                                            | Default    |
| ------- | ----------------- | ------------------------------------------------------ | ---------- |
| `path`  | `str or PathLike` | The path to re-express.                                | *required* |
| `start` | `str or PathLike` | The anchor directory (default: the working directory). | `'.'`      |

See Also

path_has_parent : Test containment instead of computing the relation. FsPath.rel_to : Fluent equivalent.

Examples:

```
>>> path_rel("/a/b/c", "/a")
FsPath('b/c')
>>> path_rel("/a/b", "/a/d")
FsPath('../b')
```

## pyrfs.path_expand

```
path_expand(path: str) -> FsPath
```

Expand a leading `~` to the user's home directory.

Vectorized: also accepts an iterable or pandas Series of paths.

See Also

path_home : Build paths under the home directory directly.

## pyrfs.path_home

```
path_home(*parts: PathInput) -> FsPath
```

Return the user's home directory, optionally joined with `parts`.

Examples:

```
>>> path_home("data").endswith("/data")
True
```

## pyrfs.path_temp

```
path_temp(*parts: PathInput) -> FsPath
```

Return the session temp directory, optionally joined with `parts`.

See Also

pyrfs.file_temp : A unique temp *file name* (not just the directory).

## pyrfs.path_tidy

```
path_tidy(path: str) -> FsPath
```

Tidy a path: `/` separators, no doubled or trailing slashes.

Every pyrfs function already returns tidy paths; use this to normalize paths from elsewhere. Vectorized: also accepts an iterable or pandas Series of paths.

Examples:

```
>>> path_tidy("src//a.txt/")
FsPath('src/a.txt')
>>> path_tidy("C:\\data\\x")
FsPath('C:/data/x')
```

## pyrfs.path_split

```
path_split(path: str) -> list[str]
```

Split a tidy path into components (a leading root stays `'/'`).

Vectorized: a list of paths yields a list of component lists.

See Also

path_join : The inverse operation. FsPath.parts : Fluent equivalent.

Examples:

```
>>> path_split("/usr/bin")
['/', 'usr', 'bin']
>>> path_split("a/b")
['a', 'b']
```

## pyrfs.path_join

```
path_join(parts: Iterable[PathInput | Iterable[PathInput]]) -> FsPath | list[FsPath]
```

Join split components back into path(s) — the inverse of `path_split`.

Parameters:

| Name    | Type       | Description                                                                            | Default    |
| ------- | ---------- | -------------------------------------------------------------------------------------- | ---------- |
| `parts` | `iterable` | Either one sequence of components, or a sequence of such sequences (joining each one). | *required* |

See Also

path : Variadic construction with an optional extension.

Examples:

```
>>> path_join(["/", "usr", "bin"])
FsPath('/usr/bin')
>>> path_join([["a", "b"], ["c", "d"]])
[FsPath('a/b'), FsPath('c/d')]
```

## pyrfs.path_file

```
path_file(path: str) -> FsPath
```

Return the file name — the last path component.

Vectorized: also accepts an iterable or pandas Series of paths.

See Also

path_dir : The complementary directory part. FsPath.name : Fluent equivalent.

Examples:

```
>>> path_file("a/b/c.txt")
FsPath('c.txt')
```

## pyrfs.path_dir

```
path_dir(path: str) -> FsPath
```

Return the directory part of a path (`'.'` if there is none).

Vectorized: also accepts an iterable or pandas Series of paths.

See Also

path_file : The complementary file-name part. FsPath.dir : Fluent equivalent.

Examples:

```
>>> path_dir("a/b/c.txt")
FsPath('a/b')
>>> path_dir("c.txt")
FsPath('.')
```

## pyrfs.path_ext

```
path_ext(path: str) -> str
```

Return the extension without the dot (`''` if none).

Dotfiles like `.gitignore` count as having no extension. Vectorized: also accepts an iterable or pandas Series of paths.

See Also

path_ext_set, path_ext_remove

Examples:

```
>>> path_ext("a.tar.gz")
'gz'
>>> path_ext(".gitignore")
''
```

## pyrfs.path_ext_remove

```
path_ext_remove(path: str) -> FsPath
```

Remove the extension (dotfiles like `.gitignore` are left intact).

Vectorized: also accepts an iterable or pandas Series of paths.

Examples:

```
>>> path_ext_remove("d/a.tar.gz")
FsPath('d/a.tar')
```

## pyrfs.path_ext_set

```
path_ext_set(path: str, ext: str) -> FsPath
```

Replace (or add) the extension; an empty `ext` removes it.

Vectorized: also accepts an iterable or pandas Series of paths.

Parameters:

| Name   | Type              | Description                                                                       | Default    |
| ------ | ----------------- | --------------------------------------------------------------------------------- | ---------- |
| `path` | `str or PathLike` | The path to modify.                                                               | *required* |
| `ext`  | `str`             | New extension, with or without the leading dot; "" removes the current extension. | *required* |

See Also

FsPath.with_ext : Fluent equivalent.

Examples:

```
>>> path_ext_set("report.md", "html")
FsPath('report.html')
>>> path_ext_set(["a.txt", "b"], "py")
[FsPath('a.py'), FsPath('b.py')]
```

## pyrfs.path_common

```
path_common(paths: Iterable[PathInput]) -> FsPath
```

Return the longest common path prefix of `paths`.

Parameters:

| Name    | Type                             | Description                                      | Default    |
| ------- | -------------------------------- | ------------------------------------------------ | ---------- |
| `paths` | `iterable of str or os.PathLike` | At least one path; all absolute or all relative. | *required* |

Raises:

| Type           | Description                                             |
| -------------- | ------------------------------------------------------- |
| `FsValueError` | If paths is empty or mixes absolute and relative paths. |

Examples:

```
>>> path_common(["a/b/c", "a/b/d"])
FsPath('a/b')
```

## pyrfs.path_filter

```
path_filter(paths: Iterable[PathInput], glob: str | None = None, regexp: str | None = None, *, invert: bool = False) -> list[FsPath]
```

Filter paths by a glob or a regular expression (mutually exclusive).

Parameters:

| Name     | Type                             | Description                                                                                     | Default    |
| -------- | -------------------------------- | ----------------------------------------------------------------------------------------------- | ---------- |
| `paths`  | `iterable of str or os.PathLike` | Paths to filter.                                                                                | *required* |
| `glob`   | `str`                            | Wildcard pattern matched against the whole path (e.g. "\*.py"); mutually exclusive with regexp. | `None`     |
| `regexp` | `str`                            | Regular expression searched within the path; mutually exclusive with glob.                      | `None`     |
| `invert` | `bool`                           | Keep the paths that do not match.                                                               | `False`    |

Raises:

| Type           | Description                      |
| -------------- | -------------------------------- |
| `FsValueError` | If both glob and regexp are set. |

See Also

pyrfs.dir_ls : Directory listing with the same filter arguments.

Examples:

```
>>> path_filter(["a.py", "b.txt", "src/c.py"], glob="*.py")
[FsPath('a.py'), FsPath('src/c.py')]
>>> path_filter(["a.py", "b.txt"], glob="*.py", invert=True)
[FsPath('b.txt')]
```

## pyrfs.path_has_parent

```
path_has_parent(path: str, parent: PathInput) -> bool
```

Return whether `path` sits at or below `parent`.

Both are anchored to the working directory before comparing, so relative and absolute forms compare consistently. Vectorized: also accepts an iterable or pandas Series of paths.

See Also

path_rel : Compute the relative path instead of testing containment.

Examples:

```
>>> path_has_parent("/x/y", "/x")
True
>>> path_has_parent("/xy/z", "/x")
False
```

## pyrfs.path_sanitize

```
path_sanitize(filename: str, replacement: str = '') -> str
```

Turn an untrusted string into a filename safe on all major OSes.

Removes control characters, characters illegal in filenames (`/\?<>:*|"`), trailing dots/spaces, and Windows-reserved device names; truncates to 255 characters. Operates on a *filename*, not a path — separators are stripped, not preserved.

Parameters:

| Name          | Type  | Description                                                   | Default    |
| ------------- | ----- | ------------------------------------------------------------- | ---------- |
| `filename`    | `str` | The untrusted string.                                         | *required* |
| `replacement` | `str` | What to substitute for removed characters (default: nothing). | `''`       |

Examples:

```
>>> path_sanitize("rep/ort:2026*")
'report2026'
>>> path_sanitize("a/b", "_")
'a_b'
```

# Predicates & ids

Type predicates classify the entry itself (lstat): a symlink answers `True` only to `is_link`, matching fs.

## pyrfs.is_file

```
is_file(path: str) -> bool
```

Whether the path is a regular file (symlinks answer `False`).

Classifies the entry itself (lstat), matching fs — unlike `os.path.isfile`, which follows symlinks. Vectorized: also accepts an iterable or pandas Series of paths.

See Also

is_link : The predicate a symlink answers `True` to. pyrfs.file_exists : Existence regardless of type.

Examples:

```
>>> from pyrfs import file_touch, link_create
>>> _ = file_touch("data.txt")
>>> _ = link_create("data.txt", "ln.txt")
>>> is_file("data.txt"), is_file("ln.txt"), is_file("missing")
(True, False, False)
```

## pyrfs.is_dir

```
is_dir(path: str) -> bool
```

Whether the path is a directory (symlinks answer `False`).

Classifies the entry itself (lstat), matching fs — unlike `os.path.isdir` and `pyrfs.dir_exists`, which follow symlinks. Vectorized: also accepts an iterable or pandas Series of paths.

See Also

pyrfs.dir_exists : Follow-symlink directory test.

## pyrfs.is_link

```
is_link(path: str) -> bool
```

Whether the path is a symlink (its target need not exist).

Vectorized: also accepts an iterable or pandas Series of paths.

See Also

pyrfs.link_path : Read where the link points.

## pyrfs.is_file_empty

```
is_file_empty(path: str) -> bool
```

Whether the file exists and has size zero.

Missing paths answer `False` (they are not empty files). Vectorized: also accepts an iterable or pandas Series of paths.

## pyrfs.is_dir_empty

```
is_dir_empty(path: str) -> bool
```

Whether the directory exists and has no entries (hidden included).

Missing paths answer `False`. Vectorized: also accepts an iterable or pandas Series of paths.

Examples:

```
>>> from pyrfs import dir_create
>>> _ = dir_create("empty")
>>> is_dir_empty("empty")
True
```

## pyrfs.is_absolute_path

```
is_absolute_path(path: str) -> bool
```

Whether the path is absolute (a leading `~` counts, as in fs).

Pure string test — no filesystem access. Vectorized: also accepts an iterable or pandas Series of paths.

Examples:

```
>>> is_absolute_path(["/usr", "~/data", "rel/path"])
[True, True, False]
```

## pyrfs.user_ids

```
user_ids() -> list[dict[str, object]]
```

All known users as rows of `{"user_id", "user_name"}`.

Returns an empty list on platforms without `pwd` (Windows).

## pyrfs.group_ids

```
group_ids() -> list[dict[str, object]]
```

All known groups as rows of `{"group_id", "group_name"}`.

Returns an empty list on platforms without `grp` (Windows).

# Bytes & Perms

Typed scalars that subclass `int` — see the [typed values guide](https://pyrfs.netlify.app/guides/typed-values/index.md). With the `[pandas]` extra, columns of these become the `"bytes"`/`"perms"`/`"path"` ExtensionDtypes.

## pyrfs.Bytes

Bases: `int`

A byte count that parses and displays human-readable sizes.

All units are 1024-based (`"10MB"` == `"10MiB"` == `10 * 1024**2`), matching R's fs.

Examples:

```
>>> Bytes("10MB")
Bytes(10485760)
>>> str(Bytes(455200))
'444.5K'
>>> Bytes(455200) < "1MB"
True
>>> str(Bytes("1MB") + "500KB")
'1.49M'
```

Notes

`repr` stays exact (`Bytes(455200)`); `str`/`format` humanize. With the `[pandas]` extra, columns of these use the `"bytes"` ExtensionDtype, so the same comparisons work in `DataFrame.query()`.

See Also

pyrfs.file_size : Returns sizes as `Bytes`.

## pyrfs.Perms

Bases: `int`

Unix permission bits that parse and display `rwxr-xr-x` style.

Construct from octal (`"644"`), symbolic (`"u+rw,go+r"`), display (`"rw-r--r--"`) strings, or raw mode bits.

Examples:

```
>>> Perms("644")
Perms('rw-r--r--')
>>> Perms("644") == "rw-r--r--"
True
>>> str(Perms("644") | "u+x")
'rwxr--r--'
```

Notes

Symbolic strings here build from a base of `0` (so `"u+rw"` == `"u=rw"`); `pyrfs.file_chmod` applies symbolic modes to the file's *current* mode instead, like the `chmod` command.

See Also

pyrfs.file_chmod : Apply permissions to files.
# Design notes

# pyrfs — UX Design

> A Pythonic port of R's [`fs`](https://fs.r-lib.org) · Status: **design draft** · Last updated: 2026-06-11 Companion: [`pyrfs-architecture.md`](https://pyrfs.netlify.app/design/pyrfs-architecture/index.md) (how it's built)

This document defines the **feel** of pyrfs — names, return values, chaining, and the pandas workflow. The guiding goal: an ex-R user who knows `fs` should feel at home immediately, and a Python user should find it idiomatic and pipeable.

______________________________________________________________________

## 1. UX thesis

> **Every function takes path(s) in, and gives a predictable, path-carrying value back — or raises.** The same operation is reachable three ways: as a function, as a method on a path, or as a vectorized column operation in pandas.

```
flowchart LR
    in["path(s) in<br/>str · FsPath · list · Series"] --> op["pyrfs operation"]
    op -->|success| out["typed result<br/>FsPath · Bytes · Perms · bool · DataFrame"]
    op -->|failure| err["raises (OSError / FsError)"]
    out -.->|chains into| op
```

We inherit the five `fs` promises — **consistent naming · vectorization · predictable returns · explicit failure · tidy UTF-8 paths** — and add a sixth: **three interchangeable surfaces**.

______________________________________________________________________

## 2. Naming — the four families (kept from `fs`)

Functions are grouped by the **noun** they act on, `noun_verb`, snake_case. Type `dir_` + Tab and you see every directory operation.

| Prefix  | Domain                                           | Examples                                                              |
| ------- | ------------------------------------------------ | --------------------------------------------------------------------- |
| `path_` | construct & manipulate path strings (**no I/O**) | `path()`, `path_dir()`, `path_ext_set()`, `path_rel()`, `path_norm()` |
| `file_` | operate on files                                 | `file_create()`, `file_copy()`, `file_info()`, `file_chmod()`         |
| `dir_`  | operate on directories                           | `dir_create()`, `dir_ls()`, `dir_info()`, `dir_tree()`                |
| `link_` | operate on links                                 | `link_create()`, `link_path()`, `link_copy()`                         |

Plus predicates (`is_file`, `is_dir`, `is_link`, `is_file_empty`, `is_dir_empty`, `is_absolute_path`), id helpers (`user_ids`, `group_ids`), and temp helpers (`file_temp`, `path_temp`, `file_temp_push/pop`).

The create/copy/delete/exists verbs repeat with identical shapes across `file_`/`dir_`/`link_` — a predictable matrix you learn once.

______________________________________________________________________

## 3. The three surfaces (same engine, your choice of style)

### Surface A — Functional (closest to R `fs`)

```
import pyrfs as fs

fs.path("foo", "bar", "a", ext="txt")     # FsPath('foo/bar/a.txt')
fs.dir_ls("pyrfs", recurse=True, glob="*.py")
fs.file_copy("a.txt", "b.txt")            # -> FsPath('b.txt')
fs.path_ext_set("report.md", "html")      # FsPath('report.html')
```

### Surface B — Fluent `FsPath` (Pythonic chaining)

```
from pyrfs import FsPath

(FsPath("foo") / "bar" / "a.txt")         # FsPath('foo/bar/a.txt')   <- '/' operator
(FsPath("data") / "raw.csv").with_ext("parquet").copy_to("clean/")
FsPath("project").mkdir().touch_file("README.md")
FsPath("logs").ls(glob="*.log")           # [FsPath, FsPath, ...]
```

`FsPath` **is a `str`** (subclass), so it works anywhere a path string is expected — `open(p)`, `pd.read_csv(p)`, `os.fspath(p)` — no conversion needed.

### Surface C — pandas `.fs` accessor + DataFrame returns

```
import pandas as pd
import pyrfs as fs

df = pd.DataFrame({"path": fs.dir_ls("pyrfs", recurse=True)})

df.assign(
    ext = df["path"].fs.ext(),            # vectorized over the column
    dir = df["path"].fs.dir(),
    ok  = df["path"].fs.exists(),
)
```

```
flowchart TD
    eng["pyrfs engine (one implementation)"]
    eng --> A["A. functional<br/>fs.file_copy()"]
    eng --> B["B. fluent<br/>FsPath().copy_to()"]
    eng --> C["C. pandas<br/>Series.fs.* / dir_info()"]
```

Pick per task: scripts lean A, OO code leans B, dataframe pipelines lean C. They interoperate — `dir_ls()` returns `FsPath`s you can drop straight into a DataFrame column.

______________________________________________________________________

## 4. Predictable, typed return values

Every function returns one of a small, learnable set of shapes — and it always conveys the path.

| Return              | Type                             | Produced by                                         |
| ------------------- | -------------------------------- | --------------------------------------------------- |
| a path              | `FsPath` (⊂ `str`)               | `path()`, `file_copy()`, `dir_create()`, most verbs |
| many paths          | `list[FsPath]` / `Series[path]`  | `dir_ls()`, vectorized calls                        |
| existence/type test | `bool` / `dict`/`Series` of bool | `file_exists()`, `is_dir()`                         |
| a size              | `Bytes` (⊂ `int`)                | `file_size()`                                       |
| permissions         | `Perms` (⊂ `int`)                | `file_info()["permissions"]`                        |
| a table             | `DataFrame` (or `list[dict]`)    | `file_info()`, `dir_info()`                         |

**Mutating verbs return the new path**, enabling chains and pipes:

```
(fs.file_temp()
   .pipe(... )   # any callable
)
# fluent equivalent:
FsPath(fs.file_temp()).mkdir().touch_file("a").touch_file("b")
```

______________________________________________________________________

## 5. Typed values that read like a human

The heart of `fs`'s charm — values that *know what they are* and print accordingly.

| You have       | pyrfs shows                  | And you can write                     |
| -------------- | ---------------------------- | ------------------------------------- |
| `455200` bytes | `445.2K`                     | `fs.file_size("x") > "10KB"` → `True` |
| mode `0o644`   | `rw-r--r--`                  | `perms == "u=rw,go=r"` → `True`       |
| `"src//a.txt"` | `src/a.txt` (tidy, coloured) | `FsPath("src") / "a.txt"`             |

```
from pyrfs import Bytes, Perms

Bytes("10MB")              # Bytes(10485760)  -> displays '10M'
Bytes(455200) < "1MB"      # True
sum([Bytes("1MB"), Bytes("500KB")])   # Bytes -> '1.46M'

Perms("644")               # Perms -> 'rw-r--r--'
Perms("644") & "u+r"       # Perms (bitwise), still prints rwx
Perms("644") == "rw-r--r--"  # True
```

In pandas these become **real column dtypes** (ExtensionArrays), so the R headline demo ports almost verbatim:

```
(fs.dir_info("pyrfs", recurse=False)
   .query("size > '10KB' and type == 'file'")     # Bytes column compares to a string
   .sort_values("size", ascending=False)
   .loc[:, ["path", "permissions", "size", "modification_time"]])
#                  path  permissions    size      modification_time
#   pyrfs/_engine/dirops.py  rw-r--r--   12.4K  2026-06-11 13:35:54
#   ...
```

______________________________________________________________________

## 6. The pandas pipe workflow (a first-class use case)

pyrfs is built to flow inside `.pipe()` chains because `*_info` returns a DataFrame and the `.fs` accessor vectorizes path algebra over columns.

```
import pyrfs as fs

big_modules = (
    fs.dir_info("pyrfs", recurse=True)
      .query("type == 'file'")
      .assign(stem=lambda d: d["path"].fs.name())
      .pipe(lambda d: d[d["path"].fs.ext() == "py"])
      .groupby(d_dir := lambda d: d["path"].fs.dir())  # group by directory
      .agg(total=("size", "sum"), n=("path", "size"))
      .sort_values("total", ascending=False)
)
```

Reading many files into one frame — `dir_ls()` returns paths you tag by source, the pandas analogue of R's named-vector `map_df(.id=)` trick:

```
files = fs.dir_ls("data", glob="*.tsv")
frame = pd.concat(
    {p.name(): pd.read_csv(p, sep="\t") for p in files},
    names=["file"],
)
```

______________________________________________________________________

## 7. Safe defaults & argument conventions (learn once)

| Argument          | Meaning                                                        | Default                         | On                      |
| ----------------- | -------------------------------------------------------------- | ------------------------------- | ----------------------- |
| `overwrite`       | allow clobbering an existing target                            | `False` (safe)                  | copy/move               |
| `recurse`         | recurse fully (`True`), not (`False`), or to depth (`int`)     | `False` listing / `True` create | `dir_*`                 |
| `all`             | include hidden dotfiles                                        | `False`                         | `dir_ls`, `dir_map`     |
| `type`            | filter by entry type (`"file"`, `"directory"`, `"symlink"`, …) | `"any"`                         | `dir_ls`, `dir_info`    |
| `glob` / `regexp` | filter listings (mutually exclusive → `FsError` if both)       | `None`                          | `dir_ls`, `path_filter` |
| `fail`            | raise (`True`) vs warn (`False`) on inaccessible entries       | `True`                          | directory traversals    |

- **Destructive actions opt-in.** `overwrite=False` and bounded `recurse` mean nothing surprising gets deleted or walked.
- **Keyword-only where it aids clarity** — flags like `overwrite`, `recurse`, `all` are keyword-only (`*,`) so call sites read self-documenting: `file_copy(a, b, overwrite=True)`.

______________________________________________________________________

## 8. Explicit failure (Pythonic)

pyrfs raises rather than silently returning a falsy value:

```
fs.file_copy("a.txt", "b.txt")            # raises FileExistsError if b.txt exists
fs.file_copy("a.txt", "b.txt", overwrite=True)   # ok

fs.dir_ls("nope")                         # raises FileNotFoundError
fs.path_filter(paths, glob="*.py", regexp=r"\.py$")   # raises pyrfs.FsError: cannot set both

# soften a traversal when some entries are unreadable:
fs.dir_ls("/var", recurse=True, fail=False)   # warns + skips, returns what it could read
```

- Native `OSError` subclasses (`FileNotFoundError`, `FileExistsError`, `PermissionError`) for OS-level failures — familiar, `try/except`-able.
- `pyrfs.FsError` (with subclasses) for pyrfs validation — friendly, actionable messages.

______________________________________________________________________

## 9. R `fs` → pyrfs translation

| R `fs`                        | pyrfs functional              | pyrfs fluent                                |
| ----------------------------- | ----------------------------- | ------------------------------------------- |
| `path("a", "b", ext = "txt")` | `path("a", "b", ext="txt")`   | `FsPath("a") / "b"` then `.with_ext("txt")` |
| `dir_ls("d", recurse = TRUE)` | `dir_ls("d", recurse=True)`   | `FsPath("d").ls(recurse=True)`              |
| `dir_info("d")`               | `dir_info("d")` → DataFrame   | `FsPath("d").info()`                        |
| `file_copy("a", "b")`         | `file_copy("a", "b")`         | `FsPath("a").copy_to("b")`                  |
| `file_size("a")`              | `file_size("a")` → `Bytes`    | `FsPath("a").size()`                        |
| `path_ext_set("a.txt", "md")` | `path_ext_set("a.txt", "md")` | `FsPath("a.txt").with_ext("md")`            |
| `path_rel("a/b", "a")`        | `path_rel("a/b", "a")`        | `FsPath("a/b").rel_to("a")`                 |
| `dir_tree("d")`               | `dir_tree("d")`               | `FsPath("d").tree()`                        |
| `x %>% file_delete()`         | `df.pipe(...)` / loop         | `FsPath(x).delete()`                        |

Naming is intentionally identical on the functional surface so muscle memory transfers; the fluent surface adds Pythonic method names for OO-style chaining.

______________________________________________________________________

## 10. Small touches (ported from `fs`)

- **`dir_tree()`** prints a coloured box-drawing tree (`├──`, `└──`), like Unix `tree`.
- **`file_show()`** opens a file in the OS default app (cross-platform).
- **`path(..., ext=)`** builds extensions correctly (one dot, no doubling).
- **`path_sanitize()`** turns untrusted strings into safe filenames.
- **`path_rel()` / `path_common()`** — relative paths and longest common dir (no stdlib one-liner).
- **`file_temp_push()/pop()`** — deterministic temp names for reproducible docs/tests.
- **Colour degrades** automatically on non-TTY / `NO_COLOR`.

______________________________________________________________________

## 11. Sharp edges (honest notes)

- **Stricter than stdlib in places.** `file_copy` refuses to overwrite by default — porting loose scripts may surface `FileExistsError`. Opt in with `overwrite=True`.
- **No `dir_move`.** Directories move via `file_move` (dirs are files), matching `fs`.
- **`FsPath` is a `str`, not a `pathlib.Path`.** Great for interop and pandas; if you want `pathlib` semantics call `.as_pathlib()` (helper) — we don't pretend to be `Path`.
- **pandas-only features fail gracefully.** Without the `[pandas]` extra, `dir_info()` returns `list[dict]` and the `.fs` accessor is unavailable; the docstring says so.
- **ExtensionDtype edge cases.** Some exotic pandas ops on `Bytes`/`Perms` columns may need `.astype(int)` first in v1; comparisons, sorting, and `sum/min/max` are supported from the start.

______________________________________________________________________

## 12. Cheat-sheet

```
NOUN_VERB(path, ...)              families: path_ file_ dir_ link_   (+ is_*, *_ids, *_temp)
  ├─ path(s) in                  str · FsPath · list · pandas.Series  (vectorized)
  ├─ tidy FsPath out             always '/', no '//' or trailing '/', UTF-8
  ├─ typed result                FsPath · Bytes('445.2K') · Perms('rwxr-xr-x') · DataFrame
  └─ raises on failure           OSError subclasses · pyrfs.FsError ; fail=False to soften

three surfaces:  fs.file_copy(a,b)  ·  FsPath(a).copy_to(b)  ·  df['p'].fs.ext()
pandas pipe:     dir_info(d).query("size > '10KB'").sort_values('size')
safe defaults:   overwrite=False · recurse=False(list)/True(create) · all=False · fail=True
from R fs:       same functional names; fluent adds Pythonic methods
```

# pyrfs — Architecture

> A Pythonic port of R's [`fs`](https://fs.r-lib.org) · Status: **design draft** · Last updated: 2026-06-11 Companion: [`pyrfs-ux.md`](https://pyrfs.netlify.app/design/pyrfs-ux/index.md) (user-facing design)

______________________________________________________________________

## 1. Purpose & non-goals

**Purpose.** Give Python the same file-system *ergonomics* that R users enjoy from `fs`: consistent `noun_verb` naming families, tidy paths, predictable path-carrying return values, explicit failure, and **typed self-describing values** (human-readable sizes, `rwxr-xr-x` permissions) — while being **chainable/pipeable** and integrating natively with **pandas**.

**What pyrfs is.** A thin, ergonomic, fully-typed wrapper over the Python standard library (`pathlib`, `shutil`, `os`, `stat`, `pwd`/`grp`) plus an optional pandas integration layer.

**Non-goals.**

- *Not* a new filesystem abstraction over remote/cloud backends (that's `fsspec`/`PyFilesystem2`).
- *Not* a C/native extension. R's `fs` needed **libuv** for cross-platform syscalls; Python's stdlib already abstracts that, so **pyrfs is pure Python** — no build step, trivial install.
- *Not* a 1:1 transliteration. We keep `fs`'s *UX contract*, expressed in idiomatic Python.

______________________________________________________________________

## 2. Core principle — *one engine, three surfaces*

Every filesystem operation is implemented **once** in a pure-stdlib `_engine`. The three user-facing surfaces are thin delegations — no logic is duplicated across them.

```
flowchart TD
    subgraph surfaces["User-facing surfaces"]
        fn["Functional API<br/>file_copy(a, b)<br/>dir_ls(p) · path_ext(p)"]
        fp["Fluent FsPath<br/>FsPath(a).copy_to(b)<br/>(FsPath(p) / 'x').with_ext('md')"]
        acc["pandas .fs accessor<br/>df['path'].fs.ext()<br/>dir_info(p) -> DataFrame"]
    end

    eng["pyrfs._engine<br/>(pure stdlib, no pandas)<br/>paths · fileops · dirops · linkops · ids · temp"]

    std[("Python stdlib<br/>pathlib · shutil · os · stat · pwd/grp")]

    fn --> eng
    fp --> eng
    acc --> eng
    eng --> std
```

**Why this matters:** `fs` itself uses this idea — high-level R verbs compose from a small set of C primitives. pyrfs applies it in pure Python: the fluent object and the pandas accessor are *presentation layers*, and correctness lives in one place.

______________________________________________________________________

## 3. System context

```
flowchart LR
    user([Python user / data scientist])

    subgraph pyrfs["pyrfs"]
        core["core API + FsPath + typed values"]
        pdx["optional pandas layer"]
    end

    pandas{{"pandas (optional extra)"}}
    std[("OS filesystem via stdlib")]

    user -->|"file_*/dir_*/path_* · FsPath · Series.fs"| pyrfs
    core --> std
    core -.->|"lazily, if installed"| pdx
    pdx --> pandas
```

- **Inbound:** scripts, notebooks, and packages call pyrfs.
- **Hard dependency:** none beyond the standard library (Python ≥ 3.10).
- **Optional:** pandas — enables `*_info` DataFrames, the `.fs` Series accessor, and the ExtensionDtypes. Absent pandas, the core still works and `*_info` returns `list[dict]`.

______________________________________________________________________

## 4. Package layout (flat layout)

The importable package sits at the **top level** (`pyrfs/pyrfs/`), not under `src/`.

```
pyrfs/                         # repo root
├── pyproject.toml            # setuptools backend, [project], optional-deps, tooling
├── docs/                     # these design docs
├── pyrfs/                     # the importable package
│   ├── __init__.py           # PUBLIC re-exports (functions + FsPath/Bytes/Perms + FsError)
│   ├── py.typed              # PEP 561 marker (ships type info)
│   ├── errors.py             # FsError hierarchy (validation)
│   ├── fspath.py             # FsPath(str) — fluent, chainable           [PUBLIC]
│   ├── values.py             # Bytes(int), Perms(int) — typed scalars     [PUBLIC]
│   ├── display.py            # humanize bytes · perms→rwx · LS_COLORS · tidy
│   ├── _engine/              # pure-stdlib core (NEVER imports pandas)
│   │   ├── paths.py          # path_* algebra
│   │   ├── fileops.py        # file_*
│   │   ├── dirops.py         # dir_*  (ls/map/walk/info/tree/create/copy/delete)
│   │   ├── linkops.py        # link_*
│   │   ├── ids.py            # user_ids/group_ids
│   │   ├── temp.py           # file_temp stack · path_temp
│   │   └── vectorize.py      # polymorphic scalar|iterable dispatch
│   └── _pandas/              # OPTIONAL integration (imported only if pandas present)
│       ├── __init__.py       # registers .fs accessor + ExtensionDtypes
│       ├── dtypes.py         # BytesDtype, PermsDtype, PathDtype
│       ├── arrays.py         # BytesArray, PermsArray, PathArray
│       ├── accessor.py       # @register_series_accessor("fs")
│       └── frames.py         # build *_info DataFrames with typed columns
└── tests/                    # pytest mirror of the package
```

### Module responsibilities

| Module                 | Responsibility                                                                                        | Depends on                          |
| ---------------------- | ----------------------------------------------------------------------------------------------------- | ----------------------------------- |
| `_engine/paths.py`     | Pure path string algebra (`path`, `path_dir`, `path_ext*`, `path_rel`, `path_norm`, …)                | `pathlib`, `os.path`                |
| `_engine/fileops.py`   | `file_create/copy/move/delete/touch/show/chmod/chown/info/size/access`                                | `shutil`, `os`, `stat`              |
| `_engine/dirops.py`    | `dir_create/copy/delete/ls/map/walk/info/tree`, recursion & filtering                                 | `os.scandir`, `pathlib`             |
| `_engine/linkops.py`   | `link_create/copy/delete/exists/path`                                                                 | `os`                                |
| `_engine/ids.py`       | `user_ids/group_ids` (POSIX; empty frames on Windows)                                                 | `pwd`, `grp`                        |
| `_engine/temp.py`      | `file_temp` deterministic stack, `path_temp`                                                          | `tempfile`                          |
| `_engine/vectorize.py` | Decorator mapping scalar funcs over iterables/Series                                                  | —                                   |
| `fspath.py`            | `FsPath(str)` fluent object; methods delegate to `_engine`                                            | `_engine`, `display`                |
| `values.py`            | `Bytes(int)`, `Perms(int)` typed scalars                                                              | `display`                           |
| `display.py`           | Formatting/parsing: `humanize_bytes`, `parse_bytes`, `perms_to_str`, `parse_perms`, `tidy`, LS_COLORS | stdlib                              |
| `_pandas/*`            | ExtensionDtypes/arrays, `.fs` accessor, DataFrame builders                                            | `pandas`, reuses `display`/`values` |

**Invariant:** `_engine` and `values`/`display` must never `import pandas`. The optional layer depends inward on them, never the reverse — a classic dependency-inversion boundary.

______________________________________________________________________

## 5. The three surfaces in detail

### 5.1 Functional API (R-`fs` faithful)

Mirrors `fs`'s families and names exactly: `path_*` (pure, no I/O), `file_*`, `dir_*`, `link_*`, predicates (`is_file`, `is_dir`, `is_link`, …), `user_ids`/`group_ids`, temp helpers.

- **Predictable returns:** verbs return `FsPath` (or a list/Series of them); predicates return `bool` or a vectorized mapping; `file_size` → `Bytes`; `*_info` → DataFrame (or `list[dict]`).
- **Safe defaults** ported verbatim: `overwrite=False`, `recurse` defaults matching `fs` (`False` for listing, `True` for `dir_create`), `all=False`, `fail=True`.
- **`recurse: bool | int`** overload — `True`/`False`/depth, exactly like `fs`.

### 5.2 Fluent `FsPath`

`FsPath` **subclasses `str`** — the same choice as R's `fs_path ⊂ character` and the `path` library. Because an `FsPath` *is* a string, it drops into any stdlib or third-party API that expects a path, and serializes cleanly into pandas.

```
classDiagram
    class str {
        <<builtin>>
    }
    class FsPath {
        +__truediv__(other) FsPath
        +ext() str
        +with_ext(ext) FsPath
        +dir() FsPath
        +name() FsPath
        +abs() FsPath
        +real() FsPath
        +exists() bool
        +is_dir() bool
        +copy_to(dst) FsPath
        +move_to(dst) FsPath
        +touch() FsPath
        +delete() None
        +mkdir(recurse) FsPath
        +ls(...) list~FsPath~
        +info() DataFrame
    }
    str <|-- FsPath
    FsPath ..> _engine : delegates
```

Methods return `FsPath` (or lists thereof) so calls chain: `(FsPath("a") / "b").with_ext("txt").copy_to("c")`.

### 5.3 pandas `.fs` accessor + DataFrame returns

- A registered Series accessor gives **vectorized path algebra over a column**: `df["path"].fs.ext()`, `.dir()`, `.with_ext("md")`, `.exists()`, `.is_dir()`.
- `dir_info()`/`file_info()` return a DataFrame whose `path`/`size`/`permissions` columns use the ExtensionDtypes, so the R headline demo translates directly:

```
(dir_info("pyrfs", recurse=False)
     .query("size > '10KB' and type == 'file'")
     .sort_values("size", ascending=False))
```

______________________________________________________________________

## 6. Typed value system

Two cooperating tiers, sharing one set of parse/format functions in `display.py`.

```
flowchart TD
    subgraph fmt["display.py — single source of truth"]
        hb["humanize_bytes / parse_bytes"]
        pp["perms_to_str / parse_perms"]
        ti["tidy (path normalizer)"]
    end

    subgraph scalars["values.py + fspath.py (always available)"]
        b["Bytes(int)"]
        p["Perms(int)"]
        fpath["FsPath(str)"]
    end

    subgraph arrays["_pandas/arrays.py (optional)"]
        ba["BytesArray / BytesDtype"]
        pa["PermsArray / PermsDtype"]
        pta["PathArray / PathDtype"]
    end

    hb --> b --> ba
    pp --> p --> pa
    ti --> fpath --> pta
```

### Scalar wrappers (pure stdlib, always present)

| Type     | Subclass of | Construct from                             | Displays as                      | Overloads                                             |
| -------- | ----------- | ------------------------------------------ | -------------------------------- | ----------------------------------------------------- |
| `Bytes`  | `int`       | `int`, `"10MB"`, `"1.5GiB"`                | `445.2K`                         | `<,>,==` parse string RHS; arithmetic returns `Bytes` |
| `Perms`  | `int`       | octal `"644"`, symbolic `"u+rw,go+r"`, int | `rw-r--r--`                      | `& \| ~` return `Perms`; `==` parses string RHS       |
| `FsPath` | `str`       | any path-like                              | tidy path (coloured in terminal) | `/` for join                                          |

Subclassing the builtins mirrors `fs`'s S3-over-atomic-vector design (`fs_bytes ⊂ numeric`, `fs_perms ⊂ integer`, `fs_path ⊂ character`): a value still behaves like its base type but *remembers what it is* and prints for humans.

### pandas ExtensionArrays (optional)

For each scalar there is a real `ExtensionArray`/`ExtensionDtype` so DataFrame columns are first-class typed:

- `BytesDtype` (`name="bytes"`, backing `int64`) — elements show `445.2K`; native `>`/`<`/`==` against strings inside `.query()`; `sum`/`min`/`max` reductions.
- `PermsDtype` (`name="perms"`) — elements show `rwxr-xr-x`.
- `PathDtype` (`name="path"`, backing object of `FsPath`) — tidy display, `<fs::path>`-style repr.

Implemented with the standard protocol (`_from_sequence`, `__getitem__`, `__len__`, `isna`, `take`, `copy`, `_concat_same_type`) plus `ExtensionScalarOpsMixin` for operators, registered via `@register_extension_dtype`. **They call the same `display.py` functions as the scalars** — no duplicated formatting logic.

______________________________________________________________________

## 7. Vectorization model

R's `fs` is vectorized end to end. Python is scalar-by-default; pyrfs bridges this with a small `@vectorized` decorator in `_engine/vectorize.py`:

```
input type              → output type
-------------------------------------
str | PathLike | FsPath → scalar (FsPath/Bytes/bool)
list | tuple | set      → list
pandas.Series           → pandas.Series   (only if pandas importable)
```

This gives `file_exists(["a", "b"])` → `[bool, bool]` and `path_ext(series)` → `Series`, while a single path returns a single value. The `.fs` accessor is the *idiomatic* vectorized-over-column surface; the decorator makes the bare functions polymorphic too.

```
flowchart LR
    inp["caller input"] --> dec{"@vectorized<br/>dispatch on type"}
    dec -->|scalar| s["f(x) -> scalar"]
    dec -->|iterable| l["[f(x) for x] -> list"]
    dec -->|Series| ser["x.map(f) -> Series"]
```

______________________________________________________________________

## 8. Error model

`fs`'s promise is **explicit failure** (throw, never a silent `FALSE`). Python's stdlib already honors this — `os`/`shutil`/`pathlib` raise `OSError` subclasses. pyrfs's policy:

- **Reuse native exceptions** where they fit: `FileNotFoundError`, `FileExistsError`, `PermissionError` (all `OSError`). `overwrite=False` on an existing target → `FileExistsError` (matches `fs`).
- **Add `pyrfs.FsError(Exception)`** for pyrfs-level validation that has no native equivalent — e.g. `glob` and `regexp` both set, recycling length mismatch, bad permission/size literal. Subclasses (`FsValueError`, …) let callers `except` precisely, mirroring `fs`'s classed `fs_error`/`invalid_argument`.
- **`fail=False`** softens directory traversals (`dir_ls`/`dir_map`/`dir_info`) from error to warning when a single entry is inaccessible — a direct port of `fs`'s `fail` knob.

```
flowchart TD
    op["pyrfs operation"] --> k{failure?}
    k -->|"OS-level"| oserr["raise FileNotFoundError /<br/>FileExistsError / PermissionError"]
    k -->|"bad argument"| fserr["raise pyrfs.FsError subclass"]
    k -->|"traversal entry, fail=False"| warn["warnings.warn(), skip entry"]
    k -->|none| ok["return typed value (FsPath/Bytes/bool/DataFrame)"]
```

______________________________________________________________________

## 9. Optional-dependency strategy

pandas is an **extra** (`pip install pyrfs[pandas]`). The mechanism:

- `_engine` and `values`/`display` never import pandas → core is import-safe without it.
- `pyrfs/__init__.py` attempts `import pyrfs._pandas` inside a `try/except ImportError`; success registers the `.fs` accessor and the ExtensionDtypes.
- `*_info` functions check a cached `has_pandas()` flag: return a typed **DataFrame** when present, else a plain **`list[dict]`** (still useful, still typed scalars in each row).

This mirrors `fs`'s R philosophy: hard deps minimal (`Imports: methods`), rich integrations as *Suggests* (`pillar`, `vctrs`) wired up lazily in `.onLoad`.

______________________________________________________________________

## 10. Build & tooling

- **Backend:** setuptools (`[build-system] requires = ["setuptools>=68"]`).
- **Layout:** flat — `[tool.setuptools.packages.find] where = ["."]`, `include = ["pyrfs*"]`.
- **Env/locking:** `uv` (`uv sync`, `uv run …`).
- **Python:** `requires-python = ">=3.10"`.
- **Extras:** `pandas = ["pandas>=2.0"]`, optional `color`, `dev = ["pytest","ruff","mypy"]`.
- **Quality gates:** `ruff` (lint+format), `mypy --strict` (no `Any`, `py.typed` shipped), `pytest` (pandas tests guarded by `importorskip`, run with and without the extra).
- **Docstrings:** NumPy style on the public API.

______________________________________________________________________

## 11. Representative flow — `file_copy("a.txt", dest_dir)`

```
sequenceDiagram
    participant U as caller
    participant F as file_copy (functional API)
    participant V as vectorize
    participant E as _engine.fileops
    participant S as shutil/os
    participant D as display.tidy

    U->>F: file_copy("a.txt", "out/")
    F->>V: dispatch on input shape
    V->>E: _copy_one("a.txt", "out/", overwrite=False)
    E->>E: resolve dir target -> "out/a.txt"; check exists
    alt exists and not overwrite
        E-->>U: raise FileExistsError
    else
        E->>S: shutil.copy2("a.txt", "out/a.txt")
        E->>D: tidy("out/a.txt")
        D-->>F: FsPath("out/a.txt")
        F-->>U: FsPath
    end
```

The same `_engine._copy_one` backs `FsPath.copy_to` and any `.fs`-accessor copy — *one engine, three surfaces*.

______________________________________________________________________

## 12. Open questions & notes

- **Path display colour.** `FsPath.__repr__` colouring via `LS_COLORS` is deferred to a late phase (P6); it must degrade cleanly on non-TTY / `NO_COLOR`. Default plan: plain until P6.
- **ExtensionArray scope.** Full operator/reduction coverage on `BytesArray` is the heaviest piece; v1 targets comparisons + `sum/min/max`. Edge cases (groupby aggregations, `astype` round-trips) to be pinned down with tests in P5.
- **Windows specifics.** `user_ids`/`group_ids` return empty frames (no `pwd`/`grp`); symlink creation may require privilege. Tidy paths always use `/`. To be verified on a Windows runner.
- **`path_expand` semantics.** `fs` distinguishes `path_expand` vs `path_expand_r`; pyrfs maps the former to `os.path.expanduser` and will document any divergence rather than hide it.
- **`dir_move`.** Like `fs`, pyrfs intentionally has no `dir_move` — directories move via `file_move`.
# Project

# Changelog

All notable changes to **pyrfs** are documented here. The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and the project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased](https://github.com/Lightbridge-KS/pyrfs/compare/v0.1.0...HEAD)

## [0.1.0](https://github.com/Lightbridge-KS/pyrfs/releases/tag/v0.1.0) - 2026-06-11

Initial release — a Pythonic port of the UX of R's [fs](https://fs.r-lib.org) package.

### Added

- **Path algebra** (`path_*`, no I/O): `path()` with `ext=`, `path_dir`/ `path_file`/`path_ext*`, `path_rel`, `path_common`, `path_filter` (glob/regexp, mutually exclusive), `path_split`/`path_join`, `path_has_parent`, `path_sanitize`, `path_expand`/`path_home`/`path_temp`, `path_tidy`.
- **`FsPath`** — a tidy path that subclasses `str`: `/` join operator, chainable methods delegating to the engine, `LS_COLORS`-coloured repr (degrades on non-TTY / `NO_COLOR`), `as_pathlib()` escape hatch.
- **Typed scalars**: `Bytes ⊂ int` (parses `"10MB"`, displays `444.5K`, compares against literals, arithmetic stays typed — all units 1024-based) and `Perms ⊂ int` (octal/symbolic/`rw-r--r--` forms, mode algebra).
- **File operations** (`file_*`): create/touch/copy/move/delete/exists/ access/size/chmod/chown/show/info — mutating verbs return the new path; `overwrite=False` raises `FileExistsError`; copy/move into an existing directory targets `dir/basename`; symbolic chmod applies to the current mode.
- **Directory operations** (`dir_*`): create/copy/delete/exists, lazy `dir_walk` generator with the full fs filter set (`all`, `recurse: bool | int`, `type`, `glob`/`regexp`, `invert`, `fail=False` → warn-and-skip), `dir_ls`, `dir_map`, `dir_info`, and a box-drawing, coloured `dir_tree`. No `dir_move` by design — use `file_move`.
- **Link operations** (`link_*`): symbolic (default) and hard creation, `link_path`, `link_exists`, `link_copy`, `link_delete` (refuses non-links).
- **Predicates & ids**: `is_file`/`is_dir`/`is_link` (lstat semantics — a symlink is only `is_link`), `is_file_empty`, `is_dir_empty`, `is_absolute_path`; `user_ids`/`group_ids` (POSIX).
- **Vectorization**: every path-taking function is polymorphic over a scalar, list/tuple/set, or pandas Series (without the engine importing pandas).
- **pandas layer** (optional `[pandas]` extra): `bytes`/`perms`/`path` ExtensionDtypes lifting the scalar semantics onto columns (`size > "10KB"` works in `.query()`), the `Series.fs` accessor, and `file_info`/`dir_info` returning typed DataFrames (engine rows without pandas).
- **Temp helpers**: `file_temp` with a deterministic `file_temp_push`/`pop` stack for reproducible docs and tests.
- **Errors**: native `OSError` subclasses for OS failures; `FsError`/ `FsValueError` for pyrfs-level validation.
- **Docs**: MkDocs Material site at <https://pyrfs.netlify.app> with llms.txt/llms-full.txt, an executed tour notebook, and a Quarto-rendered README kept fresh by CI.