Fast, async, stream-based link checker written in Rust.
Finds broken URLs and mail addresses inside Markdown, HTML, reStructuredText, websites and more!
Lychee is an async, stream-based link checker designed to identify broken URLs and email addresses in Markdown, HTML, reStructuredText, websites, and more. Built with Rust, it efficiently processes large volumes of links while maintaining low memory usage.
Key Features:
Async and stream-based architecture for fast performance
Cross-platform support including Windows, macOS, and Linux
Integrates seamlessly as a command-line tool or library
GitHub Actions integration for automated link checking in CI/CD pipelines
Configurable concurrency limits to prevent server overload
Modern TLS support with optional native or Rustls implementations
Email verification capabilities
Flexible configuration options via TOML files
Audience & Benefit:
Ideal for developers, technical writers, and DevOps teams seeking reliable link validation. Lychee helps maintain content integrity by detecting broken links early in the development process, reducing downtime and user frustration. Its scalability makes it suitable for large projects while its CI/CD integration streamlines workflow automation.
README
⚡ A fast, async, stream-based link checker written in Rust.
Finds broken hyperlinks and mail addresses inside Markdown, HTML,
reStructuredText, or any other text file or website!
Available as a command-line utility, a library and a GitHub Action.
After installing Rust use Cargo for building and testing.
On Linux the OpenSSL package is required to compile reqwest, a dependency of lychee.
For Nix we provide a flake so you can use nix develop and nix build.
# available for Alpine Edge in testing repositories
apk add lychee
Chocolatey (Windows)
choco install lychee
Conda
conda install lychee -c conda-forge
Pre-built binaries
We provide binaries for Linux, macOS, and Windows for every release.
You can download them from the releases page.
Cargo
Build dependencies
On APT/dpkg-based Linux distros (e.g. Debian, Ubuntu, Linux Mint and Kali Linux)
the following commands will install all required build dependencies, including
the Rust toolchain and cargo:
1 Other machine-readable formats like CSV are supported.
Commandline usage
Recursively check all links in supported files inside the current directory
lychee .
You can also specify various types of inputs:
# check links in specific local file(s):
lychee README.md
lychee test.html info.txt
# check links on a website:
lychee https://endler.dev
# check links in directory but block network requests
lychee --offline path/to/directory
# check links in a remote file:
lychee https://raw.githubusercontent.com/lycheeverse/lychee/master/README.md
# check links in local files via shell glob:
lychee ~/projects/*/README.md
# check links in local files (lychee supports advanced globbing and ~ expansion):
lychee "~/projects/big_project/**/README.*"
# ignore case when globbing and check result for each link:
lychee --glob-ignore-case "~/projects/**/[r]eadme.*"
# check links from epub file (requires atool: https://www.nongnu.org/atool)
acat -F zip {file.epub} "*.xhtml" "*.html" | lychee -
lychee parses other file formats as plaintext and extracts links using linkify.
This generally works well if there are no format or encoding specifics,
but in case you need dedicated support for a new file format, please consider creating an issue.
Docker Usage
Here's how to mount a local directory into the container and check some input
with lychee.
The --init parameter is passed so that lychee can be stopped from the terminal.
We also pass -it to start an interactive terminal, which is required to show the progress bar.
The --rm removes not used anymore container from the host after the run (self-cleanup).
The -w /input points to /input as the default workspace
The -v $(pwd):/input does local volume mounting to the container for lychee access.
> By default a Debian-based Docker image is used. If you want to run an Alpine-based image, use the latest-alpine tag.
> For example, lycheeverse/lychee:latest-alpine
To avoid getting rate-limited while checking GitHub links, you can optionally
set an environment variable with your GitHub token like so GITHUB_TOKEN=xxxx,
or use the --github-token CLI option. It can also be set in the config file.
Here is an example config file.
The token can be generated on your GitHub account settings page.
A personal access token with no extra permissions is enough to be able to check public repo links.
For more scalable organization-wide scenarios you can consider a GitHub App.
It has a higher rate limit than personal access tokens but requires additional configuration steps on your GitHub workflow.
Please follow the GitHub App Setup example.
Commandline Parameters
There is an extensive list of command line parameters to customize the behavior.
See below for a full list.
A fast, async link checker
Finds broken URLs and mail addresses inside Markdown, HTML, `reStructuredText`, websites and more!
Usage: lychee [OPTIONS] ...
Arguments:
...
The inputs (where to get links to check from). These can be: files (e.g. `README.md`), file globs (e.g. `"~/git/*/README.md"`), remote URLs (e.g. `https://example.com/README.md`) or standard input (`-`). NOTE: Use `--` to separate inputs from options that allow multiple arguments
Options:
-c, --config
Configuration file to use
[default: lychee.toml]
-v, --verbose...
Set verbosity level; more output per occurrence (e.g. `-v` or `-vv`)
-q, --quiet...
Less output per occurrence (e.g. `-q` or `-qq`)
-n, --no-progress
Do not show progress bar.
This is recommended for non-interactive shells (e.g. for continuous integration)
--extensions
Test the specified file extensions for URIs when checking files locally.
Multiple extensions can be separated by commas. Note that if you want to check filetypes,
which have multiple extensions, e.g. HTML files with both .html and .htm extensions, you need to
specify both extensions explicitly.
[default: md,mkd,mdx,mdown,mdwn,mkdn,mkdown,markdown,html,htm,txt]
--cache
Use request cache stored on disk at `.lycheecache`
--max-cache-age
Discard all cached requests older than this duration
[default: 1d]
--cache-exclude-status
A list of status codes that will be ignored from the cache
The following exclude range syntax is supported: [start]..[[=]end]|code. Some valid
examples are:
- 429 (excludes the 429 status code only)
- 500.. (excludes any status code >= 500)
- ..100 (excludes any status code < 100)
- 500..=599 (excludes any status code from 500 to 599 inclusive)
- 500..600 (excludes any status code from 500 to 600 excluding 600, same as 500..=599)
Use "lychee --cache-exclude-status '429, 500..502' ..." to provide a comma- separated
list of excluded status codes. This example will not cache results with a status code of 429, 500
and 501.
[default: ]
--dump
Don't perform any link checking. Instead, dump all the links extracted from inputs that would be checked
--dump-inputs
Don't perform any link extraction and checking. Instead, dump all input sources from which links would be collected
--archive
Specify the use of a specific web archive. Can be used in combination with `--suggest`
[possible values: wayback]
--suggest
Suggest link replacements for broken links, using a web archive. The web archive can be specified with `--archive`
-m, --max-redirects
Maximum number of allowed redirects
[default: 5]
--max-retries
Maximum number of retries per request
[default: 3]
--min-tls
Minimum accepted TLS Version
[possible values: TLSv1_0, TLSv1_1, TLSv1_2, TLSv1_3]
--max-concurrency
Maximum number of concurrent network requests
[default: 128]
-T, --threads
Number of threads to utilize. Defaults to number of cores available to the system
-u, --user-agent
User agent
[default: lychee/x.y.z]
-i, --insecure
Proceed for server connections considered insecure (invalid TLS)
-s, --scheme
Only test links with the given schemes (e.g. https). Omit to check links with any other scheme. At the moment, we support http, https, file, and mailto
--offline
Only check local files and block network requests
--include
URLs to check (supports regex). Has preference over all excludes
--exclude
Exclude URLs and mail addresses from checking. The values are treated as regular expressions
--exclude-file
Deprecated; use `--exclude-path` instead
--exclude-path
Exclude paths from getting checked. The values are treated as regular expressions
-E, --exclude-all-private
Exclude all private IPs from checking.
Equivalent to `--exclude-private --exclude-link-local --exclude-loopback`
--exclude-private
Exclude private IP address ranges from checking
--exclude-link-local
Exclude link-local IP address range from checking
--exclude-loopback
Exclude loopback IP address range and localhost from checking
--include-mail
Also check email addresses
--remap
Remap URI matching pattern to different URI
--fallback-extensions
When checking locally, attempts to locate missing files by trying the given
fallback extensions. Multiple extensions can be separated by commas. Extensions
will be checked in order of appearance.
Example: --fallback-extensions html,htm,php,asp,aspx,jsp,cgi
Note: This option takes effect on `file://` URIs which do not exist and on
`file://` URIs pointing to directories which resolve to themself (by the
--index-files logic).
--index-files
When checking locally, resolves directory links to a separate index file.
The argument is a comma-separated list of index file names to search for. Index
names are relative to the link's directory and attempted in the order given.
If `--index-files` is specified, then at least one index file must exist in
order for a directory link to be considered valid. Additionally, the special
name `.` can be used in the list to refer to the directory itself.
If unspecified (the default behavior), index files are disabled and directory
links are considered valid as long as the directory exists on disk.
Example 1: `--index-files index.html,readme.md` looks for index.html or readme.md
and requires that at least one exists.
Example 2: `--index-files index.html,.` will use index.html if it exists, but
still accept the directory link regardless.
Example 3: `--index-files ''` will reject all directory links because there are
no valid index files. This will require every link to explicitly name
a file.
Note: This option only takes effect on `file://` URIs which exist and point to a directory.
-H, --header
Set custom header for requests
Some websites require custom headers to be passed in order to return valid responses.
You can specify custom headers in the format 'Name: Value'. For example, 'Accept: text/html'.
This is the same format that other tools like curl or wget use.
Multiple headers can be specified by using the flag multiple times.
-a, --accept
A List of accepted status codes for valid links
The following accept range syntax is supported: [start]..[[=]end]|code. Some valid
examples are:
- 200 (accepts the 200 status code only)
- ..204 (accepts any status code < 204)
- ..=204 (accepts any status code <= 204)
- 200..=204 (accepts any status code from 200 to 204 inclusive)
- 200..205 (accepts any status code from 200 to 205 excluding 205, same as 200..=204)
Use "lychee --accept '200..=204, 429, 500' ..." to provide a comma-
separated list of accepted status codes. This example will accept 200, 201,
202, 203, 204, 429, and 500 as valid status codes.
[default: 100..=103,200..=299]
--include-fragments
Enable the checking of fragments in links
-t, --timeout
Website timeout in seconds from connect to response finished
[default: 20]
-r, --retry-wait-time
Minimum wait time in seconds between retries of failed requests
[default: 1]
-X, --method
Request method
[default: get]
--base
Deprecated; use `--base-url` instead
-b, --base-url
Base URL to use when resolving relative URLs in local files. If specified,
relative links in local files are interpreted as being relative to the given
base URL.
For example, given a base URL of `https://example.com/dir/page`, the link `a`
would resolve to `https://example.com/dir/a` and the link `/b` would resolve
to `https://example.com/b`. This behavior is not affected by the filesystem
path of the file containing these links.
Note that relative URLs without a leading slash become siblings of the base
URL. If, instead, the base URL ended in a slash, the link would become a child
of the base URL. For example, a base URL of `https://example.com/dir/page/` and
a link of `a` would resolve to `https://example.com/dir/page/a`.
Basically, the base URL option resolves links as if the local files were hosted
at the given base URL address.
--root-dir
Root directory to use when checking absolute links in local files. This option is
required if absolute links appear in local files, otherwise those links will be
flagged as errors. This must be an absolute path (i.e., one beginning with `/`).
If specified, absolute links in local files are resolved by prefixing the given
root directory to the requested absolute link. For example, with a root-dir of
`/root/dir`, a link to `/page.html` would be resolved to `/root/dir/page.html`.
This option can be specified alongside `--base-url`. If both are given, an
absolute link is resolved by constructing a URL from three parts: the domain
name specified in `--base-url`, followed by the `--root-dir` directory path,
followed by the absolute link's own path.
--basic-auth
Basic authentication support. E.g. `http://example.com username:password`
--github-token
GitHub API token to use when checking github.com links, to avoid rate limiting
[env: GITHUB_TOKEN]
--skip-missing
Skip missing input files (default is to error if they don't exist)
--no-ignore
Do not skip files that would otherwise be ignored by '.gitignore', '.ignore', or the global ignore file
--hidden
Do not skip hidden directories and files
--include-verbatim
Find links in verbatim sections like `pre`- and `code` blocks
--glob-ignore-case
Ignore case when expanding filesystem path glob inputs
-o, --output
Output file of status report
--mode
Set the output display mode. Determines how results are presented in the terminal
[default: color]
[possible values: plain, color, emoji, task]
-f, --format
Output format of final status report
[default: compact]
[possible values: compact, detailed, json, markdown, raw]
--require-https
When HTTPS is available, treat HTTP links as errors
--cookie-jar
Tell lychee to read cookies from the given file. Cookies will be stored in the cookie jar and sent with requests. New cookies will be stored in the cookie jar and existing cookies will be updated
--include-wikilinks
Check WikiLinks in Markdown files
-h, --help
Print help (see a summary with '-h')
-V, --version
Print version
Exit codes
0 for success (all links checked successfully or excluded/skipped as configured)
1 for missing inputs and any unexpected runtime failures or config errors
2 for link check failures (if any non-excluded link failed the check)
3 for errors in the config file
Ignoring links
You can exclude links from getting checked by specifying regex patterns
with --exclude (e.g. --exclude example\.(com|org)).
Here are some examples:
# Exclude LinkedIn URLs (note that we match on the full URL, including the schema to avoid false-positives)
lychee --exclude '^https://www\.linkedin\.com'
# Exclude LinkedIn and Archive.org URLs
lychee --exclude '^https://www\.linkedin\.com' --exclude '^https://web\.archive\.org/web/'
# Exclude all links to PDF files
lychee --exclude '\.pdf$' .
# Exclude links to specific domains
lychee --exclude '(facebook|twitter|linkedin)\.com' .
# Exclude links with certain URL parameters
lychee --exclude '\?utm_source=' .
# Exclude all mailto links
lychee --exclude '^mailto:' .
For excluding files/directories from being scanned use lychee.toml
and exclude_path.
exclude_path = ["some/path", "*/dev/*"]
If a file named .lycheeignore exists in the current working directory, its
contents are excluded as well. The file allows you to list multiple regular
expressions for exclusion (one pattern per line).
For more advanced usage and detailed explanations, check out our comprehensive guide on excluding links.
Caching
If the --cache flag is set, lychee will cache responses in a file called
.lycheecache in the current directory. If the file exists and the flag is set,
then the cache will be loaded on startup. This can greatly speed up future runs.
Note that by default lychee will not store any data on disk.
Library usage
You can use lychee as a library for your own projects!
Here is a "hello world" example:
use lychee_lib::Result;
#[tokio::main]
async fn main() -> Result<()> {
let response = lychee_lib::check("https://github.com/lycheeverse/lychee").await?;
println!("{response}");
Ok(())
}
This is equivalent to the following snippet, in which we build our own client:
use lychee_lib::{ClientBuilder, Result, Status};
#[tokio::main]
async fn main() -> Result<()> {
let client = ClientBuilder::default().client()?;
let response = client.check("https://github.com/lycheeverse/lychee").await?;
assert!(response.status().is_success());
Ok(())
}
All options that you set will be used for all link checks.
See the builder documentation
for all options. For more information, check out the examples
directory. The examples can be run with cargo run --example .
GitHub Action Usage
A GitHub Action that uses lychee is available as a separate repository: lycheeverse/lychee-action
which includes usage instructions.
If you are using lychee for your project, please add it here.
Credits
The first prototype of lychee was built in episode 10 of Hello
Rust. Thanks to all GitHub and Patreon sponsors
for supporting the development since the beginning. Also, thanks to all the
great contributors who have since made this project more mature.