A command line tool to identify and filter plain text files based on their programming language. This can be useful for automated formatting and linting of code as part of continuous integration.
Language identification is established by first checking the file for a shebang line with a known interpreter; if none is found, the file name itself is checked against known glob patterns. This two-step process is more accurate than a naive search for file extensions, as it will properly classify executable scripts.
taxonate [FLAGS] [OPTIONS] [PATH]...
Run taxonate --help
for detailed information on all features.
In its most simple form, point taxonate
at a file to identify its language:
$ taxonate src/lib.rs
src/lib.rs: Rust
File names can also be read from STDIN, if you specify a dash (-
) as the path:
$ find . -name "main*" | taxonate -
./target/doc/main.js: JavaScript
./src/bin/taxonate/main.rs: Rust
Instead of pointing to individual files, you can point to a directory to
recursively identify all files within it (respecting .gitignore
patterns):
$ taxonate .
./LICENSE-MIT: Unknown
./Cargo.toml: TOML
./src/bin/taxonate/main.rs: Rust
./src/bin/taxonate/app.rs: Rust
./src/lib.rs: Rust
./src/languages.rs: Rust
./src/config.rs: Rust
./data/languages.cue: CUE
./data/languages.json: JSON
./data/README.md: Markdown
./data/dump_tool.cue: CUE
./LICENSE-APACHE: Unknown
./Cargo.lock: Unknown
./CHANGELOG.md: Markdown
./README.md: Markdown
NOTE: If no path is provided,
taxonate
will default to recursively identifying files within the current directory.
To filter the output so it only displays files identified as a specific
language, use the --language
option:
$ taxonate --language rust
./src/bin/taxonate/main.rs: Rust
./src/bin/taxonate/app.rs: Rust
./src/lib.rs: Rust
./src/languages.rs: Rust
./src/config.rs: Rust
To display just the file names without the identified language (e.g. if you want
to pipe the output elsewhere), add the --filename-only
flag:
$ taxonate --language rust --filename-only
./src/bin/taxonate/main.rs
./src/bin/taxonate/app.rs
./src/lib.rs
./src/languages.rs
./src/config.rs
You can display a list of the supported languages (in key: name
format) by
running the following command:
$ taxonate --list-languages
Where key
is what you should provide to the --language
option when
filtering. See the data/
directory for more details on how the
languages are defined.
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.