Skip to content
/ dwca Public

The DwCA library processes Darwin Core Archive files

License

Notifications You must be signed in to change notification settings

gnames/dwca

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DwCA is an app and a Go library to deal with Darwin Core Archive files.

Fast reader and writer of Darwin Core Archive Files. For now only checklist files are supported.

Installation

Homebrew on Mac OS X, Linux, and Linux on Windows (WSL2)

TLDR:

```bash
brew tap gnames/gn
brew install dwca
```

Homebrew is a popular package manager for Open Source software originally developed for Mac OS X. Now it is also available on Linux, and can easily be used on MS Windows 10 or 11, if Windows Subsystem for Linux (WSL) is installed.

Note that Homebrew requires some other programs to be installed, like Curl, Git, a compiler (GCC compiler on Linux, Xcode on Mac). If it is too much, go to the Linux and Mac without Homebrew section.

  1. Install Homebrew according to their instructions.

  2. Install dwca with:

    brew tap gnames/gn
    brew install dwca
    # to upgrade
    brew upgrade dwca

Manual Install

dwca consists of just one executable file, so it is pretty easy to install it by hand. To do that download the binary executable for your operating system from the latest release.

Linux and Mac without Homebrew

Move dwca executable somewhere in your PATH (for example /usr/local/bin)

sudo mv path_to/gnfinder /usr/local/bin

Go

Install Go v1.22 or higher.

git clone [email protected]:/gnames/dwca
cd dwca
make tools
make install

Configuration

When you run dwca -V command for the first time, it will create a [dwca.yml][dwca.yml] configuration file.

This file should be located in the following places:

MS Windows: C:\Users\AppData\Roaming\dwca.yml

Mac OS: $HOME/.config/dwca.yml

Linux: $HOME/.config/dwca.yml

This file allows to set options that will modify behaviour of dwca according to your needs. It will spare you from entering the same flags for the command line application again and again.

Command line flags will override the settings in the configuration file.

It is also possible to setup environment variables. They will override the settings in both the configuration file and from the flags.

Settings Environment variables
RootPath DWCA_ROOT_PATH
OutputArchiveCompression DWCA_OUTPUT_ARCHIVE_COMPRESSION
OutputCSVType DWCA_OUTPUT_CSV_TYPE
JobsNum DWCA_JOBS_NUM

Usage

Usage as a command line app

To see flags and usage:

dwca --help
# or just
dwca

To see the version of its binary:

dwca -V

Normalizing DwCA file

dwca normalize input_file.zip  <output.zip>
## change number of concurrent jobs
dwca normalize -j 100 input_file.zip  <output.zip>
## change to comma-separated format for the output
dwca normalize -c csv input_dwca.zip
## change to a `tar.gz` archive
dwca normalize -a tar input_dwca.zip
## to skip or process rows with wrong number of fields in CSV files
dwca normalize -w skip input_dwca.zip
dwca normalize --wrong-fields-num process input_dwca.zip

If output path is not given, the output will be {input file name}.norm.zip or {input file name}.norm.tar.gz

Development

To install the latest dwca

git clone [email protected]:/gnames/dwca
cd gnfinder
make tools
make install

Testing

To avoid conflicts in filesystem run tests in sequential order.

go test -p 1 -count=1 ./...