Skip to content

Short 2

Andrea Telatin edited this page Dec 4, 2019 · 8 revisions

Text files and bioinformatics

The cat command (for concatenate) will print the content of one (or more) text file(s).

Let's start locating some ".txt" files. First, we can move to the "course" directory we created in our home:

cd ~/course

From there, we can use find to locate files in a specific path (that will be the learn_bash directory, again in our home), using different criteria like wildcards to filter for filenames:

find ../learn_bash/ -name "*.txt"

Choose one of those files and try cat. For example:

cat ../learn_bash/files/introduction.txt

If the file is long, we don't want to flood our terminal with the whole thing. Sometimes a preview of the first (or last) lines will be enough (by default 10 lines, use -n NUMBER to specify otherwise). The head and tail commands will do this:

head ~/learn_bash/files/introduction.txt
tail -n 3 ~/learn_bash/files/introduction.txt

Extracting matching lines

The grep command will only print lines matching a pattern. For example:

grep Darwin ~/learn_bash/files/introduction.txt

will print one line, while:

grep Newton ~/learn_bash/files/introduction.txt

will print none

Counting lines

The wc command returns how many lines, words and characters are present in a text file:

wc ~/learn_bash/files/introduction.txt

To only print the number of lines:

wc -l ~/learn_bash/files/introduction.txt

Menu

Clone this wiki locally