Text Manipulating Cmds
Linux Text Manipulation Commands¶
awk suitable for smaller data processing.
- Works like sed, line-by-line, but separate a line into parts to process.
- The default separation char is space or [tab]
awk 'condition1{action1} condition2{action2} ...' filename- can use $numb to access which part, starting from 1:
last -n 5 | awk '{print $1 "\t" $3}'- $0 represents the entire line
- additionally, awk has internal variables accessible:
NFhow many parts on this lineNRwhich line is current lineFScurrent separation charcat /etc/passwd | awk 'BEGIN {FS=":"} $3 < 10 {print $1 "\t " $3}'
- using conditions for different outputs
- i.e.
awk 'NR==1{printf "%10s %10s %10s %10s %10s\n",$1,$2,$3,$4,"Total" } NR>=2{total = $2 + $3 + $4; printf "%10s %10d %10d %10d %10.2f\n", $1, $2, $3, $4, total}'
- i.e.
col for simple process of a text file, like converting [tab] with spaces, etc.
cut gets some part of info out of a line of text, like a log
-d: [sparation char]-f: nth_part-c: get range of characters
diff compare two pure-text files/dirs and output the differences.
diff [-bBi] from-file to-file-b: ignore diff of one of more spaces, like "about me" and "about me" are the same-B: ignore empty lines-i: ignore capitalized differences- diff can be used on directories to show difference in files
grep supports regex, analyze a block of text and get the lines containing the match
grep [-A] [-B] [--color=auto] 'search_regex' filename-Ais display n lines after the result line-B: is display n lines before the result line-n: show line-number-v: reverse the condition- Extended regex
- '|' means OR i.e:
grep -v '^$' file | grep -v '^#'gives the same result asgrep -v -E '^$|^#' file, to show lines without empty lines and commented lines - grouping '()',
egrep -n 'g(la|oo)d' filefinds 'good' or 'glad' lines
- '|' means OR i.e:
join, paste, expand
- join merge two files by comparing them and only put together similar parts/lines.
- Files should be sorted before doing join.
- paste is simpler, just connected two lines together with a [tab]
- expand converts [tab] as a number of spaces
patch
- use diff to generate difference file, then apply difference file on the old file to patch updates.
diff -Naur passwd.old passwd.new > passwd.patchpatch -pN < patch_fileapply patchpatch -R -pN < patch_filerestore old file from patch
pr for processing pure text and format to be print-ready.
printf format lines columns to be visually appealing
sed useful for analyzing input, replace, delete, append, or extract text and lines
sed [-nefr] ['action'] [filename] - -n: silent mode, only processed lines being output - -e [script]: have the script added to the command to be executed - -f [filename]: read script from a file - -r: let sed work with extended regex - -i: direct modify the file instead of output results - [action]: - in the form of [n1[,n2]]function; function has: - a: insert a line after i.e. nl /etc/passwd | sed '2a drink tea' - c: replace lines b/w n1,n2 - d: delete matched line i.e. nl /etc/passwd | sed '2,5d' - i: insert a line before - p: print (stdout) selected lines of data/text i.e. nl /etc/passwd | sed -n '5,7p' is same as ln file | head -n 7 | tail -n 3 - s: find and replace inline! 1,20s/old_phrase/new_phrase/g here the phrase part supports regex!
sort arranges text lines in the order we want.
-f: ignore capitalized difference-b: ignore the space at the beginning-M: arrange using month-n: use number to arrange-r: reversed order-u: uniq lines only (filter out repeated lines)-t: separation char for columns (fields), default is [tab]-k[n]: use nth field to arrange
split useful for splitting a large file into smaller ones according to size, or number of lines.
tee redirects data as well as saving part of the data.
last | tee last.list | cut -d " " -f1
tr deletes / replaces some text within a block of text
last | tr '[a-z]' 'A-Z'will replace all lower case with upper
wc shows text stats like number of characters, lines, english words.
-l: lines-w: words-m: characters
uniq shows only unique (non-repeated) lines only
-cshow count
xargs provides pipe access to the commands that don't support pipes
xclip copy STDOUT piped from other commands to the clipboard; MacOS equivalent is pbcopy