format greps and formats lines using regex and printf-like syntax
Go to file
2022-05-06 10:51:52 +02:00
.gitignore Update .gitignore, main.go 2021-12-11 16:10:41 +00:00
esc_seq.go treat data until EOF as new line and parse escape sequences 2022-05-06 10:51:52 +02:00
go.mod added go.mod 2022-02-17 09:03:42 +01:00
main.go treat data until EOF as new line and parse escape sequences 2022-05-06 10:51:52 +02:00
mutator.go introduced mutators 2022-04-18 17:39:41 +02:00
README.md improved README 2022-04-18 21:08:03 +02:00

format

format greps lines from stdin, parses them via regex and reformats them according to a given format string

Input pattern

The input pattern describes the format in which the lines are parsed from stdin. This pattern is a regular expression according to Go's regexp spec.

Be default, the input pattern will only be applied to every single line. When using multiline patterns, you can provide an amount of lines using the command line argument -n followed by an integer amount of lines.

Use subgroups for extracting specific parts of the input line.

Provide your custom input pattern with the command line argument -i '<pattern>'

The default value is ^.*?$ which simply matches the whole line.

Output pattern

The output pattern describes the format in which lines are generated for stdout using data from the input pattern.

The default value is {0} which always matches the full input pattern

Capturing groups

Use the {<group_index>} syntax to use a specific capturing group.

  • {0} always matches the whole line.
  • {1} matches the first capturing group
  • {2} matches the second capturing group
  • and so on

Formatting

When referencing capturing groups, you can add a specific format for some data types as well using a simplified printf syntax.

You can use them using this syntax: {1:%d}. It will parse the first capturing group into an integer to get rid of leading zeros. Additionally, you can provide a given amount of leading zeros via: {1:%03d} for a total length of 3 digits.

The same method can also be applied to %f to further format floating point values. See a full list of formatting options at Go's fmt spec.

Currently only %s, %d, %f and %g are supported though.

Mutators

Mutators are a simple way of manipulating number values like integers and floats using a simple math-like expression

You can provide a mutator using the syntax: {1:%d:+1}. It will parse the first capturing group into an integer, adds 1 and then formats the result using the printf format %d.

A mutator always consists of an operator and a value. In the example above + is the operator and 1 is the value.

The following operators are supported for %d, %f and %g formats:

  • +
  • -
  • *
  • /

It is possible to add multiple mutators by just concatenating them: {1:%d:*2+1}.

Multiple mutators will not follow any order of operations. They are simply applied from left to right!

Furthermore you can reference caputring groups which will be parsed as the same type to apply its value. This is done via the following syntax: {1:%d:+(2)}. It will parse the first and second capturing group into integers and adds them.

Handling unmatched lines

By default, lines which do not match the input pattern will be dropped. Use -k to keep them unchanged. They will be copied into stdout.

Examples

Copying

Input:

1
2
3
4

Command:

format

Output:

1
2
3
4

Filtering

Only keep lines which contains an i

Input:

one
two
three
four
five
six
seven
eight
nine
ten

Command:

format -i '.*i.*'

Output:

five
six
eight
nine

Removing leading zeros

Input:

001
002
003
04

Command:

format -i '\d+' -o '{0:%d}'

Output:

1
2
3
4

Extracting dates

Input:

2022-04-18
1970-01-01
2006-01-02

Command:

format -i '(\d{4})-(\d{2})-(\d{2})' -o 'day: {3:%d} | month: {2:%d} | year: {1}'

Output:

day: 18 | month: 4 | year: 2022
day: 1 | month: 1 | year: 1970
day: 2 | month: 1 | year: 2006

Applying multiple formats

Every format process can only apply a single pattern. Use -k to keep unmatched lines so the next format instance can apply another input pattern to them

Input:

2022-04-18
1970-01-01
02.01.2006
02.02.1962

Command:

format -i '(\d{4})-(\d{2})-(\d{2})' -o 'day: {3:%d} | month: {2:%d} | year: {1}' -k | format -i '(\d{2})\.(\d{2})\.(\d{4})' -o 'day: {1:%d} | month: {2:%d} | year: {3}' -k

Output:

day: 18 | month: 4 | year: 2022
day: 1 | month: 1 | year: 1970
day: 2 | month: 1 | year: 2006
day: 2 | month: 2 | year: 1962

Parsing multi-line patterns

Input:

year: 2022
month: 04
day: 18
year: 1970
month: 01
day: 01
year: 2006
month: 01
day: 02
year: 1962
month: 02
day: 02

Command:

format -n 3 -i '^year: (\d{4})\nmonth: (\d{2})\nday: (\d{2})$' -o 'day: {3:%d} | month: {2:%d} | year: {1}'

Output:

day: 18 | month: 4 | year: 2022
day: 1 | month: 1 | year: 1970
day: 2 | month: 1 | year: 2006
day: 2 | month: 2 | year: 1962

Adding 2 values together

Input:

5 7
3 2
10 152
-15 3.7

Command:

format -i '(-?\d+) (-?\d+(?:.\d+)?)' -o '{1} + {2} = {1:%g:+(2)}'

Output:

5 + 7 = 12
3 + 2 = 5
10 + 152 = 162
-15 + 3.7 = -11.3

Bulk renaming files

Rename a bunch of files at once using format

Output of ls:

000.jpg
001.jpg
002.jpg

Command:

ls | format -i '(\d+)\.jpg' -o 'mv "{0}" "{1:%d:+1}.jpg"' | xargs -0 sh -c

Output of ls afterwards:

1.jpg
2.jpg
3.jpg

To further automate this, I made my own custom script called bulkrename and put it in my $PATH

Content:

#!/usr/bin/env sh

if [ "$3" = "exec" ]; then
	command ls | format -i "$1" -o "mv \"{0}\" \"$2\"" | xargs -0 -P 4 sh -c
else
	command ls | format -i "$1" -o "mv \"{0}\" \"$2\""
	echo
	echo "execute commands with 'bulkrename $@ exec'"
fi

There are a few things to consider using this script:

  • You can't use -i and -o. The first argument is input, the second is output
  • To prevent unwanted file modifications, it will print the mv commands it generates to stdout by default. After you checked that all files will be renamed as desired, just add exec as its third argument
  • It can only move items which are in your working directory. But it can move these files outside of the working directory using relative or absolute paths

Example usage of this script:

Output of ls:

000.jpg
001.jpg
002.jpg

Command:

bulkrename '(\d+)\.jpg' '{1:%d:+1}.jpg'

Output:

mv "000.jpg" "1.jpg"
mv "001.jpg" "2.jpg"
mv "002.jpg" "3.jpg"

execute commands with 'bulkrename (\d+)\.jpg {1:%d:+1}.jpg exec'

Command:

bulkrename '(\d+)\.jpg' '{1:%d:+1}.jpg' exec

Output of ls afterwards:

1.jpg
2.jpg
3.jpg