From 6121dbe0ebcacc98e098e519e218c338d37e2082 Mon Sep 17 00:00:00 2001 From: Timon Ringwald Date: Mon, 18 Apr 2022 20:18:48 +0200 Subject: [PATCH] improved README --- README.md | 273 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 272 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 2d083f4..f892078 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,274 @@ # format -format greps and formats lines using regex and printf-like syntax \ No newline at end of file +format greps lines from stdin, parses them via regex and reformats them according to a given format string + +## Input pattern + +The input pattern describes the format in which the lines are parsed from stdin. +This pattern is a regular expression according to [Go's regexp spec](https://pkg.go.dev/regexp). + +Be default, the input pattern will only be applied to every single line. +When using multiline patterns, you can provide a given amount of lines using the command line argument `-n` followed by an integer amount of lines. + +Use subgroups for extracting specific parts of the input line. + +Provide your custom input pattern with the command line argument `-i ''` + +The default value is `^.*?$` which simply matches the whole line. + +## Output pattern + +The output pattern describes the format in which lines are generated for stdout using data from the input pattern. + +The default value is `{0}` which always matches the full input pattern + +### Capturing groups + +Use the `{}` syntax to use a specific capturing group. +- `{0}` always matches the whole line. +- `{1}` matches the first capturing group +- `{2}` matches the second capturing group +- and so on + +### Formatting + +When referencing capturing groups, you can add a specific format for some data types as well using a simplified printf syntax. + +You can use them using this syntax: `{1:%d}`. It will parse the first capturing group into an integer to get rid of leading zeros. Additionally, you can provide a given amount of leading zeros via: `{1:%03d}` for a total length of 3 digits. + +The same method can also be applied to `%f` to further format floating point values. +See a full list of formatting options at [Go's fmt spec](https://pkg.go.dev/fmt). + +Currently only `%s`, `%d` and `%f` are supported though. + +### Mutators + +Mutators are a simple way of manipulating number values like integers and floats using a simple math-like expression + +You can provide a mutator using the given syntax: `{1:%d:+1}`. It will parse the first capturing group into an integer, adds 1 and then formats the result using the given printf format `%d`. + +A mutator always consists of an operator and a value. In the example above `+` is the operator and `1` is the value. + +The following operators are supported for `%d` and `%f` formats: +- `+` +- `-` +- `*` +- `/` + +It is possible to add multiple mutators by just concatenating them: `{1:%d:*2+1}`. + +Multiple mutators will not follow any order of operations. They are simply applied from left to right! + +Furthermore you can reference caputring groups which will be parsed as the same type to apply its value. This is done via the following syntax: `{1:%d:+(2)}`. It will parse the first and second capturing group into integers and adds them. + +## Handling unmatched lines + +By default, lines which do not match the input pattern will be dropped. +Use `-k` to keep them unchanged. They will be copied into stdout. + +## Examples + +### Copying +Input: +``` +1 +2 +3 +4 +``` + +Command: `format` + +Output: +``` +1 +2 +3 +4 +``` + +### Filtering + +Only keep lines which contains an `i` + +Input: +``` +one +two +three +four +five +six +seven +eight +nine +ten +``` + +Command: `format -i '.*i.*'` + +Output: +``` +five +six +eight +nine +``` + + +### Removing leading zeros + +Input: +``` +001 +002 +003 +04 +``` + +Command: `format -i '\d+' -o '{0:%d}'` + +Output: +``` +1 +2 +3 +4 +``` + +### Extracting dates + +Input: +``` +2022-04-18 +1970-01-01 +2006-01-02 +``` + +Command: `format -i '(\d{4})-(\d{2})-(\d{2})' -o 'day: {3:%d} | month: {2:%d} | year: {1}'` + +Output: +``` +day: 18 | month: 4 | year: 2022 +day: 1 | month: 1 | year: 1970 +day: 2 | month: 1 | year: 2006 +``` + +### Applying multiple formats + +Every format process can only apply a single pattern. Use `-k` to keep unmatched lines so the next format instance can apply another input pattern to them + +Input: +``` +2022-04-18 +1970-01-01 +02.01.2006 +02.02.1962 +``` + +Command: `format -i '(\d{4})-(\d{2})-(\d{2})' -o 'day: {3:%d} | month: {2:%d} | year: {1}' -k | format -i '(\d{2})\.(\d{2})\.(\d{4})' -o 'day: {1:%d} | month: {2:%d} | year: {3}' -k` + +Output: +``` +day: 18 | month: 4 | year: 2022 +day: 1 | month: 1 | year: 1970 +day: 2 | month: 1 | year: 2006 +day: 2 | month: 2 | year: 1962 +``` + +### Parsing multi-line patterns + +Input: +``` +year: 2022 +month: 04 +day: 18 +year: 1970 +month: 01 +day: 01 +year: 2006 +month: 01 +day: 02 +year: 1962 +month: 02 +day: 02 +``` + +Command: `format -n 3 -i '^year: (\d{4})\nmonth: (\d{2})\nday: (\d{2})$' -o 'day: {3:%d} | month: {2:%d} | year: {1}'` + +Output: +``` +day: 18 | month: 4 | year: 2022 +day: 1 | month: 1 | year: 1970 +day: 2 | month: 1 | year: 2006 +day: 2 | month: 2 | year: 1962 +``` + +### Bulk renaming files + +Rename a bunch of files using format at once + +Output of `ls`: +``` +000.jpg +001.jpg +002.jpg +``` + +Command: `ls | format -i '(\d+)\.jpg' -o 'mv "{0}" "{1:%d:+1}.jpg"' | xargs -0 sh -c` + +Output of `ls` afterwards: +``` +1.jpg +2.jpg +3.jpg +``` + +To further automate this, I made my own custom script called `bulkrename` and put it in my `$PATH` + +Content: +```sh +#!/usr/bin/env sh + +if [ "$3" = "exec" ]; then + command ls | format -i "$1" -o "mv \"{0}\" \"$2\"" | xargs -0 -P 4 sh -c +else + command ls | format -i "$1" -o "mv \"{0}\" \"$2\"" + echo + echo "execute commands with 'bulkrename $@ exec'" +fi +``` + +There are a few things to consider using this script: +- You can't use `-i` and `-o`. The first argument is input, the second is output +- To prevent unwanted file modifications, it will print the `mv` commands it generates to stdout by default. After you checked that all files will be renamed as desired, just add `exec` as its third argument +- It can only move items which are in your working directory. But it can move these files outside of the working directory using relative or absolute paths + +Example usage of this script: + +Output of `ls`: +``` +000.jpg +001.jpg +002.jpg +``` + +Command: `bulkrename '(\d+)\.jpg' '{1:%d:+1}.jpg'` + +Output: +``` +mv "000.jpg" "1.jpg" +mv "001.jpg" "2.jpg" +mv "002.jpg" "3.jpg" + +execute commands with 'bulkrename (\d+)\.jpg {1:%d:+1}.jpg exec' +``` + +Command: `bulkrename '(\d+)\.jpg' '{1:%d:+1}.jpg' exec` + +Output of `ls` afterwards: +``` +1.jpg +2.jpg +3.jpg +``` \ No newline at end of file