format/README.md

363 lines
7.2 KiB
Markdown
Raw Permalink Normal View History

2021-12-11 17:08:52 +01:00
# format
2022-07-25 17:55:05 +02:00
## Source code
2022-04-18 20:18:48 +02:00
2022-08-17 21:00:36 +02:00
You can find the source code here: https://git.milar.in/milarin/format
2022-07-25 17:55:05 +02:00
## Installation
2022-08-17 21:00:36 +02:00
If you have Go installed, you can simply go install the program: `go install git.milar.in/milarin/format@latest`
2022-07-25 17:55:05 +02:00
2022-08-17 21:00:36 +02:00
There are pre-compiled executables for various platforms on the [repository](https://git.milar.in/milarin/format/releases).
2022-07-25 17:55:05 +02:00
## License
2022-08-17 21:00:36 +02:00
Distributed under the MIT License. See [LICENSE.md](https://git.milar.in/milarin/format/src/branch/main/LICENSE.md)
2022-07-25 17:55:05 +02:00
## Usage
### Input pattern
2022-04-18 20:18:48 +02:00
The input pattern describes the format in which the lines are parsed from stdin.
This pattern is a regular expression according to [Go's regexp spec](https://pkg.go.dev/regexp).
2022-07-25 17:55:05 +02:00
By default, the input pattern will only be applied to every single line.
2022-04-18 20:21:17 +02:00
When using multiline patterns, you can provide an amount of lines using the command line argument `-n` followed by an integer amount of lines.
2022-04-18 20:18:48 +02:00
Use subgroups for extracting specific parts of the input line.
Provide your custom input pattern with the command line argument `-i '<pattern>'`
The default value is `^.*?$` which simply matches the whole line.
2022-07-25 17:55:05 +02:00
### Output pattern
2022-04-18 20:18:48 +02:00
The output pattern describes the format in which lines are generated for stdout using data from the input pattern.
2022-07-25 17:55:05 +02:00
Provide an output pattern via `-o '<pattern>'`.
2022-04-18 20:18:48 +02:00
The default value is `{0}` which always matches the full input pattern
2022-07-25 17:55:05 +02:00
#### Capturing groups
2022-04-18 20:18:48 +02:00
Use the `{<group_index>}` syntax to use a specific capturing group.
- `{0}` always matches the whole line.
- `{1}` matches the first capturing group
- `{2}` matches the second capturing group
- and so on
2022-08-15 22:41:24 +02:00
#### Coloring
Coloring a reference to a capturing group can be done via `{1:<color>}`.
`<color>` can be one of the following values:
- `black`
- `red`
- `green`
- `yellow`
- `blue`
- `magenta`
- `cyan`
- `white`
Leaving the color argument empty results into the default color.
2022-07-25 17:55:05 +02:00
#### Formatting
2022-04-18 20:18:48 +02:00
2022-08-15 22:41:24 +02:00
When referencing capturing groups, you can add a specific format for some data types using a simplified printf syntax.
2022-04-18 20:18:48 +02:00
2022-08-15 22:41:24 +02:00
You can use them using this syntax: `{1::%d}`. It will parse the first capturing group into an integer to get rid of leading zeros. Additionally, you can provide a given amount of leading zeros via: `{1::%03d}` for a total length of 3 digits.
2022-04-18 20:18:48 +02:00
The same method can also be applied to `%f` to further format floating point values.
See a full list of formatting options at [Go's fmt spec](https://pkg.go.dev/fmt).
2022-04-18 20:59:52 +02:00
Currently only `%s`, `%d`, `%f` and `%g` are supported though.
2022-04-18 20:18:48 +02:00
2022-07-25 17:55:05 +02:00
#### Mutators
2022-04-18 20:18:48 +02:00
Mutators are a simple way of manipulating number values like integers and floats using a simple math-like expression
2022-08-15 22:41:24 +02:00
You can provide a mutator using the syntax: `{1::%d:+1}`. It will parse the first capturing group into an integer, adds 1 and then formats the result using the printf format `%d`.
2022-04-18 20:18:48 +02:00
A mutator always consists of an operator and a value. In the example above `+` is the operator and `1` is the value.
2022-04-18 21:01:21 +02:00
The following operators are supported for `%d`, `%f` and `%g` formats:
2022-04-18 20:18:48 +02:00
- `+`
- `-`
- `*`
- `/`
2022-08-15 22:41:24 +02:00
It is possible to add multiple mutators by just concatenating them: `{1::%d:*2+1}`.
2022-04-18 20:18:48 +02:00
2022-07-25 17:55:05 +02:00
**Multiple mutators will not follow any order of operations. They are simply applied from left to right!**
2022-04-18 20:18:48 +02:00
2022-08-15 22:41:24 +02:00
Furthermore you can reference caputring groups which will be parsed as the same type to apply its value. This is done via the following syntax: `{1::%d:+(2)}`. It will parse the first and second capturing group into integers and adds them.
2022-04-18 20:18:48 +02:00
2022-07-25 17:55:05 +02:00
### Handling unmatched lines
2022-04-18 20:18:48 +02:00
By default, lines which do not match the input pattern will be dropped.
Use `-k` to keep them unchanged. They will be copied into stdout.
## Examples
### Copying
2022-07-25 17:55:05 +02:00
Copying is the default behavior of `format`
2022-04-18 20:18:48 +02:00
Input:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:18:48 +02:00
1
2
3
4
```
2022-04-18 20:45:40 +02:00
Command:
```sh
format
```
2022-04-18 20:18:48 +02:00
Output:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:18:48 +02:00
1
2
3
4
```
### Filtering
Only keep lines which contains an `i`
Input:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:18:48 +02:00
one
two
three
four
five
six
seven
eight
nine
ten
```
2022-04-18 20:45:40 +02:00
Command:
```sh
format -i '.*i.*'
```
2022-04-18 20:18:48 +02:00
Output:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:18:48 +02:00
five
six
eight
nine
```
### Removing leading zeros
2022-07-25 17:55:05 +02:00
Use printf syntax on a capturing group in the output pattern
2022-04-18 20:18:48 +02:00
Input:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:18:48 +02:00
001
002
003
04
```
2022-04-18 20:45:40 +02:00
Command:
```sh
2022-08-15 22:41:24 +02:00
format -i '\d+' -o '{0::%d}'
2022-04-18 20:45:40 +02:00
```
2022-04-18 20:18:48 +02:00
Output:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:18:48 +02:00
1
2
3
4
```
### Extracting dates
Input:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:18:48 +02:00
2022-04-18
1970-01-01
2006-01-02
```
2022-04-18 20:45:40 +02:00
Command:
```sh
2022-08-15 22:41:24 +02:00
format -i '(\d{4})-(\d{2})-(\d{2})' -o 'day: {3::%d} | month: {2::%d} | year: {1}'
2022-04-18 20:45:40 +02:00
```
2022-04-18 20:18:48 +02:00
Output:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:18:48 +02:00
day: 18 | month: 4 | year: 2022
day: 1 | month: 1 | year: 1970
day: 2 | month: 1 | year: 2006
```
### Applying multiple formats
2022-07-25 17:55:05 +02:00
Every `format` process can only apply a single pattern. Use `-k` to keep unmatched lines so the next `format` instance can apply another input pattern to them
2022-04-18 20:18:48 +02:00
Input:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:18:48 +02:00
2022-04-18
1970-01-01
02.01.2006
02.02.1962
```
2022-04-18 20:45:40 +02:00
Command:
```sh
2022-08-15 22:41:24 +02:00
format -i '(\d{4})-(\d{2})-(\d{2})' -o 'day: {3::%d} | month: {2::%d} | year: {1}' -k |
format -i '(\d{2})\.(\d{2})\.(\d{4})' -o 'day: {1::%d} | month: {2::%d} | year: {3}' -k
2022-04-18 20:45:40 +02:00
```
2022-04-18 20:18:48 +02:00
Output:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:18:48 +02:00
day: 18 | month: 4 | year: 2022
day: 1 | month: 1 | year: 1970
day: 2 | month: 1 | year: 2006
day: 2 | month: 2 | year: 1962
```
### Parsing multi-line patterns
2022-07-25 17:55:05 +02:00
Use `-n` to change the amount of lines fed into the input pattern
2022-04-18 20:18:48 +02:00
Input:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:18:48 +02:00
year: 2022
month: 04
day: 18
year: 1970
month: 01
day: 01
year: 2006
month: 01
day: 02
year: 1962
month: 02
day: 02
```
2022-04-18 20:45:40 +02:00
Command:
```sh
2022-08-15 22:41:24 +02:00
format -n 3 -i '^year: (\d{4})\nmonth: (\d{2})\nday: (\d{2})$' -o 'day: {3::%d} | month: {2::%d} | year: {1}'
2022-04-18 20:45:40 +02:00
```
2022-04-18 20:18:48 +02:00
Output:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:18:48 +02:00
day: 18 | month: 4 | year: 2022
day: 1 | month: 1 | year: 1970
day: 2 | month: 1 | year: 2006
day: 2 | month: 2 | year: 1962
```
### Adding 2 values together
2022-07-25 17:55:05 +02:00
Use mutators to apply simple arithmetic on
Input:
2022-07-25 17:55:05 +02:00
```txt
5 7
3 2
10 152
-15 3.7
```
2022-04-18 20:45:40 +02:00
Command:
```sh
2022-08-15 22:41:24 +02:00
format -i '(-?\d+) (-?\d+(?:.\d+)?)' -o '{1} + {2} = {1::%g:+(2)}'
2022-04-18 20:45:40 +02:00
```
Output:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:59:52 +02:00
5 + 7 = 12
3 + 2 = 5
10 + 152 = 162
-15 + 3.7 = -11.3
```
2022-04-18 20:18:48 +02:00
### Bulk renaming files
2022-07-25 17:55:05 +02:00
Rename a bunch of files at once using `format`
2022-04-18 20:18:48 +02:00
Output of `ls`:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:18:48 +02:00
000.jpg
001.jpg
002.jpg
```
2022-04-18 20:45:40 +02:00
Command:
```sh
2022-08-15 22:41:24 +02:00
ls | format -i '(\d+)\.jpg' -o 'mv "{0}" "{1::%d:+1}.jpg"' | xargs -0 sh -c
2022-04-18 20:45:40 +02:00
```
2022-04-18 20:18:48 +02:00
Output of `ls` afterwards:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:18:48 +02:00
1.jpg
2.jpg
3.jpg
```
To further automate this, I made my own custom script called `bulkrename` and put it in my `$PATH`
Content:
```sh
#!/usr/bin/env sh
if [ "$3" = "exec" ]; then
command ls | format -i "$1" -o "mv \"{0}\" \"$2\"" | xargs -0 -P 4 sh -c
else
command ls | format -i "$1" -o "mv \"{0}\" \"$2\""
echo
echo "execute commands with 'bulkrename $@ exec'"
fi
```
There are a few things to consider using this script:
- You can't use `-i` and `-o`. The first argument is input, the second is output
- To prevent unwanted file modifications, it will print the `mv` commands it generates to stdout by default. After you checked that all files will be renamed as desired, just add `exec` as its third argument
- It can only move items which are in your working directory. But it can move these files outside of the working directory using relative or absolute paths
Example usage of this script:
Output of `ls`:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:18:48 +02:00
000.jpg
001.jpg
002.jpg
```
2022-04-18 20:45:40 +02:00
Command:
```sh
2022-08-15 22:41:24 +02:00
bulkrename '(\d+)\.jpg' '{1::%d:+1}.jpg'
2022-04-18 20:45:40 +02:00
```
2022-04-18 20:18:48 +02:00
Output:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:18:48 +02:00
mv "000.jpg" "1.jpg"
mv "001.jpg" "2.jpg"
mv "002.jpg" "3.jpg"
2022-08-15 22:41:24 +02:00
execute commands with 'bulkrename (\d+)\.jpg {1::%d:+1}.jpg exec'
2022-04-18 20:18:48 +02:00
```
2022-04-18 20:45:40 +02:00
Command:
```sh
2022-08-15 22:41:24 +02:00
bulkrename '(\d+)\.jpg' '{1::%d:+1}.jpg' exec
2022-04-18 20:45:40 +02:00
```
2022-04-18 20:18:48 +02:00
Output of `ls` afterwards:
2022-07-25 17:55:05 +02:00
```txt
2022-04-18 20:18:48 +02:00
1.jpg
2.jpg
3.jpg
2022-07-25 17:55:05 +02:00
```