Console app, that can remove text diacritics from stdin, inside files and file/dir names
Method for removing diacritics was chosen from this benchmark: https://github.com/jakubsuchybio/Benchmark-Diacritics
Time - everything is O(n), where n is the size of input
Memory - stdin and content removers have O(n) memory consumption, where n is the size of input
Memory - file-names remover have O(n) memory consumption, where n is the size of directory tree (names)
Complexity can be definitelly improved by buffering, but I didn't see it important for my kind of use.
> .\remove-diacritics.exe help
remove-diacritics 1.0.16+Branch.master.Sha.301c567a42c94a9b8e8f824d9470b07975623407
Copyright (C) 2019 remove-diacritics
stdin Removes diacritics from stdin and outputs it to stdout
file-names Removes diacritics from all files and directories
content Removes diacritics inside files
help Display more information on a specific command.
version Display version information.
> .\remove-diacritics.exe help stdin
remove-diacritics 1.0.16+Branch.master.Sha.301c567a42c94a9b8e8f824d9470b07975623407
Copyright (C) 2019 remove-diacritics
--debug Forces program to run in debug mode
--help Display this help screen.
--version Display version information.
> .\remove-diacritics.exe help file-names
remove-diacritics 1.0.16+Branch.master.Sha.301c567a42c94a9b8e8f824d9470b07975623407
Copyright (C) 2019 remove-diacritics
-d, --directories Required. Input directories where will diacritics be recursively removed in file/dir names
--debug Forces program to run in debug mode
--help Display this help screen.
--version Display version information.
> .\remove-diacritics.exe help content
remove-diacritics 1.0.16+Branch.master.Sha.301c567a42c94a9b8e8f824d9470b07975623407
Copyright (C) 2019 remove-diacritics
-f, --files Required. Input files where the diacritics be removed inside the file's content
--debug Forces program to run in debug mode
--help Display this help screen.
--version Display version information.
input:
echo ÁČĎÉĚÍŇÓŘŠŤÚŮÝŽáčďéěíňóřšťúůýž | remove-diacritics.exe stdin
output:
ACDEEINORSTUUYZacdeeinorstuuyz
input:
echo > ÁČĎÉĚÍŇÓŘŠŤÚŮÝŽáčďéěíňóřšťúůýž.txt
remove-diacritics.exe file-names --directories .
output: (Console doesn't correctly logs diacritics...)
13:13:38.9467384 +02:00 [INF] [1] Renaming .\ACDÉEINORSTUUYZácdéeínórstúuyz.txt -> .\ACDEEINORSTUUYZacdeeinorstuuyz.txt
input:
echo ÁČĎÉĚÍŇÓŘŠŤÚŮÝŽáčďéěíňóřšťúůýž > test.txt
remove-diacritics.exe content --files test.txt
test.txt file content:
ÁČĎÉĚÍŇÓŘŠŤÚŮÝŽáčďéěíňóřšťúůýž
output:
13:32:45.5279019 +02:00 [INF] [1] Processed file test.txt
nodiacritics_test.txt file content:
ACDEEINORSTUUYZacdeeinorstuuyz