Skip to content

Latest commit

 

History

History
41 lines (29 loc) · 2.45 KB

1. Additional Notes.md

File metadata and controls

41 lines (29 loc) · 2.45 KB

Backup files before before deleting or moving anything.

The MS Windows CMD terminal window will not work with non-latin charaters, use the MS Windows Terminal app.

Foreign language characters : Always import CSV files into MS Excel. Opening a CSV using Open with or Excel as the default app results in foreign laguage characters being lost.

NOTES for those that want to use MS Excel or Libre Cal to make the task a lot simpler.

  1. MANDITORY: Any action list (move or delete) created in Excel or Cal on MS Windows must be saved as a *.txt file and converted to Linux format using the dos2linux command.
    • Run dos2unix <file name>
  2. A removal or move list can be created from either report
    • duplicate_files1_yymmdd-hhmm.csv or
    • duplicate_files2_yymmdd-hhmm.csv
  3. Test with a small sample.
  4. Progress can be monitored with the creation of the all_files CSV report, that is created first. Once that has been completed the creation of the other reports can be monitored in the same way.
    • Run tail -f all_files_YYMMDD-hhmm.csv

Bulk moving of duplicates

Import the duplicate_files1_yymmdd-hhmm.csv into excel or cal

  1. Delete all the rows belonging to files you don't want to move.
  2. Delete the SHA256 and Directory columns and save as a text file, the FILE_LIST.txt
  3. If on MS Windows WSL, MSYS2 or Gitbash you must convert the FILE_LIST.txt file to linux format using dos2unix command.
  4. Generate the MOVE_LIST where each line is a command to move each file individually.
    • Run the following command where TARGET_DIRECTORY is the directory where file are to moved to, something like "/tmp/duplicates" The directroy path must be in double quotes
      awk -v destination="TARGET_DIRECTORY" '{ printf "mv -f \x27%s\x27 \x27%s\x27\n",$0,destination}' < FILE_LIST.txt > MOVE_LIST

    • The MOVE_LIST will have lines like: mv -f '/file/source/directory/path/file_name' '/file/destination/path/'

  5. To move files run: bash MOVE_LIST

Bulk deletion of duplicates

All the same except the command to generate the output is slightly different

awk '{ printf "rm -f \x27%s\x27\n",$0}' < FILE_LIST.txt > REMOVE_LIST

The REMOVE_LIST will have lines like: rm -f '/file/source/directory/path/file_name'

To remove files run: bash REMOVE_LIST

NOTE: If want to be cautious remove the -f from the commands. You will be prompted for every delete or move.