Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update #3

Merged
merged 4 commits into from
Oct 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions .github/workflows/PDF-OCR.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# This is a basic workflow to help you get started with Actions

name: CI

# Controls when the workflow will run
on:
schedule:
- cron: '0 23 * * *'

# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
# This workflow contains a single job called "build"
build:
# The type of runner that the job will run on
runs-on: ubuntu-latest

# Steps represent a sequence of tasks that will be executed as part of the job
steps:
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
- uses: actions/checkout@v3
- uses: awalsh128/cache-apt-pkgs-action@latest
with:
packages: ocrmypdf
version: 1.0
- name: Installing Write
run: |
wget https://www.styluslabs.com/download/write-tgz,
mkdir Write,
tar –xvzf *.tar.gz -C Write --strip-components 1.

- name: Creating PDFs
run: |
sudo chmod u+x PdfScript/pdfcreator.sh,
./PdfScript/pdfcreator.sh.

- name: Creating OCR layer
run: find . -name *-DavideTonelli.pdf -exec ocrmypdf --language eng --output-type pdf --verbose 1 {} {} \;
14 changes: 14 additions & 0 deletions PdfScript/pdfcreator.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
readarray -d '' dir_array < <(find . -type d -name "*-DT" -print0)

$array

for i in ${dir_array[@]}
do
readarray -d '' array < <(find $i -name "*.svgz" -print0)
done

for i in ${array[@]}
do
out=${i%.svgz}.pdf
~/Write/Write --exit --out $out $i
done
Binary file added Year_1/First_semester/MCM/MCM-DT/test.pdf
Binary file not shown.