Commit 7e9407a5 authored by Amrita Deb's avatar Amrita Deb
Browse files

Merge branch 'master' into 'rohlfing-patch-remove-pypdf2'

# Conflicts:
#   README.md
parents 56e999e1 7130fa3f
Pipeline #459578 passed with stage
in 1 minute and 58 seconds
# exam-scan # exam-scan
Preparing exam scans for ship out: Adding watermarks, encryption and preparing upload to Moodle. Exam-Scan is a command-line tool that helps to watermark and/or encrypt feedback files/scanned exams/additional exam materials and prepare them in Moodle uploadable format.
The tool is designed to handle zipped submissions downloadable from Moodle as well as scanned PDFs from your local scanner.
## Contents ## Contents
* `handlemoodlesubmissions.py` unzips files from submission zip file downloadable from Moodle(in case there are no scans) and renames it accordingly in the format (`<Matriculation number>_<Lastname>`) * `handlemoodlesubmissions.py` (*for downloadable PDFs*) unzips files from submission zip file downloadable from Moodle(in case there are no scans) and renames it accordingly in the format *<Matriculation number>_<Lastname/first letter of Lastname>*
* `supplements.py` renames and create copies of sample solutions(if any) for every student * `renamescans.py` (*for scanned PDFs*) Rename scanned PDFs, assuming scan order equal to alphabetical order of students in Moodle grading sheet, such that the file name is in the format in the format *<Matriculation number>_<first letter of Lastname>*. This only works if exams were scanned in alphabetical order. Optionally, each scanned PDF is searched for barcodes/QRs containing the matriculation number to double check.
* `supplements.py` renames and create copies of sample solutions(if any)/additional exam materials for every student.
* `watermark.py` watermarks each page of PDFs containing exam scans with matriculation number of the respective student * `watermark.py` watermarks each page of PDFs containing exam scans with matriculation number of the respective student
* `encrypt.py` encrypts PDF with password either with a common password(passed as an argument) or a randomly generated password(when there is no argument) * `encrypt.py` encrypts PDFs either with a common password(passed as an argument) or a randomly generated password(when there is no argument)
* `preparemoodleupload.py` prepares for uploading PDFs to moodle via assign module as feedback file for each student * `preparemoodleupload.py` zips PDFs in the format acceptable for moodle upload via assign module as feedback file for each student
* `batch.py` executes all three programs as a singular batch job * `batch.py` executes all three programs as a singular batch job
Please note that the three scripts `watermark.py`, `encrypt.py`, and `preparemoodleupload.py`do not depend on each other. Please note that `handlemoodlesubmissions.py`, `supplements.py`, `watermark.py`, `encrypt.py`, `preparemoodleupload.py` do not depend on each other.
If you want to use only a subset (or one) of the scripts, you can find it [here](Dependancies.md). If you want to use only a subset (or one) of the scripts, you can do so after installing the corresponding script's [dependancies](Dependancies.md).
Exemplary outputs can be downloaded: Exemplary outputs can be downloaded:
* [moodle_feedbacks.zip](https://git.rwth-aachen.de/rwthmoodle/exam-scan/-/jobs/artifacts/master/raw/out/moodle_feedbacks.zip?job=test): The zip-Archive to be uploaded to Moodle containing the watermarked and encrypted PDFs for each student. * [moodle_feedbacks.zip](https://git.rwth-aachen.de/rwthmoodle/exam-scan/-/jobs/artifacts/master/raw/out/moodle_feedbacks.zip?job=test): The zip-Archive to be uploaded to Moodle containing the watermarked and encrypted PDFs for each student.
* [passwords.csv](https://git.rwth-aachen.de/rwthmoodle/exam-scan/-/jobs/artifacts/master/raw/out/passwords.csv?job=test): CSV file containing passwords for each PDF. * [passwords.csv](https://git.rwth-aachen.de/rwthmoodle/exam-scan/-/jobs/artifacts/master/raw/out/passwords.csv?job=test): CSV file containing passwords for each PDF.
For more info please refer to the [Documentation](https://rwthmoodle.pages.rwth-aachen.de/exam-scan/)
## Instructions ## Instructions
### Prerequisites ### Prerequisites
...@@ -29,20 +33,37 @@ Exemplary outputs can be downloaded: ...@@ -29,20 +33,37 @@ Exemplary outputs can be downloaded:
* **Create PDFs corresponding to each exam** * **Create PDFs corresponding to each exam**
* Scan the exams and save the scans as PDFs (each page should be A4). For most copy machines, you can save an A3 scan (double page of an exam) as two A4 pages. * Scan the exams and save the scans as PDFs (each page should be A4). For most copy machines, you can save an A3 scan (double page of an exam) as two A4 pages.
* The filename of each PDF should start with the student's matriculation number (e.g. `123456_Nachname.pdf`). * You can either
* Place all PDFs in a folder, e.g. `pdfs`. 1. Order the scans following the same order as the grading worksheet
**OR: Download submission zip file** 1. Or rename the scans in the format `<Matriculation number>_<Lastname/first letter of Lastname>` (e.g. `123456_Nachname.pdf`/`123456_L.pdf`).
* Download the submission zip file from (Assignment Main Page->View all submissions->Download all submissions)
* **OR: Download submission zip file** * **OR: Download submission zip file**
* Download the submission zip file from (Assignment Main Page->View all submissions->Download all submissions) * Download the submission zip file from (Assignment Main Page->View all submissions->Download all submissions)
* **Optional: Create Sample Solutions (Refer [here](https://git.rwth-aachen.de/rwthmoodle/exam-scan/-/issues/3))** * **Optional: Create ample solutions/additional exam materials (Refer [here](https://git.rwth-aachen.de/rwthmoodle/exam-scan/-/issues/3))**
* Scan the sample solutions and save the scans as PDFs (each page should be A4). For most copy machines, you can save an A3 scan (double page of an exam) as two A4 pages. * Scan the sample solutions/additional exam materials and save the scans as PDFs (each page should be A4). For most copy machines, you can save an A3 scan (double page of an exam) as two A4 pages.
* Place all PDFs in a folder, e.g. `supplements`. * Place all PDFs in a folder, e.g. `supplements`.
* **Install the software dependancies** * **Install the software dependancies**
The current version of code was tested on Windows10, Ubuntu 20.04.1 LTS and macOS 10.14 Mojave to ensure platform independence.The code has the following software dependencies which needs to installed before the programs can be run successfully.
* Imagemagick (version: 7.0.10-33)
* Ghostscript (version: 9.53.3)
* Python (version: 3.8/3.9)
* PIP (version: 21.0.1)
* Additional Python modules:
* wand (version 0.6.5)
* pillow (version: 8.1.0)
* PyPDF2 (version 1.26.0)
* pwgen (version: 0.8.2.post0)
* pikepdf (version 2.5.0)
* zip (version 0.02)
Instructions to install software dependencies based on your operating system:
* Windows 10 : [Installation of Software Dependencies](swdependencies_win.md)
* MacOS : [Installation of Software Dependencies](swdependencies_mac.md)
* Linux : [Installation of Software Dependencies](swdependencies_linux.md)
If you are an experienced user and familiar with the python `venv` (virtual environments) module and after having installed both ImageMagick (beware the `Policy Error` fix in [FAQs])and Ghostscript you can install the python dependencies in the virtual environment via pip with If you are an experienced user and familiar with the python `venv` (virtual environments) module and after having installed both ImageMagick (beware the `Policy Error` fix in [FAQs])and Ghostscript you can install the python dependencies in the virtual environment via pip with
```bash ```bash
...@@ -84,105 +105,84 @@ docker build -t examscan:latest . ...@@ -84,105 +105,84 @@ docker build -t examscan:latest .
docker run --name examscan --rm -v $(pwd):$(pwd) -w $(pwd) examscan:latest batch.py --help docker run --name examscan --rm -v $(pwd):$(pwd) -w $(pwd) examscan:latest batch.py --help
``` ```
## Scripts and how to run them ## Commands
### Process
Run `handlemoodlesubmissions.py` (if you have submissions in a zip and not as scans), `supplements.py` (if you want to add watermarked sample solutions as well), `watermark.py`, `encrypt.py`, and `preparemoodleupload.py` (or run `batch.py` which runs all) as described in the sections below. In summary, these steps will
1. unzip all PDF files from the zip and rename them according to the schema `<Matriculation number>_<Lastname>`
1. prepare sample solution for each students
1. watermark each page of each PDF with the corresponding matriculation number,
1. encrypt each PDF with a password (global or per-student) and
1. construct a zip-archive enabling batch upload and assignment of each PDF to each student in Moodle.
Upload `moodle_feedbacks.zip` to Moodle #### Unzip submission files from and rename them
* `Alle Angaben anzeigen` &#8594; `Bewertungsvorgang` &#8594; `Mehrere Feedbackdateien in einer Zip-Datei hochladen` Assuming that `./tests/assets/submissions.zip` is the zip file containing all submissions and `./tests/assets/Grades.csv` is the grading worksheet
* Moodle will check for consistency and prompt errors.
### Commands ```bash
python handlemoodlesubmissions.py ./tests/assets/submissions.zip ./tests/assets/Grades.csv ./tests/assets/pdfs
```
Please note that the following commands may only work for you with either `python` or `python3`. For more info on the scripts and additional arguments refer to the [Documentation](https://https://rwthmoodle.pages.rwth-aachen.de/exam-scan/handlemoodlesubmissions.html)
#### Unzip submission files from and rename them #### Rename scanned PDFs
Assuming that ./submissions.zip is the zip file containing all submissions and ./Bewertungen.csv is the grading worksheet Assuming that `./tests/assets/pdfs_scan` is the folder containing all scans, `./tests/assets/Grades.csv` is the grading worksheet and `./tests/assets/pdfs` is the output folder where all the renamed scans will be placed
```bash ```bash
python handlemoodlesubmissions.py --inzip ./submissions.zip --outfolder ./pdfs --csv ./Bewertungen.csv python renamescans.py ./tests/assets/pdfs_scan ./tests/assets/Grades.csv ./tests/assets/pdfs
``` ```
For more info on the scripts and additional arguments refer to the [Documentation](https://https://rwthmoodle.pages.rwth-aachen.de/exam-scan/renamescans.html)
#### Prepare copies of Sample Solutions for each student (Optional) #### Prepare copies of Sample Solutions for each student (Optional)
We assume that the folder `./supplements` holds the scans of the sample solution. Assuming that the folder `./tests/assets/supplements` holds the scans of the sample solution/additional materials, `./tests/assets/Grades.csv` is the grading worksheet and `./tests/assets/pdfs` is the output folder where all the renamed materials will be placed
```bash ```bash
python supplements.py python supplements.py ./tests/assets/supplements ./tests/assets/Grades.csv ./tests/assets/pdfs
``` ```
Folder `supplements_out` contains copies of the sample solutions for each student. For more info on the scripts and additional arguments refer to the [Documentation](https://https://rwthmoodle.pages.rwth-aachen.de/exam-scan/supplements.html)
#### Watermark the submissions #### Watermark the submissions
We assume that the folder `./pdfs` holds the scans of the exams and . Assuming that the folder `./tests/assets/pdfs` holds the scans of the exams and filenames follow the format *<Matriculation number>_<Lastname/first letter of Lastname>*, `./tests/assets/pdfs_watermarked` is the output folder where all the watermarked PDFs will be placed and cores=2 indicate the number of cores for parallel processing
The filename of each PDF should start with the matriculation number of the student, e.g. `./pdfs/123456_Lastname.pdf`.
```bash ```bash
python watermark.py --in ./pdfs --out ./pdfs_watermarked --cores 2 python watermark.py ./tests/assets/pdfs ./tests/assets/pdfs_watermarked --cores 2 --dpi 150 --quality 75
``` ```
Folder `pdfs_watermarked` contains watermarked PDFs, with each page watermarked with the matriculation number of the student.
**TIP:** Play around with `dpi` and `quality` parameters according to your requirements. Higher values for these two will result in high resolution PDFs of bigger size (ideal for when the number of files is low). Lower values will result in PDFs having lower file size and low resolution (ideal when the number of files is high) **TIP:** Play around with `dpi` and `quality` parameters according to your requirements. Higher values for these two will result in high resolution PDFs of bigger size (ideal for when the number of files is low). Lower values will result in PDFs having lower file size and low resolution (ideal when the number of files is high)
#### Watermark Sample solution copies For more info on the scripts and additional arguments refer to the [Documentation](https://https://rwthmoodle.pages.rwth-aachen.de/exam-scan/watermark.html)
We assume that the folder `./supplements_out` holds the copies for every students
```bash
python watermark.py --in ./supplements_out --out ./pdfs_watermarked --cores 2
```
#### Encrypt the files #### Encrypt the files
Use either a global password by specifying it with the `--password` option or per-student passwords by omitting `--password`. Assuming that the folder `./tests/assets/pdfs_watermarked` holds the files to encrypt, `./tests/assets/pdfs_encrypted` is the output folder where all the encrypted PDFs will be placed and providing a common password for all encrypted PDFs with the `--password` option
```bash ```bash
python encrypt.py --in ./pdfs_watermarked --out ./pdfs_encrypted --password ganzgeheim python encrypt.py ./tests/assets/pdfs_watermarked ./tests/assets/pdfs_encrypted --password ganzgeheim
``` ```
**TIP:** Omitting the `password` option, you can set randomly generated passwords for each PDF
Folder `./pdfs_encrypted` contains all encrypted PDFs as well as `passwords.csv`, mapping the password of each PDF to the matriculation number. For more info on the scripts and additional arguments refer to the [Documentation](https://https://rwthmoodle.pages.rwth-aachen.de/exam-scan/encrypt.html)
#### Prepare for Moodle batch upload #### Prepare for Moodle batch upload
This step prepares the PDFs for upload to Moodle. First, the grading table `Bewertungen.csv` has to be downloaded from Moodle via: Assuming that the folder `./tests/assets/pdfs_encrypted` holds the scans of be uploaded to Moodle, `./tests/assets/Grades.csv` is the grading worksheet and `./tests/assets/out` is the output folder where zip archive will be palced
`Alle Angaben anzeigen` &#8594; `Bewertungsvorgang` &#8594; `Bewertungstabelle herunterladen`.
This step is needed since Moodle does not only need matriculation number, but also last and first name as well as an internal user id, which is stored in `Bewertungen.csv`.
```bash ```bash
python preparemoodleupload.py --in ./pdfs_encrypted --csv ./Bewertungen.csv --out ./moodle_feedbacks.zip python preparemoodleupload.py ./tests/assets/pdfs_encrypted ./tests/assets/Grades.csv ./tests/assets/out
``` ```
Then, you can upload `moodle_feedbacks.zip` in Moodle: Then, you can upload `moodle_feedbacks.zip` in Moodle:
`Alle Angaben anzeigen` &#8594; `Bewertungsvorgang` &#8594; `Mehrere Feedbackdateien in einer Zip-Datei hochladen` `Alle Angaben anzeigen` &#8594; `Bewertungsvorgang` &#8594; `Mehrere Feedbackdateien in einer Zip-Datei hochladen`
Further remarks: For more info on the scripts and additional arguments refer to the [Documentation](https://https://rwthmoodle.pages.rwth-aachen.de/exam-scan/preparemoodleupload.html)
* Exemplary zip archive `moodle_feedbacks.zip` can be downloaded [here](https://git.rwth-aachen.de/IENT/exam-scan/-/jobs/artifacts/master/download?job=test).
* You can also conduct a dry run (neither folders nor zip file are created) via `./preparemoodleupload.sh --dry [...]`
#### Batch job #### Batch job
Or do everything in one step You can do all the above processes with one single scripts as follows:
```bash ```bash
python batch.py --in ./pdfs --out ./out --cores 2 --password ganzgeheim --csv ./Bewertungen.csv --supinfolder ./supplements --sup 1 --zip ./submissions.zip python batch.py ./tests/assets/pdfs ./tests/assets/Grades.csv ./tests/assets/out --cores 2 --password ganzgeheim --suppinfolder ./tests/assets/supplements --supp
``` ```
with folder `out` containing `passwords.csv` and `moodle_feedbacks.zip`. In this case the `./tests/assets/out` contains both passwords.csv and zip archive
For more info on the scripts and additional arguments refer to the [Documentation](https://https://rwthmoodle.pages.rwth-aachen.de/exam-scan/batch.html)
## Original Authors ## Original Authors
......
...@@ -87,6 +87,10 @@ def main(args): ...@@ -87,6 +87,10 @@ def main(args):
starttime = time.time() starttime = time.time()
# Check folders
if not os.path.exists(outfolder):
os.makedirs(outfolder)
# Unzip submissions if provided zip archive # Unzip submissions if provided zip archive
if inzip != "0": if inzip != "0":
if not os.path.exists(infolder): if not os.path.exists(infolder):
......
...@@ -16,4 +16,8 @@ Note that all Python packages are listed in the file `requirements.txt`. ...@@ -16,4 +16,8 @@ Note that all Python packages are listed in the file `requirements.txt`.
## `preparemoodleupload.py` ## `preparemoodleupload.py`
* zip * zipfile (version 0.02)
## `renamescans.py`
* pyzbar (version: 0.1.8)
\ No newline at end of file
...@@ -87,6 +87,10 @@ def main(args): ...@@ -87,6 +87,10 @@ def main(args):
tmp_folder = args.tmp tmp_folder = args.tmp
extracted_folder = os.path.join(tmp_folder, "extracted_from_moodle") extracted_folder = os.path.join(tmp_folder, "extracted_from_moodle")
# Check folders
if not os.path.exists(outfolder):
os.makedirs(outfolder)
# Print status # Print status
starttime = time.time() starttime = time.time()
num_students = moodle.get_student_number(sheet_csv=sheet_csv, num_students = moodle.get_student_number(sheet_csv=sheet_csv,
......
123001,u4KZmFV9
123002,9nnhC4nv
123010,AjH1vhi2
123011,n79WZpm4
...@@ -192,6 +192,12 @@ def main(args): ...@@ -192,6 +192,12 @@ def main(args):
csv_enc = args.csvenc csv_enc = args.csvenc
size_limit = int(args.moodleuploadlimit) # Moodle upload size limit in MiB size_limit = int(args.moodleuploadlimit) # Moodle upload size limit in MiB
# Check folders
zip_dir= outzip.rsplit('/',1)[0]
if not os.path.exists(zip_dir):
os.makedirs(zip_dir)
# Print status # Print status
starttime = time.time() starttime = time.time()
num_students = moodle.get_student_number(sheet_csv=sheet_csv, num_students = moodle.get_student_number(sheet_csv=sheet_csv,
......
...@@ -75,6 +75,10 @@ def main(args): ...@@ -75,6 +75,10 @@ def main(args):
csv_enc = args.csvenc csv_enc = args.csvenc
check_qr = args.checkqr check_qr = args.checkqr
# Check folders
if not os.path.exists(outfolder):
os.makedirs(outfolder)
# Print status with total number of lines # Print status with total number of lines
starttime = time.time() starttime = time.time()
dryout = "" dryout = ""
......
...@@ -125,6 +125,10 @@ def main(args): ...@@ -125,6 +125,10 @@ def main(args):
csv_quote = args.csvquote csv_quote = args.csvquote
dry = args.dry dry = args.dry
# Check folders
if not os.path.exists(output_dir):
os.makedirs(output_dir)
# Decide whether PDF folder or CSV file was given # Decide whether PDF folder or CSV file was given
csvfilename = pdf_dir = "" csvfilename = pdf_dir = ""
ext = os.path.splitext(prefixinfo)[1].lower() ext = os.path.splitext(prefixinfo)[1].lower()
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment