Aufgrund einer Wartung wird GitLab am 26.10. zwischen 8:00 und 9:00 Uhr kurzzeitig nicht zur Verfügung stehen. / Due to maintenance, GitLab will be temporarily unavailable on 26.10. between 8:00 and 9:00 am.

Commit 7e9407a5 authored by Amrita Deb's avatar Amrita Deb
Browse files

Merge branch 'master' into 'rohlfing-patch-remove-pypdf2'

# Conflicts:
#   README.md
parents 56e999e1 7130fa3f
Pipeline #459578 passed with stage
in 1 minute and 58 seconds
# exam-scan
Preparing exam scans for ship out: Adding watermarks, encryption and preparing upload to Moodle.
Exam-Scan is a command-line tool that helps to watermark and/or encrypt feedback files/scanned exams/additional exam materials and prepare them in Moodle uploadable format.
The tool is designed to handle zipped submissions downloadable from Moodle as well as scanned PDFs from your local scanner.
## Contents
* `handlemoodlesubmissions.py` unzips files from submission zip file downloadable from Moodle(in case there are no scans) and renames it accordingly in the format (`<Matriculation number>_<Lastname>`)
* `supplements.py` renames and create copies of sample solutions(if any) for every student
* `handlemoodlesubmissions.py` (*for downloadable PDFs*) unzips files from submission zip file downloadable from Moodle(in case there are no scans) and renames it accordingly in the format *<Matriculation number>_<Lastname/first letter of Lastname>*
* `renamescans.py` (*for scanned PDFs*) Rename scanned PDFs, assuming scan order equal to alphabetical order of students in Moodle grading sheet, such that the file name is in the format in the format *<Matriculation number>_<first letter of Lastname>*. This only works if exams were scanned in alphabetical order. Optionally, each scanned PDF is searched for barcodes/QRs containing the matriculation number to double check.
* `supplements.py` renames and create copies of sample solutions(if any)/additional exam materials for every student.
* `watermark.py` watermarks each page of PDFs containing exam scans with matriculation number of the respective student
* `encrypt.py` encrypts PDF with password either with a common password(passed as an argument) or a randomly generated password(when there is no argument)
* `preparemoodleupload.py` prepares for uploading PDFs to moodle via assign module as feedback file for each student
* `encrypt.py` encrypts PDFs either with a common password(passed as an argument) or a randomly generated password(when there is no argument)
* `preparemoodleupload.py` zips PDFs in the format acceptable for moodle upload via assign module as feedback file for each student
* `batch.py` executes all three programs as a singular batch job
Please note that the three scripts `watermark.py`, `encrypt.py`, and `preparemoodleupload.py`do not depend on each other.
If you want to use only a subset (or one) of the scripts, you can find it [here](Dependancies.md).
Please note that `handlemoodlesubmissions.py`, `supplements.py`, `watermark.py`, `encrypt.py`, `preparemoodleupload.py` do not depend on each other.
If you want to use only a subset (or one) of the scripts, you can do so after installing the corresponding script's [dependancies](Dependancies.md).
Exemplary outputs can be downloaded:
* [moodle_feedbacks.zip](https://git.rwth-aachen.de/rwthmoodle/exam-scan/-/jobs/artifacts/master/raw/out/moodle_feedbacks.zip?job=test): The zip-Archive to be uploaded to Moodle containing the watermarked and encrypted PDFs for each student.
* [passwords.csv](https://git.rwth-aachen.de/rwthmoodle/exam-scan/-/jobs/artifacts/master/raw/out/passwords.csv?job=test): CSV file containing passwords for each PDF.
For more info please refer to the [Documentation](https://rwthmoodle.pages.rwth-aachen.de/exam-scan/)
## Instructions
### Prerequisites
......@@ -29,21 +33,38 @@ Exemplary outputs can be downloaded:
* **Create PDFs corresponding to each exam**
* Scan the exams and save the scans as PDFs (each page should be A4). For most copy machines, you can save an A3 scan (double page of an exam) as two A4 pages.
* The filename of each PDF should start with the student's matriculation number (e.g. `123456_Nachname.pdf`).
* Place all PDFs in a folder, e.g. `pdfs`.
**OR: Download submission zip file**
* Download the submission zip file from (Assignment Main Page->View all submissions->Download all submissions)
* You can either
1. Order the scans following the same order as the grading worksheet
1. Or rename the scans in the format `<Matriculation number>_<Lastname/first letter of Lastname>` (e.g. `123456_Nachname.pdf`/`123456_L.pdf`).
* **OR: Download submission zip file**
* Download the submission zip file from (Assignment Main Page->View all submissions->Download all submissions)
* **Optional: Create Sample Solutions (Refer [here](https://git.rwth-aachen.de/rwthmoodle/exam-scan/-/issues/3))**
* Scan the sample solutions and save the scans as PDFs (each page should be A4). For most copy machines, you can save an A3 scan (double page of an exam) as two A4 pages.
* **Optional: Create ample solutions/additional exam materials (Refer [here](https://git.rwth-aachen.de/rwthmoodle/exam-scan/-/issues/3))**
* Scan the sample solutions/additional exam materials and save the scans as PDFs (each page should be A4). For most copy machines, you can save an A3 scan (double page of an exam) as two A4 pages.
* Place all PDFs in a folder, e.g. `supplements`.
* **Install the software dependancies**
If you are an experienced user and familiar with the python `venv` (virtual environments) module and after having installed both ImageMagick (beware the `Policy Error` fix in [FAQs])and Ghostscript you can install the python dependencies in the virtual environment via pip with
The current version of code was tested on Windows10, Ubuntu 20.04.1 LTS and macOS 10.14 Mojave to ensure platform independence.The code has the following software dependencies which needs to installed before the programs can be run successfully.
* Imagemagick (version: 7.0.10-33)
* Ghostscript (version: 9.53.3)
* Python (version: 3.8/3.9)
* PIP (version: 21.0.1)
* Additional Python modules:
* wand (version 0.6.5)
* pillow (version: 8.1.0)
* PyPDF2 (version 1.26.0)
* pwgen (version: 0.8.2.post0)
* pikepdf (version 2.5.0)
* zip (version 0.02)
Instructions to install software dependencies based on your operating system:
* Windows 10 : [Installation of Software Dependencies](swdependencies_win.md)
* MacOS : [Installation of Software Dependencies](swdependencies_mac.md)
* Linux : [Installation of Software Dependencies](swdependencies_linux.md)
If you are an experienced user and familiar with the python `venv` (virtual environments) module and after having installed both ImageMagick (beware the `Policy Error` fix in [FAQs])and Ghostscript you can install the python dependencies in the virtual environment via pip with
```bash
python -m venv venv
......@@ -84,105 +105,84 @@ docker build -t examscan:latest .
docker run --name examscan --rm -v $(pwd):$(pwd) -w $(pwd) examscan:latest batch.py --help
```
## Scripts and how to run them
### Process
Run `handlemoodlesubmissions.py` (if you have submissions in a zip and not as scans), `supplements.py` (if you want to add watermarked sample solutions as well), `watermark.py`, `encrypt.py`, and `preparemoodleupload.py` (or run `batch.py` which runs all) as described in the sections below. In summary, these steps will
1. unzip all PDF files from the zip and rename them according to the schema `<Matriculation number>_<Lastname>`
1. prepare sample solution for each students
1. watermark each page of each PDF with the corresponding matriculation number,
1. encrypt each PDF with a password (global or per-student) and
1. construct a zip-archive enabling batch upload and assignment of each PDF to each student in Moodle.
## Commands
Upload `moodle_feedbacks.zip` to Moodle
#### Unzip submission files from and rename them
* `Alle Angaben anzeigen` &#8594; `Bewertungsvorgang` &#8594; `Mehrere Feedbackdateien in einer Zip-Datei hochladen`
* Moodle will check for consistency and prompt errors.
Assuming that `./tests/assets/submissions.zip` is the zip file containing all submissions and `./tests/assets/Grades.csv` is the grading worksheet
### Commands
```bash
python handlemoodlesubmissions.py ./tests/assets/submissions.zip ./tests/assets/Grades.csv ./tests/assets/pdfs
```
Please note that the following commands may only work for you with either `python` or `python3`.
For more info on the scripts and additional arguments refer to the [Documentation](https://https://rwthmoodle.pages.rwth-aachen.de/exam-scan/handlemoodlesubmissions.html)
#### Unzip submission files from and rename them
#### Rename scanned PDFs
Assuming that ./submissions.zip is the zip file containing all submissions and ./Bewertungen.csv is the grading worksheet
Assuming that `./tests/assets/pdfs_scan` is the folder containing all scans, `./tests/assets/Grades.csv` is the grading worksheet and `./tests/assets/pdfs` is the output folder where all the renamed scans will be placed
```bash
python handlemoodlesubmissions.py --inzip ./submissions.zip --outfolder ./pdfs --csv ./Bewertungen.csv
python renamescans.py ./tests/assets/pdfs_scan ./tests/assets/Grades.csv ./tests/assets/pdfs
```
For more info on the scripts and additional arguments refer to the [Documentation](https://https://rwthmoodle.pages.rwth-aachen.de/exam-scan/renamescans.html)
#### Prepare copies of Sample Solutions for each student (Optional)
We assume that the folder `./supplements` holds the scans of the sample solution.
Assuming that the folder `./tests/assets/supplements` holds the scans of the sample solution/additional materials, `./tests/assets/Grades.csv` is the grading worksheet and `./tests/assets/pdfs` is the output folder where all the renamed materials will be placed
```bash
python supplements.py
python supplements.py ./tests/assets/supplements ./tests/assets/Grades.csv ./tests/assets/pdfs
```
Folder `supplements_out` contains copies of the sample solutions for each student.
For more info on the scripts and additional arguments refer to the [Documentation](https://https://rwthmoodle.pages.rwth-aachen.de/exam-scan/supplements.html)
#### Watermark the submissions
We assume that the folder `./pdfs` holds the scans of the exams and .
The filename of each PDF should start with the matriculation number of the student, e.g. `./pdfs/123456_Lastname.pdf`.
Assuming that the folder `./tests/assets/pdfs` holds the scans of the exams and filenames follow the format *<Matriculation number>_<Lastname/first letter of Lastname>*, `./tests/assets/pdfs_watermarked` is the output folder where all the watermarked PDFs will be placed and cores=2 indicate the number of cores for parallel processing
```bash
python watermark.py --in ./pdfs --out ./pdfs_watermarked --cores 2
python watermark.py ./tests/assets/pdfs ./tests/assets/pdfs_watermarked --cores 2 --dpi 150 --quality 75
```
Folder `pdfs_watermarked` contains watermarked PDFs, with each page watermarked with the matriculation number of the student.
**TIP:** Play around with `dpi` and `quality` parameters according to your requirements. Higher values for these two will result in high resolution PDFs of bigger size (ideal for when the number of files is low). Lower values will result in PDFs having lower file size and low resolution (ideal when the number of files is high)
#### Watermark Sample solution copies
We assume that the folder `./supplements_out` holds the copies for every students
```bash
python watermark.py --in ./supplements_out --out ./pdfs_watermarked --cores 2
```
For more info on the scripts and additional arguments refer to the [Documentation](https://https://rwthmoodle.pages.rwth-aachen.de/exam-scan/watermark.html)
#### Encrypt the files
Use either a global password by specifying it with the `--password` option or per-student passwords by omitting `--password`.
Assuming that the folder `./tests/assets/pdfs_watermarked` holds the files to encrypt, `./tests/assets/pdfs_encrypted` is the output folder where all the encrypted PDFs will be placed and providing a common password for all encrypted PDFs with the `--password` option
```bash
python encrypt.py --in ./pdfs_watermarked --out ./pdfs_encrypted --password ganzgeheim
python encrypt.py ./tests/assets/pdfs_watermarked ./tests/assets/pdfs_encrypted --password ganzgeheim
```
**TIP:** Omitting the `password` option, you can set randomly generated passwords for each PDF
Folder `./pdfs_encrypted` contains all encrypted PDFs as well as `passwords.csv`, mapping the password of each PDF to the matriculation number.
For more info on the scripts and additional arguments refer to the [Documentation](https://https://rwthmoodle.pages.rwth-aachen.de/exam-scan/encrypt.html)
#### Prepare for Moodle batch upload
This step prepares the PDFs for upload to Moodle. First, the grading table `Bewertungen.csv` has to be downloaded from Moodle via:
`Alle Angaben anzeigen` &#8594; `Bewertungsvorgang` &#8594; `Bewertungstabelle herunterladen`.
This step is needed since Moodle does not only need matriculation number, but also last and first name as well as an internal user id, which is stored in `Bewertungen.csv`.
Assuming that the folder `./tests/assets/pdfs_encrypted` holds the scans of be uploaded to Moodle, `./tests/assets/Grades.csv` is the grading worksheet and `./tests/assets/out` is the output folder where zip archive will be palced
```bash
python preparemoodleupload.py --in ./pdfs_encrypted --csv ./Bewertungen.csv --out ./moodle_feedbacks.zip
python preparemoodleupload.py ./tests/assets/pdfs_encrypted ./tests/assets/Grades.csv ./tests/assets/out
```
Then, you can upload `moodle_feedbacks.zip` in Moodle:
`Alle Angaben anzeigen` &#8594; `Bewertungsvorgang` &#8594; `Mehrere Feedbackdateien in einer Zip-Datei hochladen`
Further remarks:
For more info on the scripts and additional arguments refer to the [Documentation](https://https://rwthmoodle.pages.rwth-aachen.de/exam-scan/preparemoodleupload.html)
* Exemplary zip archive `moodle_feedbacks.zip` can be downloaded [here](https://git.rwth-aachen.de/IENT/exam-scan/-/jobs/artifacts/master/download?job=test).
* You can also conduct a dry run (neither folders nor zip file are created) via `./preparemoodleupload.sh --dry [...]`
#### Batch job
Or do everything in one step
You can do all the above processes with one single scripts as follows:
```bash
python batch.py --in ./pdfs --out ./out --cores 2 --password ganzgeheim --csv ./Bewertungen.csv --supinfolder ./supplements --sup 1 --zip ./submissions.zip
python batch.py ./tests/assets/pdfs ./tests/assets/Grades.csv ./tests/assets/out --cores 2 --password ganzgeheim --suppinfolder ./tests/assets/supplements --supp
```
with folder `out` containing `passwords.csv` and `moodle_feedbacks.zip`.
In this case the `./tests/assets/out` contains both passwords.csv and zip archive
For more info on the scripts and additional arguments refer to the [Documentation](https://https://rwthmoodle.pages.rwth-aachen.de/exam-scan/batch.html)
## Original Authors
......
......@@ -87,6 +87,10 @@ def main(args):
starttime = time.time()
# Check folders
if not os.path.exists(outfolder):
os.makedirs(outfolder)
# Unzip submissions if provided zip archive
if inzip != "0":
if not os.path.exists(infolder):
......
......@@ -16,4 +16,8 @@ Note that all Python packages are listed in the file `requirements.txt`.
## `preparemoodleupload.py`
* zip
* zipfile (version 0.02)
## `renamescans.py`
* pyzbar (version: 0.1.8)
\ No newline at end of file
......@@ -87,6 +87,10 @@ def main(args):
tmp_folder = args.tmp
extracted_folder = os.path.join(tmp_folder, "extracted_from_moodle")
# Check folders
if not os.path.exists(outfolder):
os.makedirs(outfolder)
# Print status
starttime = time.time()
num_students = moodle.get_student_number(sheet_csv=sheet_csv,
......
123001,u4KZmFV9
123002,9nnhC4nv
123010,AjH1vhi2
123011,n79WZpm4
......@@ -192,6 +192,12 @@ def main(args):
csv_enc = args.csvenc
size_limit = int(args.moodleuploadlimit) # Moodle upload size limit in MiB
# Check folders
zip_dir= outzip.rsplit('/',1)[0]
if not os.path.exists(zip_dir):
os.makedirs(zip_dir)
# Print status
starttime = time.time()
num_students = moodle.get_student_number(sheet_csv=sheet_csv,
......
......@@ -75,6 +75,10 @@ def main(args):
csv_enc = args.csvenc
check_qr = args.checkqr
# Check folders
if not os.path.exists(outfolder):
os.makedirs(outfolder)
# Print status with total number of lines
starttime = time.time()
dryout = ""
......
......@@ -125,6 +125,10 @@ def main(args):
csv_quote = args.csvquote
dry = args.dry
# Check folders
if not os.path.exists(output_dir):
os.makedirs(output_dir)
# Decide whether PDF folder or CSV file was given
csvfilename = pdf_dir = ""
ext = os.path.splitext(prefixinfo)[1].lower()
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment