Manual grading of programming assignments was time-consuming, inconsistent, and delayed feedback for students. Tasks were previously designed with no clear structure. There was a need for consistent exercises and an automated evaluation workflow to improve learning efficiency and scalability.
I designed and implemented programming exercises for Python course at Rhine-Waal University of Applied Sciences and develop automated assessment workflows using CodeRunner in Moodle. My focus was on creating clear, practical tasks that help students learn programming concepts efficiently while ensuring reliable and fair automatic grading.
Results of my work are:
With automatic grader I saved 15 working hours per week (165 hours per semester) previously spent on manual grading.
Feedback time for students was reduced from 1 week to instant with Coderunner.
Exam passing rate went up by 16%
Consistency in formulation of exercises increased students ability to solve them faster, which made it possible to fit up to 10 tasks per class.
Designed original Python programming exercises for university-level courses.
Structured tasks progressively from fundamental syntax to applied problem solving.
Focused on clarity, reproducibility, and practical relevance of exercises.
Ensured tasks support both independent study and guided exercise sessions.
Step-by-step learning progression in Python programming.
Immediate feedback through automatic grading.
Clear task formulations and expected outputs.
Practice with real programming workflows rather than isolated theory.
Algorithmic Thinking & Problem Structuring
Python Fundamentals
Data Types and Data Processing
Control Structures and Functions
Sequential Data and Text Processing
Numerical Computing and Visualization
Matrices, Nested Loops and Numerical Algorithms
Recursion and Advanced Problem Solving
Object-Oriented Programming basics
Debugging and Code Reliability
Keeping in mind the possible lack of experience from freshmen students, I formulated task in a clear and structured way, with concrete expected inputs and outputs. Further designing the automatic grading system, this approach in avoiding abstract formulations helped me set up the templates faster.
PDF file with exercises I designed is available below.
It is protected by Rhine-Waal University of Applied Sciences law on unauthorized usage.
Setting up automatic grading was part of modernization of the programming curriculum. Previously, student submissions were uploaded and reviewed manually by tutors or teachers, which was time-consuming, less consistent, and delayed feedback for students. Introducing an integrated code-execution and grading system helped streamline this process and improve the overall learning workflow.
The CodeRunner plugin in Moodle enabled automatic execution of Python programs directly within the learning platform. It supports various evaluation approaches, including output comparison, unit-style testing, and custom grading logic, allowing flexible assessment depending on the task type. This integration made it possible to provide immediate feedback to students, ensure more consistent grading standards, and reduce administrative workload for teaching staff while maintaining a structured and reliable evaluation process.
This approach simply compares the output of a student with pre-defined expected output. Disadvantage of this approach is that the output should exactly match, which led to possible loss of points, when student's code was giving slightly different results from the expected one (e.g. due to usage of different packages for math functions).
Example of the direct comparison grading:
Template grading offers a variety of possible automatic grading techniques with increasing complexity and requires a longer time to set up. The key advantage to direct comparison is complete ignorance of the user's output. Instead, the template focuses on comparing the variables created by students to pre-defined "solution" variables and ensures step-by-step feedback on the given code.
Example of the template grading for the task:
Write a function named diagonal_matrix(n) with an input integer n. It should construct an n × n matrix where the diagonal contains numbers from 1 to n, and all other elements are 0.
Define the required matrix as a n × n Numpy array of zeroes using np.zeros(), then fill it in with values by applying A[i][j]=....
import subprocess, json, sys, re
student_func = """{{ STUDENT_ANSWER | e('py') }}"""
code = student_func
# Require use of NumPy
if "import numpy" not in code and "np." not in code:
output = "Expected solution using NumPy (e.g., import numpy as np)."
result = {'got': output, 'fraction': 0}
print(json.dumps(result))
sys.exit(0)
# Forbid shortcuts for diagonal matrices
forbidden_snippets = [
"np.diag",
"np.eye"
]
for snippet in forbidden_snippets:
if snippet in code:
output = (
f"Forbidden construct detected in solution: '{snippet}'. "
"You must construct the matrix manually (e.g., with a loop)."
)
result = {'got': output, 'fraction': 0}
print(json.dumps(result))
sys.exit(0)
# Require at least one for-loop
if "for " not in code:
output = "Expected solution that uses at least one for-loop to fill the matrix."
result = {'got': output, 'fraction': 0}
print(json.dumps(result))
sys.exit(0)
test_program = """
import numpy as np
{{ STUDENT_ANSWER | e('py') }}
ok = True
if "diagonal_matrix" in globals():
print("Function diagonal_matrix exists.")
else:
ok = False
print("Function diagonal_matrix(n) is not defined.")
def diagonal_matrix_ref(n):
A = np.zeros((n, n), dtype=int)
for i in range(n):
A[i, i] = i + 1
return A
test_values = [3, 5, 10]
for n in test_values:
if not ok:
break
print(f"\\nTesting n = {n}...")
try:
A = diagonal_matrix(n)
except Exception as e:
ok = False
print(f"Error when calling diagonal_matrix({n}): {e}")
break
try:
A_np = np.array(A)
except Exception as e:
ok = False
print(f"Could not convert result of diagonal_matrix({n}) to a NumPy array: {e}")
break
A_r = diagonal_matrix_ref(n)
if A_np.shape == A_r.shape:
print(f"Shape for n={n} is OK.")
else:
ok = False
print(f"Shape for n={n} incorrect. Expected: {A_r.shape}, got: {A_np.shape}.")
break
# Value check
if np.array_equal(A_np, A_r):
print(f"Values for n={n} are OK.")
else:
ok = False
print(f"Values for n={n} incorrect.\\nExpected:\\n{A_r}\\nGot:\\n{A_np}")
break
if ok:
print("\\nAll good!")
else:
print("\\nNot all tests have been successful!")
"""
try:
with open('code.py', 'w') as fout:
fout.write(test_program)
output = subprocess.check_output(
['python3', 'code.py'],
stderr=subprocess.STDOUT,
universal_newlines=True
)
except subprocess.CalledProcessError as e:
output = e.output
mark = 1 if "All good!" in output.strip() else 0
result = {'got': output, 'fraction': mark}
print(json.dumps(result))
import numpy as np
def diagonal_matrix(n):
A= np.zeros((n,n), dtype=int)
for i in range(n):
for j in range(n):
if i==j:
A[i][j]=i+1
return A
Example output of the template grader for this task with the given code. The template grader allows implementing the check for existence of variables, as well as multiple test cases in a loop.
The complexity and variety depends on the task and properties needs checking. During my work on exercises implementation, different tools and templates were developed and used.
Template grading also allows graphics and objects, which was vital for the corresponding topics.
Developing Python programming exercises and implementing automated assessment in Moodle involved both technical and educational challenges. One key difficulty was designing tasks suitable for students with very different skill levels while keeping a clear learning progression. Exercises needed to be precise enough for automatic grading but still flexible enough to allow different valid solution approaches.
Another major challenge was implementing reliable automatic checking with CodeRunner. Test cases had to account for formatting differences, numerical precision, and alternative coding styles while still assessing correctness fairly. Ensuring that automated feedback remained helpful for learning rather than purely evaluative was also important.
Finally, maintaining consistency between exercise design, Moodle integration, and Python environments required continuous testing and adjustments to ensure smooth and reliable operation.
Exercises, solutions, scripts were created by Grigorii Fediakov, a student of Rhine-Waal University of Applied Sciences under supervision of Prof. Dr. Matthias Krauledat, Rhine-Waal University of Applied Sciences.
LinkedIn | +1 (917) 916 4549 | Brooklyn, NY 11226 | feduakov17@gmail.com