-
Notifications
You must be signed in to change notification settings - Fork 4
Releases: 0.9.0, 0.9.1 & 0.9.2 - merge into main
#311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* fix: Improve logging when feedback generation fails * chore: Remove an unused dataclass, fix some typos and type annotations * fix: Fix a potential security issue when running Git commands in a subprocess When running multiple commands with `sh -c ...`, it would be easier to construct the variable "message" so that it contained some executable code. * fix: Fix a typo in the log message after a successful autograde job * fix: Do not try to generate feedback for notebooks created by students If the instructor specifies '*.ipynb' in the whitelist of an assignment, additional notebooks created by students are also copied over. The feedback generator should not try to generate feedback for them - they are not in the gradebook and an attempt to do so would fail. Refs: #272 * fix: Improve error handling and logging in GitBaseHandler.git_reponse()
* refactor: changed load_roles dict entries to lists * chore: added new load_roles style to dev env * Update a docstring --------- Co-authored-by: Natalia Maniakowska <[email protected]>
…290) * chore: Update the openapi specification to make it less out-of-date * chore: Regenerate the API models * fix: Remove (unused) 'grading_failed' from ManualStatus * chore: Use enums for submission statuses in the code * fix: Use StrEnum for status enums (makes them json-serialisable) * chore: DB migration to use submission status enums
* chore: Add migration: switch from using user.name as PK to user.id (WIP) * refactor: Use sqlalchemy ORM (not raw SQL) to update values in migration * chore: Update schema and recreate models: user id * chore: Update the migration to also add user_id to `api_token`, `oauth_code` tables * chore: Update alembic migration README * chore: Update some docstrings to make them more explanatory * chore: Add user_id to ORM classes for APIToken and OAuthCode and update their usage * chore: Update ORM classes for User, Submission, Role: add user_id * chore: [WIP] Replace usage of `username` with `user_id` in the code +cleanup: remove a few unused test helper functions * chore: [WIP] Replace some more usages of username with user_id * chore: Fix some further user_id usages, split a few too long lines * chore: Fix some tests that I broke, refactor a test helper function * chore: Fix even more user.id and username usages * style: Fix typos, string formatting, and a copy-paste mistake in an error msg * chore: Fix failing tests, add a helper function for creating a student * chore: Small improvements and cleanup in submissions handler tests * chore: Add type annotations to authentication-related methods in base handler * refactor: Extract a csv-writting method in submission handler * chore: Remove unused role check in submission handler The appropriate role is already enforced by the @Authorise decorator. * chore: Fix the LectureStudentsHandler (get name from User), add a test Also: rewrite an inacurrately named test for a submission handler * chore: Fix user name usages in autograding, values.yaml, and the docs * refactor: Slightly simplify SubmissionHandler.post * fix: Set the submission user to student when instructor creates a submission for a student * refactor: Use BaseHandler.construct_git_dir where possible, validate the constructed path * fix: Use GitRepoType instead of str in Git base handler * style: Improve parsing of the pathlets in GitBaseHandler.gitlookup * fix: Do not autograde submissions created for student by instructor This fixes an error in LocalAutogradeExecutor where the `git checkout 000...0" failed for submissions newly created by an instructor. * fix: Fix failing tests * fix: Retrieve the submission more carefully in gitlookup() * chore: Write tests for submission creation with POST request * fix: Fix a bug when a repo is edited by an instructor * fix: Update submission post body in OpenAPI schema * fix: Replace the use of username in submission's delete(), fix a test
… on running tests (#300) * chore: Write tests for the edit handler (create or reset edit repo) * fix: Fix a failing test; configure git in GH actions * chore: Produce coverage report when running the tests
* WIP: updated migrations; postgres does not work yet with data inside * WIP: added db migration tests * ci: added docker network * fix: implemented PR feedback * chore: changed gitignore to allow grader config files * fix: added new config * fix: changed db init name * test: automatically clean postgresql db after test to avoid side effects * test: added data integrity test * Update grader_service/tests/migrate/test_migrate.py Co-authored-by: Natalia Maniakowska <[email protected]> --------- Co-authored-by: nadja-jovancevic <[email protected]> Co-authored-by: Natalia Maniakowska <[email protected]>
* Fix display_name in token and base_handler * Simplifying display_name * Allow display_name in load_roles * Update docstring for load_roles
* fix: make celery lti task synchonous * fix: removed placeholder aud from bearer token func
| raise HTTPError(404, reason="Assignment not found") | ||
|
|
||
| # create directories once we know they exist in the database | ||
| lecture_path = os.path.abspath(os.path.join(self.gitbase, lect_code)) |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
This path depends on a user-provided value.
This path depends on a user-provided value.
This path depends on a user-provided value.
This path depends on a user-provided value.
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 1 month ago
To fix this problem, we should restrict lect_code – used in the filesystem path – to contain only safe characters and never allow path traversal or special characters that could escape the intended directory structure.
For this, the best practice is to validate and sanitize lect_code immediately after retrieving it from the database, before using it to construct any filesystem path. Restricting lecture codes to alphanumerics, underscores, and dashes is a suitable, minimal set. If the application needs more, this set can be modified as required.
- Introduce a sanitization/helper function, e.g.,
sanitize_fs_component(name: str) -> str, or simply perform an in-place validation with a regex such as^[A-Za-z0-9_\-]+$. - If the value does not match this pattern, return a 404 or 400 error.
- Apply this check before line 135, ideally right after extracting or looking up
lect_code. - Import the
remodule to use regular expressions for the validation.
-
Copy modified line R12 -
Copy modified lines R130-R133
| @@ -9,6 +9,7 @@ | ||
| from pathlib import Path | ||
| from string import Template | ||
| from typing import List, Optional | ||
| import re | ||
|
|
||
| from sqlalchemy.orm.exc import MultipleResultsFound, NoResultFound | ||
| from tornado.ioloop import IOLoop | ||
| @@ -126,6 +127,10 @@ | ||
| role = self.session.get(Role, (self.user.id, lecture.id)) | ||
| self._check_git_repo_permissions(rpc, role, pathlets) | ||
|
|
||
| # Ensure lect_code is safe to use as a filesystem component | ||
| if not re.match(r"^[A-Za-z0-9_\-]+$", lect_code): | ||
| raise HTTPError(400, reason="Invalid lecture code") | ||
|
|
||
| try: | ||
| assignment = self.get_assignment(lecture.id, int(assign_id)) | ||
| except ValueError: |
|
|
||
| # create directories once we know they exist in the database | ||
| lecture_path = os.path.abspath(os.path.join(self.gitbase, lect_code)) | ||
| assignment_path = os.path.abspath(os.path.join(lecture_path, assign_id)) |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
This path depends on a user-provided value.
This path depends on a user-provided value.
This path depends on a user-provided value.
This path depends on a user-provided value.
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 1 month ago
To prevent path traversal attacks, ensure that any constructed file system path that includes client-supplied data is not able to escape a defined root directory. After constructing the assignment_path by joining the root directory (lecture_path) with the assign_id, normalize the path (using os.path.abspath(), as already done) and then verify that the resulting path begins with the intended base path (in this case, lecture_path). If the constructed path does not start with the intended base path, raise an error or reject the operation.
Specifically, after assigning assignment_path, add a check:
if not assignment_path.startswith(lecture_path + os.sep):
raise HTTPError(403, "Invalid assignment path")This ensures that any manipulation of assign_id or other variables cannot result in a file being created outside of the designated location.
- Edit
grader_service/handlers/git/server.py, in thegitlookupmethod. - After constructing
assignment_path, add the validation check shown above. - Ensure to import
os(already present).
-
Copy modified lines R137-R139
| @@ -134,6 +134,9 @@ | ||
| # create directories once we know they exist in the database | ||
| lecture_path = os.path.abspath(os.path.join(self.gitbase, lect_code)) | ||
| assignment_path = os.path.abspath(os.path.join(lecture_path, assign_id)) | ||
| # Ensure that the constructed assignment_path is contained within the intended lecture_path. | ||
| if not assignment_path.startswith(lecture_path + os.sep): | ||
| raise HTTPError(403, "Invalid assignment path") | ||
| if not os.path.exists(lecture_path): | ||
| os.mkdir(lecture_path) | ||
| if not os.path.exists(assignment_path): |
|
@florian-jaeger please merge - then we can create the |
* refactor: Start refactoring the LocalAutogradeExecutor class Split methods into smaller ones, improve logging and error handling * refactor: [WIP] Prepare GenerateFeedbackExecutor for refactoring Reflect some of the changes from the base class, LocalAutogradeExecutor, and improve error handling etc. * fix: Fix a circular import * chore: Make all helper methods in kube grader 'private' (for consistency) * cleanup: Remove redundant recreation of output dirs in graders * refactor: [WIP] Extract git-related logic from LocalAutogradeExecutor and derrived classes Also: - move getting all whitelist patterns to a method in Assignment (seems to fit there better), - remove some duplicated code from the child classes of the AutogradeExecutor, sometimes making their behaviour more consistent with the parent, - add missing type annotations and improve a few clearly outdated docstrings, etc. * tests: Add tests for the LocalAutogradeExecutor * refactor: Remove explicit gradebook file deletion (is done in _cleanup) * fix: Fix writing gradebook in GenerateFeedbackProcessExecutor * fix: Remove unused attributes from GenerateFeedbackExecutor * test: Write tests for local feedback generation executors * fix: Use the full commit hash when creating a submission in tests setup * fix: Fix reading stderr in the process executors process.stderr is a str already, no need to read().decode() it - it fails * fix: Remove unexpected (and unused) param from GenerateFeedbackApp.start() * cleanup: Clean up fixtures for autograding tests * fix: Re-add missing updating of submission logs; log process errors more consistently * tests: Add tests for LocalProcessAutogradeExecutor, improve tests for local feedback * refactor: Simplify autograding/feedback log collection * refactor: Improve running subprocesses, adjust and add tests * refactor: Pass git commands to subprocess.run as lists, not strings We execute the git commands with variables potentially provided by users (filenames, usernames, etc.). Passing them as strings and then splitting, potentially could enable constructing these variables in such a way so as to execute malicious commands. An alternative would be to use `shlex.quote` with any variables, but it seems more cumbersome. * refactor: Small tweaks in the local grader and local feedback, update docstrings * tests: Add tests for the collect_logs context manager * fix: Feedback repo path, do not save feedback logs in submission.logs Until now feedback logs have also not been appended to submission.logs. * chore: Small improvements in tests * fix: Only commit html files when generating feedback Previously, all student-supplied files were included in the feedback, even though feedback is actually only generated for the jupyter notebooks that are included in the gradebook. Now only the generated html files will be included in the feedback files. * fix: Protect against fabricated usernames in git repo paths in autograders * refactor: Move git manager to a separate module * chore: Small cleanup in autograde executors * chore: Make autograding/feedback executor class names more consistent * fix: Do not write `self.grading_logs` in local feedback executors We don't use these logs anywhere, so it's not necessary to collect them. All the logs are still logged with the regular `self.log` logger.
* WIP: added groups attribute to AssignmentSettings * fix: add deleted lines back * fix: add missing attribute (group) in test_base_handler.py
* chore: Small improvements in the docker-compose setup - remove the network used by all containers (dc creates it by default) - set default DATABASE_TYPE to sqlite => simpler dc up commands - change restart policy to unless-stopped => containers don't restart after system restart (which was annoying in local development) * fix: Fix the initial SQL data import for local docker-compose setup * fix: added psycog2 package to init container and set grader network * fix: Remove postgres-specific syntax from the initial data file `E'\n'` is invalid syntax in sqlite. It was used in `replace(<very long logs as a string>, '\n', E'\n')` and `replace(<very long logs as a string>, '\012', E'\n')`. I just replaced these values (`\n` and `\012`) with newlines. * cleaup: Remove psycopg from Dockerfile (not needed, after all) * fix: Fix database initialisation in docker compose for sqlite and postgres * fix: update docker compose docs * fix: Fix a comment * chore: Use uv to install python packages in containers in Docker Compose setup * chore: Make the various init bash scripts for containers more consistent * chore: Also use uv in jupyterhub Dockerfile * cleanup: Remove a redundant `expose` from docker compose file * fix: Remove submission data from the initial db import We don't include the submission files and repos in the initial setup, and creating the database objects without them doesn't make much sense, because one cannot do anything with such submissions in grader service. * fix: Simplify python deps installation in Dockerfile-Service * fix: Only load one assignment as initial db data in docker compose setup * fix: Set the initial data assignment deadline to a future date * fix: Small adjustment in the init assignment data * chore: Add initial 'source' repo for docker compose setup * chore: Add initial 'release' repo for docker compose setup * chore: Make initial repos available in the docker compose setup * fix: Fix celery worker container command * fix: Add `test` dependency group to pyproject.toml, upgrade dev and test deps. * fix: Replace deprecated `set-output` with $GITHUB_OUTPUT * chore: Delete unused .gitlab-ci.yml file * fix: Tiny improvements in docker compose setup (better caching, get rid of a warning) * fix: Fix github action check for Dockerfile changes Previously, the check would only output directory names, so the check for the "Dockerfile" name would never match. I added a separate step that checks specifically if the Dockerfile has changed. --------- Co-authored-by: florian.jaeger <[email protected]>
No description provided.