Skip to content

Commit 19ab6a2

Browse files
committed
changes to the report
1 parent 379ff28 commit 19ab6a2

File tree

2 files changed

+5
-4
lines changed

2 files changed

+5
-4
lines changed

content/authors/alghali/_index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ authors:
1010
superuser: false
1111

1212
# Role/position
13-
role: "underaduate Computer Science student at The University of Khartoum"
13+
role: "undergraduate Computer Science student at The University of Khartoum"
1414

1515
# Organizations/Affiliations
1616
organizations:
@@ -20,7 +20,7 @@ organizations:
2020

2121

2222
# Short bio (displayed in user profile at end of posts)
23-
bio: Ahmed Alghali is an undergraduate in Computer Science at the University of Khartoum with interest in applied machine learning and data platforms.
23+
bio: Ahmed Alghali is an undergraduate Computer Science student at the University of Khartoum with interest in applied machine learning and data platforms.
2424

2525

2626
# Social/Academic Networking

content/report/osre25/ucsc/06212025-alghali/index.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,13 +39,14 @@ The same way the famous paper about the [repoducibility crisis in science](https
3939

4040
The lack of software dependency management, proper version control, log tracking, and effective artifacts sharing made it very difficult to reproduce research in machine learning.
4141

42-
Reproducibility in ML is largely driven by well-established MLOps practices.However, in academic settings reproducibility remains a great challenge, the adaptation and standardization of these practices progress slowly, the best way to ensure is to seamleas experience with MLOps, is to make these capabilities are easily accessible to the researchers' workflow. by developing a tool that steamlines the process of provisioning resources, enviornment setup, model training and artifacts tracking, that ensures reproducible results.
42+
Reproducibility in machine learning is largely supported by MLOps practices which is the case in the industry where the majority of researchers are backed by software engineers who are responsible of setting experimental environments or develop tools that streamline the workflow.However, in academic settings reproducibility remains a great challenge, researchers prefer to focus on coding, and worry a little about the the complexities invloved in configuring their experimental environment,As a result, the adaptation and standardization of MLOps practices in academia progress slowly. The best way to ensure a seamleas experience with MLOps, is to make these capabilities easily accessible to the researchers' workflow. by developing a tool that steamlines the process of provisioning resources, enviornment setup, model training and artifacts tracking, that ensures reproducible results.
43+
4344

4445
### Proposed Solution
4546

4647
![Solution Architecture](Design.png)
4748

48-
We want researcher to spin up ML research instances/bare metal on Chameleon testbed while keeping the technical complexity involved in configuring and stitching everything together abstracted, the user answers basic questions about the project info, frameworks, tools, features and integrations if there are any and have a full generated project that is reproducible. it contains a provisioning/infrastracture config layer for provisioning resources on the cloud, a dockerfile to spin up services and presistent storage for data,the ML code at its core is backed by ML tracking server system that logs the artifacts, metadata, environment configuration, system specification (GPUs type) and Git status using Mlflow, powered by a postgresSQL for storing metadata and a S3 Minio bucket to store artifacts.
49+
We want the researchers to spin up ML research instances/bare metal on Chameleon testbed while keeping the technical complexity involved in configuring and stitching everything together abstracted, users simply answer frew questions about their project info, frameworks, tools, features and integrations if there are any, and have a full generated,reproducible project. it contains a provisioning/infrastracture config layer for provisioning resources on the cloud, a dockerfile to spin up services and presistent storage for data,the ML code at its core is a containarized training environment backed by ML tracking server system that logs the artifacts, metadata, environment configuration, system specification (GPUs type) and Git status using Mlflow, powered by a postgresSQL for storing metadata and a S3 Minio bucket to store artifacts.
4950
persistent storage for the artifacts generated from the experiment and the datasets and containarization of all these to ensure reproducibility.we aim to make the cloud experience easier, by dealing with the configuration needed for setting up the environment having a 3rd party framework, enabling seamless access to benchmarking dataset or any necessary components from services like Hugging face and GitHub as an example will be accessible from the container easily. for more techincal details about the solution you can read my propsal [here](https://docs.google.com/document/d/1ilm-yMEq-UTiJPGMl8tQc3Anl5cKM5RD2sUGInLjLbU).
5051

5152

0 commit comments

Comments
 (0)