This project is the solution to the Challenge: Leonardo está incompleto from SENASoft 2025. It's a custom Spring Boot backend that extends Leonardo's capabilities (ChatGPT/OpenAI Assistant) with new detailed metrics for SENA apprentices.
Leonardo could already consume basic metrics, but needed more detailed endpoints to answer specific questions about:
- ✅ Number of registered apprentices by training center
- ✅ Recommended instructors by training center
- ✅ Registered apprentices by center and training program (4 programs allowed)
- ✅ Apprentices by Colombian department
- ✅ Apprentices with GitHub accounts
- ✅ Apprentices with B1 or B2 English level by center
Complete REST API backend that provides detailed metrics and analytics for SENA apprentices, training centers, and educational programs across Colombia. The API is designed to be consumed by Leonardo and other AI assistants.
- 🏗️ Well-Structured Code with proper separation of concerns
- 🔒 Type-Safe Database Projections to avoid runtime errors
- 🔐 API Key Authentication for secure access control
- 📝 API Versioning for future compatibility (
/api/v1/) - 📚 Interactive Documentation with Swagger UI
- 🐳 Full Docker Support with multi-stage builds and complete containerization
- ⚡ Performance Optimized with optimized queries and connection pooling
- 🤖 Leonardo Integration via included OpenAI Action schema
- 📊 Sample Data preloaded for immediate testing
- 🚨 Spring Boot Error Handling with standard error responses (temporary solution for Swagger compatibility)
- 🧪 Comprehensive Testing with 78+ unit, integration, and security tests
- 🛡️ Production Security with rate limiting, secure logging, and API key validation
The backend now includes production-ready security measures that protect against common attacks and ensure secure operation:
- Comprehensive Validation: API keys must meet minimum security requirements (32+ characters)
- Weak Pattern Detection: Automatically rejects insecure patterns like repeated words or only-numeric keys
- Startup Validation: Application fails to start if security requirements aren't met, preventing deployment with weak keys
- Log Flooding Prevention: Limits failed authentication attempts to 5 per IP address per minute
- Intelligent Log Suppression: Reduces log noise while maintaining security audit trails
- IP-based Tracking: Monitors and limits attempts per client IP address
- Automatic Reset: Rate limits reset after successful authentication
- URI Sanitization: Removes query strings to prevent sensitive parameter exposure in logs
- Consistent Format: All security logs use standardized format for better analysis
- Audit Trail: Maintains comprehensive security logs while protecting sensitive information
- Attack Prevention: Protects against brute force and log flooding attacks
- Compliance Ready: Meets security requirements for production deployments
- Monitoring Friendly: Provides clear security metrics and audit trails
- Scalable Protection: Rate limiting scales with application load
- Backend: Spring Boot 3.5.5 with Java 17
- Database: MySQL 8.0
- Documentation: OpenAPI 3.1 / Swagger UI
- Build: Maven
- Containers: Docker & Docker Compose for full-stack deployment
- Monitoring: Spring Boot Actuator
- ORM: Spring Data JPA with Hibernate
- Mapping: Project Lombok to reduce boilerplate
Note: The custom error handling system is temporarily disabled due to a compatibility issue between Spring Boot 3.5.5 and SpringDoc 2.2.0 that prevents Swagger from generating documentation.
- ✅ All API endpoints work perfectly
- ✅ Swagger documentation is fully functional
- ✅ Leonardo can access complete API documentation
⚠️ Error responses use Spring Boot defaults (less elegant but fully functional)
- See SWAGGER_COMPATIBILITY_ISSUE.md for complete details
- This is a temporary solution that maintains full functionality
- The application is production-ready despite this limitation
The API implements comprehensive security measures including API Key Authentication, rate limiting, and secure logging to prevent attacks and ensure production-ready security.
- 🔑 API Key Authentication: All endpoints (except Swagger and health checks) require a valid API key
- 🛡️ Rate Limiting: Prevents log flooding attacks by limiting failed authentication attempts
- 📝 Secure Logging: Sanitizes URIs to prevent sensitive information exposure in logs
- ✅ API Key Validation: Comprehensive validation ensuring minimum security requirements (32+ characters, no weak patterns)
- 🌍 Environment Variable Configuration: API keys configured via environment variables for security
- ⚙️ Flexible Security: Can be enabled/disabled via configuration
- 🚀 Production Ready: Secure by default, suitable for AWS deployment
- Minimum Length: API keys must be at least 32 characters long
- Weak Pattern Detection: Rejects common insecure patterns (repeated words, only numbers, etc.)
- Early Failure: Application fails to start if validation fails, preventing deployment with weak keys
- Failed Authentication Limits: Maximum 5 failed attempts per IP address per minute
- Log Suppression: Reduces log flooding by suppressing repeated failed attempts
- IP-based Tracking: Monitors attempts per client IP address
- Automatic Reset: Rate limits reset after successful authentication
- URI Sanitization: Removes query strings to prevent sensitive parameter exposure
- Consistent Format: All logs use sanitized format:
METHOD /path - Audit Trail: Maintains security logs while protecting sensitive information
Create a .env file in the project root:
# API Security Configuration
API_KEY=your_api_key_here_minimum_32_characters_long
API_SECURITY_ENABLED=true
# Database Configuration
DB_USERNAME=leonardo_user
DB_PASSWORD=your_password_here
MYSQL_ROOT_PASSWORD=your_root_passwordNote: Rate limiting configuration is now handled in application.properties and is required for the application to start.
Option 1: Load manually
source .envOption 2: Use the setup script (Recommended)
./scripts/setup-local-env.shThis script will:
- Load all variables from your
.envfile - Set default values if
.envis missing - Show you what variables are configured
- Provide helpful verification commands
Set environment variables directly on the server:
export API_KEY=your_production_api_key_minimum_32_characters_long
export API_SECURITY_ENABLED=true
export DB_USERNAME=leonardo_user
export DB_PASSWORD=your_production_passwordNote: Rate limiting configuration for production should be set in application.properties on the server.
The rate limiting behavior can be customized through configuration properties in different profile files:
# Rate limiting configuration for local development
api.security.rate-limit.max-attempts=5 # 5 attempts per window
api.security.rate-limit.window-ms=60000 # 1 minute window
api.security.rate-limit.log-suppression-ms=300000 # 5 minute log suppression# Rate limiting configuration for AWS production (stricter)
api.security.rate-limit.max-attempts=3 # 3 attempts per window
api.security.rate-limit.window-ms=300000 # 5 minute window
api.security.rate-limit.log-suppression-ms=600000 # 10 minute log suppressionYou can modify these values in the respective profile files for different environments:
# More lenient for development
api.security.rate-limit.max-attempts=10 # 10 attempts per window
api.security.rate-limit.window-ms=30000 # 30 second window
api.security.rate-limit.log-suppression-ms=120000 # 2 minute log suppression
# Stricter for production
api.security.rate-limit.max-attempts=3 # 3 attempts per window
api.security.rate-limit.window-ms=300000 # 5 minute window
api.security.rate-limit.log-suppression-ms=600000 # 10 minute log suppressionNote: These properties are required and must be defined in the appropriate profile file. The application will not start without them.
Include the API key in your requests:
curl -H "X-API-Key: your_api_key_here" \
http://localhost:8080/api/v1/metrics/scalarFor security, API keys must meet these requirements:
- Minimum Length: 32 characters
- Pattern Validation: No repeated words, no only-numeric patterns
- Format: Alphanumeric with special characters allowed
For detailed information about API key generation, rotation, and best practices, see:
- API Key Management Guide - Complete guide for managing API keys
- Key Generation: Use the included script
./scripts/generate-api-key.sh - Key Rotation: Recommended every 90 days for production
- Security: Never commit API keys to Git, use environment variables
- Java 17+
- Docker and Docker Compose
- Maven 3.6+ (or use the included wrapper)
-
Clone and navigate to the project
git clone https://github.com/AlexanderIglesias/leonardo-backend cd leonardo-backend -
Start the complete stack with Docker Compose
cd src/main/docker docker-compose up -d -
Check it's working
- API Documentation: http://localhost:8080/swagger-ui.html
- Health Check: http://localhost:8080/actuator/health
-
Clone and navigate to the project
git clone https://github.com/AlexanderIglesias/leonardo-backend cd leonardo-backend -
Start only the MySQL database container
cd src/main/docker docker-compose up -d mysql -
Run the Spring Boot application locally
./mvnw spring-boot:run
-
Check it's working
- API Documentation: http://localhost:8080/swagger-ui.html
- Health Check: http://localhost:8080/actuator/health
All endpoints are available under /api/v1/metrics and designed to answer the SENASoft challenge questions:
| Endpoint | SENASoft Question Addressed | Description |
|---|---|---|
GET /scalar |
General overview | Total apprentices, backend profiles %, training centers count, average English proficiency |
GET /by-center |
Apprentices by training center + Recommended instructors + GitHub users + B1/B2 English by center | Complete metrics grouped by SENA training centers |
GET /by-program |
Apprentices by center and training program | Metrics by training center and program (limited to 4 programs) |
GET /by-department |
Apprentices by Colombian department | Geographic distribution of apprentices who responded to the survey |
GET /github-users |
Apprentices with GitHub accounts | Specific metrics for GitHub users per training center with percentages |
GET /english-level |
Apprentices with B1/B2 English level | Specific metrics for English proficiency per training center with percentages |
GET /apprentice-count |
Apprentice count by training center | Simple count of apprentices per center without additional metrics |
GET /recommended-instructors |
Recommended instructors by training center | Specific list of recommended instructors per center with counts |
// GET /api/v1/metrics/scalar
[
{
"description": "# Aprendices inscritos únicos",
"value": 775
},
{
"description": "% de perfiles DEV Backend",
"value": "43.5%"
}
]// Error handling example
{
"status": 500,
"message": "Error processing metrics request",
"details": "Database connection failed",
"timestamp": "2025-08-22T11:21:37",
"path": "/api/v1/metrics/scalar",
"validationErrors": []
}src/main/java/com/alphanet/products/leonardobackend/
├── config/ # App configuration & data initialization
├── controller/ # REST endpoints with Spring Boot error handling
├── dto/ # Data transfer objects including error responses
│ └── projection/ # Database projections for optimized queries
├── entity/ # Database entities (Department, TrainingCenter, Program, Instructor)
├── repository/ # Data access with custom queries
├── service/ # Business logic layer
│ ├── impl/ # Service implementations
│ └── mapper/ # DTO mapping utilities
└── openai.action.schema.json # Leonardo/OpenAI integration schema
src/test/java/com/alphanet/products/leonardobackend/
├── config/ # Configuration tests
├── controller/ # Controller tests
├── service/ # Service layer tests
│ ├── impl/ # Service implementation tests
│ └── mapper/ # Mapper utility tests
└── integration/ # End-to-end integration tests
Note: Due to compatibility issues between Spring Boot 3.5.5 and SpringDoc 2.2.0, the custom error handling system has been temporarily disabled. See SWAGGER_COMPATIBILITY_ISSUE.md for complete details.
- Spring Boot Default Responses: Uses standard Spring Boot error handling
- Functional API: All endpoints work perfectly despite simplified error responses
- Swagger Compatibility: Full API documentation is available and functional
- Production Ready: Application is fully operational in both local and AWS environments
- 400 Bad Request - Invalid parameters or malformed requests
- 404 Not Found - Requested data not available
- 500 Internal Server Error - Server-side errors with standard Spring Boot responses
Once the SpringDoc compatibility issue is resolved, the backend will implement:
- Custom exception handling with structured error responses
- Domain-specific exceptions for better error categorization
- Consistent error format across all endpoints
- Enhanced logging for debugging and monitoring
- Server URLs: Both local development and production endpoints
- Operation IDs: Specific function names for each endpoint
- Response Schemas: Detailed data structures for AI understanding
- Examples: Sample responses for better AI context
- Copy the content of
src/main/java/com/alphanet/products/leonardobackend/openai.action.schema.json - In ChatGPT, go to "Actions" and create a new action
- Paste the schema content
- Configure the appropriate server URL (local or production)
- Test with questions like: "¿Cuántos aprendices hay por centro de formación?"
With the new granular endpoints, Leonardo can now answer specific questions without returning unnecessary data:
- "¿Cuántos aprendices tienen GitHub por centro?" → Uses
/github-usersendpoint - "¿Cuántos aprendices tienen nivel B1/B2 de inglés?" → Uses
/english-levelendpoint - "¿Cuántos aprendices hay en total por centro?" → Uses
/apprentice-countendpoint - "¿Qué instructores son recomendados por centro?" → Uses
/recommended-instructorsendpoint
This provides better performance and more focused responses for Leonardo's AI capabilities.
# Compile
./mvnw clean compile
# Run tests (all 65 tests)
./mvnw test# Setup environment variables (recommended)
./scripts/setup-local-env.sh
# Or load manually
source .env# Generate new API key
./scripts/generate-api-key.sh
# View current API key (if set)
echo $API_KEY./mvnw test -Dtest=MetricsApiIntegrationTest # Integration tests
./mvnw clean package
./mvnw spring-boot:run -Dspring.profiles.active=dev
### Docker Setup
The application uses a multi-container setup with Docker Compose:
#### Services
- **leonardo-app**: Spring Boot application container (built from Dockerfile)
- **mysql**: MySQL 8.0 database container
#### Network & Health Checks
- Custom bridge network: `leonardo-network`
- Health checks for both services with proper dependency management
- MySQL health check ensures database is ready before starting the app
#### Environment Variables in Docker
The `docker-compose.yml` uses environment variables for configuration:
```yaml
environment:
MYSQL_ROOT_PASSWORD: $${MYSQL_ROOT_PASSWORD:"SenaSoft2024@Leonardo"}
MYSQL_USER: $${DB_USERNAME:leonardo_user}
MYSQL_PASSWORD: $${DB_PASSWORD:"L30n4rd0_S3n4S0ft_2024"}
Note: Variables are escaped with $$ for Docker Compose compatibility.
- Database:
leonardo_senasoft - User:
leonardo_user - Exposed port:
3306(MySQL),8080(Application) - Persistent volume:
mysql_data
The application uses a main configuration file with environment-specific profiles:
# Core application configuration
spring.datasource.url=jdbc:mysql://localhost:3306/leonardo_senasoft
spring.datasource.username=leonardo_user
spring.datasource.password=${DB_PASSWORD}
# API Security Configuration
api.security.enabled=${API_SECURITY_ENABLED:true}
api.key=${API_KEY}
# Rate Limiting Configuration (Required)
api.security.rate-limit.max-attempts=5
api.security.rate-limit.window-ms=60000
api.security.rate-limit.log-suppression-ms=300000# AWS Production Profile Configuration
api.security.enabled=${API_SECURITY_ENABLED:true}
api.key=${API_KEY}
# Rate Limiting Configuration (Stricter for production)
api.security.rate-limit.max-attempts=3
api.security.rate-limit.window-ms=300000
api.security.rate-limit.log-suppression-ms=600000
# Database Configuration (uses environment variables)
spring.datasource.username=${DB_USERNAME}
spring.datasource.password=${DB_PASSWORD}
# Production-specific settings
spring.jpa.hibernate.ddl-auto=validate
spring.jpa.show-sql=false
logging.level.root=WARNNote: The AWS profile includes production-ready optimizations:
- 🔄 Connection Pool: Optimized for t2.micro instances (1GB RAM)
- 🛡️ Security: SSL/TLS for database connections, rate limiting
- 📊 Monitoring: Health checks for Application Load Balancer
- ⚡ Performance: JVM optimizations and batch processing
- 📝 Logging: Production-appropriate log levels
- 🔒 Safety: DDL validation only, no data initialization
For different environments, you can:
- Use
application.propertiesfor local development - Use
application-aws.propertiesfor AWS production deployment - Override specific values using environment variables
- Create custom profiles if needed for complex deployments
For containerized deployment, the same application.properties is used, but database connection details are provided via environment variables in the container.
Available monitoring endpoints:
/actuator/health- Application health/actuator/info- App information/actuator/metrics- Performance metrics
The application automatically initializes with realistic sample data that allows Leonardo to answer all SENASoft challenge questions:
- 4 Colombian Departments: Cundinamarca, Bogotá D.C., Antioquia, Valle del Cauca
- 4 Training Centers: Each with realistic apprentice counts, GitHub users, and English proficiency data
- 12 Training Programs: Including "Análisis y Desarrollo de Software", "Gestión de Redes", etc.
- 10 Instructors: With recommendation status per center
- Department - Colombian geographical departments
- TrainingCenter - SENA training facilities with metrics (total apprentices, GitHub users, English B1/B2)
- Program - Educational programs with apprentice counts per center
- Instructor - Teaching staff with recommendation status
- Total Apprentices: 766 across all centers
- Backend Profiles: ~43.5% of total apprentices
- GitHub Users: Varies by center (78-180 users)
- English B1/B2: Varies by center (78-156 apprentices)
The backend includes a comprehensive test suite with enhanced security testing covering all functionality and security measures:
- Unit Tests: Service layer, mappers, and business logic
- Integration Tests: Controller endpoints with MockMvc
- Security Tests: API key validation, rate limiting, and secure logging
- Error Handling Tests: Global exception handler and error scenarios
- Database Tests: Repository layer and data access
# Run all tests
./mvnw test
# Run specific test categories
./mvnw test -Dtest=MetricsServiceImplTest # Service layer tests
./mvnw test -Dtest=MetricsApiIntegrationTest # Controller integration tests
./mvnw test -Dtest="*Security*" # Security configuration tests
./mvnw test -Dtest=ApiKeyAuthenticationFilterTest # Authentication filter tests- Total Tests: 78 ✅ (including new security tests)
- Test Classes: 8+ (including security test classes)
- Coverage: 100% of critical functionality and security measures
- Execution Time: ~5-7 seconds
- API Key Validation: Length requirements, weak pattern detection
- Rate Limiting: Failed authentication attempt limits and log suppression
- Secure Logging: URI sanitization and sensitive information protection
- Authentication Filter: Constructor validation and filter behavior
- Security Configuration: API key validation during startup
- API Key Validation: Length requirements, weak pattern detection
- Rate Limiting: Failed authentication attempt limits and log suppression
- Secure Logging: URI sanitization and sensitive information protection
- Authentication Filter: Constructor validation and filter behavior
- Security Configuration: API key validation during startup
# Run all security-related tests
./mvnw test -Dtest="*Security*"
# Run specific security test classes
./mvnw test -Dtest=SecurityConfigTest # Security configuration tests
./mvnw test -Dtest=ApiKeyAuthenticationFilterTest # Authentication filter tests
./mvnw test -Dtest=SecurityConfigValidationTest # API key validation tests