Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
3f6f5ea
Initial plan
Copilot Dec 12, 2025
86acef7
Implement TODO fixes: optimize database insertion, add output size li…
Copilot Dec 12, 2025
3fb60dd
Fix bisect ordering to avoid string comparison issues for programs wi…
Copilot Dec 12, 2025
8613852
Optimize bisect to avoid temporary list creation, add clarifying comment
Copilot Dec 12, 2025
5046c5e
Add clarifying comments about bisect implementation and Python versio…
Copilot Dec 12, 2025
80ba089
Add comprehensive optimization guide and future enhancement recommend…
Copilot Dec 12, 2025
10349ba
Add generic run.sh template for all projects in problems directory
Copilot Dec 12, 2025
f091def
Add comprehensive README with detailed config reference and recommend…
Copilot Dec 12, 2025
a373596
Add API key configuration support in run scripts with multiple secure…
Copilot Dec 12, 2025
85e0fb3
Merge pull request #1 from mragan2/copilot/find-and-apply-todos
mragan2 Dec 12, 2025
b45759b
Add scaffold for F_time problem
mragan2 Dec 12, 2025
1a27769
Merge pull request #2 from mragan2/codex/create-new-f_time-project-an…
mragan2 Dec 12, 2025
4629c52
Add config for F_time problem
mragan2 Dec 12, 2025
b2c86cc
Merge pull request #3 from mragan2/codex/create-new-f_time-project-an…
mragan2 Dec 12, 2025
ba4f8a7
Add F_time setup guide for local runs
mragan2 Dec 12, 2025
19f42e0
Merge pull request #4 from mragan2/codex/fix-input-directory-setup-fo…
mragan2 Dec 12, 2025
93314b3
Fix F_time runner and diff fallback
Dec 12, 2025
dbc1e34
Update F_time configs, scripts, and add helper files
mragan2 Dec 12, 2025
d3c5ced
aa
mragan2 Dec 12, 2025
3cc765f
Streamline runner config selection and outputs
mragan2 Dec 12, 2025
054cde6
Add ephemeral API key prompts to runners
mragan2 Dec 12, 2025
7ff4a00
Update scripts/interactive_launcher.py
mragan2 Dec 12, 2025
0aab4fe
Merge pull request #6 from mragan2/codex/create-interactive-setup-script
mragan2 Dec 12, 2025
8383e00
Initial plan
Copilot Dec 12, 2025
8632f54
Apply black and isort formatting to all Python files
Copilot Dec 12, 2025
e980710
Merge pull request #8 from mragan2/copilot/fix-code-issues
mragan2 Dec 12, 2025
1fbdfb1
Update config path and improve env var handling in script
mragan2 Dec 12, 2025
5fd28e6
asdv
mragan2 Dec 12, 2025
326dce7
Add adversarial multi-population coevolution
mragan2 Dec 13, 2025
baee9a7
Add seasonal climate modifiers for thermal resilience
mragan2 Dec 13, 2025
6f298cb
Add async batch evaluation helper
mragan2 Dec 13, 2025
89ffcc8
Add parallel evaluation test
mragan2 Dec 13, 2025
5e0f74e
Merge pull request #12 from mragan2/codex/implement-parallel-evaluati…
mragan2 Dec 13, 2025
838d360
Fix: Resolve evaluator, tuly, and initialization issues
Dec 13, 2025
0e771b8
Merge pull request #13 from mragan2/manus/fix-evaluator-tuly-init
mragan2 Dec 13, 2025
040bd94
j
Dec 13, 2025
547df58
Merge branch 'codex/incorporate-novel-agent-into-codeevolve' of https…
Dec 13, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 15 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -210,4 +210,18 @@ __marimo__/

# debug
debug/
mock/
mock/

# CodeEvolve run outputs / local env
experiments/
.conda/

# API Keys and Secrets
# NEVER commit API keys or credentials
.api_keys
*api_keys
*.api_keys
**/api_keys.sh
**/.api_keys
problems/.api_keys
.codeevolve_api_keys
215 changes: 215 additions & 0 deletions OPTIMIZATIONS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,215 @@
# CodeEvolve Optimizations and Future Enhancements

This document summarizes the optimizations implemented and provides suggestions for future improvements to make CodeEvolve a world-class code evolution framework.

## Implemented Optimizations

### 1. Database Performance (database.py)

**Problem**: The original implementation performed a full O(N log N) sort on every program insertion, which becomes a bottleneck as the population grows.

**Solution**: Implemented incremental cache updates using the `bisect` module for O(log N) insertions:
- Added `_incremental_update_cache()` method that uses binary search to find insertion points
- Maintains a sorted list of `(-fitness, pid)` tuples
- Only updates ranks for affected programs (those at or after the insertion point)

**Impact**: Reduces insertion time from O(N log N) to O(log N), significantly improving performance for large populations.

**Code Location**: `src/codeevolve/database.py:397-421`

### 2. Memory Management (evaluator.py)

**Problem**: Program stdout/stderr can be very large, potentially causing memory issues in long-running evolutionary processes.

**Solution**: Added optional output size limits:
- New `max_output_size` parameter in Evaluator constructor
- Truncates output to specified size when enabled
- Default behavior (no storage) preserved for backward compatibility

**Impact**: Prevents memory exhaustion while maintaining debugging capability when needed.

**Code Location**: `src/codeevolve/evaluator.py:79, 276-283`

### 3. Build System Compatibility

**Problem**: Python version requirement was too restrictive (>=3.13.5), preventing installation on most systems.

**Solution**: Relaxed requirement to >=3.10, which is widely available and supports all features used in the codebase.

**Code Location**: `pyproject.toml:10`

## Documentation Improvements

### Enhanced TODOs with Implementation Guidance

1. **Sandboxing Enhancement** (evaluator.py:26-31)
- Documented options: Firejail, Docker, systemd-nspawn, seccomp
- Current implementation uses subprocess isolation with resource limits

2. **Local LM Support** (lm.py:25-31)
- Documented integration strategies for open-source models
- Suggested frameworks: llama-cpp-python, vllm, HuggingFace, Ollama

3. **Async Migration** (islands.py:255-263)
- Explained benefits of asynchronous migration without barriers
- Documented implementation considerations and tradeoffs

## Recommended Future Optimizations

### High Priority

#### 1. Parallel Program Evaluation
**Current State**: Programs are evaluated sequentially within each island.

**Optimization**: Implement parallel evaluation using `asyncio` or `multiprocessing`:
```python
# Pseudo-code example
async def evaluate_batch(programs: List[Program], evaluator: Evaluator):
tasks = [asyncio.create_subprocess_exec(...) for prog in programs]
results = await asyncio.gather(*tasks)
return results
```

**Expected Impact**: 2-10x speedup depending on available CPU cores.

#### 2. LLM Request Batching
**Current State**: LLM requests are made one at a time.

**Optimization**: Batch multiple LLM requests when possible:
- Collect multiple programs needing evolution
- Send batch requests to LLM API
- Most APIs support parallel processing of multiple prompts

**Expected Impact**: Reduced API latency, better token efficiency, 1.5-3x throughput improvement.

#### 3. Caching and Memoization
**Current State**: No caching of previously evaluated programs or LLM responses.

**Optimization**: Implement caching layers:
- **Program Cache**: Hash program code and cache evaluation results
- **LLM Cache**: Cache LLM responses for identical prompts
- **Embedding Cache**: Cache embeddings for program similarity computations

**Expected Impact**: 30-50% reduction in redundant computations.

### Medium Priority

#### 4. Database Indexing
**Current State**: Linear search for certain operations.

**Optimization**: Add indexes for common queries:
- Fitness-based queries
- Parent-child relationships
- Feature space lookups in MAP-Elites

**Expected Impact**: Faster query times, especially for large databases.

#### 5. Adaptive Population Sizing
**Current State**: Fixed population size per island.

**Optimization**: Dynamically adjust population size based on:
- Convergence rate
- Diversity metrics
- Available computational resources

**Expected Impact**: Better resource utilization, faster convergence.

#### 6. Smart Migration Strategy
**Current State**: Fixed migration interval and strategy.

**Optimization**: Implement adaptive migration:
- Migrate based on diversity metrics rather than fixed intervals
- Select migrants based on novelty, not just fitness
- Use gradient-based migration patterns

**Expected Impact**: Improved exploration, better solution diversity.

### Lower Priority (Polish)

#### 7. Profiling and Monitoring
**Optimization**: Add built-in profiling:
- Token usage tracking per operation
- Time spent in each evolutionary operator
- Memory usage patterns
- Success rates for different strategies

**Expected Impact**: Better observability, easier optimization identification.

#### 8. Checkpoint Compression
**Current State**: Checkpoints may be large for big populations.

**Optimization**: Compress checkpoints using gzip or similar:
```python
import gzip
import pickle

def save_checkpoint_compressed(data, path):
with gzip.open(path, 'wb') as f:
pickle.dump(data, f)
```

**Expected Impact**: Reduced storage requirements, faster I/O.

#### 9. Type Hints and Validation
**Current State**: Some functions lack complete type hints.

**Optimization**: Add comprehensive type hints and use `mypy` for static type checking:
- Better IDE support
- Catch type errors early
- Improved code documentation

## Code Quality Improvements

### 1. Error Handling
- Add specific exception types for different error conditions
- Implement retry logic with exponential backoff for API calls
- Better error messages with context

### 2. Logging
- Structured logging with JSON format for better parsing
- Configurable log levels per component
- Log aggregation support for distributed runs

### 3. Testing
- Add integration tests for the full evolutionary loop
- Performance regression tests
- Stress tests with large populations

### 4. Documentation
- Add inline examples in docstrings
- Create tutorial notebooks
- Document configuration parameters with examples

## Performance Benchmarks

To track optimization progress, consider implementing benchmarks for:

1. **Insertion Time**: Measure time to add programs to database at different population sizes
2. **Evolution Throughput**: Programs evolved per minute
3. **Memory Usage**: Peak memory usage during runs
4. **Convergence Speed**: Epochs to reach target fitness

## Architecture Considerations

### Distributed Computing
For large-scale deployments, consider:
- Ray or Dask for distributed computation
- Redis for shared state management
- Message queues (RabbitMQ, Kafka) for asynchronous communication

### Cloud Optimization
- Use spot instances for cost savings
- Implement checkpointing for fault tolerance
- Auto-scaling based on workload

## Conclusion

The implemented optimizations provide a solid foundation for performance. The recommended future optimizations, prioritized by impact and implementation complexity, can further improve CodeEvolve's efficiency and scalability.

Focus areas for maximum impact:
1. Parallel evaluation (highest ROI)
2. LLM request batching
3. Intelligent caching
4. Better monitoring and profiling

These optimizations align with the project's goal of being a transparent, reproducible, and community-driven framework for LLM-driven algorithm discovery.
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,14 @@ conda activate codeevolve
```
The command-line version of codeevolve is implemented in ```src/codeevolve/cli.py```, and ```scripts/run.sh``` contains a bash script for running codeevolve on a given benchmark. The most important variables to be defined in this file are the ```API_KEY, API_BASE``` environment variables for connecting with an LLM provider.

CodeEvolve now also supports an optional **NovelAgent** that injects exploratory prompt updates. Enable it by adding a `NOVEL_AGENT` block to your config (see `problems/problem_template/configs/config_mp.yaml`), which will occasionally replace the standard meta-prompting step with a more diversity-focused proposal.

For competitive experiments, you can enable **adversarial islands** via the `ADVERSARIAL` block in the same config. Islands are partitioned into teams (e.g., red vs blue), each evolving independently with MAP-Elites while periodically cross-evaluating candidates against the rival team's current champions. Fitness can be based on win rate, Elo, or a hybrid score, and cross-play can be scheduled every _k_ epochs or alternated between teams to synchronize coevolutionary phases.

You can also inject a lighthearted **climate pressure** by enabling the `CLIMATE` block. Each epoch belongs to a season (choose a single perpetual season or a 4-season cycle), and a small set of Python helpers are randomly tagged as "heat-tolerant" or "cold-resilient." Programs using functions aligned with the current season earn a configurable fitness multiplier, making heat-resistant code more likely to survive during hotter phases.

For a concrete example, see the [F_time setup guide](problems/F_time/SETUP.md) for step-by-step instructions to clone the repository under `/home/rag/Projects`, configure the conda environment, and run the bundled benchmark script.

More comprehensive tutorials will be released soon.

## Next steps
Expand Down
32 changes: 32 additions & 0 deletions problems/.api_keys.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Example API Keys Configuration File
#
# USAGE:
# 1. Copy this file: cp .api_keys.example .api_keys
# 2. Add your actual API keys to .api_keys
# 3. Source in your run.sh: source problems/.api_keys
# 4. Add .api_keys to .gitignore (already done)
#
# SECURITY:
# - NEVER commit the actual .api_keys file to git
# - This .example file shows the format only
# - Keep your keys secret!

# OpenAI / Azure OpenAI
export API_KEY="sk-your-openai-api-key-here"
export API_BASE="https://api.openai.com/v1"

# Google Gemini
# export API_KEY="your-google-api-key-here"
# export API_BASE="https://generativelanguage.googleapis.com/v1beta"

# Azure OpenAI (custom endpoint)
# export API_KEY="your-azure-key-here"
# export API_BASE="https://your-resource.openai.azure.com/openai/deployments/your-deployment"

# Anthropic Claude
# export API_KEY="sk-ant-your-anthropic-key-here"
# export API_BASE="https://api.anthropic.com/v1"

# Custom / Self-hosted
# export API_KEY="your-custom-key"
# export API_BASE="http://localhost:8080/v1"
Loading