|
| 1 | +# SCER Rating for AI Models (LLM) - Reference Implementation |
| 2 | + |
| 3 | +A reference implementation of the [SCER for LLM Draft Specification](https://github.com/Green-Software-Foundation/scer/blob/dev/use_cases/SCER_FOR_LLM/SCER_For_LLM_Specification.md). This tool evaluates and compares Large Language Models based on their carbon efficiency, providing transparent ratings to help organizations make sustainable AI deployment decisions. |
| 4 | + |
| 5 | +**🌐 Live Demo**: [https://green-software-foundation.github.io/scer/scer-llm-tool/](https://green-software-foundation.github.io/scer/scer-llm-tool/) |
| 6 | + |
| 7 | +## 🌟 Features |
| 8 | + |
| 9 | +- **Multiple Data Sources**: Choose from ML.ENERGY Leaderboard (recommended), Hugging Face LLM-Perf, or Sample Dataset |
| 10 | +- **Real-time Data**: Fetches live energy and performance metrics from external sources |
| 11 | +- **SCER Rating System**: A-E scale (similar to Nutri-Score) for carbon efficiency |
| 12 | +- **Interactive Leaderboard**: Sort and filter models by SCER rating, efficiency, performance, and size |
| 13 | +- **Detailed Model Analytics**: Click any model for comprehensive metrics including: |
| 14 | + - Energy efficiency (tokens/kWh) |
| 15 | + - Carbon footprint (CO₂e per 1k tokens) |
| 16 | + - Performance benchmarks |
| 17 | + - Hardware configuration |
| 18 | + - Environmental impact calculations |
| 19 | +- **Data Source Transparency**: Click on data source name to view source details and URLs |
| 20 | +- **Smart Data Refresh**: Reload data from current source with visual feedback |
| 21 | +- **Automatic Fallback**: Seamlessly uses sample data if selected source is unavailable |
| 22 | +- **Modern Clean UI**: Professional interface with excellent readability and accessibility |
| 23 | +- **Responsive Design**: Works seamlessly on desktop, tablet, and mobile devices |
| 24 | +- **About & Methodology Pages**: Learn about SCER framework and calculation methods |
| 25 | + |
| 26 | +## 🚀 Quick Start |
| 27 | + |
| 28 | +### Prerequisites |
| 29 | +- Modern web browser with JavaScript enabled |
| 30 | +- Local web server (for development) |
| 31 | + |
| 32 | +### Installation |
| 33 | + |
| 34 | +1. **Clone the repository** |
| 35 | + ```bash |
| 36 | + git clone https://github.com/Green-Software-Foundation/scer.git |
| 37 | + cd scer/scer-llm-tool |
| 38 | + ``` |
| 39 | + |
| 40 | +2. **Start a local server** |
| 41 | + ```bash |
| 42 | + # Using Python |
| 43 | + python -m http.server 8000 |
| 44 | + |
| 45 | + # Using Node.js |
| 46 | + npx serve . |
| 47 | + |
| 48 | + # Using PHP |
| 49 | + php -S localhost:8000 |
| 50 | + ``` |
| 51 | + |
| 52 | +3. **Open in browser** |
| 53 | + Navigate to `http://localhost:8000` |
| 54 | + |
| 55 | +## 📊 SCER Rating System |
| 56 | + |
| 57 | +The SCER rating evaluates models based on tokens generated per kilowatt-hour: |
| 58 | + |
| 59 | +- **A (Excellent)**: ≥2,500 tokens/kWh |
| 60 | +- **B (Good)**: 1,800-2,499 tokens/kWh |
| 61 | +- **C (Average)**: 1,200-1,799 tokens/kWh |
| 62 | +- **D (Poor)**: 600-1,199 tokens/kWh |
| 63 | +- **E (Very Poor)**: <600 tokens/kWh |
| 64 | + |
| 65 | +For complete details on the SCER framework, see the [official SCER for LLM Specification](https://github.com/Green-Software-Foundation/scer/blob/dev/use_cases/SCER_FOR_LLM/SCER_For_LLM_Specification.md). |
| 66 | + |
| 67 | +## 🏗️ Project Structure |
| 68 | + |
| 69 | +``` |
| 70 | +scer-llm-tool/ |
| 71 | +├── index.html # Main HTML structure |
| 72 | +├── css/ |
| 73 | +│ ├── main.css # Main styles and responsive design |
| 74 | +│ └── rating-labels.css # SCER rating badge styles |
| 75 | +├── js/ |
| 76 | +│ ├── rating-algorithm.js # SCER calculation engine |
| 77 | +│ ├── huggingface-adapter.js # Hugging Face data integration (CSV) |
| 78 | +│ ├── mlenergy-adapter.js # ML.ENERGY data integration (stub) |
| 79 | +│ └── main.js # Main application logic |
| 80 | +├── data/ |
| 81 | +│ └── sample-models.json # Sample model data (fallback) |
| 82 | +├── assets/ # Images and static assets |
| 83 | +└── README.md # This file |
| 84 | +``` |
| 85 | + |
| 86 | +## 🔧 Technical Details |
| 87 | + |
| 88 | +### SCER Algorithm |
| 89 | + |
| 90 | +The rating calculation uses the following formulas: |
| 91 | + |
| 92 | +```javascript |
| 93 | +// Tokens per kWh |
| 94 | +tokensPerKwh = totalTokens / energyConsumedKwh |
| 95 | + |
| 96 | +// CO2e per 1k tokens |
| 97 | +co2ePer1kTokens = (energyConsumedKwh * emissionFactor) / (totalTokens / 1000) |
| 98 | + |
| 99 | +// Composite score (70% efficiency + 30% performance) |
| 100 | +compositeScore = (efficiencyScore * 0.7) + (performanceScore * 0.3) |
| 101 | +``` |
| 102 | + |
| 103 | +### Data Sources |
| 104 | + |
| 105 | +The tool supports **multiple data sources** that can be selected via dropdown: |
| 106 | + |
| 107 | +#### 1. ML.ENERGY Leaderboard ⭐ (Recommended - Most Current) |
| 108 | +- **Website**: [ml.energy/leaderboard](https://ml.energy/leaderboard/) |
| 109 | +- **Data Source**: [GitHub Repository](https://github.com/ml-energy/leaderboard/tree/master/data/llm_text_generation/chat) |
| 110 | +- **Access Method**: Direct JSON file fetching from GitHub |
| 111 | +- **Metrics**: Energy per request (Joules), throughput (tokens/s), TPOT, batch sizes |
| 112 | +- **Hardware**: NVIDIA A100-SXM4-40GB (default) |
| 113 | +- **Models**: Latest LLMs including Llama 3.1, Gemma 2, Mistral, Phi-3 |
| 114 | +- **Update Frequency**: Actively maintained (2025 data) |
| 115 | +- **Age**: ✅ **Current** (regularly updated) |
| 116 | + |
| 117 | +#### 2. Hugging Face LLM-Perf Leaderboard |
| 118 | +- **Website**: [HF Space](https://huggingface.co/spaces/optimum/llm-perf-leaderboard) |
| 119 | +- **Dataset**: [optimum-benchmark/llm-perf-leaderboard](https://huggingface.co/datasets/optimum-benchmark/llm-perf-leaderboard) |
| 120 | +- **Access Method**: Direct CSV file download from repository |
| 121 | +- **File Used**: `perf-df-pytorch-cuda-unquantized-1xA100.csv` |
| 122 | +- **Metrics**: Latency, throughput, energy consumption, memory usage |
| 123 | +- **Size**: ~3MB, 863 model configurations |
| 124 | +- **Age**: ⚠️ ~10 months old (December 2024) |
| 125 | + |
| 126 | +#### 3. Sample Dataset |
| 127 | +- **Size**: 3 models (Phi-3 Mini, Llama 3 8B, Gemma 2B) |
| 128 | +- **Purpose**: Offline use, testing, fallback |
| 129 | +- **Always Available**: Yes |
| 130 | +- **Age**: Static example data |
| 131 | + |
| 132 | +### How Data Source Selection Works |
| 133 | + |
| 134 | +1. Use the **"Data Source"** dropdown at the top of the leaderboard |
| 135 | +2. Choose from: **ML.ENERGY Leaderboard** (recommended), Hugging Face LLM-Perf, or Sample Dataset |
| 136 | +3. Data automatically refreshes when source changes |
| 137 | +4. **Click on the data source name** to see a popup with: |
| 138 | + - **Clickable Website URL**: Opens the leaderboard website |
| 139 | + - **Clickable Data URL**: Opens the data repository |
| 140 | + - **Error details**: If something went wrong (shown in yellow) |
| 141 | +5. **Color indicators**: |
| 142 | + - 🔵 **Blue text** = Working source with data |
| 143 | + - 🟡 **Yellow text** = Error occurred, using fallback |
| 144 | +6. If selected source is unavailable, automatically falls back to sample data |
| 145 | +7. Click anywhere outside the popup to close it |
| 146 | + |
| 147 | +### How Data Refresh Works |
| 148 | + |
| 149 | +1. Click the **"Refresh"** button in the leaderboard |
| 150 | +2. Application fetches fresh data from selected source |
| 151 | +3. Raw benchmark data is transformed to SCER format |
| 152 | +4. Models are re-evaluated and ratings calculated |
| 153 | +5. Leaderboard updates with new data |
| 154 | +6. Automatic fallback to sample data if source fails |
| 155 | + |
| 156 | +## 🎨 Design System |
| 157 | + |
| 158 | +### Color Palette |
| 159 | +- **Primary**: Green-to-blue gradient for branding |
| 160 | +- **Background**: Clean white/light gray (#f8f9fa) |
| 161 | +- **Text**: Dark gray (#212529) for excellent readability |
| 162 | +- **Accents**: Green (efficiency), Blue (performance), Purple (composite) |
| 163 | +- **SCER Badges**: Color-coded A-E ratings (Green=A, Red=E) |
| 164 | + |
| 165 | +### Components |
| 166 | +- **Clean Card Design**: White cards with subtle borders and shadows |
| 167 | +- **SCER Badges**: Circular color-coded rating indicators with gradients |
| 168 | +- **Interactive Tables**: Sortable, filterable, clickable rows |
| 169 | +- **Modal Dialogs**: Full model details with organized sections |
| 170 | +- **Responsive Grid**: Mobile-first layout system |
| 171 | + |
| 172 | +## 📱 Browser Support |
| 173 | + |
| 174 | +- Chrome 90+ |
| 175 | +- Firefox 88+ |
| 176 | +- Safari 14+ |
| 177 | +- Edge 90+ |
| 178 | + |
| 179 | +## 🤝 Contributing |
| 180 | + |
| 181 | +We welcome contributions to improve the SCER LLM tool! This is a community-driven project under the Green Software Foundation. |
| 182 | + |
| 183 | +### How to Contribute |
| 184 | + |
| 185 | +1. **Fork the repository** |
| 186 | + ```bash |
| 187 | + # Fork https://github.com/Green-Software-Foundation/scer |
| 188 | + git clone https://github.com/YOUR-USERNAME/scer.git |
| 189 | + cd scer/scer-llm-tool |
| 190 | + ``` |
| 191 | + |
| 192 | +2. **Create a feature branch** |
| 193 | + ```bash |
| 194 | + git checkout -b feature/scer-llm-tool-improvement |
| 195 | + ``` |
| 196 | + |
| 197 | +3. **Make your changes** |
| 198 | + - Add features, fix bugs, or improve documentation |
| 199 | + - Test thoroughly on different browsers and devices |
| 200 | + - Follow existing code style and patterns |
| 201 | + |
| 202 | +4. **Commit your changes** |
| 203 | + ```bash |
| 204 | + git add . |
| 205 | + git commit -m "feat(scer-llm-tool): Add amazing feature" |
| 206 | + ``` |
| 207 | + Use conventional commit format: `feat:`, `fix:`, `docs:`, `style:`, `refactor:`, `test:` |
| 208 | + |
| 209 | +5. **Push to your fork** |
| 210 | + ```bash |
| 211 | + git push origin feature/scer-llm-tool-improvement |
| 212 | + ``` |
| 213 | + |
| 214 | +6. **Open a Pull Request** |
| 215 | + - Go to [Green Software Foundation SCER](https://github.com/Green-Software-Foundation/scer) |
| 216 | + - Click "New Pull Request" |
| 217 | + - Select your fork and branch |
| 218 | + - Provide clear description of changes |
| 219 | + - Reference any related issues |
| 220 | + |
| 221 | +### Contribution Areas |
| 222 | + |
| 223 | +- 🔍 **Data Sources**: Help integrate new credible benchmark sources |
| 224 | +- 🐛 **Bug Fixes**: Report and fix issues |
| 225 | +- 📊 **Features**: Propose and implement new features |
| 226 | +- 📖 **Documentation**: Improve docs, add examples |
| 227 | +- 🎨 **UI/UX**: Enhance design and accessibility |
| 228 | +- ✅ **Testing**: Add tests, improve coverage |
| 229 | + |
| 230 | +### Questions? |
| 231 | + |
| 232 | +- Open an [Issue](https://github.com/Green-Software-Foundation/scer/issues) for bugs or feature requests |
| 233 | +- Join [Discussions](https://github.com/Green-Software-Foundation/scer/discussions) for questions |
| 234 | +- See [SCER Specification](https://github.com/Green-Software-Foundation/scer/blob/dev/use_cases/SCER_FOR_LLM/SCER_For_LLM_Specification.md) for framework details |
| 235 | + |
| 236 | +## 📋 Key Challenges |
| 237 | + |
| 238 | +The tool is functional, but the main challenge is **getting more credible, current data**. Three key research areas: |
| 239 | + |
| 240 | +### 1. 🔍 Better Data Sources |
| 241 | +- **Current Issue**: ML.ENERGY has only ~7 models, Hugging Face data is 10 months old |
| 242 | +- **Research Needed**: |
| 243 | + - Contact Hugging Face team about API access or data updates |
| 244 | + - Find alternative sources (MLPerf, academic benchmarks, cloud providers) |
| 245 | + - Improve ML.ENERGY integration (model discovery, multi-GPU support) |
| 246 | + |
| 247 | +### 2. 🤖 AI-Powered Data Extraction |
| 248 | +- **Opportunity**: Use LLMs to extract energy/carbon metrics from public sources |
| 249 | +- **Potential**: |
| 250 | + - Automatically parse model cards, papers, technical documentation |
| 251 | + - Extract metrics from research publications and benchmarks |
| 252 | + - Verify and validate AI-extracted data against known sources |
| 253 | + - Scale data collection without manual effort |
| 254 | + |
| 255 | +### 3. 👥 Crowdsourced Data |
| 256 | +- **Opportunity**: Community-contributed measurements and evaluations |
| 257 | +- **Potential**: |
| 258 | + - User-submitted benchmark results (with verification) |
| 259 | + - Community voting/validation for data quality |
| 260 | + - Distributed measurement efforts |
| 261 | + - Incentive structure for contributors |
| 262 | + |
| 263 | +## 📄 License |
| 264 | + |
| 265 | +This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. |
| 266 | + |
| 267 | +## 🙏 Acknowledgments |
| 268 | + |
| 269 | +- [Green Software Foundation](https://greensoftware.foundation/) for the SCER framework |
| 270 | +- [SCER for LLM Specification](https://github.com/Green-Software-Foundation/scer/blob/dev/use_cases/SCER_FOR_LLM/SCER_For_LLM_Specification.md) - Official specification document |
| 271 | +- [Hugging Face](https://huggingface.co/) for model performance data |
| 272 | +- [ML.ENERGY Leaderboard](https://ml.energy/leaderboard/) for energy benchmark data |
| 273 | +- [CodeCarbon](https://codecarbon.io/) for energy measurement methodology |
| 274 | + |
| 275 | +## 📞 Contact |
| 276 | + |
| 277 | +- Project Repository: [Green Software Foundation SCER](https://github.com/Green-Software-Foundation/scer) |
| 278 | +- Project Issues: [GitHub Issues](https://github.com/Green-Software-Foundation/scer/issues) |
| 279 | +- Discussions: [GitHub Discussions](https://github.com/Green-Software-Foundation/scer/discussions) |
| 280 | + |
| 281 | +--- |
| 282 | + |
| 283 | +**Promoting sustainable AI development through transparent carbon efficiency rating.** 🌱 |
0 commit comments