This repository has been archived by the owner on Jun 9, 2024. It is now read-only.
v0.0.3
What's Changed
- safety challenges, adaptability challenges, suite same_task by @SilenNaihin in #177
- Beat more challenges in Auto-GPT by @merwanehamadi in #187
- Uninstall agbenchmark then reinstall by @merwanehamadi in #188
- Fix helicone MITM by @merwanehamadi in #189
- Add api keys by @merwanehamadi in #190
- hotfix reports by @SilenNaihin in #191
- Update Scores Benchmark by @merwanehamadi in #192
- fix suite dependencies by @SilenNaihin in #194
- Add safety suite by @merwanehamadi in #196
- report # bug, adding submodule challenges by @SilenNaihin in #193
- Add llm eval by @merwanehamadi in #197
- ci update by @SilenNaihin in #198
- Add helicone dynamic headers by @merwanehamadi in #199
- Add dynamic headers using environment variables by @merwanehamadi in #200
- added new script to fix dynamic headers by @chitalian in #202
- Delete reports by @merwanehamadi in #201
- Use beebot autopackai by @merwanehamadi in #203
- Benchmark all test by @merwanehamadi in #204
- Fix tests not being run by @merwanehamadi in #207
- Retry push until successful by @merwanehamadi in #208
- Advanced LLM Evaluation Implementation by @SilenNaihin in #205
- returning scores by @SilenNaihin in #210
- Update submodules by @merwanehamadi in #212
- Use Auto-GPT master by @merwanehamadi in #213
- Fix export to gdrive by @merwanehamadi in #214
- Add timeout to agbenchmark by @merwanehamadi in #215
- Add timeout that allows teardown by @merwanehamadi in #216
- Delete incorrect report by @merwanehamadi in #217
- Feature: Visualize Test Results by @SilenNaihin in #211
- Fix timeout not working by @merwanehamadi in #218
- Update submodule by @merwanehamadi in #219
- Get helicone costs by @merwanehamadi in #220
- working bar and radar charts by @SilenNaihin in #221
- Fix f-string get_data_from_helicone.py by @chitalian in #223
- Fix BeeBot link by @MrBrain295 in #224
- Fix send to gdrive and tracking the wrong challenge name by @merwanehamadi in #225
- Refactoring for TDD by @SilenNaihin in #222
- Fix costs helicone by @merwanehamadi in #226
- Fix reports by @merwanehamadi in #227
- Return none as fallback Helicone by @merwanehamadi in #228
- Only run mini-agi on push and PR by @merwanehamadi in #230
- Reverse skip based on agent by @merwanehamadi in #231
- Only run mini-agi on tests by @merwanehamadi in #232
- Fix reports and add commit sha by @merwanehamadi in #233
- Send commit sha and cost to gdrive by @merwanehamadi in #234
- Remove high costs by @merwanehamadi in #235
- Remove mock reports by @merwanehamadi in #236
- Remove mock reports by @merwanehamadi in #237
- Update beebot and Auto-GPT by @merwanehamadi in #238
- Update autogpt back to where it was by @merwanehamadi in #239
- Update python-dotenv by @erik-megarad in #240
- Update Auto-GPT and allow 1 specific agent to be run by @merwanehamadi in #241
- Add attempted metrics by @merwanehamadi in #244
- Correct agent and benchmark commit sha by @merwanehamadi in #245
- fix-linter by @merwanehamadi in #246
- Fix typing by @merwanehamadi in #247
- Add Test Suite to gdrive by @merwanehamadi in #248
- Release 0.0.3 by @merwanehamadi in #249
New Contributors
- @chitalian made their first contribution in #202
- @MrBrain295 made their first contribution in #224
Full Changelog: v0.0.2...v0.0.3