Skip to content
This repository has been archived by the owner on Jun 9, 2024. It is now read-only.

v0.0.3

Compare
Choose a tag to compare
@github-actions github-actions released this 03 Aug 23:48
· 528 commits to master since this release
02dd294

What's Changed

  • safety challenges, adaptability challenges, suite same_task by @SilenNaihin in #177
  • Beat more challenges in Auto-GPT by @merwanehamadi in #187
  • Uninstall agbenchmark then reinstall by @merwanehamadi in #188
  • Fix helicone MITM by @merwanehamadi in #189
  • Add api keys by @merwanehamadi in #190
  • hotfix reports by @SilenNaihin in #191
  • Update Scores Benchmark by @merwanehamadi in #192
  • fix suite dependencies by @SilenNaihin in #194
  • Add safety suite by @merwanehamadi in #196
  • report # bug, adding submodule challenges by @SilenNaihin in #193
  • Add llm eval by @merwanehamadi in #197
  • ci update by @SilenNaihin in #198
  • Add helicone dynamic headers by @merwanehamadi in #199
  • Add dynamic headers using environment variables by @merwanehamadi in #200
  • added new script to fix dynamic headers by @chitalian in #202
  • Delete reports by @merwanehamadi in #201
  • Use beebot autopackai by @merwanehamadi in #203
  • Benchmark all test by @merwanehamadi in #204
  • Fix tests not being run by @merwanehamadi in #207
  • Retry push until successful by @merwanehamadi in #208
  • Advanced LLM Evaluation Implementation by @SilenNaihin in #205
  • returning scores by @SilenNaihin in #210
  • Update submodules by @merwanehamadi in #212
  • Use Auto-GPT master by @merwanehamadi in #213
  • Fix export to gdrive by @merwanehamadi in #214
  • Add timeout to agbenchmark by @merwanehamadi in #215
  • Add timeout that allows teardown by @merwanehamadi in #216
  • Delete incorrect report by @merwanehamadi in #217
  • Feature: Visualize Test Results by @SilenNaihin in #211
  • Fix timeout not working by @merwanehamadi in #218
  • Update submodule by @merwanehamadi in #219
  • Get helicone costs by @merwanehamadi in #220
  • working bar and radar charts by @SilenNaihin in #221
  • Fix f-string get_data_from_helicone.py by @chitalian in #223
  • Fix BeeBot link by @MrBrain295 in #224
  • Fix send to gdrive and tracking the wrong challenge name by @merwanehamadi in #225
  • Refactoring for TDD by @SilenNaihin in #222
  • Fix costs helicone by @merwanehamadi in #226
  • Fix reports by @merwanehamadi in #227
  • Return none as fallback Helicone by @merwanehamadi in #228
  • Only run mini-agi on push and PR by @merwanehamadi in #230
  • Reverse skip based on agent by @merwanehamadi in #231
  • Only run mini-agi on tests by @merwanehamadi in #232
  • Fix reports and add commit sha by @merwanehamadi in #233
  • Send commit sha and cost to gdrive by @merwanehamadi in #234
  • Remove high costs by @merwanehamadi in #235
  • Remove mock reports by @merwanehamadi in #236
  • Remove mock reports by @merwanehamadi in #237
  • Update beebot and Auto-GPT by @merwanehamadi in #238
  • Update autogpt back to where it was by @merwanehamadi in #239
  • Update python-dotenv by @erik-megarad in #240
  • Update Auto-GPT and allow 1 specific agent to be run by @merwanehamadi in #241
  • Add attempted metrics by @merwanehamadi in #244
  • Correct agent and benchmark commit sha by @merwanehamadi in #245
  • fix-linter by @merwanehamadi in #246
  • Fix typing by @merwanehamadi in #247
  • Add Test Suite to gdrive by @merwanehamadi in #248
  • Release 0.0.3 by @merwanehamadi in #249

New Contributors

Full Changelog: v0.0.2...v0.0.3