-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OSPP: Smart Coding benchmark suite: built on KubeEdge-lanvs #159
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposals are not needed in this implementation PR. Shall move the proposal to the proposal PR, i.e., #120
Besides, all the content written in Chinese should be translated into English to facilitate international understanding and usage.
@@ -0,0 +1,172 @@ | |||
# 背景 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposals are not needed in this implementation PR. Shall keep the proposal in the proposal PR, i.e., #120
Besides, all the content written in Chinese should be translated into English to facilitate international understanding and usage.
@@ -0,0 +1,129 @@ | |||
# 背景 | |||
大型语言模型(LLM)在代码生成、自动编程、代码分析等任务中展现出了强大的能力,但这些模型通常是在通用代码数据上训练的,往往不能充分利用实际场景中软件工程师的协作和反馈。为了构建更加智能高效的代码生态,需要建立协作代码数据集和评测基准,促进LLM与软件工程师的紧密协作。本项目旨在基于开源边缘计算框架KubeEdge-Ianvs构建LLM协作代码智能体对齐数据集和评测基准。该数据集将包括软件工程师在开发过程中的行为轨迹、反馈和迭代过程,以及相关的代码版本和注释信息。通过这些数据,我们将设计评测指标和基准来衡量LLM在代码生成、推荐和分析等任务中的表现,促进LLM与软件工程师之间的协作。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposals are not needed in this implementation PR. Shall keep the proposal in the proposal PR, i.e., #120
Besides, all the content written in Chinese should be translated into English to facilitate international understanding and usage.
|
||
|
||
def extract_comprehensive_score(input_str): | ||
# 使用正则表达式匹配综合得分及其分数 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In comment/testenv/llm_judgement.py, the content written in Chinese should be translated into English to facilitate international understanding and usage.
|
||
|
||
def extract_comprehensive_score(input_str): | ||
# 使用正则表达式匹配综合得分及其分数 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In issue/testenv/llm_judgement.py, the content written in Chinese should be translated into English to facilitate international understanding and usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides, there are also CI issues that remain to be solved, see https://github.com/kubeedge/ianvs/actions/runs/11524309022/job/32095357988?pr=159
For example
Run if [ "3.7" = "3.9" ]; then
************* Module core.testenvmanager.dataset.dataset
core/testenvmanager/dataset/dataset.py:22:0: C0301: Line too long (106/100) (line-too-long)
core/testenvmanager/dataset/dataset.py:22:0: E06[11](https://github.com/kubeedge/ianvs/actions/runs/11524309022/job/32095357988?pr=159#step:5:12): No name 'JsonlDataParse' in module 'sedna.datasources' (no-name-in-module)
core/testenvmanager/dataset/dataset.py:22:0: E0611: No name 'JSONMetaDataParse' in module 'sedna.datasources' (no-name-in-module)
core/testenvmanager/dataset/dataset.py:28:0: R0902: Too many instance attributes (9/7) (too-many-instance-attributes)
core/testenvmanager/dataset/dataset.py:19:0: W0611: Unused import json (unused-import)
-----------------------------------
Your code has been rated at 9.92/10
Error: Process completed with exit code 30.
已经按照 https://github.com/kubeedge/ianvs/actions/runs/11524309022/job/32095357988?pr=159 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New CI errors see https://github.com/kubeedge/ianvs/actions/runs/11529775773/job/32099094184?pr=159
Run if [ "3.7" = "3.9" ]; then
************* Module core.testenvmanager.dataset.dataset
core/testenvmanager/dataset/dataset.py:468:0: C0304: Final newline missing (missing-final-newline)
-----------------------------------
Your code has been rated at 9.99/10
Error: The operation was canceled.
明白,好像是我的尾行格式设置有误,操作系统格式问题,正在修改 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposals are not needed in this implementation PR. Shall move the proposal to the proposal PR, i.e., #120
Kindly remind: 1) need to remove proposals in implementation PR; 2) translate words in Chinese. See comments in #159 (review)
相关proposal已经删除,后续提交仅更新代码 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pull request contains 10 commits, which might make maintenance difficult, considering the number of contributors, pull requests, and their commits in KubeEdge Ianvs recently.
After all the comments are tackled, in the final stage, @safe-b can squash the commits into one using rebase techniques.
@@ -0,0 +1,129 @@ | |||
# 背景 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned, proposals shall be kept in the proposal PR. If it is needed in the test reports, then this document shall be written as test reports like this link.
@@ -0,0 +1,129 @@ | |||
# 背景 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In test reports, all the content written in Chinese should be translated into English to facilitate international understanding and usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments include
-
fix all statements written in Chinese
-
This branch has conflicts that must be resolved
Use the web editor or the to resolve conflicts.
Conflicting files
core/testenvmanager/dataset/dataset.py -
This pull request contains 10 commits, which might make maintenance difficult, considering the number of contributors, pull requests, and their commits in KubeEdge Ianvs recently. After all the comments are tackled, in the final stage, @safe-b can squash the commits into one using rebase techniques.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
English issues have been fixed.
-
This branch has conflicts that must be resolved
Use the web editor or the to resolve conflicts.
Conflicting files
core/testenvmanager/dataset/dataset.py -
This pull request contains 17 commits, which might make maintenance difficult, considering the number of contributors, pull requests, and their commits in KubeEdge Ianvs recently. After all the comments are tackled, in the final stage, @safe-b can squash the commits into one using rebase techniques.
-
There is still the proposal PR add a proposal of Smart Coding benchmark suite #120 to be merged
Signed-off-by: boX <442572328@qq.com> update and improve the proposal Improve the architecture diagram Signed-off-by: boX <442572328@qq.com> update and improve the proposal Signed-off-by: boX <442572328@qq.com> update and improve the proposal Signed-off-by: boX <442572328@qq.com> update and improve the proposal Signed-off-by: boX <442572328@qq.com> updated smart_coding large model benchmark Signed-off-by: boX <442572328@qq.com> fix pylint check problem and updated smart_coding large model benchmark Signed-off-by: boX <442572328@qq.com> delete Chinese proposal Signed-off-by: boX <442572328@qq.com> fix pylint check problem and updated smart_coding large model benchmark Signed-off-by: boX <442572328@qq.com> delete Chinese proposal Signed-off-by: boX <442572328@qq.com> fix pylint check problem Signed-off-by: boX <442572328@qq.com> fix pylint check problem and updated smart_coding large model benchmark Signed-off-by: boX <442572328@qq.com> fix pylint check problem and updated smart_coding large model benchmark Signed-off-by: boX <442572328@qq.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/lgtm |
/approve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All concerns are fixed. Well done! @safe-b
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: hsj576, MooreZheng The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This PR is the implementation of #98