Add Docker executor (#733)

Co-authored-by: Touseef Ahmad <touseefahmed9669@gmail.com> Co-authored-by: Parteek <parteekkamboj112@gmail.com> Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com> Co-authored-by: sysradium <sysradium@users.noreply.github.com>
huggingface · Feb 26, 2025 · d0c3f43 · d0c3f43
1 parent 9498094
commit d0c3f43
Show file tree

Hide file tree

Showing 21 changed files with 537 additions and 293 deletions.
diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml
@@ -78,9 +78,9 @@ jobs:
           uv run pytest ./tests/test_local_python_executor.py
         if: ${{ success() || failure() }}
 
-      - name: E2B executor tests
+      - name: Remote executor tests
         run: |
-          uv run pytest ./tests/test_e2b_executor.py
+          uv run pytest ./tests/test_remote_executors.py
         if: ${{ success() || failure() }}
 
       - name: Search tests

diff --git a/docs/source/en/guided_tour.mdx b/docs/source/en/guided_tour.mdx
@@ -186,7 +186,8 @@ agent.run("Could you get me the title of the page at url 'https://huggingface.co
 
 The execution will stop at any code trying to perform an illegal operation or if there is a regular Python error with the code generated by the agent.
 
-You can also use [E2B code executor](https://e2b.dev/docs#what-is-e2-b) instead of a local Python interpreter by first [setting the `E2B_API_KEY` environment variable](https://e2b.dev/dashboard?tab=keys) and then passing `use_e2b_executor=True` upon agent initialization.
+You can also use [E2B code executor](https://e2b.dev/docs#what-is-e2-b) or Docker instead of a local Python interpreter. For E2B, first [set the `E2B_API_KEY` environment variable](https://e2b.dev/dashboard?tab=keys) and then pass `executor_type="e2b"` upon agent initialization. For Docker, pass `executor_type="docker"` during initialization.
+
 
 > [!TIP]
 > Learn more about code execution [in this tutorial](tutorials/secure_code_execution).

diff --git a/docs/source/en/tutorials/secure_code_execution.mdx b/docs/source/en/tutorials/secure_code_execution.mdx
@@ -48,7 +48,7 @@ One could argue that on the [spectrum of agency](../conceptual_guides/intro_agen
 So you need to be mindful of security.
 
 To add a first layer of security, code execution in `smolagents` is not performed by the vanilla Python interpreter.
-We have re-built a more secure `LocalPythonInterpreter` from the ground up.
+We have re-built a more secure `LocalPythonExecutor` from the ground up.
 
 To be precise, this interpreter works by loading the Abstract Syntax Tree (AST) from your Code and executes it operation by operation, making sure to always follow certain rules:
 - By default, imports are disallowed unless they have been explicitly added to an authorization list by the user.
@@ -90,11 +90,11 @@ pip install 'smolagents[e2b]'
 
 #### Running your agent in E2B: mono agents
 
-We provide a simple way to use an E2B Sandbox: simply add `use_e2b_executor=True` to the agent initialization, like:
+We provide a simple way to use an E2B Sandbox: simply add `executor_type="e2b"` to the agent initialization, like:
 ```py
 from smolagents import HfApiModel, CodeAgent
 
-agent = CodeAgent(model=HfApiModel(), tools=[], use_e2b_executor=True)
+agent = CodeAgent(model=HfApiModel(), tools=[], executor_type="e2b")
 
 agent.run("Can you give me the 100th Fibonacci number?")
 ```

diff --git a/docs/source/hi/guided_tour.mdx b/docs/source/hi/guided_tour.mdx
@@ -124,7 +124,7 @@ agent.run("Could you get me the title of the page at url 'https://huggingface.co
 
 एक्जीक्यूशन किसी भी कोड पर रुक जाएगा जो एक अवैध ऑपरेशन करने का प्रयास करता है या यदि एजेंट द्वारा जनरेट किए गए कोड में एक रेगुलर पायथन एरर है।
 
-आप [E2B कोड एक्जीक्यूटर](https://e2b.dev/docs#what-is-e2-b) का उपयोग लोकल पायथन इंटरप्रेटर के बजाय कर सकते हैं, पहले [`E2B_API_KEY` एनवायरनमेंट वेरिएबल सेट करके](https://e2b.dev/dashboard?tab=keys) और फिर एजेंट इनिशियलाइजेशन पर `use_e2b_executor=True` पास करके।
+आप [E2B कोड एक्जीक्यूटर](https://e2b.dev/docs#what-is-e2-b) या Docker का उपयोग लोकल पायथन इंटरप्रेटर के बजाय कर सकते हैं। E2B के लिए, पहले [`E2B_API_KEY` एनवायरनमेंट वेरिएबल सेट करें](https://e2b.dev/dashboard?tab=keys) और फिर एजेंट इनिशियलाइजेशन पर `executor_type="e2b"` पास करें। Docker के लिए, इनिशियलाइजेशन के दौरान `executor_type="docker"` पास करें।
 
 > [!TIP]
 > कोड एक्जीक्यूशन के बारे में और जानें [इस ट्यूटोरियल में](tutorials/secure_code_execution)।

diff --git a/docs/source/hi/tutorials/secure_code_execution.mdx b/docs/source/hi/tutorials/secure_code_execution.mdx
@@ -41,7 +41,7 @@ rendered properly in your Markdown viewer.
 ### लोकल पायथन इंटरप्रेटर
 
 डिफ़ॉल्ट रूप से, `CodeAgent` LLM-जनरेटेड कोड को आपके एनवायरनमेंट में चलाता है।
-यह एक्जीक्यूशन वैनिला पायथन इंटरप्रेटर द्वारा नहीं किया जाता: हमने एक अधिक सुरक्षित `LocalPythonInterpreter` को शुरू से फिर से बनाया है।
+यह एक्जीक्यूशन वैनिला पायथन इंटरप्रेटर द्वारा नहीं किया जाता: हमने एक अधिक सुरक्षित `LocalPythonExecutor` को शुरू से फिर से बनाया है।
 यह इंटरप्रेटर सुरक्षा के लिए डिज़ाइन किया गया है:
  - इम्पोर्ट्स को उपयोगकर्ता द्वारा स्पष्ट रूप से पास की गई सूची तक सीमित करना
  - इनफिनिट लूप्स और रिसोर्स ब्लोटिंग को रोकने के लिए ऑपरेशंस की संख्या को कैप करना
@@ -64,7 +64,7 @@ rendered properly in your Markdown viewer.
 
 अब आप तैयार हैं!
 
-कोड एक्जीक्यूटर को E2B पर सेट करने के लिए, बस अपने `CodeAgent` को इनिशियलाइज़ करते समय `use_e2b_executor=True` फ्लैग पास करें।
+कोड एक्जीक्यूटर को E2B पर सेट करने के लिए, बस अपने `CodeAgent` को इनिशियलाइज़ करते समय `executor_type="e2b"` फ्लैग पास करें।
 ध्यान दें कि आपको `additional_authorized_imports` में सभी टूल की डिपेंडेंसीज़ जोड़नी चाहिए, ताकि एक्जीक्यूटर उन्हें इंस्टॉल करे।
 
 ```py
@@ -73,7 +73,7 @@ agent = CodeAgent(
     tools = [VisitWebpageTool()],
     model=HfApiModel(),
     additional_authorized_imports=["requests", "markdownify"],
-    use_e2b_executor=True
+    executor_type="e2b"
 )
 
 agent.run("What was Abraham Lincoln's preferred pet?")

diff --git a/docs/source/zh/guided_tour.mdx b/docs/source/zh/guided_tour.mdx
@@ -134,7 +134,7 @@ agent.run("Could you get me the title of the page at url 'https://huggingface.co
 
 如果生成的代码尝试执行非法操作或出现常规 Python 错误，执行将停止。
 
-您也可以使用 [E2B 代码执行器](https://e2b.dev/docs#what-is-e2-b) 而不是本地 Python 解释器，首先 [设置 `E2B_API_KEY` 环境变量](https://e2b.dev/dashboard?tab=keys)，然后在初始化 agent 时传递 `use_e2b_executor=True`。
+您也可以使用 [E2B 代码执行器](https://e2b.dev/docs#what-is-e2-b) 或 Docker 而不是本地 Python 解释器。对于 E2B，首先 [设置 `E2B_API_KEY` 环境变量](https://e2b.dev/dashboard?tab=keys)，然后在初始化 agent 时传递 `executor_type="e2b"`。对于 Docker，在初始化时传递 `executor_type="docker"`。
 
 > [!TIP]
 > 在 [该教程中](tutorials/secure_code_execution) 了解更多关于代码执行的内容。

diff --git a/docs/source/zh/tutorials/secure_code_execution.mdx b/docs/source/zh/tutorials/secure_code_execution.mdx
@@ -41,7 +41,7 @@ rendered properly in your Markdown viewer.
 ### 本地 Python 解释器
 
 默认情况下，`CodeAgent` 会在你的环境中运行 LLM 生成的代码。
-这个执行不是由普通的 Python 解释器完成的：我们从零开始重新构建了一个更安全的 `LocalPythonInterpreter`。
+这个执行不是由普通的 Python 解释器完成的：我们从零开始重新构建了一个更安全的 `LocalPythonExecutor`。
 这个解释器通过以下方式设计以确保安全：
   - 将导入限制为用户显式传递的列表
   - 限制操作次数以防止无限循环和资源膨胀
@@ -64,7 +64,7 @@ rendered properly in your Markdown viewer.
 
 现在你已经准备好了！
 
-要将代码执行器设置为 E2B，只需在初始化 `CodeAgent` 时传递标志 `use_e2b_executor=True`。
+要将代码执行器设置为 E2B，只需在初始化 `CodeAgent` 时传递标志 `executor_type="e2b"`。
 请注意，你应该将所有工具的依赖项添加到 `additional_authorized_imports` 中，以便执行器安装它们。
 
 ```py
@@ -73,7 +73,7 @@ agent = CodeAgent(
     tools = [VisitWebpageTool()],
     model=HfApiModel(),
     additional_authorized_imports=["requests", "markdownify"],
-    use_e2b_executor=True
+    executor_type="e2b"
 )
 
 agent.run("What was Abraham Lincoln's preferred pet?")

diff --git a/examples/e2b_example.py b/examples/e2b_example.py
diff --git a/examples/sandboxed_execution.py b/examples/sandboxed_execution.py
@@ -0,0 +1,12 @@
+from smolagents import CodeAgent, DuckDuckGoSearchTool, HfApiModel
+
+
+model = HfApiModel()
+
+agent = CodeAgent(tools=[DuckDuckGoSearchTool()], model=model, executor_type="docker")
+output = agent.run("How many seconds would it take for a leopard at full speed to run through Pont des Arts?")
+print("Docker executor result:", output)
+
+agent = CodeAgent(tools=[DuckDuckGoSearchTool()], model=model, executor_type="e2b")
+output = agent.run("How many seconds would it take for a leopard at full speed to run through Pont des Arts?")
+print("E2B executor result:", output)
diff --git a/pyproject.toml b/pyproject.toml
@@ -33,8 +33,8 @@ audio = [
   "smolagents[torch]",
 ]
 docker = [
-  "docker",
-  "python-dotenv>=1.0.1",
+  "docker>=7.1.0",
+  "websocket-client",
 ]
 e2b = [
   "e2b-code-interpreter>=1.0.3",
@@ -68,7 +68,7 @@ transformers = [
   "smolagents[torch]",
 ]
 all = [
-  "smolagents[audio,e2b,gradio,litellm,mcp,openai,telemetry,transformers]",
+  "smolagents[audio,docker,e2b,gradio,litellm,mcp,openai,telemetry,transformers]",
 ]
 quality = [
   "ruff>=0.9.0",

diff --git a/src/smolagents/__init__.py b/src/smolagents/__init__.py
@@ -19,12 +19,12 @@
 from .agent_types import *  # noqa: I001
 from .agents import *  # Above noqa avoids a circular dependency due to cli.py
 from .default_tools import *
-from .e2b_executor import *
 from .gradio_ui import *
 from .local_python_executor import *
 from .memory import *
 from .models import *
 from .monitoring import *
+from .remote_executors import *
 from .tools import *
 from .utils import *
 from .cli import *
diff --git a/src/smolagents/agents.py b/src/smolagents/agents.py
@@ -38,12 +38,7 @@
 
 from .agent_types import AgentAudio, AgentImage, AgentType, handle_agent_output_types
 from .default_tools import TOOL_MAPPING, FinalAnswerTool
-from .e2b_executor import E2BExecutor
-from .local_python_executor import (
-    BASE_BUILTIN_MODULES,
-    LocalPythonInterpreter,
-    fix_final_answer_code,
-)
+from .local_python_executor import BASE_BUILTIN_MODULES, LocalPythonExecutor, PythonExecutor, fix_final_answer_code
 from .memory import ActionStep, AgentMemory, PlanningStep, SystemPromptStep, TaskStep, ToolCall
 from .models import (
     ChatMessage,
@@ -56,6 +51,7 @@
     LogLevel,
     Monitor,
 )
+from .remote_executors import DockerExecutor, E2BExecutor
 from .tools import Tool
 from .utils import (
     AgentError,
@@ -316,7 +312,8 @@ def run(
         self.memory.steps.append(TaskStep(task=self.task, task_images=images))
 
         if getattr(self, "python_executor", None):
-            self.python_executor.update_tools({**self.tools, **self.managed_agents})
+            self.python_executor.send_variables(variables=self.state)
+            self.python_executor.send_tools({**self.tools, **self.managed_agents})
 
         if stream:
             # The steps are returned as they are executed through a generator to iterate on.
@@ -665,7 +662,6 @@ def replay(self, detailed: bool = False):
 
     def __call__(self, task: str, **kwargs):
         """Adds additional prompting for the managed agent, runs it, and wraps the output.
-
         This method is called only by a managed agent.
         """
         full_task = populate_template(
@@ -840,8 +836,9 @@ def to_dict(self) -> Dict[str, Any]:
         }
         if hasattr(self, "authorized_imports"):
             agent_dict["authorized_imports"] = self.authorized_imports
-        if hasattr(self, "use_e2b_executor"):
-            agent_dict["use_e2b_executor"] = self.use_e2b_executor
+        if hasattr(self, "executor_type"):
+            agent_dict["executor_type"] = self.executor_type
+            agent_dict["executor_kwargs"] = self.executor_kwargs
         if hasattr(self, "max_print_outputs_length"):
             agent_dict["max_print_outputs_length"] = self.max_print_outputs_length
         return agent_dict
@@ -938,7 +935,8 @@ def from_folder(cls, folder: Union[str, Path], **kwargs):
         )
         if cls.__name__ == "CodeAgent":
             args["additional_authorized_imports"] = agent_dict["authorized_imports"]
-            args["use_e2b_executor"] = agent_dict["use_e2b_executor"]
+            args["executor_type"] = agent_dict["executor_type"]
+            args["executor_kwargs"] = agent_dict["executor_kwargs"]
             args["max_print_outputs_length"] = agent_dict["max_print_outputs_length"]
         args.update(kwargs)
         return cls(**args)
@@ -1131,7 +1129,8 @@ class CodeAgent(MultiStepAgent):
         grammar (`dict[str, str]`, *optional*): Grammar used to parse the LLM output.
         additional_authorized_imports (`list[str]`, *optional*): Additional authorized imports for the agent.
         planning_interval (`int`, *optional*): Interval at which the agent will run a planning step.
-        use_e2b_executor (`bool`, default `False`): Whether to use the E2B executor for remote code execution.
+        executor_type (`str`, default `"local"`): Which executor type to use between `"local"`, `"e2b"`, or `"docker"`.
+        executor_kwargs (`dict`, *optional*): Additional arguments to pass to initialize the executor.
         max_print_outputs_length (`int`, *optional*): Maximum length of the print outputs.
         **kwargs: Additional keyword arguments.
 
@@ -1145,13 +1144,13 @@ def __init__(
         grammar: Optional[Dict[str, str]] = None,
         additional_authorized_imports: Optional[List[str]] = None,
         planning_interval: Optional[int] = None,
-        use_e2b_executor: bool = False,
+        executor_type: str = "local",
+        executor_kwargs: Optional[Dict[str, Any]] = None,
         max_print_outputs_length: Optional[int] = None,
         **kwargs,
     ):
         self.additional_authorized_imports = additional_authorized_imports if additional_authorized_imports else []
         self.authorized_imports = list(set(BASE_BUILTIN_MODULES) | set(self.additional_authorized_imports))
-        self.use_e2b_executor = use_e2b_executor
         self.max_print_outputs_length = max_print_outputs_length
         prompt_templates = prompt_templates or yaml.safe_load(
             importlib.resources.files("smolagents.prompts").joinpath("code_agent.yaml").read_text()
@@ -1169,22 +1168,26 @@ def __init__(
                 "Caution: you set an authorization for all imports, meaning your agent can decide to import any package it deems necessary. This might raise issues if the package is not installed in your environment.",
                 0,
             )
-
-        if use_e2b_executor and len(self.managed_agents) > 0:
-            raise Exception(
-                f"You passed both {use_e2b_executor=} and some managed agents. Managed agents is not yet supported with remote code execution."
-            )
-
-        if use_e2b_executor:
-            self.python_executor = E2BExecutor(
-                self.additional_authorized_imports,
-                self.logger,
-            )
-        else:
-            self.python_executor = LocalPythonInterpreter(
-                self.additional_authorized_imports,
-                max_print_outputs_length=max_print_outputs_length,
-            )
+        self.executor_type = executor_type
+        self.executor_kwargs = executor_kwargs or {}
+        self.python_executor = self.create_python_executor(executor_type, self.executor_kwargs)
+
+    def create_python_executor(self, executor_type: str, kwargs: Dict[str, Any]) -> PythonExecutor:
+        match executor_type:
+            case "e2b" | "docker":
+                if self.managed_agents:
+                    raise Exception("Managed agents are not yet supported with remote code execution.")
+                if executor_type == "e2b":
+                    return E2BExecutor(self.additional_authorized_imports, self.logger, **kwargs)
+                else:
+                    return DockerExecutor(self.additional_authorized_imports, self.logger, **kwargs)
+            case "local":
+                return LocalPythonExecutor(
+                    self.additional_authorized_imports,
+                    max_print_outputs_length=self.max_print_outputs_length,
+                )
+            case _:  # if applicable
+                raise ValueError(f"Unsupported executor type: {executor_type}")
 
     def initialize_system_prompt(self) -> str:
         system_prompt = populate_template(
@@ -1250,7 +1253,7 @@ def step(self, memory_step: ActionStep) -> Union[None, Any]:
         self.logger.log_code(title="Executing parsed code:", content=code_action, level=LogLevel.INFO)
         is_final_answer = False
         try:
-            output, execution_logs, is_final_answer = self.python_executor(code_action, self.state)
+            output, execution_logs, is_final_answer = self.python_executor(code_action)
             execution_outputs_console = []
             if len(execution_logs) > 0:
                 execution_outputs_console += [