About visualise sparse policy #6

Vcbby · 2025-01-12T15:18:53Z

Vcbby
Jan 12, 2025

Hi there, I am trying to write a script to visualise the cart pole swing task under cart sparse policy, but I am a little bit confused and error occurs when I am coding. Can you show me a simple script to visualise the task pls?

Here is my code, and the 'cart_sparse_policy' is generated by cart.py

import numpy as np
import matplotlib.pyplot as plt
from ray.rllib.env.wrappers.dm_control_wrapper import DMCEnv
import sindy_rl.policy
from sindy_rl.sindy_utils import build_optimizer
from pysindy import PolynomialLibrary
import pickle

env = DMCEnv('cartpole', task_name='swingup', height=480, width=480)
env.reset()

cam_id = 0

plt.ion() 
fig, ax = plt.subplots(figsize=(6, 6))
img = ax.imshow(np.zeros((480, 480, 3), dtype=np.uint8))

alpha = 1e-6
thresh = 1e-5
n_models = 20
poly_deg = 3
include_bias = False

optimizer_config = {
    'base_optimizer': {
        'name': 'STLSQ',
        'kwargs': {
            'alpha': alpha,
            'threshold': thresh,
        }
    },
    'ensemble': {
        'bagging': True,
        'library_ensemble': True,
        'n_models': n_models,
    }
}
optimizer = build_optimizer(optimizer_config)

with open('cart_sparse_policy.pkl', 'rb') as f:
    cart_sparse_policy = pickle.load(f)

    print(cart_sparse_policy)

# Polynomial Library
feature_library = PolynomialLibrary(degree=poly_deg, include_bias=include_bias, include_interaction=True)


n_control = env.action_space.shape[0] 

SparseEnsemblePolicy = cart_sparse_policy


for episode in range(100): 
    print(f"Episode {episode + 1}")
    obs = env.reset() 
    # obs = env.step(0.0 * env.action_space.sample())

    print("Observation shape:", np.array(obs).shape)
    
    done = False
    total_reward = 0
    
    while not done:
        pixels = env.render(camera_id=cam_id)
        
        print("Rendered Image Shape:", pixels.shape)
        print("Pixel values range:", pixels.min(), pixels.max()) 
        
        if pixels is None or pixels.size == 0:
            print("Warning: Rendered image is empty!")
            continue
        
        img.set_data(pixels)
        plt.draw()
        plt.pause(0.01) 

        observation = obs['observation'] if isinstance(obs, dict) and 'observation' in obs else np.array(obs)
        
        action = SparseEnsemblePolicy.compute_action(observation)

        # action = random_policy.compute_action(observation)
        step_output = env.step(action)
        
        print(f"Step output: {step_output}")

        obs = step_output[0]  # observation
        reward = step_output[1]  # reward
        done = step_output[2]  # done
        truncated = step_output[3]  # truncated 
        info = step_output[4]  # info
        
        total_reward += reward

        print(f"Total Reward so far: {total_reward}")
        
    print(f"Episode {episode + 1} finished. Total Reward: {total_reward}")

Answered by nzolman

Feb 3, 2025

@Vcbby — glad you were able to solve this! I can't claim to be a roboticist, so I'm afraid I'm not going to be very helpful here. MuJoCo is just a physics engine, but people have been using it to create custom robotics models (which I think can be defined using XML?) and propagate the physics/constraints within it. It looks like there have been previous attempts at building cassie models in MuJoCo [1,2].

gymnasium on the other hand, is just a convenient API for wrapping simulators and is commonly accepted for many of the RL packages out there—I tried to make my code compliant with this. It looks like [1] tried to do this for cassie wrapping the MuJoCo environment. I believe that dm_control …

View full answer

nzolman · 2025-02-02T04:19:43Z

nzolman
Feb 2, 2025
Maintainer

Hi there @Vcbby! So sorry for the delay, this flew under my radar. What error are you getting? And do you mind letting me know what OS you're running the code on (e.g. Windows, Linux, MacOS) and whether you're using Docker? I've noticed that dm_control can act kind of weird in Docker (and machines when you ssh) because they are searching for a way to access a GUI and aren't. I think there are some tricks to fixing this, but I'm less experienced with them.

In theory, something like this should work. I just got this to run locally in a jupyter notebook on a M2 Macbook air with Python=3.9.13

import numpy as np
import matplotlib.pyplot as plt
from ray.rllib.env.wrappers.dm_control_wrapper import DMCEnv
import pickle
import os

# setup environment
cam_id = 0
env = DMCEnv('cartpole', task_name='swingup', height=480, width=480)

# load policy
policy_path = '/path/to/policy'
with open(policy_path, 'rb') as f:
    cart_sparse_policy = pickle.load(f)
    print(cart_sparse_policy)

num_steps = 1000

obs_list = []
pixel_list = []

obs = env.reset()
# obs, info = env.reset()  <--- Depends on the version of ray/gymnasium you have installed
for step_idx in range(num_steps):
    # query action
    action = cart_sparse_policy.compute_action(obs)
    
    # step in environment
    result = env.step(action)
    obs = result[0]
    obs_list.append(obs)
    
    # extract pixels to render later
    pixels = env.render(camera_id=cam_id)
    pixel_list.append(pixels)

# render pixels
plt.imshow(pixel_list[-1])
plt.show()

3 replies

Vcbby Feb 3, 2025
Author

Hiii @nzolman , thank you so much for the reply, actually I've solved it several days ago, and the code structure is similar with yours😀. Still thank you for help! BTW I run the code on a M2 Macbook with Python=3.9.13 too, but with local conda environment.

Actually I am doing my graduation project and I want to import a new robot model, like cassie, into sindy_rl. I noticed that you used dm_control environment for cartpole, and MuJoCo environment under gymnasium for swimmer. I am not very fimiliar with these environments, can you give me some advice on which environment should I use and how to import external robot models pls? 🙏🥺

nzolman Feb 3, 2025
Maintainer

@Vcbby — glad you were able to solve this! I can't claim to be a roboticist, so I'm afraid I'm not going to be very helpful here. MuJoCo is just a physics engine, but people have been using it to create custom robotics models (which I think can be defined using XML?) and propagate the physics/constraints within it. It looks like there have been previous attempts at building cassie models in MuJoCo [1,2].

gymnasium on the other hand, is just a convenient API for wrapping simulators and is commonly accepted for many of the RL packages out there—I tried to make my code compliant with this. It looks like [1] tried to do this for cassie wrapping the MuJoCo environment. I believe that dm_control is just another convenient wrapper for these. But it's API is not nearly as widely adopted, hence why rllib wraps the dm_control environments into a gymnasium environment.

If you're looking to build your own custom MuJoCo environment, I would probably follow the code in [1] and [2] and see how far it gets you (though, note that MuJoCo went under some structural changes once DeepMind acquired it in 2021---so be a little wary of code that you find which is older than this).

On the other hand, if you already have a physics simulator with the robot model in it, then building a gymnasium environment around it usually isn't that difficult---the hardest part from my experience is getting communication channels established to send information back and forth between python and the simulator. Some CAD programs, like Blender, have Python channels already built in them [3]. But you might have to write your own physics engine to make sure constraints are being satisfied, etc. But I definitely don't have enough experience with these types of models to provide much insight/advice.

Hope that helps a little! Good luck!

[1] https://github.com/siekmanj/cassie

[2] https://github.com/osudrl/cassie-mujoco-sim

[3] https://docs.blender.org/api/current/info_quickstart.html

Answer selected by Vcbby

Vcbby Feb 3, 2025
Author

Thank you so much! It is very useful and precious advice for me. I believe these links can widen my mind.❤

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About visualise sparse policy #6

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

About visualise sparse policy #6

Vcbby Jan 12, 2025

Replies: 1 comment · 3 replies

nzolman Feb 2, 2025 Maintainer

Vcbby Feb 3, 2025 Author

nzolman Feb 3, 2025 Maintainer

Vcbby Feb 3, 2025 Author

Vcbby
Jan 12, 2025

Replies: 1 comment 3 replies

nzolman
Feb 2, 2025
Maintainer

Vcbby Feb 3, 2025
Author

nzolman Feb 3, 2025
Maintainer

Vcbby Feb 3, 2025
Author