Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support AIX #605

Closed
wiggin15 opened this issue Mar 8, 2015 · 36 comments
Closed

support AIX #605

wiggin15 opened this issue Mar 8, 2015 · 36 comments

Comments

@wiggin15
Copy link
Collaborator

wiggin15 commented Mar 8, 2015

I see it's in the TODO file, filed under "lower priority", but I wanted a ticket to be open because we actually encountered this:

root@aixio003 (AIX 7.1.0.0) ➜  ~  easy_install psutil                                                                                                                                                                                                         15-03-08 17:46
Searching for psutil
Reading http://pypi.python.org/simple/psutil/
Best match: psutil 2.2.1
Downloading https://pypi.python.org/packages/source/p/psutil/psutil-2.2.1.tar.gz#md5=1a2b58cd9e3a53528bb6148f0c4d5244
Processing psutil-2.2.1.tar.gz
Writing /tmp/easy_install-GaoOfH/psutil-2.2.1/setup.cfg
Running psutil-2.2.1/setup.py -q bdist_egg --dist-dir /tmp/easy_install-GaoOfH/psutil-2.2.1/egg-dist-tmp-9X2hV0
error: Setup script exited with platform aix7 is not supported
@giampaolo
Copy link
Owner

giampaolo commented Mar 8, 2015

Yes, AIX is not supported. Considering that this is a niche platform and that I don't have access to any AIX box it's very unlikely this is gonna get pushed forward, unless somebody steps in and decides to contribute a patch.

@wiggin15
Copy link
Collaborator Author

I started working on this here: Infinidat@2d037eb
There are still some missing functions, but if I open a pull request will you consider merging?

@mffrench
Copy link

will be interested on that too... :)

@giampaolo
Copy link
Owner

@wiggin15 what's the status of that branch of yours? What APIs are not implemented?

@wiggin15
Copy link
Collaborator Author

We're using it here at my company. I think it is mostly stable. There are 10 API functions not implemented (5 of them are methods of the Process class) but I added documentation and disabled them for AIX in the latest commit. See https://github.com/Infinidat/psutil/commits/aix

(Please note that the AIX branch is also rebased on top of the branch in pull request #610 which is also not merged yet)

@giampaolo
Copy link
Owner

giampaolo commented Jul 21, 2015

I'm sorry but IMO 10 unsupported APIs means this is not acceptable/ready for inclusion yet. It would mean over complicating the existent code and tests with ifs, unittest.skipIf(AIX) etc., and make the doc less readable.
Please note that it is also OK to parse the output of CLI tools if the functionality is too hard to implement in C. We currently do this to provide swap_memory stats on Solaris, if I'm not mistaken.

@giampaolo
Copy link
Owner

@wiggin15: looking back at this. I wonder if it's possible to install AIX on a laptop / virtualbox so that I can try to take a look at this myself. https://github.com/Infinidat/psutil/blob/aix/psutil/_psutil_aix.c looks similar to the Solaris implementation, which is encouraging.

@wiggin15
Copy link
Collaborator Author

wiggin15 commented Sep 6, 2015

I'm afraid AIX runs only on PowerPC architecture, and virtualization technologies like VirtualBox/VMWare etc. can't run it (IBM have their own virtualization solution called VIOS, but it requires a PowerPC server). I don't know how I can help with that.

@wiggin15
Copy link
Collaborator Author

I updated the branch and it's now down to only two missing methods.
I'm not sure it's ready for inclusion yet, though. Some tests are failing (most because AIX behaves differently than the tests expect but not only), and it should be run and tested on other environments than what I'm running - AIX environments are sometimes very different.

@giampaolo
Copy link
Owner

What are the 2 missing methods? Could you paste a test result?

@wiggin15
Copy link
Collaborator Author

I committed this file with the missing methods and a list of failing tests:
https://github.com/Infinidat/psutil/blob/aix/psutil/TODO.aix
Here's a full output of a test run from now:
https://gist.github.com/wiggin15/fa7469f5638bb52acc8e

@giampaolo
Copy link
Owner

This is very good progress. Kudos to you!
Some comments.

    def test_children(self):
        p = psutil.Process()
>       self.assertEqual(p.children(), [])
E       AssertionError: Lists differ: [<psutil.Process(pid=19267672,... != []

This is weird as there shouldn't be any subprocess (parent of os.getpid()) at the beginning of the test, meaning reap_children() function, which is run before any uniitest, didn't work as expected.
Also the PID is very high (19267672) which suggests that Process.ppid() implementation might be the issue (Process.children() relies entirely on it).
The fact that reap_children (which also relies on children()) didn't work looks like a confirmation of that fact. Are we sure this return Py_BuildValue("KKKdiiiK", (unsigned long long) info.pr_ppid, ... is correct in terms of types? Is pr_ppid an unsigned long long? Try to search its definition under /usr/include.

TestProcess.test_cmdline long args are cut from cmdline in /proc/pid/psinfo and getargs (flaky)

Does ps show the full cmdline? If there's no way to obtain the full cmdline I would add a // XXX cmdline is truncated to 15 chars comment in the C source code and update doc in accordance.

E AttributeError: 'Process' object has no attribute 'memory_maps'

If there's no way to implement memory_maps() I suggest to just remove support for it, skip the test as in @unittest.skipIf(AIX, "not supported on AIX") and update the doc.

E AttributeError: 'Process' object has no attribute 'num_ctx_switches'

Same here: remove support, skip test and update doc.

    def test_pid_0(self):
         ...
>       self.assertTrue(p.name())
E       AssertionError: '' is not true

Does ps show a name? If so you can enforce that name in pure python as in if self.pid == 0: return "name" otherwise it is fine to skip the test.

def test_prog_w_funky_name(self):

You can simply skip this test.

@wiggin15
Copy link
Collaborator Author

Thanks for the feedback.

  1. I also noticed that reap_children isn't working and that this causes most of the "flaky" test failures. I started to debug and noticed that the children aren't stopped because they return False from is_running even though they are running - because of this check in is_running:
    return self == Process(self.pid)
    the new process instance ident doesn't match the one we have. One of them (I don't remember which) has 0 in the creation time (which is read from procfs). I don't know how that can be. I double checked the procfs struct definitions, they're ok. The pids are correct (most pids on the system are very high - 6-8 digits). I didn't investigate further. AIX is one weird system.
  2. ps doesn't show the full command line (and procfs and getargs don't either). It's not exactly that the command line is truncated. Instead, very long arguments (like python code we pass after "-c") are "skipped". I can document this and skip this test.
  3. pid 0 has a name in ps ("swapper") but it doesn't show in procfs and other API functions I tried. You're right, maybe I can just check this in the pure python code and return that name there.
  4. test_prog_w_funky_name - I can skip this test. This needs to be skipped in Solaris too, by the way. It doesn't work there either.
  5. I can skip the 3 tests that fail because of the two missing methods too. I already updated the docs about what's missing and more relevant notes.

@giampaolo
Copy link
Owner

I started to debug and noticed that the children aren't stopped because they return False from
is_running even though they are running - because of this check in is_running:
return self == Process(self.pid)
the new process instance ident doesn't match the one we have.

So you may want to look at how you determine process creation time because that is what is used to identify a process uniquely over time. It may be that creation time has a C overflow / typing issue similarly to ppid.

One of them (I don't remember which) has 0 in the creation time (which is read from procfs).

Maybe it's PID 0? In that case I recommend to return the boot time (instead of 0) in pure python as in if self.pid == 0: return boot_time()

The pids are correct (most pids on the system are very high - 6-8 digits). I didn't investigate further.
AIX is one weird system.

I see. OK.

test_prog_w_funky_name - I can skip this test. This needs to be skipped in Solaris too, by the way. It
doesn't work there either.

In this case skip the tests on both platforms.

@dmurphy18
Copy link

@wiggin15 wondering if there is any further work for support of AIX or has the work been dropped ?

@wiggin15
Copy link
Collaborator Author

wiggin15 commented Mar 9, 2017

Hi @dmurphy18 . We're still using the AIX branch at our company and I'm rebasing it over master from time to time, but I stopped working on the tests. All the tests left to fix are due to quirky behavior on AIX so I don't know how to proceed with those.

@giampaolo I don't think there's a fix to the tests that fail, they just won't work on AIX, so the options are:

  1. skip all the failing tests on AIX (there's 18 of them!).
  2. merge the branch as-is and the tests on AIX will not pass completely.
  3. do nothing and not merge the AIX branch

@giampaolo
Copy link
Owner

@wiggin15 can you please re-base from latest master and paste the output failures? If somehow you could give me access to the AIX box via SSH I could attempt to fix them.
Also, what are the missing features in terms of APIs?

@wiggin15
Copy link
Collaborator Author

wiggin15 commented Mar 9, 2017

I rebased the branch on top of latest master: https://github.com/infinidat/psutil/tree/aix

Here is the latest test run output. Apparently there are some new tests that fail :(
https://gist.github.com/wiggin15/16a1f7134efe13ddfae22137c6ba9280
explanations for some of the failing tests can be found here:
https://github.com/Infinidat/psutil/blob/aix/psutil/TODO.aix
(the "flaky" test failures are caused by a race condition in procfs, where the creation time is not reflected correctly when the process starts, causing new test processes to be not found)

Regarding missing API: only two methods of the Process class are not implemented:

psutil.Process.memory_maps
psutil.Process.num_ctx_switches

This is documented in the README on the branch.
In addition, the oneshot context manager is empty on AIX and does not implement any optimization.

I'm afraid it's not possible to grant ssh access to our AIX machines, as they are in the internal company network and not internet-facing. (It may be possible to arrange WebEx sessions with screen sharing for very limited time access, e.g. if there are specific issues to debug...)

@dmurphy18
Copy link

@giampaolo @wiggin15 FYI - There is an IBM Partner world access program whereby you can get access for a limited time (up to 2 weeks) to AIX machine's on IBM's infrastructure, see url https://www-356.ibm.com/partnerworld/wps/servlet/ContentHandler/stg_com_sys_power-development-platform

SaltStack leverages this to develop on AIX currently, systems are pretty basic but you get full control of them. Also able to get PowerLinux variants too via that program.

@bthery
Copy link

bthery commented Apr 6, 2017

Hi,

For some development done here, we're interested in having the AIX support merged in the psutil module.

For starting we will use @wiggin15's repository, but that would be great to have it included in the official repo as this would ease deployment a lot: ie. we will benefit from the automatic installation from pip when psutil is a dependency.

We will do some testing, and report any error we may find.

@jomann09
Copy link
Contributor

I am also using @wiggin15's repo for AIX builds of psutil. Unfortunately users aren't working properly for me, I had to edit https://github.com/Infinidat/psutil/blob/bc5b3ad6fb28d7225cece9697a91d47056260e1d/psutil/_psaix.py#L167 in order to get it to work by adding the pid like most others have. If the very last value is missing it throws an error, at least on the AIX box I am using.

@giampaolo
Copy link
Owner

If Process.memory_maps and Process.num_ctx_switches are the only missing APIs this is definitively good to merge in terms of mere functionality. What's missing is a rebase with latest 5.3.0 release (code changed quite a bit), see the situation of the unittests and make a PR.

@wiggin15 wiggin15 mentioned this issue Sep 5, 2017
@wiggin15
Copy link
Collaborator Author

wiggin15 commented Sep 5, 2017

I rebased the branch and (finally) opened a PR.
The situation of the unittests is currently not so great. Here is the output of the latest run from today after the rebase: https://gist.github.com/wiggin15/52ecddd4a9729ca187fe2ac76bb39c2a
There are probably some tests that can be fixed but I didn't look into it again yet. There are some tests that should probably be skipped on AIX due to different behavior on this platform. See TODO.aix in the pull request.

@giampaolo
Copy link
Owner

This is great work, thanks. Most connections-related failures are due to address not being a namedtuple. Just do this in order to fix it:
https://github.com/Infinidat/psutil/blob/c895a4aa1a08e7dac7213ab1c2458b0a28cea343/psutil/_psbsd.py#L721-L723
I will start commenting about other failures on the PR itself.
In the meantime let me ask you some questions.
This is the first time I'm going to add support for a platform I'll likely never be able to test myself and it makes me a bit nervous. As such I'm going to clearly state in the doc that AIX support is experimental for now. Are you willing to officially take over maintenance of the AIX platform?
Also, what happens if you quit your job? Is there somebody else who can take over maintenance in the future?

@giampaolo
Copy link
Owner

giampaolo commented Sep 5, 2017

Also, just to be clear, keep in mind that if in case AIX support in psutil becomes unstable due to lack of active maintaintance I will likely drop it at some point, the same way I dropped support for Windows XP. Including AIX support now that psutil is stable(5.3.0) would still be a win though, as we can point future AIX users to this psutil version (in the doc).

@giampaolo
Copy link
Owner

First review is done. Overall, really great work. This really looks promising.

@wiggin15
Copy link
Collaborator Author

wiggin15 commented Sep 7, 2017

@jomann09 I fixed the users function on the AIX branch.

@jomann09
Copy link
Contributor

jomann09 commented Sep 7, 2017

Awesome, that's what I did to fix it too @wiggin15

@giampaolo
Copy link
Owner

New psutil 5.4.0 is finally out with AIX support included! Thanks to @wiggin15 who made this possible (see #1123).

@oxben
Copy link

oxben commented Oct 12, 2017

Great. Thank you.

@giampaolo
Copy link
Owner

@wiggin15
Copy link
Collaborator Author

Wow. Thanks for the compliments @giampaolo 😊

@giampaolo
Copy link
Owner

@wiggin15 they were well deserved. =)
I was thinking about publicizing this a bit on some "official" AIX mailing list or something. I bumped into this one:
http://lists.princeton.edu/cgi-bin/wa?A0=aix-l
...but it seems pretty dead. Are you aware of any other AIX discussion forum?

@wiggin15
Copy link
Collaborator Author

I'm not aware of any AIX discussion forum or mailing list. Maybe it's one of the reasons why information about AIX is so scarce :)

@bthery
Copy link

bthery commented Oct 19, 2017

I can probably help a bit here.
I'll point this post to some people at IBM who may be interested.

@dmurphy18
Copy link

dmurphy18 commented Oct 24, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants