Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package environment #142

Closed
zenon opened this issue Jun 3, 2020 · 61 comments · Fixed by #844
Closed

Package environment #142

zenon opened this issue Jun 3, 2020 · 61 comments · Fixed by #844
Assignees
Labels
backend Concerning the julia server and runtime

Comments

@zenon
Copy link

zenon commented Jun 3, 2020

Hi,

I try and use many different libraries, and only some of them I keep. So, to start a project that uses a certain set of libraries, I enter a new directory, and in Julia I say

] activate "."

And now, I have a clean environment.

Pluto, on the other hand, starts its Notebooks in the standard environment (thus, for me, doesn't find any library).

So I added some cells before my "using XXX" lines:

using Pkg
Pkg.activate("myPath")

Side note: This isn't very portable. If there'd be a better way, I'm happy.

When writing the notebook, everything went fine. When reloading it, the ordering by Pluto kicked in, and all packages were loaded up front, and thus not found (or so I interpret what I saw).
This reordering seeems sensible to me in all cases except when working with environments :-)

I have no good idea where to put such meta information. Somehow I either

  • need to be able to tell Pluto that some code lines must get priority (Initializer code)
  • directly have a syntax for activating an environment.

Both makes Pluto more complex, I'm afraid.

As "using XXX" has to be on the top level, I don't see a way to put this into functions, like, I mean

 # not syntactically correct julia 
 function initialize()
    using Pkg
    Pkg.activate("myDir")
end

x = do
   initilize()
   using XXX
   using YYY
end

Any ideas?

Kind greetings, z.

@fonsp
Copy link
Owner

fonsp commented Jun 3, 2020

Hi z,

Very good point! I have been thinking a lot about this lately - I think that every Pluto notebook should start in its own package environment, and there should be a GUI for loading packages, or this can be done with using XXX directly.

The other thing here is that a Pluto notebook should contain that package info! That way a .jl is completely reproducible (as far as packages go). Before going too much into detail - what do you think about this idea?

@fonsp
Copy link
Owner

fonsp commented Jun 3, 2020

Also, you can use

let
    import Pkg
    Pkg.activate(".")
    Pkg.add("XXX")
end

as you first cell, and it should also be the first cell to run. Does that work?

@zenon
Copy link
Author

zenon commented Jun 4, 2020

Ah!

Thank you Fons!, I gave up when reading the error message that 'using' must be top level; didn't expect to find that there a re more top levels than I thought. :-)

Yes, a let/end or begin/end block seems to do it. (Assuming that my single experiment showed characteristic behavior.) Thank you!

I'll comment with thoughts on a general solution in the next hours.

Kind greetings, z

@zenon
Copy link
Author

zenon commented Jun 4, 2020

Some thoughts.

Creating a directory (to put the environment in), and installing packages is potentially dangerous.
I wouldn't do it without asking.

What we can do is check for the right content of the current environment.
Pkg.installed() gives a dictionary packageName => versionString.

So, a possible way, not thinking about environments yet:
Pluto adds the result of Pkg.installed() in a comment at the end,
and, when starting, checks, whether that's given, and warns if not.
(And offers to do the install.)

An environment is a directory with configuration files, or at least that's all I know
about it currently.
Hm. I don't know whether it has the implication, that the packages mentioned in the
config files really are installed.

In a way, using environments is the more heavy duty way to use Julia. So I think it better
to make that optional for Pluto. When an environment is used, it needs an identifier, and
it starts getting difficult.

Maybe I continue above, where I said "offers to do the install". Pluto can ask for a
directory, looks whether that's empty / create it when it doesn't exist.
If exists & empty, activate it as environment-directory, and start installing.

Or: Just print a note how to create an environment, and what to install there, and how to
build it; and at the end ask the user to provide the directory.
That would be a start!

Note, one reason why I am reluctant to install: I regularly have difficulties
installing Julia packages. Most often because they try to install something non-Julia
(Conda/Python, DLLs, all that). I'd refuse to take responsibility about that.
(That's the reason why I added the "how to build it" phrase to the paragraph above, as
often the building fails.)

Kind greetings, z.

@fonsp
Copy link
Owner

fonsp commented Jun 4, 2020

My thoughts were that a notebook will get a directory in /tmp (gets deleted on boot), which is activated as the environment, i.e. there will be a Manifest.toml in it.

The first time you use using or import, Pluto will ask you whether you want to use an existing package environment - you can choose its path - or whether to make the notebook self-contained.

In the existing Pkg environment case, it is exactly like it is today.

A self-contained environment means that you never interact with the package manager - calling using XXX will install the package and add it to the manifest, removing it will delete it from the manifest. The manifest (with version info) is included in the notebook file, either through comments or through Pkg.something commands that are hidden to the user.

When you open a self-contained notebook, Pluto will create a clean env and 'install' everything needed. Remember that Julia has a global cache! If you add a package to a new environment, it only downloads and builds that package if it has never been used on your machine before. Otherwise, it just adds one line to Manifest.toml. That's why I think that creating a clean environment for each notebook is a cool idea :)

Oh and you said that it could be dangerous, what do you mean by that? Isn't the dangerous part the running-arbitrary-julia-code-from-a-notebook, not the package install?

@zenon
Copy link
Author

zenon commented Jun 5, 2020

Isn't the dangerous part the running-arbitrary-julia-code-from-a-notebook, not the package install?

Hahaha, right.

Hm.
Indeed right. Why didn't I think of that?
I want a possibility to load a notebook without immediately executing it. I.e. I want to read it first.
This should even be the default.

What I was pondering, rather technically than cyber security wise, is notebooks installing things that don't really work. Like for me in the last days, where Plots/GR only worked after some massage.
(At another computer, install of Pluto failed, because it some of the downloads seemingly needed admin rights.) Rarely starting a Julia project works without hassels.
Maybe that because I often use Windows, or computers without admin rights. But thats my situation.

So your plan sounds good for a ideal situation. I just rarely see it :-)

@fonsp
Copy link
Owner

fonsp commented Jun 5, 2020

Hm 😕 do you also have those issues when starting from an empty package environment? And is it just the crazy python packages that break?

I would love to know more about the Pluto-admin-install problem! Just the system info would also be helpful! It only has 2 dependencies and doesn't do anything wild to install, so perhaps it's the required HTTP.jl package (or installing any package) that didn't work.

(It would be great if you could report these issues to the Plots package!)

@grero
Copy link

grero commented Jun 8, 2020

Is the solution suggested above still expected to work? When I run this as my first cell

begin
	using Pkg
	Pkg.activate(".")
end

I kind of expected the files Project.toml and Manifest.toml to be created in the director containing the current notebook. However, no such files are created.

@zenon
Copy link
Author

zenon commented Jun 8, 2020

Hi Roger,

  1. what doesn "." refer to? I mean, are you sure to look into the right directory? Try pwd()
  2. I think I saw the same behavior some days ago. I just assumed that I did something wrong, and tried something else. (especially I didn't follow advise 1 :-) )

Kind greetings, z.

@grero
Copy link

grero commented Jun 8, 2020

Hi,
My exception was that thatPkg.activate(".") would activate a new environment in the current working directory, which I thought to be the location of the notebook. I realise after running pwd() from the notebook, though, that I am actually in the folder from which Pluto was launched. In that case, I would expect the notebook itself to show up in that directory, rather than in a temporary directory. This is probably just my familiarity with Jupyter talking, though.

@fonsp
Copy link
Owner

fonsp commented Jun 8, 2020

Yep, what to do with the working directory is another issue... (In particular, after changing the notebook path - some cells can implicitly depend on the wd, but there is no way to detect that using syntax analysis). I guess that an okay solution is to always cd to the notebook's path?

@fonsp fonsp added the reactivity The Pluto programming paradigm label Jun 9, 2020
@fonsp
Copy link
Owner

fonsp commented Jun 10, 2020

By the way, here is a more complicated example of setting up an environment

begin
	cd("/mnt/c/dev/julia/margo_tests/") # see edit below
	import Pkg
	Pkg.activate(".")
	using ClimateMARGO
	using Plots
	using LaTeXStrings
	using Colors
end

This works - but of course this should not be necessary - it's on the todo list!

When figuring out the "run all" order, cells that contain a using statement always run before cells that don't. This is why, for example, Pkg.activate(...) needs to be inside the same block as the using statements - otherwise it would run afterwards.

Edit: Pluto now does cd("path to notebook file") automatically. So if Project.toml is in the same directory as your notebook, you don't need the manual cd line.

@ToucheSir
Copy link

In the interest of reproducibility, it would be ideal if we could still provide a Project.toml (with compat if needed) and Manifest.toml. This is especially helpful in "mixed" environments where one has application/library code in addition to Pluto notebooks (ML comes to mind). Otherwise, keeping both sets of dependencies in sync is rather painful and somewhat of a dealbreaker.

@fonsp
Copy link
Owner

fonsp commented Jun 17, 2020

Thanks for pointing it out - I'm thinking that Pluto should detect whether you are in a package environment (by going up the file tree and looking for Project.toml) and use that one instead. This would be the you-know-what-you-are-doing mode

@fonsp
Copy link
Owner

fonsp commented Jun 27, 2020

That would also mirror IJulia's behavior: JuliaLang/IJulia.jl#820

@fonsp fonsp changed the title Initializer, e.g. for using environments? Package environment Jul 11, 2020
@Roger-luo
Copy link
Contributor

I find currently Pluto will use whatever the environment Julia uses when running Pluto.run() in REPL, is that true?

@fonsp
Copy link
Owner

fonsp commented Jul 30, 2020

Not sure, maybe it depends on the Julia version. Currently, the best way to set up in environments is to copy the setup from the PlutoUI sample, but I want to make this a lot more smooth soon!

@ToucheSir
Copy link

@Roger-luo how did you get Pluto to pick up on a local project environment? I just tested with 0.11.0 and it still defaults to the global env?

@Roger-luo
Copy link
Contributor

@ToucheSir I'm not sure I just start it using JULIA_PROJECT

@ToucheSir
Copy link

Perfect, that did the trick! Not sure why, but julia --project doesn't work the same way.

@Roger-luo
Copy link
Contributor

I think this is because when you pass --project the sub process Pluto spawns does not inherit that flag, but it inherits the environment variables.

@fonsp fonsp added this to the 🛶 v0.13 milestone Aug 12, 2020
@lungben
Copy link
Contributor

lungben commented Aug 14, 2020

I really like the idea to include the environment into the notebook file!

Some suggestions:

  • it would be good to have both Project.toml and Manifest.toml (the manifest may be optional)
  • the standard Julia format for Project.toml and Manifest.toml should be used
  • the simplest solution would be to include the content of both files as block comment in the notebook.jl file. Even better would be to store it in a Julia data strucutre (e.g. String) together with some code so that the Project.toml / Manifest.toml files are automatically generated and the environment is activated when running the notebook.jl file directly in Julia (without Pluto).
  • there should be a possibility to do Pkg.update() to update the dependencies in the Manifest

@lungben
Copy link
Contributor

lungben commented Sep 1, 2020

I played around a bit with Pkg and have a suggestion (or rather a rough draft).
When the following function instantiate_env() is executed automatically when a notebook is opened, a notebook specific Pkg environment (named notebook.jl.env) is activated.

  • If it already exists, it is instantiated using the Project.toml / Manifest.toml files in the notebook.jl.env directory.
  • If it does not exist, it should ideally not do anything yet - currently the function below creates the dir+ empty Project / Manifest files, but I'll try to get rid of it.
  • If the user wants to add any package, she can Pkg.add them directly into the notebook (or from Julia REPL). The Pkg.add cells could / should be deleted afterwards, the dependencies are stored in the toml files. Same for updates, removals, etc.
  • Open: when moving / renaming a notebook, the corresponding .env directory should be moved, too.
get_env_name() = string(split(basename(@__FILE__), '#')[1], ".env")
function instantiate_env()
	@eval import Pkg
	env_name = get_env_name()
	Pkg.activate(env_name)
	Pkg.instantiate() # maybe do this only if .env dir exists?
end

A further enhancement could be the option to integrate the .env directory (the content of the toml files) directly into the notebook.jl file and to extract them from the notebook.jl file again.

I think this behavior should be opt-in, e.g. in a keyword argument to Pluto.run().

What do you think?

@fonsp
Copy link
Owner

fonsp commented Sep 1, 2020

Right now, I always include the following inside the notebook:

begin
	import Pkg
	Pkg.activate(mktempdir())
end

in one cell (mktempdir docs), and whenever I need Example, I write

begin
	Pkg.add("Example")
	import Example
end

or if I want to specify the version

begin
	Pkg.add(Pkg.PackageSpec(name="Example", version=v"1.2.3"))
	import Example
end

This way, running the notebook always starts in a clean package environment (no state!). So it's just like what you proposed, except we purposely don't save the environment in a separate file, but completely describe the environment inside the notebook!

This is pretty much what I want Pluto to do automatically in the future - the first cell will be built in, and you get the third cell when typing:

import Example @ 1.2.3

with nice autocomplete to help you. (Just import Example is no longer allowed)

@fonsp
Copy link
Owner

fonsp commented Sep 1, 2020

I was working on a design doc to receive feedback on about the import Example @ 1.2.3, but it slipped a little. More in a couple of weeks!

@Roger-luo
Copy link
Contributor

Roger-luo commented Oct 9, 2020

@garrison just an update, in v0.12, you can specify which project to use via project kwargs in Pluto.run, e.g

Pluto.run(;project="@.")

will look for closest julia Project.toml instead of global ones, or you can also feed in a path to specify which Project.toml or Manifest.toml you want to use.

There is also per notebook environment variable settings built internally, but it's not exposed publicly.


@fonsp I'm wondering how do you serialize this in to the notebook file currently, should we just have an inline TOML in the comments? I can experiment this feature for normal Julia scripts in IonCLI as ion run script.jl

one more thing I'm thinking is that I guess it's actually possible to not create a Project.toml in tempdir but directly create a in memory Pkg.Context from the inline TOML of the script, which can make things a bit faster I think.

@j-fu
Copy link
Contributor

j-fu commented Oct 10, 2020

For inline TOML see also #421 ...

@c42f
Copy link
Contributor

c42f commented Oct 14, 2020

There will be the option to do that, but it will not be the default - Pluto notebooks should be reproducible as a single file by default. Ideally, correctly understanding, managing and sharing a package environment should not be a prerequisite for your notebooks being reproducible.

💯 I feel very strongly that this is the correct way forward. In fact, I investigated doing this very thing for normal Julia scripts in https://github.com/c42f/CodeEnvironments.jl (though at the time I implemented that, I seemed unable to convince anyone that it was very worthwhile!). (BTW, the encoding of the Manifest in CodeEnvironments is not inherent, but partly a workaround for UI considerations in jupyter, and in text editors.)

I know I'm just echoing what @fonsp said already! But I feel that managing environments (let alone understanding git) adds intolerable accidental complexity for casual users. Also that having the manifest and project separate from the notebook file will immediately lead to these files getting separated as they're copied around.

@fonsp
Copy link
Owner

fonsp commented Nov 10, 2020

@oschulz
Copy link

oschulz commented Mar 13, 2021

Also that having the manifest and project separate from the notebook file will immediately lead to these files getting separated as they're copied around

I'm just curious - why does Pluto "disrespect" the currently active Julia project and start the notebook in the default environment? I'm aware that there may be a well thought-out reason behind this, but it did surprise me quite a bit (not being in the env I thought I was).

I would actually love Pluto to pick up an environment (Project.toml and Manifest.toml) found in the same directory as the notebook automatically (like IJulia does with defaults). [Edit: Pluto does this now, see below.]

@oschulz
Copy link

oschulz commented Mar 13, 2021

One thing regarding Pluto.run(;project=...): When using Pluto.run(;project="@myenv"), Pluto will currently activate an environment literally named "@myenv" in the current directory, instead of activating "myenv" in the default environment directory (like the package console does). Would it be possible to change that? Using absolute paths can be so tedious. :-)

@Roger-luo
Copy link
Contributor

@myenv does mean environment named @myenv in all Julia programs. It's not a Pluto only thing. Do you actually want @.?

@oschulz
Copy link

oschulz commented Mar 13, 2021

Well, on the Julia package management console, "@myenv" means "myenv" in the default environments directory.

@Roger-luo
Copy link
Contributor

That's not the package API unless Pkg.jl changes its API I don't think Pluto should use a different convention

@oschulz
Copy link

oschulz commented Mar 14, 2021

See here, though: JuliaLang/julia#35354

@Roger-luo
Copy link
Contributor

If that PR is merged it will automatically work on Pluto side Pluto does nothing extra but just forward the project path to Julia compiler API.

@oschulz
Copy link

oschulz commented Mar 14, 2021

Oh, right! :-)

@oschulz
Copy link

oschulz commented Mar 14, 2021

Sorry, didn't consider that.

@oschulz
Copy link

oschulz commented Mar 14, 2021

Regarding default environment, though - I'd like to plead for Pluto to use the environment it's been started with for notebooks as well, instead of starting them in the default environment. Novice users often won't use environments, so for them it won't matter - they'll start Pluto in the default environment. But if a user explicitly starts Pluto from another environment, wouldn't the intuitive expectation be that notebooks will run in the same env?

@oschulz
Copy link

oschulz commented Mar 14, 2021

@oschulz: I would actually love Pluto to pick up an environment (Project.toml and Manifest.toml) found in the same directory as the notebook automatically (like IJulia does with defaults)

I have to apologize, I didn't realize that Pluto does this already, via --project=@.

But if a user explicitly starts Pluto from another environment, wouldn't the intuitive expectation be that notebooks will run in the same env?

Maybe not, in that case - respecting either the current environment or picking up the environment of the notebook's directory would lead to muddled semantics.

@fonsp
Copy link
Owner

fonsp commented Mar 15, 2021

@oschulz Please take a couple of minutes time before writing a quick reply, and consider switching to email/slack/zulip if you want to chat.

Take a look at #844. This is where we are headed, we try to make the notebook the unit of reproducibility, and the environment used to launch Pluto is not used on purpose (unless PlutoPkg is disabled using Pkg.activate, see the PR).

@oschulz
Copy link

oschulz commented Mar 15, 2021

Please take a couple of minutes time before writing a quick reply, and consider switching to email/slack/zulip if you want to chat.

My apologies - I actually didn't intend to write a quick reply: I did propose that Pluto should pick up environments next to the notebook or use the current default env. I had overlooked that Pluto already uses --project="@." (my mistake, of course) and so I thought I should correct my suggestion - it had received a thumbs-up so I thought it would be better to publicly retract it, instead of just deleting it.

Sorry for the noise, I will be more careful with Pluto GitHub issues in the future.

@fonsp
Copy link
Owner

fonsp commented Mar 15, 2021

@oschulz No worries, thanks for understanding, I am sure you mean well! I definitely do not want to discourage you from contributing in the future! I always try to moderate slightly too early, instead of slightly too late. You are right that it's difficult to correct an earlier comment in github issues.

Definitely take a look at #844 and take that future behaviour into account, and perhaps we should open a new Discussion about the semantics of keyword arguments passed to the Pluto server vs command line arguments passed to its Julia process. (I also have some questions about this.)

@fonsp fonsp linked a pull request Mar 15, 2021 that will close this issue
66 tasks
@ToucheSir
Copy link

Hi @oschulz,

As the person who thumbs-uped your comment, I should probably take responsibility for the retraction :) Having read through (and politely debated with @fonsp about) package environments on GH and Zulip, I think this could've been avoided if #844 was linked earlier. For example, I'm suscribed to this thread but not Pluto in general, and wasn't aware that PR existed (and, more importantly, provided an escape hatch).

@oschulz
Copy link

oschulz commented Mar 16, 2021

Yes, I read #844 now and I do like the concept a lot (simplicity, but being able to opt out and use full environments when necessary).

@fonsp
Copy link
Owner

fonsp commented Jun 11, 2021

This issue was closed by releasing the Built-in package manager! To leave feedback, please open a new Issue or Discussion.

🎉

Repository owner locked as too heated and limited conversation to collaborators Jun 11, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backend Concerning the julia server and runtime
Projects
None yet
Development

Successfully merging a pull request may close this issue.