Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide for worker-configuration parameters #350

Merged
merged 2 commits into from
Jan 11, 2021

Conversation

benesch
Copy link
Contributor

@benesch benesch commented Jan 11, 2021

@frankmcsherry, as discussed on Slack! No promises that any of this is appealing!

Tweak the execution API so that worker configuration parameters can be
specified along with communication fabric configuration parameters. In
particular, this permits the DEFAULT_PROGRESS_MODE environment variable
to be replaced with a typed ProgressMode enum, and for its value to be
controllable with a command-line option.

The new WorkerConfig struct also admits arbitrary typed configuration
parameters, which downstream libraries (e.g., differential) can use to
store their own worker configuration variables.

Tweak the execution API so that worker configuration parameters can be
specified along with communication fabric configuration parameters. In
particular, this permits the DEFAULT_PROGRESS_MODE environment variable
to be replaced with a typed ProgressMode enum, and for its value to be
controllable with a command-line option.

The new WorkerConfig struct also admits arbitrary typed configuration
parameters, which downstream libraries (e.g., differential) can use to
store their own worker configuration variables.
pub enum Configuration {
pub enum CommunicationConfig {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the "recommended style" is leaving this as Configuration but then importing it as CommunicationConfig in timely. I don't think it matters too much outside of this crate (as it will be CommunicationConfig by timely users). Any thoughts?

/// The progress mode to use.
pub(crate) progress_mode: ProgressMode,
/// A map from parameter name to typed parameter values.
registry: HashMap<String, Arc<dyn Any + Send + Sync>>,
Copy link
Member

@frankmcsherry frankmcsherry Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a non-obvious reason that this is an Arc instead of a Box?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternately (if it is like, setting up the configs for everyone and I've missed that) is the Send required for Arc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, what we actually need is a clonable Box<dyn Any>, but I don't know how to make that happen easily types besides Arc<dyn Any + Send + Sync>, which is a bit overspecific. Specifically, I don't think Sync is important, but Arc<T> doesn't implement Send unless the underlying type implements Sync too.

I'm sure you could make something work with https://docs.rs/dyn-clone/1.0.4/dyn_clone/, but it's always a big pain in the butt in my experience.

where
A: AllocateBuilder+'static,
T: Send+'static,
F: Fn(&mut Worker<<A as AllocateBuilder>::Allocator>)->T+Send+Sync+'static {
initialize_from(builders, others, move |allocator| {
let mut worker = Worker::new(allocator);
let mut worker = Worker::new(worker_config.clone(), allocator);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the Arc moment!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep!

@@ -269,8 +302,65 @@ pub fn execute_from_args<I, T, F>(iter: I, func: F) -> Result<WorkerGuards<T>,St
where I: Iterator<Item=String>,
T:Send+'static,
F: Fn(&mut Worker<Allocator>)->T+Send+Sync+'static, {
let configuration = Configuration::from_args(iter)?;
execute(configuration, func)
let mut opts = getopts::Options::new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The opts parsing .. I was sort of imagining as part of ExecuteConfig. That way e.g. someone could parse up some arguments, and then maybe flip a few, read out some and print them on screen, stuff like that. Is there a reason to stash this logic here, vs in some constructor/initialization for ExecuteConfig?

Add some methods to the various config structs that makes it possible to
assemble a getopts::Options out of various different config structs.
@benesch
Copy link
Contributor Author

benesch commented Jan 11, 2021

I overhauled the naming of these structs so that everything is called Config where it is defined, and gets a more specific prefix only when re-exported at the root of the timely crate. I also reworked the option parsing so that it's possible to parse just the timely::CommunicationConfig from arguments or just the timely::WorkerConfig from arguments. This requires some cooperation to make sure that none of the command-line options conflict, but I think that's ok.

@frankmcsherry
Copy link
Member

Thanks very much! I think this all looks great!

@benesch
Copy link
Contributor Author

benesch commented Jan 11, 2021

Woohoo! I think all that's missing then is documentation of what the different progress modes actually do.

@frankmcsherry
Copy link
Member

I'm happy to add that in post, if that works for you.

@frankmcsherry frankmcsherry merged commit a7bca8e into TimelyDataflow:master Jan 11, 2021
@benesch benesch deleted the worker-config branch January 11, 2021 20:24
@benesch
Copy link
Contributor Author

benesch commented Jan 11, 2021

It most certainly does!

@frankmcsherry
Copy link
Member

frankmcsherry commented Jan 11, 2021

Documentation added in bafcd7d


initialize_from(allocators, other, move |allocator| {

let mut worker = Worker::new(allocator);
let mut worker = Worker::new(WorkerConfig::default(), allocator);
Copy link
Contributor

@sdht0 sdht0 Mar 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@frankmcsherry Should this take the value from config.worker instead? I tried configuring idle_merge_effort in DD as below, but it does not look like the config is actually set in Worker.

let mut config = timely::Config::from_args(std::env::args()).expect("error");
differential_dataflow::configure(&mut config.worker, &differential_dataflow::Config::default().idle_merge_effort(Some(100000)));
timely::execute(config, move |worker| {
// code
}

Or is this not how it's meant to be used?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably! Lemme look further. Sorry, this stuff is definitely under-exercised.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be fixed now, with a test in place. Thanks very much for noticing!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, cheers!

@github-actions github-actions bot mentioned this pull request Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants