-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add eos-vm-oc-enable auto mode #1322
Conversation
/** | ||
* @class wasm_interface_collection manages the active wasm_interface to use for execution. | ||
*/ | ||
class wasm_interface_collection { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no functionality change here, just a refactor out of controller.
@@ -355,6 +355,9 @@ namespace eosio { namespace chain { | |||
void replace_producer_keys( const public_key_type& key ); | |||
void replace_account_keys( name account, name permission, const public_key_type& key ); | |||
|
|||
void set_producer_node(bool is_producer_node); | |||
bool is_producer_node()const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of sad about this change. Until now, the core of leap had no need to know if it was a producer or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, I guess depending how much of a purist you want to be here, maybe you could make controller take a std::function<bool(apply_context)> tier_up_with_oc
, either as a ctor parameter or a config or something. That way chain_plugin
can decide how to set this up based on the various configs. And libtester just always passes in a return false;
for example, I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that seems a bit too far.
* "eosio.any" -> "eosio" | ||
* "eosio" -> "eosio" | ||
*/ | ||
constexpr name prefix() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation copied from CDT
v = boost::any(wasm_interface::vm_oc_enable::oc_auto); | ||
} else if (s == "all" || s == "true" || s == "on") { | ||
v = boost::any(wasm_interface::vm_oc_enable::oc_all); | ||
} else if (s == "none" || s == "false" || s == "off") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other previously valid choices were 1
/0
and yes
/no
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
…ime to match genesis used in other tests. This allows for setcode in slow ci/cd env.
throw validation_error(validation_error::invalid_option_value); | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To have fewer options in the future, we could add a deprecation note that only auto
, all
, and none
will be allowed, all others could be deprecated in future releases.
std::unordered_map<std::thread::id, std::unique_ptr<wasm_interface>> threaded_wasmifs; // one for each read-only thread, used by eos-vm and eos-vm-jit | ||
}; | ||
|
||
} // eosio::chain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this refactoring. It hides those details from the controller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add some tests for auto
, all
, and none
options to make sure OC is enabled/disabled as configured?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any ideas on how? Unfortunately, oc is not setup to be used in unittests because the destructor of tester
-> controller
destroys oc and it can't be restarted because it forks the process on start. Seems like it would require a major refactor to allow oc to be used repeatably in unittests.
We could add some logging and look for a particular log in some integration tests. Let me see what I can do for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's surprising to hear -- it was definitely originally designed so that multiple controller
s using OC can exist simultaneously: we have unittests that do that.
But, that was for the non-tier-up usage of OC. So maybe there was a shortcoming of the original impl in this area. Or maybe we've regressed some how.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integration test added.
…me to be found in logs on slow ci/cd machine
EOS_ASSERT(!is_on_main_thread(), misc_exception, "init_thread_local_data called on the main thread"); | ||
if (is_eos_vm_oc_enabled()) { | ||
// EOSVMOC needs further initialization of its thread local data | ||
wasmif.init_thread_local_data(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lost an #ifdef EOSIO_EOS_VM_OC_RUNTIME_ENABLED
on these couple lines during the refactor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if most of the #ifdef EOSIO_EOS_VM_OC_RUNTIME_ENABLED
could be removed to simplify differences. For now, I'll just revert back to having them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restored #ifdef
s
@@ -158,6 +162,10 @@ namespace eosio { namespace chain { | |||
return it->module; | |||
} | |||
|
|||
bool should_always_oc_tierup()const { | |||
return wasm_runtime_time == wasm_interface::vm_type::eos_vm_oc || eosvmoc_tierup == wasm_interface::vm_oc_enable::oc_all; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think wasm_runtime_time == wasm_interface::vm_type::eos_vm_oc
should be in here. wasm_runtime_time == wasm_interface::vm_type::eos_vm_oc
is meant to be "OC in non-tier up mode" and all the code back here
leap/libraries/chain/wasm_interface.cpp
Line 93 in 6008c72
if(my->eosvmoc && (my->should_always_oc_tierup() || context.should_use_eos_vm_oc())) { |
should be skipped entirely in non-tier up mode. It's not clear what the ramifications are for having both a synchronous and asynchronous code_cache at the same time (which is what I think might be happening in this case.. not sure.. regardless it def wasn't originally intended to operate this way)
I think the line above should be something more on the order of
if(my->eosvmoc &&
(my->eosvmoc_tierup == wasm_interface::vm_oc_enable::oc_all || context.should_use_eos_vm_oc())) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I missed that eos_vm_oc
didn't flow throw here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a non issue from a functional standpoint -- I remember now that my->eosvmoc
isn't created unless in oc_auto
or oc_all
mode,
leap/libraries/chain/include/eosio/chain/wasm_interface_private.hpp
Lines 93 to 96 in a8f20ce
if(eosvmoc_tierup != wasm_interface::vm_oc_enable::oc_none) { | |
EOS_ASSERT(vm != wasm_interface::vm_type::eos_vm_oc, wasm_exception, "You can't use EOS VM OC as the base runtime when tier up is activated"); | |
eosvmoc.emplace(data_dir, eosvmoc_config, d); | |
} |
so nothing wonky would have happened since
my->eosvmoc &&
would have always failed out. But yeah def better now with the tweak.
@@ -405,6 +405,8 @@ namespace eosio { namespace testing { | |||
cfg.eosvmoc_config.cache_size = 1024*1024*8; | |||
|
|||
for(int i = 0; i < boost::unit_test::framework::master_test_suite().argc; ++i) { | |||
// don't use auto tier up for tests, since the point is to test diff vms | |||
cfg.eosvmoc_tierup = chain::wasm_interface::vm_oc_enable::oc_none; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why this is inside the for()
loop instead of just with the cfg.
items a few lines above.
v = boost::any(wasm_interface::vm_oc_enable::oc_auto); | ||
} else if (s == "all" || s == "true" || s == "on" || s == "yes" || s == "1") { | ||
v = boost::any(wasm_interface::vm_oc_enable::oc_all); | ||
} else if (s == "none" || s == "false" || s == "off" || s == "no" || s == "0") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw, it appears boost was doing a case insensitive compare; so False
, oN
, etc was also accepted 😈 depending on how accommodating you want to be...
("eos-vm-oc-enable", bpo::bool_switch(), "Enable EOS VM OC tier-up runtime") | ||
("eos-vm-oc-enable", bpo::value<chain::wasm_interface::vm_oc_enable>()->default_value(chain::wasm_interface::vm_oc_enable::oc_auto), | ||
"Enable EOS VM OC tier-up runtime ('auto', 'all', 'none').\n" | ||
"'auto' - EOS VM OC tier-up is enabled for eosio.* accounts and read-only trxs.\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know Lin already touched on it some, but I do feel like we ought to try and find a short concise way to sneak in that it's also enabled for applying blocks unless a BP. That aspect is completely missing in this description even without the extra "non BP rule".
Maybe something like
'auto' - EOS VM OC tier-up is enabled for eosio.* accounts, read-only trxs, and applying blocks (excluding on producers).
We already have plenty more wordy config items in read-mode
, peer-log-format
, etc.. so I feel like we can spend 5 words.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops somehow I was on an older revision when I made that comment. I see now that we did add commentary around applying blocks. Still not sure penny pinching 3 words and this avoiding documenting the producer exclusion is worth it... but what we have in the latest commit is fine to me since at least it includes block application
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of:
'auto' - EOS VM OC tier-up is enabled for eosio.* accounts, read-only trxs, and except on producers applying blocks.
. Putting it as a paratheses at the end made it seem like producer excluded all cases.
@@ -0,0 +1,2 @@ | |||
#define BOOST_TEST_MODULE state_history_plugin | |||
#include <boost/test/included/unit_test.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're using the header-only variant now, you should remove linking to the library from here
target_link_libraries(test_state_history state_history_plugin eosio_testing eosio_chain_wrap Boost::unit_test_framework) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
@@ -105,6 +105,8 @@ namespace eosio { namespace chain { | |||
once_is_enough = true; | |||
} | |||
if(cd) { | |||
if (!context.is_applying_block()) | |||
tlog("${a} speculatively executing ${h} with eos vm oc", ("a", context.get_receiver())("h", code_hash)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this meant to be left in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tlog
is trace log, but no, that was for testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, yes, it is needed for the read_only_trx_test.py test. I'll add a comment about that.
@@ -306,6 +306,12 @@ code_cache_base::code_cache_base(const std::filesystem::path data_dir, const eos | |||
} | |||
|
|||
void code_cache_base::set_on_disk_region_dirty(bool dirty) { | |||
// tests can remove directory before destructor is called |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels pretty weird to me.. it makes it sound like there are cases where fc::temp_dir
is dtor'ed before the controller
is dtor'ed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'll fix the tests and remove this.
Short Description
In order to enhance the performance of EOS and EOS EVM, we are introducing the use of OC (Optimizing Compiler) for the EOS VM. This initiative aims to improve the overall efficiency and scalability of the platform.
What
The primary goal of this initiative is to leverage OC for the EOS VM by implementing an Auto OC Mode as the default setting. This new mode,
eos-vm-oc-enable=auto
, will enhance performance. OC will automatically be used based on different contexts, such as building blocks, applying blocks, executing transactions from HTTP or P2P, and various transaction types. OC will also be used for all contracts oneosio.*
accounts. This includes:eosio
,eosio.token
,eosio.ibc
, andeosio.evm
.Why
By utilizing OC, we can significantly boost the performance of EOS and EOS EVM. With the Auto OC Mode as the default, the system will automatically apply OC when appropriate, optimizing the execution of transactions and improving overall throughput. The introduction of OC for specific contexts, such as building blocks and speculative transaction executions, further enhances the efficiency of the platform.
Auto OC Mode as default
eos-vm-oc-enable=auto
is the new default. Scripts that use--eos-vm-oc-enable
will need to be updated to use--eos-vm-oc-enable all
to enable OC for all contexts.eos-vm-oc-enable=true
=>eos-vm-oc-enable=all
(eos-vm-oc-enable=true
continues to work and is mapped toall
)eos-vm-oc-enable=false
=>eos-vm-oc-enable=none
(eos-vm-oc-enable=false
continues to work and is mapped tonone
)--eos-vm-oc-enable
will not now work, replace with--eos-vm-oc-enable all
Baseline / OC selection
In Auto OC Mode, leverage EOS VM OC, as follows:
if producer: baseline, but OC if account is eosio.*
💡 Although enabling OC for P2P would enable quicker validation of speculative trxs as they traverse the network, this PR does not add OC for P2P to help prevent overload on BP nodes. A separate option to enable OC for P2P when in
auto
is being considered as a follow on enhancement; see #1333.Resolves #1251