Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use HandleOrRuntime to allow alloydb/ethersdb to hold a custom runtime #1576

Merged
merged 6 commits into from
Jun 30, 2024

Conversation

wtdcode
Copy link
Contributor

@wtdcode wtdcode commented Jun 28, 2024

Following advice from @DaniPopes, this PR introduce HandleOrRuntime for both EthersDB and AlloyDB to avoid creating runtime for every call. Compared to #1557, this implementation also allows synchronous code to continue using both implementation by calling AlloyDB::with_runtime(..., ..., new_current_thread()), which reduces break changes.

This PR shall address the concerns of @DaniPopes while keeping the compatibility.

  • If we are already in a tokio runtime, like foundry, the handle will be perfectly valid, and no runtime is created.
  • If we are in synchronous code, users are responsible for creating the runtime only once for with_runtime.

Note with this PR, we could also support current thread runtime now.

@wtdcode
Copy link
Contributor Author

wtdcode commented Jun 28, 2024

@rakita Can we have this before #1574 ?

{
match self {
Self::Handle(handle) => tokio::task::block_in_place(move || handle.block_on(f)),
Self::Runtime(rt) => rt.block_on(f),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will panic if called within async execution context

Copy link
Contributor Author

@wtdcode wtdcode Jun 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that’s intended. Passing a runtime mostly would be the sync code. If user is within async execution context, they should call new directly instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As stated above, if user is in async context, they should call new as #1557 suggests. If user is in sync context, they should provide a runtime, mostly current_thread runtime.

If user intends to mix different contexts, they should be careful by themselves as it is known to be bad practice and cause problems hard to tackle.

pub fn with_runtime(
client: Arc<M>,
block_number: Option<BlockId>,
runtime: Runtime,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the caller is responsible for creating a new runtime?
in which case they could also pass in the handle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, we can add a new function and document the behavior.

Copy link
Collaborator

@DaniPopes DaniPopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense

crates/revm/src/db/alloydb.rs Outdated Show resolved Hide resolved
Copy link
Member

@rakita rakita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@rakita rakita merged commit 1fedacd into bluealloy:main Jun 30, 2024
26 checks passed
@github-actions github-actions bot mentioned this pull request Jun 30, 2024

use crate::primitives::{AccountInfo, Address, Bytecode, B256, U256};
use crate::{Database, DatabaseRef};

use super::utils::HandleOrRuntime;

#[derive(Debug, Clone)]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wtdcode would you explain why Clone has been removed? I'm slightly new in Rust and want to learn the technical concept behind this decision

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because runtime doesn’t implement Clone and it doesn’t make too much sense cloning the db. Or, you could wrap it with an Rc or Arc.
What’s your use case?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks
In my case, I define an ethersDB instance and by using it, I define a cacheDB. I add some base data (deploy contract, change some slots, etc) to cacheDB and after that, by cloning it, I send the same copies to multiple threads.

For now, I get an error (when cloning cacheDB) complains about ethersDB (as internal object of cacheDB) cannot be cloned.

because inside separate threads, I need to write to db, I tested not cloning and use Arc/Mutex. but the performance was worse than cloning.

Copy link
Contributor Author

@wtdcode wtdcode Jun 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks In my case, I define an ethersDB instance and by using it, I define a cacheDB. I add some base data (deploy contract, change some slots, etc) to cacheDB and after that, by cloning it, I send the same copies to multiple threads.

For now, I get an error (when cloning cacheDB) complains about ethersDB (as internal object of cacheDB) cannot be cloned.

because inside separate threads, I need to write to db, I tested not cloning and use Arc/Mutex. but the performance was worse than cloning.

It's not really make sense to clone AlloyDB or EthersDB. Clone was also added by me in a previous PR, but I recently found it was a mistake. Generally, it is due to:

  1. Both Provider and Middleware are only safe to clone within the same runtime. If you send it across runtimes (or threads), many internal things and assumptions could break.
  2. Even if ensure you are in the same async context all the time or just sync code, it just doesn't make sense to clone the EthersDB because it saves nothing.

To your use case which is similar to mine, my suggestion is:

  1. Share a single Arc<Mutex<CacheDB<EthersDB>>> across all threads. This avoids duplicate caching for each thread and can speed up your application overall.
  2. Clone CacheDB members (note they are public) except the EthersDB and create a new EthersDB instead. This generously is a manual "Clone" implementation.

Copy link

@jafar75 jafar75 Jun 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I have a challenge in using arc.

image

in order to create evm and then transact(), I must do lock. but, about all of my perf. overhead is related to these steps, so it convert my program in a literally single-threaded system.
do you have any suggestion for this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note with_db accepts borrows.

Copy link

@jafar75 jafar75 Jun 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean with_ref_db ? I want to update the cache in each transact from multiple threads

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants