Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for using Consul 0.6.0 as a Membership Provider #1267

Merged
merged 1 commit into from
Jan 29, 2016

Conversation

PaulNorth
Copy link
Contributor

Config Usage: <Globals><SystemStore SystemStoreType="Consul" DataConnectionString="http://localhost:8500/" DeploymentId="TestDeployment" /></Globals>

@dnfclas
Copy link

dnfclas commented Jan 11, 2016

Hi @PaulNorth, I'm your friendly neighborhood .NET Foundation Pull Request Bot (You can call me DNFBOT). Thanks for your contribution!

In order for us to evaluate and accept your PR, we ask that you sign a contribution license agreement. It's all electronic and will take just minutes. I promise there's no faxing. https://cla2.dotnetfoundation.org.

TTYL, DNFBOT;

@PaulNorth
Copy link
Contributor Author

This code is that which was discussed in #1226. In that discussion @gabikliot raised some points which I will discuss here. For the record, I am not affiliated with Consul or its creator Hashicorp and the following is simply a dump of my experience of using Consul.

Consul is a distributed, highly-available, datacenter aware service discovery solution (https://consul.io/). Consumers register their services with a string key which can then be discovered by clients in a number of ways, including via HTTP by appending the key to Consul's root uri. This PR uses the DeploymentId as the Consul Key.

Consul works by adding nodes to a cluster which uses a Gossip protocol to maintain awareness of node availability and distribute the service directory around the cluster. To do this it uses two other projects: https://github.com/hashicorp/raft and https://github.com/hashicorp/serf; Raft ensures there is consensus within the cluster when nodes are added or services are registered & Serf optimises the cluster connection and inter-communication (Gossip). Although it is feasible that Orleans could incorporate these technologies directly, that would require a separate and more complex development effort. We, and I suspect most others, have non-orleans requirements for service discovery so it made sense to make use of Consul directly rather than reimplement Raft and Serf in Orleans.

To support geo-distribution, Consul simply includes the concept of each node being in a "datacenter" (https://consul.io/docs/guides/datacenters.html). By having each node take a -dc "" parameter at startup, nodes in the cluster can be grouped into the same datacenter and this information is used to both optimise the cluster and route clients to the closest resource. Registered services are assumed to be co-located with the node in the cluster that they registered with, so a client connecting to an Orleans Silo would include the DeploymentId (and in theory a datacenter) in the client SystemStore config element. The list of silos Consul returns is then ordered based on their proximity to the client. Please note, datacenter support is not implemented in this PR as this would break the abstraction that the SystemStore places over existing membership options (and possibly conflict with Orleans own geo-distribution effort?).

Example Consul service registration for a 2 silo Orleans cluster (running on the same machine for dev only :) where each has been stopped and restarted once.
cluster.fabric.orleans.json.txt

Let me know if this answers the questions you had.

@gabikliot
Copy link
Contributor

Thank you very much @PaulNorth !
I have the following questions:

  1. I would like to understand a bit better what are the advantage of using Orleans in Consul. Is the only benefit hosting/deployment infrastructure? Basically, since you deploy on prem you don't want to just use scripts (like we have here), so you are using Consul for your "micro-services" platform? Is there anything else in it, for our scenario with Orleans (beyond maybe future geo-distrbution) that you are using?
    @rore/@shayhatsor - did you guys consider Consul in Gigya?

@PaulNorth, you can write a page about using Orleans with Consul and submit it as a PR to https://github.com/dotnet/orleans/tree/gh-pages. A good place will be in the http://dotnet.github.io/orleans/Advanced-Concepts/.

  1. Membership table: it was not clear to me what you are using in Consul for Membership table. Is it the key value store? Are you using its CAS support? What about transactions? UpdateRows must mutate 2 rows atomically and in isolation (see here a discussion about suitability of Cassandra for this: Apache Cassandra MembershipTable implementation? #1218). You basically need multi-row (batch) write. Does kv support that? If you are using kv and it does not support multi-row (batch) write, don't despair. We can discuss what the implications will be. Out MBR protocol will still work and most of its feature will work correctly. But we do need to clearly understand what we are getting with Consul.

  2. It looks like you are using synchronous APIs to access Consul. This is a big No-No in the Orleans world. Only async APIs inside silo. If there are no async APIs (which is VERY bad) you should offload the sync IO calls to thread pool. See http://dotnet.github.io/orleans/Advanced-Concepts/External-Tasks-and-Grains on how to do that.

  3. Configuration: when we added ZK and SQL system store we did not yet have a dynamic way to configure system store. Since than we added it in ReminderTableAssembly and MembershipTableAssembly in SystemStore Config #648. So basically you don't need to add a new system store the in any of the config code. Instead use LivenessProviderType.Custom. And in the documentation page in http://dotnet.github.io/orleans/Advanced-Concepts/ you can provide the sample configuration. You can even (actually suggested) use our programmatic API to configure custom system store. That can be demonstrated in the unit test.

@gabikliot gabikliot self-assigned this Jan 12, 2016
@shayhatsor
Copy link
Member

AFAIK, we haven't considered Consul in Gigya, @rore can probably provide a definite answer. I personally wasn't aware on its existence till now.
also @gabikliot said:

Is it the key value store? Are you using its CAS support? What about transactions? UpdateRows must mutate 2 rows atomically and in isolation (see here a discussion about suitability of Cassandra for this: #1218). You basically need multi-row (batch) write. Does kv support that? If you are using kv and it does not support multi-row (batch) write, don't despair. We can discuss what the implications will be. Out MBR protocol will still work and most of its feature will work correctly.

I might be (probably am) wrong here, so add it seems to me that to every sentence:
I'm not sure the MBR protocol will work correctly if you don't have conditional, atomic and isolated update of rows. because without that, data can be overwritten by any silo with no version checks upon updates. The table will not move forward as expected.
It seems to me that the problem is that we essentially have two different protocols for membership state - Consul and Orleans. The IMembershipTable interface mandates that the implementation conforms to the Orleans protocol. When working with a service like Consul, you've already accepted its guarantees and limitations, and would want to utilize it fully. Therefore, there should be a lower level interface like IMembershipProvider that has basically two methods: Update and Query. Which will communicate with the externally managed membership state. The IMembershipTable interface will be used by OrleansMembershipProvider.

@PaulNorth
Copy link
Contributor Author

@jthelin Thanks, I have now read that page and will make the required change.

@gabikliot

  1. You are pretty much correct with your assumption but to elaborate. We have project goals of being able to install our (micro)service products either in Azure or on-premise at customers where the incumbent IT provider controls the infrastructure. Our product needs to support 1000's of users located in multiple offices across the globe, some with limited, very slow or no internet access and not all with Domain access. We are often treated as hostile by the incumbent IT provider (as their parent company will often market a competing product, or they don't understand the technology and won't support it) so we have to deliver very simple to install, deploy & update products. We considered a hybrid Azure solution but the industries we work with are also very anti-cloud due to a risk-averse culture and lack of trust in new(ish) technology. So consul allows us to have cloud-like service discovery on-premise (or proxied for external users) neatly wrapped up inside our service products. We are also moving toward no-SQL databases and none of our existing customers use ZooKeeper, so Consul was the most logical choice.

  2. We are registering each unique silo+generation as a new service registration (https://www.consul.io/intro/getting-started/services.html) but I understand your points and will consider further how to achieve the required consistency with Consul before proposing a solution.

  3. The Consul.net code does not currently support async, I have asked the owners of that project if they intend to add support; in the meantime I will switch to using async calls with HttpClient.

  4. I must have missed this change to the SystemStore while merging with the latest master, I will fix that up.

Thanks for the excellent feedback.

@rore
Copy link

rore commented Jan 12, 2016

Just a small comment about consul - we looked at it as a possibility for doing service discovery, we're not sure yet we will actually use it. But I believe it will be a great addition as a membership provider alongside zookeeper, as it is gaining adoption and becoming a common tool.

@dnfclas
Copy link

dnfclas commented Jan 12, 2016

@PaulNorth, Thanks for signing the contribution license agreement so quickly! Actual humans will now validate the agreement and then evaluate the PR.

Thanks, DNFBOT;

@alfeg
Copy link

alfeg commented Jan 12, 2016

@PaulNorth, just my 2 cents

Because consul API is pure http rest, them maybe it's better to pure HttpClient for all communication. Then async will be a free bonus. As far as I can see from PR, that there is not a lot of consule API endpoints are used, and not a lot of changes needed.

THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS
OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We recently removed all license verbiage from source files and left it only in https://github.com/dotnet/orleans/blob/master/LICENSE. Please remove it from the sources to reduce noise.

@sergeybykov
Copy link
Contributor

It seems to me it should be trivial to make Consul async - by adding an async method (ExecuteInternalAsync?) to Consul.Query that instead of WebRequest.GetResponse() on https://github.com/PlayFab/consuldotnet/blob/master/Consul/Client.cs#L385 would call WebRequest.GetResponseAsync().

@sergeybykov
Copy link
Contributor

I think I can craft a PR for https://github.com/PlayFab/consuldotnet if somebody could help testing it.

@PaulNorth
Copy link
Contributor Author

@sergeybykov You may want to hold fire a while as they had a PR open which has switched to using HttpClient ( PlayFab/consuldotnet#31), after I asked the question (PlayFab/consuldotnet#33) it looks like the author is merging it in; he also mentioned preferring to create a separate AsyncClient. It's up to you but as @alfeg and I have commented, it may be cleaner to just use HttpClient async calls directly as I am only using a small subset of Consul commands.

@sergeybykov
Copy link
Contributor

I'm almost done with the PR. I might submit it anyway. :-)

@gabikliot
Copy link
Contributor

OK, let's discuss the options for implementing membership table integration. Here are all the options I see:

  1. Orleans Membership table implemented by registering each silo as a Consul service. That is what you do now in this PR, by using https://www.consul.io/intro/getting-started/services.html. As @shayhatsor pointed out, I think this is wrong. My understanding of this implementation is that you just ask to update registration with latest and the last update will always overwrite the previous one, without checking versions.
    This will not work correctly, since there is no concurrency control on concurrent registrations.
    Therefore, in Orleans's membership protocol if 2 silos suspect silo 3, and concurrently write their suspicions into the table, in your implementation the 2nd one may overwrite the 1st one. So shortly- it just does not work. I mean, I can think hard how "mayb" it will somehow eventually settle in" (in the next round they will suspect again) but really - this will be a different membership protocol. It will not be what is described here and thus I prefer not to think about it right now.

  2. Orleans Membership table implemented via Consul kv. This is very similar to what I recently advised for Cassandra and also basically similar to Azure Table. KV supports CAS, so you will use it for concurrency control. I also hope this kv table is persistent or at least somehow reliable. One problem with kv is that, as far as I could tell based on docs, it does not support write-batch transactions, which are needed for Orleans' membership table. But the good news are that even without write-batch transactions, Orleans membership protocol will work correctly. You will be missing some minor features, but you can really run in prod without those. I will provide more low level details separately about how Orleans's membership will work if used on the MBR table without write txs. But for now, lets just says it will work no problem.

  3. Use some other built in Consul mechanism to build the Membership table. I see they have distributed semaphores and locks. So maybe it is even easier to build MBR table with semaphores than with kv. I have no preference. It boils out to the reliability level and failure modes of those. What happens to the safety and liveness of each (semaphores and kv) when there are failures/crashes/network partitions.

  4. Orleans Membership table + service registration: in this option you combine 2/3 with 1. You both implement Orleans Membership table with either 2 or 3, and also register each silo as a Consul service. Registration as a Consul service is not for Orleans. Orleans will not use it in any way, and it will not be done as part of Orleans Membership table impl. You can register silos as Consul service in the silo host process, lets say after silo successfully started. That way, if you have some other, non Orleans service that needs to be aware of Orleans for something (like if you want to monitor all your services in a unified way, or watchdog them, or just iterate, or discover for to some other purposes), you can have it.

  5. Consul based Membership integrated in Orleans - in this option you don't implement membership table, but rather substitute the whole Orleans membership protocol with Consul's. That is what @shayhatsor suggested above as IMembershipProvider. Luckily, we have that already. It is called ISiloStatusOracle. It is a local interface that the membership Oracle (our current implementation of MBR protocol) exposes to the rest of silo runtime. All silo components interact with membership via that interface only. That means you will need to re-implement this interface in terms on Consul. Basically delegate some calls to Consul, and when Consul MBR tells you another silo is down you will notify all local silo components, ... Basically, this is a layer that exposes membership information locally to the silo. Will it work? Yes! I know it will work, since we have already done that in the past. We had a full proof integration with Service Fabric Naming Service (which is Service Fabric membership) in this way. That is also why this interface is pretty polished and I know for sure it does not "leak" anywhere.
    This is definitely a more complicated integration work than option 2. First, we will need to add hooks via configuration and appropriate factories (or, DI) so you can plug it in. Second, there are just much more integration points, more complexity, so need to be more carefull. The advantage of this may be, as @shayhatsor pointed out "When working with a service like Consul, you've already accepted its guarantees and limitations, and would want to utilize it fully." So it will be "more natural/native" integration. Of course, the devil is in the details (in service semantics and guarantees). How the other MBR protocol behaves in face of network partitions? What are failure detection accuracy and completeness guarantees? What are the timing guarantees? Will uniform agreement on the membership provided and how timely? For example, and I worked a LOT in the past with gossip based MBRs, they usually provide VERY eventual, very long into the future point when all alive servers agree on the set of dead servers. This is usually inherent to the random nature of those epidemic style gossip protocols. It make take long time. I don't know how timely is Consul's serf, but if it will take long time the consequence may be pretty bad for a system like Orleans, which relies on all alive silos to agree very quickly (seconds or even faster usually).
    I also don't know how good/robust Consul's protocol is. As you can read from our description, our MBR protocol was specifically adapted for the Cloud and hardened-proofed by years of production usage. For example, if you look at the last year issues on GH, you will not find a single case I think of people reporting bugs in MBR protocol. There is a reason for that.
    But 5 is definitely an option, especially if you have satisfactory answers to all those semantics and guarantees questions.

If I had to choose out of all those options, I will start with 2. This is the easiest to implemented and the safest since it has already been done at least 3 times (Azure Table, ZK, SQL).

@PaulNorth
Copy link
Contributor Author

Thanks for your thoughts on this @gabikliot it's very useful information as we have been working on this yesterday and have been doing a PoC on a solution best described as a variation on options 2+4.

Correct me if I'm wrong, but it is my interpretation that the only silo information that is subject to contention is: TableVersion, TableVersionEtag, SiloStatus, SiloEtag, SuspectingSilos? Building on that, my assumption is that no silo will ever try to update a different Silo's other details; in particular its IAmAliveTime.

Our solution is based on this assumption and registers the basic Silo information as a Service Register. The assumption is that only the Silo itself will ever need to update this registration when it calls UpdateIAmAlive(), and it is further assumed that the Silo will not make multiple competing calls to UpdateIAmAlive().

All the data subject to contention (as above) is stored in a KV and writes are guaranteed consistent by using Consul's CAS. This partitioning has the benefit of minimising the length of the KV value (it is limited to 512KB), using Consul's Service Registration which is a natural fit for the silo endpoint information and reducing contention on the KV entry due to silos updating their IAmAlive,

On read, this solution requires one call to get the Service Registration (which will return all known silos) and a second call to get the Cluster "Liveness" KV, the ConsulMembershipProvider then rehydrates this data into the format that Orleans expects.

What are your thoughts on this hybrid approach?

We do have some issues at present:

  1. As I've never seen it happen, when does DeleteMembershipTableEntries() get called? Is there a periodic cleanup of rows for dead silos and their generations?
  2. The KV value is Base64 encoded list of liveness information, if nothing cleans out the dead silos then as well as being inefficient it will eventually hit the KV value length limit. We could investigate splitting the Silo liveness into one KV pair per Silo and one for the TableVersion but that would mean Consul couldn't support atomic updates when incrementing the Version and Etag.
  3. A side issue, the async changes pushed to Consul.net have updated Newtonsoft.Json to version 7.0.0.0. I have read the External References guidance but defer to the Orleans team as to whether you are happy to upgrade Newtonsoft.Json at this time.

@PaulNorth
Copy link
Contributor Author

I have updated the PR which addresses every issue raised and implements the approach I proposed today.

Known Limitations / Issues (Incorporates & supersedes my previous list of issues :) )

  1. The growth of dead silo information will eventually exceed the KV value limit of 512KB. This is not an issue if the dead silo rows can get cleaned up.
  2. I have had to add AutoGenerateBindingRedirects & GenerateBindingRedirectsOutputType to the OrleansConsulUtils.csproj to work around the fact that Consul.net upgraded to version 7.0.1 of Newtonsoft.Json. Please can you advise on how best to correct this I know you prefer to keep all external references at the same version.
  3. Using a config file to set the Custom Membership Provider does not work as the GlobalConfiguration Load() method also expects the ReminderProvider to be implemented as Custom, I'm not sure if this is by design? It does however work if you configure it through code as suggested.
  4. The Consul provider will only work with a single Consul datacenter as Consul does not replicate KV values between datacenters, There is a separate project (https://github.com/hashicorp/consul-replicate) to address this but I have not yet investigated it.
  5. I will write a help page once the approach with Consul is agreed.

@highlyunavailable
Copy link
Contributor

About item 2: There's an issue open for that PlayFab/consuldotnet#28 but no real progress has been made on it. The problem is that the version doesn't actually matter (it's using an extremely simple subset of Newtonsoft.Json) but there's no good way to tell .NET "use any Newtonsoft.Json.dll over version 4.0.0.0, whatever you've got is fine" that I know of.

@gabikliot
Copy link
Contributor

Hi @PaulNorth . Great work, but let me suggest a bit different approach. I don't like the single kv entry to ALL silo data (even if it is only the dynamic part that is subject to contention). It limits the size of the deployment, creates the need to shorten/clean it up. Let me suggest an alternative.

Use one kv entry per silo and just don't implement Table version at all. As you saw in the protocol description, Table version is only used in the extended protocol. The basic protocol will work without iy, just like that. What you will loose is really not much:

  1. You won't have monotonically increasing version numbers attached to membership views and membership views themselves would not be totally ordered. But we don't actually use this in any place in Orleans. Totally ordered membership views is a nice feature to have, and we had different plans to use it in other protocols in the directory later on, but those plans have not materialized yet. As of now, nothing in Orleans (except point 2 below) relies on that, so you wont' loose anything without it.
  2. Total order guarantees that silos join one by one, never concurrently. We use that fact to validate two-way connectivity between any pair of silos (the JOINING silo state). If now you won't have total order, what could happen is that lets say you have 5 active silos and 2 join simultaneously. Each of the 2 will validate two-way connectivity with each of the 5, but not between themselves. They will still join fine and all will work as usually, we will just do a partial instead of a full connectivity check. But stepping back, you don't need this connectivity check at all. I am actually not aware of any distributed system that performs connectivity validation. Orleans goes way and beyond here. It is usually assumed to be provided by the underlying network/infra. In our case we saw some very weird cases of miss-configured firewalls on Azure, so we had to add this check. But if you trust your network (and don't misconfigure your firewalls), you don't need Orleans to be smart about checking your network.

So my suggestion is to completely ignore Table version. Upon update/insert just don't write it anywhere at all, and only write the silo kv row. Upon read, just always return Table version 0 with same etag. That way in the future, if we do decide to use it somewhere else, it will be very clear that Consul membership does not support Table version and we will be able to look for alternative solutions (use semaphore, use your approach, ...). I would not be worried about that.

Now, if you agree with the above, you should not be concerned about kv size. That means you can further simplify your protocol and only use kv and not register the service. I actually prefer you do it this way, since by utilizing both you have created too tight coupling between those 2 systems (kv and registration) and this may be fragile. For example, what will happen if service registration thinks silo is down, but Orleans membership does not think so? You may not be able to read all silos info from service registration. Also from the standpoint of taking strong dependencies, it is better to take dependency only on one system and not on 2. It will also simplify debugging/testing/troubleshooting if something does not work.

If you still want silos to be registered as a service, do it like in option 4, but don't make Orleans membership be dependent on 2 different systems. That is at least my intuition.


DeleteMembershipTableEntries is only called in tests. We don't automate deleting in prod, simply since we use very little storage space and it is so cheap, so we just leave it there. Plus its good for later diagnostics to keep it there. If you still want to delete, you could run a cron job, or better: use Orleans reminder that will poke a grain once a day and this grain can delete old entries. Deleting is not trivial since you need to be 100% sure you are not deleting something prematuraly.

I think you can update Newtonsoft.Json in the whole Orleans solution. Just submit it as a separate PR. Don't bundle it with this PR.

I will look at the configuration. I think we should support different systems stores for liveness and reminder.

@shayhatsor
Copy link
Member

I agree with everything @gabikliot said. just wanted to add something about:

Deleting is not trivial since you need to be 100% sure you are not deleting something prematuraly.

A safe table cleanup can be implemented as follows:

  1. read all table entries and group by ip:port
  2. in each group - find the biggest generation and delete all other entries

@PaulNorth
Copy link
Contributor Author

@gabikliot I will give this approach a go today.

Is there any policy on upgrading Newtonsoft.Json to the same version as Consul (7.0.1) or to the latest current version (8.0.2)?

@shayhatsor
Copy link
Member

Is there any policy on upgrading Newtonsoft.Json to the same version as Consul (7.0.1) or to the latest current version (8.0.2)?

that's probably a question for @jthelin

@sergeybykov
Copy link
Contributor

I see no reason for not upgrading to 8.0.2.

@jthelin
Copy link
Contributor

jthelin commented Jan 15, 2016

Let's look at the data:
Newtonsoft.Json v8 is still very new (Dec 29), and most of the installed base (~91%) looks to be still on v7
https://www.nuget.org/packages/newtonsoft.json/

Version Downloads Last updated
Json.NET 8.0.2 (this version) 70,445 Saturday, January 9, 2016
Json.NET 8.0.1 114,019 Tuesday, December 29, 2015
Json.NET 7.0.1 1,968,636 Monday, June 22, 2015

Also, a "patch" release within 10 days of v8 availability is not an encouraging sign (too much "Christmas Cheer" maybe?) and suggests to me that v8 is maybe not completely "stable" just yet?

I would favor staying at the tried and trusted 7.0.1 stability point for now.

}
catch (Exception ex)
{
_logger.Verbose("ConsulMembershipProvider failed to insert registration for silo {0}; {1}.", entry.SiloAddress, ex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend logging this and all exception at least on Info level, maybe even warning here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have gone with Info level for exceptions for two reasons: 1. The exception is rethrown anyway and Orleans MembershipOracle looks to be already logging these exceptions as error. 2. Logger.Warn() requires an ErrorCode and every example I can find uses the Orleans.ErrorCode enum which I assume I shouldn't modify for the benefit of an extension or arbitrarily overload an existing error code.

@gabikliot
Copy link
Contributor

Insert and Update look good now.
Agree on the b option for UpdateIAmAlive.

We are very close now!

@PaulNorth
Copy link
Contributor Author

Notes:

  1. ReadRow and ReadAll now read 2 KVs per Silo in a single HTTP Get and then filter by key to deserialize to MembershipEntry
  2. If IAmAlive is not found (e.g. hasn't been written yet) then Silo StartTime is used as default.

@shayhatsor
Copy link
Member

👍

@gabikliot
Copy link
Contributor

Looks good!
Please squash and rebase to master and I will be ready to merge.

@PaulNorth
Copy link
Contributor Author

@gabikliot We are just starting to work with GitHub here and have never had to Rebase and Squash changes. So following guidance I have pulled and merged the latest from orleans/master to PaulNorth/master and used "Rebase...Interactively" in SourceTree on my branch , choosing to squash the commits during the rebase. This hasn't created the commit history I was expecting to see so can you advise further if it is not what you wanted.

@gabikliot
Copy link
Contributor

I am not a GIT expert either, there might be other ways, but the way I do it is the following:

  1. first squash all commits into one. There are multiple ways to do that, like git merge --squash, or others. A lot of times I would not even bother and just manually create a new branch and copy there all my changes in one commit. Lets say this new local branch is called X. X now has all your changes in one commit, but maybe not all commits from the master.
  2. Rebase this branch on master. That will cause the new commit to be the last one in the branch X after all commits that are already in the master.
  3. Now you need to hard push branch X into the remote branch we were working on (this branch, branch ConsulProvider in your remote repo). You do it with:

git push origin +HEAD:Y
This pushes the currently checked out branch (branch X) to a remore branch with name Y (ConsulProvider).

@PaulNorth
Copy link
Contributor Author

Cheers for the advice, the commit should be what you are after now.

gabikliot pushed a commit that referenced this pull request Jan 29, 2016
Added support for using Consul 0.6.0 as a Membership Provider
@gabikliot gabikliot merged commit c1ff255 into dotnet:master Jan 29, 2016
@gabikliot
Copy link
Contributor

Great!
Thank you very much @PaulNorth for your contribution!

It would be great if you could write a documentation page, where you can:
a) describe the general benefits of using Consul to host Orleans on prem (basically, what we discussed in the beginning of this issue).
b) the impl. details: how you implemented the membership table via kv, the fact that we didn't implement tale version for now, how you map data to kv, sub row for IAmAlive, ...
c) Configuration guide: how to confgure Orleans to use Consul membership.

You can put this page at: https://github.com/dotnet/orleans/tree/gh-pages/Runtime-Implementation-Details/Consul

@sergeybykov
Copy link
Contributor

Big thanks to @PaulNorth and everyone who helped shape this into another important option for running Orleans anywhere you need it, the way you need! Shows the true power of OSS collaboration and leveraging other people's work.

@PaulNorth PaulNorth deleted the ConsulProvider branch January 29, 2016 09:37
@PaulNorth
Copy link
Contributor Author

Glad I could help. I will keep an eye on Consul development and update the provider when a stable release supporting atomic operations is available.

@armon
Copy link

armon commented Apr 6, 2016

@PaulNorth we have multi key transactions slated for the upcoming 0.7 major release!

@highlyunavailable
Copy link
Contributor

At some point a package update should be done for the Consul/Orleans integration because I've fixed quite a few bugs in Consul.NET and cleaned up a ton of things - see the Changelog (everything from 2016-02-07 and newer is not in the Consul/Orleans package). I'll look into making a PR for the update myself but it may be a lot faster/cleaner of @PaulNorth does a PR when he has time since the CLA has been signed and all that jazz - it should only involve updating the src/OrleansConsulUtils/packages.config file to point to version 0.6.4.1, though the ConsulClient is now IDisposable so it might make sense to move some stuff into using blocks. It's not 100% required though since no native resources are used, it just is a way to pass the Dispose() call onto the HttpClient.

Of course I'll also be updating the .NET API for Consul 0.7 whenever it comes out. Cool to hear that multikey transactions are going to be a thing, @armon .

@sergeybykov
Copy link
Contributor

@PaulNorth If you'd like to update OrleansConsulUtils, now it the time, so that the changes would be included in 1.2.0.

@PaulNorth
Copy link
Contributor Author

PaulNorth commented Apr 22, 2016

@sergeybykov It appears that the Orleans master branch already uses the latest Consul.net client (0.6.4.1) (thanks @highlyunavailable) and Hashicorp have not yet released version 0.7 of the server. The changes to take advantage of multi key transaction will have to wait until the next release.

@sergeybykov
Copy link
Contributor

👍

@armon
Copy link

armon commented May 21, 2016

@PaulNorth @sergeybykov Just a heads up that Consul now supports multi-key transactions in master.

@gabikliot
Copy link
Contributor

Nice!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants