Skip to content
This repository has been archived by the owner on Feb 12, 2022. It is now read-only.

Port Phoenix to 0.96 HBase #349

Open
jtaylor-sfdc opened this issue Jul 29, 2013 · 16 comments
Open

Port Phoenix to 0.96 HBase #349

jtaylor-sfdc opened this issue Jul 29, 2013 · 16 comments
Milestone

Comments

@jtaylor-sfdc
Copy link
Contributor

Here's what I can think of that'll be required:

  1. Protobuf MetaDataProtocol endpoint coprocessor
  2. Protobuf HashCacheProtocol endpoint coprocessor
  3. Deal with any namespace issues. Phoenix uses HBase table names of the form "<schema_name>.<table_name>", so I'm not sure if this leads to issues with the new namespace scheme.

They'll likely be other issues too of which I'm not aware. One enhancement I'd like to take advantage of is the new way of modifying HBase tables without requiring to disable and enable them.

Once things are working, it'd be interesting to try to swap in the new type system to measure the performance impact. This would likely be a bigger task.

@ghost ghost assigned tedyu Aug 1, 2013
@ghost ghost assigned jeffreyz88 Nov 4, 2013
@jtaylor-sfdc
Copy link
Contributor Author

One other issue that @jyates brought up is that the WALEditCodec will need to be different for 0.96.

Not sure if this is feasible, but it would be ideal if Phoenix could work on both 0.94 and 0.96 with the same code base by having some kind of thin shim layer.

@ndimiduk
Copy link
Contributor

ndimiduk commented Nov 4, 2013

Might I suggest tracking the type conversion as a separate ticket?

@jeffreyz88
Copy link

"Deal with any namespace issues. Phoenix uses HBase table names of the form ".", so I'm not sure if this leads to issues with the new namespace scheme."
The above won't be an issue because 0.96 name space is using ':' as delimiter.

"Might I suggest tracking the type conversion as a separate ticket?"

That's a good idea. I think the scope of current work is to run phoenix on 0.96 without any enhancement. The future enhancement result in 0.96 new features can be added on case by case later once we have Phoenix running on 0.96.

@jtaylor-sfdc
Copy link
Contributor Author

Created #521 as a separate issue for using the new type system. Interested @ndimiduk ?

@ndimiduk
Copy link
Contributor

ndimiduk commented Nov 5, 2013

We shall see ;)

@jyates
Copy link
Contributor

jyates commented Nov 13, 2013

We could make the port be backwards compatible by supporting a couple of different shims.

First, clearly, would be the incompatible changes between 0.96 and 0.94 for HBase (specifically, I'm thinking of thinks like CP and the WALEditCodec for indexes). These would go behind a compatibility layer, like HBase uses, that includes the right dependencies at compile time.

The other side is dealing with Hadoop1 and Hadoop2 compatibility. Right now, we don't have anything doesn't work when you compile with the right dependencies (e.g. against hadoop2 where HBase has also been compiled against hadoop2 in the local .m2 repo). However, over in tracing, I'm going through the same kind of pain that Elliot Clark did for metrics2 where the interfaces change for the metrics2 stuff between Hadoop1 and Hadoop2 (which would needs its own shim layer too).

We have the option of doing a "poor man's shim" (PMS...an unfortunate acronym) by doing the same thing that HBase 0.94 does for security - adding in a new source directory when the correct profile is selected. However, dealing with all these different versions seems to say that we really should consider doing a full-blown, multi-module approach to support these different pieces working together.

Maybe this is too many things to support at once though? Just doing hadoop1/2 could be supported by a PMS in either 0.94 or 0.96, but doing 0.94 and 0.96 against each might start to be a nightmare. It could just be too much overhead and we just say that Phoenix 3.X is 0.96 and Phoenix 2.X is 0.94 and evolve from there.

Note, the above doesn't solve the compile against HBase 0.94 on Hadoop2 issue as their isn't a public 0.94-Hadoop2 artifact yet; we would need to rely on the the existing process of having that version stored in the local maven repository when building.

@jeffreyz88
Copy link

Ideally we could create a shim interface and build 4 detail implementations(different maven modules) based on the different combinations hbase0.94, 0.96, hadoop-1 & hadoop-2 during compile time like how Hive does for different hadoop versions. Unit tests have to run 4 iterations though against all shim implementations.

Since the shim layer work is significant(even hbase0.96 doesn't have the magic shim) and can possibly be done in a later time. So far, I'm using the simplest approach like "Phoenix 3.X is 0.96 and Phoenix 2.X is 0.94 and evolve from there" you mentioned above.

@jeffreyz88
Copy link

I'm currently migrate WAL related code. One issue is that in 0.96 we changed the SequenceFileLogReader#WALReader to private. Therefore, IndexedHLogReader can't compile due to access the private class.

Since 0.96 we protobuffered wal(ProtobufLogReader), SequenceFileLogReader is left there for reading legacy wals which mainly happened during upgrade.

My questions are:

  1. Do we want the support to read legacy wals from 0.94?
  2. We can make the SequenceFileLogReader#WALReader non-private in hbase0.96 so that Phoenix can access it OR copy the code into Pheonix code base as this part unlikely will change.

How do you think? Thanks.

@jyates
Copy link
Contributor

jyates commented Dec 4, 2013

In the upgrade from 0.94 -> 0.96, you need to do a full shutdown. I don't think we need to support the SequenceFileLogReader and instead can just use the ProtobufLogReader as all the WALs will be removed on a clean shutdown for the upgrade.

If you don't do a clean shutdown (so you have outstanding WALs) then its your own **** fault - no warranty :)

@akarray
Copy link

akarray commented Dec 11, 2013

Really impressed by your work. Please let us know if there is a branch tested with hbase 0.96.
Thanks.

@jeffreyz88
Copy link

So far not yet because I'm still in the process of porting because Phoenix is using end-point coprocessor, WAL and filters and other arears which 0.96 hbase changes significantly. After the porting work is done, I'll check in the changes into port-0.96 branch. After that, I will do tests with others. Previously the original plan is to compete the code part around middle of Dec and later the plan is pushed out to the middle of Feb. Thanks.

@nehalecky
Copy link

Do I understand correctly that this port won't be around until the middle of February? If so, is there is a development branch that is a WIP?

Thanks much, this is an exciting project!

@jeffreyz88
Copy link

Yes, the fist code drop should happen around the middle of Jan and the code will be in port-0.96 branch. A usable version(after tests & more code refactoring) will happen around middle of Feb. Thanks.

@SatyaNarayan1
Copy link

Hi Jeffrey,

Any progress on this. I checked port-0.96, code is 2 month older.
Do you have a working version, can you please update the branch.

Thanks in advance.

@jeffreyz88
Copy link

The work relating to migration is "done"(#688 (comment)) with most unit tests still fail. The following weeks will resolve unit test failures, rebase the code with master branch and more testing.

Current development branch is at https://github.com/jeffreyz88/phoenix/tree/port-0.96 and the work will move to Apache Phoenix git repo once it's created within these two weeks.

Thanks.

@jeffreyz88
Copy link

Just a heads up. Recently I found an incompatible change in hbase 0.96(https://issues.apache.org/jira/browse/HBASE-10366). Therefore, the supported min hbase version will be hbase0.96.2, hbase0.98 or you can apply the fix of hbase-10366 in your deployment. Thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants