Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible git usage #24

Open
Dr-Emann opened this issue Jul 17, 2012 · 5 comments
Open

Possible git usage #24

Dr-Emann opened this issue Jul 17, 2012 · 5 comments

Comments

@Dr-Emann
Copy link
Contributor

I was wondering how large the try-haxe folder is getting, with auto-branching. If space becomes a problem, I thought of a solution using git as as a repository to store all of the saved examples, and checking them out to a set number of folders.

Pros

  • Provides a SHA-1 hash which could be used to identify the example (the first 6 digits should be plenty to remain unique)
  • Compression based on the difference to the parent example
  • Reduce redundancy

Cons

  • Requires the server to have git
  • More complex than current strategy
  • Might be pre-mature optimization, if space is not an issue, pointless

If you think this would be a worthwhile idea, I can start work on forking and working on an implementation, but I'd rather not start work on something that's not worthwhile/won't ever be used.

@clemos
Copy link
Owner

clemos commented Jul 17, 2012

Hi Dr Emann,

I've actually been thinking about it too.
It could also be fun to allow people to clone their project for further experiments, like gists for instance.
This could really be an awesome project.
My server has git installed so it should be fine.

Now personnally, while it's a probably fun and interesting idea to work on, it's just not something I felt was really urgent so I just "forgot" it somehow.
Actually, space is not a real problem. The current archive is 1,5G on try.haxe.org, which is not really big (I still have 43G left).

Just feel free to do it, I'll totally merge what you'll come up with.

@Dr-Emann
Copy link
Contributor Author

Heh, yea, no big rush for it, then. Still would be interesting. You could even write a script to make tags for all the existing ones, and push them into it as well, keeping the same hash.

I thought of two possible implementations.

Separate repositories, pushing to a central, master repo

This will work because git hard-links objects when cloning locally, and you can run git relink periodically to keep new objects linked as well.

Linked git dir

Using git init --separate-git-dir '../master.git' would allow all of the folders to share the same git directory. This will ensure maximum space savings. However, this means all folders would share the same HEAD, and we would therefore have to use the git plumbing commands, combined with a separate git index file (setting GIT_INDEX_FILE environment variable to a file local to .local_index) to stage and commit changes manually. i.e.

EXPORT GIT_INDEX_FILE=.local_index
git read-tree (hash)
git checkout-index -af

# Do work
git add .
git write-tree
git commit-tree (hash from previous command) -p (parent hash) -m "nothing"

@clemos
Copy link
Owner

clemos commented Jul 17, 2012

Well this is quite beyond my git knowledge actually...

The second solution seems more natural, though, at least the part that says "all the folder share the same git directory".
It seems good, but then the operations required to update it are obscure to me.
Wouldn't it be possible to achieve similar behaviour the other way round, with one bare repository and checkouts to external dirs. Each directory name could correspond to a commit hash, which would make it easier to manage commits.
I don't know. Because I still don't see how you can manage several people committing at the same time to several branches...

ITOH, since space is not really an issue, maybe we should focus on the actual features git could provide besides saving space.

As already said, I've been thinking about allowing one to clone his "project" like a gist.
The best in this case would be for the user to only get the branch / history that directly leads to his version.
I don't know if this implies generating a separate "clean" repo, or if branches are enough to achieve this.

In the same spirit, being able to navigate through versions directly on the website would be fun.

I'm quite lost, actually :p

@Dr-Emann
Copy link
Contributor Author

Yea, I was up late last night reading up on the internals of git.

I think the first option would be easier, and more safe. It sounds almost exactly like what you suggested, actually, to have one bare repository, and then each directory would pull from the central repository, then push new commits in. Each new commit would require a new branch (so that they don't get garbage collected). The way I see it, it would work like this:

  • User navigates to try.haxe.org
  • User hacks on code, saves an example
    • git makes a new commit, based off a pre-made initial commit that contains the default try-haxe code (this will be the root of all commits)
    • git makes a branch called try-haxe_123abc (the first couple digits of the SHA1 hash)
  • User is presented with a link to try.haxe.org/#123abc

Then later:

  • User2 goes to try.haxe.org/#123abc
    • We go into the oldest directory (one with oldest last-modified)
    • We fetch from the central repository all new commits
    • We checkout branch try-haxe_123abc
  • User2 hacks new code on top of the example
  • User2 chooses to save an example
    • We commit (get new SHA1 of 234bcd)
    • We make a branch (try-haxe_234bcd)
    • We push to the main repo
  • User2 gets a link to try.haxe.org/#234bcd

Because each commit is based off of the example that the user started hacking from, the history would automatically include the chain of examples that lead to it. A "clone" would be implied by starting at an existing example, and saving something new.

@clemos
Copy link
Owner

clemos commented Aug 7, 2012

Sounds good, except maybe the part with "oldest last-modified", which I don't really get.
This said, I have no real idea as to the amount of work to implement this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants