-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jkarni/feature/chunkedupload #674
Conversation
@@ -36,6 +37,9 @@ | |||
|
|||
__all__ = ['asset_index', 'upload_asset', 'import_course', 'generate_export_course', 'export_course'] | |||
|
|||
MAX_UP_LENGTH = 20000352 | |||
|
|||
CONTENT_RE = re.compile(r"(?P<start>\d{1,11})-(?P<stop>\d{1,11})/(?P<end>\d{1,11})") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment indicating that this is intended to parse the HTTP_CONTENT_RANGE header? Regular expressions are always difficult to understand, and comments can go a long way towards making it easier.
High level question/comment before I start reading the code. Does this approach use multiple POST backs to the server, with each POST sending one chunk? If so, we can't guarantee that each POST will be routed to the same production server through the ELB. |
Ugh... yeah, that's exactly what it does. Is there a way of getting the requests routed to the same server? The other option I can think of is to use the DB instead of the filesystem for temporary storage (of the tar file), but that looks like it'll be a lot uglier... |
Can you ping DevOps with this question? I can't remember if we have sticky ELB sessions or not. Even if we do, it's not guaranteed. |
Looks like we do. Waiting to see if stickiness lasts for the entire
|
dirpath = None | ||
|
||
coursexmls = ((d, f) for d, _, f in os.walk(course_dir) | ||
if f.count('course.xml') > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still seems confusing/unclear. I'd suggest something more like:
def get_all_files(directory):
"""
For each file in the directory, yield a 2-tuple of (file-name, directory-path)
"""
for dirpath, dirnames, filenames in os.walk(directory):
for filename in filenames:
yield (filename, dirpath)
def get_dir_for_fname(directory, filename):
"""
Returns the dirpath for the first file found in the directory with the given name.
If there is no file in the directory with the specified name, return None.
"""
for fname, dirpath in get_all_files(directory):
if fname == filename:
return dirpath
return None
fname = "course.xml"
dirpath = get_dir_for_fname(course_dir, fname)
if not dirpath:
# return JsonResponse
# continue
It's longer, but it's much clearer. Also, if you move those two functions outside of the view function, they're reusable and the view is easier to unit test.
This might be out-of-scope to the immediate work here, but could you: Separate out the import/export functions in assets.py into a new view file, say import_export.py (be sure to also import in init.py in that folder and change the _ all _ variable in both assets.py and the new import_export.py). This separation should have been done when we initially spit up the view files, and it must have been an oversight that these functions are grouped with the asset library. If you'll do this I can close a PR which did this, but unfortunately wasn't merged before you did this work. |
Yeah, that makes sense, and should be a quick fix.
|
|
||
MAX_UP_LENGTH = 20000352 # Max chunk size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this line (and CONTENT_RE below) be deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm using content_re in the multiple assets pr, but could delete both here
and then add again afterwards so the pr makes sense atomically
On Aug 19, 2013 10:57 AM, "Christina Roberts" notifications@github.com
wrote:
In cms/djangoapps/contentstore/views/assets.py:
+MAX_UP_LENGTH = 20000352 # Max chunk size
Shouldn't this line (and CONTENT_RE below) be deleted?
—
Reply to this email directly or view it on GitHubhttps://github.com/edx/edx-platform/pull/674/files#r5844227
.
I hit a strange case when I tried uploading something that wasn't a tar.gz file. I got a couple errors (both notification bar "Studio couldn't save" bar and the error dialog) when I selected a proper tar.gz file AFTER I had selected a file with a different extension. The JavaScript/'HTML changes in here are straightforward. I wanted to ensure that when an import fails, things don't "blow up". Julian says he has tested that case. |
@cahrens that was definitely a regression - am pushing a fix now. |
A couple odd things--
|
@cahrens - looks like it was because I wasn't removing the submit callbacks. New commit fixes the issue. But just to be clear - you get one "your upload was successful" and one "your upload failed", right? (2) also fixed. |
@jkarni This looks good-- 👍 for interactive testing and JavaScript/HTML, though note that I never tested when the upload actually failed (I only tested unsupported extensions). I assume you have tested the case of the upload failing, with the final version of the code. Please update the BDD manal testing spec as appropriate, including failure cases: https://edx-wiki.atlassian.net/wiki/display/STU/Import @chrisndodge We still need your Python code review, and testing this on staging would be splendid. |
I can push to stage tonight. Since the release is out migrations have already been run there. Sent from my iPhone On Aug 20, 2013, at 2:54 PM, Christina Roberts notifications@github.com wrote:
|
rebased to pick up the test fix |
|
||
# Regex to capture Content-Range header ranges. | ||
CONTENT_RE = re.compile(r"(?P<start>\d{1,11})-(?P<stop>\d{1,11})/(?P<end>\d{1,11})") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used? It seems like this should be in the new import_export.py file...
Not sure why this is getting marked as unstable. There's 18 code lines in violation. |
@chrisndodge switched to a with-context so that all the files are sure to be deleted. |
That with-context is a nifty trick! +1 |
Also adds lockfile for asynchronous status updates.
Comments, httpresponse -> jsonresponse; make sure errors get relayed.
Signed-off-by: Julian Arni <julian@edx.org>
…-html Add more tests and replace ':' with '_' in html_id
Update course metrics grades leaders API to allow filtering of users by groups
* po/django-admin-extrainfo-obj: Make ExtraInfo list user-friendly in Django Admin Make `user` field read-only in `ExtraInfo` detail
UPD: translations for recap and freetext FIX: protecting agnostic block, if not installed, ignore ENH: honor to honour
…dx#674) prevents confusing exam errors in non-exam situations MST-905
…e-translation Update 'View Course' arabic translation.
* feat: Add Library Collections REST endpoints * test: Add tests for Collections REST APIs * chore: Add missing __init__ files * feat: Add events emitting for Collections * feat: Add REST endpoints to update Components in a Collections (temp) (#674) * feat: add/remove components to/from a collection * docs: Add warning about unstable REST APIs * chore: updates openedx-events==9.14.0 * chore: updates openedx-learning==0.11.4 * fix: assert collection doc have unique id --------- Co-authored-by: Jillian <jill@opencraft.com> Co-authored-by: Chris Chávez <xnpiochv@gmail.com> Co-authored-by: Rômulo Penido <romulo.penido@gmail.com>
Adds ability to import courses over 100MB.