Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seemingly Random 500 POST #284

Closed
LegacydbAdmin opened this issue Jan 23, 2020 · 8 comments
Closed

Seemingly Random 500 POST #284

LegacydbAdmin opened this issue Jan 23, 2020 · 8 comments

Comments

@LegacydbAdmin
Copy link

Hello! Recently switched to flow from resumable.js, from what I can tell I've made the necessary adjustments to my code. However, sometimes the files upload perfectly, then other times flow.js is posting a 500 error and the upload quits. I'm having trouble figuring out the pattern/reason it sometimes doesn't work... any help would be great!

My backend is flask.

from flask import Blueprint, request, render_template, abort, make_response
from flask import current_app as app
from config import Config
uploads = Blueprint('upload', __name__, template_folder='templates')



# landing page
@uploads.route("/upload")
def upload():

    return render_template("upload/upload.html")


# resumable.js uses a GET request to check if it uploaded the file already.
# NOTE: your validation here needs to match whatever you do in the POST (otherwise it will NEVER find the files)
@uploads.route("/resumable-home/", methods=['GET'])
def resumable():
    resumableIdentfier = request.args.get('flowIdentifier', type=str)
    resumableFilename = request.args.get('flowFilename', type=str)
    resumableChunkNumber = request.args.get('flowChunkNumber', type=int)

    if not resumableIdentfier or not resumableFilename or not resumableChunkNumber:
        # Parameters are missing or invalid
        abort(500, 'Parameter error')

    # chunk folder path based on the parameters
    temp_dir = os.path.join(Config.DATA, resumableIdentfier)

    # chunk path based on the parameters
    chunk_file = os.path.join(temp_dir, get_chunk_name(resumableFilename, resumableChunkNumber))
    app.logger.debug('Getting chunk: %s', chunk_file)

    if os.path.isfile(chunk_file):
        # Let resumable.js know this chunk already exists
        return 'OK'
    else:
        # Let resumable.js know this chunk does not exists and needs to be uploaded
        abort(404, 'Not found')


# if it didn't already home, resumable.js sends the file here
@uploads.route("/resumable-home/", methods=['POST'])
def resumable_post():
    resumableTotalChunks = request.form.get('flowTotalChunks', type=int)
    resumableChunkNumber = request.form.get('flowChunkNumber', default=1, type=int)
    resumableFilename = request.form.get('flowFilename', default='error', type=str)
    resumableIdentfier = request.form.get('flowIdentifier', default='error', type=str)

    # get the chunk data
    chunk_data = request.files['file']

    # make our temp directory
    temp_dir = os.path.join(Config.DATA, resumableIdentfier)
    if not os.path.isdir(temp_dir):
        os.makedirs(temp_dir, 0o777)

    # save the chunk data
    chunk_name = get_chunk_name(resumableFilename, resumableChunkNumber)
    chunk_file = os.path.join(temp_dir, chunk_name)
    chunk_data.save(chunk_file)
    app.logger.debug('Saved chunk: %s', chunk_file)

    # check if the home is complete
    chunk_paths = [os.path.join(temp_dir, get_chunk_name(resumableFilename, x)) for x in
                   range(1, resumableTotalChunks + 1)]
    upload_complete = all([os.path.exists(p) for p in chunk_paths])

    # combine all the chunks to create the final file
    if upload_complete:
        target_file_name = os.path.join(Config.DATA, resumableFilename)
        with open(target_file_name, "ab") as target_file:
            for p in chunk_paths:
                stored_chunk_file_name = p
                stored_chunk_file = open(stored_chunk_file_name, 'rb')
                target_file.write(stored_chunk_file.read())
                stored_chunk_file.close()
                os.unlink(stored_chunk_file_name)
        target_file.close()
        os.rmdir(temp_dir)
        app.logger.debug('File saved to: %s', target_file_name)

    return 'OK'


def get_chunk_name(uploaded_filename, chunk_number):
    return uploaded_filename + "_part_%03d" % chunk_number```


   JS:

```var draggable = $('#fileDropBox'),
		results = $('#results'),
		fullProgressBar = $('#fullProgressBar');

	var progressBar = new ProgressBar($('#upload-progress'));

	var r = new Flow({
		target: '/resumable-home/',
		query: {},
		maxChunkRetries: 3,
        prioritizeFirstAndLastChunk: true,
		maxFiles: undefined,
		simultaneousUploads: 4,
		chunkSize: 1 * 1024 * 1024,
		testChunks: true,
        successStatuses: [200],
        permanentErrors: [415, 500, 501]
	});

	// if resumable is not supported aka IE
	if (!r.support) location.href = 'http://browsehappy.com/';

	r.assignBrowse(document.getElementById('add-file-btn'), false, false);
	r.assignDrop(draggable);

	r.on('fileAdded', function (file, event) {
		var template =
			'<div data-uniqueid="' + file.uniqueIdentifier + '" class="d-flex flex-column justify-content-between w-100 pt-0 pb-3"><div class="d-flex justify-content-between">' +
			'<div class="fileName p-2">' +
			file.name +
			'</div>' +
			'<div class="ml-auto pb-1"></div>' +
			'<div class="d-flex">' +
			'<div class="ml-auto pb-1" id="group-' + file.uniqueIdentifier + '"><button type="button" class="deleteFile btn btn-link" id="delete_btn">' +
			'<span class="fas fa-ban" style="color: red" aria-hidden="true"></span></button></div> <div class="progress">' +
			'</div></div> </div>' +
			'   <div class="progress">' +
			'       <span class="progress-bar progress-bar-success progress-bar" style="width:0%;"></span>' +
			'   </div>' +
			'<small><span data-uniqueid="rem-' + file.uniqueIdentifier + '" class="text-sm-left"></span></small>' +
			'</div>' +

			'</div></div>';

		results.append(template);
		/* {
			fullProgressBar.append(progressTemplate)

		} */

		var group_id = $("#group-" + file.uniqueIdentifier);


		if (checkJobNumberMatch(file) === false) {
			// ADD FUNCTION TO MAKE FILE RED, VERIFY JOB NUMBER
			//modalWarning("Job Number Mismatch", "One or more of the files have Job Numbers that do NOT match the job you have selected.")
			//var modalWarning = new modalWarning($('#modal-warning'));
			// alert('One of more of the files have Job Numbers that do NOT match the job you have selected.')
		}
		;

		let setFileGroupResults = setFileGroup(file);
		let fileGroup = setFileGroupResults[0];
		let extensionType = setFileGroupResults[1];
		let group_dropdown = dropdown_type(fileGroup, extensionType);
		group_id.before(group_dropdown);


		if (warningTypes.includes(fileGroup)) {
			$('[data-uniqueId=' + file.uniqueIdentifier + ']').addClass('alert alert-warning');
		}


	});


	$(document).on('click', '.deleteFile', function () {
		var self = $(this),
			parent = self.closest("[data-uniqueid]")
		identifier = parent.data('uniqueid'),
			file = r.getFromUniqueIdentifier(identifier);

		r.removeFile(file);
		parent.remove();
	});

	$('#start-upload-btn').click(function () {
		if (results.children().length > 0) {
			r.upload();
		} else {
			nothingToUpload.fadeIn();
			setTimeout(function () {
				nothingToUpload.fadeOut();
			}, 3000);
		}


	});

	$('#pause-upload-btn').click(function () {
		if (r.files.length > 0) {
			if (r.isUploading()) {
				return r.pause();
			}
			return r.upload();
		}
	});

	r.on('fileProgress', function (file) {
		var progress = Math.floor(file.progress() * 100);
		//var timeRemaining = calculateRemainigUploadTime()
		var timeRemaining = timeLeft(r.timeRemaining());
		var avgUploadSpeed = avgSpeed(file.averageSpeed);
		$('[data-uniqueId=' + file.uniqueIdentifier + ']').find('.progress-bar').css('width', progress + '%');
		$('[data-uniqueId=' + file.uniqueIdentifier + ']').find('.progress-bar').html('&nbsp;' + progress + '%');
		$('[data-uniqueId=rem-' + file.uniqueIdentifier + ']').html("" + avgUploadSpeed + "/s ... " + timeRemaining);

	});

	r.on('fileAdded', function (file, event) {
		progressBar.fileAdded();

	});

	r.on('fileSuccess', function (file, message) {
		// progressBar.finish();
		$('[data-uniqueId=' + file.uniqueIdentifier + ']').addClass('alert alert-success');
		$('[data-uniqueId=rem-' + file.uniqueIdentifier + ']').text("")
	});

	r.on('uploadStart', function () {
		$('.alert-box').text('Uploading....');
		$('.deleteFile').addClass('hide');
		document.getElementById("delete_btn").style.visibility = "hidden";
	});


	r.on('complete', function () {
		$('.alert-box').text('Done Uploading');
	});


	r.on('progress', function () {
		progressBar.uploading(r.progress() * 100);

		$('#pause-upload-btn').removeClass('start-upload-btn').addClass('pause-upload-btn');
	});

	r.on('pause', function () {
		$('#pause-upload-btn').removeClass('pause-upload-btn').addClass('start-upload-btn');
	});

	function ProgressBar(ele) {
		this.thisEle = $(ele);

		this.fileAdded = function () {
			(this.thisEle).removeClass('hide').find('.progress-bar').css('width', '0%');
		},

			this.uploading = function (progress) {
				(this.thisEle).find('.progress-bar').attr('style', "width:" + progress + '%');
				(this.thisEle).find('.progress-bar').html('&nbsp;' + Math.floor(progress) + "%")
			}
	}

	function timeLeft(time) {
		// Hours, minutes and seconds
		if (time === 0) {
			return ""
		}
		var hrs = ~~(time / 3600);
		var mins = ~~((time % 3600) / 60);
		var secs = ~~time % 60;

		// Output like "1:01" or "4:03:59" or "123:03:59"
		var ret = "";
		if (hrs > 0) {
			ret += "" + hrs + "h " + (mins < 10 ? "0" : "");
		}
		ret += "" + mins + "m " + (secs < 10 ? "0" : "");
		ret += "" + secs + "s remaining";
		return ret;
	}

	function avgSpeed(bytes) {
		var i = Math.floor(Math.log(bytes) / Math.log(1024)),
			sizes = ['B', 'KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'];
		return (bytes / Math.pow(1024, i)).toFixed(2) * 1 + ' ' + sizes[i];

	}```
@command-tab
Copy link

This is right up my alley — I often use flow.js and Flask together.

The 500 is a response code from the Flask backend, so this is very likely to be a server-side issue, not an issue with flow.js. Could you post the example code, along with a minimal version of templates/upload/upload.html to a GitHub repo that could be cloned and run to attempt to reproduce the error?

@LegacydbAdmin
Copy link
Author

LegacydbAdmin commented Jan 23, 2020

Thanks! I'll work on the minimal version, but I think you're right - it's failing in these two places:

os.unlink(stored_chunk_file_name) with:

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process.

if upload_complete:
        target_file_name = os.path.join(Config.DATA, resumableFilename)
        with open(target_file_name, "ab") as target_file:
            for p in chunk_paths:
                stored_chunk_file_name = p
                stored_chunk_file = open(stored_chunk_file_name, 'rb')
                target_file.write(stored_chunk_file.read())
               stored_chunk_file.close()
                os.unlink(stored_chunk_file_name) 

and then sometimes:

    if not os.path.isdir(temp_dir):
        os.makedirs(temp_dir, 0o777)

with
FileExistsError: [WinError 183] Cannot create a file when that file already exists

Neither of these happened which resumabls.js, which is why I was leaning towards the issue being with flow, and because this code is basically the same as what I had with resumable.

@command-tab
Copy link

I wonder if the issue has to do with the number of simultaneous chunks being uploaded? Since you're using "ab" mode to append bytes, maybe a second, simultaneous chunk upload is also causing a file create or append to happen at the same time. Does it work more reliably if set the flow.js simultaneousUploads option to 1 instead of the default 3?

@command-tab
Copy link

Oh, hmm, that might only adjust the number of simultaneous file uploads, not simultaneous chunk uploads. I suppose chunk uploads happen sequentially — I don't see anything about that in the docs. If there's any overlap, you'll get those file open and file write OS errors. I'm curious if you write the chunks to separate files, just as a test, if you encounter the same issue.

@LegacydbAdmin
Copy link
Author

I think simultaneousUploads is the issue! Changed to 1, ran a few dozen files no problem - change to 3, first file freezes.

Hate to ask, but would you have any recommendations on my flask end to help with this? I'd really like for that function to work, if possible.

Thanks again for all the help!

@command-tab
Copy link

In the past, I've uploaded each chunk to separate files. When the last chunk arrives, I merged them together with something like:

import shutil

destination = open(path, 'wb')
for chunk in chunks:
    shutil.copyfileobj(open(chunk.path, 'rb'), destination)
destination.close()

Depending on the total file size, though, combining the chunks can be time consuming. In that case, processing uploaded files might be best handed off to something that can process them separately (like a Dramatiq task queue) so your final chunk HTTP request doesn't take too long and keep the user waiting.

Some block storage APIs like SwiftStack can accept chunks of files (even directly from flow.js!) and present them as a single, unified file without the time overhead of actually merging the chunks:
https://docs.openstack.org/swift/latest/overview_large_objects.html

Alternatively, if using 1 simultaneousUploads doesn't impact performance much, leave it at that.

@LegacydbAdmin
Copy link
Author

Awesome, the SwiftStack looks interesting. Thanks so much for all the help!

@drzraf
Copy link
Collaborator

drzraf commented Jun 4, 2020

For reference (about Swift + flow.js):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants