Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large file upload failing with "The Content-Range header length does not match the provided number of bytes." #295

Closed
jasonjoh opened this issue Jun 7, 2024 · 20 comments · Fixed by #304
Assignees
Labels
type:bug A broken experience

Comments

@jasonjoh
Copy link
Member

jasonjoh commented Jun 7, 2024

Here's my code, I'm running this on Ubuntu with Go 1.21.5. largeFile is /home/jasonjoh/vacation.gif, and itemPath is Documents/vacation.gif. I've attached vacation.gif in case the file is important.
vacation

func UploadFileToOneDrive(graphClient *graph.GraphServiceClient, largeFile string, itemPath string) {
	itemUploadProperties := models.NewDriveItemUploadableProperties()
	itemUploadProperties.SetAdditionalData(map[string]any{"@microsoft.graph.conflictBehavior": "replace"})
	uploadSessionRequestBody := drives.NewItemItemsItemCreateuploadsessionCreateUploadSessionPostRequestBody()
	uploadSessionRequestBody.SetItem(itemUploadProperties)
	myDrive, _ := graphClient.Me().Drive().Get(context.Background(), nil)

	uploadSession, _ := graphClient.Drives().
		ByDriveId(*myDrive.GetId()).
		Items().
		ByDriveItemId("root:/"+itemPath+":").
		CreateUploadSession().
		Post(context.Background(), uploadSessionRequestBody, nil)

	maxSliceSize := int64(320 * 512)
	byteStream, _ := os.Open(largeFile)
	fileUploadTask := fileuploader.NewLargeFileUploadTask[models.DriveItemable](
		graphClient.RequestAdapter,
		uploadSession,
		byteStream,
		maxSliceSize,
		models.CreateDriveItemFromDiscriminatorValue,
		nil)

	progress := func(progress int64, total int64) {
		fmt.Printf("Uploaded %d of %d bytes", progress, total)
	}

	uploadResult := fileUploadTask.Upload(progress)

	if uploadResult.GetUploadSucceeded() {
		fmt.Printf("Upload complete, item ID: %s", *uploadResult.GetItemResponse().GetId())
	} else {
		fmt.Print("Upload failed.")
	}
}

Using a debugging middleware I see the headers look right, but the slices are sent in a random order (i.e. the first one isn't the 0-163839 range), which causes the failure. Per the docs, slices must be sent sequentially or you will get an error.

@buechele
Copy link

Upvoting this issue.

I'm running into the same issue using similar code:

func SampleFileUpload(graphClient *graph.GraphServiceClient, driveID string) {

	filename := "<FILENAME>"
	driveItemId := "root:/" + "<ITEMPATH>" + ":"

	// Pre 1.44.0:
	// requestBody := drives.NewItemItemsItemCreateUploadSessionPostRequestBody()
	// 1.44.0 or later:
	requestBody := drives.NewItemItemsItemCreateuploadsessionCreateUploadSessionPostRequestBody()

	uploadProperties := models.NewDriveItemUploadableProperties()
	uploadProperties.SetAdditionalData(map[string]any{"@microsoft.graph.conflictBehavior": "replace"})
	requestBody.SetItem(uploadProperties)

	uploadSessionReqBuilder := graphClient.Drives().ByDriveId(driveID).Items().ByDriveItemId(driveItemId).CreateUploadSession()
	uploadSession, err := uploadSessionReqBuilder.Post(context.Background(), requestBody, nil)
	if err != nil {
		log.Fatal(err)
	}

	errorMapping := abstractions.ErrorMappings{
		"4XX": odataerrors.CreateODataErrorFromDiscriminatorValue,
		"5XX": odataerrors.CreateODataErrorFromDiscriminatorValue,
	}

	fileDesc, _ := os.Open(filename)
	maxSlice := int64(320 * 1024)

	uploadTask := fileuploader.NewLargeFileUploadTask[models.DriveItemable](graphClient.GetAdapter(), uploadSession, fileDesc, maxSlice, models.CreateDriveItemFromDiscriminatorValue, errorMapping)

	callback := fileuploader.ProgressCallBack(func(current int64, total int64) {
		log.Println("Progress: ", current, total)
	})

	uploadResult := uploadTask.Upload(callback)
	if !uploadResult.GetUploadSucceeded() {
		log.Fatal(uploadResult.GetResponseErrors())
	}
}

Result:

[The Content-Range header length does not match the provided number of bytes.]

@andrueastman andrueastman added the type:bug A broken experience label Jul 2, 2024
@andrueastman
Copy link
Member

Using a debugging middleware I see the headers look right, but the slices are sent in a random order (i.e. the first one isn't the 0-163839 range), which causes the failure. Per the docs, slices must be sent sequentially or you will get an error.

@rkodev Any chance you can take a look at this? As stated this should occur in order and the use of goroutines here will not give that assurance...

@buechele
Copy link

buechele commented Jul 11, 2024

@rkodev @andrueastman Since you are having problems with the release, I copied your fixed fileuploader code into my code base.
The issue remains and I'm still getting the error "The Content-Range header length does not match the provided number of bytes.".

This is happening with the above code and also with only one slice.

@andrueastman
Copy link
Member

Re-opening for now.

This is happening with the above code and also with only one slice.

@buechele Any chance you can confirm the file size of the file being uploaded? Does this mean that the task fails if the file is less than the size of a single chunk?

@andrueastman andrueastman reopened this Jul 12, 2024
@buechele
Copy link

buechele commented Jul 12, 2024

@andrueastman It fails if the file size is below the maxSlice size and also if the size is above.
I tried it with several sizes (bigger and smaller than maxSlice) and everything failed.
I checked the Content-Range header created within the fileuploader package and it seems to be correct.
Currently I'm clueless about the root cause. I checked the Java-SDK and there it worked seamless.

@andrueastman
Copy link
Member

Does this mean that this happens for all files you try to upload?
Any chance you're in a position to grab a trace using a logging app like fiddler and grab a snapshot of the request response cycle?
Listing all the headers as well as capturing the request body would be helpful to identify where the issue could be.

@buechele
Copy link

@andrueastman Yes, it fails for all files.
I took a screenshot from the PUT request. Maybe the gzip compression is the problem. The file size in this case is 4 bytes. If I inspect the body of the request it is uncompressed. So the error from the API is correct.
Fiddler-1

@buechele
Copy link

@andrueastman Short update:
The gzip compression is not handled correctly. If I hardcode "bytes 0-27/28" into the above example it works.
So the content is compressed but the content-range header is not taking this into account and sticks with the uncompressed values.

@buechele
Copy link

@andrueastman It's even worse: The originally uncompressed file which is now being uploaded without an error (by manipulating the content-range header), appears gzip compressed in the destination folder on the Microsoft side.

@andrueastman
Copy link
Member

Thanks for looking into this @buechele and getting to the root cause here.

A similar issue has been raised at microsoftgraph/msgraph-sdk-go#747. The correct behavior should be to have data sent uncompressed to the server (I don't believe Graph api supports receiving compressed content yet). But for the client to try to decompress content from the server as it will send the accept header.

@buechele Are you able to make this work by initialization of the graph client that has the CompressionHandler removed?

@rkodev Any chance you can look into updating the default middleware to avoid using the default compression handler?

func GetDefaultMiddlewaresWithOptions(options *GraphClientOptions) []khttp.Middleware {

@RomanSter
Copy link

Thanks for looking into this @buechele and getting to the root cause here.

A similar issue has been raised at microsoftgraph/msgraph-sdk-go#747. The correct behavior should be to have data sent uncompressed to the server (I don't believe Graph api supports receiving compressed content yet). But for the client to try to decompress content from the server as it will send the accept header.

@buechele Are you able to make this work by initialization of the graph client that has the CompressionHandler removed?

@rkodev Any chance you can look into updating the default middleware to avoid using the default compression handler?

func GetDefaultMiddlewaresWithOptions(options *GraphClientOptions) []khttp.Middleware {

I can confirm that I uploaded a "Microsoft Word 2007+" file type, then downloaded the file from SharePoint, checked the filetype and it was "gzip compressed data, original size modulo 2^32 11954".

I appended the .gz extension to the filename and ran the "gzip -d test.docx.gz" command resulting in a valid docx file that I was able to open.

@buechele
Copy link

@andrueastman I was not able to create a graph client without compression. gzip was still on.

But I tried this:
Bildschirmfoto 2024-07-15 um 15 14 44

And now I'm getting:
The size of the provided stream is not known. Make sure the request is not chunked, and the Content-Length header is specified

Seem like the content length header is missing this time:
Bildschirmfoto 2024-07-15 um 15 18 50

@buechele
Copy link

@andrueastman I was now able to create a graph client without compression.
I tried both ways: replacing the CompressionHandler with enableCompression = false and also to avoid to add the CompressionHandler at all to the Middleware.
Both ways are leading to the same error as mentioned in my last comment:
The size of the provided stream is not known. Make sure the request is not chunked, and the Content-Length header is specified

@baywet
Copy link
Member

baywet commented Jul 15, 2024

Hi everyone,
This one sent me down a rabbit hole of RFCs for content-range and content-encoding.
Long story short: the compression middleware should not compress request bodies when a content-range header with bytes range type is present.
I tried to detailed things a bit further here microsoft/kiota#598 and in the associated design pull request.
I believe adding the extra condition around here should fix the issue end to end.
Anybody willing to submit a pull request for that?

@buechele
Copy link

@baywet Maybe I understand here something wrong, but I removed, as @andrueastman suggested, the entire CompressionHandler from the pipeline and the result was, that the content-length header is now missing.
Replacing the CompressionHandler in the pipeline with enableCompression = false also leads to a missing content-length header. And in this case you are in the if-clause you are pointing out for an extra condition.
I think there is more wrong, because of the missing content-length header.

@baywet
Copy link
Member

baywet commented Jul 15, 2024

It's strange that you're not seeing a content length header without the compression middleware as it's being set here
Are you seeing the content range header when you disable the compression handler?
Can you try also disabling the redirect handler for sanity?

@buechele
Copy link

buechele commented Jul 15, 2024

@baywet Yes, I ran the application in debug mode and it is set correctly in the code but missing in the actual http request.
The content-range header is included with disabled compression handler:
Bildschirmfoto 2024-07-15 um 21 59 22
With disabled redirect handler and compression handler the content-length header is also missing, but content-range is included:
Bildschirmfoto 2024-07-15 um 22 04 17

@baywet
Copy link
Member

baywet commented Jul 15, 2024

Thanks for confirming, it might be the case that since this header has a dedicated field on the request object, it's not being read from the headers collection.
We might ALSO need to set the field, like we do in the compression handler, in the request information translation here

@baywet
Copy link
Member

baywet commented Jul 15, 2024

Now that we have all the pieces in place, would you like to submit a pull request to:

  • not compress the body when the range header is set
  • set the content length field when it's present in the abstract headers

?

@baywet
Copy link
Member

baywet commented Jul 16, 2024

Fixed by microsoft/kiota-http-go#174
Thanks @buechele for the pull request!
Closing.

@baywet baywet closed this as completed Jul 16, 2024
@andrueastman andrueastman assigned buechele and unassigned rkodev Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug A broken experience
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants