-
-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resumable uploads failing #122
Comments
@BillPetti wrote: I'm facing what I think is a similar issue, but in my case the upload is actually failing. I am not asking for it to find a resumable upload, but when I try to upload an updated file it appears to find on and hangs after reading about half of the file, then I get this message:
And here's my original call:
|
@BillPetti could you rerun the upload that fails with Also what type of file is uploading - is it a big file and/or an R list or data.frame? |
I have a similar issue with a large RDS file (9 GB) - whenever I try to upload it, I get
Rerunning it with
Given that I am trying to save from Google Cloud Engine within the same region, I thought I would give simple upload a go - however, that fails because the option needs to be specified as an integer. Any other suggestions? |
Apparently, for me the issue was that I did not choose "Fine-grained: Object-level ACLs enabled" when creating the bucket. With that, the upload seems to work now. Not sure if that is a general limitation, or because of how I created the JSON ... but all seems well for now (even though it might be worth documenting this, in case it is a common mistake people make?) Many thanks for this helpful package (and I will be back if the issue reappears :)). |
Thanks @LukasWallrich - this is a tricky one to pin down as I need to find a failing example to replicate. I think in your case you were missing the new Acl parameter defined in #111 gcs_upload(mtcars, bucket = "mark-bucketlevel-acl",
predefinedAcl = "bucketLevel") Perhaps I can use this to test the above retry issue :) |
@MarkEdmondson1234 I'm having the same or similar issue when I upload a batch of pdf files. I have a list of 500 pfd files that I upload via a for loop. Each time I do this a different subset of files will fail, so I don't think it is an issue with the files. You'll see in my script that I log which ones fail, then run the loop again on just those, and many of them upload fine on round 2, then I do a round 3. I'll also include the log so you can see the errors. Upload script with three rounds of uploads
LogsLog 1
Log 2
Log 3
|
This upload should work, I should at least add better logging such as the status code (you could see this via Do you need the PDFs uploaded as separate files? Just to work around your particular issue you could look at |
I'll try the more verbose logging. I'll try a zip file, too. |
I've also been plagued by my uploads hanging. Finally found the solution today. It seems that when you upload a file like Once this happens, you cannot overwrite the file (or at least I couldn't). For me, every attempt to overwrite the file using the same call to The trick was to delete the file, then re-upload it using |
Ooooh thanks that makes sense - so the resumable upload needs to have the same ACL permissions as the original upload - which would explain an uptick of these reports when GCS bought in bucket level ACL vs object level. Is there a change in the code that can be made to make this easier to avoid? |
Perhaps |
I finally got a situation where I could make it fail and found a bug for checking the retry, so it should at least attempt a retry now. |
Looks like I've run into this issue while using targets. Most targets were succeeding with
SetupStandard Bucket, europe-west2-b region, Uniform access control, No public access, no versioning. Centos 7 in GCP ReprexreadRenviron('my_gcs.env')
library(googleCloudStorageR)
#> ✔ Setting scopes to https://www.googleapis.com/auth/devstorage.full_control and https://www.googleapis.com/auth/cloud-platform
#> ✔ Successfully auto-authenticated via my-server-key.json
#> ✔ Set default bucket name to 'my-default-bucket'
my_bucket <- "my-default-bucket"
# Create 5.7MB csv file
payload<-as.data.frame(matrix(rep(1, 3e6), nrow = 1e3))
write.csv(payload, tmpfile<-tempfile())
googleCloudStorageR::gcs_upload(tmpfile, bucket = my_bucket)
#> ℹ 2022-07-22 16:09:20 > File size detected as 5.7 Mb
#> ℹ 2022-07-22 16:09:20 > Found resumeable upload URL: https://storage.googleapis.com/upload/storage/v1/b/my-default-bucket/o/?uploadType=resumable&name=tmpfile&predefinedAcl=private&upload_id=ADPycdu_o6vVcIQm5iH3g4JtJV5g6LCGPD3b6R9F5y2aZdUl7azw6ovQb1Af9xh4qMIyCapT-GhoRuN-S5-Iep4-h95tS68RC1C7
#> ℹ 2022-07-22 16:09:21 > File upload failed, trying to resume...
#> ℹ 2022-07-22 16:09:21 > Retry 3 of 3
#> Error in value[[3L]](cond): Couldn't get upload status Created on 2022-07-22 by the reprex package (v2.0.1) CommentsThis looks like it may be a long-standing problem, so perhaps it is tough to resolve for all use-cases? What would be the reasonable resolution? Should the |
The default should be bucket level I think since it's by far the most convenient, I think the GCP interface nudges you in that direction when creating the bucket. That level of access is newer though which is why it wasn't default before. There is some logic to retry the fetch with bucket level permissions upon failure since it's so common, which I wonder why hasn't triggered in your case. I'm finishing writing a book at the moment so am behind on issues. |
Ok the logic to retry with bucket level permissions is only in for getting objects, not putting them in. |
I'll take a look at the retry logic & see if I can figure it out. It's the least I can do for all the hard work you put into the |
Much appreciated! And glad to see the target integration with GCP being used in the wild. |
Using predefinedAcl option |
As reported in #120
The text was updated successfully, but these errors were encountered: