Support returning versionId on s3 `write` #282

hannahilea · 2023-06-01T21:01:50Z

When writing objects to s3, we regularly want to capture the stored object's versionId.

From
https://github.com/JuliaCloud/AWS.jl/blob/a7dae3daa06923754dad420a00f1b0a76ae3ddd2/src/services/s3.jl#L5607
this information is returned on s3_put:

If you enable versioning for a bucket, Amazon S3 automatically generates a unique version ID for the object being stored. Amazon S3 returns this ID in the response.

How could we access that information on write? A few options:

A. Update default write behavior: have write return a versioned path instead of current return value (which is the parsed returned buffer stream, which seems to default to an empty string??)
B. Add option to current write function to toggle return value to be versioned path instead of default
C. Update default write behavior to additionally return either the versioned path or the full response header
D. Add separate write function that returns a versioned path

Option D is nice in that it doesn't change any default behavior, but A (or C) would be nicer for just getting response versions "for free" on all write operations, which seems like it could be widely desirable (for definitions of "widely" that are maybe just "my team", hah).

While it's possible that folks depend on the current output of write, there are currently no tests in this testset for output (type or content). 😢

In any case, a stub of this requested output would be to swap out

AWSS3.jl/src/AWSS3.jl

Line 811 in 76b3a28

return parse(S3.put_object(bucket, path, args; aws_config=aws, kwargs...))

with something more like

r = S3.parse(S3.put_object(bucket, path, args; aws_config=aws, kwargs...))
version = get(Dict(r.headers), "x-amz-version-id", nothing)
return S3Path(path; version)

(as implemented by @ericphanson in a non-public repo)
or

r = S3.put_object(path.bucket, path.key, params; aws_config)
version = get(Dict(r.headers), "x-amz-version-id", nothing)
return parse(r), r.headers

The text was updated successfully, but these errors were encountered:

hannahilea · 2023-06-05T15:13:21Z

Personal preference for option A (or a bastardized version thereof): have put_object return just the version string OR the versioned path

hannahilea · 2023-06-05T21:23:15Z

In some private code, my current (hopefully short-term...) workaround is to pirate with
(EDIT: updated since previous workaround did not work!)

@service S3 use_response_type = true

function AWSS3.s3_put(
    aws::AbstractAWSConfig,
    bucket,
    path,
    data::Union{String,Vector{UInt8}},
    data_type="",
    encoding="";
    acl::AbstractString="",
    metadata::AWSS3.SSDict=AWSS3.SSDict(),
    tags::AbstractDict=AWSS3.SSDict(),
    kwargs...,
)
    headers = Dict{String,Any}(["x-amz-meta-$k" => v for (k, v) in metadata])

    if isempty(data_type)
        data_type = "application/octet-stream"
        ext = splitext(path)[2]
        for (e, t) in [
            (".html", "text/html"),
            (".js", "application/javascript"),
            (".pdf", "application/pdf"),
            (".csv", "text/csv"),
            (".txt", "text/plain"),
            (".log", "text/plain"),
            (".dat", "application/octet-stream"),
            (".gz", "application/octet-stream"),
            (".bz2", "application/octet-stream"),
        ]
            if ext == e
                data_type = t
                break
            end
        end
    end

    headers["Content-Type"] = data_type

    if !isempty(tags)
        headers["x-amz-tagging"] = AWS.URIs.escapeuri(tags)
    end

    if !isempty(acl)
        headers["x-amz-acl"] = acl
    end

    if !isempty(encoding)
        headers["Content-Encoding"] = encoding
    end

    args = Dict("body" => data, "headers" => headers)
    r = S3.put_object(bucket, path, args; aws_config=aws, kwargs...)
    version = get(Dict(r.headers), "x-amz-version-id", nothing)
    return S3Path(bucket, path; version)
end

function Base.open(f::Function, path::S3Path, args...; kwargs...)
    buffer = open(path, args...; kwargs...)
    try
        f(buffer)
    finally
        versioned_path = if iswritable(buffer)
            seekstart(buffer)
            write(buffer.path, read(buffer.io))
        else
            nothing
        end
        close(buffer.io)
        return versioned_path
    end
end

hannahilea · 2023-06-14T18:52:57Z

sketchy notes from convo w/ @omus, @kleinschmidt, @a-cakir:

@hannahilea will take first pass at drafting changes.

raw flag -> into s3_put (return_raw)
inside write call, pass in raw flag “get versioned s3 path” out -> return a new versioned s3 path (return_path flag)

COULD but not going to (now) also add same flag to base.open; curt has grand vision for multipart buffer writing that would be nicer, that we will not do now but should file for future. :)
If we also wanted to add an S3Buffer (which we do, just not quite yet), would look something like:

copy https://github.com/rofinn/FilePathsBase.jl/blob/master/src/buffer.jl
change "FileBuffer" -> S3Buffer that contains FileBuffer and passes through to FB when we don't need the specialized function
- all fields inside private (so could be swapped out in future)

hannahilea · 2023-06-16T20:33:15Z

Closing, as the original issue is now addressed. We can open further issues as needed for supporting an S3Buffer etc.

hannahilea mentioned this issue Jun 14, 2023

Support returning [versioned] S3Path on Base.write #293

Merged

omus assigned hannahilea Jun 15, 2023

hannahilea closed this as completed Jun 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support returning versionId on s3 `write` #282

Support returning versionId on s3 `write` #282

hannahilea commented Jun 1, 2023

hannahilea commented Jun 5, 2023

hannahilea commented Jun 5, 2023 •

edited

Loading

hannahilea commented Jun 14, 2023

hannahilea commented Jun 16, 2023

Support returning versionId on s3 write #282

Support returning versionId on s3 write #282

Comments

hannahilea commented Jun 1, 2023

hannahilea commented Jun 5, 2023

hannahilea commented Jun 5, 2023 • edited Loading

hannahilea commented Jun 14, 2023

hannahilea commented Jun 16, 2023

Support returning versionId on s3 `write` #282

Support returning versionId on s3 `write` #282

hannahilea commented Jun 5, 2023 •

edited

Loading