Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StorageAPI for Images and Files #526

Closed
grabbou opened this issue Aug 10, 2014 · 53 comments
Closed

StorageAPI for Images and Files #526

grabbou opened this issue Aug 10, 2014 · 53 comments

Comments

@grabbou
Copy link
Contributor

grabbou commented Aug 10, 2014

Draft

To store Images, we have to use CloudinaryImage if we are on Heroku and so on or if we like ability to resize or manipulate our image. Not everybody is happy with paying monthly-fee for having only those features. Although Cloudinary is great - not everyone is using all of its features. That's why we need to kick our Image and Files to next level and make it like in other CMS systems.

Storage API

REPOSITORY: https://github.com/grabbou/StorageAPI/

Will be released as stand-alone NPM package and after that, integrated into Keystone.

Deliver Keystone.StorageAPI for developers to easily use storage (when available) in their projects. Storage API can be used either in fields or in our own views and modules. It's public. Define once, use multiple times.

Directory structure

providers/
providers/storage/index.js // this is place for our 'class' **Storage**
providers/storage/....
providers/storage/amazon.js //name 'amazon' now is available as keystone.set('storage')
providers/storage/azure.js

// once we finish, it's time to write mailApi
providers/mail/index.js // mail resolver, works like storage one as well (obtain() method)
providers/mail/postmark.js
providers/mail/mandrill.js
providers/mail/smtp.js

Storage

  • obtain(options) returns current storage provider based on keystone.set('storage') specified. We can optionally pass argument provider here in case our custom field allows to set different provider for it (for example - files in S3, images in Cloudinary). Automatically checks whether given provider is available (file with it exists in our providers folder).

StorageClient

  • @constructor - registers new client based on keystone.set('storage config'). This should be an object, specific for different providers. Can accept either single object (then we assume it's config for keystone.set('storage') or object array (in case we use multiple storage providers inside our app):
keystone.set('storage', {
   'azure': {},
   'amazon': {}
});
  • upload(options), delete(options), exists(options) - easy manipulation
  • _ensureContainer() - ensures that container exists - based on keystone.set('storage container'). We do not have any default value here as container name should be unique across cloud platform

Please note, that async methods are wrapped within promises (Q library) to avoid callback chain and provider better functionality

List of supported providers to start with

  • Everything provided by pkgcloud - Amazon, Azure, HP, Rackspace, OpenShift
  • Filesystem
  • FTP / SFTP

Pros:

  • Use core, consistent storage layer to access file systems. Developers can now easily write their own modifications and fields using our API without thinking whether config is supported or not. Easier in maintaining. If API changes - only one file to edit, not 12 plugins that are interacting with them.
  • No more hard-coded supported providers. We simply add new file under providers/storage and users can start using it automatically. Our Storage checks whether keystone.set('storage') provider is available, if not - throws new Error('Unsupported provider').
  • Users can easily create their own PR with new providers (or we can even include them as dependencies or think of plug & play architecture for the future).

Fields

Removing multiple fields that are doing the same job and replacing them with following ones. We are going to delete ui-azure.js, ui-s3file.js and many many more (with CSS3 stylesheets as well! Going simple now!)

File

Simple field for handling files. Accepts any storage (uses default one specified in the project). Returns false if uploaded file doesn't match allowedTypes.

Schema:

  • url {String} - full_res url
  • file_name {String}
  • provider {String}
  • container {String} - might be path in case we used FTP/SFTP.

Accepted fields:

  • required - if file is required. If yes and no file provided, returns an error.
  • provider - overwrites provider for that field. Remember to specify configuration. If array specified, multiple actions are performed.
  • size - maximum size - if wrong, returns an error
  • allowedTypes

Available hooks:

  • pre: upload
  • post: upload

Image extends Field

Generic ImageField for uploading & handling images. Deletes CloudinaryImage as it's now built-in within this field. The same with other image-related fields.

Schema:

  • inherited

Accepted fields:

  • inherited

Underscored methods

  • Full Cloudinary API support (if Cloudinary is not default provider or provider for that field, use our wrapper for gm). Key feature is to mimic Cloudinary functionality (the same method calls, almost the same effects). Our Node API simply checks whether image matching options specified is present on server, if not - we generate it and return a link. Please note that we are keeping ful-res image in the cloud storage just to have the ability to resize them in the future.

Available hooks:

  • inherited

Wysiwyg

Removes hard-coded CloudinarySupport. Allows to upload file to default storage provider when available. Allows overwriting that by specifying keystone.set('wysiwyg storage'). To enable image upload - call keystone.set('wysiwyg image upload'). It means you can now upload your wysiwyg images easily wherever you want. Either to FTP or Amazon S3. Want to resize your images while uploading? Easy! Just set keystone.set('wysiwyg image options') and we will adjust your input using ImageField undescored methods. No more resizing before uploading high-res photos.

Breaking changes

  • Removed all file fields and image fields. Replaced with those new ones.
  • Removed lots of unnecessary keystone.set configs, like cloudinary and so on.

Updates

Update 1

  • After writing new field - we should move post/pre and other methods to basic Field class and implement them where possible. In most cases, they are all the same. In others, just make them abstract by throwing new Error('Method not implemented') so one can easily overwrite them and implement. Useful when you are writing everything from scratch and want to make sure that you declared everything needed to work.

Update 2

  • updateHandler needs to be rewritten. For now, we can add new fields to that case, but for the future, we should have in our field method returning value. Based on that value, we can decide what action we should take in ActionQueue. Something like {Field}.getUploadHandlerType. No need to modify core files after that.
@grabbou grabbou changed the title [Types.Image] Complete refactoring [Types.Image] [Types.Field] Complete refactoring Aug 10, 2014
@jamlen
Copy link
Contributor

jamlen commented Aug 10, 2014

One feature that I use from Cloudinary is upload a PDF and they generate a thumbnail. This is really useful.

@grabbou
Copy link
Contributor Author

grabbou commented Aug 10, 2014

Thanks @jamlen for sharing your idea! Actually I agree that's pretty useful. Maybe we can create 'middlewares' for File Uploaders, if option generateThumbnail is set to true, we will try to generate image thumb for our file and save it along the original one.

@estilles
Copy link

@grabbou and @JedWatson, does this mean we're dropping support for Cloudinary in 0.3.x?

@grabbou
Copy link
Contributor Author

grabbou commented Aug 10, 2014

@JohnnyEstilles not AFAIK. I am only developing new ImageField that will handle Local & Cloud images (upload/get/resize). A kind of free alternative if you already got hosting. In the future, we will be able to add ImageFile.FTP as provider as well. Cloudinary will remain untouched :)

As I have a deadline for tomorrow with that feature, will try to finish at least Amazon integration. You can check out my fork (branch feature-imagefile) for progress. For now, I've copied AmazonFile files - quite unhappy with the fact that I had to redeclare jQuery and templates as well. Probably in the final case, I will use the same as Amazon/S3 that are almost the same.

@estilles
Copy link

@grabbou thanks for the update. I'll check out your branch. BTW, I love the concept of Types.Image being an extension of a simpler Types.File. The idea is brilliant.

@grabbou
Copy link
Contributor Author

grabbou commented Aug 11, 2014

@JohnnyEstilles thanks! Great to hear that!

@yinear
Copy link

yinear commented Aug 11, 2014

We have a large number of pictures and don't want to pay cloudinary for that.
I wish local images can be added in wysiwyg and managed as a Field.
Thanks for your effort.

@grabbou
Copy link
Contributor Author

grabbou commented Aug 11, 2014

Hi @yinear, although my current branch is a bit awkward, all those capabilities are handled now by Amazon (we had to migrate our CMS before today). I will try to finish that module today/tomorrow and in final version, it's going to work right that :)

@JedWatson
Copy link
Member

Not sure if you saw #165, there were a few other issues where this has been discussed but you're basically nailing it, @grabbou. I need to run for a couple of hours but will write down a full summary of my thoughts on this tonight.

@grabbou
Copy link
Contributor Author

grabbou commented Aug 11, 2014

Cool, thanks. I haven't heard of them, it's good to close few of them by accident! We can catch up later today. I will start working on that probably around 8-12pm. It seems that we are gonna have busy time with writing new modules, if you can mail me using my email address listed in my profile - that might be speed up our work.

Important
Topic updated, removed my comments and moved everything to first comment, so it should be clearer now.

@grabbou grabbou changed the title [Types.Image] [Types.Field] Complete refactoring StorageAPI for Images and Files Aug 11, 2014
@JedWatson JedWatson added this to the 0.3.0 milestone Aug 11, 2014
@JedWatson
Copy link
Member

Great spec, @grabbou. My ideas / questions:

Images

How are you planning to implement the cloudinary-like resizing features? I love the idea, but had planned to do "upload-time" generation of images from the source, based on a definition that would live with the field (e.g. multiple keys w/ sizes, crop methods, etc).

Emulating the on-the-fly image resizing from source (when missing) would definitely be cooler, but would also (I imagine) require either middleware to route the request for the image via a generator that integrates with the storage engine while serving the request, or a callback in the underscore methods (as checking for the size, and generating it if necessary, would be async (which the cloudinary methods are not, as all they do is generate a URL).

I've got more ideas on this, but first wanted to get a sense of how you're planning for it to work end-to-end.

Cloudinary

I was originally going to leave the CloudinaryImage and CloudinaryImages field types in place, because I do expect our internal Image API to diverge from Cloudinary's. I'm happy to have one or two exceptions like this; I think "this other service processes your images" vs "keystone processes your image and stores it somewhere else (fs / s3 / etc)" are fundamentally different problems to solve.

Especially because CloudinaryImage/s fields are so widely used at the moment, I recommend we build this alongside them, and focus on getting our generic Image and File fields correct. (remember, we'll have to do both single and multi-file/image variants).

Promises

Please note, that async methods are wrapped within promises (Q library) to avoid callback chain and provider better functionality

Generally, without taking either side of the async / promise preferences, I've kept all external Keystone APIs async and follow standard node conventions (error first callbacks, etc), using the async library as necessary to keep code clean.

I'd be happy to offer a dual-api (as Mongoose does, where you can fire off async functions by passing a callback argument, but they also return a promise) but don't want to split the API into promises / async.

My main reasoning for this is that it's quite easy to promisify async methods anyway, if authors prefer that api, and this keeps Keystone more consistent with core node / express APIs, so there are less things to learn for newcomers.

I've found new users have trouble distinguishing the boundaries between what's Keystone, what's Mongoose and what's Express, which has reinforced the importance of a consistent API for me (something I'm a bit guilty of breaking with some early decisions, as concepts have evolved).

Dealing with Breaking Changes

Given the breaking changes in this, we'll need to publish it as a minor version bump (technically anything can break on 0.x.x versions in semver, but I like to go with "patch version bumps fix / add, minor version bumps break". So ideally we can line up these changes with the move to React, which I was going to publish as 0.3.0 (since some people will be running custom builds / hacks that reworking the Admin UI may interfere with).

In the meantime, I'd like to keep master safe for updates to the 0.2.x versions, so I'm aggressively merging changes in master into my react branch and trying to limit refactoring in react until it's ready to merge.

Alternatively we could focus on adding new File and Image types, then Files and Images, merge them in the 0.2.x / master branch as we go, then remove the redundant field types in the 0.3.0 cleanup. Might be less disruptive that way, and the two tasks won't be dependent on each other.

I suspect the File and Image fields will be quite divergent (except for wrapping the underlying StorageClient layer) with File being much simpler... do you think it is worth splitting this into two tasks / issues, and tackling File first to get the basics down? Or do you want to tackle it all at once?

Finally on breaking changes, with the options that we're replacing (like s3 config), I'd like to deprecate these but have them pass the config through to the new system anyway (maybe with a warning to console.log). Much like @JohnnyEstilles with the List.addPattern deprecation in #523. Remember we'll want to handle compatibility with process.env keys.

Refactoring the UpdateHandler

This has been on my mind for a while, I've opened a new issue for it specifically (#533) because it can / should be done independently of this. I'll jump into it soon, if you don't get to it first ;)

Finally, this is something I've been wanting to do to Keystone for quite a while, thanks for tackling it so comprehensively! :D

@grabbou
Copy link
Contributor Author

grabbou commented Aug 11, 2014

Hi!

So, basically, in answer to your question:

Images

My initial idea was to provide users with underscored methods, let's say _resize(400, 500);
The workflow would be (synchronous):

  1. In jade template, there is our _resize method, jade calls it.
  2. Our method checks using Storage API whether image exists on the filesystem.
  3. If yes - it returns that URL
  4. If not - it downloads the image (if necessary) / resizes it and uploads it again
  5. Returns new URL

I quite don't like the point 4 and the fact that we will have to download high-res photo, but the initial idea is to save them in the cloud in untouched way, so then, we can modify them without uploading them again. They might be prefixed with _original<name>.jpg or sth.

Upload-time manipulation is better in a way that the first page load will be faster, but in case we are not satisfied with the process, we will have to re-upload it again, which might be confusing if we got dozens of images already uploaded.

Fact, I agree that we can leave Cloudinary untouched. This will give us more flexibility in terms of implementing our own image manipulation methods without breaking current images.

Files

Ok, that's not a problem. We can follow that pattern (at least I am a huge fan of err,callback pattern as well). After giving it a thought for a long, I think that's actually better.

Breaking changes

Well, personally, I'd rather prefer the style of doing everything all together. Of course -> we will split that in smaller steps with very small informal deadlines. The initial process will be probably like this:

  1. Storage
  2. StorageClient (Amazon as I got credentials)
  3. File
  4. Connecting Field & Storage (writing obtain() method)
  5. Testing if File works
  6. Image as extended File
  7. Wysiwyg tweaks
  8. Underscored image methods

Not sure if they will be divergent, I am going to use File as a base, as you can see from my spec, Image inherits almost everything from File :)

OK, so -> we are changing keystone.set methods, but keeping proccess.env variables the same. We can then add wrapper in Storage checkDeprecatedConfig that will run our set-up but with warnings.

Ideally, I would love to have at least kind of 'limited' repo access, so I can create branch here, called feature-storage and work on it continuously. After that -> we are merging with yours React? I think that's easier and more safe. If we are going to have breaking changes anyway, let's to that only once.

Cool, thanks for that issue. I will probably help you on refactoring that.

What about (for the future) moving few fields and methods to Field (making them 'abstract') as I pointed before?

You're welcome! Thanks for that superb CMS. I felt in love with it from the very beginning and it's fun that I can finally contribute to something I really like.

I am going to start writing points 1 & 2 today, so the most important thing is to sort everything out before that :) Probably we will have few issues and suggestions while writing that :)

@webteckie
Copy link
Contributor

I'm having a hard time following through the requirements set forth here that will be addressed via the storage API. Simple question: currently there is support for LocalFile, S3File, and AzureFile. Ideally I would like to be blind about the underlying storage system and just deal with a File. That way, when I'm testing things locally it will use my local file system and when I deploy out there to some cloud it will use whatever file system transparently. Will that be possible with what's being defined here?

@grabbou
Copy link
Contributor Author

grabbou commented Aug 14, 2014

Yes, @webteckie, basically, StorageAPI works like the way you'd like it to work.

It's all about configuration. You can use default available provider by just accessing one Storage method that will give you current client instance.

If you want to use LocalSystem on your localhost and e.g. S3File on your deployment server -> simply configure cloud credentials there and leave them empty on your localhost. Further details will be announced soon.

@webteckie
Copy link
Contributor

Excellent! And, will there also be an embedded wysiwyg property as part of it so that when you view the file in the Admin UI a preferred wysiwyg editor is used to view its content?

@sebmck
Copy link
Contributor

sebmck commented Jan 7, 2015

@grabbou Keystone has come pretty far in the last few months, we've since had our first community meeting #812, would be great to see you at the next one.

@JedWatson
Copy link
Member

@grabbou nice to see you back, hope you're better! we're nearly ready for the major new release with React and Express 4 so we'll pick this up after that goes out :)

@grabbou
Copy link
Contributor Author

grabbou commented Jan 7, 2015

@sebmck Yeah, I can see that. I am glad it keeps developing. I think I'll speak with one of contributors on Slack to get an overview what's going on at the moment. I am in for our next Hangouts meeting.

@JedWatson Great work on React! Sorry for being away. Let's start contributing once again! :)

@okjake
Copy link

okjake commented Aug 17, 2015

Any further progress on this? Can I help with it?

@JedWatson
Copy link
Member

@okjake I'm so glad you asked. I've been busily wrapping up the rebuild of the Admin UI and this is the other major blocker to the new version, I'd really love someone to take it on and help with it.

Things have moved on quite a bit since @grabbou's work, and are encapsulated in the file-fields branch. It's quite out of date so I'll have to get it brought up to speed, and figure out what's required to complete it. @creynders may be able to help organise the effort as well, he did quite a bit of work with it last. Let me know your email so I can add you into our Slack and we can chat.

@okjake
Copy link

okjake commented Aug 17, 2015

OK great! Just added my email to my profile.

@JedWatson
Copy link
Member

Great, you're invited!

@mahnunchik
Copy link
Contributor

Hi @JedWatson

Any progress on this?

I need to implement something like this:

  • validation that the image is not less than required size
  • resize if image is larger than required size

It is quite difficult to implement it on the current implementation of LocalFile.

@christroutner
Copy link
Contributor

@JedWatson I'd also love to help with this. @okjake If you can take a half hour to bring me up to speed with what you've accomplished, I can dedicate a couple hours a week to help with the integration of this. My email is available in my GitHub profile.

@jeffreypriebe
Copy link
Contributor

@mahnunchik @christroutner Happy to help with this effort as well. A little more discussion over on the Google Group post.
I currently have a branch that does some integration of cloundinary & s3 images/files into the main editor and want to see the file & image management in Keystone grow into a more robust system.
(My work uses React and pre-integration Elemental and isn't setup as something to be folded into the main Keystone project ATM)

Let's get on the Google Group Post for continued discussion.
cc: @okjake @JedWatson

@rnjailamba
Copy link

@jeffreypriebe @christroutner @JedWatson @grabbou
How are things going for this feature ? :)

@rnjailamba
Copy link

@jamlen
a code snippet of using cloudinary pdf would be VERY helpful!!!

@jamlen
Copy link
Contributor

jamlen commented Mar 9, 2016

@rnjailamba from memory I think the main bit that does it is presentation.src({ transformation: 'mottram-thumb'}), where I am referencing a predefined transformation, but you could just specify the transformation details.

I use this in my model:

Sermon.add({
//...
presentation: { type: Types.CloudinaryImage, collapse: true, allowedTypes:['application/pdf'], autoCleanup : true }
//....
});

Sermon.schema.virtual('thumbnail').get(function() {
    if (!this._.presentation.src()) {
        return {
            img: null,
            isPdf: false
        };
    }
    return {
        img: this._.presentation.src({ transformation: 'mottram-thumb'}),
        isPdf: this.presentation.format.toLowerCase() === 'pdf'
    };
});

and then this in the jade:

if sermon.presentation.exists
    if sermon.thumbnail.isPdf
        a(href=sermon.presentation.url).sermonImg
            img(src=sermon.thumbnail.img)
    else
        img(src=sermon.thumbnail.img)
else
    img(src='/images/nopdf.gif')

@LuganOcode
Copy link

Hi, I hope this is the right place to add this, I'm curious about whether or not it is possible to setup the S3 storage so that actual file uploads are performed not by keystone in a:

  • Client → KeystoneJS → S3 flow
    but instead:
  1. Client (meta data and signing request) → Keystone
  2. Keystone → sends timeout signed PUT request back to client with data size, etc. Client
  3. Client → directly uploads actual file to S3

The application I am building performs a lot of file uploads, and in the past I have used this signing strategy in other application frameworks with good success.

Below is a more concrete snippet of what this means.

In my experience this can quite significantly reduce load at no loss, especially since Keystone only stores the metadata anyway.

I'm sure I can tinker my way to this, but maybe there is already some kind of support for it that I am missing?

@JedWatson
Copy link
Member

@LuganOcode this is something I'd like to look into for all the file-type fields once we've consolidated them, but that's looking like a post-0.4 thing. We need to get the foundations fixed before we build more features on top of them.

What will be available though is the ability to pass the metadata directly to Keystone so in the meantime (with 0.4) you could handle the upload to S3 directly with an extension to your app then submit the metadata to Keystone.

@LuganOcode
Copy link

@JedWatson Thanks for the tip. We are already using 0.4 beta - the React UI stuff was just too nice to pass up. So one more hint then, if we go in the direction that you suggest, would this imply essentially not using the 'S3FileType', and instead just just using a custom model object, or you would instead mean using the standard S3FileType and just processing submissions with the metadata but without the actual files?

Since this does seem to be the/an appropriate place, I'm going to point out one more small hiccup that we ran into with file uploads and the updateHandler. In our application we are asking users to create A/V files. This forced us to custom-configure the _uploads because:

  1. Client side javascript does not permit us to add a filetype input to an HTML form object programmatically. The user must do it. The only alternatives we could think of were to:
    1. Use a FormData object, and submit with an XmlHTTPRequest which means we screw up context handling and flash messages, among other things.
    2. Append the file as a blob in a hidden input.
  2. The Keystone form processing API does not know how to process a blobfile that has been passed via a hidden input, I think because it is not in the 'files' section of the request.

In the end we just moved this work into the view controller, and replicated the work otherwise handled by the updatehandler (and I think ExpressJS is also maybe handling the temporary storage), but this is probably not the ultimate recommended way to handle such issues.

BTW this is a quite amazing thing!

@wmertens
Copy link
Contributor

wmertens commented Aug 3, 2016

See also keystonejs/keystone#3209 (comment) for my take on how Storage Adapters are responsible for handling files, the rest of Keystone should just work with File objects (the bags of data that the Storage Adapter uses to represent the file).

@peterjohansson92
Copy link

Maby im in the wrong discussion for this question, but what is the status about using s3 instead of cloudinary in the wysiwyg editor for image uploading? Currently we use "wysiwyg cloudinary images" in the init, but is it possible to use s3 instead? If so, i would be very pleased if someone could tell me how.

@gautamsi
Copy link
Member

Keystone 4 is going in maintenance mode. No major changes expected. see #4913 for details.

I see a PR was done for S3 image upload for wysiwyg in v4, see #1614

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests