Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to re-map comments to different URLs #412

Closed
umputun opened this issue Aug 14, 2019 · 16 comments
Closed

Add ability to re-map comments to different URLs #412

umputun opened this issue Aug 14, 2019 · 16 comments

Comments

@umputun
Copy link
Owner

umputun commented Aug 14, 2019

Sometimes users may need to change linkage between posts and comments. One case we already had (see #411). Another valid reason - blog/site changed domain or url pattern.

To address it we need some way to alter the location. I don't think we want to extend storage, store.service and the api with such functionality, but rather perform the export-mapper-import sequence.

The mapper will accept input (file? request body?) as a list of "from-url:to-url" pairs and will change all locationw as a part of migrator run. Not sure if we need it in exportrer, importer or maybe in both

@umputun
Copy link
Owner Author

umputun commented Aug 14, 2019

@akosourov - pinging, in case you have time and want to deal with this one. Should be close to the other things you helped with.

@akosourov
Copy link
Contributor

@umputun Yes, that's interesting. I'm going to dive into current version of remark to see how it could be implemented (from my point of view) and return with suggestions

@akosourov
Copy link
Contributor

My thoughts about implementation details:

  1. With changing linkage between comments and posts we should also change PostMetaData (readonly flag). Because admin may forbid commenting for specific url and after such migration this changed url should stay readonly too.

  2. In my opinion this re-map procedure should be near with export and import implementations. So I'd suggest to add API POST /api/v1/admin/convert?site=side-id with request body as you suggested with list of "from-url:to-url" pairs.
    2.1. This list could be simplified for users with pattern matching feature ("old-domain:new-domain" or something similar)

  3. Add new entity Converter (or similar) which changes old url to new url if rule for changing found in request body. It could be extended in future for changing other possible things.
    type Converter interface { ConvertURL(url string) string }
    I'm not sure should the converter be interface or concrete type but my idea is to pass it to Exporter or Importer with possibility to be not set (to be nil).
    Extend Import function in Importers like
    Import(r io.Reader, siteID string, converter Converter) (int, error)
    or extend Exporter the same way
    Export(w io.Writer, siteID string, converter Converter) (int, error)
    I'd suggest to extend Exporter because it has only one realisation, e.x. for importers we want extend only Native Importer, not need to extend others.
    And in Import or Export procedure convert url if caller passed converter.

@umputun What do you think about this ideas?

If it is not clear I can write PR to show what I mean :)

@umputun
Copy link
Owner Author

umputun commented Aug 26, 2019

we should also change PostMetaData

Sure. If we do it as a part of the importer, it should be mapped before n.DataStore.SetMetas call it does.

This list could be simplified for users with pattern matching feature ("old-domain:new-domain" or something similar

Actually, it should be both. I.e. ability to map individual urls as well as ability to remap with patterns, like domain. Probably can be done with some simple pattern matching, and I think we can even go crazy and support regex, but practically example.com/* -> example2.com/* should be sufficient as the only pattern we care about.

Add new entity Converter (or similar) which changes old url to new ur

I'd call such interface UrlMapper, but if you like Converter - up to you.

Extend Import function in Importers like Import(r io.Reader, siteID string, converter Converter) (int, error)

Not sure why would we want to do it. To me, it sounds like the all we need is to inject UrlMapper into Native field and leave Import (or Export) signatures as-is. In other words, the mapper we going to add won't be a part of "specification" but the only part of the particular implementation of Native.Import. Note: I mentioned importer everywhere but the same can be done for Exporter, should be not much different.

I'm not sure should the converter be interface or concrete type

I don't see a reason why not to make it an interface. Will make testing easier, and we could mock it in any way we like.

Generally speaking, this new mapper controller will be very similar to the current Migrator.exportCtrl, the only difference it will be POST instead of GET as we need to pass the file, as we do for importFormCtrl and importCtrl.

@akosourov
Copy link
Contributor

Not sure why would we want to do it. To me, it sounds like the all we need is to inject UrlMapper into Native field and leave Import (or Export) signatures as-is

It's not clear for me how to inject UrlMapper into NativeImporter in new controller when request is being handled without changing interface (or without adding a new interface which defines possibility to import & url mapping). If I understand correctly your suggestion is to build UrlMapper once when application objects are build (on prepare app step) and inject it to NativeImporter and pass it to Migrator struct (which holds controllers). But Migrator struct defines NativeImporter as Importer interface. When url-mapping request will being handled I need a way to build UrlMapper from request body, pass it to NativeImporter and call NativeImporter.Import.
But NativeImporter defines here as Importer interface which doesn't declare a way to pass url mapper.
So I don't see how to do it without changing interface.

I can suggest to define new interface which declares import & url mapping possibility, something like

type MapImporter interface {
	MapImport(r io.Reader, siteID string, mapper UrlMapper) (int, error)
}

declare it in Migrator.NativeImporter field and extend NativeImporter to implement both Import and MapImporter interface

@umputun
Copy link
Owner Author

umputun commented Sep 1, 2019

your suggestion is to build UrlMapper once when application objects are build (on prepare app step) and inject it to NativeImporter and pass it to Migrator struct (which holds controllers)

Kind of. My suggestion was to initialize NativeExporter with the mapper, i.e.:

	migr := &api.Migrator{
		Cache:             loadingCache,
		NativeImporter:    &migrator.Native{DataStore: dataService},
		DisqusImporter:    &migrator.Disqus{DataStore: dataService},
		WordPressImporter: &migrator.WordPress{DataStore: dataService},
		NativeExporter:    &migrator.Native{DataStore: dataService, Mapper: urlMappe},
		KeyStore:          adminStore,
	}

NativeImporter still implements Importer interface, but internally it has UrlMapper field. The real question is how to set mapping rules dynamically, i.e. how the controller can pass those rules to the importer.

However, after some quick thinking, I believe we can do it even simpler. Currently, both Exporter and Importer accept io.Reader and io.Writer as a part of the signature. It should be possible to wrap/decorate either one of them with an adopter doing url rewrite for the data passed in.

I mean, smth like migrator.WithMapper(r io.Reader, m UrlMapper) io.Reader should be added to migrator package and the controller will have UrlMapper as a part of the statically initialized struct, i.e.

migr := &api.Migrator{
		Cache:             loadingCache,
                URLMapper: urlMapper,
                .....
}

pls note: the thinking process was really qucik, so i could be missing something. Feel free to object/critisize this aproach

@akosourov
Copy link
Contributor

I'll try to do via migrator.WithMapper as you suggested without touching interface. Thank for idea.
This is only for showing what I mean in previous example.

@akosourov
Copy link
Contributor

akosourov commented Sep 8, 2019

@umputun I've added /convert controller which performs possible url remap ability.
This approach doesn't touch importer/exporter interfaces but adds migrator.WithMapper decorator which wraps reader for importer. If this approach seems to be better I'll continue to do (add tests, add command for it (if needed), add pattern matching for rules).
UrlMapper still created in request, reading from request body - but maybe with pattern matching feature could be better to creates it on start (as you suggested) - but for me now it seems ok to build it dynamically in every convert request

@umputun
Copy link
Owner Author

umputun commented Sep 8, 2019

If this approach seems to be better I'll continue

Looking good, much simpler and easier to understand, very nice.

it seems ok to build it dynamically in every convert request

The reason why I suggested to pass the mapper into migrator and not to create it on demand is the ability to change/mock the implementation and, generally speaking, avoid tight coupling between controller and mapper implementation. However, if you don't see a simple way to construct/initialize the mapper with a concrete reader I don't have a big problem with the current implementation and can live with this.

One note - convert controller seems to create a temporary export file in order to run mapped importer. Probably it can be eliminated as they all are just readers/writers and io.Pipe can be used here as well.

add command for it

This would be nice to have. I think you will also need 'convert/wait' for this

@akosourov
Copy link
Contributor

The reason why I suggested to pass the mapper into migrator and not to create it on demand is the ability to change/mock the implementation and, generally speaking, avoid tight coupling between controller and mapper implementation

Ok, I got it why it is better. I'll change it

convert controller seems to create a temporary export file in order to run mapped importer. Probably it can be eliminated as they all are just readers/writers and io.Pipe can be used here as well

I thought about it but if I understand correctly this stream-like approach won't work because when exporter will read portion of data from source (from db) and push it to pipe (to importer) the importer delete all data from source. And exporter will be stopped. This delete action defined here.
So I think such file buffer is needed.

@umputun
Copy link
Owner Author

umputun commented Sep 9, 2019

the importer delete all data from source

yeah, makes sense.

@akosourov
Copy link
Contributor

@umputun Could you see pr #431 ? It is not already finished yet, but shows main approach. I've moved mapper creation to migrator struct (from build it on demand on every request to build it once on start).
I see that I should add more tests because coverage decreased a little.

I think you will also need 'convert/wait' for this

What do you think about use /import/wait for /convert procedure too? My point is that /convert also uses shared Busy flag and finishes with import procedure. And also users which will run /convert should understand that it is based on export/import procedures so run /import/wait seems not confused.

And I have another question about naming. I named /convert API because probably this API could do more complex comment changes. But I'm not sure is it ok to name it so broad, maybe /remap or similar would better.

If rest is ok I'll add command for run /convert.

@umputun
Copy link
Owner Author

umputun commented Sep 15, 2019

Could you see pr

I have added comments. Hopefully, you can see it, I never know if github shows them to the author or just to me

What do you think about use /import/wait

It is fine, but the path will be misleading in this case. Probably we could change it to just /wait and document the nature, i.e. "used to wait for completion for any async migration ops"

@umputun
Copy link
Owner Author

umputun commented Sep 15, 2019

maybe /remap or similar would better.

yeah, remap sounds better to me

@akosourov
Copy link
Contributor

I have added comments. Hopefully, you can see it, I never know if github shows them to the author or just to me

Yes, I see comments and get email notifications, thanks for feedback

@umputun
Copy link
Owner Author

umputun commented Sep 30, 2019

done with #431

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants