-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mirroring #1081
Conversation
I had a look at the work needed for mirroring yesterday and think that I have a pretty good idea of what to change. We essentially need to keep track of two URLs: the canonical one and the mirror. I'll push a PR with changes later today, or I can push into this pr. |
Sounds great. Directory-depth invariance will be important to have working, since it seems we are actually going to be moving hosting to |
While I like subdomains in principle, do we really want to move away from the canonical URLs that we have? This seems to be a premier source for confusion; even if we create redirects everywhere. The Anthology URLs are nearly the only URLs on the aclweb.org site that have a very good reason to stay the way they are; why not migrate future CPU heavy dynamic stuff to somewhere else? This is not a complete argument, just please let us discuss this before creating facts! |
We can defer discussion about permanent hosting. For the meantime, I think it'd be good to:
Note that we continue to have problems with bluehost; for example, the most recent build failed to deploy with this: Run rsync -aze "ssh -o StrictHostKeyChecking=accept-new" --delete build/anthology/ $PUBLISH_TARGET
7
ssh: connect to host aclweb.org port 22: Connection timed out
8
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
9
rsync error: error in rsync protocol data stream (code 12) at io.c(235) [sender=3.1.2]
10
Error: Process completed with exit code 12. |
Yeah, I saw that. Maybe let's have a chat over infrastructure stuff some time in the near future (I'm available at UTC+1 times). I think I have finished all the necessary changes to host a mirror (currently testing) and now only need the code that actually mirrors the PDFs. That should not be too much work as nearly everything is already in place for that. |
Short update: URL logic for mirroring works as intended, I will push some stuff next week. |
Closed in favor of #1124. |
This PR is for discussion of mirroring (#295).
The minor modifications here are in place at http://anthology.aclweb.org/, which was pretty easy to setup. However, a few minor problems remain:
/anthology
subdirectory appears to be hard-coded in some places (I had to add a symlink fromanthology
→.
to get things working)bin/anthology/data.py
may be a bit too hidden. Also, if it's empty, PDFs (strangely) link to domains, e.g.,http://2020.emnlp-main.1/