Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent responses when PUTting media #1481

Open
mjordan opened this issue Apr 6, 2020 · 32 comments
Open

Inconsistent responses when PUTting media #1481

mjordan opened this issue Apr 6, 2020 · 32 comments
Labels
Type: question asks for support (asks a question)

Comments

@mjordan
Copy link
Contributor

mjordan commented Apr 6, 2020

Creating media via HTTP PUT (docs) sometimes returns a 201 and sometimes returns a 204. Does anybody know why this is? One practical implication of this is that when Drupal responds with a 201, a location header is returned:

{'Date': 'Mon, 06 Apr 2020 15:38:20 GMT', 'Server': 'Apache/2.4.29 (Ubuntu)', 'X-Powered-By': 'PHP/7.2.29-1+ubuntu18.04.1+deb.sury.org+1', 'Cache-Control': 'must-revalidate, no-cache, private', 'Location': 'http://localhost:8000/media/193', 'X-UA-Compatible': 'IE=edge', 'Content-language': 'en', 'X-Content-Type-Options': 'nosniff', 'X-Frame-Options': 'SAMEORIGIN', 'Expires': 'Sun, 19 Nov 1978 05:00:00 GMT', 'Vary': '', 'X-Generator': 'Drupal 8 (https://www.drupal.org)', 'Content-Length': '0', 'Keep-Alive': 'timeout=5, max=100', 'Connection': 'Keep-Alive', 'Content-Type': 'text/html; charset=UTF-8'}

But when a 204 is returned, there is no location:

{'Date': 'Mon, 06 Apr 2020 15:38:24 GMT', 'Server': 'Apache/2.4.29 (Ubuntu)', 'X-Powered-By': 'PHP/7.2.29-1+ubuntu18.04.1+deb.sury.org+1', 'Cache-Control': 'must-revalidate, no-cache, private', 'X-UA-Compatible': 'IE=edge', 'Content-language': 'en', 'X-Content-Type-Options': 'nosniff', 'X-Frame-Options': 'SAMEORIGIN', 'Expires': 'Sun, 19 Nov 1978 05:00:00 GMT', 'Vary': '', 'X-Generator': 'Drupal 8 (https://www.drupal.org)', 'Keep-Alive': 'timeout=5, max=100', 'Connection': 'Keep-Alive'}

In both cases, the media appears to be created.

Some poking around in source code has led me to https://github.com/Islandora/islandora/blob/8.x-1.x/src/Controller/MediaSourceController.php#L143 as the source of this behavior but I am not clear on what triggers success or non-success. In both cases, the media appear to be created.

@dannylamb can you shed some light on this? Or any other @Islandora/8-x-committers ?

@whikloj
Copy link
Member

whikloj commented Apr 6, 2020

When doing a PUT normally (or my experience is) 201 CREATED means the resource did not exist and has been created, whereas 204 NO CONTENT means the resource did already exist and has been updated.

@mjordan
Copy link
Contributor Author

mjordan commented Apr 6, 2020

@whikloj that is my understanding too but I am seeing this when creating new media.They don't already exists.

@dannylamb
Copy link
Contributor

Then that's a bug 😞 The behaviour @whikloj described is the expected behaviour.

The code checks to see if the entity already exists here: https://github.com/Islandora/islandora/blob/8.x-1.x/src/MediaSource/MediaSourceService.php#L255. So somehow that's getting fooled. Or its some sort of awful timing thing. What exactly are you doing to trigger it?

@seth-shaw-unlv
Copy link
Contributor

It's possible that we are running into timing issues; where two items are trying to be created at the same time (media being uploaded and a derivative of another item).

@mjordan
Copy link
Contributor Author

mjordan commented Apr 6, 2020

It's inconsistent (at least I can't reliably replicate it). I can collect a bit more info if want (that is, try to replicate it) and report back. Anything I should be looking for in particular?

@mjordan
Copy link
Contributor Author

mjordan commented Apr 6, 2020

@dannylamb

What exactly are you doing to trigger it?

Running Workbench to create new nodes and associated media. I wanted to get the new media's URL, which is in the location response header. Some media didn't have a location, which is why I dug in and noticed the 201/204 pattern.

@mjordan
Copy link
Contributor Author

mjordan commented Apr 6, 2020

Is it reasonable to assume that if the controller generates a 204, the media already has a URL (i.e., a /media/xxx URL?

@dannylamb
Copy link
Contributor

@mjordan Yes, exactly. You've already got what you seek.

@dannylamb
Copy link
Contributor

Still worried @seth-shaw-unlv may be right, though. Are you hammering it? Exactly how big of a batch are you making?

@mjordan
Copy link
Contributor Author

mjordan commented Apr 6, 2020

Let me log at https://github.com/Islandora/islandora/blob/8.x-1.x/src/Controller/MediaSourceController.php#L176 and run workbench and see what's there.

@mjordan
Copy link
Contributor Author

mjordan commented Apr 6, 2020

Hammering in the sense that I am creating one after the other, within parts of a second. But not hammering it the sense that I am only creating 2 nodes/media in one run, during debugging.

@mjordan
Copy link
Contributor Author

mjordan commented Apr 6, 2020

@dannylamb you start your transaction at https://github.com/Islandora/islandora/blob/8.x-1.x/src/Controller/MediaSourceController.php#L163 , but do you need to explicitly close it?

@dannylamb
Copy link
Contributor

@mjordan IIRC the transaction autocommits when the variable goes out of scope. Let me go double check that 😅

@dannylamb
Copy link
Contributor

According to Drupal docs, yes. They commit when the variable goes out of scope. But I don't see any harm in committing the transaction earlier sooner than later if we're running into timing issues.

@dannylamb
Copy link
Contributor

@mjordan
Copy link
Contributor Author

mjordan commented Apr 6, 2020

I'm not sure if it will matter, I just noticed that it wasn't being explicitly closed.

I'll do some logging to see if I can come up with any more information.

@mjordan
Copy link
Contributor Author

mjordan commented Apr 6, 2020

Ok, got something, I think: I've created about 100 media and didn't get any 204s but I did get a 500, so looked in my apache error log. Found these:

sudo tail -f  /var/log/apache2/error.log
[Mon Apr 06 20:36:18.096001 2020] [php7:notice] [pid 8883] [client 127.0.0.1:45908] Uncaught PHP Exception Symfony\\Component\\HttpKernel\\Exception\\HttpException: "SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT savepoint_1 does not exist: ROLLBACK TO SAVEPOINT savepoint_1; Array\n(\n)\n" at /var/www/html/drupal/web/modules/contrib/islandora/src/Controller/MediaSourceController.php line 192
[Mon Apr 06 20:36:52.072608 2020] [php7:notice] [pid 9828] [client 127.0.0.1:46514] Uncaught PHP Exception Symfony\\Component\\HttpKernel\\Exception\\HttpException: "SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT savepoint_1 does not exist: ROLLBACK TO SAVEPOINT savepoint_1; Array\n(\n)\n" at /var/www/html/drupal/web/modules/contrib/islandora/src/Controller/MediaSourceController.php line 192
[Mon Apr 06 20:38:04.491148 2020] [php7:notice] [pid 9933] [client 127.0.0.1:47504] Uncaught PHP Exception Symfony\\Component\\HttpKernel\\Exception\\HttpException: "SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT savepoint_1 does not exist: ROLLBACK TO SAVEPOINT savepoint_1; Array\n(\n)\n" at /var/www/html/drupal/web/modules/contrib/islandora/src/Controller/MediaSourceController.php line 192
[Mon Apr 06 20:38:22.265251 2020] [php7:notice] [pid 10494] [client 127.0.0.1:47988] Uncaught PHP Exception Symfony\\Component\\HttpKernel\\Exception\\HttpException: "SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT savepoint_1 does not exist: ROLLBACK TO SAVEPOINT savepoint_1; Array\n(\n)\n" at /var/www/html/drupal/web/modules/contrib/islandora/src/Controller/MediaSourceController.php line 192
[Mon Apr 06 20:38:54.089959 2020] [php7:notice] [pid 9758] [client 127.0.0.1:48798] Uncaught PHP Exception Symfony\\Component\\HttpKernel\\Exception\\HttpException: "SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT savepoint_1 does not exist: ROLLBACK TO SAVEPOINT savepoint_1; Array\n(\n)\n" at /var/www/html/drupal/web/modules/contrib/islandora/src/Controller/MediaSourceController.php line 192
[Mon Apr 06 20:39:19.224328 2020] [php7:notice] [pid 6980] [client 127.0.0.1:49276] Uncaught PHP Exception Symfony\\Component\\HttpKernel\\Exception\\HttpException: "SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT savepoint_1 does not exist: ROLLBACK TO SAVEPOINT savepoint_1; Array\n(\n)\n" at /var/www/html/drupal/web/modules/contrib/islandora/src/Controller/MediaSourceController.php line 192
[Mon Apr 06 20:39:22.099831 2020] [php7:notice] [pid 9758] [client 127.0.0.1:49398] Uncaught PHP Exception Symfony\\Component\\HttpKernel\\Exception\\HttpException: "SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT savepoint_1 does not exist: ROLLBACK TO SAVEPOINT savepoint_1; Array\n(\n)\n" at /var/www/html/drupal/web/modules/contrib/islandora/src/Controller/MediaSourceController.php line 192
[Mon Apr 06 20:40:28.160379 2020] [php7:notice] [pid 9933] [client 10.0.2.2:49474] Uncaught PHP Exception Symfony\\Component\\HttpKernel\\Exception\\HttpException: "SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT savepoint_1 does not exist: ROLLBACK TO SAVEPOINT savepoint_1; Array\n(\n)\n" at /var/www/html/drupal/web/modules/contrib/islandora/src/Controller/MediaSourceController.php line 192
[Mon Apr 06 20:40:45.711832 2020] [php7:notice] [pid 9758] [client 127.0.0.1:51126] Uncaught PHP Exception Symfony\\Component\\HttpKernel\\Exception\\HttpException: "SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT savepoint_1 does not exist: ROLLBACK TO SAVEPOINT savepoint_1; Array\n(\n)\n" at /var/www/html/drupal/web/modules/contrib/islandora/src/Controller/MediaSourceController.php line 192
[Mon Apr 06 20:41:13.480497 2020] [php7:notice] [pid 11151] [client 10.0.2.2:49554] Uncaught PHP Exception Symfony\\Component\\HttpKernel\\Exception\\HttpException: "SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT savepoint_1 does not exist: ROLLBACK TO SAVEPOINT savepoint_1; Array\n(\n)\n" at /var/www/html/drupal/web/modules/contrib/islandora/src/Controller/MediaSourceController.php line 192
[Mon Apr 06 20:41:49.633085 2020] [php7:notice] [pid 11136] [client 127.0.0.1:52186] Uncaught PHP Exception Symfony\\Component\\HttpKernel\\Exception\\HttpException: "SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT savepoint_1 does not exist: ROLLBACK TO SAVEPOINT savepoint_1; Array\n(\n)\n" at /var/www/html/drupal/web/modules/contrib/islandora/src/Controller/MediaSourceController.php line 192
[Mon Apr 06 20:41:57.430145 2020] [php7:notice] [pid 10047] [client 127.0.0.1:52274] Uncaught PHP Exception Symfony\\Component\\HttpKernel\\Exception\\HttpException: "'field_config' entity with ID 'media.fits_technical_metadata.fits_exiftool_exposure_time' already exists." at /var/www/html/drupal/web/modules/contrib/islandora/src/Controller/MediaSourceController.php line 192

Strange thing is, they don't correspond to 500s I was seeing as a result of my REST PUT calls, but they are all being thrown at MediaSourceController.php line 192.

@dannylamb
Copy link
Contributor

For sure it's FITS. I get a bunch of those every time I debug it. If things aren't just so in the media_presave hook, you get these scary transaction errors.

@mjordan
Copy link
Contributor Author

mjordan commented Oct 6, 2020

Still getting these errors frequently (maybe 20% of the time using a sample of 5 media doing some dev on Workbench) when PUTing media:

[Tue Oct 06 04:06:17.215598 2020] [php7:notice] [pid 4269] [client 10.0.2.2:44470] Uncaught PHP Exception Symfony\\Component\\HttpKernel\\Exception\\HttpException: "SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT savepoint_1 does not exist: ROLLBACK TO SAVEPOINT savepoint_1; Array\n(\n)\n" at /var/www/html/drupal/web/modules/contrib/islandora/src/Controller/MediaSourceController.php line 192

Would love to make REST more reilable.... anything we can do? I'm using a current Vagrant.

@mjordan
Copy link
Contributor Author

mjordan commented Oct 7, 2020

As discussed in the 2020-10-07 tech call, I will comment out the transaction code to see if this particular error disappears. At least that will localize the problem and we can go from there.

@wgilling
Copy link
Contributor

wgilling commented Oct 7, 2020

During the I8 call, I didn't want to take us on another tangent but if each POST is getting processed by an individual apache client thread, and a rollback can take a longer time (depending on what all had changed), but it got me wondering if the rollback names all being the same between these processes is a potential issue. Perhaps I do not entirely know what overhead it can be to use transactions in the first place (does it lock any records, does the system handle duplicate rollback names in parallel without confusing them, etc). While this would be a separate issue (and not likely behind the reason for an initial POST to fail), it seems to me that each thread would need its own database transaction (distinct name) for what changes happened during its processing until the commit can discard the transaction information.

Initially my thought was that the first process that had to perform the rollback must have proceeded and causing the subsequent failed processes to NOT be able to rollback to their "savepoint_1" because the previous process disposed of it.

@mjordan
Copy link
Contributor Author

mjordan commented Oct 19, 2020

@wgilling I don't know enough about how Drupal handles transactions to comment on your ideas. But here is the results of some testing I did:

  1. Created 100 objects (node + PNG image) with Islandora Workbench and got 8 failed transactions.
  2. I commented out the transaction code in the Islandora module's MediaSourceController and created two more batches of 100 objects (total 200 objects without the transaction code).
  3. All my media were created successfully.

@mjordan
Copy link
Contributor Author

mjordan commented Dec 2, 2020

@dannylamb up in #1481 (comment) you mentioned that "I don't see any harm in committing the transaction earlier sooner than later if we're running into timing issues." I'd be happy to test this. Would be good to track down this issue once and for all. Any suggestions?

@mjordan
Copy link
Contributor Author

mjordan commented Dec 9, 2020

Decision at 2020-12-09 Tech call was that it's easier to change Workbench to use Drupal's file and media REST resources than to monkey with Islandora's MediaSourceController, which would be unnecessarily disruptive and non-trivial. Workbench issue is linked ^^. Thanks for the disuccsion everyone.

@dannylamb I can close this if you want.

@elizoller
Copy link
Member

i'm hitting this in our production environment without any use of the workbench module.

@mjordan
Copy link
Contributor Author

mjordan commented Apr 21, 2021

@elizoller during a Migration?

@elizoller
Copy link
Member

yes

@kstapelfeldt kstapelfeldt added the Type: question asks for support (asks a question) label Sep 25, 2021
@aOelschlager
Copy link
Contributor

I am getting these errors with my migrations. I'm using isle-dc and workbench. It's triggered by both fits derivatives and thumbnail derivatives. I will be running more migrations next week and I will test if the suggestion in this issue fixes the problem.

@rebeccapower
Copy link
Contributor

Was discussed in the Open Tech Call–a complex issue that could possibly be bumped up to a working group. Possible first steps to tackling: come up with a test that reproduces the issue.
These links were provided during the discussion:
https://www.drupal.org/project/drupal/issues/1803886
https://dev.mysql.com/doc/refman/5.7/en/innodb-transaction-isolation-levels.html

@rosiel
Copy link
Member

rosiel commented Jun 15, 2023

TAG met and decided to dedicate some of August's Open Meeting to a session where we figure out a way to fix this.

@whikloj
Copy link
Member

whikloj commented Jun 15, 2023

Reading through the discussion at https://www.drupal.org/project/drupal/issues/1803886
and I would say this issue is very relevant and validates the discussion in Islandora-Devops/islandora-playbook#250 and anyone using MySQL should probably be making the same change. This may or may not resolve the response issue, but it might help actually diagnose the issue.

I'd also note this comment related to the savepoint_1 issue and their suggestion that this might be a red herring disguising the real issue.

@rosiel
Copy link
Member

rosiel commented Jun 30, 2023

from @ruebot on the ICC call:

That error goes away if you set 'SET GLOBAL TRANSACTION ISOLATION LEVEL READ COMMITTED' on the database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: question asks for support (asks a question)
Projects
Development

No branches or pull requests

10 participants