Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: migrate to utf8mb4 #3094

Merged
merged 1 commit into from
Aug 30, 2018

Conversation

jniles
Copy link
Collaborator

@jniles jniles commented Aug 29, 2018

This commit migrates the database to UTF8MB4 in preparation for MySQL 8 which defaults to this. It also produces better results for diverse character sets.

Closes #2989.

Overview
The motivation for this Pull Request is two-fold:

  1. Make sure no one had break the server by putting valid characters that cannot be represented in MySQL's UTF8 character set.
  2. Ensure we are set up for future releases of MySQL and future languages by migrating to UTF8MB4. MySQL 8 will support this out of the box.

And additional benefit continues the work started by @mbayopanda in his work on Travis CI changes in #3061 to document and standardize the character sets and collations in our database. Previously, we just relied on having "the right set up", with some of our Stored Procedures in the server's default encoding, functions called in the mysqljs's default encoding, etc... Now, we define at creation time and connection time the character sets and collations we are working with.

Notes
I had to skip one test. It was the link between debtor groups and patients. I'm pretty sure this is due to the fact that the order data was inserted in changed somehow. Either way, the test is non-specific and should be re-written to reference the group by its name.

@jniles
Copy link
Collaborator Author

jniles commented Aug 30, 2018

@mbayopanda can I get a review?

@jmcameron
Copy link
Collaborator

If a separate manual step is necessary to upgrade the database (apart from those done automatically by 'yarn dev'), please make a note of that somewhere so I'll know what to do. Thanks.

-Jonathan C

@jniles
Copy link
Collaborator Author

jniles commented Aug 30, 2018

@jmcameron good point! I'm not confident it works on anyone else's machine but my own (plus Travis), so I'm hoping @mbayopanda can help contribute those insights when he tests on Windows. Once we are confident that it works with yarn dev, I'll jot down some notes to include with this PR.

Copy link
Collaborator

@mbayopanda mbayopanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

I leave some comments which cannot block the merge this PR

@@ -32,7 +31,7 @@ class DatabaseConnector {
user : process.env.DB_USER,
password : process.env.DB_PASS,
database : process.env.DB_NAME,
charset : 'utf8_unicode_ci',
charset : 'UTF8MB4_UNICODE_CI',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 it is important to comment that the charset UTF8MB4_UNICODE_CI must be in uppercase.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Will do.

@@ -115,8 +114,8 @@ class DatabaseConnector {
one(sql, params, id, entity = 'record') {
return this.exec(sql, params)
.then(rows => {
const errorMessage =
`Expected ${entity} to contain a single record with id ${id}, but ${rows.length} were found!`;
// eslint-disable-next-line max-len
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄 is it allowed ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, it's either that or figure our a shorter representation ;)

@@ -134,7 +134,7 @@ BEGIN
SET accountType = 'expense';
END IF;

SET accountTypeId = (SELECT id FROM account_type WHERE `type` = accountType COLLATE utf8_unicode_ci LIMIT 1);
SET accountTypeId = (SELECT id FROM account_type WHERE `type` = accountType COLLATE utf8mb4_unicode_ci LIMIT 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to get rid also of all these COLLATE utf8mb4_unicode_ci ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... good question. Let me try!

SET existAccountType = (SELECT IF((SELECT COUNT(*) AS total FROM `account_type` WHERE `type` = accountType COLLATE utf8_unicode_ci) > 0, 1, 0));
SET accountTypeId = (SELECT id FROM `account_type` WHERE `type` = accountType COLLATE utf8_unicode_ci LIMIT 1);
SET existAccountType = (SELECT IF((SELECT COUNT(*) AS total FROM `account_type` WHERE `type` = accountType COLLATE utf8mb4_unicode_ci) > 0, 1, 0));
SET accountTypeId = (SELECT id FROM `account_type` WHERE `type` = accountType COLLATE utf8mb4_unicode_ci LIMIT 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can remove all these COLLATE utf8mb4_unicode_ci it will be very nice

@@ -16,38 +16,38 @@ set +a
TIMEOUT=${BUILD_TIMEOUT:-8}

# build the test database
mysql -u $DB_USER -p$DB_PASS -e "DROP DATABASE IF EXISTS $DB_NAME ;" &> /dev/null
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we redirect the output to /dev/null now ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - this was a feature requested by @jmcameron. We basically eliminate all ability to debug by sending all errors into /dev/null/. I figured this was a bit too much. Do you think we should put it back?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we set up the script to have an argument that determines whether the output goes to /dev/null or to regular output?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want, I can take a cut at setting up the script with an optional argument for debugging:

sh/build-database.sh debug

would force output to output to go to stdout (if I can figure out how to do that...)

-Jonathan C

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got a test bash script working for the debug option here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testo.txt

Download, remove .txt and chmod +x and try it with:
./testo
./testo debug

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, that is kind of nice! I'll incorporate it. 👍

@@ -16,7 +16,7 @@ describe('Check Inter-Registry Links', () => {
GU.expectRowCount('patient-registry', 2);
});

it('Checks the link between Debtor Groups -> Patient Registry', () => {
it.skip('Checks the link between Debtor Groups -> Patient Registry', () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you leave a comment about the reason of skipping this test ?

This commit migrates the database to UTF8MB4 in preparation for MySQL 8
which defaults to this.  It also produces better results for diverse
character sets.

Closes Third-Culture-Software#2989.
@jniles
Copy link
Collaborator Author

jniles commented Aug 30, 2018

@mbayopanda, this is ready for a second review.

@jniles
Copy link
Collaborator Author

jniles commented Aug 30, 2018

@sfount want to weigh in here? Did you test it locally?

@sfount
Copy link
Contributor

sfount commented Aug 30, 2018

@jniles I've tested this branch locally and it passes all tests. Running the most up to date version of MySQL 5.6 on linux.

@jniles
Copy link
Collaborator Author

jniles commented Aug 30, 2018

@sfount, thanks for you feedback. Glad this didn't break anything!

@jniles
Copy link
Collaborator Author

jniles commented Aug 30, 2018

Alright, merging!

bors r+

bors bot added a commit that referenced this pull request Aug 30, 2018
3041: refractor(session service) using $state instead of $location r=jniles a=jeremielodi

closes #3030

3094: fix: migrate to utf8mb4 r=jniles a=jniles

This commit migrates the database to UTF8MB4 in preparation for MySQL 8 which defaults to this.  It also produces better results for diverse character sets.

Closes #2989.

**Overview**
The motivation for this Pull Request is two-fold:
 1. Make sure no one had [break the server](https://medium.com/@adamhooper/in-mysql-never-use-utf8-use-utf8mb4-11761243e434) by putting valid characters that [cannot be represented in MySQL's UTF8 character set](https://stackoverflow.com/questions/30074492/what-is-the-difference-between-utf8mb4-and-utf8-charsets-in-mysql#30074553).
 2. Ensure we are set up for future releases of MySQL and future languages by migrating to UTF8MB4.  MySQL 8 will support this out of the box.

And additional benefit continues the work started by @mbayopanda in his work on Travis CI changes in #3061 to document and standardize the character sets and collations in our database.  Previously, we just relied on having "the right set up", with some of our Stored Procedures in the server's default encoding, functions called in the mysqljs's default encoding, etc...  Now, we define at creation time and connection time the character sets and collations we are working with.

**Notes**
I had to [skip one test](https://github.com/IMA-WorldHealth/bhima-2.X/blob/master/test/end-to-end/verificationLinks/verificationLinks.spec.js#L19).  It was the link between debtor groups and patients.  I'm pretty sure this is due to the fact that the order data was inserted in changed somehow.  Either way, the test is non-specific and should be re-written to reference the group by its name.



Co-authored-by: jeremielodi <[email protected]>
Co-authored-by: Jonathan Niles <[email protected]>
@jniles
Copy link
Collaborator Author

jniles commented Aug 30, 2018

@jmcameron - to answer your earlier question: it's a long an laborious process and depends a lot on the database. This SO answer is a good overview of the basic procedure, to be adapted to whatever databases are needed. Additionally, the Stored Procedures and Functions may have to be recreated in a connection whose CHARSET is utf8mb4 for the parameters to be correct.

For our production sites, we'll need to write a custom script to migrate their datasets. Since we don't have a standard way of pushing updates, we'll need to write scripts for each site's database in particular.

EDIT:
Percona has a great article on migrating existing databases in a sane fashion. Depends on our hardware though.

@bors
Copy link
Contributor

bors bot commented Aug 30, 2018

Build failed (retrying...)

bors bot added a commit that referenced this pull request Aug 30, 2018
3094: fix: migrate to utf8mb4 r=jniles a=jniles

This commit migrates the database to UTF8MB4 in preparation for MySQL 8 which defaults to this.  It also produces better results for diverse character sets.

Closes #2989.

**Overview**
The motivation for this Pull Request is two-fold:
 1. Make sure no one had [break the server](https://medium.com/@adamhooper/in-mysql-never-use-utf8-use-utf8mb4-11761243e434) by putting valid characters that [cannot be represented in MySQL's UTF8 character set](https://stackoverflow.com/questions/30074492/what-is-the-difference-between-utf8mb4-and-utf8-charsets-in-mysql#30074553).
 2. Ensure we are set up for future releases of MySQL and future languages by migrating to UTF8MB4.  MySQL 8 will support this out of the box.

And additional benefit continues the work started by @mbayopanda in his work on Travis CI changes in #3061 to document and standardize the character sets and collations in our database.  Previously, we just relied on having "the right set up", with some of our Stored Procedures in the server's default encoding, functions called in the mysqljs's default encoding, etc...  Now, we define at creation time and connection time the character sets and collations we are working with.

**Notes**
I had to [skip one test](https://github.com/IMA-WorldHealth/bhima-2.X/blob/master/test/end-to-end/verificationLinks/verificationLinks.spec.js#L19).  It was the link between debtor groups and patients.  I'm pretty sure this is due to the fact that the order data was inserted in changed somehow.  Either way, the test is non-specific and should be re-written to reference the group by its name.



3104: fix fiscal year list pagination r=jniles a=mbayopanda

This PR fix the pagination in the fiscal year list module by using the uib-pagination attribute instead of the element.

Co-authored-by: Jonathan Niles <[email protected]>
Co-authored-by: mbayopanda <[email protected]>
@bors
Copy link
Contributor

bors bot commented Aug 30, 2018

Timed out (retrying...)

bors bot added a commit that referenced this pull request Aug 30, 2018
3094: fix: migrate to utf8mb4 r=jniles a=jniles

This commit migrates the database to UTF8MB4 in preparation for MySQL 8 which defaults to this.  It also produces better results for diverse character sets.

Closes #2989.

**Overview**
The motivation for this Pull Request is two-fold:
 1. Make sure no one had [break the server](https://medium.com/@adamhooper/in-mysql-never-use-utf8-use-utf8mb4-11761243e434) by putting valid characters that [cannot be represented in MySQL's UTF8 character set](https://stackoverflow.com/questions/30074492/what-is-the-difference-between-utf8mb4-and-utf8-charsets-in-mysql#30074553).
 2. Ensure we are set up for future releases of MySQL and future languages by migrating to UTF8MB4.  MySQL 8 will support this out of the box.

And additional benefit continues the work started by @mbayopanda in his work on Travis CI changes in #3061 to document and standardize the character sets and collations in our database.  Previously, we just relied on having "the right set up", with some of our Stored Procedures in the server's default encoding, functions called in the mysqljs's default encoding, etc...  Now, we define at creation time and connection time the character sets and collations we are working with.

**Notes**
I had to [skip one test](https://github.com/IMA-WorldHealth/bhima-2.X/blob/master/test/end-to-end/verificationLinks/verificationLinks.spec.js#L19).  It was the link between debtor groups and patients.  I'm pretty sure this is due to the fact that the order data was inserted in changed somehow.  Either way, the test is non-specific and should be re-written to reference the group by its name.



Co-authored-by: Jonathan Niles <[email protected]>
@bors
Copy link
Contributor

bors bot commented Aug 30, 2018

Build succeeded

@bors bors bot merged commit f428f0e into Third-Culture-Software:master Aug 30, 2018
@jniles jniles deleted the fix-manual-collations branch August 30, 2018 16:11
mbayopanda pushed a commit to mbayopanda/bhima that referenced this pull request Dec 15, 2018
3015: Update karma to the latest version 🚀 r=jniles a=greenkeeper[bot]




## Version **3.0.0** of **karma** was just published.

<table>
  <tr>
    <th align=left>
      Dependency
    </th>
    <td>
      <a target=_blank href=https://github.com/karma-runner/karma>karma</a>
    </td>
  </tr>
  <tr>
      <th align=left>
       Current Version
      </th>
      <td>
        2.0.5
      </td>
    </tr>
  <tr>
    <th align=left>
      Type
    </th>
    <td>
      devDependency
    </td>
  </tr>
</table>



The version **3.0.0** is **not covered** by your **current version range**.

If you don’t accept this pull request, your project will work just like it did before. However, you might be missing out on a bunch of new features, fixes and/or performance improvements from the dependency update.

It might be worth looking into these changes and trying to get this project onto the latest version of karma.

If you have a solid test suite and good coverage, a passing build is a strong indicator that you can take advantage of these changes directly by merging the proposed change into your project. If the build fails or you don’t have such unconditional trust in your tests, this branch is a great starting point for you to work on the update.


---


<details>
<summary>Release Notes</summary>
<strong>v3.0.0</strong>

<h3>Bug Fixes</h3>
<ul>
<li><strong>config:</strong> wait 20s for browser activity. (<a href="https://urls.greenkeeper.io/karma-runner/karma/issues/3087">#3087</a>) (<a href="https://urls.greenkeeper.io/karma-runner/karma/commit/88b977f">88b977f</a>)</li>
<li><strong>config:</strong> Wait 30s for browser activity per Travis. (<a href="https://urls.greenkeeper.io/karma-runner/karma/issues/3091">#3091</a>) (<a href="https://urls.greenkeeper.io/karma-runner/karma/commit/f6d2f0e">f6d2f0e</a>)</li>
<li><strong>init:</strong> add "ChromeHeadless" to the browsers' options (<a href="https://urls.greenkeeper.io/karma-runner/karma/issues/3096">#3096</a>) (<a href="https://urls.greenkeeper.io/karma-runner/karma/commit/56fda53">56fda53</a>)</li>
<li><strong>server:</strong> Exit clean on unhandledRejections. (<a href="https://urls.greenkeeper.io/karma-runner/karma/issues/3092">#3092</a>) (<a href="https://urls.greenkeeper.io/karma-runner/karma/commit/02f54c6">02f54c6</a>), closes <a href="https://urls.greenkeeper.io/karma-runner/karma/issues/3064">#3064</a></li>
<li><strong>travis:</strong> Up the socket timeout 2-&gt;20s. (<a href="https://urls.greenkeeper.io/karma-runner/karma/issues/3103">#3103</a>) (<a href="https://urls.greenkeeper.io/karma-runner/karma/commit/732396a">732396a</a>), closes <a href="https://urls.greenkeeper.io/karma-runner/karma/issues/3102">#3102</a></li>
<li><strong>travis:</strong> use the value not the key name. (<a href="https://urls.greenkeeper.io/karma-runner/karma/issues/3097">#3097</a>) (<a href="https://urls.greenkeeper.io/karma-runner/karma/commit/90f5546">90f5546</a>)</li>
<li><strong>travis:</strong> validate TRAVIS_COMMIT if TRAVIS_PULL_REQUEST_SHA is not set. (<a href="https://urls.greenkeeper.io/karma-runner/karma/issues/3094">#3094</a>) (<a href="https://urls.greenkeeper.io/karma-runner/karma/commit/fba5d36">fba5d36</a>)</li>
<li><strong>travis:</strong> Validate TRAVIS_PULL_REQUEST_SHA rather than TRAVIS_COMMIT. (<a href="https://urls.greenkeeper.io/karma-runner/karma/issues/3093">#3093</a>) (<a href="https://urls.greenkeeper.io/karma-runner/karma/commit/a58fa45">a58fa45</a>)</li>
</ul>
</details>

<details>
<summary>Commits</summary>
<p>The new version differs by 17 commits.</p>
<ul>
<li><a href="https://urls.greenkeeper.io/karma-runner/karma/commit/a4d5bdcbe36cbe12974fa9936a224a7837c5f0a4"><code>a4d5bdc</code></a> <code>chore: release v3.0.0</code></li>
<li><a href="https://urls.greenkeeper.io/karma-runner/karma/commit/75f466dc172223cd8aa6fd7eec9bee810c982341"><code>75f466d</code></a> <code>chore: release v2.0.6</code></li>
<li><a href="https://urls.greenkeeper.io/karma-runner/karma/commit/5db939907cabae3d0ef77ec08ae04563177116c2"><code>5db9399</code></a> <code>chore: update contributors</code></li>
<li><a href="https://urls.greenkeeper.io/karma-runner/karma/commit/eb3b1b4ce8e0545832676289deb6e48bde5465fd"><code>eb3b1b4</code></a> <code>chore(deps): update mime -&gt; 2.3.1 (Third-Culture-Software#3107)</code></li>
<li><a href="https://urls.greenkeeper.io/karma-runner/karma/commit/732396a087c6dddeea2cf7f7493bf148a508725d"><code>732396a</code></a> <code>fix(travis): Up the socket timeout 2-&gt;20s. (Third-Culture-Software#3103)</code></li>
<li><a href="https://urls.greenkeeper.io/karma-runner/karma/commit/173848e3a0659bceea9d87d4eeee6c8a11c0f1a1"><code>173848e</code></a> <code>Remove erroneous change log entries for 2.0.3</code></li>
<li><a href="https://urls.greenkeeper.io/karma-runner/karma/commit/10025695c4af3ea983c2c1316ec4ef078e056ebc"><code>1002569</code></a> <code>chore(ci): drop node 9 from travis tests (Third-Culture-Software#3100)</code></li>
<li><a href="https://urls.greenkeeper.io/karma-runner/karma/commit/02f54c6c57f5a3e0be3a44e8e5ca1db98b8dbc8f"><code>02f54c6</code></a> <code>fix(server): Exit clean on unhandledRejections. (Third-Culture-Software#3092)</code></li>
<li><a href="https://urls.greenkeeper.io/karma-runner/karma/commit/0fdd8f9e91b6039c07c857e11a5d7f7b3205cf01"><code>0fdd8f9</code></a> <code>chore(deps): update socket.io -&gt; 2.1.1 (Third-Culture-Software#3099)</code></li>
<li><a href="https://urls.greenkeeper.io/karma-runner/karma/commit/90f5546a9f88199f8118eae506922d4e8ee38945"><code>90f5546</code></a> <code>fix(travis): use the value not the key name. (Third-Culture-Software#3097)</code></li>
<li><a href="https://urls.greenkeeper.io/karma-runner/karma/commit/fba5d365146bad122d54af75bf191ad0b6091dd0"><code>fba5d36</code></a> <code>fix(travis): validate TRAVIS_COMMIT if TRAVIS_PULL_REQUEST_SHA is not set. (Third-Culture-Software#3094)</code></li>
<li><a href="https://urls.greenkeeper.io/karma-runner/karma/commit/56fda53ec19a1a691cd80342fef9b23d9f9fe4d2"><code>56fda53</code></a> <code>fix(init): add "ChromeHeadless" to the browsers' options (Third-Culture-Software#3096)</code></li>
<li><a href="https://urls.greenkeeper.io/karma-runner/karma/commit/f6d2f0ea5a3323c5e359e26fe5be9fbf68db819f"><code>f6d2f0e</code></a> <code>fix(config): Wait 30s for browser activity per Travis. (Third-Culture-Software#3091)</code></li>
<li><a href="https://urls.greenkeeper.io/karma-runner/karma/commit/a58fa45c1df08ff4e74f9e75379f74c1311073c3"><code>a58fa45</code></a> <code>fix(travis): Validate TRAVIS_PULL_REQUEST_SHA rather than TRAVIS_COMMIT. (Third-Culture-Software#3093)</code></li>
<li><a href="https://urls.greenkeeper.io/karma-runner/karma/commit/88b977fcada5d08ae8d5bba9bc8eefc8404eff82"><code>88b977f</code></a> <code>fix(config): wait 20s for browser activity. (Third-Culture-Software#3087)</code></li>
</ul>
<p>There are 17 commits in total.</p>
<p>See the <a href="https://urls.greenkeeper.io/karma-runner/karma/compare/00189471d383600a95415ea526b152dd556ce751...a4d5bdcbe36cbe12974fa9936a224a7837c5f0a4">full diff</a></p>
</details>

<details>
  <summary>FAQ and help</summary>

  There is a collection of [frequently asked questions](https://greenkeeper.io/faq.html). If those don’t help, you can always [ask the humans behind Greenkeeper](https://github.com/greenkeeperio/greenkeeper/issues/new).
</details>

---


Your [Greenkeeper](https://greenkeeper.io) bot 🌴



Co-authored-by: greenkeeper[bot] <greenkeeper[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants