Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ja] cs-229-machine-learning-tips-and-tricks #99

Merged
merged 11 commits into from
Jun 30, 2020
Merged

[ja] cs-229-machine-learning-tips-and-tricks #99

merged 11 commits into from
Jun 30, 2020

Conversation

umu1729
Copy link
Contributor

@umu1729 umu1729 commented Nov 16, 2018

No description provided.

@shervinea shervinea changed the title fix ja [ja] Machine learning tips and tricks Nov 16, 2018
@shervinea shervinea added the in progress Work in progress label Nov 16, 2018
@shervinea
Copy link
Owner

Thanks for your ongoing work @umu1729! I only kept the cheatsheet you were translating, so that this PR does not conflict with #96.

@shervinea shervinea mentioned this pull request Jun 3, 2019
@shervinea
Copy link
Owner

Just realized that your work was already ready to be reviewed @umu1729! Please feel free to invite friends to take a look at it.

@shervinea shervinea added reviewer wanted Looking for a reviewer and removed in progress Work in progress labels Sep 11, 2019
@ytknzw
Copy link

ytknzw commented Oct 28, 2019

@shervinea
This cheat sheet has been translated and reviewed in the following PR. Even though it's a different translation from umu1729's, I think we could call this cheat sheet 'done'.
#164 (comment)

@shervinea
Copy link
Owner

Hi @ytknzw, thanks for your message. It is my understanding that PR #164 is not the same as the current one, right? (deep learning tips and tricks as opposed to machine learning tips and tricks)

@ytknzw
Copy link

ytknzw commented Oct 29, 2019

Hi @shervinea, yes, sorry! My misunderstanding!

@shervinea
Copy link
Owner

No worries @ytknzw, thanks for helping out in seeing what can be already done! By the way, would you be interested in reviewing the translation?

@ytknzw
Copy link

ytknzw commented Oct 30, 2019

By the way, would you be interested in reviewing the translation?
Yes!

@shervinea
Copy link
Owner

@ytknzw, thanks for proposing your help!
Hi @umu1729, please feel free to invite anyone else who you think could be interested in this process. Looking forward to the final version of this translation!

Copy link

@ytknzw ytknzw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 - 11.

@@ -0,0 +1,285 @@
**1. Machine Learning tips and tricks cheatsheet**

⟶ 機械学習チップ&トリック チートシート
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
⟶ 機械学習チップ&トリック チートシート
⟶機械学習のアドバイスやコツのチートシート

To be consistent with translation of "Deep learning tips and tricks cheatsheet":
https://stanford.edu/~shervine/l/ja/teaching/cs-230/cheatsheet-deep-learning-tips-and-tricks


**2. Classification metrics**

⟶ 分類評価指標
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


**3. In a context of a binary classification, here are the main metrics that are important to track in order to assess the performance of the model.**

⟶ 二値分類の文脈では,次のようなモデルの性能を評価するための重要な評価指標があります.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
⟶ 二値分類の文脈では,次のようなモデルの性能を評価するための重要な評価指標があります.
⟶二値分類の場面では、モデルの性能を評価するために追うべき主な指標として次のものがあります。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for "の文脈では" v. "の場面では", how about "を背景に"?

Copy link
Contributor

@hrkmr-tech hrkmr-tech Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

二値分類において、モデルの性能を評価するための主要な指標として次のものがあります.
二値分類において、モデルの性能を評価する際の主要な指標として次のものがあります.

I think the latter one is better. I intentionally omit the words (that are important to track) in translation. It is a bit redundant if you translate all words as they are.


**4. Confusion matrix ― The confusion matrix is used to have a more complete picture when assessing the performance of a model. It is defined as follows:**

⟶ 混同行列 - 混同行列はモデルの性能を評価する際に,より完全な描像を得るために用いられます.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
⟶ 混同行列 - 混同行列はモデルの性能を評価する際に,より完全な描像を得るために用いられます.
⟶ 混同行列 - 混同行列はモデルの性能を評価する際に、より完全に理解するために用いられます。次のように定義されます:

Copy link
Contributor

@hrkmr-tech hrkmr-tech Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that a complete picture be translated as '全体像' in Japanese. We use confusion matrix to assess the whole model, so the part in question should be like:

混同行列はモデルの性能を評価する際に、より全体像を把握するために用いられます.

or something like

混同行列はモデルの性能をより全体から評価するために用いられます.

The more in the sentence is comparing it (confusion matrix) to other metrics which focusing on more detail part of assessment.


**5. [Predicted class, Actual class]**

⟶ [予測したクラス, 実際のクラス]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


**7. [Metric, Formula, Interpretation]**

⟶ [評価指標,式,解釈]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


**8. Overall performance of model**

⟶ モデルの全体的な性能
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


**9. How accurate the positive predictions are**

⟶ 正と判断された予測の正答率
Copy link

@ytknzw ytknzw Nov 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
⟶ 正と判断された予測の正答率
⟶陽性判定の正解率(陽性的中率)

Cf. https://ja.wikipedia.org/wiki/%E9%99%BD%E6%80%A7%E9%81%A9%E4%B8%AD%E7%8E%87

Copy link

@tianjianjiang tianjianjiang Nov 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, the source phrase in English is trying to put the interpretation in layman's terms. Perhaps something similar to yet simpler than "陽性と判定された場合に、真の陽性である確率" from the aforementioned Wikipedia page.
For example, "陽性判定は、どれくらい正確ですか."


**10. Coverage of actual positive sample**

⟶ 実際には正であるサンプルを正しく正と予測した割合
Copy link

@ytknzw ytknzw Nov 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
⟶ 実際には正であるサンプルを正しく正と予測した割合
⟶ 本当に陽性であるサンプルに対する正解率(真陽性率)

Cf. https://ja.wikipedia.org/wiki/%E9%99%BD%E6%80%A7%E9%81%A9%E4%B8%AD%E7%8E%87#%E5%8F%82%E8%80%83

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about 実際に陽性であるサンプル ?


**11. Coverage of actual negative sample**

⟶ 実際には負であるサンプルを正しく負と予測した割合
Copy link

@ytknzw ytknzw Nov 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
⟶ 実際には負であるサンプルを正しく負と予測した割合
⟶ 本当に陰性であるサンプルに対する正解率(真陰性率)

Cf. https://ja.wikipedia.org/wiki/%E9%99%BD%E6%80%A7%E9%81%A9%E4%B8%AD%E7%8E%87#%E5%8F%82%E8%80%83

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about 実際に陰性であるサンプル ?


**3. In a context of a binary classification, here are the main metrics that are important to track in order to assess the performance of the model.**

⟶ 二値分類の文脈では,次のようなモデルの性能を評価するための重要な評価指標があります.
Copy link
Contributor

@hrkmr-tech hrkmr-tech Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

二値分類において、モデルの性能を評価するための主要な指標として次のものがあります.
二値分類において、モデルの性能を評価する際の主要な指標として次のものがあります.

I think the latter one is better. I intentionally omit the words (that are important to track) in translation. It is a bit redundant if you translate all words as they are.


**4. Confusion matrix ― The confusion matrix is used to have a more complete picture when assessing the performance of a model. It is defined as follows:**

⟶ 混同行列 - 混同行列はモデルの性能を評価する際に,より完全な描像を得るために用いられます.
Copy link
Contributor

@hrkmr-tech hrkmr-tech Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that a complete picture be translated as '全体像' in Japanese. We use confusion matrix to assess the whole model, so the part in question should be like:

混同行列はモデルの性能を評価する際に、より全体像を把握するために用いられます.

or something like

混同行列はモデルの性能をより全体から評価するために用いられます.

The more in the sentence is comparing it (confusion matrix) to other metrics which focusing on more detail part of assessment.


**10. Coverage of actual positive sample**

⟶ 実際には正であるサンプルを正しく正と予測した割合
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about 実際に陽性であるサンプル ?


**11. Coverage of actual negative sample**

⟶ 実際には負であるサンプルを正しく負と予測した割合
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about 実際に陰性であるサンプル ?


**12. Hybrid metric useful for unbalanced classes**

⟶ 不均衡データに対する有用な複合指標
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


**13. ROC ― The receiver operating curve, also noted ROC, is the plot of TPR versus FPR by varying the threshold. These metrics are are summed up in the table below:**

⟶ ROC曲線 - 受信者動作特性曲線(ROC)は閾値を変えていく際のFPRに対するTPRのグラフです.
Copy link
Contributor

@hrkmr-tech hrkmr-tech Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the next line:

これらの指標は下表の通りまとめられます.

<br>

**13. ROC ― The receiver operating curve, also noted ROC, is the plot of TPR versus FPR by varying the threshold. These metrics are are summed up in the table below:**

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE

An and should be removed from the original.


**14. [Metric, Formula, Equivalent]**

&#10230; [評価指標,式,等価な指標]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know which one is the best, but I give two more examples.

  • 同等の指標
  • 同義の指標


**15. AUC ― The area under the receiving operating curve, also noted AUC or AUROC, is the area below the ROC as shown in the following figure:**

&#10230; AUC - ROC曲線下面積(AUC,AUROC)は次の図のようにROC曲線の下側の面積のことです.
Copy link
Contributor

@hrkmr-tech hrkmr-tech Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&#10230; AUC - ROC曲線下面積(AUC,AUROC)は次の図のようにROC曲線の下側の面積のことです.
&#10230; AUC - ROC曲線下面積(AUC,AUROC)は次の図に示される通りROC曲線の下側面積のことです


**16. [Actual, Predicted]**

&#10230; [実際,予測]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


**18. [Total sum of squares, Explained sum of squares, Residual sum of squares]**

&#10230; [総平方和,説明された平方和,残差平方和]
Copy link
Contributor

@hrkmr-tech hrkmr-tech Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&#10230; [総平方和,説明された平方和,残差平方和]
&#10230; [全平方和,回帰平方和,残差平方和]

Copy link
Contributor

@scrambleegg7 scrambleegg7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done small changes on the last parts of article.


**45. [Classification metrics, confusion matrix, accuracy, precision, recall, F1 score, ROC]**

&#10230; [分類評価指標,混同行列,正解率,適合率,再現率,F値,ROC]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&#10230; [分類評価指標,混同行列,正解率,適合率,再現率,F値,ROC]
&#10230; [分類評価指標,混同行列,正解率,適合率,再現率,F値,ROC曲線]


**43. Ablative analysis ― Ablative analysis is analyzing the root cause of the difference in performance between the current and the baseline models.**

&#10230; アブレーション分析 - アブレーション分析は,ベースラインとするモデルと現在のモデル間の性能差の主要な要因を分析することです.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&#10230; アブレーション分析 - アブレーション分析は,ベースラインとするモデルと現在のモデル間の性能差の主要な要因を分析することです
&#10230; アブレーション分析 - アブレーション分析は,ベースライン・モデルと改良されたモデル間で発生したパフォーマンスの差異の原因を分析することです

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current is '現在のモデル'.


**46. [Regression metrics, R squared, Mallow's CP, AIC, BIC]**

&#10230; [回帰評価指標,R二乗,マローズのCp,AIC,BIC]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


**47. [Model selection, cross-validation, regularization]**

&#10230; [モデル選択,交差検証,正則化]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&#10230; [モデル選択,交差検証,正則化]
&#10230; [モデルの選択,交差検証,正則化]


**48. [Diagnostics, Bias/variance tradeoff, error/ablative analysis]**

&#10230; [分析,バイアス・バリアンストレードオフ,エラー・アブレーション分析]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&#10230; [分析,バイアス・バリアンストレードオフ,エラー・アブレーション分析]
&#10230; [解析(分析),バイアス・バリアンストレードオフ,エラー・アブレーション分析]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 診断 or 診断方法 is better. We need discuss this topic.
https://github.com/shervinea/cheatsheet-translation/pull/99/files#r349865518

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I think it might be ok with "診断", however when Google searching with combination words "機械学習" and "診断", I look at many headlines related to the medical topics in Japan. The combination words "機械学習" and ("解析" or "分析") pick up headlines related to the machine learning in Japan. Though my approaching to find best translated word not be good one, how do you think about that ?

Copy link
Contributor

@hrkmr-tech hrkmr-tech Nov 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. "診断" is a bit weird. This section is about trouble shooting when your model doesn't work as expected. But "分析" sounds like "データ分析". How about "問題分析" ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your nice comment. I agree with "問題分析". It will be definitely fit to original nuance in English.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your nice comment. I agree with "問題分析". It will be definitely fit to original nuance in English.

Copy link

@tianjianjiang tianjianjiang Nov 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know, "打診" in Japanese had a similar word sense derivation like "diagnostics "in English: from medical usages to common ones.
However, when I see "問題分析", I may think the source phrase in English was "error analysis."


**44. Regression metrics**

&#10230; 回帰評価指標
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&#10230; 回帰評価指標
&#10230; 回帰分析の精度評価指標


**17. Basic metrics ― Given a regression model f, the following metrics are commonly used to assess the performance of the model:**

&#10230; [基本的な評価指標] 回帰モデルfが与えられたとき,次のようなよう化指標がモデルの性能を評価するために一般的に用いられます.
Copy link
Contributor

@hrkmr-tech hrkmr-tech Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&#10230; [基本的な評価指標] 回帰モデルfが与えられたとき,次のようなよう化指標がモデルの性能を評価するために一般的に用いられます
&#10230; [基本的な評価指標] 回帰モデルfが与えられたとき,モデルの性能を評価するために次のような指標が一般的に用いられます


**19. Coefficient of determination ― The coefficient of determination, often noted R2 or r2, provides a measure of how well the observed outcomes are replicated by the model and is defined as follows:**

&#10230; 決定係数 - よくR2やr2と書かれる決定係数は,実際の結果がモデルによってどの程度よく再現されているかを測る評価指標であり,次のように定義される.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


**20. Main metrics ― The following metrics are commonly used to assess the performance of regression models, by taking into account the number of variables n that they take into consideration:**

&#10230; 主要な評価指標 - 次の評価指標は説明変数の数を考慮して回帰モデルの性能を評価するために,一般的に用いられています.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


**21. where L is the likelihood and ˆσ2 is an estimate of the variance associated with each response.**

&#10230; ここでLは尤度であり,ˆσ2は各応答に対する誤差分散の推定値です.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


**22. Model selection**

&#10230; モデル選択
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


**23. Vocabulary ― When selecting a model, we distinguish 3 different parts of the data that we have as follows:**

&#10230; 用語 - モデルを選択するときには,次のように,データの種類を異なる3つに区別します.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&#10230; 用語 - モデルを選択するときには,次のように,データの種類を異なる3つに区別します
&#10230; 用語 - モデルを選択するときには,次のようにデータの種類を異なる3つに区別します


**31. [Generally k=5 or 10, Case p=1 is called leave-one-out]**

&#10230; [k=5か10が一般的,p=1の場合はLeave-one-out cross validation法と呼ばれる.]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&#10230; [k=5か10が一般的,p=1の場合はLeave-one-out cross validation法と呼ばれる.]
&#10230; [一般的にはk=5または10,p=1の場合は一個抜き交差検証と呼ばれます]


**32. The most commonly used method is called k-fold cross-validation and splits the training data into k folds to validate the model on one fold while training the model on the k−1 other folds, all of this k times. The error is then averaged over the k folds and is named cross-validation error.**

&#10230; 最も一般的に用いられている方法はk交差検証法であり,データセットをk群に分け,1群を検証に,残りのk-1群を学習に用います.これをk回繰り返します.求められた検証誤差はk群全てにわたって平均化され,これは交差検証誤差と呼ばれています.
Copy link
Contributor

@hrkmr-tech hrkmr-tech Nov 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&#10230; 最も一般的に用いられている方法はk交差検証法であり,データセットをk群に分け,1群を検証に,残りのk-1群を学習に用います.これをk回繰り返します.求められた検証誤差はk群全てにわたって平均化され,これは交差検証誤差と呼ばれています
&#10230; 最も一般的に用いられている方法はk交差検証法です.データセットをk群に分けた後,1群を検証に使用し残りのk-1群を学習に使用するという操作を順番にk回繰り返します.求められた検証誤差はk群すべてにわたって平均化されます.この平均された誤差のことを交差検証誤差と呼びます


**33. Regularization ― The regularization procedure aims at avoiding the model to overfit the data and thus deals with high variance issues. The following table sums up the different types of commonly used regularization techniques:**

&#10230; 正則化 - 正則化はモデルが過学習するのを避ける目的としており,したがってバリアンスが大きくなる問題に対処します.次の表は一般的に使用されるいくつかの正則化法をまとめたものです.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&#10230; 正則化 - 正則化はモデルが過学習するのを避ける目的としており,したがってバリアンスが大きくなる問題に対処します.次の表は一般的に使用されるいくつかの正則化法をまとめたものです.
&#10230; 正則化 - 正則化はモデルの過学習状態を回避することが目的であり,したがってハイバリアンス問題(オーバーフィット問題)に対処できます. 一般的に使用されるいくつかの正則化法を下表にまとめました.


**34. [Shrinks coefficients to 0, Good for variable selection, Makes coefficients smaller, Tradeoff between variable selection and small coefficients]**

&#10230; [係数を0にする,変数選択に適する,係数を小さくする,変数選択と係数を小さくすることのトレードオフ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


**35. Diagnostics**

&#10230; 分析
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 診断 or 診断方法 is better because the word diagnostics is a pair of symptoms and remedies in the later part: a metaphor.


**36. Bias ― The bias of a model is the difference between the expected prediction and the correct model that we try to predict for given data points.**

&#10230; バイアス - モデルのバイアスとは,予測するあるデータ点における,予測した結果の期待値と正しいモデルによる結果との差です.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&#10230; バイアス - モデルのバイアスとは,予測するあるデータ点における,予測した結果の期待値と正しいモデルによる結果との差です
&#10230; バイアス - モデルのバイアスとは,ある標本値群を予測する際の期待値と正しいモデルの結果との差異のことです


**37. Variance ― The variance of a model is the variability of the model prediction for given data points.**

&#10230; バリアンス - モデルのバリアンスとは,予測するあるデータ点における,予測した結果の分散です.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&#10230; バリアンス - モデルのバリアンスとは,予測するあるデータ点における,予測した結果の分散です
&#10230; バリアンス - モデルのバリアンスとは,ある標本値群に対するモデルの予測値のばらつきのことです


**38. Bias/variance tradeoff ― The simpler the model, the higher the bias, and the more complex the model, the higher the variance.**

&#10230; バイアス・バリアンストレードオフ - よりシンプルなモデルではバイアスが高くなり,より複雑なモデルはバリアンスが高くなります.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


**39. [Symptoms, Regression illustration, classification illustration, deep learning illustration, possible remedies]**

&#10230; [症状,回帰モデルでの図,分類モデルでの図,深層学習での図,可能な解決策]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


**40. [High training error, Training error close to test error, High bias, Training error slightly lower than test error, Very low training error, Training error much lower than test error, High variance]**

&#10230; [高い訓練誤差,訓練誤差がテスト誤差に近い,高いバイアス,訓練誤差がテスト誤差より少しだけ小さい,極端に小さい訓練誤差,訓練誤差がテスト誤差に比べて非常に小さい,高いバリアンス]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


**41. [Complexify model, Add more features, Train longer, Perform regularization, Get more data]**

&#10230; [より複雑なモデルを試す,特徴量を増やす,より長く学習する,正則化を導入する,データ数を増やす]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


**42. Error analysis ― Error analysis is analyzing the root cause of the difference in performance between the current and the perfect models.**

&#10230; エラー分析 - エラー分析は完璧なモデルと現在のモデル間の性能差の主要な要因を分析することです.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&#10230; エラー分析 - エラー分析は完璧なモデルと現在のモデル間の性能差の主要な要因を分析することです
&#10230; エラー分析 - エラー分析は現在のモデルと完璧なモデル間の性能差の主要な要因を分析することです


**43. Ablative analysis ― Ablative analysis is analyzing the root cause of the difference in performance between the current and the baseline models.**

&#10230; アブレーション分析 - アブレーション分析は,ベースラインとするモデルと現在のモデル間の性能差の主要な要因を分析することです.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current is '現在のモデル'.


**48. [Diagnostics, Bias/variance tradeoff, error/ablative analysis]**

&#10230; [分析,バイアス・バリアンストレードオフ,エラー・アブレーション分析]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 診断 or 診断方法 is better. We need discuss this topic.
https://github.com/shervinea/cheatsheet-translation/pull/99/files#r349865518

Copy link
Contributor

@scrambleegg7 scrambleegg7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly agree with Hiroki san's japanese tranlation.


**48. [Diagnostics, Bias/variance tradeoff, error/ablative analysis]**

&#10230; [分析,バイアス・バリアンストレードオフ,エラー・アブレーション分析]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I think it might be ok with "診断", however when Google searching with combination words "機械学習" and "診断", I look at many headlines related to the medical topics in Japan. The combination words "機械学習" and ("解析" or "分析") pick up headlines related to the machine learning in Japan. Though my approaching to find best translated word not be good one, how do you think about that ?

Copy link
Contributor

@scrambleegg7 scrambleegg7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are no major issues on conversion to Japan. Thank you for Hiroki san's great review.


**48. [Diagnostics, Bias/variance tradeoff, error/ablative analysis]**

&#10230; [分析,バイアス・バリアンストレードオフ,エラー・アブレーション分析]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your nice comment. I agree with "問題分析". It will be definitely fit to original nuance in English.

@yoshiyukinakai
Copy link
Contributor

Hello @umu1729, a team from Machine Learning Tokyo completed reviewing your translation and added some suggestions. Could you check and incorporate our suggestions?

Here is how to incorporate suggestions:
https://help.github.com/ja/github/collaborating-with-issues-and-pull-requests/incorporating-feedback-in-your-pull-request

@shervinea
Copy link
Owner

Hi @umu1729, would there be anything I could do to help out in finishing the reviewing process?

@yoshiyukinakai
Copy link
Contributor

Hi @shervinea, is it possible for you to incorporate suggestions and merge? I think this translation is as good as other completed translations.

@shervinea
Copy link
Owner

Thanks @yoshiyukinakai for pinging me on this PR. I went ahead and manually merged in all relevant reviewers suggestions. Hopefully it should be good now. Thanks again to everyone who helped on this translation, really appreciate your time and work!

@shervinea shervinea merged commit c74c255 into shervinea:master Jun 30, 2020
@shervinea shervinea changed the title [ja] Machine learning tips and tricks [ja] cs-229-machine-learning-tips-and-tricks Oct 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reviewer wanted Looking for a reviewer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants