-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ja] cs-229-unsupervised-learning #173
[ja] cs-229-unsupervised-learning #173
Conversation
@shervinea I'm translating Cheatsheet Unsupervised Learning into Japanese, could you add in progress label for me? Many thank. |
Thank you @tuananhhedspibk for working on it! Yes, of course. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Started reviewing this cheatsheet.
|
||
**2. Introduction to Unsupervised Learning** | ||
|
||
⟶教師なし学習のはじめに |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶教師なし学習のはじめに | |
⟶教師なし学習の概要 |
|
||
**3. Motivation ― The goal of unsupervised learning is to find hidden patterns in unlabeled data {x(1),...,x(m)}.** | ||
|
||
⟶モチベーション - 教師なし学習の目的はラベルんなしデータ{x(1),...,x(m)}の中の隠されたパターンを探す。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶モチベーション - 教師なし学習の目的はラベルんなしデータ{x(1),...,x(m)}の中の隠されたパターンを探す。 | |
⟶モチベーション - 教師なし学習の目的はラベルのないデータ{x(1),...,x(m)}に隠されたパターンを探すことです。 |
|
||
**4. Jensen's inequality ― Let f be a convex function and X a random variable. We have the following inequality:** | ||
|
||
⟶ジェンセンの不平等 - fを凸関数とし、Xをランダム変数。次の不平等がある: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ジェンセンの不平等 - fを凸関数とし、Xをランダム変数。次の不平等がある: | |
⟶イェンセンの不等式 - fを凸関数、Xを確率変数とすると、次の不等式が成り立ちます: |
|
||
**6. Expectation-Maximization** | ||
|
||
⟶EM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶EM | |
⟶期待値最大化 |
|
||
**7. Latent variables ― Latent variables are hidden/unobserved variables that make estimation problems difficult, and are often denoted z. Here are the most common settings where there are latent variables:** | ||
|
||
⟶潜在変数 - 潜在変数は推定問題を困難にする隠される変数であり、zで示される。潜在変数がある最も一般的な設定はこれ: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶潜在変数 - 潜在変数は推定問題を困難にする隠される変数であり、zで示される。潜在変数がある最も一般的な設定はこれ: | |
⟶潜在変数 - 潜在変数は推定問題を困難にする隠れた/観測されていない変数であり、多くの場合zで示されます。潜在変数がある最も一般的な設定は次のとおりです: |
|
||
**9. [Mixture of k Gaussians, Factor analysis]** | ||
|
||
⟶[kガウス分布の混合, 因子分析] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶[kガウス分布の混合, 因子分析] | |
⟶[k個のガウス分布の混合, 因子分析] |
|
||
**10. Algorithm ― The Expectation-Maximization (EM) algorithm gives an efficient method at estimating the parameter θ through maximum likelihood estimation by repeatedly constructing a lower-bound on the likelihood (E-step) and optimizing that lower bound (M-step) as follows:** | ||
|
||
⟶アルゴリズム - EMアルゴリズムは尤度の下限(E-ステップ)を繰り返し構築し、その下限(M-ステップ)次の通りに最適することにより、最尤推定を通じてパラメーターθを推定する効率な方法を共有する: |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
|
||
**11. E-step: Evaluate the posterior probability Qi(z(i)) that each data point x(i) came from a particular cluster z(i) as follows:** | ||
|
||
⟶E-ステップ: 次のように各データポイントx(i)が特定クラスターz(i)に由来する事後確率Qi(z(i))を評価する: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶E-ステップ: 次のように各データポイントx(i)が特定クラスターz(i)に由来する事後確率Qi(z(i))を評価する: | |
⟶E-ステップ: 各データポイントx(i)が特定クラスターz(i)に由来する事後確率Qi(z(i))を次のように評価します: |
|
||
**12. M-step: Use the posterior probabilities Qi(z(i)) as cluster specific weights on data points x(i) to separately re-estimate each cluster model as follows:** | ||
|
||
⟶M-ステップ: 次のように各クラスターモデル別途再見積もりのためデータポイントx(i)のクラスター固有の重みとして事後確率Qi(z(i))を使う: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶M-ステップ: 次のように各クラスターモデル別途再見積もりのためデータポイントx(i)のクラスター固有の重みとして事後確率Qi(z(i))を使う: | |
⟶M-ステップ: 事後確率Qi(z(i))をデータポイントx(i)のクラスター固有の重みとして使い、次のように各クラスターモデルを個別に再推定します: |
|
||
**13. [Gaussians initialization, Expectation step, Maximization step, Convergence]** | ||
|
||
⟶[ガウス分布初期化, 期待ステップ, 最大化ステップ, 収束] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶[ガウス分布初期化, 期待ステップ, 最大化ステップ, 収束] | |
⟶[ガウス分布初期化, 期待値ステップ, 最大化ステップ, 収束] |
|
||
**10. Algorithm ― The Expectation-Maximization (EM) algorithm gives an efficient method at estimating the parameter θ through maximum likelihood estimation by repeatedly constructing a lower-bound on the likelihood (E-step) and optimizing that lower bound (M-step) as follows:** | ||
|
||
⟶アルゴリズム - EMアルゴリズムは尤度の下限(E-ステップ)を繰り返し構築し、その下限(M-ステップ)次の通りに最適することにより、最尤推定を通じてパラメーターθを推定する効率な方法を共有する: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶アルゴリズム - EMアルゴリズムは尤度の下限(E-ステップ)を繰り返し構築し、その下限(M-ステップ)次の通りに最適することにより、最尤推定を通じてパラメーターθを推定する効率な方法を共有する: | |
⟶アルゴリズム - EMアルゴリズムは次のように尤度の下限の構築(E-ステップ)と、その下限の最適化(M-ステップ)を繰り返し行うことによる最尤推定によりパラメーターθを推定する効率的な方法を提供します: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed from 14. to 20.
|
||
**14. k-means clustering** | ||
|
||
⟶k-meansクラスタリング |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶k-meansクラスタリング | |
⟶k平均法 |
|
||
**15. We note c(i) the cluster of data point i and μj the center of cluster j.** | ||
|
||
⟶クラスタのデータポイントiをc(i)、クラスタjのセンターをμjで表示する。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶クラスタのデータポイントiをc(i)、クラスタjのセンターをμjで表示する。 | |
⟶データポイントiのクラスタをc(i)、クラスタjの中心をμjと表記します。 |
|
||
**16. Algorithm ― After randomly initializing the cluster centroids μ1,μ2,...,μk∈Rn, the k-means algorithm repeats the following step until convergence:** | ||
|
||
⟶アルゴリズム - クラスターのセンターポイントμ1,μ2,...,μk∈Rnを偶然初期化後、k-meansアルゴリズムが次のようなステップを収束まで繰り返す: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶アルゴリズム - クラスターのセンターポイントμ1,μ2,...,μk∈Rnを偶然初期化後、k-meansアルゴリズムが次のようなステップを収束まで繰り返す: | |
⟶アルゴリズム - クラスターの重心μ1,μ2,...,μk∈Rnをランダムに初期化後、k-meansアルゴリズムが収束するまで次のようなステップを繰り返します: |
|
||
**17. [Means initialization, Cluster assignment, Means update, Convergence]** | ||
|
||
⟶ [Means初期化, クラスター割り立て, Means更新, 収束] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ [Means初期化, クラスター割り立て, Means更新, 収束] | |
⟶ [平均の初期化, クラスター割り当て,平均の更新, 収束] |
|
||
**18. Distortion function ― In order to see if the algorithm converges, we look at the distortion function defined as follows:** | ||
|
||
⟶ディストーション関数 - アルゴリズムが収束するかどうかを確認するため、次のように定義されたディストーション関数を参照する: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ディストーション関数 - アルゴリズムが収束するかどうかを確認するため、次のように定義されたディストーション関数を参照する: | |
⟶ひずみ関数 - アルゴリズムが収束するかどうかを確認するため、次のように定義されたひずみ関数を参照します: |
|
||
**20. Algorithm ― It is a clustering algorithm with an agglomerative hierarchical approach that build nested clusters in a successive manner.** | ||
|
||
⟶アルゴリズム - これは入れ子クラスタを連続で構築する凝集階層アプローチによるクラスタリングアルゴリズムだ。 |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
|
||
**20. Algorithm ― It is a clustering algorithm with an agglomerative hierarchical approach that build nested clusters in a successive manner.** | ||
|
||
⟶アルゴリズム - これは入れ子クラスタを連続で構築する凝集階層アプローチによるクラスタリングアルゴリズムだ。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶アルゴリズム - これは入れ子クラスタを連続で構築する凝集階層アプローチによるクラスタリングアルゴリズムだ。 | |
⟶アルゴリズム - これは入れ子になったクラスタを逐次的に構築する凝集階層アプローチによるクラスタリングアルゴリズムです。 |
|
||
**30. Principal component analysis** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶ 主成分分析 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
||
**31. It is a dimension reduction technique that finds the variance maximizing directions onto which to project the data.** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶ これはデータを投影する方向で、分散を最大にする方向を見つける次元削減手法です。 |
|
||
**54. Reviewed by X, Y and Z** | ||
|
||
⟶ X, Y, Zによるレビューされた |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ X, Y, Zによるレビューされた | |
⟶ X, Y, Zによるレビュー |
|
||
**53. Translated by X, Y and Z** | ||
|
||
⟶ X, Y, Zによる翻訳された |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ X, Y, Zによる翻訳された | |
⟶X, Y, Zによる翻訳 |
|
||
**57. [Dimension reduction, PCA, ICA]** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶[次元削減, PCA, ICA] |
|
||
**55. [Introduction, Motivation, Jensen's inequality]** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶[導入, 動機, イェンセンの不等式] |
|
||
**51. The Machine Learning cheatsheets are now available in [target language].** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶機械学習チートシートは日本語で読めます。 |
|
||
**49. Write the log likelihood given our training data {x(i),i∈[[1,m]]} and by noting g the sigmoid function as:** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶学習データを{x(i),i∈[[1,m]]}、シグモイド関数をgとし、対数尤度を次のように表します: |
|
||
**48. Write the probability of x=As=W−1s as:** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶x=As=W−1sの確率を次のように表します: |
|
||
**47. Bell and Sejnowski ICA algorithm ― This algorithm finds the unmixing matrix W by following the steps below:** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶ベルとシノスキーのICAアルゴリズム ― このアルゴリズムは非混合行列Wを次のステップによって見つけます: |
Pronunciation of "Sejnowski"
https://www.youtube.com/watch?v=CzAHlQheVfs
|
||
**46. The goal is to find the unmixing matrix W=A−1.** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶非混合行列W=A−1を見つけることが目的です。 |
|
||
**45. Assumptions ― We assume that our data x has been generated by the n-dimensional source vector s=(s1,...,sn), where si are independent random variables, via a mixing and non-singular matrix A as follows:** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶仮定 ― 混合かつ非特異行列Aを通じて、データxはn次元の元となるベクトルs=(s1,...,sn)から次のように生成されると仮定します。ただしsiは独立でランダムな変数です: |
|
||
**32. Eigenvalue, eigenvector ― Given a matrix A∈Rn×n, λ is said to be an eigenvalue of A if there exists a vector z∈Rn∖{0}, called eigenvector, such that we have:** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶ 固有値、固有ベクトル - 行列 A∈Rn×nが与えられたとき、次の式で固有ベクトルと呼ばれるベクトルz∈Rn∖{0}が存在した場合に、λはAの固有値と呼ばれる。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Translation for 41-44
|
||
**44. It is a technique meant to find the underlying generating sources.** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶隠れた生成源を見つけることを意図した技術です。 |
|
||
**43. Independent component analysis** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶独立成分分析 |
|
||
**42. [Data in feature space, Find principal components, Data in principal components space]** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶[特徴空間内のデータ, 主成分を見つける, 主成分空間内のデータ] |
|
||
**41. This procedure maximizes the variance among all k-dimensional spaces.** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶この過程は全てのk次元空間の間の分散を最大化します。 |
|
||
**33. Spectral theorem ― Let A∈Rn×n. If A is symmetric, then A is diagonalizable by a real orthogonal matrix U∈Rn×n. By noting Λ=diag(λ1,...,λn), we have:** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶ スペクトル定理 - A∈Rn×nとする。Aが対称のとき、Aは実直交行列U∈Rn×nを用いて対角化可能である。Λ=diag(λ1,...,λn)と表記することで、次の式を得る。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Translated from 21. to 29.
|
||
**21. Types ― There are different sorts of hierarchical clustering algorithms that aims at optimizing different objective functions, which is summed up in the table below:** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶種類 ― 様々な目的関数を最適化するための様々な種類の階層クラスタリングアルゴリズムが以下の表にまとめられています。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
||
**22. [Ward linkage, Average linkage, Complete linkage]** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶[Ward linkage, Average linkage, Complete linkage] |
Keep them as they are in English
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶[ウォードリンケージ, 平均リンケージ, 完全リンケージ] |
or
⟶ | |
⟶[ウォード連結法, 平均連結法, 完全連結法] |
|
||
**23. [Minimize within cluster distance, Minimize average distance between cluster pairs, Minimize maximum distance of between cluster pairs]** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶[クラスター内の距離最小化、クラスターペア間の平均距離の最小化、クラスターペア間の最大距離の最小化] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
||
**24. Clustering assessment metrics** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶クラスタリング評価指標 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
||
**25. In an unsupervised learning setting, it is often hard to assess the performance of a model since we don't have the ground truth labels as was the case in the supervised learning setting.** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶教師なし学習では、教師あり学習の場合のような正解ラベルがないため、モデルの性能を評価することが難しい場合が多いです。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
||
**26. Silhouette coefficient ― By noting a and b the mean distance between a sample and all other points in the same class, and between a sample and all other points in the next nearest cluster, the silhouette coefficient s for a single sample is defined as follows:** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶シルエット係数 ― サンプルと同じクラスタ内のその他全ての点との平均距離をa、最も近いクラスタ内の全ての点との平均距離をbと表記すると、サンプルのシルエット係数sは次のように定義されます: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶ シルエット係数 ― ある1つのサンプルと同じクラスタ内のその他全ての点との平均距離をa、そのサンプルに最も近いクラスタ内の全ての点との平均距離をbと表記すると、そのサンプルのシルエット係数sは次のように定義されます: |
|
||
**27. Calinski-Harabaz index ― By noting k the number of clusters, Bk and Wk the between and within-clustering dispersion matrices respectively defined as** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶Calinski-Harabazインデックス ― クラスタの数をkと表記すると、クラスタ間およびクラスタ内の分散行列であるBkおよびWkはそれぞれ以下のように定義されます。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
||
**29. Dimension reduction** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶次元削減 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
||
**28. the Calinski-Harabaz index s(k) indicates how well a clustering model defines its clusters, such that the higher the score, the more dense and well separated the clusters are. It is defined as follows:** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶Calinski-Harabazインデックスs(k)はクラスタリングモデルが各クラスタをどの程度適切に定義しているかを示します。スコアが高いほど、各クラスタはより密で、十分に分離されています。 それは次のように定義されます: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶ Calinski-Harabazインデックスs(k)はクラスタリングモデルが各クラスタをどの程度適切に定義しているかを示します。つまり、スコアが高いほど、各クラスタはより密で、十分に分離されています。 それは次のように定義されます: |
|
||
**34. diagonal** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶ 対角 |
|
||
**35. Remark: the eigenvector associated with the largest eigenvalue is called principal eigenvector of matrix A.** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶ 注釈: 最大固有値に対応する固有ベクトルは行列Aの第1固有ベクトルと呼ばれる。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Translation for 36-40
|
||
**40. Step 4: Project the data on spanR(u1,...,uk).** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶ステップ4:データをspanR(u1,...,uk)に射影します。 |
|
||
**39. Step 3: Compute u1,...,uk∈Rn the k orthogonal principal eigenvectors of Σ, i.e. the orthogonal eigenvectors of the k largest eigenvalues.** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶ステップ3:k個のΣの対角主値固有ベクトルu1,...,uk∈Rn、すなわちk個の最大の固有値の対角固有ベクトルを計算します。 |
|
||
**38. Step 2: Compute Σ=1mm∑i=1x(i)x(i)T∈Rn×n, which is symmetric with real eigenvalues.** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶ステップ2:実固有値に関して対称であるΣ=1mm∑i=1x(i)x(i)T∈Rn×nを計算します。 |
|
||
**37. Step 1: Normalize the data to have a mean of 0 and standard deviation of 1.** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶ステップ1:平均が0で標準偏差が1となるようにデータを正規化します。 |
**36. Algorithm ― The Principal Component Analysis (PCA) procedure is a dimension reduction technique that projects the data on k | ||
dimensions by maximizing the variance of the data as follows:** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ | |
⟶アルゴリズム ― 主成分分析 (PCA)の過程は、次のようにデータの分散を最大化することによりデータをk次元に射影する次元削減の技術である。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed from 30. to 57.
|
||
**30. Principal component analysis** | ||
|
||
⟶ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
||
**31. It is a dimension reduction technique that finds the variance maximizing directions onto which to project the data.** | ||
|
||
⟶ これはデータを投影する方向で、分散を最大にする方向を見つける次元削減手法です。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ これはデータを投影する方向で、分散を最大にする方向を見つける次元削減手法です。 | |
⟶ これは分散を最大にするデータの射影方向を見つける次元削減手法です。 |
|
||
**32. Eigenvalue, eigenvector ― Given a matrix A∈Rn×n, λ is said to be an eigenvalue of A if there exists a vector z∈Rn∖{0}, called eigenvector, such that we have:** | ||
|
||
⟶ 固有値、固有ベクトル - 行列 A∈Rn×nが与えられたとき、次の式で固有ベクトルと呼ばれるベクトルz∈Rn∖{0}が存在した場合に、λはAの固有値と呼ばれる。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ 固有値、固有ベクトル - 行列 A∈Rn×nが与えられたとき、次の式で固有ベクトルと呼ばれるベクトルz∈Rn∖{0}が存在した場合に、λはAの固有値と呼ばれる。 | |
⟶ 固有値、固有ベクトル - 行列 A∈Rn×nが与えられたとき、次の式で固有ベクトルと呼ばれるベクトルz∈Rn∖{0}が存在した場合に、λはAの固有値と呼ばれます。 |
|
||
**33. Spectral theorem ― Let A∈Rn×n. If A is symmetric, then A is diagonalizable by a real orthogonal matrix U∈Rn×n. By noting Λ=diag(λ1,...,λn), we have:** | ||
|
||
⟶ スペクトル定理 - A∈Rn×nとする。Aが対称のとき、Aは実直交行列U∈Rn×nを用いて対角化可能である。Λ=diag(λ1,...,λn)と表記することで、次の式を得る。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ スペクトル定理 - A∈Rn×nとする。Aが対称のとき、Aは実直交行列U∈Rn×nを用いて対角化可能である。Λ=diag(λ1,...,λn)と表記することで、次の式を得る。 | |
⟶ スペクトル定理 - A∈Rn×nとする。Aが対称のとき、Aは実直交行列U∈Rn×nを用いて対角化可能です。Λ=diag(λ1,...,λn)と表記することで、次の式を得ます。 |
|
||
**34. diagonal** | ||
|
||
⟶ 対角 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ 対角 | |
⟶ diagonal |
Keep it as it is in English since it appears in a mathematical formula?
|
||
**36. Algorithm ― The Principal Component Analysis (PCA) procedure is a dimension reduction technique that projects the data on k dimensions by maximizing the variance of the data as follows:** | ||
|
||
⟶ アルゴリズム ― 主成分分析 (PCA)の過程は、次のようにデータの分散を最大化することによりデータをk次元に射影する次元削減の技術である。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ アルゴリズム ― 主成分分析 (PCA)の過程は、次のようにデータの分散を最大化することによりデータをk次元に射影する次元削減の技術である。 | |
⟶ アルゴリズム ― 主成分分析 (PCA)の過程は、次のようにデータの分散を最大化することによりデータをk次元に射影する次元削減の技術です。 |
|
||
**50. Therefore, the stochastic gradient ascent learning rule is such that for each training example x(i), we update W as follows:** | ||
|
||
⟶ そのため、確率的勾配上昇法の学習規則は、学習サンプルx(i)に対して次のようにwを更新するものです: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ そのため、確率的勾配上昇法の学習規則は、学習サンプルx(i)に対して次のようにwを更新するものです: | |
⟶ そのため、確率的勾配上昇法の学習規則は、学習サンプルx(i)に対して次のようにWを更新するものです: |
|
||
**53. Translated by X, Y and Z** | ||
|
||
⟶ X, Y, Zによる翻訳 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ X, Y, Zによる翻訳 | |
⟶ X・Y・Z 訳 |
|
||
**54. Reviewed by X, Y and Z** | ||
|
||
⟶ X, Y, Zによるレビュー |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⟶ X, Y, Zによるレビュー | |
⟶ X・Y・Z 校正 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed translations 21 - 29.
21: #173 (comment)
22: #173 (comment)
23: #173 (comment)
24: #173 (comment)
25: #173 (comment)
26: #173 (comment)
27: #173 (comment)
28: #173 (comment)
29: #173 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just my 5cents of the translation. checked 21 - 29 mostly, according to a request in MLT slacks.
**56. [Clustering, Expectation-Maximization, k-means, Hierarchical clustering, Metrics]** | ||
|
||
⟶[クラスタリング, EM, k-means, 階層クラスタリング, 指標] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expectation-Maximization = 期待値最大化法 (according to https://ja.wikipedia.org/wiki/EM%E3%82%A2%E3%83%AB%E3%82%B4%E3%83%AA%E3%82%BA%E3%83%A0 )
⟶ [Ward linkage, Average linkage, Complete linkage] | ||
|
||
<br> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ward = ウォード
Linkage = リンケージ or 連動
suggestion:
[ウォード連動、 平均連動、完全連動]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank about your review, but I think to keep it in katakana リンケージ is easier to understand
|
||
**25. In an unsupervised learning setting, it is often hard to assess the performance of a model since we don't have the ground truth labels as was the case in the supervised learning setting.** | ||
|
||
⟶ 教師なし学習では、教師あり学習の場合のような正解ラベルがないため、モデルの性能を評価することが難しい場合が多いです。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If formal language: "モデルの性能を評価することが困難な場合が多いです。" could also work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review accepted!
|
||
**26. Silhouette coefficient ― By noting a and b the mean distance between a sample and all other points in the same class, and between a sample and all other points in the next nearest cluster, the silhouette coefficient s for a single sample is defined as follows:** | ||
|
||
⟶ シルエット係数 ― サンプルと同じクラスタ内のその他全ての点との平均距離をa、最も近いクラスタ内の全ての点との平均距離をbと表記すると、サンプルのシルエット係数sは次のように定義されます: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
最も近いクラスタ内 could maybe be changed to サンプルから最も近いクラスタ内 to make it clear that it's the cluster closest from the sample (and not cluster closest from the cluster class of sample), If I've interpreted the English line correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review accepted - I think サンプルから最も近いクラスタ内 will be more suitable with the next nearest cluster
|
||
**29. Dimension reduction** | ||
|
||
⟶ 次元削減 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
削減 kind of implies "erasing" or "cutting" along reduction.
Another option could be 縮小(curtailment, "making it smaller") or 減少 (decrease, decrement, diminution)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "縮小" means "shrink" so I will keep it as 削減
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tuananhhedspibk @shervinea
Reviewed. All changes are OK.
@@ -126,7 +126,7 @@ | |||
|
|||
**22. [Ward linkage, Average linkage, Complete linkage]** | |||
|
|||
⟶ [Ward linkage, Average linkage, Complete linkage] | |||
⟶ [ウォードリンケージ, 平均リンケージ, 完全リンケージ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
@@ -144,13 +144,13 @@ | |||
|
|||
**25. In an unsupervised learning setting, it is often hard to assess the performance of a model since we don't have the ground truth labels as was the case in the supervised learning setting.** | |||
|
|||
⟶ 教師なし学習では、教師あり学習の場合のような正解ラベルがないため、モデルの性能を評価することが難しい場合が多いです。 | |||
⟶ 教師なし学習では、教師あり学習の場合のような正解ラベルがないため、モデルの性能を評価することが困難な場合が多いです。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
|
||
<br> | ||
|
||
**26. Silhouette coefficient ― By noting a and b the mean distance between a sample and all other points in the same class, and between a sample and all other points in the next nearest cluster, the silhouette coefficient s for a single sample is defined as follows:** | ||
|
||
⟶ シルエット係数 ― サンプルと同じクラスタ内のその他全ての点との平均距離をa、最も近いクラスタ内の全ての点との平均距離をbと表記すると、サンプルのシルエット係数sは次のように定義されます: | ||
⟶ シルエット係数 ― ある1つのサンプルと同じクラス内のその他全ての点との平均距離をa、そのサンプルから最も近いクラスタ内の全ての点との平均距離をbと表記すると、そのサンプルのシルエット係数sは次のように定義されます: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
@@ -162,7 +162,7 @@ | |||
|
|||
**28. the Calinski-Harabaz index s(k) indicates how well a clustering model defines its clusters, such that the higher the score, the more dense and well separated the clusters are. It is defined as follows:** | |||
|
|||
⟶ Calinski-Harabazインデックスs(k)はクラスタリングモデルが各クラスタをどの程度適切に定義しているかを示します。スコアが高いほど、各クラスタはより密で、十分に分離されています。 それは次のように定義されます: | |||
⟶ Calinski-Harabazインデックスs(k)はクラスタリングモデルが各クラスタをどの程度適切に定義しているかを示します。つまり、スコアが高いほど、各クラスタはより密で、十分に分離されています。 それは次のように定義されます: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
@@ -330,7 +330,7 @@ | |||
|
|||
**56. [Clustering, Expectation-Maximization, k-means, Hierarchical clustering, Metrics]** | |||
|
|||
⟶[クラスタリング, EM, k-means, 階層クラスタリング, 指標] | |||
⟶[クラスタリング, 期待値最大化法, k-means, 階層クラスタリング, 指標] | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
@ytknzw Many thank for you help. |
@Harimus @yoshiyukinakai Could you guys review its content one more time for me? If it's OK, we can request to merge it. |
Looks good! |
@tuananhhedspibk I also agree to accept the suggestions above. Please go ahead. |
Thank you everyone for your thorough work in the translation and the review! Merging the PR right now. |
No description provided.