[ja] cs-229-unsupervised-learning #173

tuananhhedspibk · 2019-09-07T03:28:50Z

No description provided.

tuananhhedspibk · 2019-09-07T03:30:47Z

@shervinea I'm translating Cheatsheet Unsupervised Learning into Japanese, could you add in progress label for me? Many thank.

shervinea · 2019-09-07T03:44:29Z

Thank you @tuananhhedspibk for working on it! Yes, of course.

yoshiyukinakai

Started reviewing this cheatsheet.

yoshiyukinakai · 2019-09-09T11:12:10Z

ja/cheatsheet-unsupervised-learning.md

+
+**2. Introduction to Unsupervised Learning**
+
+&#10230;教師なし学習のはじめに


Suggested change

⟶教師なし学習のはじめに

⟶教師なし学習の概要

yoshiyukinakai · 2019-09-09T11:14:57Z

ja/cheatsheet-unsupervised-learning.md

+
+**3. Motivation ― The goal of unsupervised learning is to find hidden patterns in unlabeled data {x(1),...,x(m)}.**
+
+&#10230;モチベーション - 教師なし学習の目的はラベルんなしデータ{x(1),...,x(m)}の中の隠されたパターンを探す。


Suggested change

⟶モチベーション - 教師なし学習の目的はラベルんなしデータ{x(1),...,x(m)}の中の隠されたパターンを探す。

⟶モチベーション - 教師なし学習の目的はラベルのないデータ{x(1),...,x(m)}に隠されたパターンを探すことです。

yoshiyukinakai · 2019-09-09T11:16:50Z

ja/cheatsheet-unsupervised-learning.md

+
+**4. Jensen's inequality ― Let f be a convex function and X a random variable. We have the following inequality:**
+
+&#10230;ジェンセンの不平等 - fを凸関数とし、Xをランダム変数。次の不平等がある:


Suggested change

⟶ジェンセンの不平等 - fを凸関数とし、Xをランダム変数。次の不平等がある:

⟶イェンセンの不等式 - fを凸関数、Xを確率変数とすると、次の不等式が成り立ちます:

yoshiyukinakai · 2019-09-09T11:21:04Z

ja/cheatsheet-unsupervised-learning.md

+
+**6. Expectation-Maximization**
+
+&#10230;EM


Suggested change

⟶EM

⟶期待値最大化

yoshiyukinakai · 2019-09-09T11:26:27Z

ja/cheatsheet-unsupervised-learning.md

+
+**7. Latent variables ― Latent variables are hidden/unobserved variables that make estimation problems difficult, and are often denoted z. Here are the most common settings where there are latent variables:**
+
+&#10230;潜在変数 - 潜在変数は推定問題を困難にする隠される変数であり、zで示される。潜在変数がある最も一般的な設定はこれ:


Suggested change

⟶潜在変数 - 潜在変数は推定問題を困難にする隠される変数であり、zで示される。潜在変数がある最も一般的な設定はこれ:

⟶潜在変数 - 潜在変数は推定問題を困難にする隠れた/観測されていない変数であり、多くの場合zで示されます。潜在変数がある最も一般的な設定は次のとおりです:

yoshiyukinakai · 2019-09-09T11:35:23Z

ja/cheatsheet-unsupervised-learning.md

+
+**9. [Mixture of k Gaussians, Factor analysis]**
+
+&#10230;[kガウス分布の混合, 因子分析]


Suggested change

⟶[kガウス分布の混合, 因子分析]

⟶[k個のガウス分布の混合, 因子分析]

ja/cheatsheet-unsupervised-learning.md

+
+**10. Algorithm ― The Expectation-Maximization (EM) algorithm gives an efficient method at estimating the parameter θ through maximum likelihood estimation by repeatedly constructing a lower-bound on the likelihood (E-step) and optimizing that lower bound (M-step) as follows:**
+
+&#10230;アルゴリズム - EMアルゴリズムは尤度の下限(E-ステップ)を繰り返し構築し、その下限(M-ステップ)次の通りに最適することにより、最尤推定を通じてパラメーターθを推定する効率な方法を共有する:


yoshiyukinakai · 2019-09-09T11:44:15Z

ja/cheatsheet-unsupervised-learning.md

+
+**11. E-step: Evaluate the posterior probability Qi(z(i)) that each data point x(i) came from a particular cluster z(i) as follows:**
+
+&#10230;E-ステップ: 次のように各データポイントx(i)が特定クラスターz(i)に由来する事後確率Qi(z(i))を評価する:


Suggested change

⟶E-ステップ: 次のように各データポイントx(i)が特定クラスターz(i)に由来する事後確率Qi(z(i))を評価する:

⟶E-ステップ: 各データポイントx(i)が特定クラスターz(i)に由来する事後確率Qi(z(i))を次のように評価します:

yoshiyukinakai · 2019-09-09T11:45:48Z

ja/cheatsheet-unsupervised-learning.md

+
+**12. M-step: Use the posterior probabilities Qi(z(i)) as cluster specific weights on data points x(i) to separately re-estimate each cluster model as follows:**
+
+&#10230;M-ステップ: 次のように各クラスターモデル別途再見積もりのためデータポイントx(i)のクラスター固有の重みとして事後確率Qi(z(i))を使う:


Suggested change

⟶M-ステップ: 次のように各クラスターモデル別途再見積もりのためデータポイントx(i)のクラスター固有の重みとして事後確率Qi(z(i))を使う:

⟶M-ステップ: 事後確率Qi(z(i))をデータポイントx(i)のクラスター固有の重みとして使い、次のように各クラスターモデルを個別に再推定します:

yoshiyukinakai · 2019-09-09T11:51:52Z

ja/cheatsheet-unsupervised-learning.md

+
+**13. [Gaussians initialization, Expectation step, Maximization step, Convergence]**
+
+&#10230;[ガウス分布初期化, 期待ステップ, 最大化ステップ, 収束]


Suggested change

⟶[ガウス分布初期化, 期待ステップ, 最大化ステップ, 収束]

⟶[ガウス分布初期化, 期待値ステップ, 最大化ステップ, 収束]

onixwr · 2019-09-26T10:41:02Z

ja/cheatsheet-unsupervised-learning.md

+
+**10. Algorithm ― The Expectation-Maximization (EM) algorithm gives an efficient method at estimating the parameter θ through maximum likelihood estimation by repeatedly constructing a lower-bound on the likelihood (E-step) and optimizing that lower bound (M-step) as follows:**
+
+&#10230;アルゴリズム - EMアルゴリズムは尤度の下限(E-ステップ)を繰り返し構築し、その下限(M-ステップ)次の通りに最適することにより、最尤推定を通じてパラメーターθを推定する効率な方法を共有する:


Suggested change

⟶アルゴリズム - EMアルゴリズムは尤度の下限(E-ステップ)を繰り返し構築し、その下限(M-ステップ)次の通りに最適することにより、最尤推定を通じてパラメーターθを推定する効率な方法を共有する:

⟶アルゴリズム - EMアルゴリズムは次のように尤度の下限の構築(E-ステップ)と、その下限の最適化(M-ステップ)を繰り返し行うことによる最尤推定によりパラメーターθを推定する効率的な方法を提供します:

yoshiyukinakai

Reviewed from 14. to 20.

yoshiyukinakai · 2019-09-26T10:15:52Z

ja/cheatsheet-unsupervised-learning.md

+
+**14. k-means clustering**
+
+&#10230;k-meansクラスタリング


Suggested change

⟶k-meansクラスタリング

⟶k平均法

yoshiyukinakai · 2019-09-26T10:23:55Z

ja/cheatsheet-unsupervised-learning.md

+
+**15. We note c(i) the cluster of data point i and μj the center of cluster j.**
+
+&#10230;クラスタのデータポイントiをc(i)、クラスタjのセンターをμjで表示する。


Suggested change

⟶クラスタのデータポイントiをc(i)、クラスタjのセンターをμjで表示する。

⟶データポイントiのクラスタをc(i)、クラスタjの中心をμjと表記します。

yoshiyukinakai · 2019-09-26T10:30:32Z

ja/cheatsheet-unsupervised-learning.md

+
+**16. Algorithm ― After randomly initializing the cluster centroids μ1,μ2,...,μk∈Rn, the k-means algorithm repeats the following step until convergence:**
+
+&#10230;アルゴリズム - クラスターのセンターポイントμ1,μ2,...,μk∈Rnを偶然初期化後、k-meansアルゴリズムが次のようなステップを収束まで繰り返す:


Suggested change

⟶アルゴリズム - クラスターのセンターポイントμ1,μ2,...,μk∈Rnを偶然初期化後、k-meansアルゴリズムが次のようなステップを収束まで繰り返す:

⟶アルゴリズム - クラスターの重心μ1,μ2,...,μk∈Rnをランダムに初期化後、k-meansアルゴリズムが収束するまで次のようなステップを繰り返します:

yoshiyukinakai · 2019-09-26T10:34:56Z

ja/cheatsheet-unsupervised-learning.md

+
+**17. [Means initialization, Cluster assignment, Means update, Convergence]**
+
+&#10230; [Means初期化, クラスター割り立て, Means更新, 収束]


Suggested change

⟶ [Means初期化, クラスター割り立て, Means更新, 収束]

⟶ [平均の初期化, クラスター割り当て,平均の更新, 収束]

yoshiyukinakai · 2019-09-26T10:40:14Z

ja/cheatsheet-unsupervised-learning.md

+
+**18. Distortion function ― In order to see if the algorithm converges, we look at the distortion function defined as follows:**
+
+&#10230;ディストーション関数 - アルゴリズムが収束するかどうかを確認するため、次のように定義されたディストーション関数を参照する:


Suggested change

⟶ディストーション関数 - アルゴリズムが収束するかどうかを確認するため、次のように定義されたディストーション関数を参照する:

⟶ひずみ関数 - アルゴリズムが収束するかどうかを確認するため、次のように定義されたひずみ関数を参照します:

ja/cheatsheet-unsupervised-learning.md

+
+**20. Algorithm ― It is a clustering algorithm with an agglomerative hierarchical approach that build nested clusters in a successive manner.**
+
+&#10230;アルゴリズム - これは入れ子クラスタを連続で構築する凝集階層アプローチによるクラスタリングアルゴリズムだ。


onixwr · 2019-09-26T11:08:05Z

ja/cheatsheet-unsupervised-learning.md

+
+**20. Algorithm ― It is a clustering algorithm with an agglomerative hierarchical approach that build nested clusters in a successive manner.**
+
+&#10230;アルゴリズム - これは入れ子クラスタを連続で構築する凝集階層アプローチによるクラスタリングアルゴリズムだ。


Suggested change

⟶アルゴリズム - これは入れ子クラスタを連続で構築する凝集階層アプローチによるクラスタリングアルゴリズムだ。

⟶アルゴリズム - これは入れ子になったクラスタを逐次的に構築する凝集階層アプローチによるクラスタリングアルゴリズムです。

onixwr · 2019-09-26T11:10:39Z

ja/cheatsheet-unsupervised-learning.md

+
+**30. Principal component analysis**
+
+&#10230;


Suggested change

⟶

⟶ 主成分分析

onixwr · 2019-09-26T11:19:50Z

ja/cheatsheet-unsupervised-learning.md

+
+**31. It is a dimension reduction technique that finds the variance maximizing directions onto which to project the data.**
+
+&#10230;


Suggested change

⟶

⟶ これはデータを投影する方向で、分散を最大にする方向を見つける次元削減手法です。

ytknzw · 2019-09-26T10:41:39Z

ja/cheatsheet-unsupervised-learning.md

+
+**54. Reviewed by X, Y and Z**
+
+&#10230; X, Y, Zによるレビューされた


Suggested change

⟶ X, Y, Zによるレビューされた

⟶ X, Y, Zによるレビュー

ytknzw · 2019-09-26T10:42:05Z

ja/cheatsheet-unsupervised-learning.md

+
+**53. Translated by X, Y and Z**
+
+&#10230; X, Y, Zによる翻訳された


Suggested change

⟶ X, Y, Zによる翻訳された

⟶X, Y, Zによる翻訳

ytknzw · 2019-09-26T10:42:22Z

ja/cheatsheet-unsupervised-learning.md

+
+**57. [Dimension reduction, PCA, ICA]**
+
+&#10230;


Suggested change

⟶

⟶[次元削減, PCA, ICA]

ytknzw · 2019-09-26T10:42:38Z

ja/cheatsheet-unsupervised-learning.md

+
+**55. [Introduction, Motivation, Jensen's inequality]**
+
+&#10230;


Suggested change

⟶

⟶[導入, 動機, イェンセンの不等式]

ytknzw · 2019-09-26T10:46:48Z

ja/cheatsheet-unsupervised-learning.md

+
+**51. The Machine Learning cheatsheets are now available in [target language].**
+
+&#10230;


Suggested change

⟶

⟶機械学習チートシートは日本語で読めます。

ytknzw · 2019-09-26T10:53:11Z

ja/cheatsheet-unsupervised-learning.md

+
+**49. Write the log likelihood given our training data {x(i),i∈[[1,m]]} and by noting g the sigmoid function as:**
+
+&#10230;


Suggested change

⟶

⟶学習データを{x(i),i∈[[1,m]]}、シグモイド関数をgとし、対数尤度を次のように表します：

ytknzw · 2019-09-26T10:54:01Z

ja/cheatsheet-unsupervised-learning.md

+
+**48. Write the probability of x=As=W−1s as:**
+
+&#10230;


Suggested change

⟶

⟶x=As=W−1sの確率を次のように表します：

ytknzw · 2019-09-26T11:12:36Z

ja/cheatsheet-unsupervised-learning.md

+
+**47. Bell and Sejnowski ICA algorithm ― This algorithm finds the unmixing matrix W by following the steps below:**
+
+&#10230;


Suggested change

⟶

⟶ベルとシノスキーのICAアルゴリズム ― このアルゴリズムは非混合行列Wを次のステップによって見つけます：

Pronunciation of "Sejnowski"
https://www.youtube.com/watch?v=CzAHlQheVfs

ytknzw · 2019-09-26T11:13:46Z

ja/cheatsheet-unsupervised-learning.md

+
+**46. The goal is to find the unmixing matrix W=A−1.**
+
+&#10230;


Suggested change

⟶

⟶非混合行列W=A−1を見つけることが目的です。

ytknzw · 2019-09-26T11:23:44Z

ja/cheatsheet-unsupervised-learning.md

+
+**45. Assumptions ― We assume that our data x has been generated by the n-dimensional source vector s=(s1,...,sn), where si are independent random variables, via a mixing and non-singular matrix A as follows:**
+
+&#10230;


Suggested change

⟶

⟶仮定 ― 混合かつ非特異行列Aを通じて、データxはn次元の元となるベクトルs=(s1,...,sn)から次のように生成されると仮定します。ただしsiは独立でランダムな変数です：

onixwr · 2019-09-26T11:34:50Z

ja/cheatsheet-unsupervised-learning.md

+
+**32. Eigenvalue, eigenvector ― Given a matrix A∈Rn×n, λ is said to be an eigenvalue of A if there exists a vector z∈Rn∖{0}, called eigenvector, such that we have:**
+
+&#10230;


Suggested change

⟶

⟶ 固有値、固有ベクトル - 行列 A∈Rn×nが与えられたとき、次の式で固有ベクトルと呼ばれるベクトルz∈Rn∖{0}が存在した場合に、λはAの固有値と呼ばれる。

ytknzw

Translation for 41-44

ytknzw · 2019-09-26T11:33:05Z

ja/cheatsheet-unsupervised-learning.md

+
+**44. It is a technique meant to find the underlying generating sources.**
+
+&#10230;


Suggested change

⟶

⟶隠れた生成源を見つけることを意図した技術です。

ytknzw · 2019-09-26T11:33:36Z

ja/cheatsheet-unsupervised-learning.md

+
+**43. Independent component analysis**
+
+&#10230;


Suggested change

⟶

⟶独立成分分析

ytknzw · 2019-09-26T11:34:27Z

ja/cheatsheet-unsupervised-learning.md

+
+**42. [Data in feature space, Find principal components, Data in principal components space]**
+
+&#10230;


Suggested change

⟶

⟶[特徴空間内のデータ, 主成分を見つける, 主成分空間内のデータ]

ytknzw · 2019-09-26T11:35:16Z

ja/cheatsheet-unsupervised-learning.md

+
+**41. This procedure maximizes the variance among all k-dimensional spaces.**
+
+&#10230;


Suggested change

⟶

⟶この過程は全てのk次元空間の間の分散を最大化します。

onixwr · 2019-09-26T11:41:23Z

ja/cheatsheet-unsupervised-learning.md

+
+**33. Spectral theorem ― Let A∈Rn×n. If A is symmetric, then A is diagonalizable by a real orthogonal matrix U∈Rn×n. By noting Λ=diag(λ1,...,λn), we have:**
+
+&#10230;


Suggested change

⟶

⟶ スペクトル定理 - A∈Rn×nとする。Aが対称のとき、Aは実直交行列U∈Rn×nを用いて対角化可能である。Λ=diag(λ1,...,λn)と表記することで、次の式を得る。

yoshiyukinakai

Translated from 21. to 29.

yoshiyukinakai · 2019-09-26T10:53:40Z

ja/cheatsheet-unsupervised-learning.md

+
+**21. Types ― There are different sorts of hierarchical clustering algorithms that aims at optimizing different objective functions, which is summed up in the table below:**
+
+&#10230;


Suggested change

⟶

⟶種類 ― 様々な目的関数を最適化するための様々な種類の階層クラスタリングアルゴリズムが以下の表にまとめられています。

yoshiyukinakai · 2019-09-26T10:58:16Z

ja/cheatsheet-unsupervised-learning.md

+
+**22. [Ward linkage, Average linkage, Complete linkage]**
+
+&#10230;


Suggested change

⟶

⟶[Ward linkage, Average linkage, Complete linkage]

Keep them as they are in English

Suggested change

⟶

⟶[ウォードリンケージ, 平均リンケージ, 完全リンケージ]

or

Suggested change

⟶

⟶[ウォード連結法, 平均連結法, 完全連結法]

yoshiyukinakai · 2019-09-26T11:00:25Z

ja/cheatsheet-unsupervised-learning.md

+
+**23. [Minimize within cluster distance, Minimize average distance between cluster pairs, Minimize maximum distance of between cluster pairs]**
+
+&#10230;


Suggested change

⟶

⟶[クラスター内の距離最小化、クラスターペア間の平均距離の最小化、クラスターペア間の最大距離の最小化]

yoshiyukinakai · 2019-09-26T11:00:57Z

ja/cheatsheet-unsupervised-learning.md

+
+**24. Clustering assessment metrics**
+
+&#10230;


Suggested change

⟶

⟶クラスタリング評価指標

yoshiyukinakai · 2019-09-26T11:11:12Z

ja/cheatsheet-unsupervised-learning.md

+
+**25. In an unsupervised learning setting, it is often hard to assess the performance of a model since we don't have the ground truth labels as was the case in the supervised learning setting.**
+
+&#10230;


Suggested change

⟶

⟶教師なし学習では、教師あり学習の場合のような正解ラベルがないため、モデルの性能を評価することが難しい場合が多いです。

yoshiyukinakai · 2019-09-26T11:21:35Z

ja/cheatsheet-unsupervised-learning.md

+
+**26. Silhouette coefficient ― By noting a and b the mean distance between a sample and all other points in the same class, and between a sample and all other points in the next nearest cluster, the silhouette coefficient s for a single sample is defined as follows:**
+
+&#10230;


Suggested change

⟶

⟶シルエット係数 ― サンプルと同じクラスタ内のその他全ての点との平均距離をa、最も近いクラスタ内の全ての点との平均距離をbと表記すると、サンプルのシルエット係数sは次のように定義されます:

Suggested change

⟶

⟶ シルエット係数 ― ある1つのサンプルと同じクラスタ内のその他全ての点との平均距離をa、そのサンプルに最も近いクラスタ内の全ての点との平均距離をbと表記すると、そのサンプルのシルエット係数sは次のように定義されます:

yoshiyukinakai · 2019-09-26T11:36:03Z

ja/cheatsheet-unsupervised-learning.md

+
+**27. Calinski-Harabaz index ― By noting k the number of clusters, Bk and Wk the between and within-clustering dispersion matrices respectively defined as**
+
+&#10230;


Suggested change

⟶

⟶Calinski-Harabazインデックス ― クラスタの数をkと表記すると、クラスタ間およびクラスタ内の分散行列であるBkおよびWkはそれぞれ以下のように定義されます。

yoshiyukinakai · 2019-09-26T11:37:13Z

ja/cheatsheet-unsupervised-learning.md

+
+**29. Dimension reduction**
+
+&#10230;


Suggested change

⟶

⟶次元削減

yoshiyukinakai · 2019-09-26T11:39:14Z

ja/cheatsheet-unsupervised-learning.md

+
+**28. the Calinski-Harabaz index s(k) indicates how well a clustering model defines its clusters, such that the higher the score, the more dense and well separated the clusters are. It is defined as follows:**
+
+&#10230;


Suggested change

⟶

⟶Calinski-Harabazインデックスs(k)はクラスタリングモデルが各クラスタをどの程度適切に定義しているかを示します。スコアが高いほど、各クラスタはより密で、十分に分離されています。それは次のように定義されます:

Suggested change

⟶

⟶ Calinski-Harabazインデックスs(k)はクラスタリングモデルが各クラスタをどの程度適切に定義しているかを示します。つまり、スコアが高いほど、各クラスタはより密で、十分に分離されています。それは次のように定義されます:

onixwr · 2019-09-26T11:44:21Z

ja/cheatsheet-unsupervised-learning.md

+
+**34. diagonal**
+
+&#10230;


Suggested change

⟶

⟶ 対角

onixwr · 2019-09-26T11:53:07Z

ja/cheatsheet-unsupervised-learning.md

+
+**35. Remark: the eigenvector associated with the largest eigenvalue is called principal eigenvector of matrix A.**
+
+&#10230;


Suggested change

⟶

⟶ 注釈: 最大固有値に対応する固有ベクトルは行列Aの第1固有ベクトルと呼ばれる。

ytknzw

Translation for 36-40

ytknzw · 2019-09-26T11:37:20Z

ja/cheatsheet-unsupervised-learning.md

+
+**40. Step 4: Project the data on spanR(u1,...,uk).**
+
+&#10230;


Suggested change

⟶

⟶ステップ4：データをspanR(u1,...,uk)に射影します。

ytknzw · 2019-09-26T11:47:18Z

ja/cheatsheet-unsupervised-learning.md

+
+**39. Step 3: Compute u1,...,uk∈Rn the k orthogonal principal eigenvectors of Σ, i.e. the orthogonal eigenvectors of the k largest eigenvalues.**
+
+&#10230;


Suggested change

⟶

⟶ステップ3：k個のΣの対角主値固有ベクトルu1,...,uk∈Rn、すなわちk個の最大の固有値の対角固有ベクトルを計算します。

ytknzw · 2019-09-26T11:49:38Z

ja/cheatsheet-unsupervised-learning.md

+
+**38. Step 2: Compute Σ=1mm∑i=1x(i)x(i)T∈Rn×n, which is symmetric with real eigenvalues.**
+
+&#10230;


Suggested change

⟶

⟶ステップ2：実固有値に関して対称であるΣ=1mm∑i=1x(i)x(i)T∈Rn×nを計算します。

ytknzw · 2019-09-26T11:50:37Z

ja/cheatsheet-unsupervised-learning.md

+
+**37. Step 1: Normalize the data to have a mean of 0 and standard deviation of 1.**
+
+&#10230;


Suggested change

⟶

⟶ステップ1：平均が0で標準偏差が1となるようにデータを正規化します。

ytknzw · 2019-09-26T11:52:59Z

ja/cheatsheet-unsupervised-learning.md

+**36. Algorithm ― The Principal Component Analysis (PCA) procedure is a dimension reduction technique that projects the data on k
+dimensions by maximizing the variance of the data as follows:**
+
+&#10230;


Suggested change

⟶

⟶アルゴリズム ― 主成分分析 (PCA)の過程は、次のようにデータの分散を最大化することによりデータをk次元に射影する次元削減の技術である。

yoshiyukinakai

Reviewed from 30. to 57.

yoshiyukinakai · 2019-09-26T11:49:03Z

ja/cheatsheet-unsupervised-learning.md

+
+**30. Principal component analysis**
+
+&#10230;


yoshiyukinakai · 2019-09-29T10:35:01Z

ja/cheatsheet-unsupervised-learning.md

+
+**31. It is a dimension reduction technique that finds the variance maximizing directions onto which to project the data.**
+
+&#10230; これはデータを投影する方向で、分散を最大にする方向を見つける次元削減手法です。


Suggested change

⟶ これはデータを投影する方向で、分散を最大にする方向を見つける次元削減手法です。

⟶ これは分散を最大にするデータの射影方向を見つける次元削減手法です。

yoshiyukinakai · 2019-09-29T10:36:39Z

ja/cheatsheet-unsupervised-learning.md

+
+**32. Eigenvalue, eigenvector ― Given a matrix A∈Rn×n, λ is said to be an eigenvalue of A if there exists a vector z∈Rn∖{0}, called eigenvector, such that we have:**
+
+&#10230; 固有値、固有ベクトル - 行列 A∈Rn×nが与えられたとき、次の式で固有ベクトルと呼ばれるベクトルz∈Rn∖{0}が存在した場合に、λはAの固有値と呼ばれる。


Suggested change

⟶ 固有値、固有ベクトル - 行列 A∈Rn×nが与えられたとき、次の式で固有ベクトルと呼ばれるベクトルz∈Rn∖{0}が存在した場合に、λはAの固有値と呼ばれる。

⟶ 固有値、固有ベクトル - 行列 A∈Rn×nが与えられたとき、次の式で固有ベクトルと呼ばれるベクトルz∈Rn∖{0}が存在した場合に、λはAの固有値と呼ばれます。

yoshiyukinakai · 2019-09-29T10:38:16Z

ja/cheatsheet-unsupervised-learning.md

+
+**33. Spectral theorem ― Let A∈Rn×n. If A is symmetric, then A is diagonalizable by a real orthogonal matrix U∈Rn×n. By noting Λ=diag(λ1,...,λn), we have:**
+
+&#10230; スペクトル定理 - A∈Rn×nとする。Aが対称のとき、Aは実直交行列U∈Rn×nを用いて対角化可能である。Λ=diag(λ1,...,λn)と表記することで、次の式を得る。


Suggested change

⟶ スペクトル定理 - A∈Rn×nとする。Aが対称のとき、Aは実直交行列U∈Rn×nを用いて対角化可能である。Λ=diag(λ1,...,λn)と表記することで、次の式を得る。

⟶ スペクトル定理 - A∈Rn×nとする。Aが対称のとき、Aは実直交行列U∈Rn×nを用いて対角化可能です。Λ=diag(λ1,...,λn)と表記することで、次の式を得ます。

yoshiyukinakai · 2019-09-29T10:48:57Z

ja/cheatsheet-unsupervised-learning.md

+
+**34. diagonal**
+
+&#10230; 対角


Suggested change

⟶ 対角

⟶ diagonal

Keep it as it is in English since it appears in a mathematical formula?

yoshiyukinakai · 2019-09-29T10:52:06Z

ja/cheatsheet-unsupervised-learning.md

+
+**36. Algorithm ― The Principal Component Analysis (PCA) procedure is a dimension reduction technique that projects the data on k dimensions by maximizing the variance of the data as follows:**
+
+&#10230; アルゴリズム ― 主成分分析 (PCA)の過程は、次のようにデータの分散を最大化することによりデータをk次元に射影する次元削減の技術である。


Suggested change

⟶ アルゴリズム ― 主成分分析 (PCA)の過程は、次のようにデータの分散を最大化することによりデータをk次元に射影する次元削減の技術である。

⟶ アルゴリズム ― 主成分分析 (PCA)の過程は、次のようにデータの分散を最大化することによりデータをk次元に射影する次元削減の技術です。

yoshiyukinakai · 2019-09-29T11:12:39Z

ja/cheatsheet-unsupervised-learning.md

+
+**50. Therefore, the stochastic gradient ascent learning rule is such that for each training example x(i), we update W as follows:**
+
+&#10230; そのため、確率的勾配上昇法の学習規則は、学習サンプルx(i)に対して次のようにwを更新するものです：


Suggested change

⟶ そのため、確率的勾配上昇法の学習規則は、学習サンプルx(i)に対して次のようにwを更新するものです：

⟶ そのため、確率的勾配上昇法の学習規則は、学習サンプルx(i)に対して次のようにWを更新するものです：

yoshiyukinakai · 2019-09-29T11:13:50Z

ja/cheatsheet-unsupervised-learning.md

+
+**53. Translated by X, Y and Z**
+
+&#10230; X, Y, Zによる翻訳


Suggested change

⟶ X, Y, Zによる翻訳

⟶ X・Y・Z 訳

yoshiyukinakai · 2019-09-29T11:14:11Z

ja/cheatsheet-unsupervised-learning.md

+
+**54. Reviewed by X, Y and Z**
+
+&#10230; X, Y, Zによるレビュー


Suggested change

⟶ X, Y, Zによるレビュー

⟶ X・Y・Z 校正

ytknzw

Reviewed translations 21 - 29.

21: #173 (comment)
22: #173 (comment)
23: #173 (comment)
24: #173 (comment)
25: #173 (comment)
26: #173 (comment)
27: #173 (comment)
28: #173 (comment)
29: #173 (comment)

Harimus

Just my 5cents of the translation. checked 21 - 29 mostly, according to a request in MLT slacks.

Harimus · 2019-10-03T02:47:18Z

ja/cheatsheet-unsupervised-learning.md

+**56. [Clustering, Expectation-Maximization, k-means, Hierarchical clustering, Metrics]**
+
+&#10230;[クラスタリング, EM, k-means, 階層クラスタリング, 指標]


Expectation-Maximization = 期待値最大化法 (according to https://ja.wikipedia.org/wiki/EM%E3%82%A2%E3%83%AB%E3%82%B4%E3%83%AA%E3%82%BA%E3%83%A0 )

Harimus · 2019-10-03T02:58:57Z

ja/cheatsheet-unsupervised-learning.md

+&#10230; [Ward linkage, Average linkage, Complete linkage]
+
+<br>


Ward = ウォード
Linkage = リンケージ　or 連動

suggestion:
[ウォード連動、　平均連動、完全連動]

Thank about your review, but I think to keep it in katakana リンケージ is easier to understand

Harimus · 2019-10-03T03:05:05Z

ja/cheatsheet-unsupervised-learning.md

+
+**25. In an unsupervised learning setting, it is often hard to assess the performance of a model since we don't have the ground truth labels as was the case in the supervised learning setting.**
+
+&#10230; 教師なし学習では、教師あり学習の場合のような正解ラベルがないため、モデルの性能を評価することが難しい場合が多いです。


If formal language: "モデルの性能を評価することが困難な場合が多いです。" could also work

Review accepted!

Harimus · 2019-10-03T03:09:47Z

ja/cheatsheet-unsupervised-learning.md

+
+**26. Silhouette coefficient ― By noting a and b the mean distance between a sample and all other points in the same class, and between a sample and all other points in the next nearest cluster, the silhouette coefficient s for a single sample is defined as follows:**
+
+&#10230; シルエット係数 ― サンプルと同じクラスタ内のその他全ての点との平均距離をa、最も近いクラスタ内の全ての点との平均距離をbと表記すると、サンプルのシルエット係数sは次のように定義されます:


最も近いクラスタ内 could maybe be changed to サンプルから最も近いクラスタ内 to make it clear that it's the cluster closest from the sample (and not cluster closest from the cluster class of sample), If I've interpreted the English line correctly.

Review accepted - I think サンプルから最も近いクラスタ内 will be more suitable with the next nearest cluster

Harimus · 2019-10-03T03:16:09Z

ja/cheatsheet-unsupervised-learning.md

+
+**29. Dimension reduction**
+
+&#10230; 次元削減


削減　kind of implies "erasing" or "cutting"　along reduction.
Another option could be 縮小(curtailment, "making it smaller") or　減少 (decrease, decrement, diminution)

I think "縮小" means "shrink" so I will keep it as 削減

ytknzw

@tuananhhedspibk @shervinea
Reviewed. All changes are OK.

ytknzw · 2019-10-28T13:05:59Z

ja/cheatsheet-unsupervised-learning.md

@@ -126,7 +126,7 @@

 **22. [Ward linkage, Average linkage, Complete linkage]**

-&#10230; [Ward linkage, Average linkage, Complete linkage]
+&#10230; [ウォードリンケージ, 平均リンケージ, 完全リンケージ]


ytknzw · 2019-10-28T13:06:19Z

ja/cheatsheet-unsupervised-learning.md

@@ -144,13 +144,13 @@

 **25. In an unsupervised learning setting, it is often hard to assess the performance of a model since we don't have the ground truth labels as was the case in the supervised learning setting.**

-&#10230; 教師なし学習では、教師あり学習の場合のような正解ラベルがないため、モデルの性能を評価することが難しい場合が多いです。
+&#10230; 教師なし学習では、教師あり学習の場合のような正解ラベルがないため、モデルの性能を評価することが困難な場合が多いです。


ytknzw · 2019-10-28T13:06:46Z

ja/cheatsheet-unsupervised-learning.md


 <br>

 **26. Silhouette coefficient ― By noting a and b the mean distance between a sample and all other points in the same class, and between a sample and all other points in the next nearest cluster, the silhouette coefficient s for a single sample is defined as follows:**

-&#10230; シルエット係数 ― サンプルと同じクラスタ内のその他全ての点との平均距離をa、最も近いクラスタ内の全ての点との平均距離をbと表記すると、サンプルのシルエット係数sは次のように定義されます:
+&#10230; シルエット係数 ― ある1つのサンプルと同じクラス内のその他全ての点との平均距離をa、そのサンプルから最も近いクラスタ内の全ての点との平均距離をbと表記すると、そのサンプルのシルエット係数sは次のように定義されます:



ytknzw · 2019-10-28T13:07:10Z

ja/cheatsheet-unsupervised-learning.md

@@ -162,7 +162,7 @@

 **28. the Calinski-Harabaz index s(k) indicates how well a clustering model defines its clusters, such that the higher the score, the more dense and well separated the clusters are. It is defined as follows:**

-&#10230; Calinski-Harabazインデックスs(k)はクラスタリングモデルが各クラスタをどの程度適切に定義しているかを示します。スコアが高いほど、各クラスタはより密で、十分に分離されています。 それは次のように定義されます:
+&#10230; Calinski-Harabazインデックスs(k)はクラスタリングモデルが各クラスタをどの程度適切に定義しているかを示します。つまり、スコアが高いほど、各クラスタはより密で、十分に分離されています。 それは次のように定義されます:


ytknzw · 2019-10-28T13:07:22Z

ja/cheatsheet-unsupervised-learning.md

@@ -330,7 +330,7 @@

 **56. [Clustering, Expectation-Maximization, k-means, Hierarchical clustering, Metrics]**

-&#10230;[クラスタリング, EM, k-means, 階層クラスタリング, 指標]
+&#10230;[クラスタリング, 期待値最大化法, k-means, 階層クラスタリング, 指標]



tuananhhedspibk · 2019-10-28T13:23:32Z

@ytknzw Many thank for you help.

tuananhhedspibk · 2019-10-28T13:24:27Z

@Harimus @yoshiyukinakai Could you guys review its content one more time for me? If it's OK, we can request to merge it.

Harimus · 2019-10-29T00:44:32Z

Looks good!

yoshiyukinakai · 2019-10-29T00:52:52Z

@tuananhhedspibk I also agree to accept the suggestions above. Please go ahead.

shervinea · 2019-10-29T05:40:08Z

Thank you everyone for your thorough work in the translation and the review! Merging the PR right now.

[ja] Cheatsheet Unsupervised learning

7b93432

shervinea added the in progress Work in progress label Sep 7, 2019

shervinea changed the title ~~[ja] Cheatsheet Unsupervised learning~~ [ja] Unsupervised learning Sep 7, 2019

yoshiyukinakai reviewed Sep 9, 2019

View reviewed changes

onixwr reviewed Sep 26, 2019

View reviewed changes

yoshiyukinakai reviewed Sep 26, 2019

View reviewed changes

onixwr reviewed Sep 26, 2019

View reviewed changes

ytknzw reviewed Sep 26, 2019

View reviewed changes

onixwr reviewed Sep 26, 2019

View reviewed changes

ytknzw reviewed Sep 26, 2019

View reviewed changes

onixwr reviewed Sep 26, 2019

View reviewed changes

yoshiyukinakai reviewed Sep 26, 2019

View reviewed changes

onixwr reviewed Sep 26, 2019

View reviewed changes

ytknzw reviewed Sep 26, 2019

View reviewed changes

[ja] Cheatsheet Unsupervised learning

b61342f

yoshiyukinakai reviewed Sep 29, 2019

View reviewed changes

[ja] Cheatsheet Unsupervised learning

a6411bc

ytknzw reviewed Oct 3, 2019

View reviewed changes

Harimus reviewed Oct 3, 2019

View reviewed changes

[ja] Cheatsheet Unsupervised learning

d3eda78

shervinea added reviewer wanted Looking for a reviewer and removed in progress Work in progress labels Oct 6, 2019

ytknzw reviewed Oct 28, 2019

View reviewed changes

shervinea merged commit 85de599 into shervinea:master Oct 29, 2019

shervinea changed the title ~~[ja] Unsupervised learning~~ [ja] cs-229-unsupervised-learning Oct 6, 2020


		2. Introduction to Unsupervised Learning

		⟶教師なし学習のはじめに

	⟶教師なし学習のはじめに
	⟶教師なし学習の概要


		3. Motivation ― The goal of unsupervised learning is to find hidden patterns in unlabeled data {x(1),...,x(m)}.

		⟶モチベーション - 教師なし学習の目的はラベルんなしデータ{x(1),...,x(m)}の中の隠されたパターンを探す。

	⟶モチベーション - 教師なし学習の目的はラベルんなしデータ{x(1),...,x(m)}の中の隠されたパターンを探す。
	⟶モチベーション - 教師なし学習の目的はラベルのないデータ{x(1),...,x(m)}に隠されたパターンを探すことです。


		4. Jensen's inequality ― Let f be a convex function and X a random variable. We have the following inequality:

		⟶ジェンセンの不平等 - fを凸関数とし、Xをランダム変数。次の不平等がある:

	⟶ジェンセンの不平等 - fを凸関数とし、Xをランダム変数。次の不平等がある:
	⟶イェンセンの不等式 - fを凸関数、Xを確率変数とすると、次の不等式が成り立ちます:


		7. Latent variables ― Latent variables are hidden/unobserved variables that make estimation problems difficult, and are often denoted z. Here are the most common settings where there are latent variables:

		⟶潜在変数 - 潜在変数は推定問題を困難にする隠される変数であり、zで示される。潜在変数がある最も一般的な設定はこれ:

	⟶潜在変数 - 潜在変数は推定問題を困難にする隠される変数であり、zで示される。潜在変数がある最も一般的な設定はこれ:
	⟶潜在変数 - 潜在変数は推定問題を困難にする隠れた/観測されていない変数であり、多くの場合zで示されます。潜在変数がある最も一般的な設定は次のとおりです:


		9. [Mixture of k Gaussians, Factor analysis]

		⟶[kガウス分布の混合, 因子分析]

	⟶[kガウス分布の混合, 因子分析]
	⟶[k個のガウス分布の混合, 因子分析]


		10. Algorithm ― The Expectation-Maximization (EM) algorithm gives an efficient method at estimating the parameter θ through maximum likelihood estimation by repeatedly constructing a lower-bound on the likelihood (E-step) and optimizing that lower bound (M-step) as follows:

		⟶アルゴリズム - EMアルゴリズムは尤度の下限(E-ステップ)を繰り返し構築し、その下限(M-ステップ)次の通りに最適することにより、最尤推定を通じてパラメーターθを推定する効率な方法を共有する:


		11. E-step: Evaluate the posterior probability Qi(z(i)) that each data point x(i) came from a particular cluster z(i) as follows:

		⟶E-ステップ: 次のように各データポイントx(i)が特定クラスターz(i)に由来する事後確率Qi(z(i))を評価する:

	⟶E-ステップ: 次のように各データポイントx(i)が特定クラスターz(i)に由来する事後確率Qi(z(i))を評価する:
	⟶E-ステップ: 各データポイントx(i)が特定クラスターz(i)に由来する事後確率Qi(z(i))を次のように評価します:


		12. M-step: Use the posterior probabilities Qi(z(i)) as cluster specific weights on data points x(i) to separately re-estimate each cluster model as follows:

		⟶M-ステップ: 次のように各クラスターモデル別途再見積もりのためデータポイントx(i)のクラスター固有の重みとして事後確率Qi(z(i))を使う:

	⟶M-ステップ: 次のように各クラスターモデル別途再見積もりのためデータポイントx(i)のクラスター固有の重みとして事後確率Qi(z(i))を使う:
	⟶M-ステップ: 事後確率Qi(z(i))をデータポイントx(i)のクラスター固有の重みとして使い、次のように各クラスターモデルを個別に再推定します:


		13. [Gaussians initialization, Expectation step, Maximization step, Convergence]

		⟶[ガウス分布初期化, 期待ステップ, 最大化ステップ, 収束]

	⟶[ガウス分布初期化, 期待ステップ, 最大化ステップ, 収束]
	⟶[ガウス分布初期化, 期待値ステップ, 最大化ステップ, 収束]

	⟶アルゴリズム - EMアルゴリズムは尤度の下限(E-ステップ)を繰り返し構築し、その下限(M-ステップ)次の通りに最適することにより、最尤推定を通じてパラメーターθを推定する効率な方法を共有する:
	⟶アルゴリズム - EMアルゴリズムは次のように尤度の下限の構築(E-ステップ)と、その下限の最適化(M-ステップ)を繰り返し行うことによる最尤推定によりパラメーターθを推定する効率的な方法を提供します:


		15. We note c(i) the cluster of data point i and μj the center of cluster j.

		⟶クラスタのデータポイントiをc(i)、クラスタjのセンターをμjで表示する。

	⟶クラスタのデータポイントiをc(i)、クラスタjのセンターをμjで表示する。
	⟶データポイントiのクラスタをc(i)、クラスタjの中心をμjと表記します。

[ja] cs-229-unsupervised-learning #173

[ja] cs-229-unsupervised-learning #173

Conversation

tuananhhedspibk commented Sep 7, 2019

tuananhhedspibk commented Sep 7, 2019

shervinea commented Sep 7, 2019

yoshiyukinakai left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment was marked as outdated.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yoshiyukinakai left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment was marked as outdated.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ytknzw Sep 26, 2019 • edited Loading

Choose a reason for hiding this comment

ytknzw Sep 26, 2019 • edited Loading

Choose a reason for hiding this comment

ytknzw Sep 26, 2019 • edited Loading

Choose a reason for hiding this comment

ytknzw Sep 26, 2019 • edited Loading

Choose a reason for hiding this comment

ytknzw Sep 26, 2019 • edited Loading

Choose a reason for hiding this comment

ytknzw Sep 26, 2019 • edited Loading

Choose a reason for hiding this comment

ytknzw Sep 26, 2019 • edited Loading

Choose a reason for hiding this comment

ytknzw Sep 26, 2019 • edited Loading

Choose a reason for hiding this comment

ytknzw Sep 26, 2019 • edited Loading

Choose a reason for hiding this comment

ytknzw Sep 26, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ytknzw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yoshiyukinakai left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ytknzw Oct 3, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ytknzw Oct 3, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ytknzw Sep 26, 2019 •

edited

Loading

ytknzw Sep 26, 2019 •

edited

Loading

ytknzw Sep 26, 2019 •

edited

Loading

ytknzw Sep 26, 2019 •

edited

Loading

ytknzw Sep 26, 2019 •

edited

Loading

ytknzw Sep 26, 2019 •

edited

Loading

ytknzw Sep 26, 2019 •

edited

Loading

ytknzw Sep 26, 2019 •

edited

Loading

ytknzw Sep 26, 2019 •

edited

Loading

ytknzw Sep 26, 2019 •

edited

Loading

yoshiyukinakai left a comment •

edited

Loading

ytknzw Oct 3, 2019 •

edited

Loading

ytknzw Oct 3, 2019 •

edited

Loading

yoshiyukinakai left a comment •

edited

Loading


		16. Algorithm ― After randomly initializing the cluster centroids μ1,μ2,...,μk∈Rn, the k-means algorithm repeats the following step until convergence:

		⟶アルゴリズム - クラスターのセンターポイントμ1,μ2,...,μk∈Rnを偶然初期化後、k-meansアルゴリズムが次のようなステップを収束まで繰り返す:

	⟶アルゴリズム - クラスターのセンターポイントμ1,μ2,...,μk∈Rnを偶然初期化後、k-meansアルゴリズムが次のようなステップを収束まで繰り返す:
	⟶アルゴリズム - クラスターの重心μ1,μ2,...,μk∈Rnをランダムに初期化後、k-meansアルゴリズムが収束するまで次のようなステップを繰り返します:


		17. [Means initialization, Cluster assignment, Means update, Convergence]

		⟶ [Means初期化, クラスター割り立て, Means更新, 収束]

	⟶ [Means初期化, クラスター割り立て, Means更新, 収束]
	⟶ [平均の初期化, クラスター割り当て,平均の更新, 収束]


		18. Distortion function ― In order to see if the algorithm converges, we look at the distortion function defined as follows:

		⟶ディストーション関数 - アルゴリズムが収束するかどうかを確認するため、次のように定義されたディストーション関数を参照する:

	⟶ディストーション関数 - アルゴリズムが収束するかどうかを確認するため、次のように定義されたディストーション関数を参照する:
	⟶ひずみ関数 - アルゴリズムが収束するかどうかを確認するため、次のように定義されたひずみ関数を参照します:


		20. Algorithm ― It is a clustering algorithm with an agglomerative hierarchical approach that build nested clusters in a successive manner.

		⟶アルゴリズム - これは入れ子クラスタを連続で構築する凝集階層アプローチによるクラスタリングアルゴリズムだ。

	⟶アルゴリズム - これは入れ子クラスタを連続で構築する凝集階層アプローチによるクラスタリングアルゴリズムだ。
	⟶アルゴリズム - これは入れ子になったクラスタを逐次的に構築する凝集階層アプローチによるクラスタリングアルゴリズムです。


		31. It is a dimension reduction technique that finds the variance maximizing directions onto which to project the data.

		⟶

	⟶
	⟶ これはデータを投影する方向で、分散を最大にする方向を見つける次元削減手法です。