ALS is a way of solving an matrix equivalent of an regular linear least 
squares problem. In a regular least sqaures problem we solve for Ax=b, 
i.e., ||b - Ax|| In the case of ALS we try to solve ||R - UM || in 
frobenius norm. Now we try to reduce ||R - UM||, to the kind ||b - Ax||.
 
Example:
Let us consider the first row of R, which will have all the ratings given by the
fist user. Let us assume the user has seen movies 1, 213, 315, 1744, 2133, 7044 and
12344, so in a total of 7 movies. Forming a vector of the known ratings for the first
user we have [1 213 315 1744 2133 7044 12344] , let this be denoted by b.
We initialize the M matrix in a particular way, and it has n_m(total number of movies) rows and n_f(reduced factors) columns. The matrix U has n_u(total number of users) rows and n_f columns. For the case of the first user we can form a sub-matrix of M , denote this by M_u . This matrix M_u will have all the rows from M corresponding to the movies seen by the user u, the first user in this case. Let u be the row vector from U corresponding to the considered user, .i.e, first user. Now we have three components b,M_u and u, so we have a regular least squares problem for ||b - Mu||. This is for the first user, we do this for all the users, hence forming the U matrix. since this cant be the final U we need to alternatively continue to solve for U and M.
You could try to just do an SVD on R, and initialize M=S*V', and use this M in ALS, in this case you might not need many iterations as we dont initialize with random values in M. Any approximation method involving random initial values needs a certain minimum number of iterations to tend towards the right values, as we in each iteration minimize the error to a certain extent only. We cant reduce the whole error in just one iteration.
Example:
Let us consider the first row of R, which will have all the ratings given by the
fist user. Let us assume the user has seen movies 1, 213, 315, 1744, 2133, 7044 and
12344, so in a total of 7 movies. Forming a vector of the known ratings for the first
user we have [1 213 315 1744 2133 7044 12344] , let this be denoted by b.
We initialize the M matrix in a particular way, and it has n_m(total number of movies) rows and n_f(reduced factors) columns. The matrix U has n_u(total number of users) rows and n_f columns. For the case of the first user we can form a sub-matrix of M , denote this by M_u . This matrix M_u will have all the rows from M corresponding to the movies seen by the user u, the first user in this case. Let u be the row vector from U corresponding to the considered user, .i.e, first user. Now we have three components b,M_u and u, so we have a regular least squares problem for ||b - Mu||. This is for the first user, we do this for all the users, hence forming the U matrix. since this cant be the final U we need to alternatively continue to solve for U and M.
You could try to just do an SVD on R, and initialize M=S*V', and use this M in ALS, in this case you might not need many iterations as we dont initialize with random values in M. Any approximation method involving random initial values needs a certain minimum number of iterations to tend towards the right values, as we in each iteration minimize the error to a certain extent only. We cant reduce the whole error in just one iteration.
 
No comments:
Post a Comment